Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Valkey Over RDMA transport #477

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

pizhenwei
Copy link

Hi,

Since 2021/06, I created a PR for Redis Over RDMA proposal. Then I did some work to fully abstract connection and make TLS dynamically loadable, a new connection type could be built into Redis statically, or a separated shared library(loaded by Redis on startup) since Redis 7.2.0.

Base on the new connection framework, I created a new PR, some guys(@xiezhq-hermann @zhangyiming1201 @JSpewock @uvletter @FujiZ) noticed, played and tested this PR. However, because of the lack of time and knowledge from the maintainers, this PR has been pending about 2 years.

Changes in this PR:

  • introduce Valkey Over RDMA specification. (same as Redis, and this should be same)
  • implement Valkey Over RDMA. (compact the Valkey style)

Finally, if this feature is considered to merge, I volunteer to maintain it.

pizhenwei and others added 2 commits May 9, 2024 10:38
RDMA is the abbreviation of remote direct memory access. It is a
technology that enables computers in a network to exchange data in
the main memory without involving the processor, cache, or operating
system of either computer. This means RDMA has a better performance
than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and
lower latency.

In recent years, RDMA gets popular in the data center, especially
RoCE(RDMA over Converged Ethernet) architecture has been widely used.

Introduce Valkey Over RDMA protocol as a new transport for Valkey. For
now, we defined 4 commands:
- GetServerFeature & SetClientFeature: the two commands are used to
  negotiate features for further extension. There is no feature
  definition in this version. Flow control and multi-buffer may be
  supported in the future, this needs feature negotiation.
- Keepalive
- RegisterXferMemory: the heart to transfer the real payload.

The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory
with RDMA write/write with imm, it's similar to several mechanisms
introduced by papers(but not same):
- Socksdirect: datacenter sockets can be fast and compatible
  <https://dl.acm.org/doi/10.1145/3341302.3342071>
- LITE Kernel RDMA Support for Datacenter Applications
  <https://dl.acm.org/doi/abs/10.1145/3132747.3132762>
- FaRM: Fast Remote Memory
  <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf>

Co-authored-by: Xinhao Kong <xinhao.kong@duke.edu>
Co-authored-by: Huaping Zhou <zhouhuaping.san@bytedance.com>
Co-authored-by: zhuo jiang <jiangzhuo.cs@bytedance.com>
Co-authored-by: Yiming Zhang <zhangyiming1201@bytedance.com>
Co-authored-by: Jianxi Ye <jianxi.ye@bytedance.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Main changes in this patch:
* implement *Valkey Over RDMA* protocol, see *Protocol* section in RDMA.md
* implement server side of connection module only, this means we can *NOT*
  compile RDMA support as built-in.
* add necessary information in RDMA.md
* support 'CONFIG SET/GET', for example, 'CONFIG Set rdma.port 6380', then
  check this by 'rdma res show cm_id' and valkey-cli(with RDMA support,
  but not implemented in this patch)
* the full listeners show like():
    listener0:name=tcp,bind=*,bind=-::*,port=6379
    listener1:name=unix,bind=/var/run/valkey.sock
    listener2:name=rdma,bind=xx.xx.xx.xx,bind=yy.yy.yy.yy,port=6379
    listener3:name=tls,bind=*,bind=-::*,port=16379

valgrind test works fine:
valgrind --track-origins=yes --suppressions=./src/valgrind.sup
         --show-reachable=no --show-possibly-lost=no --leak-check=full
         --log-file=err.txt ./src/valkey-server --port 6379
         --loadmodule src/valkey-rdma.so port=6379 bind=xx.xx.xx.xx
         --loglevel verbose --protected-mode no --server_cpulist 2
         --bio_cpulist 3 --aof_rewrite_cpulist 3 --bgsave_cpulist 3
         --appendonly no

performance test:
server side: ./src/valkey-server --port 6379 # TCP port 6379 has no conflict with RDMA port 6379
             --loadmodule src/valkey-rdma.so port=6379 bind=xx.xx.xx.xx bind=yy.yy.yy.yy
             --loglevel verbose --protected-mode no --server_cpulist 2 --bio_cpulist 3
             --aof_rewrite_cpulist 3 --bgsave_cpulist 3 --appendonly no

build a valkey-benchmark with RDMA support(not implemented in this patch), run
on a x86(Intel Platinum 8260) with RoCEv2 interface(Mellanox ConnectX-5):
client side: ./src/valkey-benchmark -h xx.xx.xx.xx -p 6379 -c 30 -n 10000000 --threads 4
             -d 1024 -t ping,get,set --rdma
====== PING_INLINE ======
480561.28 requests per second, 0.060 msec avg latency.

====== PING_MBULK ======
540482.06 requests per second, 0.053 msec avg latency.

====== SET ======
399952.00 requests per second, 0.073 msec avg latency.

====== GET ======
443498.31 requests per second, 0.065 msec avg latency.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Copy link

codecov bot commented May 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.81%. Comparing base (6cff0d6) to head (36bd4e5).
Report is 9 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #477      +/-   ##
============================================
+ Coverage     68.91%   69.81%   +0.90%     
============================================
  Files           109      109              
  Lines         61792    61792              
============================================
+ Hits          42581    43138     +557     
+ Misses        19211    18654     -557     

see 21 files with indirect coverage changes

@pizhenwei
Copy link
Author

This PR could be tested by client.

To build client with RDMA:

make BUILD_RDMA=yes -j16

To test by commands:

Config of redis: appendonly no, port 6379, rdma-port 6379, appendonly no,
                 server_cpulist 12, bgsave_cpulist 16.
For RDMA: ./redis-benchmark -h HOST -c 30 -n 10000000 -r 1000000000 \
          --threads 8 -d 512 -t ping,set,get,lrange_100 --rdma \
	  --server_cpulist 2 --bio_cpulist 3 --aof_rewrite_cpulist 3 --bgsave_cpulist 3
For TCP: ./redis-benchmark -h HOST -c 30 -n 10000000 -r 1000000000 \
          --threads 8 -d 512 -t ping,set,get,lrange_100

@madolson madolson added the major-decision-pending Needs decision by core team label May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-decision-pending Needs decision by core team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants