Skip to content

Latest commit

 

History

History
84 lines (66 loc) · 4.75 KB

performance_benchmarks.md

File metadata and controls

84 lines (66 loc) · 4.75 KB

Performance

message-io is focused on getting the best performance, adding the minimal possible overhead over raw OS socket. As for performance features, it offers:

  • Zero-copy read/write messages: Messages sent and received pass their data reference (&[u8]) through the library, from the user-space to OS socket and vice-versa without copies. This means that the overhead the library adds is independent of the size of the message, because the library only copies the pointer to the data, no matter the longer this data is.
  • Full-duplex: From two different threads, you can simultaneously write and read over the same socket.
  • Multiwriter: Over the same node, you can write simultaneously from any number of different sockets without blocking for each other.
  • One internal thread with non-blocking sockets: Creating new connections and listeners do not add new threads. All sockets are managed from one thread, saving resources at scaling.

Benchmarks

The benchmark compares two dimensions:

  • Vertical: among different transports. How they are compared among them.
  • Horizontal: between a native usage and its usage by message-io. A native usage is represented here as the most basic usage of a transport in a blocking way between two sockets, with zero overhead. When checking the results, take into account that message-io manages for you a pool of sockets that behaves in a non-blocking way waking from a OS poll. It adds some delay to handle these things. However, in most of the network applications, you will need to build some kind of socket management on top of the native usage by yourself. So, in practice, not using message-io would also imply an overhead by implementing a management layer.

Note that the network benchmark results can vary slightly among different runs since it depends considerably on the OS load at the moment of execution.

Throughput benchmarks

The throughput is the amount of data you can transfer by a second, measured in bytes per second. It starts when the sender sends the first byte and ends when the receiver receives the last byte.

The following results are measured for the transmision of 1GB of data by localhost:

Transport native message-io efficiency
UDP 7.1 GB/s 5.9 GB/s ~83%
TCP 6.4 GB/s 5.2 GB/s ~81%
Framed TCP 5.5 GB/s 5.0 GB/s ~91%
Web Socket 590 MB/s 560 MB/s ~95%

Take into account that the throughput is measured by localhost. In localhost the computer makes few more than a data copy, so the measurement is not hidden by network latency. This means that the time difference between the native usage and message-io measures exactly the overhead that message-io adds over that native usage. In this prejudicial conditions for message-io, it maintains an efficiency of 80%-90% which in practice (in a real network environment with minimal latency), is higher.

As is mentioned above, note that the basic usage is a raw and basic usage of the sockets. Any real application will add some kind of management over them, adding also an overhead.

If you want to test it yourself, see the throughput example.

Latency benchmarks

The latency can be measured as the time a byte takes to be trasferred. It starts when the sender sends a byte and ends when the receiver receives that byte.

The following results are measured by transferring 1-byte by localhost:

Transport native message-io overhead
UDP 1.2 us 2.1 us + ~0.9 us
TCP 2.6 us 3.5 us + ~0.9 us
Framed TCP 5.2 us 6.6 us + ~1.4 us
Web Socket 9.1 us 10.1 us + ~1.0 us

Depending on the transport used, message-io adds around 1us of overhead per chunk of data transsmision. Because it is zero-copy at reading/writing messages, this overhead is constant and independently of the size of that chunk of data. The library only copies the pointer to the data. It means, for example, that sending and receiving a message of 1 Byte or 64KB by UDP (if the MTU supports that size), only adds around 0.9us of overhead to the entire data transsmision.

Note that TCP based protocols could split internally your message into several chunks of data.

If you want to test it yourself, execute cargo bench.