Key/Value Embedding in Main Dictionary #394

hpatro · 2024-04-26T20:43:57Z

Ref: redis/redis#12498
I would like to restart the conversation in Valkey around achieving better memory efficiency for the main dictionary. @vitarb and I had proposed a change in Redis project which encapsulate(s) the key and value within the dictEntry. Please find the details below.

The problem/use-case that the feature addresses

Redis main dictionary is a hash table with separate chaining approach on collision. The entry in the dictionary looks like the following

struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next; 
};

Each member here is a pointer consuming upto 24 bytes of data which is a high overhead for actual data of small size. For main dictionary, the observation we have that is key is always an sds string and value is a redisObject that holds a pointer to an actual value or contains embedded sds value in case if it's short (44 bytes or less). The overhead caused by the entry can be significantly reduced by changing the struct layout and will lead to more key/value storage in the same amout of memory.

Description of the feature

Introduce a new entry structure which will embedded both the key (sds) and value (redisObject wrapper).

struct embeddedDictEntry {
    struct redisObject robj;
    struct dictEntry *next;
    unsigned char data[];
};

Key points

Use data array concept, similar to Redis RAX design to store the key
Promote the Redis object to be the dictEntry (saves a pointer).
Save upto 15 bytes per entry, allows packing 8-45% more key/value pair in the same amount of space based on scenario (jemalloc bucketing).
Able to achieve the above memory savings by maintaining the same latency/throughput compared to unstable branch.
Uses the existing hashtable design with chaining based approach for collisions.

Benchmarking

No increase/decrease in latency/throughput was observed with these changes.
Memory optimization: Maximum no. of key/value pair stored with maxmemory set to 100 MB

Server configuration

src/redis-server --daemonize yes --save "" --maxmemory 104857600

Benchmark configuration

src/redis-benchmark -t set -n 100000000 -r 1000000000 -d <value-size>

Results

Key Size (bytes)	Value Size (bytes)	actual memory used for embeddedDictEntry	je_malloc_bin_used for embedded	No. of keys (unstable)	No. of keys unstable (human)	No. of keys (embedded)	No. of keys (embedded) (human)	% gain (additional key/value stored)
16	3	53	56	1078257	1.07 M	1536113	1.53 M	42.4
16	32	81	96	907804	907 K	983446	983 K	8.3
16	40	89	96	842961	842 K	983446	983 K	16.6
16	44	93	96	842961	842 K	983446	983 K	16.6
16	64	45	48	626508	626 K	737181	737 K	17.6

Drawbacks

New dict entry proposed is tightly coupled with robj.
robj is duplicated on insertion to the dictionary (avoiding the robj creation in input buffer parsing could avoid this)

Additional information

Previous discussion(s):

The text was updated successfully, but these errors were encountered:

lipzhu · 2024-04-27T02:33:23Z

I am also working on the evaluation for short keys from performance perspective, the idea is simple, just change the layout of dictEntry when zmalloc dictEntry to benefit from cache line. In my local POC implementation, I can observe 4% QPS gain because of the reduced memory latency of dictSdsKeyCompare.

taskset -c 0-3 ~/valkey/src/valkey-server /tmp/valkey.conf

port 9001
bind * -::*
daemonize no
protected-mode no
save ""

taskset -c 6-9 ~/valkey/src/valkey-benchmark -p 9001 -t set -d 100 -r 10000000 -n 10000000 -c 50 --threads 4 -P 20

zuiderkwast · 2024-04-28T03:27:56Z

I like the idea but I have a solution that avoids the drawbacks. It's very much like @madolson's idea in "Rethinking the main dict" idea.

Replace 'dict' with a better hash table for the keyspace. I described it in Re-thinking the main hash table #169. No more dict entry.
Embed key and value in robj as data array, like you do here.

hpatro · 2024-04-29T23:14:15Z

I am also working on the evaluation for short keys from performance perspective, the idea is simple, just change the layout of dictEntry when zmalloc dictEntry to benefit from cache line. In my local POC implementation, I can observe 4% QPS gain because of the reduced memory latency of dictSdsKeyCompare.

So the PR redis/redis#12498 raised in Redis project has two commits

Key embedding
Key + Value embedding

@lipzhu This is exactly what we have done as part of key embedding.

hpatro · 2024-04-29T23:37:04Z

I like the idea but I have a solution that avoids the drawbacks. It's very much like @madolson's idea in "Rethinking the main dict" idea.

Replace 'dict' with a better hash table for the keyspace. I described it in Re-thinking the main hash table #169. No more dict entry.

Embed key and value in robj as data array, like you do here.

@madolson and I were discussing internally to explore open addressing further with embedding to get rid of the next pointers.
@zuiderkwast The part which we were concerned about was how to support SCAN command appropriately. Could you share some more thoughts on "scanning block-by-block until a block with the "ever full" bit zero"?

Key point I feel about this problem space is not in the embedding front rather how do we handle all the layers in the code primarily affected by embedding. redis/redis#12498 (comment) I would like to rehighlight the Lifecycle/Ownership part here.

Due to some of the unique command(s) and behavior of Redis, we need to introduce properties around the lifecycle of the key/value struct embedded together. As the value could be passed around, it should have both the properties of refcount as well as marker flag to indicate reference from the dictionary to avoid destruction of the object. Some of the scenarios are:
Cleanup of the client argument during end of command processing shouldn't cleanup the data which is also referenced by the embedded key/value structure.
Commands like MSET, INCR could operate on the existing embedded structure and should be safe to do it. Otherwise, we might have to pay a penalty of reallocating memory for a bunch of these operations.
Redis Module API like RM_StringDMA can directly manipulate the value stored in the dictionary.

zuiderkwast · 2024-04-30T00:52:34Z

Regarding the hash table, I replied in the other issue.

For embedding, I think it would be very good if we can make robj opaque. Then we can completely reorganize it without the risk of breaking things.

lipzhu · 2024-05-07T03:33:35Z

So the PR redis/redis#12498 raised in Redis project has two commits

Key embedding

Key + Value embedding

@lipzhu This is exactly what we have done as part of key embedding.

@hpatro I took a look at the redis/redis#12498, my optimization is a little different from yours, I only embedded sds with dictEntry together when they can be put into a cache line. The optimization will not break existing structure, only want to benefit from CPU cache hit rate to reduce the memory latency. From the profiling data I can observe the cycles ratio of dictSdsKeyCompare decreased as expected.

The pseudocode is like below, I can send a normalized patch for you to review if possible.

#define DEFAULT_CACHE_LINE_SIZE 64
static inline dictEntry *createEntryEmbKey(void *key, dictEntry *next) {
    size_t keylen = sdslen((sds)key);
    size_t total_size = keylen + sizeof(dictEntry) + sizeof(struct sdshdr8) + 1;
    /* Embed key with dictEntry when can be filled into a cache line. */
    if (total_size <= DEFAULT_CACHE_LINE_SIZE) {
        dictEntry *entry = zmalloc(total_size);
        struct sdshdr8 *sh = (void *)(entry + 1);
        sh->len = keylen;
        sh->alloc = keylen;
        sh->flags = SDS_TYPE_8;
        memcpy(sh->buf, key, keylen);
        sh->buf[keylen] = '\0';
        entry->key = sh->buf;
        entry->next = next;
        return (dictEntry *)(void *)((uintptr_t)(void *)entry | ENTRY_PTR_EMB_KEY);
    }
    dictEntry *entry = zmalloc(sizeof(*entry));
    entry->key = key;
    entry->next = next;
    return entry;
}

hpatro changed the title ~~[NEW]~~ Key/Value Embedding in Redis Main Dictionary Apr 26, 2024

hpatro changed the title ~~Key/Value Embedding in Redis Main Dictionary~~ Key/Value Embedding in Main Dictionary Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key/Value Embedding in Main Dictionary #394

Key/Value Embedding in Main Dictionary #394

hpatro commented Apr 26, 2024

lipzhu commented Apr 27, 2024 •

edited

zuiderkwast commented Apr 28, 2024 •

edited

hpatro commented Apr 29, 2024

hpatro commented Apr 29, 2024

zuiderkwast commented Apr 30, 2024

lipzhu commented May 7, 2024 •

edited

Key/Value Embedding in Main Dictionary #394

Key/Value Embedding in Main Dictionary #394

Comments

hpatro commented Apr 26, 2024

Key points

Benchmarking

lipzhu commented Apr 27, 2024 • edited

zuiderkwast commented Apr 28, 2024 • edited

hpatro commented Apr 29, 2024

hpatro commented Apr 29, 2024

zuiderkwast commented Apr 30, 2024

lipzhu commented May 7, 2024 • edited

lipzhu commented Apr 27, 2024 •

edited

zuiderkwast commented Apr 28, 2024 •

edited

lipzhu commented May 7, 2024 •

edited