New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Key/Value Embedding in Main Dictionary #394
Comments
I am also working on the evaluation for short keys from performance perspective, the idea is simple, just change the layout of dictEntry when zmalloc dictEntry to benefit from cache line. In my local POC implementation, I can observe 4% QPS gain because of the reduced memory latency of
|
I like the idea but I have a solution that avoids the drawbacks. It's very much like @madolson's idea in "Rethinking the main dict" idea.
|
So the PR redis/redis#12498 raised in Redis project has two commits
@lipzhu This is exactly what we have done as part of key embedding. |
@madolson and I were discussing internally to explore open addressing further with embedding to get rid of the next pointers. Key point I feel about this problem space is not in the embedding front rather how do we handle all the layers in the code primarily affected by embedding. redis/redis#12498 (comment) I would like to rehighlight the Lifecycle/Ownership part here.
|
Regarding the hash table, I replied in the other issue. For embedding, I think it would be very good if we can make |
@hpatro I took a look at the redis/redis#12498, my optimization is a little different from yours, I only embedded sds with dictEntry together when they can be put into a cache line. The optimization will not break existing structure, only want to benefit from CPU cache hit rate to reduce the memory latency. From the profiling data I can observe the cycles ratio of The pseudocode is like below, I can send a normalized patch for you to review if possible. #define DEFAULT_CACHE_LINE_SIZE 64
static inline dictEntry *createEntryEmbKey(void *key, dictEntry *next) {
size_t keylen = sdslen((sds)key);
size_t total_size = keylen + sizeof(dictEntry) + sizeof(struct sdshdr8) + 1;
/* Embed key with dictEntry when can be filled into a cache line. */
if (total_size <= DEFAULT_CACHE_LINE_SIZE) {
dictEntry *entry = zmalloc(total_size);
struct sdshdr8 *sh = (void *)(entry + 1);
sh->len = keylen;
sh->alloc = keylen;
sh->flags = SDS_TYPE_8;
memcpy(sh->buf, key, keylen);
sh->buf[keylen] = '\0';
entry->key = sh->buf;
entry->next = next;
return (dictEntry *)(void *)((uintptr_t)(void *)entry | ENTRY_PTR_EMB_KEY);
}
dictEntry *entry = zmalloc(sizeof(*entry));
entry->key = key;
entry->next = next;
return entry;
} |
Ref: redis/redis#12498
I would like to restart the conversation in Valkey around achieving better memory efficiency for the main dictionary. @vitarb and I had proposed a change in Redis project which encapsulate(s) the key and value within the dictEntry. Please find the details below.
The problem/use-case that the feature addresses
Redis main dictionary is a hash table with separate chaining approach on collision. The entry in the dictionary looks like the following
Each member here is a pointer consuming upto 24 bytes of data which is a high overhead for actual data of small size. For main dictionary, the observation we have that is key is always an sds string and value is a redisObject that holds a pointer to an actual value or contains embedded sds value in case if it's short (44 bytes or less). The overhead caused by the entry can be significantly reduced by changing the struct layout and will lead to more key/value storage in the same amout of memory.
Description of the feature
Introduce a new entry structure which will embedded both the key (sds) and value (redisObject wrapper).
Key points
Benchmarking
No increase/decrease in latency/throughput was observed with these changes.
Memory optimization: Maximum no. of key/value pair stored with maxmemory set to 100 MB
Server configuration
Benchmark configuration
Results
Drawbacks
robj
is duplicated on insertion to the dictionary (avoiding the robj creation in input buffer parsing could avoid this)Additional information
Previous discussion(s):
The text was updated successfully, but these errors were encountered: