Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copying between processes #3

Open
zuiderkwast opened this issue Jun 9, 2022 · 5 comments
Open

Avoid copying between processes #3

zuiderkwast opened this issue Jun 9, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@zuiderkwast
Copy link
Collaborator

Minimize the copying of data between processes. Do as much as possible in the calling process.

@drmull
Copy link
Collaborator

drmull commented Jun 10, 2022

One idea is to make a separate benchmark repo where we test ered vs eredis_cluster and maybe include other erlang redis cluster clients.

As we have discussed before, it would be interesting to try out some more optimized way of of doing things. One thing we could do is to remove the ered and ered_client process and instead use atomics for the slot map lookup, persistent term for the connection lookup and the counters module for keeping track of the queue size.

slot -> connection index (atomics)
connection index -> queue size (counters)
connection index -> connection pid (persistent term, local pid fits in a word so no global GC to update)

The connection module would have to handle reconnect and status reporting to the cluster module. The queue would be the connection send process message queue.
Avoid gen_server:call since setting up the link is expensive, rely on a timeout instead.

Not sure if it will work, there might be a catch, but if it works I think it would be quite efficient.

@zuiderkwast
Copy link
Collaborator Author

Benchmarking is a good idea. We should include ecredis in the comparison.

Atomics and counters are probably good, but I'm not sure about persistent term. It's true that replacing a pid doesn't trigger a global GC, but it still rewrites the whole persistent term table, which may contain stuff out of control of this lib. Perhaps an ETS table is an acceptable choice for connection index -> pid lookup?

Avoid gen_server:call since setting up the link is expensive, rely on a timeout instead.

You mean gen_server:call's monitor is expensive? With timeout you mean we use cast + receive after?

@drmull
Copy link
Collaborator

drmull commented Jun 11, 2022

but it still rewrites the whole persistent term table, which may contain stuff out of control of this lib.

Yes you are right, I did not realize the persistent term table was global. Better not go that way.

You mean gen_server:call's monitor is expensive?

Yes, I meant monitor. I remember it showed up when I did some profiling and bang/cast + receive performed better. It is hackish but might be worth if we are going all in for speed. At least we could profile it and see if it makes any difference.

@ghost
Copy link

ghost commented Jul 26, 2022

Perhaps an ETS table is an acceptable choice for connection index -> pid lookup?

A process dict might also be an option, it has the same lifetime as ETS tables (dies with the owner process). If doing only simple key lookups it should be faster than ETS I guess.

@zuiderkwast
Copy link
Collaborator Author

Ideally the lookup should happen in the caller (user's) process before the first gen-server call. We don't want to pollute the process dictionary of the user's process.

@zuiderkwast zuiderkwast added the enhancement New feature or request label Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants