Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Hash Algorithms Benchmark #18539

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

zhezhidashi
Copy link
Contributor

@zhezhidashi zhezhidashi commented Mar 6, 2024

What changes are proposed in this pull request?

A benchmarking tool for various hashing policy.

Why are the changes needed?

This test will measure:

  1. Time Cost: The time consumed after the file is allocated once to judge the efficiency of the algorithm.
  2. Standard Deviation: The standard deviation of the number assigned to each Worker to judge the uniformity of the algorithm.
  3. File Reallocated: After randomly deleting a Worker, redistribute the File again, and count how many files assigned to the Worker have changed. The fewer the number of File moves, the better the consistency of the algorithm.

Does this PR introduce any user facing changes?

You can run bin/alluxio exec class alluxio.stress.cli.client.StressClientHashBench directly without adding any parameters.

You can also add some parameters to run Benchmark, such as bin/alluxio exec class alluxio.stress.cli.client.StressClientHashBench -- --hash-policy CONSISTENT,JUMP,KETAMA,MAGLEV,MULTI_PROBE --virtual-node-num 10000 --worker-num 10 --node-replicas 1000 --lookup-size 65537 --probe-num 21 --report-path . --file-num 1000000.

This invokes the hashing tests:

  • 5 hash algorithms will be tested: CONSISTENT, JUMP, KETAMA, MAGLEV, MULTI_PROBE;
  • 10 workers will be used;
  • 10000 virtual nodes will be used;
  • 10 workers will be used;
  • 1000 worker replicas will be used;
  • the size of lookup table is 65537 (must be a prime);
  • the num of probes is 21;
  • The report will be generated under the current path;
  • The number of simulation test files is 1,000,000.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant