Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TIKV as a metadata storage method #18566

Open
wants to merge 1 commit into
base: master-2.x
Choose a base branch
from

Conversation

wangbin83-gmail-com
Copy link

What changes are proposed in this pull request?

Add Tikv as a new metadata storage method.

Why are the changes needed?

1.To cope with the huge daily growth of data.

@jasondrogba
Copy link
Contributor

Thank you for your contribution.
Can you add some illustrative information about the design? Have you done any actual testing?

key[i] = (byte) (n & 0xffL);
n >>= Byte.SIZE;
}
//System.arraycopy(strBytes, 0, key, Longs.BYTES, strBytes.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed there are some comments in the code that haven't been removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

byte[] strBytes = str.getBytes();

byte[] key = new byte[Longs.BYTES + strBytes.length];
System.arraycopy(strBytes, 0, key, 0, strBytes.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why System.arraycopy is placed before the for loop, could you please explain?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Str is obtained by concatenating the prefix and inode, and the concatenated part of str is operated separately to obtain the key, which can minimize the distribution hash of the key as much as possible

* @param n a long value
* @param str1 a string value
* @param str2 a string value
* @return a byte array formed by writing the bytes of n followed by the bytes of str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this description correct? You repeated in the above description.

     * @return a byte array formed by writing the bytes of n followed by the bytes of str
     */
    public static byte[] toByteArray(String str, long n) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be an error. Different implementations of toByteArray() are used to handle different keys, but their effects are consistent.

valid.set(false);
throw new RuntimeException(exc);
} finally {
//tikvIterator.next();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

* It seeks given iterator to first entry before returning the iterator.
*
* @param tikvIterator the rocks iterator
* @param parser parser to produce iterated values from rocks key-value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from rocks key-value?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is indeed an error and has been corrected.

* @param str a String value
* @param long1 a long value
* @param long2 a long value
* @return a byte array formed by writing the two long values as bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the description is slightly different from the content of the method.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unified description of the function of toByteArray().

* Used to wrap an {@link CloseableIterator} over {@link ListIterator<Kvrpcpb.KvPair>}.
* It seeks given iterator to first entry before returning the iterator.
*
* @param tikvIterator the rocks iterator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocks iterator?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is indeed an error and has been corrected.

Copy link
Contributor

@jasondrogba jasondrogba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, TiKV is a good try. But I hope you can modify the code and clearly describe the design. It is best to add some tests.

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word of title ("Adding") is not an imperative verb. Please use one of the valid words
  • Commits associated with Github account: PASS

Some checks failed. Please fix the reported issues and reply
alluxio-bot, check this please
to re-run checks.

@wangbin83-gmail-com wangbin83-gmail-com changed the title Adding TIKV as a metadata storage method Add TIKV as a metadata storage method Apr 18, 2024
@alluxio-bot
Copy link
Contributor

Automated checks report:

  • PR title follows the conventions: PASS
  • Commits associated with Github account: PASS

All checks passed!

@wangbin83-gmail-com
Copy link
Author

Thank you for your contribution. Can you add some illustrative information about the design? Have you done any actual testing?

Thank you very much for raising questions about the code, which can help me better examine and improve the quality of code writing.

We believe that using Racks as a metastore will be a bottleneck in future large-scale application scenarios when using Alluxio. Therefore, we consider using distributed KV storage as a new metastore that can rival racks while achieving higher scalability. At present, our testing metadata has reached a scale of billions and will continue to increase in the future.
We are currently only storing file meta and block meta in TIKV, hoping to receive support and assistance from the community to better improve these features.

The entire PR is only a part of the complete functionality. I will submit a total of 8 PRs, which is the first one. In the final PR, I will modify the POM files at all levels and officially integrate them into Alluxio.

@wangbin83-gmail-com
Copy link
Author

Thanks for your contribution, TiKV is a good try. But I hope you can modify the code and clearly describe the design. It is best to add some tests.

This PR is only a small part of the entire feature, and we have split the entire feature for submission. At present, our functional testing has come to an end, and the data scale can reach billions.We only store file information and block information on TikV.

Copy link
Author

@wangbin83-gmail-com wangbin83-gmail-com left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All modifications have been completed. Please review them again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants