Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.5 - Add the "levels of hashing" document in here as a .md? #49

Open
cmuratori opened this issue May 18, 2019 · 12 comments
Open

v0.5 - Add the "levels of hashing" document in here as a .md? #49

cmuratori opened this issue May 18, 2019 · 12 comments

Comments

@cmuratori
Copy link
Owner

This one is all you @NoHatCoder :) Not sure what you have planned for that, but it seems like we should check in a copy of your blog post at some point when it is ready, just so people can understand what is going on here with hash levels. Maybe we make a "doc" directory? I can then link to it with the main readme.md (assuming GitHub actually has a way to do that... I will check).

- Casey

@NoHatCoder
Copy link

I had planned to post in on my blog, but I need to incorporate your feedback first. I think we should just link to it, especially since I might need to update it depending on feedback, we wouldn't want an outdated version in this repository.

@cmuratori
Copy link
Owner Author

OK, I will add a hyperlink from the readme.md once you've got it posted...

- Casey

@NoHatCoder
Copy link

First public draft: http://nohatcoder.dk/2019-05-19-1.html, comments are very welcome.

@cmuratori
Copy link
Owner Author

Some text clean up possibilities for your perusal. From level 3, I removed the phrase "nor have a chance of recovering the seed greater than the proportion of work put to the task" because I couldn't understand what that meant beyond what was already in the text. But it should probably still be in there, just rewritten to be clearer about what it means, assuming it meant something additional?

- Casey

Note: This categorization system is a work in progress. Please do not hesitate to contact me if you have suggestions or find flaws in the following text.

Until now hash functions have generally been categorized as either cryptographic or non-cryptographic. However, there are several different hash function capabilities that are useful for different purposes. I here try to categorize the most useful capability sets. A complete description consists of a level, a description of the hash's capabilities, an output size, and possibly a seed size. The output size and seed size describe how hard it is for an attacker to break the capabilities, either through luck or computation.

Level 1 (n bits of output)
A level 1 hash should, given two different inputs, produce a collision with a probability of 1 in 2n. This must hold true even if the inputs differ only by minor differences, such as inserted, deleted, altered or swapped sections. It may be very easy to intentionally design input that collides. In order to avoid signalling level 2 capabilities, a level 1 hash should not accept a seed, nonce or similar.

Level 2 (n bits of output) (s bits of seed)
A level 2 hash must take a seed as part of the input. In addition to level 1 capabilities, a level 2 hash must be designed so that an attacker, given no knowledge of the seed or any output from the hash, cannot produce a c-way collision with a probability greater than 1 in 2min(s,n (c-1)).

Level 3 (n bits of output) (s bits of seed)
In addition to level 2 capabilities, a level 3 hash must be designed so that an attacker with limited access to compute hashes for arbitrary input, without knowing the seed, must not with certainty be able to recover the seed, or any derivative thereof that is useful for computing hashes, using less computations than the equivalent of computing 2s hashes. Producing a collision with certainty must be no easier than recovering the seed. Generating a c-way collision with probability 1 in 2p must not be possible for p<min(s,n (c-1)).

Level 4 (n bits of output)
It must take computation equivalent to 2n/2 hash computations to produce a collision. The hash may take a seed as part of the input, but if it does, it must also have all capabilities of a level 3 hash.

Level 5 (n bits of output)
It must take computation equivalent to 2n (c-1)/c hash computations to produce a c-way collision. The hash may take a seed as part of the input, but if it does, it must also have all capabilities of a level 3 hash.

For all hash levels it should be possible drop part of the output and maintain all capabilities for the new output size.

Hash functions may claim lower capabilities than their implemented output and seed size suggest. Authors of such functions should generally advertise output and seed sizes according to capabilities.

A level 1 hash is useful for cases where there are absolutely no adversaries. Level 1 is what is usually referred to as a non-cryptographic hash.

Level 2 hashes are useful as internal hash functions for hash maps and similar data structures, where the input might be from an adversarial source. While a practical attack seems unlikely, timing attacks could reveal a small amount of information about hash values, thus technically breaking the premise of an attacker having no knowledge of hash values.

Level 3 hashes can be used for hash maps, and suffer no theoretical deficiency for this job. They can also be used for generating message authentication codes.

Level 4 and level 5 functions can be used for identifying and verifying the integrity of a piece of data, without the need for a secret seed. Level 5 is what is usually called a cryptographic hash.

@NoHatCoder
Copy link

It is pretty much legalese, to guard against cases of "this broken function technically pass your definition". One could imagine a function where it is in most cases pretty easy to recover the seed, but sometimes it is really hard, that would technically pass the shortened text, but still be practically broken. There is probably a better way to write that.

Also you killed the sups when you copied the text, that certainly doesn't make it any more sensible.

@cmuratori
Copy link
Owner Author

Also you killed the sups when you copied the text, that certainly doesn't make it any more sensible.

Well, yes, but I assume you are not literally going to use this version :) It was just faster to edit the sentences in line than to try to quote each one and then rewrite it. I was assuming you would only grab rewordings you actually wanted.

- Casey

@NoHatCoder
Copy link

I incorporated most of your edits, but I'll keep it British and but-free.

@cmuratori
Copy link
Owner Author

consists of a level, which describe

"which describes"?

it should be possible drop part

"be possible to drop"?

Can I go ahead and link/tweet your post out now, or should I wait?

- Casey

@NoHatCoder
Copy link

It is public, you are free to spread.

@cmuratori
Copy link
Owner Author

Spreaded.

One technological suggestion that may be unnecessarily frilly: it may be nice to have some kind of permanent link you can do, like this:

https://nohatcoder.dk/hashcat.html#level3

or something. Know what I mean? So we can do something like "Meow Hash attempts to achieve the maximum possible x64 performance for a level 3 hash" and other people could do the same.

- Casey

@NoHatCoder
Copy link

Well, I can put some anchors in, no problem: http://nohatcoder.dk/2019-05-19-1.html#level3 . But I don't feel like it is a text that benefits from starting in the middle, the first paragraphs are way easier to read, and provide some dearly needed context for the abomination that is the level 3 description.

@cmuratori
Copy link
Owner Author

Both the Meow .h file and the readme for the GitHub front page now link to the "levels of hashing" doc and call Meow a Level 3 hash. You should definitely read through them though and correct any incorrect wording, or let me know if more needs to be said.

- Casey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants