Half-precision float vector metrics #4122

TheQuantumFractal · 2024-04-26T17:24:12Z

All Submissions:

Contributions should target the dev branch. Did you create your branch from dev?
Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

Does your submission pass tests?
Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
Have you checked your code using cargo clippy --all --all-features command?

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

I built out a SIMD implementation with testing for Neon, AVX2, SSE2 on euclidean, manhattan, and dot similarity. Something to note is that float16 SIMD operations are not supported on most ISAs (ARM32/64 processors are able to handle it, and AVX512 recently announced some hardware support but most machines on AVX2 and SSE2 do not support it). F16C is an x86 instruction set extension supported on most x86 modern machines that supports conversion between half- and single-precision floating point formats. Essentially, to run the metrics on AVX2 or SSE2, f16 vectors need to be converted to f32 then processed with f32 SIMD accordingly. My implementations are as such. I also wrote out a separate C / assembly file that enables Neon f16 SIMD operations since Rust does not currently support ARM f16 SIMD operations.

The AVX2 / SSE2 SIMD was tested on a Intel(R) Xeon(R) CPU while the Neon SIMD was tested on an Apple M1 Pro.

As for cosine similarity, the current cosine similarity preprocess step accepts float32 DenseVectors and simply normalizes them. You can similarly normalize the float16 vectors by computing dot product between the vector and itself using the dot similarity SIMD implementation. The actual metric after preprocessing would use the same SIMD dot similarity implementation.

/claim #4110

…r vector distance metrics.

algora-pbc · 2024-04-26T17:24:20Z

💵 To receive payouts, sign up on Algora, link your Github account and connect with Stripe/Alipay.

generall · 2024-04-29T16:56:17Z

lib/segment/src/spaces/metric_f16/arm.s

I don't feel too confident about including ASM files into the project as-is.
@TheQuantumFractal could you please elaborate why you decided to do it like this instead of directly linking C?

FYI in qunatizations repo we have an example https://github.com/qdrant/quantization/tree/master/quantization/cpp

If you think ASM is strictly necessary, could you please include an instruction of how to generate it from C

Yes, there isn't really a reason to include asm files. Directly linking C is a better solution. I can just set up linking for neon in qdrant/lib/segment then?

That would help, thanks!

@TheQuantumFractal you can find the example how to integrate C code here:
https://github.com/qdrant/quantization/blob/master/quantization/build.rs

Please use this example because here we solved cross-compilation issues (for instance, build on x64 host binary for arm target)

Added build.rs to link the C file.

IvanPleshkov

Also, it would be nice to add f16 scoring benchmarks. You can do it here where byte scoring defined
https://github.com/qdrant/qdrant/blob/dev/lib/segment/benches/metrics.rs

IvanPleshkov · 2024-04-30T08:30:59Z

lib/segment/src/spaces/metric_f16/simple.rs

+        #[cfg(target_arch = "x86_64")]
+        {
+            if is_x86_feature_detected!("avx")
+                && is_x86_feature_detected!("fma")


Check if f16c is supported

Do you have a suggestion of how to make this check?

is_x86_feature_detected!("f16c")

Resolved these.

IvanPleshkov · 2024-04-30T08:31:06Z

lib/segment/src/spaces/metric_f16/simple.rs

+            }
+        }
+
+        #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]


Check if f16c is supported

IvanPleshkov · 2024-04-30T08:31:19Z

lib/segment/src/spaces/metric_f16/simple.rs

+    fn similarity(v1: &[VectorElementTypeHalf], v2: &[VectorElementTypeHalf]) -> ScoreType {
+        #[cfg(target_arch = "x86_64")]
+        {
+            if is_x86_feature_detected!("avx")


Check if f16c is supported

IvanPleshkov · 2024-04-30T08:31:27Z

lib/segment/src/spaces/metric_f16/simple.rs

+
+        #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
+        {
+            if is_x86_feature_detected!("sse") && v1.len() >= MIN_DIM_SIZE_SIMD {


Check if f16c is supported

IvanPleshkov · 2024-04-30T08:31:58Z

lib/segment/src/spaces/metric_f16/simple.rs

+    fn similarity(v1: &[VectorElementTypeHalf], v2: &[VectorElementTypeHalf]) -> ScoreType {
+        #[cfg(target_arch = "x86_64")]
+        {
+            if is_x86_feature_detected!("avx")


Check if f16c is supported

IvanPleshkov · 2024-04-30T09:15:01Z

lib/segment/src/spaces/metric_f16/arm.c

+    float16x8_t sum2 = vdupq_n_f16(0.0f);
+    float16x8_t sum3 = vdupq_n_f16(0.0f);
+    float16x8_t sum4 = vdupq_n_f16(0.0f);
+    uint32_t i = 0;


Why do you want to define iterator here instead of for (int i, ...)?

Added the iterator into the for loop

IvanPleshkov · 2024-04-30T09:15:31Z

lib/segment/src/spaces/metric_f16/arm.c

+    float16x8_t sub3 = vdupq_n_f16(0.0f);
+    float16x8_t sum4 = vdupq_n_f16(0.0f);
+    float16x8_t sub4 = vdupq_n_f16(0.0f);
+    uint32_t i = 0;


Why not inside for definition?

IvanPleshkov · 2024-04-30T09:15:36Z

lib/segment/src/spaces/metric_f16/arm.c

+    float16x8_t sum2 = vdupq_n_f16(0.0f);
+    float16x8_t sum3 = vdupq_n_f16(0.0f);
+    float16x8_t sum4 = vdupq_n_f16(0.0f);
+    uint32_t i = 0;


Why not inside for definition?

IvanPleshkov · 2024-04-30T09:17:40Z

lib/segment/src/spaces/metric_f16/arm.c

+    float32_t tmp = 0.0f;
+    for (i=0; i < (blockSize % 32); i++) {
+        tmp = (*pSrcA - *pSrcB);
+        manhattanDistance += tmp > 0 ? tmp : -tmp;


Why not abs instead?

Using the arm f16 abs operation now

IvanPleshkov · 2024-04-30T09:21:31Z

lib/segment/src/spaces/metric_f16/simple_avx.rs

+#[target_feature(enable = "avx")]
+#[target_feature(enable = "fma")]
+#[target_feature(enable = "f16c")]
+pub(crate) unsafe fn euclid_similarity_avx(


One comment for all namings. euclid_similarity_avx is already presented. Please, rename euclid_similarity_avx into avx_euclid_similarity_half like it was named for byte type:
https://github.com/qdrant/qdrant/blob/dev/lib/segment/src/spaces/metric_uint/avx2/euclid.rs#L9

Do this please for all simd functions

IvanPleshkov · 2024-05-13T11:19:57Z

@TheQuantumFractal rebase please to the latest dev, CI is red

generall · 2024-05-13T21:18:31Z

Hey @TheQuantumFractal thanks a lot for the contribution! We will take it from here and finish the integration as a separate PR.

TheQuantumFractal · 2024-05-13T21:25:13Z

Sounds good! Happy to help :)

TheQuantumFractal added 2 commits April 26, 2024 02:26

Adding half-precision floating point SIMD-optimized implementation fo…

71f4c99

…r vector distance metrics.

Primitives adjustment

71b130a

algora-pbc bot mentioned this pull request Apr 26, 2024

Support for f16 vector metrics #4110

Open

algora-pbc bot added the 🙋 Bounty claim label Apr 26, 2024

TheQuantumFractal changed the base branch from master to dev April 26, 2024 17:25

TheQuantumFractal added 3 commits April 26, 2024 10:46

Remove ds store

272bf96

Load assembly only for neon

8fa52ad

Fixing linter errors

83e294f

generall requested a review from IvanPleshkov April 27, 2024 10:55

Adding float16 type

3912012

generall reviewed Apr 29, 2024

View reviewed changes

IvanPleshkov requested changes Apr 30, 2024

View reviewed changes

TheQuantumFractal and others added 5 commits May 12, 2024 15:22

Addressing f16 comments

fefab7b

Refactoring and adding benchmarks

80ca0b2

Renaming simd functions

c3d699c

Merge branch 'dev' into f16_metrics

ca19e84

Cleaning openapi

3bb93e4

TheQuantumFractal requested a review from IvanPleshkov May 13, 2024 01:32

TheQuantumFractal and others added 4 commits May 13, 2024 12:48

Merging in changes to dev

a8959f8

Fixing linter error

4a252e2

fix clippy

c8035e9

disable float16 feature in API

3517a0e

generall approved these changes May 13, 2024

View reviewed changes

generall merged commit c230a48 into qdrant:dev May 13, 2024
16 of 17 checks passed

This was referenced May 16, 2024

Vector Storage f16 #4061

Closed

[WIP] f16 feature #4032

Closed

generall mentioned this pull request May 16, 2024

(WIP) Generic float #3922

Closed

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Half-precision float vector metrics #4122

Half-precision float vector metrics #4122

TheQuantumFractal commented Apr 26, 2024 •

edited

algora-pbc bot commented Apr 26, 2024

generall Apr 29, 2024

TheQuantumFractal Apr 29, 2024

generall Apr 29, 2024

IvanPleshkov Apr 29, 2024

IvanPleshkov Apr 29, 2024

TheQuantumFractal May 13, 2024

IvanPleshkov left a comment

IvanPleshkov Apr 30, 2024

generall Apr 30, 2024

IvanPleshkov Apr 30, 2024

TheQuantumFractal May 13, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

TheQuantumFractal May 13, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

TheQuantumFractal May 13, 2024

IvanPleshkov Apr 30, 2024

IvanPleshkov Apr 30, 2024

TheQuantumFractal May 13, 2024

IvanPleshkov commented May 13, 2024

generall commented May 13, 2024

TheQuantumFractal commented May 13, 2024

Half-precision float vector metrics #4122

Half-precision float vector metrics #4122

Conversation

TheQuantumFractal commented Apr 26, 2024 • edited

All Submissions:

New Feature Submissions:

Changes to Core Features:

algora-pbc bot commented Apr 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IvanPleshkov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IvanPleshkov commented May 13, 2024

generall commented May 13, 2024

TheQuantumFractal commented May 13, 2024

TheQuantumFractal commented Apr 26, 2024 •

edited