build 1.6.7 version pip package #386

MaggieQi · 2023-05-24T07:59:31Z

No description provided.

`cmake` throws an error if the `ThirdParty/zstd/build/cmake` git submodule is not cloned before the build: ``` CMake Error at CMakeLists.txt:96 (add_subdirectory): add_subdirectory given source "ThirdParty/zstd/build/cmake" which is not an existing directory. ``` So update the install command to avoid more users running into this. See #320 Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* improve performance and memory usage * fix CLRCore build * clean code * align the cosine with production * remove the quantizer check between search head index and ssd index. * remove some freq condition check * unify AsyncReadRequest Co-authored-by: chenqi <cheqi@microsoft.com> Co-authored-by: Philip Adams <35666630+PhilipBAdams@users.noreply.github.com>

Co-authored-by: jinweizhang <jinweizhang@microsoft.com>

Fix GPU PQ build errors and added error checks

* GPU KNN Intergrate * Merged cuda hxx files * Remove extra files, fix CoreLibrary build * GPU code needs to be in cu file * break circular including * Rearranged the method * Included the files like Neighborhood Graph. h did. Only included need files in Kernel.cu * Relocate the Quey_KNN to KNN.hxx (templates cannot be in kernel.cu * Try to Instantizate before compile * Fixed compile error. Need GPU to test rest * Set Mem Failure * Adjusted the max dimension for vectors up ot 184 * Change DistCalMethod * Print Thread Start/End * Shared memory * Cannot launch Kernelr with 60% MaxSharedMemory * Root Cause: Transpose_Mem * L2 480s, Cosine 703s after Transpose * Two functions for Shared and local ThreadHeap, two launch setting with shared and local DistPair * Test version for Ben. 32 Threads defined in params.h query_KNN has transpose and all shared memory. query_KNN has transpose and heamMem in local. * Found the debugger issue, next step: fix dist calc * 1. Fixed int8 dist calc 2. Tested batch splitting on 35M 3. Moved the malloc before launching to avoid waste of memry * 45cap, monitor mem usage, track where Convert failed * Succeed on 400M 100D, Fixed int overflow, major change: int to size_t * Multi-GPU detection * Fixed the CPU mem over-usage, located the bug in updating results * Fixed multi free results * Fixed hard coded Cosine DistCalcMethod * Relocate the Point & Transposed Point to GPUKNNDistance.hxx * Merge SPTAG current changes, complaining about cuh cuda lib * Move Generate Truth to TruthSet.cpp * Fix optimization issue, add GPUCoreLibrary/GPUSSDServing to default build * Fix Error for Linux cmake. * Remove Unnecessary Changes. * restore datasets * missing s in folder name * Restore build config to Lib, restore sln Config * update tlog for lib config * Remove Wrappers * Removed not ignored log info * Remove Static in TruthSet.cpp Remove debug command, build GPU SSDServing to exe in Debug * Recover CL compile for GPU SSDServing/main.cpp Co-authored-by: diegocai <diegocai@microsoft.com> Co-authored-by: Philip Adams <philipadams@microsoft.com> Co-authored-by: Diego-Cai <103398280+Diego-Cai@users.noreply.github.com>

…tion (#339) * Fix bug with int8/cosine configuration, and enabled hardware optimization for this case. * Trigger CI Co-authored-by: diegocai <diegocai@microsoft.com>

* add nni_auto_tune * support other data format and add more result * add result picture * refine readme * refine readme * refine code style * update readme and datareader * add aml training config * update config * fix licence * refactor for data type * refactor for data type * refactor for data type * fix overflow on bruteforce * fix compute metric by index * add limits and preprocessing * change code dir Co-authored-by: Guoxin <suiguoxin@gmail.com> Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* Fixed bugs with accuracy fix for TPT build with recon vectors * Accuracy fix working for GPU index build with PQ/OPQ enabled * Fix bug with accuracy fix * Trigger CI * Fix low accuracy issue with GPU index build for int8/cosine configuration (#339) * Fix bug with int8/cosine configuration, and enabled hardware optimization for this case. * Trigger CI Co-authored-by: diegocai <diegocai@microsoft.com> * Add nni_auto_tune example (#325) * add nni_auto_tune * support other data format and add more result * add result picture * refine readme * refine readme * refine code style * update readme and datareader * add aml training config * update config * fix licence * refactor for data type * refactor for data type * refactor for data type * fix overflow on bruteforce * fix compute metric by index * add limits and preprocessing * change code dir Co-authored-by: Guoxin <suiguoxin@gmail.com> Co-authored-by: MaggieQi <chenqi871025@gmail.com> * Fixed bugs with accuracy fix for TPT build with recon vectors * Accuracy fix working for GPU index build with PQ/OPQ enabled * Fix bug with accuracy fix * Trigger CI Co-authored-by: diegocai <diegocai@microsoft.com> Co-authored-by: smallv0221 <33639025+smallv0221@users.noreply.github.com> Co-authored-by: Guoxin <suiguoxin@gmail.com> Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* Remove Transposed Point to save shared Memory * Add Generate GT int8 Dim to 768, add CUDA CHECK to debug, Add infty for uint32 * Fix sharedmem usage for K=100 gt Co-authored-by: Diego Cai <diegocai@microsoft.com> Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* Fix build issue and windows issue with is_same_v * fix is_same_v error

* update python to python3 * use dynamic linking * enable ANNIndexTestTool code compiling * add python version in wheel package * update nuspec * enable to set different maxcheck and hashexponent * change to configure python version * trigger azurepipeline * trigger * fix python version * fix python version * fix python version in windows * fix cosine kmeans * clean avx/see header files * fix nuspec Co-authored-by: cheqi <cheqi@SRGSSD-07>

* prevent SimpleBufferIO fails to resize * Set put area for SimpleBufferIO * Update DiskIO.h

…ling into it from other TUs (#358) * Addresses static initialization fiasco problem with the logger and code calling into it from other translation units * fix namespace * make logger init multithreading safe * actually we can use magic statics to simplify * Fix #356 * gate GMH/GPA on Windows only

* add winrt projects * add gitignore for VS files * retarget to vc142 * make neighborCount uint32 * api takes byte[] for metadata * enable CFG and disable incremental linking to make BinSkim pass * format * remove edit and continue /ZI since it's incompatible for CFG * remove arm/arm64 platforms

* modify for thread_local context * fix initialization issue * fix ExtraWorkSpace id issue * fix workSpacePool * set thread affinity * add more affinity strategies * fix cmake compiler * fix linux libnuma compile * fix compiling and core bind * fix NumaStrategy and OrderStrategy enum type * remove space * Clear the workspace to ensure the heap size and pagebuffer size * User-overrideable worskpace implementation draft (#362) * make it possible to override workspace implementation * bool -> ErrrorCode * SPANN index should allow setting child index workspace * finish replacing by workspace factory * switch to unique_ptr * unresolved external * linux build error * windows build error --------- Co-authored-by: cheqi <cheqi@SRGSSD-07> Co-authored-by: Philip Adams <35666630+PhilipBAdams@users.noreply.github.com>

I have gone to "https://sourceforge.net/projects/boost/files/boost-binaries/1.67.0/" and downloaded "boost_1_67_0-msvc-14.1-64.exe", but encountered this failure during cmake -- Could NOT find Boost (missing: system thread serialization wserialization regex filesystem) (found suitable version "1.67.0", minimum required is "1.66") CMake Error at src/legacy/sptag/SPTAG/CMakeLists.txt:90 (message): Could not find Boost >= 1.67! I've notice that there are no precomiled libs, and had to run bootstrap + build exe to get the compiled libs. Co-authored-by: Philip Adams <35666630+PhilipBAdams@users.noreply.github.com>

* add filter support for BKT index * put the nullptr check into the upper function instead of #define, and add filter checking before duplicated check --------- Co-authored-by: Qianxi Zhang <qiazh@microsoft.com> Co-authored-by: Qianxi Zhang <Qianxi.Zhang@microsoft.com> Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* rename logging macro to avoid name conflicts * fix rename * resolve merge --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* Allow setting a different Logger at runtime Mutex is needed here because Logger::Logging is not const, so we can mangle things if shared_ptr is not updated atomically. And specialization for std::atomic<std::shared_ptr<T>> isn't availible to us in C++17 * Missed one file * use atomics

… to support lambda expression (#371) Co-authored-by: Qianxi Zhang <Qianxi.Zhang@microsoft.com>

…n graph refine (#369) * add protections against overflow and size mismatch, and invalid IDs * avoid compilation issue * add more safety to rebuild job * add check to RNG * try to resolve gcc compiler issue * debug-guard expensive check * fix the logging macro * make it more branch-predictor friendly * transform macro into function * turn KDT macro to function * special case for index==-1 * Update to LL_ERROR * prevent extra allocations in BKT search by templated search function * static dispatch in KDT index * fix RNG prefetching * skip checking index in graph traversal, since we will check in At * dispatch by switch instead of if for fewer branches, use AlwaysTrue when filterFunc is null to allow compiler to optimize * make the template function naming easier to understand, formatting improvements * make checks in RebuildNeighbors IF_DEBUG only

Co-authored-by: Menghao Li <menghaoli@microsoft.com>

* Add support for and error checking for 384 dim and other PQ dimensions * Fix error message for GPU code --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* add .net core support * add linux nuget * fix linux nuspec * fix CsharpClient.vcxproj * fix Linux and Windows conflict * fix windows nuget package * add dump and loadfromdump * fix setup.py * fix cuda LOG * fix GPU log * fix Dockerfile for ubuntu20.04 * fix Dockerfile --------- Co-authored-by: cheqi <cheqi@SRGSSD-07>

* add logger for total distance * enhance syncing code execution. * change typo --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* avoid off-by-one due to post-increment in do-while comparison * Update to not miss last BKT node * syntax

* add .net core support * add linux nuget * fix linux nuspec * fix CsharpClient.vcxproj * fix Linux and Windows conflict * fix windows nuget package * add dump and loadfromdump * fix setup.py * fix cuda LOG * fix GPU log * fix Dockerfile for ubuntu20.04 * fix Dockerfile * add new apis for CLRCore * fix CLR version * fix test case * add quantizevector and reconstructvector support * fix quantizeVector * fix CLR build * fix fresh BKT bug * update swig to 4.0.0 * fix README --------- Co-authored-by: cheqi <cheqi@SRGSSD-07>

This pr is auto merged as it contains a mandatory file and is opened for more than 10 days.

…e DeletedIDs set (#391)

* outline convenience function changes * implement GetPostingDebug * fix make_shared of abstract class * dont need to change VectorIndex.h interface * resolve build

…problematic when dealing with binary metadata (like zstd-compressed data). Turns out we don't need to use the GetMetadataOffsets function since we're always adding one vector+one metadata at a time. So we will treat the metadata as one continuous chunk of data. (#393)

* remove unnecessary checks * fix multithread threadaffinity issue --------- Co-authored-by: cheqi <cheqi@SRGSSD-07>

Co-authored-by: denisyang <denisyang@tencent.com>

* add iterator interface * add relaxed mono signal in the interface * add batch in iterator interface * IterativeScanTest: change delete flag to false in Search * enable iterator and relax monotonicity support in java, python and c# * avoid queryresult empty issue * fix Iterator issue * clean code for SPANN * fix CLR compiling * add result check for IterativeScan * add iterator for spann * fix SSDTest bugs * remove logs in truthset * trigger the pipeline * add iterator example for python tutorial * modify the README --------- Co-authored-by: Qianxi Zhang <qiazh@microsoft.com> Co-authored-by: Qi Chen <cheqi@microsoft.com>

Bumps [pillow](https://github.com/python-pillow/Pillow) from 9.4.0 to 10.0.1. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](python-pillow/Pillow@9.4.0...10.0.1) --- updated-dependencies: - dependency-name: pillow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: MaggieQi <chenqi871025@gmail.com>

* Initial unit tests and KNN build test * Fix linking error * Fix TPT tests * Change test files and fix tpt test issues * Fix linking issues * Fix buildssd tests and add new tests - new bug with SPTAG logger when running tests * Add benchmark tests for PQ optimization --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

suiguoxin and others added 30 commits July 28, 2022 18:44

update README: clone submodule (#323)

f96aa4b

add tools for opq training and inference (#324)

759758c

Co-authored-by: jinweizhang <jinweizhang@microsoft.com>

Fix build errors for GPU PQ

60f709b

Added more error checking for GPU PQ

4d2eccf

Small fixes

4cabd0b

Fix Windows build error

5ce6fd2

Fix zstd build/link error

2b17965

Fix for Windows build

77b93a2

Add SIMDUtils.cpp to GPU build

5112fff

Try to trigger Azure Pipeline CL build

6242cff

Remove space

cbc4e0d

Merge pull request #334 from bkarsin/fix_gpu_pq_build

ca61760

Fix GPU PQ build errors and added error checks

Fix low accuracy issue with GPU index build for int8/cosine configura…

f7e3fe4

…tion (#339) * Fix bug with int8/cosine configuration, and enabled hardware optimization for this case. * Trigger CI Co-authored-by: diegocai <diegocai@microsoft.com>

Fix windows build issues for GPU code (#346)

ebeb690

* Fix build issue and windows issue with is_same_v * fix is_same_v error

fix bkt bug: shuffle bug when clustering (#349)

2d2cfb3

add logger for total distance (#351)

de5b7f8

Prevent saving failure in SaveindexToFile (#355)

9c777df

* prevent SimpleBufferIO fails to resize * Set put area for SimpleBufferIO * Update DiskIO.h

add nuget package for WinRT (#361)

0207479

Update versions in requirements.txt to patched versions (#367)

b725cac

zqxjjj and others added 25 commits March 16, 2023 10:56

Rename logging macro to avoid name conflicts (#366)

fe03712

* rename logging macro to avoid name conflicts * fix rename * resolve merge --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

add filter support for BKT: replace function pointer to std::fucntion…

72f929d

… to support lambda expression (#371) Co-authored-by: Qianxi Zhang <Qianxi.Zhang@microsoft.com>

Add dimension check for ProcessWithoutMPI. (#373)

d6aca40

resolve compilation warnings (#374)

2c0bb04

Co-authored-by: Menghao Li <menghaoli@microsoft.com>

Fix 384d GPU index build support and error reporting (#363)

fc24d65

* Add support for and error checking for 384 dim and other PQ dimensions * Fix error message for GPU code --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

std::out_of_range doesn't have a default constructor (#378)

0df1b75

Remove old cuda tests and fix linux build error (#381)

98bc296

enhance syncing script uploading and downloading (#382)

ccb78d6

* add logger for total distance * enhance syncing code execution. * change typo --------- Co-authored-by: MaggieQi <chenqi871025@gmail.com>

Avoid off-by-one in BKT search (#383)

268578d

* avoid off-by-one due to post-increment in do-while comparison * Update to not miss last BKT node * syntax

Microsoft mandatory file

ff3825d

Auto merge mandatory file pr

2ffaec6

This pr is auto merged as it contains a mandatory file and is opened for more than 10 days.

More gracefully handle when invalid IDs are inserted or checked in th…

c05bdc8

…e DeletedIDs set (#391)

Convenience functions for parsing SPANN index (#384)

35a9bd1

* outline convenience function changes * implement GetPostingDebug * fix make_shared of abstract class * dont need to change VectorIndex.h interface * resolve build

Opt perf and fix thread affinity bug (#395)

7e9ff64

* remove unnecessary checks * fix multithread threadaffinity issue --------- Co-authored-by: cheqi <cheqi@SRGSSD-07>

fix compile failed, ErrorCode::fail should be ErrorCode::Fail (#404)

1f6a85e

Co-authored-by: denisyang <denisyang@tencent.com>

add Wait() method to SPTAG::Helper::DiskIO (#409)

e7e4f71

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build 1.6.7 version pip package #386

build 1.6.7 version pip package #386

MaggieQi commented May 24, 2023

build 1.6.7 version pip package #386

Are you sure you want to change the base?

build 1.6.7 version pip package #386

Conversation

MaggieQi commented May 24, 2023