Releases: BBC-Esq/VectorDB-Plugin-for-LM-Studio
v5.0.1
BREAKING CHANGES:
- This release has breaking changes. Notably, the "Embedding_Models" folder is no longer used and has been replaced with the "Models" folder.
IMPROVEMENTS:
- Added the functionality of using "Local Models" as well as LM Studio!!! CPU only installations must still use LM Studio.
- Fixed
setup.py
to allow for CPU only installation and the overall functionality of the program when using only CPU. - Updated dependencies.
COMMENTS:
This may be the last release in a little while, but it's a good one! I'll continue to monitor the GitHub issues for any bugs or small changes in and address those until further notice.
NEW CHAT MODELS!
V5.0 - CHAT MODELS!
BREAKING CHANGES:
- This release has breaking changes. Notably, the "Embedding_Models" folder is no longer used and has been replaced with the "Models" folder.
IMPROVEMENTS:
- Added the functionality of using "Local Models" as well as LM Studio!!! CPU only installations must still use LM Studio.
- Fixed
setup.py
to allow for CPU only installation and the overall functionality of the program when using only CPU. - Updated dependencies.
COMMENTS:
This may be the last release in a little while, but it's a good one! I'll continue to monitor the GitHub issues for any bugs or small changes in and address those until further notice.
NEW CHAT MODELS!
v4.4.1 - INSTRUCTOR fixed
IMPROVEMENTS:
- Normalize embeddings set to "True..." not sure how it go set to false, leading to suboptimal search results.
- Custom library for Instructor fixing all Instructor models - They're Back!
- Changed WhisperSpeech to using "base" for both s2a and t2s models.
COMMENTS:
- Feel free to experiment with different sizes of "s2a" and "t2s" models by modifying
tts_module.py
. Here are some benchmarks for the various model combinations. - Benchmarks for Bark coming soon!
v4.4.0 - Dude! Upgrade your Dependencies
BREAKING CHANGES:
- Significant refactoring and some minimal script renaming as well.
- Updating certain libraries to be compatible with one another.
- Revised list of vector models so if you created a database with one that's been removed from the list be aware.
IMPROVEMENTS:
- Upgrade versions of dependencies as much as possible.
- Using custom library for Instructor models.
- Removed Jina models since they have comparable performance to existing models and, recently, require an API key to use, which is stupid.
KNOWN BUGS:
- To use any of the instructor models it requires you to download through the "Models Tab" and at runtime it'll download it a second time. This is because of the custom modified instructor script. This will be fixed in a future patch - e.g. release 4.4.1.
v4.3.0 - reliability!!!
BREAKING CHANGE
- This version is the first to use TileDB instead of ChromaDB. Therefore, any databases previously created with ChromaDB will no longer work with Version 4.3.0+ of this program.
- TileDB is much more robust, faster, will not give errors when trying to ingest massive amounts of documents, and is overall "better." ChromaDB has been used since the beginning of this repository so it's a major change. However, it was necessary due to ChromaDB not being able to handle massive amounts of documents reliably. It simply couldn't insert, for example, 200k+ vectors reliably. This was tested extensively with older versions that use
duckdb + clickhouse
as well as the current version that usessqlite3
. ALL HAILTILEDB
!
IMPROVEMENTS:
- WhisperSpeech was added as an alternative to Bark when doing text-to-speech. WhisperSpeech is a very promising library managed by Collabora. It rivals Bark in terms of quality, VRAM/compute requirements, and speed - even outperforms it in certain metrics. As newer models from WhisperSpeech come out this will only improve further.
- As part of the
sentence-transformers
library vector models were loaded with a batch size of 32. It became apparent after extensive testing that this caused slower performance and higher compute/memory requirements in EVERY SINGLE MODEL.database_interactions.py
now sets the batch size based on the size of the vector model being used when creating the database with a GPU. When using a CPU, the batch size is always 2. This reduces memory requirements by over 50% in most cases with the larger models, and improves speed for GPU and CPU. - Updated some of the dependency versions with further updates based on further testing in the near future.
- Started using the
pickle
library to save representations of image files and audio transcriptions within a given database. When a user doubleclicks on a image or audio file within an existing database, it will read the pickle file and route the call to where the file is located on your computer. Obviously, if you move the actual file (i.e. the image or audio file) it'll give an error. - Multiple scripts were refactored and/or renamed.
BUG FIXES:
- Changes made after Version 4.1 resulted in "Port," "Temperature," and "Max_tokens" not being updated properly. This was fixed by reverting the specific script managing this to a prior version.
- The Models Tab was revamped to correct alignment, a longstanding bug.
v4.2.0 - general upgrade
- Eliminated multiple bugs regarding creating, saving, and searching databases.
- Faster database creation by specifying batch size based on the vector model selected, and also lower VRAM usage as a result. This was only done after extensive testing.
- Make sure that all audio and image files are displayed in the database viewer tab now and that they open the appropriate file when doubleclicked.
- Updated user guide.
- Remove certain vector models that had significantly higher compute/VRAM requirements but no greater quality than other models that require less compute/VRAM. This was only done after extensive testing.
- Other relatively minor changes.
V4.1.0 - MULTIPLE databases!!!
Finally painstakingly implemented the ability to create multiple databases.
Changed the voice recorder to use the new backend for the file transcriber, which is the amazing WhisperS2T
library.
Updated multiple libraries to the most recent versions while still maintaining inter-dependency compatibility.
Updated openai library to use the new API finally.
Refactored multiple scripts, revised some conditional checks, reduced some clutter printed to the command prompt, and other improvements.
v4.0 - CUDA 12.1+ support!
NOTE~
This release is only for Windows. Linux and MacOs users should continue to use v3.5.2 until I can get those versions up and running. Download the ZIP file from Release 3.5.2 and follow the instructions in the readme.md
INCLUDING the prerequisites, which are different than this release.
CUDA 12.1+ support finally brings support for flash attention 2 and other improvements. Those will be implemented in subsequent incremental releases. For this initial release, the following major improvements have been made:
The transcribe Tool
has had a major improvement due to CUDA 12.1+ being supported. It allowed switching from faster-whisper
to the amazing new library (only ~75 stars) located here:
https://github.com/shashikg/WhisperS2T
In summary, this library enables "batch" processing of audio using ctranslate2
version 4.0, which supports CUDA 12.1+. Here is a comparison of a long audio file under Release 3.5.1 versus this release:
Release 3.5.2
large-v2 model, float16 = 10 minutes 1 second
This release:
large-v2, float16, speed set at 50 = 54 seconds
medium.en, float16, speed set at 75 = 32 seconds
small.en, float16, speed set at 100 = 15 seconds!!!
This cannot be understated and has been a feature that faster-whisper
, while great, has been lacking for quite some time.
KNOWN ISSUES:
-
Bark models will still run but with errors printed to the command prompt. This will be fixed as flash attention 2 is implemented and the option to NOT use FA2 is made available, thus preventing the errors.
-
The voice transcriber sometimes takes way longer than in release 3.5.2 and/or prints multiple transcriptions. This is due to issues with the
faster-whisper
library itself - notctranslate2
- since the improved transcribe file tool works just fine and it relies onctranslate2
already. If it's not addressed in the future, this may require switching fromfaster-whisper
to something else likewhisperS2T
, but for shorter audiofaster-whisper
is probably just as good and I'd rather keep it if possible. -
The transcriber tool no longer lets you choose a quantization nor compute device (e.g. cuda or cpu). This was a choice in order to get initial CUDA 12+ support as soon as possible. It'll be addressed in subsequent releases.
Please contact me if you to help out if you want faster releases for Linux and MacoS as I don't own those systems. I plan to update support for Linux systems using Nvidia GPUs on windows and AMD GPUs on Linux, just like before, as well as MacOs support, just like before.
v3.5.2 - final CUDA 11.8 release
All subsequent releases will only support CUDA 12+ unless popular demand dictates otherwise.
v3.5 - revamp baby!
Fix a HUGE BUG preventing databases created with certain vector models from returning any results...apparently embeddings need to be "normalized" when using similarity search...
Transcribe file now adds the transcription into the DB, enabling metadata searching and filtering by document type!
Revamped GUI to afford tabs more space, including Databases tab that will need it to create multiple database in subsequent release.
Revamp github instructions and support matrix.
Refactoring
ELIMINATE annoying "qpainter" bug, hopefully for good! Dirty little bug bastard!
Change location to dowload bitsandbytes for windows during installation process, and minor improvements to installation procedures.