Skip to content

Releases: BBC-Esq/VectorDB-Plugin-for-LM-Studio

v5.0.1

29 Apr 16:06
e75816a
Compare
Choose a tag to compare

BREAKING CHANGES:

  • This release has breaking changes. Notably, the "Embedding_Models" folder is no longer used and has been replaced with the "Models" folder.

IMPROVEMENTS:

  • Added the functionality of using "Local Models" as well as LM Studio!!! CPU only installations must still use LM Studio.
  • Fixed setup.py to allow for CPU only installation and the overall functionality of the program when using only CPU.
  • Updated dependencies.

COMMENTS:

This may be the last release in a little while, but it's a good one! I'll continue to monitor the GitHub issues for any bugs or small changes in and address those until further notice.

NEW CHAT MODELS!

  • Tested on RTX 4090.
    image

V5.0 - CHAT MODELS!

28 Apr 22:41
4ea382c
Compare
Choose a tag to compare

BREAKING CHANGES:

  • This release has breaking changes. Notably, the "Embedding_Models" folder is no longer used and has been replaced with the "Models" folder.

IMPROVEMENTS:

  • Added the functionality of using "Local Models" as well as LM Studio!!! CPU only installations must still use LM Studio.
  • Fixed setup.py to allow for CPU only installation and the overall functionality of the program when using only CPU.
  • Updated dependencies.

COMMENTS:

This may be the last release in a little while, but it's a good one! I'll continue to monitor the GitHub issues for any bugs or small changes in and address those until further notice.

NEW CHAT MODELS!

  • Tested on RTX 4090.
    image

v4.4.1 - INSTRUCTOR fixed

14 Apr 00:11
cdeb79b
Compare
Choose a tag to compare

IMPROVEMENTS:

  • Normalize embeddings set to "True..." not sure how it go set to false, leading to suboptimal search results.
  • Custom library for Instructor fixing all Instructor models - They're Back!
  • Changed WhisperSpeech to using "base" for both s2a and t2s models.

COMMENTS:

  • Feel free to experiment with different sizes of "s2a" and "t2s" models by modifying tts_module.py. Here are some benchmarks for the various model combinations.
  • Benchmarks for Bark coming soon!

image

image

v4.4.0 - Dude! Upgrade your Dependencies

04 Apr 17:51
2062674
Compare
Choose a tag to compare

BREAKING CHANGES:

  • Significant refactoring and some minimal script renaming as well.
  • Updating certain libraries to be compatible with one another.
  • Revised list of vector models so if you created a database with one that's been removed from the list be aware.

IMPROVEMENTS:

  • Upgrade versions of dependencies as much as possible.
  • Using custom library for Instructor models.
  • Removed Jina models since they have comparable performance to existing models and, recently, require an API key to use, which is stupid.

KNOWN BUGS:

  • To use any of the instructor models it requires you to download through the "Models Tab" and at runtime it'll download it a second time. This is because of the custom modified instructor script. This will be fixed in a future patch - e.g. release 4.4.1.

v4.3.0 - reliability!!!

25 Mar 19:42
04919f6
Compare
Choose a tag to compare

BREAKING CHANGE

  • This version is the first to use TileDB instead of ChromaDB. Therefore, any databases previously created with ChromaDB will no longer work with Version 4.3.0+ of this program.
  • TileDB is much more robust, faster, will not give errors when trying to ingest massive amounts of documents, and is overall "better." ChromaDB has been used since the beginning of this repository so it's a major change. However, it was necessary due to ChromaDB not being able to handle massive amounts of documents reliably. It simply couldn't insert, for example, 200k+ vectors reliably. This was tested extensively with older versions that use duckdb + clickhouse as well as the current version that uses sqlite3. ALL HAIL TILEDB!

IMPROVEMENTS:

  • WhisperSpeech was added as an alternative to Bark when doing text-to-speech. WhisperSpeech is a very promising library managed by Collabora. It rivals Bark in terms of quality, VRAM/compute requirements, and speed - even outperforms it in certain metrics. As newer models from WhisperSpeech come out this will only improve further.
  • As part of the sentence-transformers library vector models were loaded with a batch size of 32. It became apparent after extensive testing that this caused slower performance and higher compute/memory requirements in EVERY SINGLE MODEL. database_interactions.py now sets the batch size based on the size of the vector model being used when creating the database with a GPU. When using a CPU, the batch size is always 2. This reduces memory requirements by over 50% in most cases with the larger models, and improves speed for GPU and CPU.
  • Updated some of the dependency versions with further updates based on further testing in the near future.
  • Started using the pickle library to save representations of image files and audio transcriptions within a given database. When a user doubleclicks on a image or audio file within an existing database, it will read the pickle file and route the call to where the file is located on your computer. Obviously, if you move the actual file (i.e. the image or audio file) it'll give an error.
  • Multiple scripts were refactored and/or renamed.

BUG FIXES:

  • Changes made after Version 4.1 resulted in "Port," "Temperature," and "Max_tokens" not being updated properly. This was fixed by reverting the specific script managing this to a prior version.
  • The Models Tab was revamped to correct alignment, a longstanding bug.

v4.2.0 - general upgrade

22 Mar 14:54
a4c6589
Compare
Choose a tag to compare
  • Eliminated multiple bugs regarding creating, saving, and searching databases.
  • Faster database creation by specifying batch size based on the vector model selected, and also lower VRAM usage as a result. This was only done after extensive testing.
  • Make sure that all audio and image files are displayed in the database viewer tab now and that they open the appropriate file when doubleclicked.
  • Updated user guide.
  • Remove certain vector models that had significantly higher compute/VRAM requirements but no greater quality than other models that require less compute/VRAM. This was only done after extensive testing.
  • Other relatively minor changes.

V4.1.0 - MULTIPLE databases!!!

28 Feb 15:08
d0d2985
Compare
Choose a tag to compare

Finally painstakingly implemented the ability to create multiple databases.

Changed the voice recorder to use the new backend for the file transcriber, which is the amazing WhisperS2T library.

Updated multiple libraries to the most recent versions while still maintaining inter-dependency compatibility.

Updated openai library to use the new API finally.

Refactored multiple scripts, revised some conditional checks, reduced some clutter printed to the command prompt, and other improvements.

v4.0 - CUDA 12.1+ support!

22 Feb 18:00
5410c1b
Compare
Choose a tag to compare

NOTE~

This release is only for Windows. Linux and MacOs users should continue to use v3.5.2 until I can get those versions up and running. Download the ZIP file from Release 3.5.2 and follow the instructions in the readme.md INCLUDING the prerequisites, which are different than this release.

CUDA 12.1+ support finally brings support for flash attention 2 and other improvements. Those will be implemented in subsequent incremental releases. For this initial release, the following major improvements have been made:

The transcribe Tool has had a major improvement due to CUDA 12.1+ being supported. It allowed switching from faster-whisper to the amazing new library (only ~75 stars) located here:

https://github.com/shashikg/WhisperS2T

In summary, this library enables "batch" processing of audio using ctranslate2 version 4.0, which supports CUDA 12.1+. Here is a comparison of a long audio file under Release 3.5.1 versus this release:

Release 3.5.2

large-v2 model, float16 = 10 minutes 1 second

This release:

large-v2, float16, speed set at 50 = 54 seconds
medium.en, float16, speed set at 75 = 32 seconds
small.en, float16, speed set at 100 = 15 seconds!!!

This cannot be understated and has been a feature that faster-whisper, while great, has been lacking for quite some time.

KNOWN ISSUES:

  1. Bark models will still run but with errors printed to the command prompt. This will be fixed as flash attention 2 is implemented and the option to NOT use FA2 is made available, thus preventing the errors.

  2. The voice transcriber sometimes takes way longer than in release 3.5.2 and/or prints multiple transcriptions. This is due to issues with the faster-whisper library itself - not ctranslate2 - since the improved transcribe file tool works just fine and it relies on ctranslate2 already. If it's not addressed in the future, this may require switching from faster-whisper to something else like whisperS2T, but for shorter audio faster-whisper is probably just as good and I'd rather keep it if possible.

  3. The transcriber tool no longer lets you choose a quantization nor compute device (e.g. cuda or cpu). This was a choice in order to get initial CUDA 12+ support as soon as possible. It'll be addressed in subsequent releases.

Please contact me if you to help out if you want faster releases for Linux and MacoS as I don't own those systems. I plan to update support for Linux systems using Nvidia GPUs on windows and AMD GPUs on Linux, just like before, as well as MacOs support, just like before.

v3.5.2 - final CUDA 11.8 release

22 Feb 17:07
d2fa021
Compare
Choose a tag to compare

All subsequent releases will only support CUDA 12+ unless popular demand dictates otherwise.

v3.5 - revamp baby!

16 Feb 19:55
4dc86f4
Compare
Choose a tag to compare

Fix a HUGE BUG preventing databases created with certain vector models from returning any results...apparently embeddings need to be "normalized" when using similarity search...

Transcribe file now adds the transcription into the DB, enabling metadata searching and filtering by document type!

Revamped GUI to afford tabs more space, including Databases tab that will need it to create multiple database in subsequent release.

Revamp github instructions and support matrix.

Refactoring

ELIMINATE annoying "qpainter" bug, hopefully for good! Dirty little bug bastard!

Change location to dowload bitsandbytes for windows during installation process, and minor improvements to installation procedures.