DR-305 1.25 release blog #2047

daveatweaviate · 2024-05-07T22:25:47Z

UPDATE: 2024-05-17 Moved to new PR for everything except the blog front page

…nto DR-305-1-25-release-blog

databyjp · 2024-05-08T09:51:38Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha

 Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes.


I would also point out that the flat index takes up very little memory, whereas HNSW requires an in-memory graph (to enable fast searches). I think this is the big part of the resource requirements.

databyjp · 2024-05-08T09:53:48Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vectr-index) An index configuration that allows Weavaite to dyanamically switch from flat to HNSW as object count increases.
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that will allow you to use open source locally running and hosted embedding and language models!  
+- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vector-index) Dynamically switch from flat indexes to HNSW to efficiently scale as your data grows.
+- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters. 


I'm not sure the updates are faster as a whole (could confirm). The two benefits that I'm aware of are: allow concurrent schema operations, and higher fault tolerance in clusters.

Background: Previously, any node being down would prevent schema operations, and only one schema operation could be performed at a time.

databyjp · 2024-05-08T09:54:08Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters. 
+- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that empower you to host and run open source embedding and language models locally!
+- [**Batch vectorization**](/blog/weaviate-1-25-release#batch-vectorization) Faster and more efficient use of APIs during data import.
+- [**Automatic tenant creation**](/blog/weaviate-1-25-release#automatic-tenant-creation) Easier data uploads and tenant creation for you application.


databyjp · 2024-05-08T09:56:06Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that empower you to host and run open source embedding and language models locally!
+- [**Batch vectorization**](/blog/weaviate-1-25-release#batch-vectorization) Faster and more efficient use of APIs during data import.
+- [**Automatic tenant creation**](/blog/weaviate-1-25-release#automatic-tenant-creation) Easier data uploads and tenant creation for you application.
+- [**Search improvements**](/blog/weaviate-1-25-release#search-improvements) Hybrid search gets vector similarity search and grouping. method. Keyword search also gets grouping.


It might be worth clarifying "vector similarity search", as vector search was already available. I think it's aspects like "move to/away", which allow more nuanced vector searches.

How about "more nuanced" vector search, or "finer control over vector search"?

databyjp · 2024-05-08T09:58:22Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+
+<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center>
+
+Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. 


I echo the comment above.

It may be worth highlighting that this primarily affects multi-node clusters.

databyjp · 2024-05-08T10:00:30Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+
+Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. 
+
+There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. 


Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details.

What about:
Earlier releases used two-phase commits to achieve strong consistency for the schema. Starting in 1.25, Weaviate uses Raft to improve the cluster's fault tolerance when it comes to the schema.

databyjp · 2024-05-08T10:02:46Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+
+There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. 
+
+If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process. 


How about:

If you are upgrading from an earlier version -> If you are upgrading a kubernetes deployment from an earlier version

Maybe There is a one-time change in the upgrade process. -> There is a one-time migration that is required in the upgrade process.

daveatweaviate

Thanks @databyjp !

daveatweaviate · 2024-05-08T13:56:54Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vectr-index) An index configuration that allows Weavaite to dyanamically switch from flat to HNSW as object count increases.
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that will allow you to use open source locally running and hosted embedding and language models!  
+- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vector-index) Dynamically switch from flat indexes to HNSW to efficiently scale as your data grows.
+- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters. 


daveatweaviate · 2024-05-08T14:44:12Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha

 Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes.


daveatweaviate · 2024-05-08T14:47:42Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+
+<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center>
+
+Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. 


daveatweaviate · 2024-05-08T14:57:37Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+
+Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. 
+
+There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. 


daveatweaviate · 2024-05-08T15:00:43Z

blog/2024-05-07-weaviate-1-25-release/_core-1-25-include.mdx

+
+There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. 
+
+If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process. 


daveatweaviate · 2024-05-08T15:02:24Z

developers/weaviate/search/hybrid.md

 `Hybrid` search combines the results of a vector search and a keyword (BM25F) search. 

-<<<<<<< HEAD
 Weaviate uses a ranking method to merge the search results. The [ranking method](#change-the-ranking-method) and the [ranking weights](#balance-keyword-and-vector-search) are configurable.


REVIEWER: Ignore this section. It fixes a merge error I happened to bump into.

daveatweaviate · 2024-05-20T13:45:08Z

Closing - published under the all but blog ticket

daveatweaviate added 9 commits May 7, 2024 13:52

DR-305 1.25-release-blog

2b17b8f

Merge branch '1_25_update/final' of github.com:weaviate/weaviate-io i…

436b2cb

…nto DR-305-1-25-release-blog

spell check

f88d1f1

headings

03841e9

raft

caded0d

batch

2cc9ecf

automatic-tenants

2480eb5

vector similarity

d304db6

groupby

cf897d4

daveatweaviate changed the base branch from main to 1_25_update/final May 7, 2024 22:26

staging

67670ab

databyjp reviewed May 8, 2024

View reviewed changes

daveatweaviate added 6 commits May 8, 2024 10:15

fix missing files

3aeae7f

broken step

acab9a9

restore images

df11750

fix deleted files

005e0bd

dynamic update

35de355

review

ef82e60

daveatweaviate commented May 8, 2024

View reviewed changes

daveatweaviate added 4 commits May 8, 2024 13:59

moudule to model

61745ca

anchor

f426376

anchors

ae2f0a7

staging

f46ac29

Base automatically changed from 1_25_update/final to main May 10, 2024 02:34

daveatweaviate added 5 commits May 17, 2024 14:57

Merge conflicts

985d350

Hide blog

6d485c0

rename

fcb5138

batch

5dadf82

staging

1de8bee

daveatweaviate mentioned this pull request May 17, 2024

Dr 305 1.25 release all but blog #2111

Merged

daveatweaviate closed this May 20, 2024

daveatweaviate deleted the DR-305-1-25-release-blog branch May 20, 2024 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DR-305 1.25 release blog #2047

DR-305 1.25 release blog #2047

daveatweaviate commented May 7, 2024 •

edited

databyjp May 8, 2024

daveatweaviate May 8, 2024

databyjp May 8, 2024

daveatweaviate May 8, 2024

databyjp May 8, 2024

databyjp May 8, 2024

databyjp May 8, 2024

databyjp May 8, 2024

daveatweaviate May 8, 2024

databyjp May 8, 2024

daveatweaviate May 8, 2024

databyjp May 8, 2024

databyjp May 8, 2024

daveatweaviate May 8, 2024

daveatweaviate left a comment

daveatweaviate May 8, 2024

daveatweaviate May 8, 2024

daveatweaviate May 8, 2024

daveatweaviate May 8, 2024

daveatweaviate May 8, 2024

daveatweaviate May 8, 2024

daveatweaviate commented May 20, 2024

		@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha

		Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes.


		<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center>

		Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant.


		Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant.

		There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads.


		There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads.

		If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process.

DR-305 1.25 release blog #2047

DR-305 1.25 release blog #2047

Conversation

daveatweaviate commented May 7, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daveatweaviate left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daveatweaviate commented May 20, 2024

daveatweaviate commented May 7, 2024 •

edited