-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DR-305 1.25 release blog #2047
DR-305 1.25 release blog #2047
Conversation
@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha | |||
|
|||
Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also point out that the flat
index takes up very little memory, whereas HNSW requires an in-memory graph (to enable fast searches). I think this is the big part of the resource requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vectr-index) An index configuration that allows Weavaite to dyanamically switch from flat to HNSW as object count increases. | ||
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that will allow you to use open source locally running and hosted embedding and language models! | ||
- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vector-index) Dynamically switch from flat indexes to HNSW to efficiently scale as your data grows. | ||
- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the updates are faster as a whole (could confirm). The two benefits that I'm aware of are: allow concurrent schema operations, and higher fault tolerance in clusters.
Background: Previously, any node being down would prevent schema operations, and only one schema operation could be performed at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters. | ||
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that empower you to host and run open source embedding and language models locally! | ||
- [**Batch vectorization**](/blog/weaviate-1-25-release#batch-vectorization) Faster and more efficient use of APIs during data import. | ||
- [**Automatic tenant creation**](/blog/weaviate-1-25-release#automatic-tenant-creation) Easier data uploads and tenant creation for you application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: you
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that empower you to host and run open source embedding and language models locally! | ||
- [**Batch vectorization**](/blog/weaviate-1-25-release#batch-vectorization) Faster and more efficient use of APIs during data import. | ||
- [**Automatic tenant creation**](/blog/weaviate-1-25-release#automatic-tenant-creation) Easier data uploads and tenant creation for you application. | ||
- [**Search improvements**](/blog/weaviate-1-25-release#search-improvements) Hybrid search gets vector similarity search and grouping. method. Keyword search also gets grouping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth clarifying "vector similarity search", as vector search was already available. I think it's aspects like "move to/away", which allow more nuanced vector searches.
How about "more nuanced" vector search, or "finer control over vector search"?
|
||
<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center> | ||
|
||
Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I echo the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth highlighting that this primarily affects multi-node clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. | ||
|
||
There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details.
What about:
Earlier releases used two-phase commits to achieve strong consistency for the schema. Starting in 1.25, Weaviate uses Raft to improve the cluster's fault tolerance when it comes to the schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. | ||
|
||
If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
If you are upgrading from an earlier version
-> If you are upgrading a kubernetes deployment from an earlier version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe There is a one-time change in the upgrade process.
-> There is a one-time migration that is required in the upgrade process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @databyjp !
- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vectr-index) An index configuration that allows Weavaite to dyanamically switch from flat to HNSW as object count increases. | ||
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that will allow you to use open source locally running and hosted embedding and language models! | ||
- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vector-index) Dynamically switch from flat indexes to HNSW to efficiently scale as your data grows. | ||
- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha | |||
|
|||
Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center> | ||
|
||
Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant. | ||
|
||
There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads. | ||
|
||
If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
`Hybrid` search combines the results of a vector search and a keyword (BM25F) search. | ||
|
||
<<<<<<< HEAD | ||
Weaviate uses a ranking method to merge the search results. The [ranking method](#change-the-ranking-method) and the [ranking weights](#balance-keyword-and-vector-search) are configurable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REVIEWER: Ignore this section. It fixes a merge error I happened to bump into.
Closing - published under the all but blog ticket |
issue
staging blog
staging releases
UPDATE: 2024-05-17 Moved to new PR for everything except the blog front page