Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DR-305 1.25 release blog #2047

Closed
wants to merge 25 commits into from
Closed

Conversation

daveatweaviate
Copy link
Contributor

@daveatweaviate daveatweaviate commented May 7, 2024

issue
staging blog
staging releases

UPDATE: 2024-05-17 Moved to new PR for everything except the blog front page

@daveatweaviate daveatweaviate changed the base branch from main to 1_25_update/final May 7, 2024 22:26
@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha

Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also point out that the flat index takes up very little memory, whereas HNSW requires an in-memory graph (to enable fast searches). I think this is the big part of the resource requirements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vectr-index) An index configuration that allows Weavaite to dyanamically switch from flat to HNSW as object count increases.
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that will allow you to use open source locally running and hosted embedding and language models!
- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vector-index) Dynamically switch from flat indexes to HNSW to efficiently scale as your data grows.
- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the updates are faster as a whole (could confirm). The two benefits that I'm aware of are: allow concurrent schema operations, and higher fault tolerance in clusters.

Background: Previously, any node being down would prevent schema operations, and only one schema operation could be performed at a time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters.
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that empower you to host and run open source embedding and language models locally!
- [**Batch vectorization**](/blog/weaviate-1-25-release#batch-vectorization) Faster and more efficient use of APIs during data import.
- [**Automatic tenant creation**](/blog/weaviate-1-25-release#automatic-tenant-creation) Easier data uploads and tenant creation for you application.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: you

- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that empower you to host and run open source embedding and language models locally!
- [**Batch vectorization**](/blog/weaviate-1-25-release#batch-vectorization) Faster and more efficient use of APIs during data import.
- [**Automatic tenant creation**](/blog/weaviate-1-25-release#automatic-tenant-creation) Easier data uploads and tenant creation for you application.
- [**Search improvements**](/blog/weaviate-1-25-release#search-improvements) Hybrid search gets vector similarity search and grouping. method. Keyword search also gets grouping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth clarifying "vector similarity search", as vector search was already available. I think it's aspects like "move to/away", which allow more nuanced vector searches.

How about "more nuanced" vector search, or "finer control over vector search"?


<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center>

Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I echo the comment above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth highlighting that this primarily affects multi-node clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant.

There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details.

What about:
Earlier releases used two-phase commits to achieve strong consistency for the schema. Starting in 1.25, Weaviate uses Raft to improve the cluster's fault tolerance when it comes to the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads.

If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

If you are upgrading from an earlier version -> If you are upgrading a kubernetes deployment from an earlier version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe There is a one-time change in the upgrade process. -> There is a one-time migration that is required in the upgrade process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@daveatweaviate daveatweaviate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @databyjp !

- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vectr-index) An index configuration that allows Weavaite to dyanamically switch from flat to HNSW as object count increases.
- [**New Modules**](/blog/weaviate-1-25-release#new-modules) Loads of new modules that will allow you to use open source locally running and hosted embedding and language models!
- [**Dynamic Vector Index**](/blog/weaviate-1-25-release#dynamic-vector-index) Dynamically switch from flat indexes to HNSW to efficiently scale as your data grows.
- [**Raft**](/blog/weaviate-1-25-release#raft) Improves schema management for faster updates and more reliable clusters.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -40,8 +44,22 @@ Above the threshold value, HNSW indexes are faster for queries, but they also ha

Dynamic indexes are particularly useful in a multi-tenant setup. The resource costs are high to build an HNSW index for every tenant. However, if a tenant collection grows large enough, the index dynamically switches to HNSW. The smaller tenants continue to use flat indexes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


<center><img src={WV8onRaft} width="50%" alt="Weaviate on a raft"/></center>

Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


Weaviate clusters can be large; there can be a lot of nodes to coordinate. The host systems have to work together reliably and efficiently, even under high loads. [Raft](https://raft.github.io/) is a robust consensus algorithm that helps make Weaviate faster and more fault tolerant.

There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


There are two types of data in a Weaviate cluster: your actual data, and system state information. Earlier releases use the same mechanism to store both types of data. Starting in 1.25, Weaviate uses Raft to store schema information and cluster state details. The underlying data [storage](/developers/weaviate/concepts/storage) is managed the same way as before - [replication](/developers/weaviate/concepts/replication-architecture) and [sharding](/developers/weaviate/concepts/cluster) continue to safeguard your data while making it available for your applications. Raft ensures your schemas are safe and helps you to reliably scale your production workloads.

If you are new to Weaviate, you can take immediate advantage of Raft. If you are upgrading from an earlier version, be sure to review the [migration guide](/developers/weaviate/more-resources/migration/weaviate-1-25) before you upgrade. There is a one-time change in the upgrade process.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

`Hybrid` search combines the results of a vector search and a keyword (BM25F) search.

<<<<<<< HEAD
Weaviate uses a ranking method to merge the search results. The [ranking method](#change-the-ranking-method) and the [ranking weights](#balance-keyword-and-vector-search) are configurable.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REVIEWER: Ignore this section. It fixes a merge error I happened to bump into.

Base automatically changed from 1_25_update/final to main May 10, 2024 02:34
@daveatweaviate
Copy link
Contributor Author

Closing - published under the all but blog ticket

@daveatweaviate daveatweaviate deleted the DR-305-1-25-release-blog branch May 20, 2024 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants