Skip to content

Commit

Permalink
Merge branch 'main' into fix-989
Browse files Browse the repository at this point in the history
  • Loading branch information
langchain4j committed Apr 29, 2024
2 parents decf497 + 0ae7f0a commit 7fc604b
Show file tree
Hide file tree
Showing 67 changed files with 502 additions and 161 deletions.
42 changes: 33 additions & 9 deletions .github/pull_request_template.md
@@ -1,21 +1,45 @@
<!-- Thank you so much for your contribution! -->
<!-- Please fill in all the sections below. -->

<!-- Please open the PR as a draft initially. Once it is reviewed and approved, we will ask you to add documentation and examples. -->
<!-- Please note that PRs with breaking changes will be rejected. -->
<!-- Please note that PRs without tests will be rejected. -->

## Context
<!-- Please provide some context so that it is clear why this change is required. -->
<!-- Please note that PRs will be reviewed based on the priority of the issues they address. -->
<!-- We ask for your patience. We are doing our best to review your PR as quickly as possible. -->
<!-- Please refrain from pinging and asking when it will be reviewed. Thank you for understanding! -->


## Issue
<!-- Please paste the link to the issue this PR is addressing. For example: https://github.com/langchain4j/langchain4j/issues/1012 -->


## Change
<!-- Please describe the changed you made. -->
<!-- Please describe the changes you made. -->


## Checklist
Before submitting this PR, please check the following points:
## General checklist
<!-- Please double-check the following points and mark them like this: [X] -->
- [ ] There are no breaking changes
- [ ] I have added unit and integration tests for my change
- [ ] All unit and integration tests in the module I have added/changed are green
- [ ] All unit and integration tests in the [core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core) and [main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j) modules are green
- [ ] I have manually run all the unit and integration tests in the module I have added/changed, and they are all green
- [ ] I have manually run all the unit and integration tests in the [core](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-core) and [main](https://github.com/langchain4j/langchain4j/tree/main/langchain4j) modules, and they are all green
<!-- Before adding documentation and example(s) (below), please wait until the PR is reviewed and approved. -->
- [ ] I have added/updated the [documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs)
- [ ] I have added an example in the [examples repo](https://github.com/langchain4j/langchain4j-examples) (only for "big" features)
- [ ] I have added my new module in the [BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml) (only when a new module is added)


## Checklist for adding new model integration
<!-- Please double-check the following points and mark them like this: [X] -->
- [ ] I have added my new module in the [BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml)


## Checklist for adding new embedding store integration
- [ ] I have added a {NameOfIntegration}EmbeddingStoreIT that extends from either EmbeddingStoreIT or EmbeddingStoreWithFilteringIT
<!-- Please double-check the following points and mark them like this: [X] -->
- [ ] I have added a `{NameOfIntegration}EmbeddingStoreIT` that extends from either `EmbeddingStoreIT` or `EmbeddingStoreWithFilteringIT`
- [ ] I have added my new module in the [BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml)


## Checklist for changing existing embedding store integration
<!-- Please double-check the following points and mark them like this: [X] -->
- [ ] I have manually verified that the `{NameOfIntegration}EmbeddingStore` works correctly with the data persisted using the latest released version of LangChain4j
36 changes: 26 additions & 10 deletions CONTRIBUTING.md
@@ -1,8 +1,12 @@
Thank you for investing your time and effort in contributing to our project, we appreciate it a lot! 🤗

# General Guidelines

- If you want to contribute a bug fix or a new feature that isn't listed in the [issues](https://github.com/langchain4j/langchain4j/issues) yet, please open a new issue for it and link it to your PR.
# Current situation (25 April 2024)
There are over 60 open PRs. Please help us by reviewing them first, before opening new ones. 🙏


# General guidelines
- If you want to contribute a bug fix or a new feature that isn't listed in the [issues](https://github.com/langchain4j/langchain4j/issues) yet, please open a new issue for it. We will prioritize is shortly.
- Follow [Google's Best Practices for Java Libraries](https://jlbp.dev/)
- Keep the code compatible with Java 8. We plan to increase the baseline to Java 17 a bit later.
- Avoid adding new dependencies as much as possible. If absolutely necessary, try to use the same libraries which are already used in the project.
Expand All @@ -13,27 +17,34 @@ Thank you for investing your time and effort in contributing to our project, we
- Follow existing code style present in the project.
- Large features should be discussed with maintainers before implementation. Please ping @langchain4j in the comments on the issue.


# Priorities
All [issues](https://github.com/langchain4j/langchain4j/issues) are prioritized by maintainers. There are 4 priorities: [P1](https://github.com/langchain4j/langchain4j/issues?q=is%3Aissue+is%3Aopen+label%3AP1), [P2](https://github.com/langchain4j/langchain4j/issues?q=is%3Aissue+is%3Aopen+label%3AP2), [P3](https://github.com/langchain4j/langchain4j/issues?q=is%3Aissue+is%3Aopen+label%3AP3) and [P4](https://github.com/langchain4j/langchain4j/issues?q=is%3Aissue+is%3Aopen+label%3AP4).

Please start with the higher priorities. PRs will be reviewed in order of priority, with bugs being a higher priority than new features.

Please note that we do not have the capacity to review all PRs immediately.
Please note that we do not have the capacity to review PRs immediately. We ask for your patience. We are doing our best to review your PR as quickly as possible.


# Opening an issue
- Please fill in all sections of the issue template.

# Opening a PR
- Link an [issue](https://github.com/langchain4j/langchain4j/issues) to your PR. If there is no issue yet, open one.

# Opening a draft PR
- Please open the PR as a draft initially. Once it is reviewed and approved, we will then ask you to finalize it (see section below).
- Fill in all the sections of the PR template.
- Make sure you've added tests.
- Make sure you've added documentation where required.
- For new big features, make sure you've added an example in the [examples repository](https://github.com/langchain4j/langchain4j-examples) (as a separate PR, linked to the main one).
- Please make it easier to review your PR:
- Keep changes as small as possible.
- Do not combine refactoring with changes in a single PR.
- Avoid reformatting existing code.


# Finalizing the draft PR
- Add [documentation](https://github.com/langchain4j/langchain4j/tree/main/docs/docs) (if required).
- Add an example to the [examples repository](https://github.com/langchain4j/langchain4j-examples) (if required).
- [Mark a PR as ready for review](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/changing-the-stage-of-a-pull-request#marking-a-pull-request-as-ready-for-review)


# Guidelines on adding a new model integration
- [Integration with Anthropic](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-anthropic) is a good example.
- Use the official SDK if available.
Expand All @@ -43,12 +54,17 @@ Please note that we do not have the capacity to review all PRs immediately.
- Add a new module to the appropriate section of the [BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml).
- It would be great if you could add a [Spring Boot starter](https://github.com/langchain4j/langchain4j-spring).


# Guidelines on adding a new embedding store integration
- [Integration with Chroma](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-chroma) is a good example.
- Use the official SDK if available.
- If the official SDK is not available, use Retrofit and Gson to implement the client.
- `{IntegrationName}EmbeddingStoreIT` should extend from `EmbeddingStoreWithFilteringIT` or `EmbeddingStoreIT` and pass all tests.
- Add a `{IntegrationName}EmbeddingStoreIT`. It should extend from `EmbeddingStoreWithFilteringIT` or `EmbeddingStoreIT` and pass all tests.
- Document the new integration [here](https://github.com/langchain4j/langchain4j/blob/main/README.md), [here](https://github.com/langchain4j/langchain4j/tree/main/docs/docs/integrations/embedding-stores) and [here](https://github.com/langchain4j/langchain4j/blob/main/docs/docs/integrations/embedding-stores/index.md).
- Add an example to the [examples repository](https://github.com/langchain4j/langchain4j-examples), similar to [this](https://github.com/langchain4j/langchain4j-examples/tree/main/chroma-example).
- Add a new module to the appropriate section of the [BOM](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-bom/pom.xml).
- It would be great if you could add a [Spring Boot starter](https://github.com/langchain4j/langchain4j-spring).
- It would be great if you could add a [Spring Boot starter](https://github.com/langchain4j/langchain4j-spring). (after


# Guidelines on changing an existing embedding store integration
- Ensure that your changes are backwards compatible. `Embedding`s and `TextSegment`s persisted with the latest released version of LangChain4j should still work.
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/integrations/embedding-stores/index.md
@@ -1,5 +1,5 @@
---
title: Comparison Table
title: Comparison table of all supported Embedding Stores
hide_title: false
sidebar_position: 0
---
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/integrations/language-models/index.md
@@ -1,5 +1,5 @@
---
title: Comparison Table
title: Comparison Table of all supported Language Models
hide_title: false
sidebar_position: 0
---
Expand Down
36 changes: 29 additions & 7 deletions docs/docs/tutorials/7-rag.md
Expand Up @@ -4,9 +4,6 @@ sidebar_position: 8

# RAG (Retrieval-Augmented Generation)

[Great tutorial on RAG](https://www.sivalabs.in/langchain4j-retrieval-augmented-generation-tutorial/)
by [Siva](https://www.sivalabs.in/).

LLM's knowledge is limited to the data it has been trained on.
If you want to make an LLM aware of domain-specific knowledge or proprietary data, you can:
- Use RAG, which we will cover in this section
Expand Down Expand Up @@ -82,8 +79,8 @@ in glob: `glob:**.pdf`.
</details>

3. Now, we need to preprocess and store documents in a specialized embedding store, also known as vector database.
This is necessary to quickly find relevant pieces of information on the fly when a user asks a question.
We can use any of our 15+ [supported embedding stores](/category/embedding-stores),
This is necessary to quickly find relevant pieces of information when a user asks a question.
We can use any of our 15+ [supported embedding stores](/integrations/embedding-stores),
but for simplicity, we will use an in-memory one:
```java
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
Expand Down Expand Up @@ -139,13 +136,34 @@ String answer = assistant.chat("How to do Easy RAG with LangChain4j?");

## RAG APIs
LangChain4j offers a rich set of APIs to make it easy for you to build custom RAG pipelines,
ranging from very simple ones to very advanced ones. In this section, we will cover the main domain classes and APIs.
ranging from simple ones to advanced ones.
In this section, we will cover the main domain classes and APIs.

### Document
A `Document` class represents an entire document, such as a single PDF file or a web page.
At the moment, the `Document` can only represent textual information,
but future updates will enable it to support images and tables as well.

### Metadata
Each `Document` contains `Metadata`.
It stores information about the `Document`, such as its name, source, creation date, owner,
or any other relevant details.

The `Metadata` is stored as a key-value map, where the key is of the `String` type,
and the value can be one of the following types: `String`, `Integer`, `Long`, `Float`, `Double`.

`Metadata` is useful for several reasons:
- When including the content of the `Document` in a prompt to the LLM,
metadata entries can also be included, providing the LLM with additional information to consider.
For example, providing the `Document` name and source can help improve the LLM's understanding of the content.
- When searching for relevant content to include in the prompt,
one can filter by `Metadata` entries.
For example, you can narrow down a semantic search to only `Document`s
belonging to a specific owner.
- When the source of the `Document` is updated (e.g., a particular page of documentation),
one can easily locate the corresponding `Document` by its metadata entry "source"
and update it in the `EmbeddingStore` as well.

### Document Loader
You can create a `Document` from a `String`, but a simpler method is to use one of our document loaders included in the library:
- `FileSystemDocumentLoader` from the `langchain4j` module
Expand Down Expand Up @@ -209,7 +227,8 @@ instead of the entire knowledge base in the prompt:
- LLMs have a limited context window, so the entire knowledge base might not fit
- The more information you provide in the prompt, the longer it takes for the LLM to process it and respond
- The more information you provide in the prompt, the more you pay
- Irrelevant information in the prompt might confuse or distract the LLM and increase the chance of hallucinations
- Irrelevant information in the prompt might distract the LLM and increase the chance of hallucinations
- The more information you provide in the prompt, the harder it is to explain based on which information the LLM responded

We can address these concerns by splitting a knowledge base into smaller, more digestible segments.
How big should those segments be? That is a good question. As always, it depends.
Expand Down Expand Up @@ -282,6 +301,9 @@ More details are coming soon.

Currently supported embedding stores can be found [here](/category/embedding-stores).

### Filter
More details are coming soon.

### Embedding Store Ingestor
More details are coming soon.

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-anthropic/pom.xml
Expand Up @@ -5,7 +5,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-azure-ai-search/pom.xml
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-azure-cosmos-mongo-vcore/pom.xml
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-azure-open-ai/pom.xml
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-bedrock/pom.xml
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-bom/pom.xml
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-cassandra/pom.xml
Expand Up @@ -10,7 +10,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-chatglm/pom.xml
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-chroma/pom.xml
Expand Up @@ -7,7 +7,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion langchain4j-cohere/pom.xml
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-parent</artifactId>
<version>0.30.0</version>
<version>0.31.0-SNAPSHOT</version>
<relativePath>../langchain4j-parent/pom.xml</relativePath>
</parent>

Expand Down

0 comments on commit 7fc604b

Please sign in to comment.