Expanding more document splitters #218
Closed
jiangsier-xyz
started this conversation in
Ideas
Replies: 1 comment
-
@jiangsier-xyz there is a plan to add more splitters, including for markdown format. But it is not clear when we will have capacity to do so. If you are looking to contribute, please add it to the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently, what we need is a document splitter for Markdown format (Markdown is already the most popular technical documentation format).
Langchain for Python already has a built-in implementation (with simple logic, recursive splitting, and using " " as a fallback delimiter, which means it is not suitable for documents in non-Latin languages such as Chinese).
Is there any plan to expand more document splitters in langchain4j (at least supporting Markdown)? Or if someone contributes code, which module would be the most appropriate place to put it?
Beta Was this translation helpful? Give feedback.
All reactions