Skip to content

dalty999/how-to-ingest-pdfs-with-unstructured

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This repo demonstrates how to use the Unstructured library with Weaviate. The Unstructured Library offers powerful capabilities for parsing a variety of data sources and extracting structured text from them. This includes, but is not limited to, documents like PDFs, Powerpoints, or JPEG files.

The dataset we've included are two publicly available research papers. One paper contains a single column, and the other has a two column format. The notebook starts with a basic approach to using Unstructured and ends with an end-to-end example. This includes connecting to your Weaviate instance, defining your schema, uploading the data and then running two queries.

Read the blog post for more information!

About

Weaviate, Unstructured | Ingest PDFs into Weaviate

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%