Skip to content

lancedb/lance-deeplearning-recipes

Repository files navigation

Lance Deep Learning - recipes


Dive into building Deep learning pipelines using Lance datasets! This repository contains examples to help you use Lance datasets for your Deep learning projects.
  • These are built using Lance, a free, open-source, columnar data format that requires no setup.

  • High-performance random access: More than 1000x faster than Parquet.

  • Zero-copy, automatic versioning: manage versions of your data automatically, and reduce redundancy with zero-copy logic built-in. 318060905-d284accb-24b9-4404-8605-56483160e579


Join our community for support - DiscordTwitter

Why Lance

Convinience
Lance columnar file format is designed for large scale DL workloads. Columnar format allows you to easily and efficiently manage complex and unstructred multi-modal datasets Updation, filtering and zero-copy versioning allow you to iterate faster on large datasets. It’s designed to be used with images, videos, 3D point clouds, audio and of course tabular data. It supports any POSIX file systems, and cloud storage like AWS S3 and Google Cloud Storage


Performance
Lance format supports fast read/writes making your training time data loading significantly faster.

Dataset Examples

Examples on how to convert existing datasets to Lance format.

Example   Scripts   Read The Blog!       
Creating text dataset for LLM pre-training Open In Colab Ghost
Creating Instruction dataset for LLM fine-tuning Open In Colab

Training Examples

Practical examples showcasing how to adapt your Lance dataset to popular deep learning projects.

Example   Notebook & Scripts  
PEFT Supervised Fine-tuning of Gemma using Huggingface Trainer Open In Colab
LLM pre-training Open In Colab
COCO Image segmentation Open In Colab

Contributing Examples

If you're working on some cool deep learning examples using Lance that you'd like to add to this repo, please open a PR!

About

Deep Learning how-to's using Lance file format

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published