Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed DuckDB extensions in the Rill binary #4574

Closed
nishantmonu51 opened this issue Apr 10, 2024 · 3 comments · Fixed by #4919
Closed

Embed DuckDB extensions in the Rill binary #4574

nishantmonu51 opened this issue Apr 10, 2024 · 3 comments · Fixed by #4919
Assignees

Comments

@nishantmonu51
Copy link
Collaborator

At present we download the binaries when starting rill, which can fail if duckDB CDN is having issues.

HTTP Error: Failed to download extension "json" at URL "http://nightly-extensions.duckdb.org/v0.9.2/osx_arm64/json.duckdb_extension.gz" Extension "json" is an existing extension.

We would like to avoid downloading the extensions at runtime.

@nishantmonu51
Copy link
Collaborator Author

@himadrisingh : Can you get this assigned on Infra side and prioritize it ?

@himadrisingh
Copy link
Contributor

This will increase the size of the cli though?

@begelundmuller
Copy link
Contributor

Implementation proposal:

  • To maintain a single binary, we need to use Go's embed package to embed the DuckDB extensions we use in the binary. See runtime/pkg/examples for an example of how we use it to embed example projects.
  • We should implement the embedding in our DuckDB driver (runtime/drivers/duckdb) and unpack the embedded extensions before first opening a DuckDB handle.
    • We should probably unpack to ~/.duckdb/extensions, which is DuckDB's default location for extensions (notice a specific nested folder structure is required).
    • We should avoid unpacking an extension if it already exists (for faster startup after the first run).
  • We also need a script that downloads the DuckDB extensions we rely on and places them in the embed directory (probably runtime/drivers/duckdb/embed). The script could be placed in the scripts directory and invoked when make cli is executed (in the cli.prepare Make target)
  • Lastly, we need to embed different DuckDB extensions for different architectures (they are architecture specific). Since we build for multiple architectures using a cross-compiler, we can't just download extensions for the current system architecture. There's a few ways to solve this issue – here's one proposal:
    • Have the download script download extensions for all supported architectures and place them in runtime/drivers/duckdb/embed/<arch>/<duckdb version>/
    • Use Go build tags to only embed extensions for the current architecture. For example, runtime/drivers/duckdb/embed_darwin_arm64.go would have the build tag //go:build darwin && arm64 and embed ./embed/darwin_arm64/**

@begelundmuller begelundmuller changed the title Package duckDB extension alongwith rill binary Embed DuckDB extensions in the Rill binary May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants