Skip to content

rafaelvargas/bytebridge

Repository files navigation

bytebridge

A data tool designed to move data seamlessly between various sources and destinations.

Package version

CLI

Bytebridge aims to have a CLI that can be used to easily transfer data from multiple sources. Some examples are shown below:

Parquet to PostgreSQL

bytebridge transfer \
    --connections connections.json \
    --source parquet_conn \
    --source-object data.parquet \
    --target postgresql_conn \
    --target-object bytebridge.public.data

PostgreSQL to Parquet

bytebridge transfer \
    --connections connections.json \
    --source postgresql_conn \
    --source-object bytebridge.public.data \
    --target postgresql_conn \
    --target-object data.parquet

In both cases, the connections metadata are defined in the connections.json file. An example of the file definition is:

{
    "parquet_conn": {
        "type": "parquet"
    },
    "postgresql_conn": {
        "type": "postgresql",
        "parameters": {
            "host": "[hostname_of_the_connection]",
            "user": "[postgresql_username_to_be_used]",
            "password": "[postgresql_password_to_be_used]",
            "port": "[postgresql_port_to_be_used]",
        }
    }
}

API

Coming soon.

Data Connectors

Currently Supported

Name Type Client
PostgreSQL Database psycopg
MySQL Database mysql-connector-python
MS SQL Server Database pymssql
Parquet File pyarrow
CSV File csv

Planned

Name Type
SQLite Database
MongoDB Database
MariaDB Database
Clickhouse Database
BigQuery Database
Oracle Database
Cassandra Database
ORC File
Avro File
Excel (XLSX) File
JSON File

Contributing

Feel free to contribute to this project. See the contribution guidelines in here.

License

This project is licensed under the terms of the Apache 2.0 license.

About

A data tool designed to move data seamlessly between various sources and destinations.

Topics

Resources

License

Stars

Watchers

Forks

Languages