Skip to content

Example implementation of the ODBC driver connection for Dremio and Jupyter Notebook

Notifications You must be signed in to change notification settings

vcwild/dremio-jupyter-connection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Plug DREMIO Data Lake Driver into Jupyter Notebooks

Standalone Container

Setup

import pandas as pd 
import pyodbc
import credentials # separate file with user credentials

Pyodbc settings

host = 'localhost'
port = 31010
uid = credentials.user
pwd = credentials.password
driver = '/opt/dremio-odbc/lib64/libdrillodbc_sb64.so' # ubuntu/debian default odbc driver
cnxn = pyodbc.connect("Driver={};ConnectionType=Direct;HOST={};PORT={};AuthenticationType=Plain;UID={};PWD={};".format(driver, host, port, uid, pwd), autocommit=True)

Read dataframe based on SQL Query

sql = 'SELECT * from "test"."weather" Limit 10'
df = pd.read_sql(sql, cnxn)

Output

df.head()
STATION NAME LATITUDE LONGITUDE ELEVATION DATE PRCP SNOW SNWD TAVG TMAX TMIN
0 USW00023272 SAN FRANCISCO DOWNTOWN, CA US 37.7705 -122.4269 45.7 2018-01-01 0.00 61 48
1 USW00023272 SAN FRANCISCO DOWNTOWN, CA US 37.7705 -122.4269 45.7 2018-01-02 0.00 61 52
2 USW00023272 SAN FRANCISCO DOWNTOWN, CA US 37.7705 -122.4269 45.7 2018-01-03 0.09 58 53
3 USW00023272 SAN FRANCISCO DOWNTOWN, CA US 37.7705 -122.4269 45.7 2018-01-04 0.06 63 53
4 USW00023272 SAN FRANCISCO DOWNTOWN, CA US 37.7705 -122.4269 45.7 2018-01-05 0.26 61 52

Requirements

References

DREMIO - The Data Lake Engine docs.

About

Example implementation of the ODBC driver connection for Dremio and Jupyter Notebook

Topics

Resources

Stars

Watchers

Forks