#

pyspark-notebook

Here are 190 public repositories matching this topic...

hyunjoonbok / PySpark

PySpark functions and utilities with examples. Assists ETL process of data modeling

spark hadoop pyspark sparksql pyspark-notebook pyspark-api pyspark-python pyspark-machine-learning

Updated Dec 3, 2020
Jupyter Notebook

rlilojr / Detecting-Malicious-URL-Machine-Learning

machine-learning big-data apache-spark logistic-regression binary-classification naive-bayes-classification pyspark-notebook one-vs-rest malicious-url-detection linear-support-vector-machine

Updated Jul 24, 2018
Jupyter Notebook

josephmachado / efficient_data_processing_spark

Code for "Efficient Data Processing in Spark" Course

apache-spark pyspark data-engineering minio data-pipeline pyspark-notebook

Updated May 13, 2024
Python

brennerh1 / databricks-demos

Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and more.

pyspark spark-streaming databricks pyspark-notebook databricks-notebooks databricks-demos

Updated May 27, 2021
Python

archivesunleashed / notebooks

Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archives Unleashed Toolkit.

spark python3 notebooks web-archives pyspark-notebook juypter-notebook

Updated Dec 5, 2022
Jupyter Notebook

arjones / bigdata-workshop-es

Workshop Big Data en Español

docker postgres machine-learning scala kafka spark apache-spark postgresql jupyter-notebook superset pyspark pyspark-notebook

Updated Nov 9, 2023
HTML

aakinlalu / Crime-Classification-using-PySpark

classify crime into different categories using PySpark

machine-learning pyspark-notebook pyspark-mllib pyspark-python crime-classification

Updated May 20, 2019
Jupyter Notebook

mohanakrishnavh / PySpark-Tutorial

pyspark pyspark-notebook pyspark-tutorial pyspark-mllib

Updated May 8, 2018
Jupyter Notebook

josephmachado / docker_for_data_engineers

Code for blog at: https://www.startdataengineering.com/post/docker-for-de/

docker docker-compose pyspark pyspark-notebook apachespark

Updated Apr 29, 2024
C

jplane / pyspark-devcontainer

A simple VS Code devcontainer setup for local PySpark development

python spark jupyter vscode pyspark jupyter-notebooks devcontainer pyspark-notebook devcontainers

Updated Jul 11, 2023
Jupyter Notebook

intro-to-colab-pyspark-emr

jacobceles / intro-to-colab-pyspark-emr

A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics like EMR sizing, Google Colaboratory, fine-tuning PySpark jobs, and much more.

pyspark dataframe pyspark-notebook pyspark-tutorial colaboratory colab-notebook colab-tutorial

Updated Nov 12, 2021
Jupyter Notebook

microsoft / Fabric-RTA-FlightStream

Microsoft Fabric Real-time Analytics flight streaming

streaming etl powerbi pyspark-notebook kql lakehouse

Updated Feb 8, 2024
Jupyter Notebook

betfair-data-analysis

johntelforduk / betfair-data-analysis

Explore, analyse and visualise Betfair Historical Data Feed using PySpark.

spark betfair pyspark matplotlib pyspark-notebook jupiter-notebook betfair-historical-data

Updated Feb 10, 2023
Jupyter Notebook

yennanliu / analysis

Repo for practical data science problems approaches, including notebook demo and working scripts | #DS | #analysis

data-science machine-learning statistics deep-learning algorithms analysis tensorflow sklearn pytorch pyspark-notebook

Updated Oct 13, 2020
Jupyter Notebook

prabeesh / pyspark-notebook

Pyspark Notebook With Docker

python docker spark apache-spark docker-compose notebook docker-image bigdata jupyter-notebook pyspark python-notebook pyspark-notebook

Updated Aug 18, 2015
Python

hyeonsangjeon / dataplatform

Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.

hive hadoop hadoop-cluster hadoop-mapreduce hadoop-docker pyspark-notebook zeppelin-notebook hadoop-ecosystem

Updated Nov 7, 2019
Shell

miquido / DataScience

Useful scripts and notebooks for Data Science. The project was made by Miquido. https://www.miquido.com/

docker machine-learning spark pipeline aws-s3 pyspark pyspark-notebook pyspark-tutorial pyspark-mllib

Updated Jul 6, 2023
Jupyter Notebook

jitsejan / pyspark-101

A PySpark course to get started with the basics for a Data Engineer

python spark pyspark pyspark-notebook pyspark-tutorial

Updated May 4, 2018
Jupyter Notebook

imsanjoykb / PySpark-Bootcamp

My Practice and project on PySpark

hadoop pyspark spark-streaming sparkjava transformation hadoop-mapreduce spark-sql pyspark-notebook pyspark-mllib pyspark-machine-learning pyspark-ml

Updated Sep 17, 2021
Jupyter Notebook

awkepler / PySpark_Spark_Adventure

Sample code for pyspark

python pyspark pyspark-notebook pyspark-tutorial mlib pyspark-mllib pyspark-machine-learning

Updated May 1, 2019
Jupyter Notebook

Improve this page

Add a description, image, and links to the pyspark-notebook topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pyspark-notebook topic, visit your repo's landing page and select "manage topics."