Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
-
Updated
Mar 9, 2024 - HTML
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Ephemeral Hadoop clusters using Google Compute Platform
Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service
Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for running complex Auditable workflows which can interact with Google Cloud Platform, AWS, Kubernetes, Databases, SFTP servers, On-Prem Systems and more.
Performance Observability for Apache Spark
Dataproc Customisable HA cluster debian-9 with zookeeper,kafka ,BigQuery and other tools/jobs with Terraform
Data Pipeline from the Global Historical Climatology Network DataSet
La empresa GreenMiles NYC Taxis está interesada en invertir en el sector de transporte de pasajeros con automóviles, con una visión de un futuro menos contaminado y ajustarse a las tendencias de mercado actuales.
Working examples for some components on GCP, and instructions on how to run them.
gke with terraform, dataproc with terraform
Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and pipelines.
A search engine to query social media insights with political theme
Add a description, image, and links to the dataproc topic page so that developers can more easily learn about it.
To associate your repository with the dataproc topic, visit your repo's landing page and select "manage topics."