Skip to content

Spark examples give quick overview of the Spark API using java and python

Notifications You must be signed in to change notification settings

selvamselvam/spark-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark examples

Java, Python and Jupyter notebook

Spark examples give quick overview of the Spark API.

Spark is built on the concept of distributed datasets, which contain arbitrary Java or Python objects. You create a dataset from external data, then apply parallel operations to it. The building block of the Spark API is its RDD API.

  • java uses Gradle
  • Python uses pyspark
  • jupyter notebook

Features

  • Explain the Spark environment setup
  • Uses JDK 11
  • IntelliJ Community Edition IDE
  • pySpark

Tech

Spark examples uses a number of open source projects to work properly:

  • Open JDK 11
  • pySpark
  • MongoDB
  • Windows 10

nc or netcat

The nc (or netcat) utility is used for just about anything under the sun involving TCP or UDP. It can open TCP connections, send UDP packets, listen on arbitrary TCP and UDP ports, do port scanning, and deal with both IPv4 and IPv6.

The socket examples uses the following command

nc -lk 9999

windows uses netcat from nmap and download and run the following command

netcat -lk 9999

Installation

Spark requires JDK to run.

About

Spark examples give quick overview of the Spark API using java and python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published