Skip to content

predictiveworks/cdap-spark

Repository files navigation

CDAP Spark

CDAP Spark is an all-in-one library for unified plug & play. CDAP Spark is sitting on the shoulders of Apache Spark, which now is the big data platform of choice for enterprises.

CDAP Spark covers all flavors of modern data analytics from deep learning, machine learning to busines rule and query analysis up to comprehensive text & time series processing.

Works DL Works ML Works TS
Works Rules Works SQL Works Text

CDAP Spark externalizes modern data analytics in form of plugins for Google CDAP data pipelines, and boosts the work of data analysts and scientists to build data driven applications without coding.

Externalization is an appropriate means to make advanced analytics reusable, transparent and notably secures the knowledge how enterprise data are transformed into insights, foresights and knowledge.

We decided to select Google's CDAP as this unified environment was designed to cover all aspects of corporate data processing, from data integration & ingestion to SQL & business rules up to machine learning & deep learning.

CDAP Spark offers more than 150 analytics plugins for CDAP based pipelines and provides the world's largest collection of visual analytics components.

Overview

Visual Analytics is supported by the following modules:

Module Description
DL Externalizes deep learning algorithms (adapted from Intel's Analytics Zoo) as plugins for Google CDAP data pipelines.
ML Externalizes Apache Spark ML machine learning algorithms as Google CDAP data pipelines.
TS Completes Apache Spark with proven time series algorithms and also externalizes them as plugins for Google CDAP data pipelines.
Rules Externalizes Drools' Rule Engine as plugin for CDAP data pipelines.
SQL Supports the application of Apache Spark compliant SQL queries for CDAP batch and stream pipelines.
Text Integrates John Snow Lab's excellent Spark NLP library with Google CDAP and offers approved NLP features as plugins for CDAP data pipelines.

Background

Interested in more detailed information? Read here

Releases

No releases published

Packages

No packages published