Skip to content

This project is an internal project with INTEL where a framework for monitoring data quality from disparate sources and automating it using python.

Notifications You must be signed in to change notification settings

praveenchoragudi/DataQuality_Enhancement_Framework

Repository files navigation

framework

Data Quality enhancement and monitoring framework for High Performance Systems leveraging Machine Learning techniques The data that is originating from the source systems is not standardized, integrated, deduplicated and cleansed leading to ineffective/non-actionable business indicators.

The data must be aggregated to identify the common master data sets; sourcing the “clean” dataset and feed it to the learning engine and apply the generated model to auto-cleanse, suggest or run human-assisted/semi-automated scenarios.

By doing this, we want to achieve significant reduction in data quality deviations with on-the-fly fixing of about 70% of the master data entities, as identified by Data Quality indicators and dynamic recommendations to fix about 20% of the persistent data sets.

About

This project is an internal project with INTEL where a framework for monitoring data quality from disparate sources and automating it using python.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published