Skip to content

⚒️ Data preprocessing is the process of transforming raw data into an understandable format. It is also an important step in data mining as we cannot work with raw data. The quality of the data should be checked before applying machine learning or data mining algorithms

rojaAchary/Data_Preprocessing_Techniques

Repository files navigation

⚒️ Data Preprocessing Techniques ✨

Data Preprocessing is that step in which the data gets transformed, or Encoded, to bring it to such a state that now the machine can easily parse it. In other words, the features of the data now become Algorithm interpretable.

By the end of this ,you will be equiped to data handle gracefully.so lets gets started 🏃‍♀️

Why Data Preprocessing ⚡

✅Accuracy: To check whether the data entered is correct or not.
✅Believability: The data should be trustable.
✅Completeness: To check whether the data is available or not recorded.
✅Consistency: To check whether the same data is kept in all the places that do or do not match.
✅Interpretability: The understandability of the data.
✅Timeliness: The data should be updated correctly.

Primary Tasks 🎯

Libraries required

Use the package manager pip to install below

pip install numpy
pip install pandas
pip install sklearn

Table of Content:

No Topics Code Link 🔗
1 Cardinality Encoding Code
2 Delete Missing Values Code
3 Delete outliers Code
4 Feature Discreatization Code
5 Feature Rescaling Code
6 Handling Imbalance Code
7 Data Imputation - Mean Code
8 Imputation Missing Labels Code
9 Normalization Code
10 One Hot Encoding Code
11 Outliers Dealing Code
12 Pandas Categorical with Sklearn Code
13 Preprocess Categorical Features Code
14 Standardize IRIS Code

Want to Stay Updated !!

Fork 🍴 the repository

Learned Something !!

Give a 🌟 to support me 😊

@misc{Charged Neuron,
    author       = {Roja Achary},
    title        = {Data Preprocessing Techniques},
    Credits      = {websites,CA,me,AV},
    month        = {November},
    year         = {2021}
}

About

⚒️ Data preprocessing is the process of transforming raw data into an understandable format. It is also an important step in data mining as we cannot work with raw data. The quality of the data should be checked before applying machine learning or data mining algorithms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published