Skip to content

I will update this repository to learn Machine learning with python with statistics content and materials

Notifications You must be signed in to change notification settings

nursnaaz/25DaysInMachineLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

25DaysInMachineLearning

I will update this repository to learn Machine learning with python with statistics content and materials

Day - 1: 6-4-2019 We Learnt about Different types of Analytics Different types of Machine Learning Why Python? Features of Python

Day - 2: 7-4-2019 We Started practising the python Ways to implement python Why Jupyter notebook? What is keyword, variable? Conditions on creating a identifier Different datastructure List, Tuple, Set, Dictionary, String Typecast

Day - 3: 13-4-019 Control Sataement Condition Statement What is Indendation? Functions Paraments, Defaut Parameters, Arbitory Parameters

Day - 4: 14-4-2019 Recursive Function Lambda Function Map, Filter, Reduce List Comprehension Set Comprehension Try, Except, Finally

Day - 5: 27-4-2019 Class and Object OS Library Module in python import and from import Numpy: Why Numpy? Numpy Basics

Day - 6: 28-4-2019 Pandas Data Loading Data Manipulation Data Filtering Data Grouping

Day - 7: 04-05-2019 What is Data Preprocessing? Why Data Preprocessing? Diferent Technique of Data Preprocessing Data Preprocessing with pandas example

Day - 8: 05-05-2019 Part - 1: What is Statistics? What are the Data types? Different measures - Central Tendency and Dispersion Percentiles, Quartiles and Box - Plots

Day - 9: 11-05-2019 Part - 2: Examples understanding indetail Concepts of Descriptive Statistics of Part - 1, Correlation, Covariance and Visualization

Day - 10: 12-05-2019 Exercise Session: Explaining Sampling bias, Various Sampling techniques, Characteristics of Normal Distribution and empirical rule, Central Limit Theorem, Standard Error, Z - Score, Confiedence Intervals

Day - 11: 18-05-2019 Exercise Session: Finding Descriptive Statistics for a data set in Excel, Understanding Covariance and Correlation Matrices, Solving Exercises for various Descriptive statistics concepts.

Day - 12: 19-05-2019 Part - 1: Understanding Null and Alternate hypothesis, Left tailed, Right tailed and two tailed tests, Level of Significance and Confidence Interval, Traditional and P-value approahces of Hypothesis testings, Type 1 and type 2 errors

Day - 13: 25-05-2019 Part - 2: Understanding Degree of Freedom, Z - Test, t - Test and Chi - Square Test

Day - 14: 26-05-2019 Part - 3: Analysis of Variance and Understanding Various plots using Searborn

Day - 15: 02-06-2019 Probability: Introduction to probability, Trials, Sample space, Intersections - unions & Complements, Independent and dependent events and Conditional Probability

Day - 16 Hackathon Session Stats Revision

Day - 17: 16-06-2019 Linear Regression: Supervised Learning, What is Linear Regression?, find slope and intercept, Different ways to solve Linear Regression, Line of best fit method, Linear Algebra, Gradient Descent

Day - 18 17-06-2019 Linear Regression Practise: Model Validation, train-test, Cross-validation, Variance, Bias, variance-bias trade-off, Error Metrics, simple linear regression model practise in company salary dataset, cab price dataset and House price prediction.

practise in Kaggle :

Predict the insurance income : https://www.kaggle.com/noordeen/insurance-premium-prediction/kernels
Predict the count of bike taken as rent : https://www.kaggle.com/noordeen/bikeshare-data
predict the valuecourse of lung cancer value : https://www.kaggle.com/noordeen/big-city-health-data

Day - 19 22-06-2019 Big Mart Sales (Linear Regression Hackathon Practise): BigMart Sales Hackathon contest in AnalyticsVidhya

Did necessary preprocessing and predicted the result using the Linear regression and uploaded the result to Analytics Vidhya

Assignment:

Predict Restaurant food cost

Day - 20 23-06-2019 Logistic Regression theory and Practise: Converting the continuous to probablity, Cost Function - Log loss, Error Metrics - Confusion Matrix, Accuracy, recall, precision, F1-score, ROC curve, AUC.

Predicting gender of an employer, Predicting marketing subscription by a customer.

Day - 21 29-06-2019 KNN and Naive Bayes Algorithm: KNN working, Regression and classification, Why scaling mandatory, How to find optimal K value, Computational Complexity of O(N^2), Why KD Tree. Naive Bayes Working, Classification, Assumption of Naive Bayes, Bayes Theorem, Example

Assigment: Implement KNN algorithm in following kaggle datset

Predict the insurance income : https://www.kaggle.com/noordeen/insurance-premium-prediction/kernels
Predict the count of bike taken as rent : https://www.kaggle.com/noordeen/bikeshare-data
predict the valuecourse of lung cancer value : https://www.kaggle.com/noordeen/big-city-health-data
predict employee attrition : https://www.kaggle.com/noordeen/employee-attrition/kernels

Day - 22: 30-06-2019 Unsupervised Learning(K-means, Hierrachical Cluster): Un-Supervised Learning, K-means, K-means++, Within- Sum-of-Square, optimal K value, Elbow curve, Scaling Madatory, Heirarchical, Agglomerative, Dendogram

Assignment: Find the pattern of the credit card usage in the following kaggle dataset.
https://www.kaggle.com/noordeen/card-usage/kernels

Day - 23: 06-07-2019 Decision Tree: Entropy, Information gain, Gini Index, Problems of Decision tree, Pruning, High Bias. Decision Tree Classification and Decision Tree Regressor.

Assigngment : Implement Decision Tree Classsification predict employee attrition : https://www.kaggle.com/noordeen/employee-attrition/kernels
Implement Decision Tree Regressor Predict the insurance income : https://www.kaggle.com/noordeen/insurance-premium-prediction/kernels
Predict the count of bike taken as rent : https://www.kaggle.com/noordeen/bikeshare-data
predict the valuecourse of lung cancer value : https://www.kaggle.com/noordeen/big-city-health-data

Day - 24: 07-07-2019 Ensemble: Bagging - Random Forest, Boosting - AdaBoost.

Assignment: Implement Bagging and Boosting https://github.com/nursnaaz/25DaysInMachineLearning/tree/master/24%20-%20Day%20-%2024%20-%20Ensemble/Assignment

Day - 25: 13-07-2019 Stacking and SVM: Stacking - Example - Support vector machine - perceptron, kernels

Assignemnt : Implement SVM and Stacking https://github.com/nursnaaz/25DaysInMachineLearning/tree/master/25%20-%20Day%20-%2025%20-%20SVM/Assignment

Day - 26: 14-07-2019 TimeSeries: ACF - PACF - Regression for Forecasting - Smoothing - SMA - WMA - EMA - AR - MA - ARMA - ARIMA