Skip to content

SonaOburka/Regression-tree-applied-on-wine-catalogue-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Regression Tree - Wine Catalogue Dataset

R programming language

LaTeX formatting

Article made in LaTeX - "Data Analysis and Their Visualization", Unicorn University, Winter 2022/2023

Attached: Article: pdf R code: R CSV file: Wine Catalogue Dataset - original has over 320.000 records - due to its extensive size the sample of only 120.000 records was uploaded in GitHub

Content:

  • Large dataset cleaning
  • Regression tree analysis.

The regression tree variables:

  • Wine price - dependent variable
  • Wine category and country of wine production - two independent variables.

Four hypothese testing:

  • The hypotheses on the dessert wine to be the most expensive wine category was confirmed
  • The hypotheses on white wines being less expensive than red wines was also confirmed.
  • The hypotheses that Italy produces the most expensive wines was confirmed only partially as it depends on the category of wine.
  • The hypotheses that the regression tree model was more complex than regression tree model from reduced dataset was not confirmed.

About

Seminar paper: in the form of an article prepared in LaTech

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages