Skip to content

Thesis scope: Train and Develop a Table-to-Text Transformer-based model for contextual summarization of tabular data. To achieve this T5-small , T5-base, Bart-base and Llama2 7B chat were finetuned on ToTTo and QTSumm. Regarding ToTTo, the models outperformed the benchmark.

justdepie/MSc-Thesis-From-Tables-to-Natural-Language-Summaries

Repository files navigation

Notebook Guide:
-The Data Preparation notebook includes all the preprocessing and transformations made to convert the tables to text for the ToTTo dataset.
-The Modeling Current Version notebook includes the modeling code for ToTTo.
-The QTSumm All notebook includes everything I did for QTSumm (preprocessing + modeling).

Info:
The goal is to provide table summaries that include only the requested information.
The evaluation metrics used are: METEOR, ROUGE and SacreBleu.
Regarding ToTTo the models were evaluated on the whole test set as well as the top 5 most populated domains.
This is a Capstone Project conducted in partnership with the company Incelligent IKE for the completion of my MSc in Data Science.
For Llama2 7B chat there are no available results as the GPU RAM needs for inference exceed the available 40GB.

For more information please check the Thesis Presentation file and the Thesis Report.

Demo for ToTTo (T5-base generated summaries vs provided reference summaries):
image

To verify that the given information is valid, the corresponding Wikipedia Tables are provided:
1st Table (Snippet):
image

2nd Table:
image

3rd Table:
image

About

Thesis scope: Train and Develop a Table-to-Text Transformer-based model for contextual summarization of tabular data. To achieve this T5-small , T5-base, Bart-base and Llama2 7B chat were finetuned on ToTTo and QTSumm. Regarding ToTTo, the models outperformed the benchmark.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published