Skip to content

YanSte/NLP-PEFT-LoRA-DialogSum-Dialogue-Summarize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

| NLP | PEFT/LoRA | DialogSum | Dialog Summarize |

NLP (Natural Language Processing) with PEFT (Parameter Efficient Fine-Tuning) and LoRA (Low-Rank Adaptation) for Dialogue Summarization

Introduction

This project delves into the capabilities of LLM (Language Model) with a specific focus on leveraging Parameter Efficient Fine-Tuning (PEFT) for enhancing dialogue summarization using the FLAN-T5 model.

Our goal is to enhance the quality of dialogue summarization by employing a comprehensive fine-tuning approach and evaluating the results using ROUGE metrics. Additionally, we will explore the advantages of Parameter Efficient Fine-Tuning (PEFT), demonstrating that its benefits outweigh any potential minor performance trade-offs.

  • NOTE: This is an example and we not using the entirety of the data used for PERF / LoRA.

Objectives :

  • Train LLM for Dialogue Summarization.

The DialogSum Dataset:

The DialogSum Dataset DialogSum is a large-scale dialogue summarization dataset, consisting of 13,460 (Plus 100 holdout data for topic generation) dialogues with corresponding manually labeled summaries and topics.

Project Workflow:

  • Setup: Import necessary libraries and define project parameters.
  • Dataset Exploration: Discovering DialogSum Dataset.
  • Test Model Zero Shot Inferencing: Initially, test the FLAN-T5 model for zero-shot inferencing on dialogue summarization tasks to establish a baseline performance.
  • Dataset Preprocess Dialog and Summary: Preprocess the dialog and its corresponding summary from the dataset to prepare for the train.
  • Perform Parameter Efficient Fine-Tuning (PEFT): Implement Parameter Efficient Fine-Tuning (PEFT), a more efficient fine-tuning approach that can significantly reduce training time while maintaining performance.
  • Evaluation:
    • Perform human evaluation to gauge the model's output in terms of readability and coherence. This can involve annotators ranking generated summaries for quality.
    • Utilize ROUGE metrics to assess the quality of the generated summaries. ROUGE measures the overlap between generated summaries and human-written references.

| View on Kaggle |

About

Exploration of Large Language Model (LLM) and its capabilities, specifically dialogue summarization abilities. It highlights the use of a comprehensive fine-tuning approach called Efficient Fine-Tuning (PEFT)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published