Skip to content


Repository files navigation


Topic Modeling

  • Developed Topic Modeling system using Bert topic on saudinewsnet dataset.
  • Converted unstructured data to structured one each doc belongs to a certain topic.
  • Tested the model, got got results.

Semantic Search:

  • Developed Semantic Search system using sentence-transformer on the Quora dataset to retrieve relevant docs based on input queries.
  • Used cosine similarity to measure semantic similarity between query embeddings and document embeddings.
  • Tested the model and got great results.

Sentiment Analysis:

  • Fine-tuned base BERT model on Twitter dataset and IMDB data set for identifing sentiment of a given sentence.
  • Tested the model and achieved high accuracy.

Text generation:

  • Implemented text generation system using LSTMs to predict the next word of a given context.
  • Trained the model on Shakespeare Sonnets Dataset.
  • Tested the mode and achieved acceptable results.

Arabic POS:

  • Leveraged the Farasa framework for Arabic NLP to obtain Part-of-Speech (POS) tags for each word in a given sentence.
  • Developed a visualization using NetworkX to represent each word along with its corresponding POS tag in a graphical format.