Projects-Python

Practical demonstration of scikit learn library for building various classification and regression models

Description

The ultimate goal of topic modeling is to find various topics that are present in your corpus. Each document in the corpus will be made up of at least one topic, if not multiple topics. In this notebook, we will be covering the steps on how to do Latent Dirichlet Allocation (LDA), which is one of many topic modeling techniques. It was specifically designed for text data. To use a topic modeling technique, you need to provide (1) a document-term matrix and (2) the number of topics you would like the algorithm to pick up. Once the topic modeling technique is applied, your job as a human is to interpret the results and see if the mix of words in each topic make sense. If they don't make sense, you can try changing up the number of topics, the terms in the document-term matrix, model parameters, or even try a different model.

Data set comprises of 20 Newsgroups and using LDA to extract the naturally discussed topics.

Using Latent Dirichlet Allocation (LDA) from Gensim package along with the Mallet’s implementation (via Gensim). Mallet has an efficient implementation of the LDA. It is known to run faster and gives better topics segregation.

Data set

Data can be obtained from : https://raw.githubusercontent.com/selva86/datasets/master/newsgroups.json

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Topic_modeling_nlp.ipynb		Topic_modeling_nlp.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Topic_modeling_nlp.ipynb

Topic_modeling_nlp.ipynb

Repository files navigation

Projects-Python

Description

Data set

About

Releases

Packages

Languages

ankit013/Projects-Python

Folders and files

Latest commit

History

README.md

README.md

Topic_modeling_nlp.ipynb

Topic_modeling_nlp.ipynb

Repository files navigation

Projects-Python

Description

Data set

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages