Skip to content

Simple script written in Python to get the 20 words with highest frequency in an English Wikipedia article

License

Notifications You must be signed in to change notification settings

prabhakar267/wikipedia-frequency-lookup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikipedia Frequency Lookup

Simple script written in Python to get the 20 words and their frequency percentage with highest frequency in an English Wikipedia article. You enter your string and using Wikipedia Search API, you get the top 20 words

Built this, so that I could implement my basic learning somewhere and play around with some libraries 📚 . If you want to remove the stop words (such as "and", "the", "a", "an", and similar words) from frequency table, simply add a yes after your string.

Instructions to run

  • Clone project
git clone https://github.com/prabhakar267/wikipedia-frequency-lookup.git
cd wikipedia-frequency-lookup
  • Add virtual environment
pip install virtualenv
virtualenv venv
source venv/bin/activate
  • Install dependencies
[sudo] pip install -r requirements.txt
  • Run script
  python main.py <your-string> [yes]

screenshot

screenshot

About

Simple script written in Python to get the 20 words with highest frequency in an English Wikipedia article

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages