corpus
Here are 849 public repositories matching this topic...
Collecting kanji usage frequency data from Twitter Streaming API
-
Updated
Jun 1, 2015 - JavaScript
Naive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
-
Updated
Jun 21, 2015 - Java
An assortment of word-lists and micro dictionaries in English. Especially suited to English language learning tasks.
-
Updated
Dec 2, 2015
Data for the DiMSUM shared task at SEMEVAL 2016
-
Updated
Feb 8, 2016 - Python
Code for my BSc thesis: Cleaning of Parallel Texts for Machine Translation
-
Updated
Feb 12, 2016 - Java
Pratham Books stories in Markdown format
-
Updated
Apr 25, 2016
displays word cloud for each candidate for each day where an election or caucus occurred from 2/9 TO 4/19
-
Updated
Jun 9, 2016 - R
Data from a corpus of written Hawaiian
-
Updated
Jun 27, 2016
Computer-generated poetry
-
Updated
Jul 4, 2016 - CSS
Document clustering using PCA from scratch using numpy and scipy.
-
Updated
Jul 9, 2016 - Python
Kumpulan dokumen korpus dalam bahasa Indonesia berisi kasus uji deteksi plagiarisme eksternal dengan standar PAN CLEF (http://www.uni-weimar.de/medien/webis/events/pan-11).
-
Updated
Aug 8, 2016 - Python
Estonian TimeML Annotated Corpus \ Eesti keele TimeML märgendatud korpus
-
Updated
Nov 1, 2016 - Python
小数据:Some useful small dataset
-
Updated
Dec 5, 2016
Persian Stemming data-set in order to evaluate new stemmers
-
Updated
Dec 16, 2016 - Pascal
Tool to convert the German Tiger corpus and other corpora in Tiger format to GATE
-
Updated
Feb 3, 2017 - Groovy
Arabic Keyphrase Extraction Corpus
-
Updated
Feb 14, 2017 - Shell
Improve this page
Add a description, image, and links to the corpus topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the corpus topic, visit your repo's landing page and select "manage topics."