gdg-speech-classifier

A machine learning system that recognizes the word 'Google' in human speech.

How it works

We train a classifier on a set of WAV files using Mel-Frequency Cepstral Coefficients (MFCC) as features. There are two implementations of the classifier available:

Regularized logistic regression, trained with conjugate gradient optimizer (fmincg).
Feed-forward neural network, trained with MATLAB's scaled conjugate gradient optimizer (trainscg).

How to use

Import training and test data into the data folder. You can get some data from the Releases Page. The names of the files should follow the pronunciation_en_%label%.wav pattern.
Run either mainLogisticRegression.m or mainNeuralNetwork.m depending on which classifier you would like to try.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
logistic-regression		logistic-regression
optimizers		optimizers
rastamat		rastamat
util		util
.gitignore		.gitignore
README.md		README.md
mainLogisticRegression.m		mainLogisticRegression.m
mainNeuralNetwork.m		mainNeuralNetwork.m
mapWaveformToFeatures.m		mapWaveformToFeatures.m
readData.m		readData.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

logistic-regression

logistic-regression

optimizers

optimizers

rastamat

rastamat

util

util

.gitignore

.gitignore

README.md

README.md

mainLogisticRegression.m

mainLogisticRegression.m

mainNeuralNetwork.m

mainNeuralNetwork.m

mapWaveformToFeatures.m

mapWaveformToFeatures.m

readData.m

readData.m

Repository files navigation

gdg-speech-classifier

How it works

How to use

About

Releases 1

Packages

Languages

YuriyGuts/gdg-speech-classifier

Folders and files

Latest commit

History

Repository files navigation

gdg-speech-classifier

How it works

How to use

About

Topics

Resources

Stars

Watchers

Forks

Languages