Machine Learning in JavaScript

Darren DeRidder / @73rhodes

Preview

machine learning

naive bayesian classifiers

node.js

About Me

@73rhodesgithub/73rhodes51elliot.blogspot.com

Computer Systems Engineer

Real-time • AAA • Network Security • Mobile

Tech lead on Kindsight Mobile Security @ Alcatel

Mobile World Congress • Blackhat 2013

@ottawa_js organizer

Full Disclosure...

"I Am Not A Data Scientist"

(IANADS)

and that's ok!

There are lots of tools available for us mortals.

Naive Bayesian Classification

simple, yet surprisingly effective

Bayesian Filters Can...

  • filter out spam
  • figure out if a page is about apples (fruit) or computers
  • detect malware 🏴‍☠️
  • etc!

Bayes' Theorum


` P(A|B) = (P(B|A)P(A)) / (P(B)) = ...`

`= (P(B|A)P(A)) / ( P(B|A) P(A) + (1-P(B|A))(1-P(A)))`

Binary Bayesian Classifier


`P(A) = ( prod_(i=1)^n P(A|W_i) ) / ( (prod_(i=1)^n P(A|W_i)) + (prod_(i=1)^n (1 - P(A|W_i))) )`


Or, in Plain English

"WTF?!"

Life is like...

a box of chocolates.

You never know what you're gonna get.

(But you can make a pretty good guess!)

A Simple Example

  Nuts No Nuts
Round 25% 75%
Square 75% 25%
Dark 10% 90%
Light 90% 10%

What if we pick a round, light chocolate?

A Simple Example

A round, light chocolate...

  Nuts No Nuts P(Nuts) P(NoNuts)
Round .25 .75 .25 .75
Square .75 .25 - -
Dark .10 .90 - -
Light .90 .10 .90 .10
`prod_(i=1)^n P_i` .225 .075

The Results

`x = 0.225 / 0.075 = 3`

A round, light chocolate is 3 times more likely to have nuts.

(This is a likelihood function.)

Binary Classification

Classify as "Nuts" or "No Nuts", with some level of certainty.

`P(N) = 0.225 / (0.225 + 0.075) = 0.75 = 75%`

(We're 75% sure this chocolate has nuts.)

Machine Learning in Node.JS

dclassify

Optimized binary classifier for limited vocabularies.

Leverages "missing" traits to improve accuracy by ~10%.

Used in production...

dclassify: prepare data


 const item1 = new Document(['awful','basic','cautious']);
 const item2 = new Document(['awful','basic','cautious']);
 const item3 = new Document(['awful','delightful','energetic']);
 const item4 = new Document(['cautious', 'delightful']);
 const item5 = new Document(['energetic']);
 const item6 = new Document(['basic','delightful','energetic']);
          

dclassify: Curate the data


 const data = new DataSet();
 data.add('bad',  [item1, item2, item3]);    
 data.add('good', [item4, item5, item6]);
          

dclassify: train the classifier


 const classifier = new Classifier(options);
    
 classifier.train(data);
          

dclassify: using the classifier


 const testDoc = new Document('testDoc', ['b','c', 'e']);    
 const result1 = classifier.classify(testDoc);

 console.log(result1);
          

Thanks

http://73rhodes.github.io/talks/MachineLearning