Machine learning is like highschool sex. Everyone says they do it, nobody really does, and no one knows what it actually is.
Mike Townsend on Twitter

Machine learning is probably the fastest-growing trend in the tech industry these days (sorry React!).

So, today we’re gonna write a simple program which, with a given string, will tell us if it’s mood is sad or happy.

Why JavaScript

As a web developer, I find JavaScript to be the most important, fast growing and flexible language for web development.

There are so many good Machine Learning frameworks out there (TensorFlow, Cafe2 etc), but coming from the web developer experience, working with JavaScript may result easier and more comfortable.

Of course, there are some limitations: performance, concurrency, JavaScript gotchas (never forget 1 - "1" // 0, it drives me crazy!), but I think that for the beginning, JS is a really good place to start.

Setting up our project

We’re gonna use Brain.js, the most popular JS-ML framework on GitHub.
I would recommend to use Node 8.9.4 with NVM. It full supports ES6 features (except modules, importand export) and async, await from ES7.

So, let’s get started!
Simply make a new directory and initialize it with yarn:

$ mkdir my-nn && cd my-nn
$ yarn init -y

now, let’s install Brain.js

$ yarn add brain.js

let’s make a source directory. We’re gonna need some functions:

$ mkdir src

and last but not least, create an index file:

$ touch index.js

our folder structure will look like the following one:

node_modules/
src/
package.json
index.js

perfect, we’re ready to go!

Our first Machine Learning program

Let’s take the first Brain.js example:

const brain = require('brain.js');

const net = new brain.NeuralNetwork();

net.train([{input: [0, 0], output: [0]},
           {input: [0, 1], output: [1]},
           {input: [1, 0], output: [1]},
           {input: [1, 1], output: [0]}]);

const output = net.run([1, 0]);  // => [0.987]

As you can see, we use two different Brain.js’ apis:

net.train(), which allows us to train our neural network, in order to teach it how to predict the correct result, and run(), which simply runs the neural network and returns the predicted value.

Working with strings

Don’t.

Seriously, working with strings in Brain.js is like driving a Maserati at 5mph.
It will let you to do it, but it will be really slow (~2mins for 20k iterations).

We want a Neural Network which takes a string, for example a Tweet, and tells us it’s mood (sad or happy?).

That kind of program is called emotion detection.

First of all, we need to convert our input to a numeric value. Let’s take a look at the following functions:

const encode = d => {
  const encoded = [];
  d.split('').map(c => {
    encoded.push((c.charCodeAt(0) / 255))
  })
  return encoded
}

const encodeData = data => data.map( d => encode(d) )

encodeData("I am happy");

/*
Outputs:
 [ 0.28627450980392155,
   0.12549019607843137,
   0.3803921568627451,
   0.42745098039215684,
   0.12549019607843137,
   0.40784313725490196,
   0.3803921568627451,
   0.4392156862745098,
   0.4392156862745098,
   0.4745098039215686 ]
*/

That is perfect! Brain.js works well with that kind of values.

Training Data

Now, let’s navigate to our src/ folder and create a new folder called training-data

cd src && mkdir training-data && cd training-data

now, let’s create three files (d- is a simple prefix for data):

touch index.js
touch d-happy.js
touch d-sad.js

Now edit each file and add your training data:

d-happy.js


const happy = [
  {
    input: "I am happy",
    output: {happy: 1}
  },
  {
    input: "I feel fine",
    output: {happy: 1}
  },
  {
    input: "What a good day!",
    output: {happy: 1}
  }
]

module.exports = happy

d-sad.js

const sad = [
  {
    input: "I am sad",
    output: {sad: 1}
  },
  {
    input: "I feel bad",
    output: {sad: 1}
  },
  {
    input: "Such a bad day",
    output: {sad: 1}
  }
]

module.exports = sad

and index.js

const happy = require('./d-happy')
const sad   = require('./d-sad')

const moods = [
  ...happy,
  ...sad
];

module.exports = moods;


/*
We're using the amazing spread operator, which will let us to spread our array into another one.
We could also use Array.push() or Array.concat(), but the spread operator is just beautiful :)
Btw, that's what we got:
[ { input: 'I am happy',        output: { happy: 1 } },
  { input: 'I feel fine',       output: { happy: 1 } },
  { input: 'What a good day!',  output: { happy: 1 } },
  { input: 'I am sad',          output: { sad: 1 } },
  { input: 'I feel bad',        output: { sad: 1 } },
  { input: 'Such a bad day',    output: { sad: 1 } }
]
*/

Now guess what do we need to do? We need to encode and serialize our data! So move back to our src/ folder and let’s create a new file:

$ touch serialize.js

Now open it with your favorite text editor and add the following code:


const encode = d => {
  const newArr = [];
  d.split('').map(c => {
    newArr.push((c.charCodeAt(0) / 255))
  })
  return newArr
}

const encodeData = data => {
  return data.map( d => {
    return {
        input:  encode(d.input),
        output: d.output
      }
  })
}

const serialize = data => encodeData(data)

module.exports = {
  serialize:  serialize,
  encode:     encode,
}

As you can see, we’re gonna export two functions. The first one will serialize our data, the second one is just encoding it. We’ll use the last one when we’ll need to encode our input string.

Gotcha time!

Encoding is not enough.
We’re gonna pass to our Neural Network a lot of strings with different lengths. Brain.js doesn’t like that (it will output NaN as result, wtf lol), so we need to serialize our data and make all the arrays with equal length.

Just add the following code on top of serialize.js file and edit serialize function and module.exports:

const fixLengths = (data) => {

  let maxLengthInput = -1;
  for (let i = 0; i < data.length; i++) {
    if (data[i].input.length > maxLengthInput) {
      maxLengthInput = data[i].input.length;
    }
  }

  for (let i = 0; i < data.length; i++) {
    while (data[i].input.length < maxLengthInput) {
      data[i].input.push(0);
    }
  }

  return data;
}

const encode = d => {
  const newArr = [];
  d.split('').map(c => {
    newArr.push((c.charCodeAt(0) / 255))
  })
  return newArr
}

const encodeData = data => {

  return data.map( d => {

    return {
        input:  encode(d.input),
        output: d.output
      }
  })
}

const serialize = data => fixLengths(encodeData(data))

module.exports = {
  serialize:  serialize,
  encode:     encode,
  fixLengths: fixLengths
}

This is gonna solve our problems. For now 😅

Train our data

Now we need to train our data! Open your index.js file and add the following code:

const brain      = require('brain.js')
const trainData  = require('./src/training-data')
const serializer = require('./src/serializer')
const net        = new brain.NeuralNetwork()

net.train(serializer.serialize(trainData), { log: true })

As you can se, we’re passing a second argument to the train() method:

{ log: true }

this will allow us to see the full log in our terminal.
Try to run our program:

$ node index.js

We’ll receive the following output:

iterations: 10, training error: 0.2639952050215581
iterations: 20, training error: 0.2634632080241264
iterations: 30, training error: 0.2627503794173378
iterations: 40, training error: 0.2618992898044093
iterations: 50, training error: 0.26085229846593605
iterations: 60, training error: 0.25953170291711275
iterations: 70, training error: 0.2578384270917648
iterations: 80, training error: 0.2556477642282941
iterations: 90, training error: 0.25280641798031445
iterations: 100, training error: 0.24913258946236203

...

Of course, our train error is a bit high, that’s because we’ve got a really poor dataset for training. Larger datasets will produce better results!

Time of truth

Now we just need to test our trained Neural Network.
Let’s add the following code to our index.js file:

const brain      = require('brain.js')
const trainData  = require('./src/training-data')
const serializer = require('./src/serializer')
const net        = new brain.NeuralNetwork()

net.train(serializer.serialize(trainData))

const output = net.run(serializer.encode('Nothing is not ok'))

console.log(output)

Run the program and woah! The result is extremely precise:

{ happy: 0.07987350225448608, sad: 0.9199661612510681 }

Of course you’re gonna need to train a lot your Neural Network for more complex strings, and this article must be just your starting point 😇

Make sure to check all the features of Brain.js, is totally worth the time!

Did you like this article? Consider becoming a Patron!

This article is CC0 1.0 (Public Domain) licensed.