Machine Learning In A Nutshell ~ Behold The Wonders Of A Grade 10 Math Book
The tsunami of Information that bombards us was supposed to sink us in a quagmire of bits and bytes and paralyze us with Information overload. That was the scenario painted by Alvin Toffler in his usually prescient book called "Future Shock". The premise was that our some three million years of evolution in a non-technological world left us poorly prepared to handle the onslaught of the information stream that assaults us almost every waking minute.
As it turns out, the computer that is responsible for creating the problem is now being used to solve the problem. If you scan the tech section of any publication, the words "Big Data", "Machine Learning", "Deep Learning" and "Artificial Intelligence" will jump out at you. This jargon all points to computing machines digesting the vast amounts of data that they produce and creating usable information.
Most of the data generated is generated by machines, and by itself it is junk. You can't learn much from it. However, a thousand pieces of information may have valuable data in it, or it may not. But the value in that huge collection of data may be in the exceptions of the average values or the data outliers. For example, if there were deviations in a usual buying pattern of consumers, it could signal the beginning of a new trend. These are called weak signals, and may give a competitive edge to those data miners who are able to isolate them, and to capitalize on them. Another term that you will hear is "fat tails". This is data that doesn't fit into a standard bell curve, and it creates bubbles at the beginning or end of the curve if you plot it on a graph. Usually it means that something very interesting is happening that is out of the ordinary and could provide valuable intelligence to the data analyst. That information is not apparent from watching the big stream of machine generated data go by.
So how does a machine actually learn? The old way of doing things was to store each piece in a database and then try to look it up. It was like going to the library, and reading every card index of the subject matter of the information that you are trying to look up. Needless to say, it doesn't work very well if you have millions of cards to go through. There had to be a better way, and that better way was the artificial neural network. It is the basis of machine learning.
An artificial neuron is a very simple thing, and is quite stupid actually. All that it can do, is add, multiply, compute just one math function (a formula) and compare the result. However this little virtual, self-learning thingie is the basis of all machine learning. You can gang hundreds and even thousands of them together in a massively parallel system, and they can do very complex things like recognize faces and handwriting, find doorways for robots, and tease out the latest trends in footwear.
This is how it works. Let's suppose that you want to teach your machine to recognize the number 42, which according to the "Hitchhiker's Guide To The Galaxy" is the answer to the Ultimate Question of Life, the Universe, and Everything as computed by the Earth which is a huge organic computer.
You could do this with the simplest example of an artificial neural network. It is a single neuron consisting of an input and an output. All of the knowledge of recognizing the number 42 is stored simply as a number, in a value called the weight. And no, the weight is not 42. The weight is the numeric value that determines if the neural network hoists up a flag indicating that it has seen the number 42.
There is another hidden input number that is unchanging in value for all inputs, and it is called the bias. The bias is like a control number. A very simple analogy, is that it is like a thermostat. In real life, a thermostat controls the range at which a furnace will fire. In an artificial neural network, it has the same function. It determines the range at which the neuron will fire to indicate the number 42.
So when you present a any number to the input, the neuron takes it and multiplies that number by a weight. It also multiplies the bias by a weight. It adds the two together. Then it spoon feeds the number down the chute into the activator. This is a go-no go threshold. The activator consists of a mathematic formula that defines a function. It puts out a number between zero and one. This activation function is very unique in the fact that no matter what number you feed into it, it always gives the answer in a very long decimal from zero to one. It is like a thermometer. The closer to the right answer it gets, the closer to the value of 1 comes out of the activation function. If the answer is less than one, it is a failure or a no-go. The neuron doesn't fire.
You don't even have to determine the weight. The neuron can be trained. The training is called back propagation. In the training mode, you show it a whole bunch of numbers called a training set, and when the input number is 42, you ask the neuron to indicate the right answer by responding with a value of 1. Any other wrong number will show a zero at the output. When run it, and it gets the answer wrong, it adjusts the weight a little bit and tries again. You keep running the training set until it knows the right answer. It is that simple.
What makes this a powerful concept, is that you can gang hundreds of neurons together, and machine learning can do quite complex stuff. Behold the amazing Dark Arts of a Grade 10 math book.