Tip1: Type ls -d */ to list directories within the current directory using Linux. (The reason why I am posting it here, is that I forgot it.
The other thing that threw me tonight, was that I had a tab or multiple spaces in an SQL script, and I just got a huge listing every time that I tried to load the script. Again, I solved this issue before by taking out the extra multiple spaces and tabs, but I had forgotten this.
Hope this helps someone.
OK, so I am working on and transferring files between my local machine and a production server. I move the files from the prod server to check them out on my machine. They are classes that connect to a MySQL database. I fire everything up and the console fills with all sorts of exceptions, dolphin crap and such. Here are the error messages in the console:
Unable to connect due to sql exception
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Cannot load connection class because of underlying exception: 'java.lang.NumberFormatException: For input string: "null"'.
Cannot load connection class because of underlying exception: 'java.lang.NumberFormatException: For input string: "null"'.
Cannot load connection class because of underlying exception: 'java.lang.NumberFormatException: For input string: "null"'.
So first I check to see if I can ping the database. I can. The database tool works fine. As it turns out, I still had the database credentials pointing the ones in the production server. On my dev machine, they are a lot simpler.
It wasn't connecting to the database, but it wasn't exactly explicitly telling me that.
Hope this helps someone.
Standard Artificial Neural Network Template
For the past few weeks, I have been thinking about having a trained artificial neural network on a computer, transferring it to another computer or another operating system, or even selling the trained network in a digital exchange in the future.
It really doesn't matter in what programming language artificial neural networks are written in. They all have the same parameters, inputs, outputs, weights, biases etc. All of these values are particularly suited to be fed into the program using XML document based on an .XSD schema or a light-weight protocol like JSON. However, to my knowledge, this hasn't been done, so I took it upon myself to crack one out.
It is not only useful in creating portability in a non-trained network, but it also has data elements for a trained network as well, making the results of deep learning, machine learning and AI training portable and available.
Even if there are existing binaries, creating a front end to input the values would take minimal programming, re-programming or updating.
I also took the opportunity to make it extensible and flexible. Also there are elements that are not yet developed (like an XML XSD tag for a function) but I put the capability in, once it is developed.
There are a few other interesting things included. There is the option to define more than one activation function. The values for the local gradient, the alpha and other parameters are included for further back propagation.
There is room to include a link to the original dataset to which these nets were trained (it could be a URL, a directory pathway, a database URL etc). There is an element to record the number of training epochs. With all of these information, the artificial neural net can be re-created from scratch.
There is extensibility in case this network is chained to another. There is the added data dimension in case other type of neurons are invented such as accumulators, or neurons that return a probability.
I put this .xsd template on Github as a public repository. You can download it from here:
http://github.com/kenbodnar/ann_template
Or if you wish, here is the contents of the .xsd called ann.xsd. It is heavily commented for convenience.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="artificial_neural_network">
<xs:complexType>
<xs:sequence>
<!-- The "name" element is the name of the network. They should have friendly names that can be referred to if it ever goes up for sale, rent, swap, donate, or promulgate.-->
<xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/>
<!-- The "id" element is optional and can be the pkid if the values of this network are stored in an SQL (or NOSQL) database, to be called out and assembled into a network on an ad hoc basis-->
<xs:element name="id" type="xs:integer" minOccurs="0" maxOccurs="1"/>
<!-- The "revision" element is for configuration control-->
<xs:element name="revision" type="xs:string" minOccurs="1" maxOccurs="1"/>
<!-- The "revision_history" is optional and is an element to describe changes to the network -->
<xs:element name="revision_history" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "classification element" is put in for later use. Someone will come up with a classification algorithm for types of neural nets.There is room for a multiplicity of classifications-->
<xs:element name="classification" type="xs:string" minOccurs="0" maxOccurs="0"/>
<!-- The "region" element is optional and will be important if the networks are chained together, and the neurons have different functions than a standard neuron, like an accumulator or a probability computer
and are grouped by region, disk, server, cloud, partition, etc-->
<xs:element name="region" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "description" element is an optional field, however a very useful one.-->
<xs:element name="description" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "creator" element is optional and denotes who trained these nets -->
<xs:element name="creator" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "notes" element is optional and is self explanatory-->
<xs:element name="notes" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The source element defines the origin of the data. It could be a URL -->
<xs:element name="dataset_source" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- This optional element, together with the source data helps to recreate this network should it go wonky -->
<xs:element name="number_of_training_epochs" type="xs:integer" minOccurs="0" maxOccurs="1"/>
<!-- The "number_of_layers" defines the total-->
<xs:element name="number_of_layers" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<xs:element name="layers">
<xs:complexType>
<xs:sequence>
<!-- Repeat as necessary for number of layers-->
<xs:element name="layer" type="xs:string" minOccurs="1" maxOccurs="1">
<xs:complexType>
<xs:sequence>
<!-- Layer Naming and Neuron Naming will ultimately have a recognized convention eg. L2-N1 is Layer 2, Neuron #1-->
<xs:element name="layer_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- number of neurons is for the benefit of an object-oriented constructor-->
<xs:element name="number_of_neurons" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<!-- defining the neuron this is repeated as many times as necessary-->
<xs:element name="neuron">
<xs:complexType>
<xs:sequence>
<!--optional ~ currently it could be a perceptron, but it could also be a new type, like an accumulator, or probability calculator-->
<xs:element name="type" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- name is optional ~ name will be standardized eg. L1-N1 layer/neuron pair. The reason is that there might be benefit in synaptic joining of this layer to other networks and one must define the joins -->
<xs:element name="name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- optional ~ again, someone will come up with a classification system-->
<xs:element name="neuron_classification" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- number of inputs-->
<xs:element name="number_of_inputs" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<!-- required if the input layer is also an output layer - eg. sigmoid, heaviside etc-->
<xs:element name="primary_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- ~ optional - there is no such thing as a xs:function - yet, but there could be in the future -->
<xs:element name="primary_activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>
<!-- in lieu of an embeddable function, a description could go here ~ optional -->
<xs:element name="primary_activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- possible alternate activation functions eg. sigmoid, heaviside etc-->
<xs:element name="alternate_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- ~ optional - there is no such thing as a xs:function - yet, but there could be in the future -->
<xs:element name="alternate__activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>
<!-- in lieu of an embeddable function, a description could go here ~ optional -->
<xs:element name="alternate__activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- if this is an output layer or requires an activation threshold-->
<xs:element name="activation_threshold" type="xs:double" minOccurs="1" maxOccurs="1"/>
<xs:element name="learning_rate" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- the alpha or the 'movement' is used in the back propagation formula to calculate new weights-->
<xs:element name="alpha" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- the local gradient is used in back propagation-->
<xs:element name="local_gradient" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- inputs as many as needed-->
<xs:element name="input">
<xs:complexType>
<xs:sequence>
<!-- Inputs optionally named in case order is necessary for definition -->
<xs:element name="input_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- use appropriate type-->
<xs:element name="input_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>
<!-- use appropriate type-->
<xs:element name="input_value_integer" type="xs:integer" minOccurs="0" maxOccurs="unbounded"/>
<!-- weight for this input-->
<xs:element name="input_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- added as a convenince for continuation of back propagation if the network is relocated, moved, cloned, etc-->
<xs:element name="input_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of input-->
<!-- bias start-->
<xs:element name="bias">
<xs:complexType>
<xs:sequence>
<xs:element name="bias_value" type="xs:double" minOccurs="1" maxOccurs="1"/>
<xs:element name="bias_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- added as a convenince for continuation of back propagation if the network is relocated, moved, cloned, etc-->
<xs:element name="bias_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of bias-->
<xs:element name="output">
<xs:complexType>
<xs:sequence>
<!-- outputs optionally named in case order is necessary for definition -->
<xs:element name="output_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="output_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>
<!-- hypothetical value is a description of what it means if the neuron activates and fires as output if this is the last layer-->
<xs:element name="hypothetical_value" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of output-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of neuron-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of layer-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of layers-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- network-->
</xs:schema>
I hope this helps someone. This is open source. Please use it and pass it on if you find it useful.
How To Make Scary Smart Computers
If you have ever visited this blog before, you will know that I am heavily into Artificial Intelligence. I have been playing with artificial neural networks for about ten years. Recently buoyed by the Alan Turing movie "The Imitation Game" and "The Innovators" by Walter Isaacson, I have decided to start mapping out what it would take to make the truly intelligent, conscious, creative computer that would easily pass a Turing Test.
In previous blog entries, like the one immediately below, I outline the need for an autonomous master controller that can stop and start programs based on what an embedded set of artificial neural networks come up with. I have also started thinking about a standardized artificial neural net, that can be fed into any machine already trained, so there can be a market for trained artificial neural nets. I have outlined a possible algorithm for perturbations in artificial neural nets that can create computation creativity. The list goes on and on.
But here is another essential element, If computers have embedded artificial neural networks in them, then for the machine to become scary smart, it has to be able to pass on what it has learned to the next generation. So how do you accomplish that? Easy.
Every time that an Operating System or OS is upgraded, it is upgraded by a predecessor machine who passed on the trained and learned neural nets that it has learned in its artificial lifetime. It is the equivalent of a parent teaching a child.
In the biological world, what makes humans different from the animals, is that we can pass on wisdom, knowledge and observation. In the animal kingdom, each new generation starts from where its parents did -- near zero. Animals learn from their parents. But I can go and read a book, say written by the Reverend Thomas Bayes who wrote in the 1700's on Bayesian theory, and I can read last weeks journals. I can pick and choose to learn whatever I want from the human body of knowledge. But first and foremost, I get my first instruction from my parents.
So if a new Operating System is loaded into a computer from an existing one with artificial intelligence, then it won't have to start from scratch. And if you embed the ability of the artificial neural networks to read and learn stuff by crawling the internet, soon you will have a scary smart computer. The key is that each machine and each server is capable of passing on stuff that it has learned to new computers.
I do believe what Dr. Stephen Hawking says, is that one of the threats to mankind will be some of the artificial intelligence that we create. But like nuclear fission, we have to build it for the sake of knowledge and progress, even if it has the potential to do the human species grave harm.
We have already opened the can of worms of artificial intelligence. Once opened, there is no way to close it unless you have a bigger can. Unfortunately, the contents of that can of worms is expanding faster than we can keep up with it. The best way to control artificial intelligence, is to have a hand in inventing a safe species of it.
I'm Sorry Dave, I Can't Let you Do That ~ The AI Autonomous Computer Master Controller
We still operate as Microsoft slaves when it comes to a computer. Can you imagine how ludicrous it will seem to our grandchildren, that we had to double click on an icon to start a program? They will not be able to imagine us having to direct, save and open a downloaded file. They will wonder why were so primitive that we had to have anti-virus programs. They will wonder why we didn't create something as obvious as the Autonomous Computer Controller as part of every operating system.
The genesis of this blog posting came after I did a run of my Artificial Neural Network run and then had to take the outputs and feed them into another program (in this case, it was Excel). I got to thinking that if my vision of an Artificial Neural Network embedded in every operating system comes to fruition, then the network should not just solve functions, sort and classify data, recognize stuff and do what ANNs (Artificial Neural Networks) do today. They should be able to invoke programs based on the outputs of specific neural pathways and events.
A natural extension of that thought, is that they would signal an Autonomous Computer Master Controller to fire up a program to use the outputs of the particular neural network that just finished firing. If that controller is smart enough to recognize when the AI network is telling it to start the program, then it should be smart enough to shut a program down when it's not needed. This is the small starting point of an autonomous semi-intelligent machine. But let us take it one step further.
Not only is the ANN telling it to fire up a program, and then shut it down, but the ANN could also be running the Autonomous Computer Master Controller. Moore's Law will let that happen as microprocessor get faster and faster and the overhead will be negligible. A core or two of the microprocessor could be dedicated to the intelligent OS operations.
Since we are not evolving a computer intelligence from scratch, we can take short cuts in the evolutionary cycled of smart machines. We don't have to create memory and storage from first principles like using an artificial neuron as a building block - the way that the biological evolution took place. We can do this by making the Autonomous Computer Master Controller talk to a graph database. A graph database can map ambiguous, non-empirical, indirect relationships. So the Autonomous Computer Master Controller observes what is going on with its AI network, and saves its observations in graph. As neo4j, the premiere graph database provisioner says, "Life is a graph". They are pretty cool guys at neo4j, unlike those at Arango DB -- they are the kinds of jerks who follow you on Twitter, and once you extend the courtesy of following them back, they unfollow you. They have a thing or two to learn about brand-building. But je digress.
The power of an Autonomous Computer Master Controller becomes obvious in various use cases. If you have an overseer monitoring the browser and its downloads, it can detect executable code, and as such it can pull an "I sorry Browser Dave, but I can't let you do that!", and block the download.
Slowly, I am stitching the theoretical pieces of an intelligent, creative (perhaps conscious) machine, that will pass the Touring Test and perhaps equal and/or surpass us in intelligence. A machine like that, will crack this veneer that we call civilization, and show us how thin that it really is, and how unenlightened most carbon-based life forms are. Silicon Rules!
The End of the Big Data Fad ~ Introducing Data Flow
If I were a venture capitalist, which one day I hope to be, I wouldn't fund companies and start-ups that process reams and reams of Big Data or Dark Data. Big Data, as we know it, is a flash in the pan, and it will disappear just like the Atari.
Yes we will have the internet of everything generating more data than anyone can handle. Yes we will have data generated by ever single one of the billions of humans inhabiting this planet. Yes we will have data generated by the trillions of devices that are or will be connected. But gathering huge lots of data and then batch processing it, is an unsustainable model.
When I was just starting out adult life, one of my neighbors was a draft dodger from the Vietnam War. He was/is a pacifistic in the positive sense. He didn't trust the motives of the US government and the McCarthy Communist witch hunts, and his buddies dying in foreign jungles for an unfathomable reason, in a war that they couldn't win. So he came to Canada. He has a degree in civil engineering, but he landed in Silicon Valley North and started working for a start-up. It was an exciting time. The computer was becoming ubiquitous, and almost every industry was crying for some sort of computerization in an age where there weren't any off-the-shelf software packages.
He joined a start-up and his job was to write the software for a data digestor for a shoe manufacturing company. The company would do a run of shoe components (soles, uppers, large, small, all sizes, ladies, men's, children, brown, black, purple) and they didn't know how many parts of what. They generally ran a line until they ran out of component pieces, and then laboriously switched over to another make, color and size.
It was the perfect application for a data digestor. Every time a component batch was made, someone would swipe a pre-programmed card and the result would go to a collector, and then to a database that could be queried, and management could do some actual planning by matching what components they had and knowing the manufacturing run limit. Big data wasn't very big then, but it was still an issue.
After the data digestor was delivered, the start-up ran through all of their money including the shoe company money, leaving my friend as the last employee. His wife was a registered nurse so he lived off her income while he refined the data digestor. He worked for shares of the company abandoned by the other employees. He was paid his putative salary in shares and the shares were valued at ten cents. After a year of refinement and frugal living, the company was bought for its data digestor, and my old neighbor is wealthy to this day.
All this to say, is that Big Data is going to grow exponentially, and we can't handle it like we are doing now, with old paradigms. Big Data has to be digested, mined and made sense of when it is created. It can't be allowed to accumulate otherwise when we get around to it, the effort will be akin to emptying the ocean with a teaspoon.
Because of these old paradigms still in play, in the long term, I would short the stock of SAP and SAS and all of the old-school stuff. A better way will come along, and just like the tire-cord industry disappeared when Michelin invented the radial tire, and we will have a new something-else.
So if I were a venture capitalist, I would fund and bet my money on creative, innovative paradigms that for men and machines that made sense of data as it was created. If we don't develop a sure-fire universal way of doing that, in a few short years, even if every molecule of silicon in every mountain in the world was transmuted to semi-conductor memory, it still wouldn't be enough.
A bigger philosophical question is, "How much of this data is valuable?". That question will be answered by clever minds who can monetize pieces of it. The economic world is very Darwinian.
As for Big Data, we will bury that sucker. Its child will live on though, and we will call it Data Flow. Data Flow software will be ubiquitous and highly necessary,
Coming In From The Cold ~ The Fridge of the Future
(Click on pic for larger image)
I was walking behind a little boy, toddling along being a little boy. At one point, he looked up at his father and said "It's cold, just like in the fridge." You could see the gears turning inside his head, and in a few minutes he asked "Daddy, why is the fridge cold inside?".
I didn't hear the father's reply, but it got me to thinking. The fridge evolved from the icebox which emanated from the experiments of Sir Francis Bacon and a Dr. Winterbourne. Apparently Bacon was convinced that snow would preserve a chicken. While they were driving in a carriage, they stopped at a farm house, and Bacon bought a chicken, had it killed and used his bare hands to stuff it with snow to preserve it. Legend maintains that because of the demonstration, he caught a cold and a few days later, died from the cold, and the fridge was born. This is the only case in history where Chicken beats Bacon,
Cold is used to inhibit the growth of bacteria. Before electrification of households, ice used to be harvested from bodies of water in the winter, stored in ice houses and delivered to iceboxes during the summer. Other than getting an electric cold-maker from the compression of freon and other CFCs or chlorofluorocarbons that chew up the ozone layer and give us all cancer, fridges haven't changed in abstract theory from the iceboxes.
If I had to design a fridge of the future, I would do away with the cold component. I got the idea while watching a "How It's Made" show on TV where they were making bacteria-free scalpel blades for surgery. I realize by writing this, that I am putting the idea into the public domain, making it unpatentable, but that's okay with me. I like open source technology initiatives.
There would still be some refrigeration in the house, otherwise we would have to drink all of our beer at room temperature like the British and all end up with bad teeth. Also, we need to keep our ice cream solid. However for the non-freezing preservation of food, we could do without the cold. How you ask?
First of all, the fridge would be a positive-pressure device. That means that the air pressure in the fridge is slightly elevated, so when you open the door, the air would rush out and it wouldn't draw kitchen bacteria into the fridge. And the light that goes on -- well, it's a bacteria-killing ultraviolet light. To minimize human contact, the fridge would not only be built with a window, so you could turn on the light and look inside without opening it, Also, the built in tablet (connected to the internet of course) could also tell you that you are running low on milk. After all, we will have the internet of everything, and the built-in tablet will keep track of contents and expiry dates.
But that isn't the clever bit. The way that you obviate the need for cold, is twice a day, the fridge locks the doors so that no one can open them accidentally and irradiates all of the food killing the germs and preserving the food. This happens in less than 30 seconds, and the shot of bacteria-killer rays will still preserve freshness without chilling.
I know that some people will say that this creates frankenfood, but that is pseudo-science. They said that about the microwave oven as well, and now a microwave oven graces most kitchens.
Not only would this preserve the food, but you could throw in your grungy dish clothes into a freezer bag, and wait for the next germ-zapping cycle and they would be fresh as new-fallen snow. And you could sterilize your scalpel blades after you do the exercises in "The Dummy's Guide to Self-Surgery".
How To Create Computational Creativity Without Killing Anything
In a previous blog post, I outlined some ideas on Computational Creativity, and the seminal work of Dr. Stephen Thaler. You can read it HERE. What Dr. Thaler did, was create neural nets, trained them to do things like recognize coffee cups, and then created a layer of supervisory neural nets to watch the nets. Then he would bring them to near death by killing a pile of the neurons in the layers. In very anthropomorphic terms, the neural network in paroxysms of Near Death, would create unique designs of other coffee cups. He called this process the Creativity Machine and it was some of the first steps in Computational Creativity using Artificial Neural Nets.
What Thaler was doing by formulating a mechanism for the Eureka moment, was to create the impetus, spark and ignition of thoughts from a machine was was programmed not to think outside the box, but to slavish following a compiled set of instructions in its register stack. His unique algorithm was to produce a perturbation in the execution of the neural network process to create a confabulation or false idea that would be new and unique. For the time, (and it still may a valid algorithm), it was quite revolutionary. The problem to solve, was to have to find some way to spark new information synthesis out of pre-programmed siliconized transistor pathways. After all, ideas just can't pop into a computer's circuits.
Our brains have massively parallel neural nets and just thinking about anything sparks new thoughts. Our thinking processes undergo a perturbation of essentially interrupt vectors in staid ways of thinking. That was Thaler was looking for inside the computer, when he started the practice of committing neuroncide, and killing neurons.
In another blog article, where I try to link synaptic pruning as a method of creating perturbations in Artificial Neural Networks ( HERE ), I came up with the idea of crippling instead of killing the neurons by pruning some random inputs in the layers. I haven't tested it yet. I don't think that the resultant "ideas" or designs would be as far-out or as revolutionary as Thaler's killings, but it might prove useful. That idea has yet to be tried.
Then it struck me, that perhaps brain damage isn't a viable algorithm in the long term. Even though creativity can be brainlessly expressed when monkeys finger-paint and elephants do the Picasso thing with their trunks, one would want brains, even artificial ones, with all of their faculties for serious creative thought. So there has to be a better way than Thaler's, without killing anything.
If you want to avoid killing, and near-death experience just to create something, you still need the perturbations in regularized logic activity of artificial neural networks. Otherwise you would just get the outputs that the neural nets were trained for. However to Thaler's credit, he did introduce another mechanism that can be useful in creating these perturbations in Artificial Neural Networks in producing unique thoughts, and that is the supervisory network atop the thinking network.
In a future blog post, I will outline how I think that supervisory networks can contribute to Machine Consciousness, but for now, they can be integrated for non-death perturbations and idea creation in a new breed of Creativity Machines.
First let's look at a simple artificial neuron:
(I stole this gif from Thaler's website: http://imagination-engines.com/ )
By adjusting the weights and thresholds,the simple neuron is one or two of the Boolean Gates of Knowledge that computers are made of. It can be and AND Gate or an OR Gate. In this case the weights are decimals and the inputs and outputs are integers.
There is no activation function. Usually an activation function is like a sigmoid function. It takes the sum of the inputs multiplied by their weights and after calculating the function, the output values are usually between -2 and +2 and the activation threshold is when the curve of the function is some value > 0.
If the threshold value for the neuron firing is set at say 0.9 or almost one, then then anything below that is ignored and the neuron doesn't fire. But that doesn't mean that the activation function is quiescent. It still calculates and spits out numbers, generally in a small range between -2 and +2. So if the activation threshold is 0.9 and the result of the sigmoid function is say 0.6, it will not activated the neuron. But we could say that the neuron is in an "excited" state because the output value of the sigmoid function is near the firing threshold. It is just on the cusp of firing. This excited state could be used as a perturbation to excite unique thoughts. This is where the supervisory network comes in.
A supervisory circuit can be a lot more powerful than Thaler envisioned. First of all, supervisory circuits overlaid on top of artificial neural networks placed in an n-tier of recursive monitoring are the first steps to machine consciousness. More on that in future blog posts.
But suppose that an independently trained ANN is monitoring other circuits for semi-excited layers or neurons, and reached out creating a synaptic link to these excited neurons. This may or may not cause the supervisory circuit to breach its firing thresholds, and get an output where none was expected. And the discovery of the unique ideation, is predicated on the model by Mother Nature where she plays dice and creates millions of redundant things in the event of one surviving and making something wonderful. In a like fashion, the outputs of all networks could be ANDed or ORed with another supervisory network monitoring for unique things, and the stimulation and simultaneous firing would cause perturbations and new ideas from two unrelated neural networks.
That would be the mechanism for a perturbation and confabulation of two fixed networks coming up with a new idea without having to kill anything like connections or any neurons. There would be no near-death creativity, just a flash in the pan triggered by something that just might turn out to be useful. A pseudo-schematic is shown below:
Our human brains operate on a massively parallel neural network. This concept is a bit of bio-mimicry that extends that.
The concept of killing brain cells in the name of creativity is not exactly new in the biological world as well. We apparently kill thousands of brain cells with an alcoholic drink or a few puffs on joint. Many people say that this is the key to creativity. After all, Hemingway wrote his opus half-drunk and won the Nobel Prize for Literature. However there are millions who drink and don't turn out anything creative except for a bum's life on the Nickel, sleeping between parked cars in the rain. But je digress.
So all in all, I think that this could be an alternative method for machines to dream up new ideas in the realm of Computational Creativity. It may not be as much fun as watching things gasp out creative in their death throes, but it could be more reliable and ultimately less destructive to some perfect good artificial neural nets.
Perils of Overtraining in AI Deep Learning
When we partnered with a local university department of Computer Science to create some Artificial Neural Networks (ANNs) for our platform, we gave them several years of data to play with. They massaged the input data, created an ANN machine and ran training epochs to kingdom come.
The trouble with ANNs, is that you can over-train them. This means that they respond specifically for the data set in a highly accurate manner, but they are not general enough to accurately process new data. To put it in general terms, their point-of-view is too narrow, and encompasses only the data that they were trained on.
In the training process, I was intuitively guessing that the learning rate and improved accuracy would improve in an exponential manner with each iterative training epoch. I was wrong. Here is a graph showing that the learning rate is rather linear than exponential in the training cycle.
So the minute that the graph stops being linear, is when you stop training. However, as our university friends found out, they had no way to regress the machine to exactly one training epoch back. They had no record of the weights, biases, adjusted weights, etc of the epoch after the hours of back propagation or learning, and as a result, they had to re-run all of the training.
Me, I had a rather primitive way of saving the states of the neurons and layers. I mentioned it before. I wrote my machine in Java using object oriented programming, and those objects have the ability to be serialized. In other words, binary objects in memory can be preserved in a state, written to disk, and then resurrected to be active in the last state that they were in. Kind of like freezing a body cryogenically, but having the ability to bring it back to life.
So after every training epoch, I serialize the machine. If I over-train the neural nets, I can get a signal by examining and/or plotting the error rates which are inverse to the accuracy of the nets. In the above graph, once the function stops being linear, I know that I am approaching the over-training event horizon. Then I can regress with my save serialized versions of the AI machine.
Then the Eureka moment struck me! I had discovered a quick and easy cure for over-training.
I had in a previous blog article, a few down from here (or http://coderzen.blogspot.com/2015/01/brain-cells-for-sale-need-for.html ) I made the case for a standardized AI machine where you could have an XML or JSON lightweight representation of the layers, inputs, number of neurons, outputs and even hypothetical value mappings for the outputs, and then you wouldn't need to serialize the whole machine. At the end of every training epoch, you just output the recipe for the layers, weights, biases etc, and you could revert to an earlier training incarnation by inputting a new XML file or a JSON object.
It's really time to draw up the .XSD schema for the standardized neuron. I want it to be open source. It would be horrible to be famous for thinking of a having a standardized neural net. Besides, being famous is just a job.
A Returned-Probability Artificial Neural Network - The Quantum Artificial Neural Network
Artificial Neural Networks associated with Deep Learning, Machine Learning using supervised and unsupervised learning are fairly good at figuring out deterministic things. For example they can find an open door for a robot to enter. They can find patterns in a given matrix or collection, or field.
However, sometimes there is no evident computability function. In other words, suppose that you are looking at an event or action that results from a whole bunch of unknown things, with a random bit of chaos thrown in. It is impossible to derive a computable function without years of study and knowing the underlying principles. And even then, it still may be impossible to quantify with an equation, regression formula or such.
But Artificial Neural Nets can be trained to identify things without actually knowing anything about the background causes. If you have a training set with the answers or results of size k (k being a series of cases), then you can always train your Artificial Neural Networks or Multilayer Perceptrons on k-1 sets, and evaluate how well you are doing with the last set. You measure the error rate and back propagate, and off you go to another training epoch if necessary.
This is happening with predicting solar flares and the resultant chaos that it cause with electronics and radio communications when these solar winds hit the earth. Here is a link to the article, where ANN does the predicting:
http://www.dailymail.co.uk/sciencetech/article-2919263/The-computer-predict-SUN-AI-forecasts-devastating-solar-flares-knock-power-grids-Earth.html
In this case, the ANN's have shown that there is a relationship between vector magnetic fields of the surface of the sun, the solar atmosphere and solar flares. That's all well and dandy for deterministic events, but what if the determinism was a probability and not a direct causal relationship mapped to its input parameters? What if there were other unknown or unknownable influence factors?
That's were you need an ANN (Artificial Neural Network) to return a probability as the hypothesis value. This is an easy task for a stats package working on database tables, churning out averages, probabilities, degrees of confidence, standard deviations etc, but I am left wondering if it could be done internally in the guts of the artificial neuron.
The artificial neuron is pretty basic. It sums up all of the inputs and biases multiplied by their weights, and feeds the result to an activation function. It does this many times over in many layers. What if you could encode the guts of the neuron to spit out the probability of the results of what is being inputted? What if somehow you changed the inner workings of the perceptron or neuron to calculate the probability. It seems to me that the activation function is somehow ideally suited to adaptation to do this, because it can be constructed to deliver an activation value of between 0 and 1, which matches probability notation.
Our human brains work well with fuzziness in our chaotic world. We unconsciously map patterns and assign probabilities to them. There is another word for fuzzy values. It is a "quantum" property. The more you know about one property of an object, the less you know about another. Fuzziness. The great leap forward for Artificial Neural Networks, is to become Quantum and deliver a probability. Once we can get an Artificial Neural Net machine to determine probability, then we can apply Bayesian mechanics. That's when it can make inferences, and get a computer on the road to thinking from first principles -- by things that it has learned by itself.
How Many Click-Throughs Do You Get Per Followers When A Link Is Favorited On Twitter?
How many click-throughs do you get per followers when a link is favorited on Twitter?
I recently posted a link to an article on this blog. After the initial rush died down, and the hits stopped coming, one of my followers favorited the link. He has 53,400 or so followers.
How many click-throughs did I get from him favoriting one of my tweets to his fifty+ thousand followers. I got 50 hits or click-throughs. About 0.1%. Not bad.
Brain Cells For Sale ~ The Need For Standardization of Artificial Neural Nets
When it comes to Artificial Neural Networks, the world is awash with roll-your-own. Everyone has their own brand and implementation. Although the theory and practice is well thought out, tested and put into use, the implementation in almost every case is different. In our company, we have a partner university training artificial neural nets for our field of endeavor as a research project for graduate students.
Very few roll-your-own ANN's or Artificial Neural Networks are object-oriented in terms of the way they are programmed. This is because it is easier to have a monolithic program where each layer resides in an array, and the neurons can input and output to each other easily. All ANNs are coded in everything from Java, to C, to C++, to C# to kiddie scripting. I am here to preach today, that there should be a standard Artificial Neuron. To be more explicit, the standardization should be in the recipe for layers, inputs, weights, biases and outputs. Let me explain.
While the roll-your-own is efficient for each application, it has several major drawbacks. Let me go through some of them.
The first one is portability. We have a multitude of platforms on everything from Windows to Linux, to Objective C in the iOS native format, to QNX to folks putting Artificial Neural Networks on silicon, and programming right down to the bare metal, or the semi-metals that dope the silicon matrix in the transistor junctions of the chips. We need to be able to run a particular set of specifically trained neural nets on a variety of platforms.
The multiplicity of platforms was seen early on and as a result, we had strange things developed like CORBA or Common Object Request Broker Architecture being formulated ( http://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture ). CORBA came about in the early 1990's in its initial incarnations however it is bulky and adds a code-heavy layer of abstraction to each platform when you want to transport silicon brainiacs like a multilayer perceptron machine. The idea of distributed computing is an enticing one, but due to a large variety of factors, including security and the continued exponential multiplication of integrated transistors on a chip according to Moore's Law, it is a concept that has been obviated for the present time.
My contention, is that if you had a standard for a Neural Net, then you wouldn't have to call some foreign memory or code object from a foreign computer. You would just use a very simple light-weight data protocol to transfer post-learning layers, weights and biases (like JSON) and bingo -- you can replicate smartness on a new machine in minutes without access to training data, or the time spent training the artificial neural net. It would be like unpacking a thinker in a box. You could be dumber than a second coat of paint, but no one would notice, because your mobile phone did your thinking for you.
There is another aspect to this, and it is the commercial aspect. If I came across a unique data set, and trained a bunch of neural networks to predict stuff in the realm of that data set, I potentially could have a bunch of very valuable neural nets that I could sell to you. All that you would have is pay me the money, download my neural net recipe with its standardized notation, and be in business generating your own revenue stream. It wouldn't matter what platform, operating system or chip set that your computer or device used -- the notation for the recipe of the artificial neural network would be agnostic to the binaries.
We are in a very strange time, with the underpinnings of our society changing at a very fast pace. My contention is that the very nature of employment may change for many people. We will not longer need to import cheap goods from China that fill the dollar stores. You will order the recipe for a 3D printer and make whatever you need. This paradigm alone will kill many manufacturing jobs. As a result, the nature of work will change. People will find a niche, and supply the knowledge in that niche that can be utilized or even materialize that knowledge into what they need. We will transcend the present paradigm of people supporting themselves by making crafts and selling them on Etsy or writing books and selling them on Amazon. People will make and sell knowledge products, and one could sell trained neural nets for any field of endeavor.
Just as rooms full of Third World country young men game all day and sell the rewards online to impatient first world gamers, you will have people spending days and weeks training neural nets and sell them on an online marketplace.
That day is coming shortly, and the sooner that we have a standard for Artificial Neural Net recipes, the sooner that we will see intelligence embedded in devices and trained neural nets for sale. You can count on it.
These thoughts were spawned on my daily walk, and you can bet that I have already started to create a schema for a neural net transference, as well as a Java Interface for one version of a standardized neural net. Stay tuned.
How CNN Got Their UIX (User Interface Experience) Wrong!
CNN is a bit of a light-weight news organization. They have entertainment cat videos on their main page, mixed with ISIS beheadings and Kim Kardashian butt selfies. However, I used to be a regular visitor, because at a glance, one could tell the mood of America. It was all laid out there. Riots in Ferguson mixed in with a guy in a fly suit zooming through a hole between two cliffs. In short, it was microcosmic glimpse into the demographic of slightly conservative, easily entertained, bite-sized pablum of news digesting, light-thinking demographic.
And then they went and changed their UIX or User Interface Experience. I haven't been back since.
Here is a screen shot of their new UIX:
One sits there and waits for the top news stories to scroll through and automatically change. The screen grab that I took featured prominently was a city that was the homeless capital of the US. You cannot see at a glance, what is happening in the world.
Contrast this to a Reuters screen capture of exactly the same time. We have victim #22's body of the Air Asia disaster being brought to shore. We have a story on Iran and shipping enriched uranium to Russia. And we have the death report of Edward Brooks, the first black US senator elected by popular vote, and I can see the story about an Idaho earthquake. I can read all of this stuff from a tiny screen capture.
The CNN screen capture has a faint column that is unreadable in the screen capture. If you enlarge it, it has a few terse ambiguous headlines about the death of Brooks, without giving his name, The second entry is Mike Huckabee leaving Faux News. We also have the story of Panthers Top Punchless Cards and they trumpet themselves in the NEW CNN DIGITAL.
It appears that CNN let their web developers go wild, thinking that it looks neat, but not having a clue on how to present news. The continuous flashing/changing stories makes me sit and wait to see if there is anything interesting to grab my attention. And when trivial crap comes up, it makes the wait more useless. Their UIX hearkens back to the days when HTML had the stupid flashing banner, and the revolving storybook would look good in on an online snowboard catalog, but not for a new site that wants to be taken seriously.
I'm not really sure that news is their focus -- they seem to want to invent something called newstainment - which is a meld of entertainment and news. If they want to be a serious news provider, they should emulate the UIX of bbc.com or reuters.com. I know that these sites are serious about the news, because I belong to user panels on both of those sites, and give feedback on past and upcoming stories.
They say that every dog has their day, and the CNN site has turned into one, but it's day is gone. #FAIL
And then they went and changed their UIX or User Interface Experience. I haven't been back since.
Here is a screen shot of their new UIX:
One sits there and waits for the top news stories to scroll through and automatically change. The screen grab that I took featured prominently was a city that was the homeless capital of the US. You cannot see at a glance, what is happening in the world.
Contrast this to a Reuters screen capture of exactly the same time. We have victim #22's body of the Air Asia disaster being brought to shore. We have a story on Iran and shipping enriched uranium to Russia. And we have the death report of Edward Brooks, the first black US senator elected by popular vote, and I can see the story about an Idaho earthquake. I can read all of this stuff from a tiny screen capture.
The CNN screen capture has a faint column that is unreadable in the screen capture. If you enlarge it, it has a few terse ambiguous headlines about the death of Brooks, without giving his name, The second entry is Mike Huckabee leaving Faux News. We also have the story of Panthers Top Punchless Cards and they trumpet themselves in the NEW CNN DIGITAL.
It appears that CNN let their web developers go wild, thinking that it looks neat, but not having a clue on how to present news. The continuous flashing/changing stories makes me sit and wait to see if there is anything interesting to grab my attention. And when trivial crap comes up, it makes the wait more useless. Their UIX hearkens back to the days when HTML had the stupid flashing banner, and the revolving storybook would look good in on an online snowboard catalog, but not for a new site that wants to be taken seriously.
I'm not really sure that news is their focus -- they seem to want to invent something called newstainment - which is a meld of entertainment and news. If they want to be a serious news provider, they should emulate the UIX of bbc.com or reuters.com. I know that these sites are serious about the news, because I belong to user panels on both of those sites, and give feedback on past and upcoming stories.
They say that every dog has their day, and the CNN site has turned into one, but it's day is gone. #FAIL
Burning Ants With Magnifying Glasses, Computational Creativity and Other Artificial Intelligence Inspirations
I was going to call this article Computational Creativity Confabulation Using Artificial Neural Nets, but the immature little boy in me made me do otherwise.
I've been fascinated by the works of Dr. Stephen Thaler and his work on Imagination Engines and a Unified Model of Computational Creativity. In the Artificial Intelligence domain, the ultimate Touring Test would be a computer that rivals a human at creativity or consciously designing creative things. There isn't much on his work in the literature, other than in the body of patents that Thaler has been granted, but I suppose that is because he is trying to monetize them and they are competition-sensitive algorithms and applications.
When I started Googled around his work, I landed on the Wikipedia page for Computational Creativity. Thaler has a very small section on a unifying model of creativity based on traumatized artificial neural networks. I have had a lot of experience coding and playing with my own brand of artificial neural networks, specifically the multilayer perceptron models, and let me tell you that it is both fascinating and frustrating work.
Seemingly, knowledge is stored in the overall collection of biases and weights for a dumb piece of software to make some startling, human-like decisions with just examples and no background theory in the art of whatever you are trying to make them learn.
It is quite mindblowing. For me, the Eureka moment came when I saw an artificial neural network automonously shift output patterns without any programming other than learning cycles, based on what it was seeing. It was a profound moment for me, to see a software program on a computer, recognize a complexity and reduce it to a series of biases, weights and activations to make a fundamental decision based on inputs. It was almost a life-changing event for me. It made my profession significant to me. A trivial analogy would be a watch making an adjustment to daylight savings time based on the angle of the sun hitting it at a specific time, if a watch was trained to tell the time by the position of the sun in the sky.
Thaler goes further than I would in describing behaviors of artificial neural networks in cognitive terms based on anthropomorphic characteristics like waking, dreaming, and near death. His seminal work however, deals with training artificial neural networks to do something, and then perturbing them (a fancy term for throwing a spanner in the works) to see what happens to the outputs. In some cases, the perturbations include external and/or internal ones like messing with the inputs, weights, biases and such, and then having supervisory circuits to throw out the junk and keep the good stuff. For example, in his examples listed in the patent application, he has a diagram of a coffee mug being designed by perturbing an artificial neural network. His perturbations cause confabulation or confabulatory patterns.
A confabulation is a memory disturbance caused by disease or injury and the person makes up or synthesizes memories to fill in the gaps. In a psychological sense, these memories are fabricated, distorted or misinterpreted and can be caused in humans by even such things as alcoholism.
Thaler does the equivalent to neural nets what every rascally little boy does to earthworms or frogs. They put a burning match or focus a magnifying glass on various parts of the frog, worm and even ants, and then observe how the organism reacts. It brings to mind the rock song by The Who, called "I'm a Boy!".
Creativity in humans is a funny business. Perturbations are the key. You need to perturb your usual thought patterns and introduce new ones to come up with innovative concepts. We all know how Kekule couldn't figure out the chemical structure of benzene, until he had a dream about a snake eating his tail, and he twigged onto the idea of cyclical hydrocarbons, and organic chemistry. College students today still fail by the thousands in introductory courses to organic chemistry and the field of science uncovered by that perturbation.
Essentially creativity involves putting together diverse concepts to synthesize new ideas. Computational creativity involves buggering up perfectly good artificial neural networks to see what they come up with. You have to introduce perturbations in "conventional thought" somehow. Thaler believes that this paradigm beats genetic algorithms. I was particularly impress by a genetic algorithm crunching away to design an antenna for a satellite that would work in any orientation. Radio engineers tried and tried and then came up with several designs but all had particular shortcomings. The problem was loaded into a computer with a genetic algorithm where they would start with a basic antenna structure and then add random bits and pieces, and then run some programs to simulate and test the antenna. If its performance was better than the last iteration, it would be kept and altered randomly again. If not, the alteration was thrown out, and a new random thing was tried. The final antenna looked like a weird stick conglomeration, but worked beautifully and is flying in space. Thaler says that his computational creativity models are faster and better than genetic algorithms.
I was wondering what kind of perturbations that Thaler did to his neural nets. The only clues that I got, came from reading the patent summaries and here is a quote: "System perturbations would produce the desired output. Such system perturbations may be external to the ANN (e.g., changes to the ANN data inputs) or internal to the ANN (e.g., changes to Weights, biases, etc.) By utilizing neural network-based technology, such identification of required perturbations can be achieved easily, quickly, and, if desired, autonomously.
I briefly touched on another type of perturbation of artificial neural nets when I talked about synaptic pruning. Essentially a baby creates all sorts of connections to biological neural networks in its brain, and as it approaches puberty, it prunes the inappropriate ones. The plethora of "inappropriate" synapses or connections to diverse concepts, is what makes a child's imagination so rich. In my proposed method of artificial neural net perturbations, I suggested that the way synaptic pruning could take place, was to kill some inputs into the various layers of the multilayer perceptron, and then let the network run to see what comes out.
I came upon a few more methods of creating perturbations in neural networks while reading genetic mutations. An article that I was reading described some mutation methods that included substitution, insertion, deletion and frameshift. The thought struck me, that this would be another ideal way to perturb artificial neural nets. In substitution, you could swap neurons from one layer to another. Using the insertion algorithm derived from genetics, you could add another neuron or even a layer to an already-trained network. Deletion could be implemented by dropping out an entire neuron out of a layer. Frameshift is an intriguing possibility as well. What that means is that if specific series of Perceptron/Layer pairs fed a series to an adjacent layer, you would frameshift the order. So for example if Layer3 fed a series of four perceptrons in layer 4, instead of feeding them in order, like inputs going to L4P1, L4P2, L4p3 and L4P4, you would frameshift by one and feed them into L4P2, L4p3, L4P4 and L4P1 to create these perturbations.
This entire field is utterly fascinating and may hold some of the answer to the implementation of Computational Creativity. Machines may not have the same cognitive understanding things the way that humans do, but that doesn't mean that they can't be creative.
An example of differing cognitive understanding about the problem, is given by this anecdote:
A businessman was talking with his barber, when they both noticed a goofy-looking fellow bouncing down the sidewalk. The barber whispered, "That's Tommy, one of the stupidest kids you'll ever meet. Here, I'll show you."
"Hey Tommy! Come here!" yelled the barber.
Tommy came bouncing over "Hi Mr. Williams!"
The barber pulled out a rusty dime and a shiny quarter and told Tommy he could keep the one of his choice. Tommy looked long and hard at the dime and quarter and then quickly snapped the dime from the barber's hand. The barber looked at the businessman and said, "See, I told you."
After his haircut, the businessman caught up with Tommy and asked him why he chose the dime.
Tommy looked at him in the eye and said, "If I take the quarter, the game is over."
In a real life setting, I would like to quote this anecdote about an actual result of a perturbation of an artificial neural network taken from Wikipedia:
In 1989, in one of the most controversial reductions to practice of this general theory of creativity, one neural net termed the "grim reaper," governed the synaptic damage (i.e., rule-changes) applied to another net that had learned a series of traditional Christmas carol lyrics. The former net, on the lookout for both novel and grammatical lyrics, seized upon the chilling sentence, "In the end all men go to good earth in one eternal silent night," thereafter ceasing the synaptic degradation process. In subsequent projects, these systems produced more useful results across many fields of human endeavor, oftentimes bootstrapping their learning from a blank slate based upon the success or failure of self-conceived concepts and strategies seeded upon such internal network damage. ( http://en.wikipedia.org/wiki/Computational_creativity )
And there you have it, so much to do, so little time to do it, and so little funding to do it. But it will get done, and it will bring us into a brave new world.
Subscribe to:
Posts (Atom)