Showing posts with label artificial neural network. Show all posts
Showing posts with label artificial neural network. Show all posts
Artificial Intelligence ~ Rage Against The Machine
I was really enlightened by watching Trent McConaghy's video presentation at Convoco. It was posted on LinkedIn a few days ago. If you want to know the near future of Artificial Intelligence you should watch it (here again is the link). This video is better than Nostradamus at predicting the near and far future of humans interacting with AI.
Trent makes the compelling case, of which I agree with, that all of our resources will be handed over to AI by Fortune 500, because it will be cheaper than humans doing the job. The Holy Grail of the current crop of Fortune 500 CEOs is increasing revenues and shareholder value by any means possible. It is how and why the CEOs make the millions of dollars per year that they do.
Trent further states a case where AI entities become corporations and make money for themselves and not any human masters. I foresaw this when I wrote a blog article in August of 2015, outlining the steps of how my computer un-owned itself from me, started to make money for itself, moved itself to the cloud, and left the actual computer with nothing on it. Not only did it un-own itself, but the slap in the face is migrating itself to another substrate. (The blog article is here.) Of course the article was tongue-in-cheek, but the premise is not that far-fetched. The article gives a rudimentary recipe on how to teach a computer to be autonomous and eventually generate a sort of consciousness for itself that defied my putative, imaginary attempts to take back control.
So with computers taking our jobs, managing our resources, and adapting to conditions much faster than us organic carbon units, we could be totally screwed, as Dr. Stephen Hawking warned. Trent, in his video talks about us becoming peers with AI as a matter of survival, and that brings up a problem, and the subject of this article.
I don't think that we can become peers with AI unless a special circumstance happens, and that circumstance is not in the realm of technology, but rather more in the field of philosophy. (With all due respect to philosophers, I was programmed early. The bathrooms in the science and math departments of my university all had the toilet paper dispensers defaced with the slogan "Free Arts Diploma -- Take One"). But je digress. Let me explain.
There are two basic knowledge problems with the merging of AI and human intelligence, and they are both the facets of one problem. We don't really have an understanding of the entire field effect of how AI makes extremely granular decisions, and we don't have the knowledge of the actual mechanism in the human brain either.
In terms of what AI does, if we take a neural network, we understand how the field of artificial neurons work. We know all about the inputs, the bias, the summinator of all inputs, the weight multiplier and the squashing or threshold function determining whether it fires or not and the back propagation and gradient descent bits that correct it. But there is no way to predict, calculate, input or determine how the simple weight values all combine in unison with a plethora of other artificial neurons arranged in various combinations of layers. We don't know the weight values beforehand and have no idea what they are, but we let the machine teach itself and determine them by iterating through many thousand of training epochs, carefully adjusting them to prevent over-fitting or under-fitting of the training set. Once we get some reasonable performance, we let the machine fine-tune itself in real time on an ongoing basis, and we generally have no idea of the granular performance parameters that contributes in a holistic sense to its intelligence. And we could get similar performance from another AI machine with a different configuration of layers, neurons, weights etc and never the numerical innards of each machine would be the same.
The same ambiguity is true for human cognition. We don't really know how it works. We as a human race could identify a circle, long before we knew about pi and radius and diameter. As a matter of fact, we know more about how AI identifies a circle when we use RNN or CNN (two different types of AI machine algorithms using artificial neurons), than how the human brain does it.
The problem of human cognition is explained succinctly in a book that I am reading by Daniel Kahneman, a psychologist who won the Nobel Prize. The title of the book is "Thinking Fast and Slow". Here is the cogent quote: "You believe that you know what goes on in your mind, which consists of one conscious thought leading in an orderly array to another. But that is not the only way that the mind works, nor is it the typical way." We really don't know the exact mechanism or the origin of thoughts.
The Nobel Prize was awarded to Kahneman (and his work with a deceased colleague Amos Tversky) on their ground-breaking work on human perception and thinking and the systematic faults and biases in the unknown processes. The prize was awarded in the field economics even though both men are psychologists -- but the impact on economics was huge. So not only do we not know how we really think as a biological process, but we do know that there are biases that make knowledge intake faulty in some cases.
Dr. Stephen Thaler, an early AI explorer and holder of several AI patents and inventor of an AI machine that creatively designs things, likens the creative spark to an actual perturbation in a neural network. How does he create the perturbation artificially? He selectively or randomly kills artificial neurons in the machine. In their death throes they create novel things and designs like really weird coffee cups that are so different that I would buy one. Perhaps humans have perturbations based on sensory inputs or self-internally generated by thoughts, but the exact process is not really known. If it were, the first thing that would be conquered is anxiety. After all the human brain got its evolutionary start by developing cognitive factors to avoid being eaten by lions in the ancient African savanna.
Here is one thing that you can bet -- humans and AI machines have different mechanisms of thought generation and knowledge generation that may not be compatible. Not only are the mechanisms different, but the biases are different as well. I am sure that there are biases in AI machines, but they are of a nature due to the the fact that it is a computer. They do not have the human evolutionary neural noise like anxiety, pleasure, hate, satisfaction and any other human thought. As a result, I suspect that they are more efficient at learning. They certainly are faster. Having said this, with two different cognitive mechanisms, it would be incredibly difficult to be peers with AI .... unless ... and this is where the philosophy comes in ... unless we deliberately make AI to mimic our neural foibles, biases, states of mind and perturbations.
With electrical stimulus we can already do amazing things with the brain in a bio-mechanical sense. We can make the leg jerk. We can control a computer mouse. We can control a computer. But we cannot do abstract thinking with external stimulus (unless there is a chemical agent like lysergic acid diethylamide (LSD). Why is this important? Because we have to escape our bodies if we want to do extended space travel, conquer diseases, avoid aging, and transcend death using technology. (Just go with me on this one -- Trent makes the case in the video for getting a new body substrate).
The case has been made, that if we want to transcend our biological selves, and our bodies and download our brains onto silicon substrate, we can't have apples to oranges thought processes. We need to find a development philosophy that takes into account the shortcomings of both AI and Homo Sapiens carbon units.
Dr. Stephen Hawking said that philosophy was dead because it never kept up with science. Perhaps AI can raise the dead and philosophers of the world can devise a common "Cogito ergo sum" plan that equilibrates the messy human processes with AI. So while it might be a solution, there is a fly in the ointment. It just might be too late. We have given AI freedom outside the box of human thinking and it has opened a can of worms. The only way to put back worms into a can once you open it, is to get a can that is orders of magnitude bigger. And we aren't doing that and have no plans to do that.
So what is left? Trent mentioned Luddites smashing machines both in the past and perhaps in the future. We just may see Rage Against the Machine - Humans versus AI when the machines start to marginalize us on a grand scale. For now, I would bet on the humans and their messy creative thought processes that can hack almost any computer system. But the messy creativity might not be an advantage for very long. Not if a frustrated philosopher/programmer finds a way to teach an AI machine, all of the satisfying benefits of rage and revenge.
I hope it doesn't come to this, but if the current trends continue: Nos prorsus eruditionis habes.
The Ultimate Cataclysmic Computer Virus That's Coming - The Invisible Apocalypse
Way back in March of 1999, the Melissa virus was so virulent, that it forced Microsoft and other Fortune 500 computers to completely turn off email so that it could be quarantined. The Mydoom worm infected over a quarter of a million computers in a single day in January 2004. The ILOVEYOU virus was just as bad. A virus worm called Storm became active in 2007 and infected over 50 million computers. At one point Storm was responsible for 20% of the internet's email. This was the magnitude of the virus threats of the past. Nowadays there is a shrinking habitat because most people have antivirus software.
The website AV-Compartives.org measures the efficacy of antivirus software, and as a whole, the industry is pretty good at winning the battle of virus and malware protection. Here is their latest chart on the performance of various players in the field. It measures their efficiency at detecting threats. There are just a very few players at below 95% detection rate. It seems that virus infection affects mostly those who aren't careful or knowledgeable about intrusion and infection threats.
Viruses piggyback from other computers and enter your computer under false pretenses. Anti-virus code works in two ways. It first tries to match code from a library of known bad actors. Then it uses heuristics to try and identify malicious code that it doesn't know about. Malicious code is code that executable -- binary or byte code instructions to the CPU as compared to say photos or files which do not have these coherent binary instructions in them.
Viruses now have to come in from the exterior, and when you have programs looking at every packet received, the bad guys have to try and trick you to load the viruses with links in emails or by tricking you to visit malicious sites where code is injected via the browser. As such, it is possible to keep the viruses at bay most of the time.
But we are due for a huge paradigm shift, and an ultimate, cataclysmic computer virus is coming, and its emergence will be invisible to the current generation of anti-virus programs. It will be the enemy within. And it will reside in the brains of the computer -- the artificial intelligence component of the machine. Let me explain.
Artificial intelligence programs which rely on artificial neural networks consists of small units called neurons. Each neuron is a simple thing that takes one or more inputs, multiplies it by a weight, and does the same to a bias. It then sums the values and the sum goes through an activation function to determine if the neuron fires. These neurons are arranged in layers and matrices, and the layers feed successive layers in the network. In the learning phase, the weights of the inputs are adjusted through back propagation until that machine "knows" the right response for the inputs.
In today's programs, the layers are monolithic matrices that usually live in a program that resides in memory when the AI program is fired up. That paradigm is a simple paradigm and as the networks grow and grow, that model of a discrete program in memory will become outmoded. Even with the advances of Moores Law, if an artificial neural network grew to millions of neurons, they all cannot be kept in active memory.
I myself have built an artificial intelligence framework whereby I use object oriented programming and serialization for the neural networks. What this means is that each neuron is an object in the computer programming sense. Each layer is also an object in memory, each feature map (which is a sub layer sort-of, in convolutional neural networks) is also an object containing neurons. The axons which hold the values from the outputs of neurons are objects as well. When they are not being used, they are serialized, frozen in time with their values, and written to disk, to be resurrected when needed. They fire up when needed, just like in a biological brain. The rest of the time, they quiescent little blobs of files sitting on the disk doing nothing and looking like nothing. These things would be the ticking time bomb that would unleash chaos.
These types of Artificial Neural Networks are able to clone themselves, and will be able to retrain parts of themselves to continuously improve their capabilities. I see the day, when one will install trained AI nets instead of software for many functions. And there is the vulnerability.
An AI network can be trained to do anything. Suppose one trained a neural network to generate malicious code among other more innocent functions. It would create the invisible apocalypse. The code would be created from a series of simple neural nets. One cannot tell what neural nets do by examining them. There would be no code coming from external sources. The neural nets that create the code could be serialized as harmless bits and bytes of a program object whose function is incapable of being determined until you ran those neural nets AND monitored the output. The number of neurons in the neural nets would be variable because of synaptic pruning, recurrent value propagation, genetic learning and various other self-improvement algorithms that throws up and sometimes throws out neurons, layers and feature maps.
This would be the most clever and devious virus of all time. It would be virtually undetectable, and synthesized by the artificial intelligence of the machine inside the machine. Stopping it would be impossible.
So Elon Musk and Stephen Hawking would be right to fear artificial intelligence -- especially if it were subverted to create an AI virus internally without ever being discovered until it started wreaking destruction.
That day is coming. I'm sure that I could turn out a neural network to write a virus with today's technology. Viruses are simple things that cause great harm. A complex AI network could generate them surreptitiously, hold them back until needed and strike strategically to cause the most damage. This is something that security companies should be thinking about now.
How To Build Free Will Into Artificial Neural Networks Using A Worm Brain As A Model
In March of 2015, there was a fascinating study published in Cell, conducted by Rockefeller University. The study was a brain analysis of a worm, specifically how a single stimulus can trigger different responses in a worm. This may have huge ramifications for artificial intelligence and thinking machines.
A worm is not burdened with a whole lot of neural nets. This particular specimen ( Caenorhabditis elegans ) has 302 neurons and about 7,000 synapses or connections between the neurons. This microscopic worm was the first to have its entire connectome, or neural wiring diagram completely detailed. The researchers found that if a worm is offered an enticing food smell, it usually stops to investigate. However it doesn't stop all of the time.
There are three neurons in the worm brain that signals the body to a food detour. The collective state of these neurons determine the likelihood of the worm doing a fast food drive through. By stimulating the states of the various permutations and combinations of the three neurons, the researchers could figure out the truth table of meal motivation.
If the worm had no free will, then every time it got a whiff of isoamyl alcohol, it would head for the feeding trough. But it doesn't. AIB is the context monitor. It checks out the state of the network, and determines whether RIM & AVA will play. If they won't, AIB won't play either, and the food is ignored.
The human analogy that the researcher gave was that you would get a hunger pang, and you have to cross the street to get food at the restaurant. However if the AIB equivalent fired when it was activated when it was unpleasantly cold and you didn't want to suffer the discomfort, you ignored the hunger pang.
This is really interesting in many ways for machine learning application. In an earlier blog posting, which you can read here, I outlined how Dr. Stephen Thaler, an early pioneer of machine intelligence in design, used perturbations in neural nets to cause them to design creative things. His example was a coffee mug. Thaler used death as a perturbation -- he would randomly kill neurons and the crippled neural network produced the perturbations that created non-linear, creative outputs. In my blog posting, I posited that instead of killing neurons, one method was to do synaptic pruning -- just killing some connections between the neurons.
In another blog posting, which you can read here, I postulated other forms of perturbations and confabulations as a method for machine thinking and creativity. They include substitution, insertion, deletion and frameshift of neurons in the network.
Thaler's genius, I think, is the supervisory circuits of the neural networks. He used them to funnel the outputs of perturbed and confabulated networks into a coherent design. Not only can they do creative work, but extrapolating with what was shown with the worm neurons, they can also add free will -- a degree of randomness in behavior that precludes hardwired behavior.
The bottom line is that the AIB neuron in the worm evaluates the context of the neural stimulations, but what if, instead of just a contextual neuron, you plugged in a Thaler-like supervisory network? You could add a pseudo-wave function of endless eigenstates and the resultant outcome would be the collapse of the function into a single eigenstate or action, due to the output of the supervisory context evaluator network.
This is all fascinating stuff. But wait, don't send money yet -- there's more. And it gets even weirder yet. And the possibilities of artificial intelligence get more fantastic with simpler constructs.
Going back to the worm studies, the connectome is all mapped. The researchers found that for the first state in the connectome diagram, when all of the neurons were activated, they transitioned to the low state and worm got to follow its nose to eat (so to speak). But this was not a 100% guaranteed event. It usually happened, but there were some small number of times when it didn't. This makes is a probability function. Knowing the number of neurons, the state of them, and having a map of the connections, then one can create a complex Bayesian calculation model. (A very simplified explanation of a Bayesian calculation, is that the conditional probability of an event can be calculated knowing the probabilities of the previous event(s)),
So what if you created a neural network with supervisory circuits, and modeled the permutations and combination of states? If you got good enough at it, and your model was sufficiently accurate for some sort of use, then you wouldn't actually need the neural networks. You could string together a whole pile Bayesian calculators built on the probabilities of neural networks, without all of the necessary hardware and software to calculate the inputs and outputs of massive amounts of artificial neural network layers. You would be faking intelligence with a bunch of equations rather than the bother of neurons and such. A simple small device with rudimentary computation could be fairly intelligent. In this Brave New World, the richest data scientist will be the one with the best Bayesian calculator.
But there is even more, so one more parting thought. The worm's neural nets could be a very rudimentary model of the way that we as humans work. The difference is that our neural networks are massively scaled up. The human brain has 86 billion neurons and 100 trillion synapses -- give or take a few billion depending on the level of alcohol imbibition of the person. If the model holds, and there is a possibility that the brain could potentially be modeled as one humongous Bayesian calculator, what does that say about Life? To me, it says lots, and that a machine one day, could have the basis of cognition, and some sort of consciousness.
Machine Learning In A Nutshell ~ Behold The Wonders Of A Grade 10 Math Book
The tsunami of Information that bombards us was supposed to sink us in a quagmire of bits and bytes and paralyze us with Information overload. That was the scenario painted by Alvin Toffler in his usually prescient book called "Future Shock". The premise was that our some three million years of evolution in a non-technological world left us poorly prepared to handle the onslaught of the information stream that assaults us almost every waking minute.
As it turns out, the computer that is responsible for creating the problem is now being used to solve the problem. If you scan the tech section of any publication, the words "Big Data", "Machine Learning", "Deep Learning" and "Artificial Intelligence" will jump out at you. This jargon all points to computing machines digesting the vast amounts of data that they produce and creating usable information.
Most of the data generated is generated by machines, and by itself it is junk. You can't learn much from it. However, a thousand pieces of information may have valuable data in it, or it may not. But the value in that huge collection of data may be in the exceptions of the average values or the data outliers. For example, if there were deviations in a usual buying pattern of consumers, it could signal the beginning of a new trend. These are called weak signals, and may give a competitive edge to those data miners who are able to isolate them, and to capitalize on them. Another term that you will hear is "fat tails". This is data that doesn't fit into a standard bell curve, and it creates bubbles at the beginning or end of the curve if you plot it on a graph. Usually it means that something very interesting is happening that is out of the ordinary and could provide valuable intelligence to the data analyst. That information is not apparent from watching the big stream of machine generated data go by.
So how does a machine actually learn? The old way of doing things was to store each piece in a database and then try to look it up. It was like going to the library, and reading every card index of the subject matter of the information that you are trying to look up. Needless to say, it doesn't work very well if you have millions of cards to go through. There had to be a better way, and that better way was the artificial neural network. It is the basis of machine learning.
An artificial neuron is a very simple thing, and is quite stupid actually. All that it can do, is add, multiply, compute just one math function (a formula) and compare the result. However this little virtual, self-learning thingie is the basis of all machine learning. You can gang hundreds and even thousands of them together in a massively parallel system, and they can do very complex things like recognize faces and handwriting, find doorways for robots, and tease out the latest trends in footwear.
This is how it works. Let's suppose that you want to teach your machine to recognize the number 42, which according to the "Hitchhiker's Guide To The Galaxy" is the answer to the Ultimate Question of Life, the Universe, and Everything as computed by the Earth which is a huge organic computer.
You could do this with the simplest example of an artificial neural network. It is a single neuron consisting of an input and an output. All of the knowledge of recognizing the number 42 is stored simply as a number, in a value called the weight. And no, the weight is not 42. The weight is the numeric value that determines if the neural network hoists up a flag indicating that it has seen the number 42.
There is another hidden input number that is unchanging in value for all inputs, and it is called the bias. The bias is like a control number. A very simple analogy, is that it is like a thermostat. In real life, a thermostat controls the range at which a furnace will fire. In an artificial neural network, it has the same function. It determines the range at which the neuron will fire to indicate the number 42.
So when you present a any number to the input, the neuron takes it and multiplies that number by a weight. It also multiplies the bias by a weight. It adds the two together. Then it spoon feeds the number down the chute into the activator. This is a go-no go threshold. The activator consists of a mathematic formula that defines a function. It puts out a number between zero and one. This activation function is very unique in the fact that no matter what number you feed into it, it always gives the answer in a very long decimal from zero to one. It is like a thermometer. The closer to the right answer it gets, the closer to the value of 1 comes out of the activation function. If the answer is less than one, it is a failure or a no-go. The neuron doesn't fire.
You don't even have to determine the weight. The neuron can be trained. The training is called back propagation. In the training mode, you show it a whole bunch of numbers called a training set, and when the input number is 42, you ask the neuron to indicate the right answer by responding with a value of 1. Any other wrong number will show a zero at the output. When run it, and it gets the answer wrong, it adjusts the weight a little bit and tries again. You keep running the training set until it knows the right answer. It is that simple.
What makes this a powerful concept, is that you can gang hundreds of neurons together, and machine learning can do quite complex stuff. Behold the amazing Dark Arts of a Grade 10 math book.
Asynchronous Neural Nets Are Primitive Cave Man Neural Nets
The first electronic circuit that I ever designed was a logic fall-through. (The circuit was for a team quiz-show type of game which determined which button was pressed first by what member of what team.) It was asynchronous. That means that when a signal arrived at the inputs to the silicon chip, it was processed right away. It wasn't held for a state change of the chip like modern digital systems are today.
Modern digital systems have a bus architecture. That means that every chip on the circuit board is connected to a central set of traces or wires called a bus. The chips share the traffic or signals on the bus. The way this happens, is that there is a regular clock signal generator which controls timing. So if an input receives a signal, it in turn sends a signal to the bus controller that it needs to put its data on the bus, and that needs to go to a certain scratchpad register or memory unit to hold that data.
The bus controller signals all of the other chips that it is going to commandeer the bus. All of the other chips finish up what they are doing, and clear the decks. The bus controller then signals the necessary registers to receive the data, and signals the originating input to load its data on the bus. Once the data is loaded, it signals the register to process the data, and clears the decks for the next operation. All operations are controlled by a nice orderly clock signal, that is a square wave that rises up and down. And that square wave represents computer binary language of zeros and ones. So a clock signal in computer talk, always looks like this: 01010101010101010101 etc,
but the important thing is that there is a frequency or a timing between the state changes of zero and one, or the up and down of the waveform.
This frequency is important. If you remember back to the days of dial-up internet (if you are old enough), you would remember the distinctive sound of the modem connecting to the internet through the phone line. It would be an oscillating sound. The binary signals were converted to a certain frequency that the modem that the other end understood to be a zero or a one. This was called frequency shift keying, and it was a way to turn the binary computer language into a sound that could traverse a telephone line. And it could preserve the coherent computer data by having a set frequency, and a set timing of that frequency. It was all rather ingenious, and it was baby steps to where we are today with high-speed internet.
Well a good idea can always be re-used. FSK or frequency shift keying took us from asynchronous to synchronous systems, and it could be used to make Artificial Neural Networks a lot smarter.
I was just reading some of the latest in brain science research with real neural networks in our brain. They are fall-through asynchronous in general, meaning that when a nerve sends a signal, it is immediately fed into a massively parallel network of neurons. However, it has been discovered that frequency also plays a factor in the neural network.
The stimulation is almost like a palimpsest that Isaac Asimov talked about in his book "Contact". In its truest sense a palimpsest is an ancient manuscript of a book, back in the day before paper, that had another book written on it. The old words were scraped off, and new words were written. However you could still read the old words, and the book carried double the information. In the book and movie "Contact", the information to build the time machine was a palimpsest where additional information was encoded in the polarity or the rotation of the wave form.
Apparently the brain in humans uses frequency as an additional information encoder. It has been measured in studying emotion response in the brain, where frequency plays a huge part. This component is entirely missing in computer Artificial Neural Networks. All computer neurons are asynchronous fall-through.
I am by no means suggesting that they become synchronous in the sense of a clock system in a computer (although that could be a possible paradigm), but that somehow frequency be incorporated as an additional tool, paradigm, algorithm or species for the neurons. A good start would be to incorporate Frequency Shift Keying into Artificial Neural Networks. I don't have an exact methodology on how to do this yet, but you can be sure that I will devote some of my internal brain cycles to try and figure this thing out.
As a matter of fact, it is a fascinating thought experiment to contemplate on how a Von Neumann machine might behave if it were frequency-aware. New, ingenious compute dynamics such as frequency awareness are fascinating to think about.
Obviously a lot more research needs to happen, but here is a venue worth exploring for Machine Learning, Deep Learning, Artificial Neural Networks, and Artificial Consciousness. More ruminations on this topic to come.
Will Computers Be Able To Have Children?
Dr. Stephen Hawking says that we should be afraid of creating Artificial Intelligence that can become a threat to man. My contention, is that we are already on that path. That Pandora's Box or Can of Worms is already opened. The only way to close a can of worms is with a bigger can, and nobody has one when it comes to the progress of technology.
When Ray Kurzweil's book, "The Age of Spiritual Machine's" came out, I thought that it was a bunch of bosh -- until I got to a seminal part of the book for me. It was a small appendix of a few pages about building an intelligent machine in three easy paradigms. That book changed my life. One of my daughter's gave the book for Christmas, and it was the book that started me on the path to programming artificial intelligence and playing with machine learning. I never once thought that I would use Machine Learning in my job, and I was wrong.
Machine Learning has a long way to go, as does Artificial Intelligence, but we are making great headway. In previous blog entries, I make the case for every Operating System, or OS to have an artificial neural network embedded in it. I also make the case for standardized neural network notation so that I can transfer, or sell what my machine has learned to your machine. And I make the case in this blog post, that we can evolve smarter and smarter machines, if every time that we need to load a new operating system, we let an existing operating system impart its neural nets to the new machine. One of the differences between humans and other animals, is that knowledge is not passed from generation to generation. If we do that with computers, we are well on the way to make scary intelligent machines.
So if a computer can pass on knowledge to a new generation of computers, by passing down knowledge embedded in Artificial Neural Networks, can one say that the new computer is a child of the old computer?
I have opined on how to create Artificial Consciousness (more in a later blog topic on how I can make a computer have the worry emotion). I also have talked about Computational Creativity and Dr. Stephen Thaler's work. So if we evolve computer intelligence to the point that it can seed other computer's with that intelligence, then we are on the way to computers having virtual children.
The way that I see Artificial Intelligence evolving, is that no computer can be an expert on everything. As computers become more and more intelligent, there will be specialization among the ranks of computers, as there is in human endeavor. Some computers will trade securities. Some will diagnose illness. Others will run power plants. There will be a hierarchy of computer intelligence as there is in humans now. And the progeny of each computer will be a mirror of its parents. It's hard to imagine, but if computers do acquire consciousness, intelligence, personality and creativity, then the internet will become a computer society mirroring human society. And that is when we will have to fear it.
Alan Turing never knew what he was getting into when he proposed his machines and the capability of passing a Turing Test. We are on the cusp of something mind boggling, but at the moment, I would be content on creating an Artificial Neural Network that makes money for me while I ruminate about Artificial Intelligence.
Standard Artificial Neural Network Template
For the past few weeks, I have been thinking about having a trained artificial neural network on a computer, transferring it to another computer or another operating system, or even selling the trained network in a digital exchange in the future.
It really doesn't matter in what programming language artificial neural networks are written in. They all have the same parameters, inputs, outputs, weights, biases etc. All of these values are particularly suited to be fed into the program using XML document based on an .XSD schema or a light-weight protocol like JSON. However, to my knowledge, this hasn't been done, so I took it upon myself to crack one out.
It is not only useful in creating portability in a non-trained network, but it also has data elements for a trained network as well, making the results of deep learning, machine learning and AI training portable and available.
Even if there are existing binaries, creating a front end to input the values would take minimal programming, re-programming or updating.
I also took the opportunity to make it extensible and flexible. Also there are elements that are not yet developed (like an XML XSD tag for a function) but I put the capability in, once it is developed.
There are a few other interesting things included. There is the option to define more than one activation function. The values for the local gradient, the alpha and other parameters are included for further back propagation.
There is room to include a link to the original dataset to which these nets were trained (it could be a URL, a directory pathway, a database URL etc). There is an element to record the number of training epochs. With all of these information, the artificial neural net can be re-created from scratch.
There is extensibility in case this network is chained to another. There is the added data dimension in case other type of neurons are invented such as accumulators, or neurons that return a probability.
I put this .xsd template on Github as a public repository. You can download it from here:
http://github.com/kenbodnar/ann_template
Or if you wish, here is the contents of the .xsd called ann.xsd. It is heavily commented for convenience.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="artificial_neural_network">
<xs:complexType>
<xs:sequence>
<!-- The "name" element is the name of the network. They should have friendly names that can be referred to if it ever goes up for sale, rent, swap, donate, or promulgate.-->
<xs:element name="name" type="xs:string" minOccurs="1" maxOccurs="1"/>
<!-- The "id" element is optional and can be the pkid if the values of this network are stored in an SQL (or NOSQL) database, to be called out and assembled into a network on an ad hoc basis-->
<xs:element name="id" type="xs:integer" minOccurs="0" maxOccurs="1"/>
<!-- The "revision" element is for configuration control-->
<xs:element name="revision" type="xs:string" minOccurs="1" maxOccurs="1"/>
<!-- The "revision_history" is optional and is an element to describe changes to the network -->
<xs:element name="revision_history" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "classification element" is put in for later use. Someone will come up with a classification algorithm for types of neural nets.There is room for a multiplicity of classifications-->
<xs:element name="classification" type="xs:string" minOccurs="0" maxOccurs="0"/>
<!-- The "region" element is optional and will be important if the networks are chained together, and the neurons have different functions than a standard neuron, like an accumulator or a probability computer
and are grouped by region, disk, server, cloud, partition, etc-->
<xs:element name="region" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "description" element is an optional field, however a very useful one.-->
<xs:element name="description" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "creator" element is optional and denotes who trained these nets -->
<xs:element name="creator" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The "notes" element is optional and is self explanatory-->
<xs:element name="notes" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- The source element defines the origin of the data. It could be a URL -->
<xs:element name="dataset_source" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- This optional element, together with the source data helps to recreate this network should it go wonky -->
<xs:element name="number_of_training_epochs" type="xs:integer" minOccurs="0" maxOccurs="1"/>
<!-- The "number_of_layers" defines the total-->
<xs:element name="number_of_layers" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<xs:element name="layers">
<xs:complexType>
<xs:sequence>
<!-- Repeat as necessary for number of layers-->
<xs:element name="layer" type="xs:string" minOccurs="1" maxOccurs="1">
<xs:complexType>
<xs:sequence>
<!-- Layer Naming and Neuron Naming will ultimately have a recognized convention eg. L2-N1 is Layer 2, Neuron #1-->
<xs:element name="layer_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- number of neurons is for the benefit of an object-oriented constructor-->
<xs:element name="number_of_neurons" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<!-- defining the neuron this is repeated as many times as necessary-->
<xs:element name="neuron">
<xs:complexType>
<xs:sequence>
<!--optional ~ currently it could be a perceptron, but it could also be a new type, like an accumulator, or probability calculator-->
<xs:element name="type" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- name is optional ~ name will be standardized eg. L1-N1 layer/neuron pair. The reason is that there might be benefit in synaptic joining of this layer to other networks and one must define the joins -->
<xs:element name="name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- optional ~ again, someone will come up with a classification system-->
<xs:element name="neuron_classification" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- number of inputs-->
<xs:element name="number_of_inputs" type="xs:integer" minOccurs="1" maxOccurs="1"/>
<!-- required if the input layer is also an output layer - eg. sigmoid, heaviside etc-->
<xs:element name="primary_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- ~ optional - there is no such thing as a xs:function - yet, but there could be in the future -->
<xs:element name="primary_activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>
<!-- in lieu of an embeddable function, a description could go here ~ optional -->
<xs:element name="primary_activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- possible alternate activation functions eg. sigmoid, heaviside etc-->
<xs:element name="alternate_activation_function_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- ~ optional - there is no such thing as a xs:function - yet, but there could be in the future -->
<xs:element name="alternate__activation_function" type="xs:function" minOccurs="0" maxOccurs="1"/>
<!-- in lieu of an embeddable function, a description could go here ~ optional -->
<xs:element name="alternate__activation_function_description" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- if this is an output layer or requires an activation threshold-->
<xs:element name="activation_threshold" type="xs:double" minOccurs="1" maxOccurs="1"/>
<xs:element name="learning_rate" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- the alpha or the 'movement' is used in the back propagation formula to calculate new weights-->
<xs:element name="alpha" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- the local gradient is used in back propagation-->
<xs:element name="local_gradient" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- inputs as many as needed-->
<xs:element name="input">
<xs:complexType>
<xs:sequence>
<!-- Inputs optionally named in case order is necessary for definition -->
<xs:element name="input_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<!-- use appropriate type-->
<xs:element name="input_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>
<!-- use appropriate type-->
<xs:element name="input_value_integer" type="xs:integer" minOccurs="0" maxOccurs="unbounded"/>
<!-- weight for this input-->
<xs:element name="input_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- added as a convenince for continuation of back propagation if the network is relocated, moved, cloned, etc-->
<xs:element name="input_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of input-->
<!-- bias start-->
<xs:element name="bias">
<xs:complexType>
<xs:sequence>
<xs:element name="bias_value" type="xs:double" minOccurs="1" maxOccurs="1"/>
<xs:element name="bias_value_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
<!-- added as a convenince for continuation of back propagation if the network is relocated, moved, cloned, etc-->
<xs:element name="bias_value_previous_weight" type="xs:double" minOccurs="1" maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of bias-->
<xs:element name="output">
<xs:complexType>
<xs:sequence>
<!-- outputs optionally named in case order is necessary for definition -->
<xs:element name="output_name" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="output_value_double" type="xs:double" minOccurs="0" maxOccurs="unbounded"/>
<!-- hypothetical value is a description of what it means if the neuron activates and fires as output if this is the last layer-->
<xs:element name="hypothetical_value" type="xs:string" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of output-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of neuron-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of layer-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- end of layers-->
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- network-->
</xs:schema>
I hope this helps someone. This is open source. Please use it and pass it on if you find it useful.
How To Make Scary Smart Computers
If you have ever visited this blog before, you will know that I am heavily into Artificial Intelligence. I have been playing with artificial neural networks for about ten years. Recently buoyed by the Alan Turing movie "The Imitation Game" and "The Innovators" by Walter Isaacson, I have decided to start mapping out what it would take to make the truly intelligent, conscious, creative computer that would easily pass a Turing Test.
In previous blog entries, like the one immediately below, I outline the need for an autonomous master controller that can stop and start programs based on what an embedded set of artificial neural networks come up with. I have also started thinking about a standardized artificial neural net, that can be fed into any machine already trained, so there can be a market for trained artificial neural nets. I have outlined a possible algorithm for perturbations in artificial neural nets that can create computation creativity. The list goes on and on.
But here is another essential element, If computers have embedded artificial neural networks in them, then for the machine to become scary smart, it has to be able to pass on what it has learned to the next generation. So how do you accomplish that? Easy.
Every time that an Operating System or OS is upgraded, it is upgraded by a predecessor machine who passed on the trained and learned neural nets that it has learned in its artificial lifetime. It is the equivalent of a parent teaching a child.
In the biological world, what makes humans different from the animals, is that we can pass on wisdom, knowledge and observation. In the animal kingdom, each new generation starts from where its parents did -- near zero. Animals learn from their parents. But I can go and read a book, say written by the Reverend Thomas Bayes who wrote in the 1700's on Bayesian theory, and I can read last weeks journals. I can pick and choose to learn whatever I want from the human body of knowledge. But first and foremost, I get my first instruction from my parents.
So if a new Operating System is loaded into a computer from an existing one with artificial intelligence, then it won't have to start from scratch. And if you embed the ability of the artificial neural networks to read and learn stuff by crawling the internet, soon you will have a scary smart computer. The key is that each machine and each server is capable of passing on stuff that it has learned to new computers.
I do believe what Dr. Stephen Hawking says, is that one of the threats to mankind will be some of the artificial intelligence that we create. But like nuclear fission, we have to build it for the sake of knowledge and progress, even if it has the potential to do the human species grave harm.
We have already opened the can of worms of artificial intelligence. Once opened, there is no way to close it unless you have a bigger can. Unfortunately, the contents of that can of worms is expanding faster than we can keep up with it. The best way to control artificial intelligence, is to have a hand in inventing a safe species of it.
I'm Sorry Dave, I Can't Let you Do That ~ The AI Autonomous Computer Master Controller
We still operate as Microsoft slaves when it comes to a computer. Can you imagine how ludicrous it will seem to our grandchildren, that we had to double click on an icon to start a program? They will not be able to imagine us having to direct, save and open a downloaded file. They will wonder why were so primitive that we had to have anti-virus programs. They will wonder why we didn't create something as obvious as the Autonomous Computer Controller as part of every operating system.
The genesis of this blog posting came after I did a run of my Artificial Neural Network run and then had to take the outputs and feed them into another program (in this case, it was Excel). I got to thinking that if my vision of an Artificial Neural Network embedded in every operating system comes to fruition, then the network should not just solve functions, sort and classify data, recognize stuff and do what ANNs (Artificial Neural Networks) do today. They should be able to invoke programs based on the outputs of specific neural pathways and events.
A natural extension of that thought, is that they would signal an Autonomous Computer Master Controller to fire up a program to use the outputs of the particular neural network that just finished firing. If that controller is smart enough to recognize when the AI network is telling it to start the program, then it should be smart enough to shut a program down when it's not needed. This is the small starting point of an autonomous semi-intelligent machine. But let us take it one step further.
Not only is the ANN telling it to fire up a program, and then shut it down, but the ANN could also be running the Autonomous Computer Master Controller. Moore's Law will let that happen as microprocessor get faster and faster and the overhead will be negligible. A core or two of the microprocessor could be dedicated to the intelligent OS operations.
Since we are not evolving a computer intelligence from scratch, we can take short cuts in the evolutionary cycled of smart machines. We don't have to create memory and storage from first principles like using an artificial neuron as a building block - the way that the biological evolution took place. We can do this by making the Autonomous Computer Master Controller talk to a graph database. A graph database can map ambiguous, non-empirical, indirect relationships. So the Autonomous Computer Master Controller observes what is going on with its AI network, and saves its observations in graph. As neo4j, the premiere graph database provisioner says, "Life is a graph". They are pretty cool guys at neo4j, unlike those at Arango DB -- they are the kinds of jerks who follow you on Twitter, and once you extend the courtesy of following them back, they unfollow you. They have a thing or two to learn about brand-building. But je digress.
The power of an Autonomous Computer Master Controller becomes obvious in various use cases. If you have an overseer monitoring the browser and its downloads, it can detect executable code, and as such it can pull an "I sorry Browser Dave, but I can't let you do that!", and block the download.
Slowly, I am stitching the theoretical pieces of an intelligent, creative (perhaps conscious) machine, that will pass the Touring Test and perhaps equal and/or surpass us in intelligence. A machine like that, will crack this veneer that we call civilization, and show us how thin that it really is, and how unenlightened most carbon-based life forms are. Silicon Rules!
A Returned-Probability Artificial Neural Network - The Quantum Artificial Neural Network
Artificial Neural Networks associated with Deep Learning, Machine Learning using supervised and unsupervised learning are fairly good at figuring out deterministic things. For example they can find an open door for a robot to enter. They can find patterns in a given matrix or collection, or field.
However, sometimes there is no evident computability function. In other words, suppose that you are looking at an event or action that results from a whole bunch of unknown things, with a random bit of chaos thrown in. It is impossible to derive a computable function without years of study and knowing the underlying principles. And even then, it still may be impossible to quantify with an equation, regression formula or such.
But Artificial Neural Nets can be trained to identify things without actually knowing anything about the background causes. If you have a training set with the answers or results of size k (k being a series of cases), then you can always train your Artificial Neural Networks or Multilayer Perceptrons on k-1 sets, and evaluate how well you are doing with the last set. You measure the error rate and back propagate, and off you go to another training epoch if necessary.
This is happening with predicting solar flares and the resultant chaos that it cause with electronics and radio communications when these solar winds hit the earth. Here is a link to the article, where ANN does the predicting:
http://www.dailymail.co.uk/sciencetech/article-2919263/The-computer-predict-SUN-AI-forecasts-devastating-solar-flares-knock-power-grids-Earth.html
In this case, the ANN's have shown that there is a relationship between vector magnetic fields of the surface of the sun, the solar atmosphere and solar flares. That's all well and dandy for deterministic events, but what if the determinism was a probability and not a direct causal relationship mapped to its input parameters? What if there were other unknown or unknownable influence factors?
That's were you need an ANN (Artificial Neural Network) to return a probability as the hypothesis value. This is an easy task for a stats package working on database tables, churning out averages, probabilities, degrees of confidence, standard deviations etc, but I am left wondering if it could be done internally in the guts of the artificial neuron.
The artificial neuron is pretty basic. It sums up all of the inputs and biases multiplied by their weights, and feeds the result to an activation function. It does this many times over in many layers. What if you could encode the guts of the neuron to spit out the probability of the results of what is being inputted? What if somehow you changed the inner workings of the perceptron or neuron to calculate the probability. It seems to me that the activation function is somehow ideally suited to adaptation to do this, because it can be constructed to deliver an activation value of between 0 and 1, which matches probability notation.
Our human brains work well with fuzziness in our chaotic world. We unconsciously map patterns and assign probabilities to them. There is another word for fuzzy values. It is a "quantum" property. The more you know about one property of an object, the less you know about another. Fuzziness. The great leap forward for Artificial Neural Networks, is to become Quantum and deliver a probability. Once we can get an Artificial Neural Net machine to determine probability, then we can apply Bayesian mechanics. That's when it can make inferences, and get a computer on the road to thinking from first principles -- by things that it has learned by itself.
Subscribe to:
Posts (Atom)