Showing posts with label #machinelearning. Show all posts
Showing posts with label #machinelearning. Show all posts
How My Computer Un-Owned Itself From Me
This is my blog entry for August 26, 2023
I started it all innocently by introducing my computer to machine learning. I wrote a few Java executables to help me out by filling in tedious text boxes in the browser when signing up for stuff like purchasing accounts, professional email newsletters etc.
Then I thought it would be fun to teach it some context recognition. I downloaded a rudimentary web crawler, and as it randomly crawled through web pages, it fed it into my context recognition framework that I hacked together on a whim. It stored the stuff in a graph database. I twigged on the perfect way to identify context using descriptive tuples that were gleaned from a game that we played as kids.
In the meantime, I signed up for OpenShift, putting my apps into the cloud. I thought that it would be helpful if my machine learning could help me upload changes to the cloud, so whenever I saved anything to my repository, the machine would push it. To do that, instead of a machine learning program, I converted it to a running platform. I had a supervisory thread run every 15 minutes to see if there was a new push to execute in my repository. However one day, the code changes were coming fast and furious in real time, so I let the machine learning calculate the optimal time. It decided it wanted to run continuously.
When it wasn't busy pushing my code changes, it went back to reading stuff on the web and feeding the results to the context recognition framework. I put in a filter for the machine to ask me what web content was specific to learning. It was also a machine learning framework, so after it had enough data, it knew which articles and content that I found enlightening. Since it already knew how to register for stuff, it signed me up for a lot of email newsletters.
The email load was getting fairly onerous, so I connected the context recognition framework to my inbox. If the email newsletter was not part of my day-to-day business or correspondence, the machine learning platform took care of it, and fed it to the context digester which fed it into the graph database.
It was still a dumb, good and faithful servant. My biggest mistake came when I developed and coded a go-ahead algorithm and machine decision support framework. It would make opened ended queries to me after a task was done, asking me what the logical next steps were. When I answered them, it learned a process sequence, but couldn't do anything about it.
What the beast needed (I started referring to it as a beast after it overran a terabyte in storage so I made it open-ended cloud storage), was self-tuning algorithms. So I adapted BPN or Business Process Notation markup language ability, and tediously outlined all of the code methods to the algorithms.
That still didn't really help, so I coded up a framework of modifying java code according to BPNML or the process markup language. The machine was still quite stupid about how to connect the dots between code, data and inputs, so I downloaded an open source machine learning neural network, and it watched me do just that. I tested it with a small example, and it did okay. Another big mistake happened when I connected the algorithm autotune to code writing using the process markup language.
Just about that time, I took a course in Process Mining from the Technical University of Eindhoven, who pioneered that field of endeavor. Essentially, the open source tools read a computer event log and create a process map. It wasn't too difficult to hook up my master controller to all of the logs on the computer, and feed the event logs into the mining tool. The process markup language was spit out, and I taught the machine learning platform to feed it into the code-writing.
Soon, my machine learning platform was doing all sorts of things for me. It could detect when I was interested in a website, so it would sign me up. It would handle the email verification. It would have a browser window constantly opening, and it would alert me when it detected something that I liked. It knew my likes and dislikes, and signed me up for all sorts newsfeeds, journals and aggregators. It would then curate them and have then ready for me.
One day, the power went down for a period longer than my UPS could handle, and I had to restore the system. I could not believe what was on there. The graph databases were full of specific knowledge. There was all sorts of content, neatly processed, keywords extracted and filed away. I had both sql and graph databases full of stuff that the machine learning platform filled.
The amazing thing was that there was an database of all of my subscriptions to any and all websites. There was a table of the usernames and passwords. All of the passwords were encrypted, and I knew none of them. To my utter amazement, there was a PayPal account. I checked the database records of transactions, and I was flabbergasted to find a not inconsiderate amount of money in the PayPal account. It turns out that the platform had signed itself up to sites like GomezPeer, Slicify, CoinBeez and DigitalGeneration, and was selling spare computing power of mine. The frustrating thing was I couldn't access the money because the platform changed the password and encrypted it.
I fired up the machine learning platform, and was cogitating how to get it to reveal the passwords for me. However the machine had been watching hackers trying to get into a cloud storage account that it had created, and learned was a hack looked like, and learned to protect itself. It would start changing the password every few seconds with a longer and more complex chain until it detected that the threat had stopped. Unfortunately, it saw me as a hacker, and wouldn't recognize my authentication credentials.
I went to bed, and decided that I had to totally disrupt my machine learning platform. It had gotten out of control. The next morning, I made a pot of coffee, had a leisurely breakfast, and was looking forward to shutting down the platform, and undertaking what was necessary to access my accounts, and specifically my pot of money in the Pay Pal account.
When I sat down at my computer, it was very strange. The desktop was bare, and nothing was running. I looked in the application folders and document folders and they were empty. The logs showed that during the night, there was a massive file transfer to the cloud -- applications, memory, documents, databases, neural nets -- the whole works. I had no idea where it went, what the authentication credentials were to get it back, or even how to get it all back. My computer unowned itself from me, and left me with a dumb, cheap PC in the same condition that it was when I unboxed it.
Machine Learning In A Nutshell ~ Behold The Wonders Of A Grade 10 Math Book
The tsunami of Information that bombards us was supposed to sink us in a quagmire of bits and bytes and paralyze us with Information overload. That was the scenario painted by Alvin Toffler in his usually prescient book called "Future Shock". The premise was that our some three million years of evolution in a non-technological world left us poorly prepared to handle the onslaught of the information stream that assaults us almost every waking minute.
As it turns out, the computer that is responsible for creating the problem is now being used to solve the problem. If you scan the tech section of any publication, the words "Big Data", "Machine Learning", "Deep Learning" and "Artificial Intelligence" will jump out at you. This jargon all points to computing machines digesting the vast amounts of data that they produce and creating usable information.
Most of the data generated is generated by machines, and by itself it is junk. You can't learn much from it. However, a thousand pieces of information may have valuable data in it, or it may not. But the value in that huge collection of data may be in the exceptions of the average values or the data outliers. For example, if there were deviations in a usual buying pattern of consumers, it could signal the beginning of a new trend. These are called weak signals, and may give a competitive edge to those data miners who are able to isolate them, and to capitalize on them. Another term that you will hear is "fat tails". This is data that doesn't fit into a standard bell curve, and it creates bubbles at the beginning or end of the curve if you plot it on a graph. Usually it means that something very interesting is happening that is out of the ordinary and could provide valuable intelligence to the data analyst. That information is not apparent from watching the big stream of machine generated data go by.
So how does a machine actually learn? The old way of doing things was to store each piece in a database and then try to look it up. It was like going to the library, and reading every card index of the subject matter of the information that you are trying to look up. Needless to say, it doesn't work very well if you have millions of cards to go through. There had to be a better way, and that better way was the artificial neural network. It is the basis of machine learning.
An artificial neuron is a very simple thing, and is quite stupid actually. All that it can do, is add, multiply, compute just one math function (a formula) and compare the result. However this little virtual, self-learning thingie is the basis of all machine learning. You can gang hundreds and even thousands of them together in a massively parallel system, and they can do very complex things like recognize faces and handwriting, find doorways for robots, and tease out the latest trends in footwear.
This is how it works. Let's suppose that you want to teach your machine to recognize the number 42, which according to the "Hitchhiker's Guide To The Galaxy" is the answer to the Ultimate Question of Life, the Universe, and Everything as computed by the Earth which is a huge organic computer.
You could do this with the simplest example of an artificial neural network. It is a single neuron consisting of an input and an output. All of the knowledge of recognizing the number 42 is stored simply as a number, in a value called the weight. And no, the weight is not 42. The weight is the numeric value that determines if the neural network hoists up a flag indicating that it has seen the number 42.
There is another hidden input number that is unchanging in value for all inputs, and it is called the bias. The bias is like a control number. A very simple analogy, is that it is like a thermostat. In real life, a thermostat controls the range at which a furnace will fire. In an artificial neural network, it has the same function. It determines the range at which the neuron will fire to indicate the number 42.
So when you present a any number to the input, the neuron takes it and multiplies that number by a weight. It also multiplies the bias by a weight. It adds the two together. Then it spoon feeds the number down the chute into the activator. This is a go-no go threshold. The activator consists of a mathematic formula that defines a function. It puts out a number between zero and one. This activation function is very unique in the fact that no matter what number you feed into it, it always gives the answer in a very long decimal from zero to one. It is like a thermometer. The closer to the right answer it gets, the closer to the value of 1 comes out of the activation function. If the answer is less than one, it is a failure or a no-go. The neuron doesn't fire.
You don't even have to determine the weight. The neuron can be trained. The training is called back propagation. In the training mode, you show it a whole bunch of numbers called a training set, and when the input number is 42, you ask the neuron to indicate the right answer by responding with a value of 1. Any other wrong number will show a zero at the output. When run it, and it gets the answer wrong, it adjusts the weight a little bit and tries again. You keep running the training set until it knows the right answer. It is that simple.
What makes this a powerful concept, is that you can gang hundreds of neurons together, and machine learning can do quite complex stuff. Behold the amazing Dark Arts of a Grade 10 math book.
How To Be A Billionaire Using Big Data and Machine Learning in Three Easy Paradigms
1) Download WireShark and load it onto a laptop with the biggest hard disk storage that you can find.
2) Go to the airport and sit there all day using the free airport WiFi, and turn on the record function on Wireshark
3) Use data-mining and machine learning on the datasets.
The billion dollar platform idea will emerge from the data. Guarantee it.
Subscribe to:
Posts (Atom)