Process Mining From Event Logs -- An Untapped Resource And Wave of The Future
A couple of years ago, I was searching for untapped horizons in data mining, and I came across a course given by Professor Wil van der Aalst where he pioneered the technology of business process mining from server event logs. Naturally I signed up for the course. It is and was a fascinating course, not only due to its in-depth and non-trivial treatment of gleaning knowledge from data, but for me, it got the creative juices flowing to think of where it could be applied elsewhere. I was so intrigued with the possibilities, that I created a Google Scholar Alert for Professor van der Aalst's publication. The latest Google alert was on January 31rst, and it was a paper entitled "Connecting databases with process mining". The link is here: http://repository.tue.nl/858271 It was this paper that triggered this article.
I am a huge proponent of AI, Machine Learning and Analytics. In Machine Learning, you gather large datasets, clean the data, section the data into smaller sets for training & evaluation, and then train an AI machine with hundreds, perhaps thousands of training epochs until the probability of gaining the sought-after knowledge crosses an appropriate threshold. Machine intelligence is a huge field of endeavor and it is progressing to be a major part of everyday life in all phases of life. However, it is time consuming to teach the machine and get it right. Professor van der Aalst's area of expertise can provide a better way. Let me explain:
My particular interest, is that I am building a semantic blockchain to record all of the data coupled to vehicles, autonomous or not. Blockchain of course, is an immutable data ledger that is true, autonomous itself in operation, disintermediates third parties and is outage-resistant. Autonomous vehicles will by law, be required to log every move, have records of their software revisions, and have records like post-crash behavior etc.
I immediately saw the possibilities of using this data. Suppose that you are in an autonomous vehicle and that vehicle has never been on a tricky roadway that you need to navigate to get to your destination. Your car doesn't know the route parameters, but thousands of other autonomous vehicles have, including many with your kind of operating system and software. With the connected car, your vehicle would know its GPS coordinates and query a system for the driving details for this piece of roadway that is unknown to the computer. Instead of intense computational ability required to navigate, a recipe with driving features could be downloaded.
Rather than garnering those instructions from repeated training epochs in machine learning, one could apply process mining to the logs to extract the knowledge required. There are already semantic methods of communicating processes, from decision trees to Petri nets, and if the general process were already known to the machine, it would reduce the computational load. As a matter of fact, each vehicle could have a process mining module to extract high level algorithms for the roads that it drives regularly. That in itself will reduce the computational load of the vehicles. It would know in advance, where the stop signs are, for example, and you won't have Youtube videos of self-driving cars going through red lights and stop signs.
It goes a lot further than autonomous vehicles. This concept of creating high level machine processes through event logs can be applied to such diverse fields from robotic manufacturing to cloud server monitoring and numerous fields where human operators or real world human judgement is required.
Process mining could either eliminate machine learning in a lot of instances, or it could supplement it, with a mix of technologies. The aim is the same, which is aggregating data into information and integrating information into knowledge, both for humans and machines.
This process mining business reminds me of the history behind Bayesian Inference. The Reverend Thomas Bayes discovered probability and prior belief equations. They sat on a dusty shelf for over 200 years and they were re-purposed for computer inference in intelligence. I think that Professor van der Aalst's methodologies will be re-purposed for things yet unimagined, and it will not take 200 years to come to fruition.
Professor van der Aalst's next course in process mining begins online on February 20th of this month. Here is the link: