Future Imperfect & Software Stream of Consciousness : process map

In my course on process mining from the Eindoven University of Technology in the Netherlands, on the course forum, a person asked about where to get events logs for process mining. This was the question posted:

Anyhow, as I was watching the lecture on Guidelines for Event Logging, I was struck by the question that usually occurs to me in such courses: But how to do it in practice?

I'm assuming that logging for the Internet of Things is part of the Things that Make Up that Internet. But otherwise? I absolutely abhor having to program, never struck me as that interesting. So how is it done in practice? Do you guys have preset functions/libraries? In case a human needs to log their behaviour, how do you ensure compliance - that they don't forget etc. etc.?

I'd love to hear more on that!

I took it upon myself to reply, and this is what I said:

I am a technical architect (and Chief Technology Officer !) for an eCommerce platform that deals with high-dollar value goods marketed in exclusive circles. We have had the benefit of creating the technology so we created event logs for everything. Here is an example:

1) When you log in, we record the time, the username, the IP address of where the login came and whether the user was using a desk computer or a mobile platform.

2) When you check your messages on our system, they are marked as read with a timestamp. That creates another event log.

3) When you go to view offerings, what you look at is recorded, so we can gather data on what the user likes to buy.

4) If the user is a seller, we record what he uploads into a database table, and every entry has a timestamp column to detect when the data was added.
5) Each sale is recorded along with a timestamp.

6) Each log out is recorded, along with a timestamp. All of this is very easy to do, because when we create a database to store data, we construct it such that each entry (called a row) has a column labeled timestamp where the computer puts the NOW() date/time when the data is recorded. This automagically creates the event logs for us, for all tasks, even tasks that are consided non-process related (which really add value to our processes).

When you are online, every time data is recorded, a timestamp goes along with it. We have event logs for everything.

But you can paper-based event logs that can be transcribed. For example, we looked at an auto repair shop and did a very rudimentary process mining when we were constructing a business app for the company. They took appointments and recorded the time of the call and the time the customer was going to bring the car into the shop. Then the service writer recorded the details on an invoice, service sheet when the customer arrived, and pushed the invoice into a time clock, stamping the arrival time. We then looked at the mechanic's time sheet to see what hours that he billed for the job. Then we know when the customer picked it up, because the payment invoice was timestamped (usually with the cash register receipt). They had a complete event log on various bits of paper floating around the business, and once the business was computerized, they could determine the bottlenecks. (Which turned out to be waiting for ordered parts), This was my first experience of events logs in a non-computerized fashion. Since then, I timestamp every database table that I construct -- even the ones which store metadata and mined data such as standard deviation of an aggregate of characteristics of our top buyers and sellers.

There is now a burgeoning field of using non-sql graph databases (we like neo4j) which can map semantic and/or fuzzy relationships very easily, and this course has taught me to timestamp graphs and edges to monitor significant but transient process relationships in the business milieu.

Hopes this helps,

Most people do not realize how many internet tracks they leave that are event logs when they do ordinary things online. The Germans have a word for this. It is not a nice word. It translates to "digital slime".

Event logs are ubiquitous, and it is my contention that all of these digital tracks and Big Data will lead to a plethora of training data for artificial intelligence. Artificial neural nets need concrete examples to learn from and iterate and re-iterate to "get smart. Process mining will be huge in that respect. Process mining is the first step for computers to learn human behavior. Neural net machines and multilayer perceptrons will pore over process maps gleaned from human behavior and learn to mimic and reproduce expert behavior in a very more and more repeatable fashion that humans can.

The course in process data mining given by Professor Wil van der Aalst from the Eindhoven University of Technology in the Netherlands, has opened my eyes to a few elements in data mining that I had not considered.

At first blush, the course looks like it would be quite useful for determining bottle necks in processes like order fulfillment, or patient treatment in a hospital, service calls or a manufacturing environment, and it is. But to an eCommerce platform builder like myself, it can provide amazing insights that I had never thought of before taking this course.

Professor van der Aalst has introduced a layer of abstraction or perhaps a double layer of abstraction in defining any process with a Petri Net derived from an event log. Here is an example of a Petri Net (taken from Wikipedia) :

The P's are places and the T's are transitions. In the theoretical and abstract model, tokens (the black dots) mark various spots in the process. Tokens are consumed by transitions, and regenerated when they arrive at the next occurring place. The arrival of a token at a specific place, records an explicit behavior in the transition. So how did this help me?

I do data mining to enhance revenue stream on our eCommerce platform. (See the blog entry below this one). Previous data mining efforts on my part dealt with implicit events. Sure we had an event log, but we looked at the final event of say a customer purchasing something, and tried to find associations that drove the purchase (attributes or resources like price, color, time of day, past buys of the customer, etc). The customer's act of making the purchase was captured in the event logs, using timestamps of various navigations, but all of the events leading to purchased were implicit events that we never measured. With the event logs, we have explicit behaviors, and using those event logs, we can define the purchase process for each customer. So we start making process maps of the online events that led to the purchase. In short, we began to look at the explicit events.

Where will this take us? It will show us the activities and processes leading to a high value event for us (a purchase). What it does, is that we isolate high value process events, and by mapping customer behavior to those events, we can evaluate and refine which customers will end up making an online purchase. So we can treat those customers in a special way with kid gloves.

In essence, we can gain insight into the probability of an online purchase if a new customer starts creating events in our event logs, which indicates behavior that leads to a purchase. This data is extremely valuable, as now we can put this customer on our valued customer list, and using other data mining techniques, we can suggest other things that the customer is interested in and get more sales.

To recap, we now can measure explicit behaviors instead of implicit behaviors based on such limited metrics as past buying behaviors. We add a whole new dimension in enhancing the shopping experience for our users, and thereby enhancing our bottom line revenue stream.

As in life, often in data mining, it pays to pay attention to the explicit things. Process mining is an incredibly efficient way to deduce explicit behaviors that lead to desired outcomes on our platforms.

Future Imperfect & Software Stream of Consciousness

Event Logs, Process Mining and Artificial Intelligence

Process Mining, Data Mining, Explicit & Implicit Events