Process Mining, Data Mining, Explicit & Implicit Events
The course in process data mining given by Professor Wil van der Aalst from the Eindhoven University of Technology in the Netherlands, has opened my eyes to a few elements in data mining that I had not considered.
At first blush, the course looks like it would be quite useful for determining bottle necks in processes like order fulfillment, or patient treatment in a hospital, service calls or a manufacturing environment, and it is. But to an eCommerce platform builder like myself, it can provide amazing insights that I had never thought of before taking this course.
Professor van der Aalst has introduced a layer of abstraction or perhaps a double layer of abstraction in defining any process with a Petri Net derived from an event log. Here is an example of a Petri Net (taken from Wikipedia) :
The P's are places and the T's are transitions. In the theoretical and abstract model, tokens (the black dots) mark various spots in the process. Tokens are consumed by transitions, and regenerated when they arrive at the next occurring place. The arrival of a token at a specific place, records an explicit behavior in the transition. So how did this help me?
I do data mining to enhance revenue stream on our eCommerce platform. (See the blog entry below this one). Previous data mining efforts on my part dealt with implicit events. Sure we had an event log, but we looked at the final event of say a customer purchasing something, and tried to find associations that drove the purchase (attributes or resources like price, color, time of day, past buys of the customer, etc). The customer's act of making the purchase was captured in the event logs, using timestamps of various navigations, but all of the events leading to purchased were implicit events that we never measured. With the event logs, we have explicit behaviors, and using those event logs, we can define the purchase process for each customer. So we start making process maps of the online events that led to the purchase. In short, we began to look at the explicit events.
Where will this take us? It will show us the activities and processes leading to a high value event for us (a purchase). What it does, is that we isolate high value process events, and by mapping customer behavior to those events, we can evaluate and refine which customers will end up making an online purchase. So we can treat those customers in a special way with kid gloves.
In essence, we can gain insight into the probability of an online purchase if a new customer starts creating events in our event logs, which indicates behavior that leads to a purchase. This data is extremely valuable, as now we can put this customer on our valued customer list, and using other data mining techniques, we can suggest other things that the customer is interested in and get more sales.
To recap, we now can measure explicit behaviors instead of implicit behaviors based on such limited metrics as past buying behaviors. We add a whole new dimension in enhancing the shopping experience for our users, and thereby enhancing our bottom line revenue stream.
As in life, often in data mining, it pays to pay attention to the explicit things. Process mining is an incredibly efficient way to deduce explicit behaviors that lead to desired outcomes on our platforms.