Data mining and process mining are some of the concepts that are very hot when talking about big data.
Big data is data that exists everywhere and it is the analysis of these data, that have named a range of techniques - including Data Mining and Process Mining.
Both Data Mining and Process Mining goes under the concept, called Business Intelligence. Business intelligence refers to techniques and tools that are used to analyse large amounts of digital data and retrieve valuable business knowledge out of them. And that is true for data mining techniques as well as process mining techniques - albeit with different perspective on the analysis and the results they produce.
Let's look at some of the similarities between the two :
- Both techniques are used to analyse large amounts of data, that it would be impossible to analyse manually
- Both techniques produce information that can be used for making business decisions
- Both techniques use the "mining" techniques where algorithms traverse through large volumes of data, looking for patterns and relationships
Of course there are some similarities, as both techniques can be categorised as Business Intelligence. But, as mentioned before, the two techniques have different perspectives and goals.
Differences between Data Mining and Process Mining:
- Data mining techniques are using multi-dimensional views (cubes ) on data which can be drilled up and down (in different aggregated levels levels). For example, a sale of a product could have the related dimensions: price, product category , customer , region, country , area, day, month, quarter , year, and so on and it is then possible to slice and look at the cube of data and aggregated data in various ways.
- Data mining techniques are primarily used to find patterns in large data sets. With data mining techniques it may be possible to find that certain categories of customers demand a certain product, or to find that the customers who most frequently buy product A are also the ones most often buying product B , or that the products placed on a specific location in the shop while running an advertising campaign, are also the ones that sell the best. I remember an English department store which, through data mining techniques, found out that the customers who shopped the most were also those most often buyingt a special Italian cheese that otherwise was not often sold. Traditionally retailers would try to remove products with very low turnover rates and replace them with products with better sales - the problem is that the removal of goods according to the principle could lead to the best customers having to look somewhere else (for the special Italian cheese).
- The input to data mining are tables with data
- Process mining is not used to find relationship data patterns, but rather to find process relationships in the data. Finding process relationships that provide an overview of processes and activities in the process, and deviations and process performance such as throughput , bottlenecks and discrepancies.
- Process mining's perspective is not on patterns in the data but in the processes the data represents.
- The goal of process mining is to find information about the business processes
- The input to the process mining analysis are event logs , audit trails , and data and events stamped in the IT systems.
Process Mining bridging data mining and big data, and business process management
Process mining is the " missing link" between data mining and traditional BPM ( Business Process Management ). Data mining provides valuable insights through analysis of data, but is generally not concerned about processes. This is where process mining comes into the picture and gives the opportunity to get the same benefits of data mining ,when working with processes and process improvements.
Process mapping can be done with mining techniques instead of brown -paper workshops and interviews. And the process performance analysis can be made on existing data mining techniques without first collecting data through work studies.