New Algorithm Can Streamline Data Restoration Of Event Logs

New-Algorithm-Can-Streamline-Data-Restoration-Of-Event-Logs

The high accuracy of the new data restoration algorithm guarantees its applications not only in current enterprises but also in future AI applications

Process models and optimisation processes rely on the quality of data. Missing data can lead to models that generate an incorrect analysis. In a new study, researchers from Pusan National University, South Korea, have developed an improved algorithm that uses correlations between existing information to restore missing data in an event log with a high degree of accuracy.

Digitalisation has enabled businesses to record their operations in event logs where each activity in a business process is recorded as data with certain attributes such as a timestamp, event name etc. These logs are helpful as they give an overview of the operations and can be used to develop process models that optimise the business process. However, the quality of the optimisation process is only as good as the data stored and event logs with missing events lead to poor analysis and data models.

In a collaborative study, researchers from Pusan National University, South Korea, including Dr. Sunghyun Sim and Prof. Hyerim Bae, along with Prof. Ling Liu from Georgia Institute of Technology have developed a method that can restore missing data in an event log. The study, published in IEEE Transactions on Services Computing, uses imputation methods that use correlations between available data to find missing information. “Since data is collected from multiple perspectives in numerous information systems, there is a relationship between the collected data. Starting with this point, our study suggested a method of restoring missing event values by utilising the relationship among entities in the event log, which can overcome human error or system,” said Dr. Sim.

Also Read: It’s All About Data Analytics

In event logs, events have attributes that are linked to other events in “single event” or “multiple event” relationships. In the former case, each attribute of an event corresponds to a unique attribute in another event. Based on this relationship, the researchers developed a Systematic Event Imputation (SEI) method that restores a missing value by simply referring to the available value it is linked to.

However, in the latter case where attributes have multiple correspondences, a simple matching of attributes is not possible. For such situations, a multiple event imputation (MEI) method was developed where missing events are first estimated and used to create event sequences or event chains. These sequences can be compared with an event log without missing data to restore the missing event attributes.

These imputation methods were applied simultaneously by a bagging recurrent event imputation (BREI) algorithm, using bootstrap sampling and recurrent event imputation (REI) to repair the event log. On tests with real-world event logs, the researchers found that their algorithm improved restoration accuracy by 10–30 per cent compared to existing restoration algorithms. Moreover, it could restore almost 90 per cent of the data accuracy even when more than half of it was missing.

Apart from optimizing business processes, the researchers are optimistic that such an algorithm can be extended to other applications that rely on the quality of data. One promising avenue lies in improving the data fed to AI systems and this method has the potential to accelerate the development of AI technologies. “It is possible to improve the performance of artificial intelligence by improving the quality of data in its learning process. The algorithm will also help prevent model malfunction by improving the quality of data it collects in real-time in a real-time environment,” said Prof. Hyerim.

The high accuracy of the new algorithm, as well as its versatility, is sure to ensure its widespread application in the industry in the near future.