06.04.2020

Machine learning and artificial intelligence – pure hype (?) part 2

I hope you enjoyed our first article in our series about Machine Learning. If you missed it - no problem. You can read it here.

In the second part of this blog series we focus on the following topics:

  1. Introduction: From global galactic issues to audit, ICS, data analytics and risk management
  2. Building bridges: why initial disappointment may set in – from your data to machine learning algorithms
  3. Content is king (here too): Creativity comes later – only actual use will create added value.
  4. Three examples of use: a – Old wine in new bottles; b – A new approach; c – Doing things differently
  5. Summary

 

2 Building bridges: why initial disappointment may set in

Although machine learning algorithms aren't entirely new, the subject's current topicality as mentioned in the introduction means they are currently attracting huge interest. Many software manufacturers and IT companies are responding with corresponding products. Amazon is offering AWS (Amazon Web Services) with relevant tools and services and associated suitable algorithms.

We, too, are working intensely with the ACL Robotics solution from Galvanize. It recently incorporated the K-MEANS algorithm as a clustering process (in addition to the train and predict commands, which themselves use a set of algorithms). In this context, our colleague Moritz has also written a series of articles with an integrated workshop, here you will find part 1.

But why did I write “why initial disappointment may set in” in the header to this section? Let’s look at an example: After the K-MEANS algorithm was introduced, customers contacted us who, without further ado, wanted to perform analyses and wanted to know if they can use this analysis to cluster their business partner master data, text-based descriptions of their audit findings or accounting transactions, and what the result looks like. An initial disillusionment then set in as they realised that the K-MEANS algorithm requires numeric attributes as a basis.

A transformation process is therefore needed in order to be more flexible when it comes to database matters. As a first necessary step: Depending on the planned intention and the selected algorithm, you need to convert the data. We can assist you with such transformations: the analysis is no longer based on tables and fields, but on “derivatives”. They contain the information which is contained in selected fields, but that can be directly used for ML algorithms. This information is thus almost totally detached from the semantics of the original data and is optimised for use by special algorithms.

In summary, you need to be aware that you may not be able to use your data directly in machine learning algorithms. If you have the right know-how, you can perform the necessary transformation yourself, or you can use our pre-configured content. As well as being optimised from a technical perspective, the content has also been tailored and optimised to specific applications.

In other words, we create these databases (derivatives) for areas such as vendor master records, purchasing transactions or postings in financial accounting, or material movements. During this process we include those attributes in the transformation which we believe are of relevance for subsequent analyses; we use suitable methods to transform them so that the data volume created is then “ML-ready”.

 

3 Content is king – only actual use will create added value

In the previous section, I explained the relevance of transformation. Some algorithms require corresponding preprocessing of the data to be analysed using ML methods. As discussed, we take care to ensure that the content of the data is also optimised during preprocessing. We include the data which from our experience are relevant in semantic terms, we process and clean them where necessary and then transform them into a corresponding derivative, which can then, for example, be directly used in cluster algorithms such as the K-MEANS algorithm.

Viewed positively, there are almost no limits to creativity here when it comes to further processing. From a critical perspective, however, one could question the specific added value gained from it.

Obviously, it is fascinating to obtain a completely new view of your own data. Attractively visualised, you might notice unexpected patterns which look really exciting.

But the next question is what to do with the results? One of the themes of last year’s Gartner Data Analytic Summit was, for example “from insight to action” – in other words deriving and implementing specific actions from the results of data analyses. This is difficult if the patterns (e.g. the identified clusters) are difficult to interpret. Although a seemingly exciting image of clusters is created, no one has any idea as to exactly what they mean, let alone as to the specific actions that can be derived from them.

 

Tangible added value is best generated by having a clear idea of your objective:

  • We want to analyse whether two shared service centres which map the same processes also exhibit homogeneous accounting behaviour. Comparing the data from these two shared service centres may also identify anomalies which could not be seen as such when observing one centre alone.
  • We want to be able to measure the quality of master data as an indicator and thus make it possible to compare different datasets.
  • Customers should be clustered in terms of processes, not simply within the context of a traditional ABC analysis. This can create an understanding of different groups of customers, making it possible to consider adapting own processes to different customer groups. These are examples of specific applications and resulting actions that you can trigger using ML algorithms such as the K-MEANS cluster algorithm.

 

Do not miss the last part of our blog series! In the next article we will show you the benefits of machine learning using three different examples.


Comments (0)
Be the first who comments this blog entry.
Blog login

You are not logged in. Please log in to comment this blog entry.

go to Login