10.03.2014

CCM (Part 3) - 5 things you should know - and 2 things you bet­ter not for­get - when build­ing a CCM da­ta an­a­ly­tic en­vi­ron­ment

This post is the third of a multi-part article which is titled “5 things you should know - and 2 things you better not forget - when building a data analytic environment”. It is about data analytics in general, but with a focus on building bigger solutions like a CCM (Continuous Controls Monitoring Environment).

The full article contains of several parts. The first five parts describe attention points that you should be aware of:

  1. Know what you want to know (aka list analytic questions)
  2. Know your systems (aka identify data sources)
  3. Know your data (aka examine data structures)
  4. Know your Analytic Tools (aka understand technical possibilities and limits)
  5. Know your customers (aka think about who receives the results and what they will do with it)

Part six and seven are about two things that you probably already know, but we emphasize their importance in the context of data analytics:

  1. There is a price tag to it (aka budget)
  2. You need someone who can do it (aka resources)

Part 1 & 2 have already been covered (they are linked above, so you can go back anytime there). This is now part 3, “Know your data”:

Five things you should know, continued (3/5)

We do continue with the third aspect out of the five things that I see as “essential”. The first step, labelled “Know what you want to know”, was about a getting proper business understanding and stating as-detailed-as-possible questions that can actually be answered. Only by knowing your business well you can make a good list of analytic questions. The area we were interested in was “Compliance”, the topic “Payments to critical countries” and the detail questions that were raised we stated as follows:

  • “Identify payment transactions that have been done in context of partners like vendors, or customers, or even where an expense account is directly debited.
  • Create a list of these transactions and identify the payee’s bank country for those payments.
  • Match these detail transactions versus a list of countries that are seen as critical because they are on the embargo list, known as tax haven or having a Corruption Perception Index > x.”

In step two we talked about the importance to “Know your systems”. We found out that our payment data is on an SAP® system “SY2”, running SAP® ERP ECC 6.0, and containing the financial transactions for all companies worldwide.

Picture 1 – Identifying the correct system

Based on that we found out about the data ownership, and then followed the process of requesting a user with appropriate read-only rights to access the data.

But which data do we need to access, actually? This will be part of this article “Know your data”.

Aspect 3: Know your data

Knowing your data requires business process and IT know-how. If you remember this picture from the last article, we talked about the three tiers of an application, and pointed out, that the data gets stored in a database.

Picture 2 – Three-tier-Application

That database contains a set of tables where the data is stored. A table contains information structured in columns (or fields), and rows (or lines). For example, the vendor addresses in a table could look like this:

Picture 3 – Structured data in a table

There are three lines in this table. Each line consists of several fields or columns, which help to structure the data. Like stated, in one database there are probably a lot of tables. On an SAP® system like SAP® ERP ECC 6.0, there will be ten thousands of such tables. So which data – as concrete as possible – do we need?

Again we go top down, asking two questions:

  1. In which tables is the information stored we need for our analytic questions?
  2. Which concrete fields do we (have to) use?

Picture 4 – From database over tables to fields

Identifying the tables can be a pretty time consuming task, as it is usually not self-evident. It helps to understand, that often certain process elements or process steps can be mapped to a certain table. The picture below shows some SAP® tables which contain data of the financial module SAP®-FI. Sometimes data is stored redundantly, so there may be more than one table where the same information can be found.

Picture 5 – P2P Process mapped to tables

If there is an open vendor item in SAP®-FI, it can be found in SAP®-FI GL (tables BKPF and BSEG) and parallel SAP®-FI AP (table BSIK). A payment which is performed by the Automated Payment Run is recorded in the tables REGUH and REGUP; the cleared item and the payment as well go to BKPF/BSEG of SAP®-FI and BSAK of SAP®-AP.

I don’t want to overwhelm you with table names, but the message here is that out of thousands of data base tables the relevant ones need to be picked. From a technical point of view, there are many ways to find out about the underlying tables in a database. One might be, to pick up a good book describing it. Let me mention here, that we wrote a book about it two years ago, where you can find that information. It can be found here if you are interested, but not available in English (yet). If you want to travel the online road, and look table information up on the internet, there is for example www.erpgenie.com which offers table information for some SAP® elements. Also, there are ways within SAP® to identify some table names. I cannot go into detail here, as it would exceed this blog post by far. Below there is a screenshot which shows you that the vendors bank account table LFBK for example was identified by pressing F1 in a certain SAP® GUI field of transaction XK03 (Display Vendor), and going to the “Technical Information” there.

Picture 6 – “Technical Information” screen from SAP® ERP ECC 6.0

So there are there are several ways how the correct table names can be found. Basically it is a mix of techniques and Know-How that comes to that.

What will be defined as “relevant” depends on the analytic task or tasks which need to be performed. If we analyze the processes Purchase-to-Pay and Order-to-Cash fully, we need a set of around 200 tables for example. If we are only looking at Duplicate Payments, 8-10 tables may be enough.

Also you should be aware, that maybe not every aspect of the process is stored in a table. Take payments for example, which have been done manually, triggered outside the system, not making use of the Automated Payment Run in SAP®. One of the biggest risks there is, amongst the question, how it was authorized, is, where the money went to. Bank data of the payee is not recorded for those transactions, which makes the manual stuff causing a lot of fogginess and intransparency.

For our example “Payments to critical countries”, I will take the most obvious tables for a quick approach. They will be good enough to transport the message. Of course, going more into detail is possible; it just requires more data.

 

 

Element / Process step
Table
Comment
Vendor masta dataLFA1, LFBKLFA1 contains the global vendor master data; LFBK the bank accounts for vendors.
Customer master dataKNA1, KNBKKNA1 contains the global customer masta data; KNBK the bank accounts for customers.
Payment run transactionsREGUHIn REGUH you can find the payees bank information, like bank account, bank key and bank country. This is recorded on level of each payment transaction that was done to such accounts.
Manual Payment Transactions? Maybe you have tables in place, which contain the bank statements. There might be a chance to pull that information from, even for manual payment transactions.

Picture 7 – Examine data structure: Table level

 

 

After identifying the tables, we need to discuss, which fields we do want to use for our analytic approach. This requires an understanding of how to answer the analytic question, and what exactly should be analyzed.

We are interested in “Payments to critical countries”. We found out, that in table REGUH we can have a look at the payment transactions and the payees bank information. Now it is worth to mention that the table in fact contains three country fields that could be interesting: One field is the named “LAND1” and contains the country with the residence of the vendor or customer; another one is “ZLAND”, which contains the country where the payee resides – keep in mind vendor or customer and payee can differ. A third one is called ZBNKS, and this actually holds the country key of the bank where the money was sent to. So which one do we want to compare to our list of critical countries, that we made based on CPI (Corruption Perception Index, or an embargo list like HADDEX)? Depending on your decision you would answer the question

 

  1. Which payments were made to vendors who are located in a “critical” country?
  2. Which payments were made to payees who are located in a “critical” country?
  3. Which payments were made to a bank account which is located in a “critical” country, no matter where the vendor or payee is located?

We could that example expand to the master data tables as well, but I guess you get the idea. Just like we stated that we should formulate our analytic questions in step one as detailed as possible, this now needs to be tied together with the data we want to access. Even if we decide to go for a comparison of field “ZBNKS”, we may still be interested in listing the other country fields for FYI reasons.

 

 

Element / Process step
Table
Field(s)
Comment
Vendor master dataLFA1LIFNR Vendor number
Vendor master dataLFA1LANDVendor country of residence
Vendor bank accountsLFBKLIFNRVendor number
Vendor bank accountsLFBKBANKSVendor bank country
Customer master dataKNA1KUNNRCustomer number
Customer master dataKNA1LAND1 Customer country of residence
Customer bank accountsKNBKBANKSCustomer bank country
Customer bank accountsKNBK KUNNRCustomer number
Payment run transactionsREGUHLIFNR Vendor number
Payment run transactionsREGUHKUNNRCustomer number
Payment run transactionsREGUHLAND1Vendor country of residence
Payment run transactionsREGUHZLAND Payees country of residence
Payment run transactionsREGUHZBNKSPayees bank country
.........

Picture 8 – Examine data structure: Field level

 

In addition to identifying the necessary tables and fields it is important to gather know-how about the data model and relationships between those tables you selected. As an example, a customer or vendor could have none, one or several bank accounts; this means that for one vendor or customer number there could be several entries in the tables KNBK and LFBK - or none at all, if payments are done by check only. Relationships between tables in a relational database are created by Primary and Secondary Key field relations. Without going into detail too much it is worth to mention that a basic understanding about such table relations and the data model are important as well to cover a topic thoroughly and belongs to the aspect of “Know your data”.

At the end of step three, there should be an as-complete-as-possible list of tables and fields which we will take for our (continuous) analytic approaches. It will not only contain fields that are absolutely necessary for the analysis, like the country fields. There will be a lot of fields we need to be able to work with the results afterwards, like what vendor or customer number is it, what is the name of the business partner, what is the name of the payee, etc. So that list might get pretty voluptuous, the more you go into detail. Please keep in mind, that because of data privacy aspects, but also because of performance and usability reasons, the principle of „data minimization“ should be considered: Find out what’s necessary, but stay focused when defining the data and avoid a big overhead.

If we take our 5 things as checklist, currently we managed to did the first two steps:

Picture 9 – The progress we made so far

Now that we examined the data structures and defined the data, we are able to start analyzing it. For this we need to understand, what the technical possibilities are of the analytic tools. On the other hand, we also need to know about the limitations. What aspects you should consider there will be discussed in our next article “Understand technical possibilities and limitations” – so see you next week, same time, same place. :-)

I hope you enjoyed this third part of our CCM articles! For any comments or questions, feel free to write us at info@dab-gmbh.de.

To contact the author you can also use LinkedIn or XING (you may have to login first before you can access these links).

 

LinkedIn: http://de.linkedin.com/pub/stefan-wenig/54/1b8/b30

XING: https://www.xing.com/profile/Stefan_Wenig2?sc_o=mxb_p


Comments (0)
Be the first who comments this blog entry.
Blog login

You are not logged in. Please log in to comment this blog entry.

go to Login