Data about data: Mining meta information from SAP®
In this blog post, I will discuss ways in which you can use data from SAP® systems to obtain information about your data. Although the article is fairly technical because it deals with meta data, I will use brief examples to illustrate specific issues. To begin with, I will describe the starting point, before drawing on selected examples to explain how the Data Dictionary can be used, and then ending with a brief conclusion.
Many questions which we would like to answer with the help of data analytics are connected with day-to-day work. These can, for example, concern business aspects (“What were our sales for each commodity group?”), questions about the internal control system (“Were manual payments properly authorised?”), about data quality (“Are there duplicates for supplier or customer master data?”), or about fraud detection or prevention (“Have there been unusual/frequent changes of bank account details for vendors?”).
To answer these questions using data analytics, we generally use a combination of analysis methods on the data, such as a cleansing of the bank details field, followed by a compression based on vendor number and bank account details, with subsequent counting of the number of different vendors for each identical bank account.
For this, we need to be familiar with the database. Which tables exist, and which contain the relevant information for the current analysis? Which are the relevant key fields to link the tables to each other, and which other fields contain the information which needs to be considered in the analysis?
Let’s use the example of changes to supplier bank account details:
Among other things, the analysis requires the name of the supplier and information such as supplier number and address, as well as various deletion and block indicators. These can be found in Table LFA1 in SAP®. The bank account information can also be useful. This is stored in the table LFBK, and relevant IBAN bank account details in the table TIBAN. If you want to identify changes, these are localized in the change history tables CDHDR and CDPOS, along with a wealth of other change documents. However, this level of detail is still not enough – at field level, you need to ascertain the names of the relevant data fields, and also (where applicable) information on the data types. The classical supplier number, as stored in LFA1 and LFBK, has the name LIFNR, for example, and in terms of characteristics can contain alphanumeric values and special characters (virtually a text field). The field LOEVM shows whether a supplier is globally blocked. Change date and change time are stored in the fields UDATE and UTIME of the change history tables.
In summary, therefore, the questions are often as follows:
- Which tables are there, and what information do they contain?
- Which individual fields does each table contain?
- Which data types and other technical attributes do these fields have?
The Data Dictionary as a starting point
It is helpful that SAP® also stores large quantities of meta information in table form, in the Data Dictionary. Many of these tables start appropriately with the letters DD. After explaining some basics (“T for texts, not T for technology”), I will provide examples of some tables (DD02T, DD07T, DD03L) as they can be very useful in the context of data analytics and the questions outlined above. As always, many of the data follow the system described in the text, although as is often the case with SAP®, there may be some exceptions.
Basics: T for texts, not T for technology
Tables which deal with a certain theme often occur in pairs. Roughly speaking, one of the tables contains the technical settings, and in order to avoid redundancy, the second table contains the (language-dependent) texts and descriptions. If, for example, we consider document types in accounting, table T003 stores the document types together with technical information, such as which account types are permitted, or which number ranges are assigned, etc. In contrast, table T003T contains the texts for the document types dependent on the language key.
This system is also often used for the tables in the Data Dictionary: it has tables with technical settings (e.g. DD02L, with the name “SAP® Tables”) and counterparts which end in the letter T, which contain the texts (e.g. DD02T, “R/3-DD: Texts for SAP tables”).
Example DD02T - Texts for SAP tables
One of our most frequently used tables is DD02T. It lists the tables which are contained in the current SAP® system, including the (where applicable specific) Z tables. We often use DD02T to determine the significance of the tables.
The following screenshot shows an extract from the table DD02T filtered using “LF” in order to show the most important tables in terms of suppliers. We also filtered on the basis of short table names, as our experience shows that these are often more relevant and closer to the core processes. The screenshot was created using the analytics software ACL™ Analytics for Windows; data extraction was performed using the extraction software “dab:Exporter”.
Example DD03L - Table fields
At a more detailed information level, Table DD03L gives a list of table fields. In addition to simply providing a list of fields for each table, this also gives information on a field’s position within a table, the field's data type, whether it is a key field, its field length, and the field domains to which it relates.
The following screenshot again shows the ACL™ analytics software with DD03L filtered using the table change document headers CDHDR:
Example DD07T - DD: texts for domain fixed values
This is a useful table if, as indicated by its name, you are interested in texts for the domain fixed values. In terms of content, not all values of interest within the context of data analytics are, as in the case of the document types described above, transferred into separate tables for each attribute via foreign key relationships. SAP® contains a wealth of information, such as block or delete keys, transaction or document types, which can be found bundled in this table, probably on account of their purely descriptive nature. DD07T thus contains texts for such attributes on a cross-module basis, depending on the language key (also contained in the table).
I hope that despite this article’s rather technical nature, you will nevertheless find it useful. Utilising this type of meta information can be very influential and help in grasping new topics and issues.