22.07.2015
Anton Grening
Author: Anton Grening

ACL An­a­lyt­ics™ – Re­la­tions: Dif­fer­ent types with its ad­van­tag­es and pit­falls

The last blogpost within this ACL Analytics™ series is going to get its teeth into relations between tables and wants to take up the issue described in our last entry (Relations: Simple, indirect and through combined text field). First, we will learn about different types of relations between tables and see what kind of questions can be solved very simple by using them. I am going to show you as well, what errors could happen while working with relations and how to avoid them best.

 

As you already know, ACL™ is perfectly suited for analyzing several tables combined. Therefor so-called cardinalities used in data modeling are playing an essential role. Cardinalities describe the relation between two tables; to be more exact the relation between the records in two tables (Primary and Secondary table). The records can be identified by a unique identifier, we usually talk about (one or more) key field(s). This may sound very technical, but is basically very simple, if we illustrate three basic types with simple examples:

 

 

1:1

Each record of the first table can only be assigned to one record of the second table and vice versa.

Example (Vendor master list / Vendor evaluation data): Each vendor who is listed in the first table has only one vendor evaluation entry. Vice versa, one vendor evaluation belongs to exactly one vendor.

1:n

One record of the first table can be assigned to multiple records in the second table; however a record of the second table is only assignable to one record in the first table.

Example (Vendor master list / Vendor purchase orders): A purchasing document belongs to exactly one vendor (=the vendor the goods were ordered from). But this vendor can be assigned to several purchasing documents (maybe there were 10 purchase orders in 2015 for that vendor already).

n:m

Records in the first table can be assigned to multiple records in the second table; but one record in the second table could be assigned to multiple records in the first table as well.

Example (List of materials we need for production / List of all available materials per vendor from our procurement department): One material can be obtained from not only one but a couple of vendors; and one vendor delivers several materials.

 

 

In most cases, you have to deal with 1:n relations like in our example in the previous blogpost. There we used a relation to add additional information from other tables related through a key field (e.g. extend purchasing documents by the vendor name).

In ACL™, when creating a relation, it is important to consider from which table you start building a relation. In this context, I already gave you a rule of thumb: Join your tables always starting from transaction data. That means you need to find the table with multiple records for each key field and connect it to the table with the one record per key field.

In our example this would be:

 

Cardinality1:n
Data Categorymaster data:transaction data
Table Contentvendor:purchasing document(s)
Common Key Field(s)VendorID:Vendor_Account_Number
Key Field Status

Unique

(each vendor occurs in the list only 1 time)

:

Up to n times

(a vendor can have no purchase orders, 1 purchase order or many purchase orders)

 

Now this was easy, wasn’t it? However in some cases you may not know by heart whether the tables contain master data or transaction data; or maybe it will not be possible to classify the tables by this categories at all. Let’s have a look at a second example, looking at the tables “PurchasingDocument_Item” and “PurchasingDocument_Header”. If you want to combine those two tables (because maybe the payment terms are in the header table, however the quantity and price are stored on line item level), you need to find out which table will be primary (this is the one you will start with the relation), and which one will be secondary.

A tip on how you can check the key field status of a table easily before creating a relation - Use the ACL™ command for duplicates search:

Figure 1: ACL™ duplicates search for checking key fields

ACL™ will tell you if duplicates are detected. In our example, the header table does not contain any duplicates based on the PO number; the line item table however does. This means, the table with the duplicates (PurchasingDocument_Item) will be the primary table you can use to define relations with other tables, where the common key field (Document_Number) appears only once – that means no duplicates - in the secondary table.
In the previous blogpost of this series, we did it right and started with the purchasing document items table to add information from the header table (or to be more precise, used this table to get connected to a further table – indirect relation).
But what would happen if doing a relation in the wrong way? Let’s start with the PurchasingDocument_Header and see what happens:

Figure 2: Relate purchasing document header table with items table wrong

In this example, I started with the PurchasingDocument_Header table as primary table, and used the key field Document_Number in the header table to connect the PurchasingDocument_Item table. If you then want to add additional fields (e.g. plant and net price) in order to sum or classify them, you need to know that your results might be wrong, because you always just get the first entry from the items table for each key field in the header table. All the other item entries will be lost. Let’s suppose, the header table contains 10 purchase orders and the item table 30 purchase order line items. The amount of the entries in the header table won’t change after creating the relation. That means, it will still be 10 orders, which will just be extended by the additional information from the items table. But if there is more than one line item in the second table, ACL™ needs to make a decision which information will be added. It acts pretty straight forward, adding always the content of the first line it finds from the items table. Therefore 20 items from 30 order lines would not be included into the analysis. By comparing the two tables it’s easy to spot the problem:

Figure 3: Comparison of the two tables

So obviously it is crucial what table to start with! For not losing any lines always start with the table which has multiple records for the common key field (the n table in our 1:n cardinality). In the abstract example above this means adding information to the 30 order lines from the table with the header information. Orders containing several line items would get this addition redundantly but in this way no information would get lost.

So we can use Relations in ACL™ to combine tables, for example adding information from a second table to the primary table. However they also can be used to answer questions about missing entries: Are there any records (key fields entries) in table A that do not appear in table B? On a concrete example this could be the issue about vendors not having any purchase document at all. (Maybe you want to clean your data, and get rid of all unused vendor master records, setting them to deleted.) To answer this you need to act against our rule of thumb and assign the relation in the master data table linked to the transaction data table, so from the vendor table to the purchase document table. It looks like a workaround, but it is a valid data analytic approach for identifying entries which are in the primary table, but not in the secondary table.

Figure 4: Relate vendor master data table with table PurchasingDocument_Header

But building the relation is not enough to answer that question, as – we discussed it in the first example – the number of records in the primary table will not change, no matter if or how many matching records are in the secondary table. To be able to tell, we have to add the key field Document_Number from the table PurchasingDocument_Header. This will help us in the first place identifying lines that do have a connection / corresponding entry. (At least one purchasing document for this vendor) Now if we invert this logic, this also can be used to figure out which vendors does not have any orders: If the vendor does not have any entry in the purchase documents table (purchase orders) at all, then the added field in the vendor master data table will stay blank. As a rule, the key field used for building the relation is the most appropriate field for this procedure, because it must be ensured that the field is always filled in the linked table if a record exists. Otherwise our check on empty field content might not be reliable enough. Relations are a powerful tool in ACL Analytics™ and can be used in many ways, but you have to approach this very deliberately. Errors may not be as obvious as in other approaches and may be distorting results. But if considering the mentioned tips and examples like the rule of thumb (Join your tables always starting from transaction data/your n-table) or always double-check your key fields with the duplicates search command, relations are powerful and can be used safely to connect tables and analyze their field together.

 

This concludes my first ACL™ Analytics blog series and I hope I could give you some help working with ACL™. If you have questions or comments, feel free to contact us at info@dab-gmbh.de


Comments (0)
Be the first who comments this blog entry.
Blog login

You are not logged in. Please log in to comment this blog entry.

go to Login