Data Privacy – More than just playing by the rules
Data analytics, especially in context of Big Data, are often connected directly to the term „data privacy“. In Germany for example a change of the Federal Data Protection Act has caused a lot of discussions regarding its interpretation and handling. Triggered by that development, the German Institute of Internal Auditors (IIR, Institut Interne Revision e.V.) in cooperation with the Society for Data Privacy and Data Security (GDD, Gesellschaft für Datenschutz und Datensicherung e.V.) created a position paper. [http://www.diir.de/fileadmin/fachwissen/downloads/09DIIRDatenanalyseWeb.pdf]
This paper contains some scenarios on how to balance between data analytics within a company (especially for Internal Audit) and data privacy protection & data security, especially when personal data of users is involved.
However data privacy protection is not only necessary in context of employee data. There are a lot more settings which make it interesting or even mandatory to protect data from being read in clear text, even when it has nothing to do with employees, for example
- Data of business partners (customers and/or vendors)
- Material data and parts lists
But before we have a more detailed look, we should explain the technical aspects using some examples.
The table on the left-hand side shows a list of SAP© usernames including the value of each single transaction that was executed by the user. On the right-hand side there is the analytic result, which is a consolidation per user, giving us total count and total value for each employee.
|MARTIN|| 0.10 €|
|MARTIN|| 156,357.65 €|
Transactions in clear text
|User||Total Count||Total value|
|MRIEDL|| 2||10,000.00 €|
|MBAUMGARTNER ||2||12,573.10 €|
Analytic result in clear text
To protect the employees data (as requested by the Federal Data Protection Act for example) these user names could be encrypted. Ideally this happens automatically and already during the data extraction out of SAP© with proper tools (for example the dab:Exporter, to be more exact the Addon dab:PrivacyProtection). The user names will then not be available in clear text anymore for the person who is analyzing the data. Below you can see the same tables as before, but with the user name data encrypted:
|3!ky34JLYX|| 5,000.00 €|
|vAC3klMNO|| 0.10 €|
|3!ky34JLYX|| 5,000.00 €|
|vAC3klMNO|| 156,357.65 €|
|User||Total count||Total value|
|vAC3klMNO||3|| 156,359.74 €|
Analytic result encrypted
This example shows two major advantages of having a proper encryption in place:
- Sensitive data like employee user names are not plain text anymore and do not show up in the raw data
- However it is still possible to analyze the data, because the values are still unique. Even SoD tests (Segregation of Duties) are still possible. Having the same person changing a bank account and releasing the associated invoice will show up as a result in related analytics.
In case of getting suspicious transactions as a result, there is still the option of decrypt single records by using dedicated tools with an authorized user. However a mass data analytics based on plain text employee data without any given suspicion is not possible anymore.
As mentioned at the beginning of this article, data encryption is not restricted to scenarios like employee data privacy. There are other settings where protecting your data can be extremely important:
- Think about analyzing in the (public) health care sector. Usually it is necessary to distinctly assign transactions to one certain patient without having his or her personal name and address data available in plain text.
- In the banking area customer data may be in need of special protection, however needs to be analyzed on a regular base to fulfill tons of laws and regulations.
- Business secrets like material data which are listed in parts lists of copyrighted products or customer / vendor masterdata in areas of aviation, spaceflights or defense industry are also areas where data in general (or parts of that data) needs to be protected.
Long story short, whenever data needs to be analyzed based on fields which are sensitive data encryption can be added value. It grants a level of safety in context of data analytics, especially also when data is given to third parties for having it analyzed. Ideally data already gets encrypted during the extraction process, as every unencrypted file on a harddrive may be a risk.
Data encryption, data security and data privacy are not a must, but should be taken into account every time data analytics are performed.