Everything about business-driven analytics for data at any level from reference data to master data up to BIG DATA has key dependencies on the user understanding both the strategy for how the data was intended to be used and how the quality is automatically or manually controlled. This holds whether you are diving directly into your ERP to support an IT Project or accessing data via a BDW or data lake for predictive analytics, the user needs to know what the data means and how its controlled.
Before we get started, it is important to define the data dictionary (or metadata repository, rules repository, super glossary, encyclopedia...) broadly. If you think of your data dictionary only as a technical metadata repository, you are missing opportunities to leverage knowledge for increasing your business agility and reducing risk at a minimum (see the blog What’s In It For Me?).
A modern data dictionary of course has many components beyond just the technical metadata. Other components include glossaries, accountability or RACI matrices, ownership at field and table levels, Knowledge Management workflows and so forth. But the Dictionary also has rules and standards. Often I am asked "what is the difference between a rule (or control) and a standard?"
Our experience for the past 10+ years has driven us to separate rules/controls from the standards.
Rules are important as that’s what the rules engines powering our modern master data management programs are built on. You may ask yourself why do I need separate documentation for these as they are obviously built into the system for anyone with access to see. I’ll cover that later in this blog.
Business Rules are granular. They represent the correct answer for a very specific set of circumstances. An example might look like: If its business “A” in Malaysia, a raw material then the tax category is “X”, but in Brazil for Business “B” Raw material “Y” the answer is “7”). Rules, as commonly documented, do not contain the “guiding principle” or background for WHY “7” is correct for Brazil and may not even say what “7” tells SAP to do when there is an order. Only the testable field, selection criteria and answer are noted as that’s what is used in the automation. You might have >50 rules/controls for Base Unit of Measure depending on different countries, product lines, material types and whether they are for active or process controls, or if they are used for continuous data quality monitoring. All 50 are important, but in order to understand the field holistically in support of a new project, you would need to pool them all and then try to understand the strategy and principles prior to the new project integration.
Rules are very powerful for automation, but they are not the Data Standard. Data Standards for a given data field/element tell the story of that element all in one spot.
The Standard will include for example all countries and all businesses and all document-able circumstances. And a really good standard will define the principles behind how the field is used so that when there are new circumstances, new countries, planets or businesses that need to use the field; the principles will guide them to the correct way to use existing or create new values for the next set of automated business rules (Business ZZ on planet Q for a finished good uses 9i).
A field level standard is not complete until it can define the use of that field for all material types (or customer account groups, etc) for all circumstances. It needs to define the consumption processes at least at a high level (or link out to a fully functional data architecture system). Governance over the standard itself is key (last review date, who reviewed, review comments, grade) Standards need to address the maintenance processes; Where maintained? By what role? What is the system of record and systems of consumption? Is it transported between systems and if so by what iDOC? (Important to know for inter-system data integrity quality checking). If the data in this field is transformed as its consumed, the standard is a great place to document permanently, the data transformation schema.
In short, a good standard will cover the basic information for the full spectrum of information we all are used to having to recreate at the start of a project or when training new personnel, by pulling out of people’s experiences or from archaic middleware, by consolidating the truth from 25 separate spreadsheet “dictionaries” or to reverse-engineering the standard by combining 50 “rules” and looking for the complete truth in them. Or more often all of the above are needed to see what’s going on.
A Standard cannot be graded as fully compete until it has covered all variations and at least for all required attributes of the standard, but even a standard complete for only a few material types is far better than the scenario above.
I have found Ronald Ross’s” Business Rules Manifesto” to be a great guide to business rules, but it doesn’t address the standard. The DAMA DMBOC is another good source on Business rules, but less so for standards.
In a very general way, you can compare the documentation at several levels like this:
Connectivity for the standard is fundamental. One should connect to the standard from data management workflows, data architecture systems, external systems (e.g. Vendor Master Data Self-service) and especially all training leveraging the field. The data standard should be seen as the single source of truth for the data element to be used by Business AND IT for all projects and daily business needs.
This single source of truth is problematic for some when one considers how the data may be transformed as it moves between systems.
My response is: when you are required to define data in a system independent way you lose the system specific knowledge that is important to know when dealing with system (or business) conflicts unless you capture it for every system. If you do this in one single record it gets very convoluted very quickly. Taking some key fields to a higher system independent view is easily done using several methods and structures once the systems specific foundation is laid. Getting the standards at the physical model first is key.
This keeps it simple as knowledge is consolidated.
There are many more parts and functions for a modern data dictionary but this is enough for now.
I hope this drives some discussion, as just because we have been doing it this way and have the applications to support it, does not mean it’s the only way.