Feel you need to go digital, so you don’t get left behind. Want to set the foundations of your data management programme and introduce governance of data. Get a jump start with your customer satisfaction, cost reduction or improvement programmes. Where do you start?
Below is a great high-level summary of what you need to set up for good Data Management.
Looks a lot, I agree but if you break it down into small chunks, think eating that elephant analogy or even an apple. We don’t try to do it all at once. You will most likely be surprised (and impressed) that pockets of your organisation are already carrying out activities that are Data Management and good Data Governance, which can be held up as examples for the rest of the organisation to follow and learn from.
There are many articles and blog posts out there on setting up a data management team and I will list some of my favourites in the references at the end of this post. Here I am going to write a sentence or two on each category and why it is useful to be included in your programme.
Data Management Maturity – Capability Maturity Model Integration
Data Management Strategy
Determine your strategy, what is it you want to achieve or problem that you will solve. What is your plan for how will you achieve it. This will help form the business case and support the justification for funding your budget.
Obtain a C-suite sponsor, this is the person who will advocate about the work you plan to do and the benefits. Select a Steering Committee of across business and IT stakeholders who have the authority to make decisions – agree a decision making process and quorum. Form your Data Management Team, this may be existing employees with subject matter expertise, people with a keen interest in Data Management (even if they are not aware of the term), they may already be carrying out this functionality in day to day activities, i.e. validating reports or checking accuracy of customer details and orders.
Plan and communicate how will you keep the business updated (at all levels) on your progress and promote the quick wins you will achieve to aid the buy-in for your greater goal.
Ensures data is consistent and trustworthy. This is critical as more organizations rely on data to make business decisions, optimize operations, create new products and services, and improve profitability.
The initial step in implementing a data governance framework involves defining the owners or custodians of the data assets in the enterprise. This role is called data stewardship.
Processes must then be defined to effectively cover how the data will be created/obtained, stored, archived, backed up and protected from mishaps, theft or attacks. Don’t forget your retention policy for deleting data you no longer need. A set of standards and procedures must be developed that define how the data is to be used by authorized personnel. Moreover, a set of controls and audit procedures must be put into place that ensure on-going compliance with internal data policies and external regulations, and that guarantees data is used in a consistent manner across multiple enterprise applications.
Business Glossary enables data stewards to build and manage a common business vocabulary and make it available across an organization. This ensures we are all at the same level of understanding in discussions and when making decisions.
A Data Dictionary includes various information about data, including relationships, constraints and rules, sources and usage, to name a few. This documentation is used by database users and developers to understand the data and its structures. It can be in form of a simple document or special repository accessed by a dedicated tool. Having a Data Dictionary will help with tracing lineage – knowing where your data came from and where it goes. This is key for GDPR compliance, understanding where all your personal data is held.
Metadata Management – Metadata describes other data. It provides information about a certain item’s content. For example, an image may include metadata that describes how large the picture is, the image resolution, when the image was created, etc.
When searching for documents, photographs or a webpage the metadata tags help identify what you are searching for.
Data Quality, Cleansing, Metrics & Strategy
Effective data quality management requires data monitoring and cleansing. In general, data quality maintenance involves updating/standardizing data and de-duplicating records to create a single data view.
Key data quality components to consider measuring & monitoring for rectification are: Completeness, Accuracy, Credibility, Timeliness, Consistency, Integrity.
A good data quality strategy benefits the business and your customers in numerous ways from reducing costs due to errors and improving customer relationships due to customers being provided with correct information
Profiling – can be used alongside data quality for examining the data for different purposes like determining accuracy and completeness. Outliers are easily identified.
Effective management of providers helps assure quality data delivery. Most organizations acquire data from either external or internal sources. It is frequently found that the performance of sources (e.g., the timeliness and completeness of data provided) does not meet the organization’s expectations. Similarly, data quality can be sub-optimal for the organization’s needs. The questions for the organization in this process area address best practices for: defining data sourcing requirements, acquiring and providing data that meets quality expectations, managing agreements, and interacting with providers.
Platform & Architecture
A well-designed architecture is of prime importance to make sure that data remains trustworthy and manageable in the new world of digital and business transformation. Start with questions how and where is data collected. How will it be stored. How will the data be used for analytics and how will data be available for decision making.
Data Architecture is as much a business decision as it is a technical one, as new business models and entirely new ways of working are driven by needs for new types of architectures and data platforms to support a variety of use cases (beyond relational) like ever-expanding NoSQL systems, IoT, Big Data, and Data Lakes.
A few of my favourite platforms:
Yellowfin – An end-to-end analytics platform which delivers everything you need in order to understand why faster. https://www.yellowfinbi.com/platform
Alteryx – Provides analysts with an intuitive workflow for self-service data analytics that leads to deeper insights and includes machine learning. https://www.alteryx.com/products
Informatica Axon (Diaku) – Data governance platform designed to assist financial institutions as well as other industries with complex regulatory compliance environments. https://www.informatica.com/products.html#fbid=3YCokAeqD2T
Collibra – Focuses on automating data management processes by providing business-focused applications where collaboration and ease-of-use come first. https://compass.collibra.com/display/DOC/Product+Overview
There are many supporting processes that can be identified. One of my favourites is:
A post by University of Cambridge on a basic but critical task of naming and organising files. Without consistent naming conventions you are lost when trying to locate a file, especially if you have not got good metadata.
I hope this short guide helps you set-up or realign your data programme for success. Remember small bites of the elephant!
Image credit: CGAP