Given the importance of data governance it is hardly surprising that there are technologies geared up to help with your efforts. They come in various shapes and sizes with accompanying price tags. In this post we’ll take a look at their general benefits and moving parts to help you decide how they might feature in your implementation.
Depending on the size of your data governance program, you may find significant benefit from the use of a dedicated data governance platform. Some high-level reasons for implementing a platform to assist with your data governance are outlined below.
The platform will provide functionality such as portals, approval workflows and repositories of information around data governance all from one location. This allows for better communication and collaboration across the program and organisational transparency around roles and responsibilities. Governance silos are avoided and universal approaches are more readily available.
Data governance platforms have various tools to assist with taking the ‘spade work’ out of the program implementation. These are becoming more effective with the addition of AI elements to products.
Your data management efforts will benefit from the various data cataloguing, quality and lineage tools that these platforms provide. In fact your data management team may well have already invested in these tools for just that purpose. Providing these insights in one platform will significantly increase productivity and ‘quality of life’ of data management teams.
The main purposes of these platforms are to ease the implementation of the data governance program and assist with data management aspects across the estate. Too often however companies will rush out and purchase a platform way too early within the program. The product sits there unused, with a relatively noticeable price tag attached. There is little or no point from a data governance point of view in having a platform without having your strategy, principles, policies and at least some of your processes well defined. To state what is often missed here, wait until you really have things to govern and a need for this platform.
In the meantime gain an understanding of the return on investment of the data governance program through its agreed budget and business capability improvement. Look at how your data governance processes would benefit from the functionality on offer and go through a thorough product selection process. Any vendor selection process can take some time, with their sales teams ordinarily prioritising on reeling in potentially larger prospects. You may also find platform offerings evolving during the process as new tooling and integrations are added.
With the above points in mind, there are various areas of functionality that really add to the value of the platform implementation.
This subject area has a well established selection of products that identify, process and improve the quality of data. Some vendors are stronger than others in their offerings here, but the overall functionality they offer can be summed up as:
Where a vendor has an accompanying ETL or data processing toolset these are generally integrated into the above areas. Data observability for example will benefit from greater lineage detail for products that provide this natively to the platform.
Not all vendors leverage the underlying available data processing compute resources for data quality tasks, often offloading to their bespoke data management platform infrastructure. This can prove costly and time consuming when compared to the existing resources such as big data platforms. This aspect of efficient data quality processing should be carefully considered when selecting a suitable platform.
This critical subject in any data governance program has many touch points and facets to consider. There are numerous ways that your chosen platform can assist with this, with varying functionality in this area across vendors.
Some key items are listed below:
Awareness of, and access to, data products and assets are key considerations for a successful data governance initiative. Data governance platforms leverage various repositories and portals to assist in data awareness and understanding across the organisation.
Here are some of the approaches taken:
The growing demands on data management teams to help derive business value from data place significant challenges in a growing data landscape. To remain competitive, associated costs and productivity need to be carefully managed. There are a number of benefits that can be gained from a well implemented and populated data governance platform, such as:
The following components are essential to any data governance platform.
This is the main repository for metadata and definitions of elements of data management and governance. Centralising this information for transparent access is hugely beneficial in any but the most simplistic of programs.
Most products incorporate the three areas of metadata generally included within data governance, being:
This metadata will be the lifeblood of much of the program information, and can play a significant part in data management simplification through metadata-driven processes. This component should be scrutinised regarding content capability, ease of use and metadata discovery.
Populating the metadata repository in an efficient manner will provide much needed details on the various data assets, sources and related processing that is embedded within various data management products across the organisation.
Population is initiated through the following methods:
Connectivity to metadata sources varies although most vendors capture the most common cases. The ease with which metadata can be onboarded should be a key consideration with respect to your current and future data estate.
Many programs will use technology such as SharePoint for document collaboration across the program. Some products do have their own portal for hosting the various documents on policies, processes, charters, role definitions and other items of interest to organisational data governance. Either way, a portal for all things data governance is strongly recommended for efficient communication and knowledge sharing.
In addition to the above, many vendors offer additional elements within their data governance suite of products. Depending on your particular priorities these may make the list of ‘must haves’ in your considerations.
Data access considerations require no real introduction and are front and centre of any security aware data management team. Many organisations are adopting what is often termed a ‘zero-trust’ approach to data access. This involves closely monitoring privileges to all data sources, and ensuring that data access is tracked through various auditing components. This is particularly challenging given the multitude of data stores that exist in even a modest sized organisation. Document stores are one area that are often overlooked, with the auditing of their access being difficult to achieve without close integration into vendor document store products.
With increasing use of AI models in companies of all sizes comes associated management challenges. In particular, in light of recent regulatory expectations around suitability of AI and ML outputs for use in business processes, performance of these models is particularly important. Model management tools address the following:
Given the generally high priority of data quality within a data governance program, often with considerable value-add attached, this component for many will be promoted to a key constituent. As mentioned above, functionality and implementation varies across vendors and as such should be given careful consideration.
This can provide considerable productivity benefits for data catalogue-related activities. Categorising and labelling various assets and their attributes according to predefined or newly discovered classifications of data is accomplished through data profiling of samples and applying various matching algorithms. Suggestions on data quality rules based on classifications and attribute types are also easily generated. As you can appreciate, automating these repetitive and time consuming tasks frees up data professionals to apply themselves elsewhere in more specialised activities.
Generative AI tools allow greater assistance beyond the automating of low hanging tasks. Through models focused on the data estate, profiles of organisational data and known applications and outputs, understanding large bodies of information is made considerably simpler. Data discovery challenges such as finding good candidate attributes for customer loyalty indicators are made simpler.
This ability to reason around and through your data improves transparency and productivity. There is less need to to drill into great detail to understand what’s out there. Marketed as the ‘next generation of data governance tools’, these data context-aware assistants are driven by models trained using metadata and data profiling. As they provide more insight and improve on data-related understanding, so their models benefit. More focused and relevant model input further improves accuracy and applicability in a self-learning fashion.
Data governance platforms may prove essential for some organisations in their quest for ‘data governance law and order’. Don’t assume however that you can throw technology at the challenge and sit back.
Without all the other elements we’ve been discussing such as business capability understanding, stakeholder buy-in, strategy, policies and well structured teams the platform will at best be something of a white elephant. The technology decisions should be amongst the last elements to put in place before you grow the program. Only with all other items defined will you truly know what you need from the platform. At this point you can decide on an implementation confident that it will help drive your business capability improvement initiative that is data governance.
If you’d like to discuss any aspects of your data governance program, whether defining goals, deploying solutions or anywhere in between, please don’t hesitate to get in touch. Our data governance service is a flexible, coworking approach that provides assistance wherever you are on your journey.
About the author