Data Governance Platforms

ByNigel Meakins

Data Governance Platforms

This entry is part 6 of 6 in the series Data Governance

Given the importance of data governance it is hardly surprising that there are technologies geared up to help with your efforts. They come in various shapes and sizes with accompanying price tags. In this post we’ll take a look at their general benefits and moving parts to help you decide how they might feature in your implementation.

Why Consider a Data Governance Platform?

Depending on the size of your data governance program, you may find significant benefit from the use of a dedicated data governance platform. Some high-level reasons for implementing a platform to assist with your data governance are outlined below.

Centralised Data Governance Efforts

The platform will provide functionality such as portals, approval workflows and repositories of information around data governance all from one location. This allows for better communication and collaboration across the program and organisational transparency around roles and responsibilities. Governance silos are avoided and universal approaches are more readily available.

Increased Productivity of Regular Tasks

Data governance platforms have various tools to assist with taking the ‘spade work’ out of the program implementation. These are becoming more effective with the addition of AI elements to products.

Complementary Data Management Tools

Your data management efforts will benefit from the various data cataloguing, quality and lineage tools that these platforms provide. In fact your data management team may well have already invested in these tools for just that purpose. Providing these insights in one platform will significantly increase productivity and ‘quality of life’ of data management teams.

When to Implement a Data Governance Platform

Data Governance PlatformThe main purposes of these platforms are to ease the implementation of the data governance program and assist with data management aspects across the estate. Too often however companies will rush out and purchase a platform way too early within the program. The product sits there unused, with a relatively noticeable price tag attached. There is little or no point from a data governance point of view in having a platform without having your strategy, principles, policies and at least some of your processes well defined. To state what is often missed here, wait until you really have things to govern and a need for this platform.

In the meantime gain an understanding of the return on investment of the data governance program through its agreed budget and business capability improvement. Look at how your data governance processes would benefit from the functionality on offer and go through a thorough product selection process. Any vendor selection process can take some time, with their sales teams ordinarily prioritising on reeling in potentially larger prospects. You may also find platform offerings evolving during the process as new tooling and integrations are added.

Functionality and Value-Add

With the above points in mind, there are various areas of functionality that really add to the value of the platform implementation.

Data Quality

Data Quality

This subject area has a well established selection of products that identify, process and improve the quality of data. Some vendors are stronger than others in their offerings here, but the overall functionality they offer can be summed up as:

  • Data Profiling for identification of problem areas for data quality efforts
  • Data Rectification for assisting with upstream data quality correction, through either direct processing or via data validity reports and exports
  • Data Observability to help with operational elements of data quality and related challenges

Where a vendor has an accompanying ETL or data processing toolset these are generally integrated into the above areas. Data observability for example will benefit from greater lineage detail for products that provide this natively to the platform.

Not all vendors leverage the underlying available data processing compute resources for data quality tasks, often offloading to their bespoke data management platform infrastructure. This can prove costly and time consuming when compared to the existing resources such as big data platforms. This aspect of efficient data quality processing should be carefully considered when selecting a suitable platform.

Data Security and Privacy

This critical subject in any data governance program has many touch points and facets to consider. There are numerous ways that your chosen platform can assist with this, with varying functionality in this area across vendors.

Some key items are listed below:

  • Definition of roles and assigned personnel
  • Identification and classification of data assets for security and privacy considerations
  • Auditing of data access, generally from digesting various access logs from data management products and document management systems

Data Transparency

Data TransparencyAwareness of, and access to, data products and assets are key considerations for a successful data governance initiative. Data governance platforms leverage various repositories and portals to assist in data awareness and understanding across the organisation.

Here are some of the approaches taken:

  • Data Catalogues for definitions of data assets and products, business glossaries and approval taxonomies
  • Metadata capture of existing data assets through trawlers and scanners that either pull or push metadata from known sources
  • Lineage of data assets and products determined from data processing  and management operations
  • A ‘Data Marketplace’ concept for certifying, publishing and promoting data assets across the organisation

Operational Efficiency

The growing demands on data management teams to help derive business value from data place significant challenges in a growing data landscape. To remain competitive, associated costs and productivity need to be carefully managed. There are a number of benefits that can be gained from a well implemented and populated data governance platform, such as:

  • Identification of duplicated processes and data assets
  • Monitoring of processing performance and associated resource usage
  • Visibility of processes and associated outputs to help derive associated business value

Key Components

The following components are essential to any data governance platform.

Metadata Catalogue

Data CatalogThis is the main repository for metadata and definitions of elements of data management and governance. Centralising this information for transparent access is hugely beneficial in any but the most simplistic of programs.

 

Most products incorporate the three areas of metadata generally included within data governance, being:

  • Operational metadata, providing data provenance, processing and data observability information
    • Generally more detailed when coupled with integrated data processing platforms from the same vendor
  • Business metadata, such as glossaries, definitions of metrics and other standards
  • Technical metadata, describing data assets and their classifications, data sources, transformation mappings and similar structured and semi-structured data definitions

This metadata will be the lifeblood of much of the program information, and can play a significant part in data management simplification through metadata-driven processes. This component should be scrutinised regarding content capability, ease of use and metadata discovery.

Metadata Discovery

Metadata Discovery

Populating the metadata repository in an efficient manner will provide much needed details on the various data assets, sources and related processing that is embedded within various data management products across the organisation.

Population is initiated through the following methods:

  • User-initiated discovery requests, targeting a specific source of metadata
  • Automated scans via metadata crawler services to collate metadata from various sources

Connectivity to metadata sources varies although most vendors capture the most common cases. The ease with which metadata can be onboarded should be a key consideration with respect to your current and future data estate.

Collaboration Portal

Documentation

Many programs will use technology such as SharePoint for document collaboration across the program. Some products do have their own portal for hosting the various documents on policies, processes, charters, role definitions and other items of interest to organisational data governance. Either way, a portal for all things data governance is strongly recommended for efficient communication and knowledge sharing.

Extended Components

In addition to the above, many vendors offer additional elements within their data governance suite of products. Depending on your particular priorities these may make the list of ‘must haves’ in your considerations.

Data Access Auditing Services

Data access considerations require no real introduction and are front and centre of any security aware data management team. Many organisations are adopting what is often termed a ‘zero-trust’ approach to data access. This involves closely monitoring privileges to all data sources, and ensuring that data access is tracked through various auditing components. This is particularly challenging given the multitude of data stores that exist in even a modest sized organisation. Document stores are one area that are often overlooked, with the auditing of their access being difficult to achieve without close integration into vendor document store products.

AI Model Management

With increasing use of AI models in companies of all sizes comes associated management challenges. In particular, in light of recent regulatory expectations around suitability of AI and ML outputs for use in business processes, performance of these models is particularly important. Model management tools address the following:

  • Understanding model trends and accuracies over time
  • Determining computational performance for cost-effective computational resource usage
  • Addressing input dataset and model parameter traceability, for reasons such as compliance, privacy and ethics
  • Avoidance of bias such as towards demographic groups, particularly regarding protected characteristics such as race, age, disability and sexual orientation, through understanding of model parameter data distributions

Data Quality Engine

Given the generally high priority of data quality within a data governance program, often with considerable value-add attached, this component for many will be promoted to a key constituent. As mentioned above, functionality and implementation varies across vendors and as such should be given careful consideration.

AI Assistance

This can provide considerable productivity benefits for data catalogue-related activities. Categorising and labelling various assets and their attributes according to predefined or newly discovered classifications of data is accomplished through data profiling of samples and applying various matching algorithms. Suggestions on data quality rules based on classifications and attribute types are also easily generated. As you can appreciate, automating these repetitive and time consuming tasks frees up data professionals to apply themselves elsewhere in more specialised activities.

Generative AI

Generative AI tools allow greater assistance beyond the automating of low hanging tasks. Through models focused on the data estate, profiles of organisational data and known applications and outputs, understanding large bodies of information is made considerably simpler. Data discovery challenges such as finding good candidate attributes for customer loyalty indicators are made simpler.

This ability to reason around and through your data improves transparency and productivity. There is less need to to drill into great detail to understand what’s out there. Marketed as the ‘next generation of data governance tools’, these data context-aware assistants are driven by models trained using metadata and data profiling. As they provide more insight and improve on data-related understanding, so their models benefit. More focused and relevant model input further improves accuracy and applicability in a self-learning fashion.

Conclusion

Data governance platforms may prove essential for some organisations in their quest for ‘data governance law and order’. Don’t assume however that you can throw technology at the challenge and sit back.

Without all the other elements we’ve been discussing such as business capability understanding, stakeholder buy-in, strategy, policies and well structured teams the platform will at best be something of a white elephant. The technology decisions should be amongst the last elements to put in place before you grow the program. Only with all other items defined will you truly know what you need from the platform. At this point you can decide on an implementation confident that it will help drive your business capability improvement initiative that is data governance.

Implementing Your Data Governance Solutions

If you’d like to discuss any aspects of your data governance program, whether defining goals, deploying solutions or anywhere in between, please don’t hesitate to get in touch. Our data governance service is a flexible, coworking approach that provides assistance wherever you are on your journey.

Series Navigation<< Data Governance Frameworks

About the author

Nigel Meakins administrator

Having worked for many years in the world of data and analytics, I enjoy following new innovations and understanding how best to apply them within business. I have a broad technical skill set and an acute awareness of how to make Agile work on data projects. Working at all levels and in a variety of roles on projects, I help our clients understand how the latest technology can be applied to realise greater value from their data.

Please share your thoughts...

Interested in our Data Services?

To find out more regarding any of the above, please email us, give us a call or use our enquiry form via the button below.

Discover more from Pivotal BI

Subscribe now to keep reading and get access to the full archive.

Continue reading