With so many data breaches and data privacy issues, it would be an understatement to say that organisations need a solid and effective data governance policy. Data governance practices are essential to ensure that data is optimised for any use. There’s a mandate among global tech behemoths to put into practice a data governance policy that restricts access to data and governs which information should be made available and to whom. Organisations that aim to be like IBM or Microsoft need to add data governance to their strategy that will enable them to move forward while at the same time managing risk at an acceptable level.
The goal should be to keep the flow going so that the data can drive value. Whether innovation for new products, insights around processes, or reducing cycle time, it’s all about keeping that flow going sufficiently, so an organisation attains its targets while minimising risk. Communication and prioritisation are essential as well, no matter the size of your team. Companies must also understand the sensitivity of the data, how it’s protected and managed, and why it’s collected.
Having a thorough understanding of data can help organisations prioritise specific data collection, make better decisions, scale efficiently, and save money.
Data governance best practices are evolving rapidly, and only by keeping your finger on the pulse of the data industry can you prepare your governance strategy to succeed.
Here we look at how big tech companies introduce scalable, automated controls and leverage modern foundations to transform data governance.
Microsoft embraces and promotes a data culture mindset. Rather than viewing data governance as a blocking function, Microsoft sees data governance modernisation as a way to democratise data responsibly, to power the broader digital transformation. Microsoft is building its data governance controls into the centralised analytics infrastructure and analytics processes. “We are transforming how we provide data governance, to introduce scalable, automated controls for data architecture, lifecycle health, and advancing its appropriate use,” Microsoft wrote on its blog post.
Its data governance strategy is developed with five goals in mind:
- Reduce data duplication and sprawl by building a single Enterprise Data Lake (EDL) for high-quality, secure, and trusted data.
- Connect data from disparate silos in a way that creates opportunities to use that data in ways not possible in a siloed approach.
- Power responsible data democratisation across Microsoft.
- Drive efficiency gains in the processes Microsoft employs to gather, manage, access, and use data.
- Meet or exceed compliance and regulatory requirements without compromising Microsoft’s ability to create exceptional products.
Its approach to modern data governance has two key components. First, Microsoft embedded clear data standards and built them into its application development process. This move helps it to automate and proactively manage data governance issues and data policy compliance. Second, it leverages the EDL platform to centralise and systemically scan and monitor the data.
Microsoft’s data governance framework helps organisations better understand the data protocols, aligning data strategy with business goals and outcomes and how to secure data as it is rapidly moved into the cloud.
To keep pace, organisations need to make privacy decisions in real-time. That takes automation. The strategy that IBM embarked on has been helpful –– enabling it to scale rapidly to address new regulations –– having an end-to-end flow supported intensively by automated processes and artificial intelligence. Essentially, privacy regulation boils down to ensuring organisations can safeguard the Personally Identifiable (PI) data that it’s collecting.
IBM, which has robust data governance models and is also one of the biggest vendors of data governance solutions, infuses AI into every business process. In addition, it has a central data & AI platform across the company. It started leveraging it for privacy – building a governance framework to deliver actionable information in real-time while ensuring regulatory compliance.
Pegged as a quality control discipline for introducing rigour and discipline to the process of managing, using, improving and protecting organisational data, the IBM data governance model can significantly improve the quality and integrity of the company’s data through inter-organisational collaboration and policy-making.
According to the IBM model, the core disciplines outlined are Data Quality Management, Information Security & Privacy and Information Life-Cycle Management, and the supporting disciplines are Data Architecture, Audit Information & Logging and Classification & Metadata. The key enablers are organisational structure, awareness, policy and stewardship.
The other critical technical aspect that IBM prioritises is its hybrid cloud framework.
A hybrid cloud environment gives increased flexibility concerning regulatory compliance through a mix of predefined policies and run-time automation, coupled with AI.
Apple CEO Tim Cook said in a privacy conference that achieving great data governance standards is “not only a possibility, it is also a responsibility… Technology’s potential is, and always must be, rooted in the faith people have in it.”
After being implicated in several data privacy-related scandals, from spying on customers via the Siri app to massive data breaches, Apple not only reframed its approach to data privacy, it’s making data privacy and trust a key selling point.
For several years now, Apple has been limiting how much data apps can collect and use. First, Apple gave its users the option to turn off location tracking, then it mandated app providers in iOS to spell out what data they collect through “nutrition labels” in the App Store. In 2021, they took things a step further by announcing its new App Tracking Transparency (ATT) feature that requires apps to request permission from users before tracking them across other apps and websites.
Apple has a cross-functional approach to privacy governance, covers all areas of the company and includes both customer and employee data. The Legal Team has a Senior Director in charge of Privacy and Law Enforcement Compliance. It also has a Privacy Engineering Team that partners with the Privacy Legal Team and dedicated Product Counsel to design products from the ground up to protect customer privacy.
Apple also has a Privacy Steering Committee that sets privacy standards for teams across Apple and acts as an escalation point for addressing privacy compliance issues for decision or further escalation. It also oversees instances where data for which Apple is responsible is managed or hosted by a third party on Apple’s behalf and reviews those third parties through audits and documentation.
Further, those employees who have access to Apple customer data and personal information have to undergo an additional Privacy and Security Training course on a bi-annual basis or in response to updated laws such as the GDPR.
Oracle states that data governance “does not come together all at once,” and an iterative approach is needed. Oracle’s data quality solutions are based on the principle of optimising, leveraging information as an enterprise asset. The computing giant’s Enterprise Data Governance solution helps identify, secure, manage and even discover sensitive data in the database.
Key data governance capabilities are enabled by Oracle Enterprise Metadata Manager (OEMM) and Oracle Enterprise Data Quality (EDQ).
From visualisations to sensitive database discovery results to automatic metadata discovery jobs, Oracle’s data governance functions provide improved quality and access and security to the core enterprise asset. The company data governance policy outlines establishing enterprise data strategies, identifying the right stakeholders, assigning accountabilities, and outlining the report status for data-focused initiatives.
The tech behemoth is now at the optimised level, where data governance is core to the business process and projects. Decisions are informed by data that provide quantifiable benefit/ cost/ risk analysis, and processes and policies are firmly established and adopted continually revised to reflect business goals and objectives.
Oracle’s pushes for the adoption of an ongoing program and a continuous improvement process. OEMM harvests metadata from Data Marts, Data Warehouses, Extract Transform Load, Data Integration, Business Intelligence, and Big Data/Hadoop tools, allowing easy high-level visualisation in metadata analysis and fast and straightforward data flow and lineage analyser.
To help prevent unintended data use within the organisation, Oracle has integrated Data Governance and SOA Governance or Application Services Governance within the Oracle API Platform Cloud Service. Process owners provide the subject matter expertise required to understand the meaning of data within the context of their processes. In contrast, data owners bring an understanding of the processes and metrics using their data sets.
Also Read: The AI Arms Race
Google indexes the internet, and that means collecting huge amounts of data. Google published a lot of research in the academic community about data governance — Goods Whitepaper, which describes Google’s Data Catalog, made available to the world as GCP Data Catalog, and the whitepaper, which describes BigQuery. This commercial product allows organisations to do big data analytics.
Since Google aggregates a lot of data, it makes sure to comply with privacy principles. Don’t collect what you don’t need. Eliminate personal data that is irrelevant.
Google’s core competency in data privacy and data governance is expressed in the tools Google brings and builds to the public in Google Cloud Platform (GCP).
As an entity that makes money from Android, YouTube, ads, and GCP, Google has challenges as it’s scrutinised and must adhere to the regulations in all the countries it operates in. However, Google produces products that any enterprise could use and brings the lessons into tools such as BigQuery, Data Catalog, and other big data capabilities that it provides to the public.
Data Catalog in GCP is integrated into GCP so that as soon as you create a new data set, it pops up in Data Catalog without any interaction or registration. That allows setting up alerts and monitoring all new data additions without building complicated machine learning modules and utilities to detect and classify that data.
Learnings: Things to consider when planning your data governance strategy
- Build standards into your existing process and implement them as engineering solutions. By approaching data governance during the design phase of the larger Enterprise Data strategy, you will be able to institutionalise “governance by design” into the engineering DNA — and apply it to data at every touchpoint.
- Consider implementing a modern data foundation with integrated toolsets.
- People and processes are just as important as tools and infrastructure.
Data privacy is evolving from a regulatory/compliance issue into a strategic one. Establishing a robust but adaptable data governance practice that positions data privacy as an asset will likely elevate your data strategy while paving the way for future ethical data monetisation efforts and AI development. Data privacy and innovation are not necessarily at odds. Instead, when taken together, they can serve as guideposts, lighting the way towards future growth and technological advancement.
Want to know more about topics like this from industry thought leaders? Velocity, the Data and Analytics Summit, will deliberate on how and why data-informed decisions are critical for organisations. This in-person summit is scheduled for 14th and 15th of September 2021 in Dubai, UAE.