Organizationally sourced data is increasingly complex and filled with valuable business intelligence. As organizations become increasingly data-driven, they will need to become more agile and prudent with managing, analyzing, and distributing data.
Implementing a data mesh architecture allows an organization to leverage a decentralized data management system that aligns with the current organizational structure. A data mesh distributes the data burden to experts at the domain data source. Keeping domain experts close to the data and analytics throughout most of the domain data lifecycle is an efficient and cost-conscious strategy for optimizing data management.
In the following post, I will explain the data mesh concept, how it addresses current data architecture problems, and the four principles of a data mesh.
What is a Data Mesh?
Zhamak Dehghani coined the term data mesh in her book Data Mesh: Delivering Data Driven Value at Scale.
Dehghani defines data mesh as a decentralized sociotechnical approach to sharing, accessing, and managing analytical data in complex and large-scale environments within or across organizations.
If we break down this definition, we get four main themes inherent to data mesh architecture.
Decentralized: Moving from one central management system that serves as the aggregator and distributor of data to multiple domains or business units each managing data they create and source. These decentralized domains serve as the aggregators and distributors.
Sociotechnical: An evolved approach to data management that encompasses the interactions between people, technical architecture, and solutions.
Share & Access: A data platform that can consistently extract utility and value out of data and make it available to users across the organization when needed.
Manage analytical data: Mature management and collaborative governance from multiple stakeholders is needed to provide consistency, security, and quality of business intelligence.
What problems does a data mesh address?
For data-driven enterprises, traditional centralized data architecture can quickly become a bottleneck for organizations that need to quickly ingest, process, and distribute data to add value and utility for business units.
Outlined below is an overview of how a data mesh solves the common challenges that data-driven organizations have with traditional architecture.
Centralized and siloed management and ownership: A data mesh takes a distributed and decentralized approach to data management at the source. Taking the burden off a central location to ingest, process, and serve data.
Speed and agility: Decentralizing data management to the data source improves efficiency and reduces bottlenecks and bureaucratic red tape by having that data processed and distributed by domain or business unit source experts.
Value and utility: Domain source experts consistently develop and serve data products to a platform based on organizational needs.
The previously mentioned themes and problems a data mesh addresses manifest into four core principles that interplay to make the mesh structure work within an organization.
Data Mesh Principles
Data mesh leverages four principles that have a symbiotic relationship within a data-driven enterprise information management strategy.
Domain Ownership: Domains are business units within an organization; these domains take on ownership and management of data in relation to specific domains.
Data as a Product: Domain data is developed and shared as a product for internal and external consumers.
Self-Serve Platform: The platform houses the products developed by the domains, allowing for product shareability and accessibility across domains and functions.
Federated Governance: Governance of the domains, products, and platforms that are decentralized and managed by various stakeholders such as domain representatives, legal, compliance, and security.
The four principles are what make the data mesh architecture function. Likewise, each principle plays off the functionality of the other to address challenges that each one could create.
Here is how the interplay of the principles ensures optimal functionality within the organization.
Domain Ownership depends on:
- Data as a Product to prevent data siloing
- Self-Serve Platform to empower domain teams
- Federated Governance to increase engagement and reduce domain isolation
Data as a Product depends on:
- Self-Serve Platform to reduce the cost of ownership
- Federated Governance to get high-order value by interconnecting data products distributed by one or more domain teams
Federated Governance depends on the Self-Serve Platform to enforce a consistent and reliable policy.
While the four main principles provide the framework for the data mesh to function, a shift in organizational thinking and perspective is required to ensure the data mesh adheres to the principles.
Below are six categories that are essential for a change in how the organization thinks about data management.
Organization perspective: The organization moves from a centralized model to a decentralized data ownership model. In a decentralized model, the business domains are the data owners.
Architectural perspective: The architecture moves beyond collecting data in lakes and monolithic warehouses. The architecture uses a distributed approach to accessing data products through standardized protocols.
Technological perspective: Data evolves from being a consequence of code. Data and code become a cohesive and adaptable entity.
Operational perspective: Data governance takes on a federated approach that relies on a computation system of policies replacing a centralized top-down and often human-administered approach.
Principal perspective: Data evolves from an asset to a product for internal and external users.
Infrastructure perspective: Removes the bifurcated approach to data and analytics that contains one system for analytics processing and another for transactional processing of operational applications. The infrastructure evolves to integrate transactional and analytical processing physically or virtually under a unified platform.
Expected Outcome
Data mesh architecture is a solution for data-driven enterprises that consistently rely on data to evolve, improve, make decisions, and manage lines of business. A data mesh can allow an organization to stay agile, extract value and improve the utility of data, and allow for malleability as the organization evolves.
Want to understand the four principles better?
Over the following weeks, we will publish blog posts that cover each principal in detail. Subscribe to the F4 Blog to be notified when publications occur.