Data mesh is an approach to data management that helps organizations become more agile and prudent with managing, analyzing, and distributing data.
Four main principles are fundamental in a data mesh. The first principle is domain ownership, and the second is data as a product, both of which were covered in previous blog posts.
This post will cover the third principle, the self-serve data platform. This platform is the organization’s data conduit, streamlining the creation, deployment, and maintenance by domain owners and allowing for the data product’s accessibility and shareability.
Organizations benefit by reducing product development and ownership costs and empowering the domain teams to optimize the utility of data products. Likewise, the platform ensures consistency of data product attributes like quality and compliance through automated governance policies.
What is a Self-Serve Data Platform?
A self-serve platform is a set of technologies and the underlying infrastructure that simplify otherwise complex decentralized data management and sharing. This platform allows organizations to attain key objectives like:
Autonomous team productivity: The ability to enable a team to complete their work with a sense of autonomy and without involving another team.
Exchange of trusted & governed data products: Smooth exchange of data products with a certified level of data quality from the provider to the consumer. The platform allows for consistent quality, security, compliance, and process through governance policy structure and enforcement.
Accelerate time to value: The platform should abstract technical complexity and provide elegant APIs to decrease the knowledge ramp-up required for domain teams and the steps required to deploy data products.
Scalable and flexible sharing: Data sharing is enabled across internal and external consumers, such as a network of partners. To reach this breadth of sharing, designing for secure interoperability and integration with other platforms is required.
Innovation Culture: Free domain teams from data management activities not contributing to innovation, so they can focus on data discovery, exploration, and analysis to uncover valuable insights to be shared.
What Problems Does a Self-Serve Platform Address?
Data mesh architecture allows organizations to develop valuable and useful data products. Knowledge experts create these products within business domains across the organization.
However, the actual value and utility of the data products can only be realized through cost-effective and efficient shareability and discoverability across the organization.
For this reason, a self-serve platform is key to unlocking the actual value of the data mesh. Below are three specific problems the platform addresses.
Shareability and discoverability: The platform allows domain teams to seamlessly share their data products across the organization. Likewise, knowledge workers can easily access and find useful data products on the platform.
Duplication of efforts from domains: Domain teams are empowered to share and use data products on the platform, which reduces duplication, specifically across cross-functional domain teams.
Costs of operation: The self-serve platform integrates and shares standard capabilities with existing data platforms. Additionally, it reduces cognitive load across domains and allows for the functioning of a general technologist throughout the system.
Product inconsistency and incompatibility: The platform provides domain agnostic infrastructure and services, which includes the execution of policies to ensure consistency and compatibility of each data product.
Self-Serve Data Platform Role in Data Mesh
Within a data mesh architecture, the self-serve data platform is designed to optimize and integrate with existing technologies within a data architecture. Integration examples include data storage, processing frameworks, query languages, data catalogs, and pipeline workflow management.
Likewise, in the book Data Mesh: Delivering Data-Driven Value at Scale, author Zhamak Dehghani outlines six platform characteristics. These self-serve platform characteristics optimize the existing platforms and allow for the functionality of the data mesh architecture.
Serves autonomous domain-oriented teams: Enables domain engineering teams in building, sharing, and using data products.
Manages autonomous interoperable data products: The platform works with data products to make them discoverable, usable, trustworthy, and secure for end users throughout the product life cycle.
Integrated platform of operational and analytical capabilities: The platform provides a connected experience for domains, bringing together the operational and analytical constituents of the data architecture.
Designed for a generalist majority: The platform promotes interoperability between different technologies and reduces the need for proprietary languages and experiences designed for specialists. It incentivizes and enables experienced generalist developers.
Favor decentralized technologies: The platform supports and enables the notion of the decentralized data architecture of a data mesh. Additionally, they provide centralization of tasks that help remove friction from domains and technologies.
Domain agnostic: The platform is designed to enable all domain teams by balancing domain-agnostic capabilities with domain-specific data modeling, processing, and sharing across the
organization.
The Challenges of a Self-Serve Data Platform
Below is the primary challenge of the self-serve platform and the mesh principle that addresses the challenge.
Consistent product quality and policy adherence: As with the domain ownership and data as a product principles, the self-serve platform needs to support the optimization of consistent product quality, security, privacy, and legal compliance. This challenge is addressed through the fourth mesh principle of federated computational governance, which provides an operating model to balance domain autonomy with organization-wide interoperability of a data mesh.
What’s Next After the Self-Serve Platform?
The self-serve data platform allows for the shareability and discoverability of data products created by business domains.
The data platform brings the data mesh architecture together through the cost reductions of decentralized data ownership, lessens the complexities of data infrastructures, diminishes the need for technology specialists, and automates quality and data policies through governance.
The fourth principle is the principle of federated governance which ensures system engagement, mitigates domain isolation, administers product standardization, and addresses platform operational needs. We will cover this principle in our next blog post.
Sign up below to have future blog posts delivered directly to your inbox.