August 30, 2024
A guest post from Fabrício Ceolin, DevOps Engineer at Comet. Inspired by the growing demand…
I decided to write a series of blogs on current topics: the elements of data governance that I have been thinking about, reading, and following for a while. Even though the titles are new, the ideas are not really new. They are formed by taking advantage of different sciences and mainly consist of new usage patterns focused on strengthening agility and scalability.
Data Fabric is an approach to working with data in globally consistent ways.
Let’s try to understand the main reason behind the saying “fabric” with the help of an analogy. Let’s keep in mind that the fabric of the universe in physics is essentially and globally consistent. Now let’s continue…
According to the data fabric approach;
Data fabric initially needs data sources like any other known data management architecture. This data source may be related to the sales sector, the manufacturing industry, finance, health, and R&D… Briefly, I am talking about a field-specific data source. The domain of the data.
Regardless, the data fabric must be consistent for all its components. A consistent data source, consistent integration, consistent metadata/catalog, consistent orchestration… This is the essence of the data fabric. Just as we call the “factory of the universe,” the smallest consistent unit. This smallest unit is an atom for some and a smaller one for others. Data fabric for us: The smallest unit of globally consistent data management in itself is its essence.
This global consistency must be ensured for each data source separately. This is where the novelty of this approach comes from.
Data fabric is a self-consistent approach in which the organization follows a central data management strategy that covers all sub-data sources and can benefit from tools and methods determined according to the needs of the data source. Data fabric needs metadata management maturity.
With its data fabric approach, Jaguar Land Rover has made the interconnected view of supply and demand data efficient in solving critical business challenges.
Advantages:
Disadvantages:
Treating data fabric as centralized data management across an organization may not be beneficial. So, let’s examine what a new concept, data mesh, is.
Data mesh allows us to continue to benefit from advantages such as consistency while solving the disadvantages of the data fabric.
Data Mesh is a new data set that enables units or cross-functional teams to decentralize and manage their data domains while collaborating to maintain data quality and consistency across the organization — architecture and governance approach.
We can call fabric texture or actual fabric. So think about your clothes!
Yes, you are still reading an article about data governance. Are your clothes all made of the same fabric? No, they each have different fabrics, weaves, and textures. Even an outfit can have more than one fabric and weave. The harmony of textures and colors achieves its good looks, and as a whole, if you like it, invest in it and wear it. Depending on the season, different fabrics are better for you. We need one fabric for financial data, another for logistics data, a different fabric for automotive manufacturing data, and a different fabric for health data. We cannot spend an entire year in an outfit made of one type of fabric. It is from this approach that data mesh is born. If you connect your data fabric, which you developed with the tools you have separately planned for this sectoral data, with a network, you will make a patchwork of a piece of fabric. Sewing between fabrics, a mesh!
Data mesh changes the scope of the data fabric. It continues to provide consistency for each sectoral data fabric and says that if each patchwork fabric is consistent within itself, it will be harmonious and consistent in all our sewing.
The math says it, too. In other words, we eliminate data fabric’s agility and innovation deficiency with data mesh while going from local to proper global consistency. This approach is very similar to the microservice architecture in software. On the other hand, it is becoming the most effective way to ensure interoperability. In this case, the formation of data silos is prevented, and we provide the most efficient and fast use of decentralized, federated, and simultaneous interoperability with data mesh. How does it? Let’s continue by understanding the four basic principles.
Data mesh as a concept was first introduced by Zhamak Dehghani in 2019 and is shaped by four fundamental principles.
When we place the four basic principles and the elements of these principles into the data mesh architecture, we obtain a mechanism as follows. Data mesh needs governance maturity rather than metadata maturity. I plan to cover each component of the data mesh mechanism one by one in another article.
Netflix, Dominos, Ducati, and J.P. Morgan can be examined as an example of organizations that implement data mesh.
I had the chance to attend the “Data and Analytics Summit” organized by Gartner on May 22–24, 2023. One hundred twenty (120) sessions were held at this event, run only by Gartner experts.
This summit consisted of conferences, workshops, roundtable events, ask-an-expert sessions, one-on-one meetings, bake-off meetings, and fair sections in general. I had the opportunity to follow all sessions related to Data Fabric and Data Mesh in a hectic schedule. Here, I am sharing some information I have gained on the subject. Let’s start with Gartner’s famous HypeCycle. Data mesh is just beginning to become hype, and data fabric is a little closer to the plateau. The required maturity for both has been cut to 5–10 years.
On the other hand, Gartner experts present a weighting matrix based on their research on which approach you should choose for your organization. If you are hearing these concepts for the first time, you do not need to be afraid; as you can see, the rate of organizations that can implement this is relatively low. The Enterprise Maturity indicator is based on more than 1,400 Gartner customer interactions in 2022. Your organization’s preparation level: Upgrading in the context of metadata and governance maturity will make a significant contribution.
In conclusion, data mesh and data fabric offer different advantages and disadvantages, and choosing the right approach for your organization depends on several factors, such as organizational structure and culture, technical maturity, and data governance and security requirements.
While the data mesh approach emphasizes decentralized data ownership and management, data fabric advocates a centralized data platform to ensure data quality, consistency, and security.
It may be advisable to conduct pilot projects to assess suitability to select the best approach.
Ultimately, the best approach will align with your organization’s goals, resources, and strategic direction and provide users with relevant data and insights to make data-driven decisions. In addition, it should give flexibility to use advanced analytical and artificial intelligence approaches.
So, with what tools can we apply these governance approaches? Keep following my future content.
Feel free to follow me on GitHub and Twitter accounts for more content!
Check out some other blog posts: