Understanding Data Fabric

Srinivasan Sundararajan

In the recently announced Technology Trends in Data Management, Gartner has introduced the concept of “Data Fabric”. Here is the link to the document, Top Trends in Data and Analytics for 2021: Data Fabric Is the Foundation (gartner.com).

As per Gartner, the data fabric approach can enhance traditional data management patterns and replace them with a more responsive approach. As it is key for the enterprise data management strategy, let us understand more about the details of data fabric in this article.

What is Data Fabric?

Today’s enterprise data stores and data volumes are growing rapidly. Data fabric aims to simplify the management of enterprise data sources and the ability to extract insights from them. A data fabric has the following attributes:

  • Connects to multiple data sources
  • Provides data discovery of data sources
  • Stores meta data and data catalog information about the data sources
  • Data ingestion capabilities including data transformation
  • Data lake and data storage option
  • Ability to store multi-modal data, both structured and unstructured
  • Ability to integrate data across clouds
  • Inbuilt graph engine to link data for providing complex relationships
  • Data virtualization to integrate with data that need not be physically moved
  • Data governance and data quality management
  • Inbuilt AI/ML engine for providing machine learning capabilities
  • Ability to share the data both within enterprises and across enterprises
  • Easy to configure work-flows without much coding (Low Code environment)
  • Support for comprehensive use cases like Customer 360, Patient 360 and more

As evident, Data Fabric aims to provide a super subset of all the desired data management capabilities under a single unified platform, making it an obvious choice for future of data management in enterprises.

Data Virtualization

While most of the above capabilities are part of existing data management platforms for the enterprise, one important capability that is part of data fabric platform is the data virtualization.

Data virtualization creates a data abstraction layer by connecting, gathering, and transforming data silos to support real-time and near real-time insights. It gives you direct access to transactional and operational systems in real-time whether on-premise or cloud.

The following is one of the basic implementations of data virtualizations whereby an external data source is queried natively without actually moving the data.  In the below example, a Hadoop HDFS  data source is  queried from a data fabric platform such that the external data can be integrated  with other data.

AI Devops Automation Service Tools

While this kind of external data source access it there for a while, data fabric also aims to solve the performance issues associated with the data virtualization. Some of the techniques used by data fabric platforms are:

  • Pushes some computations to the external source to optimize the overall query 
  • Scales out computer resources by providing parallelism

Multi Cloud   

As explained earlier, one another critical capability of data fabric platforms is its ability to integrate data from multi cloud providers. This is at the early stages as different cloud platforms have different architecture and no uniform way of connectivity between them. However, this feature will grow in the coming days.

Advanced Use Cases 

Data fabric should support advanced use cases like Customer 360, Product 360, etc. These are basically comprehensive view of all linkages between enterprise data typically implemented using graph technologies. Since data fabric supports graph databases and graph queries as an inherent feature, these advanced linkages are part of the data fabric platform.

Data Sharing  

Data fabric platforms should also focus on data sharing, not within the enterprise but also across enterprise. While focus on API management helps with data sharing, this functionality has to be enhanced further as data sharing also needs to take care of privacy and other data governance needs.

Data Lakes 

While the earlier platforms similar to data fabric have worked on the enterprise data warehouse as a backbone, data fabric utilizes a data lake as it is the backbone. A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data, and run different types of analytics – from dashboards and visualizations to big data processing, real-time analytics, and machine learning to guide better decisions.

Data Fabric Players

At the time of writing this article, there are no ratings from Gartner in the form of magic quadrant for Data Fabric Platforms. However, there is a report from Forrester which ranks data fabric platforms in the form of a Forrester Wave.

Some of the key platforms mentioned in that report are:

  • Talend
  • Oracle
  • SAP
  • Denodo Technologies
  • Cambridge Semantics
  • Informatica
  • Cloudera
  • Infoworks

While the detailed explanation and architecture of these platforms can be covered in a subsequent article, a sample building blocks of Talend data fabric platform is illustrated in the below diagram.

AIOps Artificial Intelligence for IT Operations

Enterprises can also think of building their data fabric platform by combining the best of features of various individual components. For example, from the Microsoft ecosystem perspective:

  • SQL Server Big Data Clusters has data virtualization capabilities
  • Azure Purview has data governance and metadata management capabilities
  • Azure Data Lake Storage provides data lake capabilities
  • Azure Cosmos DB has graph database engine
  • Azure Data Factory has data integration features
  • Azure Machine Learning and SQL Server have machine learning capabilities

However, as evident, we are yet to see strong products and platforms in the areas of multi cloud data management, especially data virtualization across cloud providers in a performance focused manner.

About the Author –

Srini is the Technology Advisor for GAVS. He is currently focused on Healthcare Data Management Solutions for the post-pandemic Healthcare era, using the combination of Multi-Modal databases, Blockchain, and Data Mining. The solutions aim at Patient data sharing within Hospitals as well as across Hospitals (Healthcare Interoperability) while bringing more trust and transparency into the healthcare process using patient consent management, credentialing, and zero-knowledge proofs.

Reimagining ITSM Metrics

Rama Periasamy

Rama Vani Periasamy

In an IT Organization, what is measured as success.? Predominantly it inclines towards the Key Performance Indicators, internally focused metrics, SLAs and other numbers. Why don’t we shift our performance reporting towards ‘value’ delivered to our customers along with the contractually agreed service levels? Because the success of any IT operation comes from defining what it can do to deliver value and publishing what value has been delivered, is the best way to celebrate that success.

It’s been a concern that people in service management overlook value as trivial and they often don’t deliver any real information about the work they do . In other words, the value they have created goes unreported and the focus lies only on the SLA driven metrics & contractual obligations. It could be because they are more comfortable with the conventional way of demonstrating the SLA targets achieved. And this eventually prevents a business partner from playing a more strategic role.

“Watermelon reporting” is a phrase used in reporting a service provider’s performance. The SLA reports depict that the service provider has adhered to the agreed service levels and met all contractual service level targets. It looks ’green’ on the outside, just like a watermelon. However, the level of service perceived by the service consumer does not reflect the ’green’ status reported (it might actually be ’red’, like the inside of a watermelon). And the service provider continues to report on metrics that do not address the pain points.  

This misses the whole point about understanding what success really means to a consumer. We tend to overlook valuable data and the one that shows how an organization as a service provider is delivering value and helping the customer achieve his/her business goals.

The challenge here is that often consumers have underdeveloped, ambiguous and conflicting ideas about what they want and need. It is therefore imperative to discover the users’ unarticulated needs and translate them into requirements.

For a service provider, a meaningful way of reporting success would be focused on outcomes rather than outputs which is very much in tandem with ITIL4. Now this creates a demand for better reporting, analysis of delivery, performance, customer success and value created.

Consider a health care provider, the reduced time spent in retrieving a patient history during a surgery can be a key business metric and the number of incidents created, number of successful changes may be secondary. As a service provider, understanding how their services support such business metrics would add meaning to the service delivered and enable value co-creation.

It is vital that a strong communication avenue is established between the customer and the service provider teams to understand the context of the customer’s business. To a large extent, this helps the service provider teams to prioritize what they do based on what is critical to the success of the customer/service consumer. More importantly, this enables the provider become a true partner to their customers.

Taking service desk as an example, the service desk engineers fixes a printer or a laptop, resets passwords. These activities may not provide business value, but it helps to mitigate any loss or disruption to a service consumer’s business activities. The other principal part of service desk activity is to respond to service requests. This is very much an area where business value delivered to customers can be measured using ITSM.

Easier said, but how and what business value is to be reported? Here are some examples that are good enough to get started.

1. Productivity
Assuming that every time a laptop problem is fixed with the SLA, it allows the customer to get back to work and be productive. Value can be measured here by the cost reduction – considering the employee cost per hour and the time spent by the IT team to fix the laptop.

How long does it take for the service provider to provide what a new employee needs to be productive? This measure of how long it takes to get people set up with the required resources and whether this lead-time matches the level of agility the business requires equates to business value. 

2. Continual Service Improvement (CSI)

Measuring value becomes meaningless when there is no CSI. So, measuring the cost of fixing an incident plus the loss of productivity and identifying and providing solutions on what needs to be done to reduce those costs or avoid incidents is where CSI comes into play.

Here are some key takeaways:

  • Make reporting meaningful by demonstrating the value delivered and co-created, uplifting your operations to a more strategic level.
  • Speak to your customers to capture their requirements in terms of value and enable value co-creation as partners.
  • Your report may wind up in the trash, not because you have reported wrong metrics, but it may just be reporting of data that is of little importance to your audience.   

Reporting value may seem challenging, and it really is. But that’s not the real problem. Keep reporting your SLA and metrics but add more insights to it. Keep an eye on your outcomes and prevent your IT service operations from turning into a watermelon!

References –

About the Author –

Rama is a part of the Quality Assurance group, passionate about ITSM. She loves reading and traveling.
To break the monotony of life and to share her interest in books and travel, she blogs and curates at www. kindleandkompass.com