Data fabric
https://www.ibm.com/topics/data-fabric
https://xie.infoq.cn/article/6c7888d32d56033c018d643e6
https://zhuanlan.zhihu.com/p/450450744
Data fabric is an architecture that facilitates the end-to-end integration of various data pipelines and cloud environments through the use of intelligent and automated systems.
现在新名词很多,Data fabric,其实就是数据治理或者数据集成,但这里更加强调使用智能和自动化的系统。
Over the last decade, developments within hybrid cloud, artificial intelligence, the internet of things (IoT), and edge computing have led to the exponential growth of big data, creating even more complexity for enterprises to manage. This has made the unification and governance of data environments an increasing priority as this growth has created significant challenges, such as data silos, security risks, and general bottlenecks to decision making.
近十年,随着IOT,混合云,边缘计算的出现,数据的规模和复杂度都大幅增加,使得数据的unification和governance变得更加重要和具有挑战性。
数据孤岛
严重的数据质量问题
低效的数据交付和重复的数据开发
安全合规问题日益严重
简单的说,fabric
通过数据联邦、基于AI的主动元数据、知识图谱以及语义丰富等数据技术,进行数据的连接、跨数据源的访问和数据交付,从而减少数据孤岛。
尤其是数据虚拟化技术在计算层而非存储层进行数据连接,“在数据处理引擎和数据消费者之间架起了桥梁”,这种数据的连接方式还避免了不断产生新的数据孤岛。
从架构上看,分六层,
- Data Management layer: This is responsible for data governance and security of data.
- Data Ingestion Layer: This layer begins to stitch cloud data together, finding connections between structured and unstructured data.
- Data Processing: The data processing layer refines the data to ensure that only relevant data is surfaced for data extraction.
- Data Orchestration: This critical layer conducts some of the most important jobs for the data fabric—transforming, integrating, and cleansing the data, making it usable for teams across the business.
- Data Discovery: This layer surfaces new opportunities to integrate disparate data sources. For example, it might find ways to connect data in a supply chain data mart and customer relationship management data system, enabling new opportunities for product offers to clients or ways to improve customer satisfaction.
- Data Access: This layer allows for the consumption of data, ensuring the right permissions for certain teams to comply with government regulations. Additionally, this layer helps surface relevant data through the use of dashboards and other data visualization tools.
其中,数据编排是fabric的关键
fabric主要是使用哪些技术来实现,
这幅图比较清晰,增强型知识图谱,智能数据集成,数据自服务等
Data Fabric的6大能力支柱
Gartner 给出的 Data Fabric和Data Mesh的对比
Data Mesh更多地是关注于人和过程而不是技术架构,而Data Fabric是一种技术架构方法,它以一种智能的方式来应对数据和元数据的复杂性。