The data lineage solution proposed by Apache Nifi proves to be an excellent tool for auditing a data pipeline.
Replay / Retry events. Observe modified attributes as part of event. NiFi's Data Provenance capability allows us to understand exactly what happens to each piece of data that is received. Apache NiFi is an outstanding tool for moving and manipulating a multitude of data sources. Data Provenance. It makes it possible to know what transformation happens on each piece of information. 3. Data provenance documents the inputs, entities, systems, and processes that influence data of interest, in effect providing a historical record of the data and its origins. As long as the provenance data has not been aged off and the referenced content is still available in the content repository, any flowfile can be replayed from any point in the flow. We are given a directed graph that shows when a FlowFile was received, when it was modified, when it was routed in a particular way, and when and where it was sent - as well as which component performed the action. Provenance data lineage. It provides a no-code, graphical approach to configuring real-time data streaming, ingestion, and management solutions for a variety of use cases. NiFi tracks the history of each piece of data with its lineage and provenance features. The Provenance Repository consists of all the provenance event data. Detail event analysis. Nifi Monitoring and Statistics 05:42 Purpose and Usage of Data Provenance. Data Provenance or Data Lineage - NiFi provides a data provenance module for tracking and monitoring data flows from beginning to end. NiFi is highly configurable.
One of the most important features of NiFi is built-in support for data provenance. Observe failed queues. View/Download input and output claim. Performance Considerations Introduction. Describe aspects of NiFi security; What to Expect. It provides a robust interface for monitoring data as it moves through the configured NiFi system as well as the ability to view data provenance during each step. Apache NiFi originated from the NSA Technology Transfer Program in Autumn of 2014.
The provenance data is also crucial to a key feature of NiFi – allowing the user to replay flowfiles.
This means that NiFi enables easy collection, curation, analysis and action on any data anywhere (edge, cloud, data center) with built-in end-to-end security and provenance. A Brief History of NiFi. NiFi automatically records, indexes, and makes available the provenance data as objects flow through the system. This course is designed for Developers, Data Engineers, Data Scientists, and Data Stewards. NiFi became an official Apache Project in July of 2015. Data provenance. NiFi has been in development for 8 years.