Overview
Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time.
In this instructor-led, live training, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi.
By the end of this training, participants will be able to:
- Understand NiFi’s architecture and dataflow concepts.
- Develop extensions using NiFi and third-party APIs.
- Custom develop their own Apache Nifi processor.
- Ingest and process real-time data from disparate and uncommon file formats and data sources.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Requirements
- Java programming experience.
- Experience with Maven.
Audience
- Developers
- Data engineers
Course Outline
Introduction
- Data at rest vs data in motion
Overview of Big Data Tools and Technologies
- Hadoop (HDFS and MapReduce) and Spark
Installing and Configuring NiFi
Overview of NiFi Architecture
Development Approaches
- Application development tools and mindset
- Extract, Transform, and Load (ETL) tools and mindset
Design Considerations
Components, Events, and Processor Patterns
Exercise: Streaming Data Feeds into HDFS
Error Handling
Controller Services
Exercise: Ingesting Data from IoT Devices using Web-Based APIs
Exercise: Developing a Custom Apache Nifi Processor using JSON
Testing and Troubleshooting
Contributing to Apache NiFi
Summary and Conclusion