Sqoop and Flume for Big Data Training Course

Overview

Apache Sqoop is a command line interface for moving data from relational databases and Hadoop. Apache Flume is a distributed software for managing big data. Using Sqoop and Flume, users can transfer data between systems and import big data into storage architectures such as Hadoop.

This instructor-led, live training (online or onsite) is aimed at software engineers who wish to use Sqoop and Flume for transferring data between systems.

By the end of this training, participants will be able to:

Ingest big data with Sqoop and Flume.
Ingest data from multiple data sources.
Move data from relational databases to HDFS and Hive.
Export data from HDFS to a relational database.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Requirements

Experience with SQL

Audience

Software Engineers

Course Outline

Introduction

Sqoop and Flume Overview

What is Sqoop?
What is Flume?
Sqoop and Flume features

Preparing the Development Environment

Installing and configuring Apache Sqoop
Installing and configuring Apache Flume

Apache Flume

Creating an agent
Using spool sources, file channels, and logger sinks
Working with events
Accessing data sources

Apache Sqoop

Importing MySQL to HDFS and Hive
Using Sqoop jobs

Data Ingestion Pipelines

Building pipelines
Fetching data
Ingesting data to HDFS

Summary and Conclusion

Posts

Overview

Requirements

Course Outline

Leave a Reply Cancel reply