Talend Open Studio for Big Data is an open source ETL tool for processing big data. It includes a development environment to interact with Big Data sources and targets, and run jobs without having to write code.
This instructor-led, live training (online or onsite) is aimed at technical persons who wish to deploy Talend Open Studio for Big Data to simplifying the process of reading and crunching through Big Data.
By the end of this training, participants will be able to:
- Install and configure Talend Open Studio for Big Data.
- Connect with Big Data systems such as Cloudera, HortonWorks, MapR, Amazon EMR and Apache.
- Understand and set up Open Studio’s big data components and connectors.
- Configure parameters to automatically generate MapReduce code.
- Use Open Studio’s drag-and-drop interface to run Hadoop jobs.
- Prototype big data pipelines.
- Automate big data integration projects.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
- An understanding of relational databases
- An understanding of data warehousing
- An understanding of ETL (Extract, Transform, Load) concepts
- Business intelligence professionals
- Database professionals
- SQL Developers
- ETL Developers
- Solution architects
- Data architects
- Data warehousing professionals
- System administrators and integrators
Overview of “Open Studio for Big Data” Features and Architecture
Setting up Open Studio for Big Data
Navigating the UI
Understanding Big Data Components and Connectors
Connecting to a Hadoop Cluster
Reading and Writing Data
Processing Data with Hive and MapReduce
Analyzing the Results
Improving the Quality of Big Data
Building a Big Data Pipeline
Managing Users, Groups, Roles, and Projects
Deploying Open Studio to Production
Monitoring Open Studio
Summary and Conclusion