Pentaho Data Integration Fundamentals Training Course

Overview

Pentaho Data Integration is an open-source data integration tool for defining jobs and data transformations.

In this instructor-led, live training, participants will learn how to use Pentaho Data Integration’s powerful ETL capabilities and rich GUI to manage an entire big data lifecycle and maximize the value of data within their organization.

By the end of this training, participants will be able to:

  • Create, preview, and run basic data transformations containing steps and hops
  • Configure and secure the Pentaho Enterprise Repository
  • Harness disparate sources of data and generate a single, unified version of the truth in an analytics-ready format.
  • Provide results to third-part applications for further processing

Audience

  • Data Analyst
  • ETL developers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Requirements

  • An understanding of relational databases
  • An understanding of data warehousing
  • An understanding of ETL (Extract, Transform, Load) concepts

Course Outline

Introduction

Installing and Configuring Pentaho

Overview of Pentaho Features and Architecture

Understanding Pentaho’s In-Memory Caching

Navigating the User Interface

Connecting to a Data Source

Configuring the Pentaho Enterprise Repository

Transforming Data

Viewing the Transformation Results

Resolving Transformation Errors

Processing a Data Stream

Reusing Transformations

Scheduling Transformations

Securing Pentaho

Integrating with Third-party Applications (Hadoop, NoSQL, etc.)

Analytics and Reporting

Pentaho Design Patterns and Best Practices

Troubleshooting

Summary and Conclusion

Leave a Reply

Your email address will not be published. Required fields are marked *