Presto for Data Science Training Course

Overview

Presto is a distributed query engine for big data analytics. Using Presto, users can natively query data, access data from multiple systems, and more.

This instructor-led, live training (online or onsite) is aimed at data scientists who wish to query big data sources with Presto.

By the end of this training, participants will be able to:

  • Employ Presto key concepts to optimize modern big data systems.
  • Use Presto to run exabyte scale warehouses.
  • Clone data to a proprietary data storage system.
  • Work with existing BI tools such as R and Tableau.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Requirements

  • Experience with SQL

Audience

  • Data Scientists

Course Outline

Introduction

Presto and Query Engines

  • What is Presto?
  • ANSI SQL

Preparing the Development Environment

  • Setting up a sandbox and Presto
  • Connecting Tableau
  • Connecting R

In-Place Analysis

  • Working with connectors
  • Benchmarking with TCHP

SQL Concepts

  • Retrieving data
  • Combining data sources
  • Using SQL functions

Advanced SQL Concepts

  • Working with bolllinger bands
  • Accessing data
  • Filtering data
  • Migrating data sources

Summary and Conclusion

Leave a Reply

Your email address will not be published. Required fields are marked *