Scaling Data Analysis with Python and Dask Training Course

Overview

Dask is a flexible and high-performance Python library for parallel computing. It scales and accelerates big data processing with other Python-based data science libraries, such as Pandas, Numpy, and Scikit-Learn.

This instructor-led, live training (online or onsite) is aimed at data scientists and software engineers who wish to use Dask with the Python ecosystem to build, scale, and analyze large datasets.

By the end of this training, participants will be able to:

Set up the environment to start building big data processing with Dask and Python.
Explore the features, libraries, tools, and APIs available in Dask.
Understand how Dask accelerates parallel computing in Python.
Learn how to scale the Python ecosystem (Numpy, SciPy, and Pandas) using Dask.
Optimize the Dask environment to maintain high performance in handling large datasets.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Requirements

Experience with data analysis
Python programming experience

Audience

Data scientists
Software engineers

Course Outline

Introduction

Overview of Dask features and advantages
Parallel computing in Python

Getting Started

Installing Dask
Dask libraries, components, and APIs
Best practices and tips

Scaling NumPy, SciPy, and Pandas

Dask arrays examples and use cases
Chunks and blocked algorithms
Overlapping computations
SciPy stats and LinearOperator
Numpy slicing and assignment
DataFrames and Pandas

Dask Internals and Graphical UI

Supported interfaces
Scheduler and diagnostics
Analyzing performance
Graph computation

Optimizing and Deploying Dask

Setting up adaptive deployments
Connecting to remote data
Debugging parallel programs
Deploying Dask clusters
Working with GPUs
Deploying Dask on cloud environments

Troubleshooting

Summary and Next Steps

Scaling Data Analysis with Python and Dask Training Course

Overview

Requirements

Course Outline

Leave a Reply Cancel reply