Azure Databricks is a unified data analytics platform that allows users to store and visualize vast amounts of data from different sources. It provides a collaborative environment to build, deploy, and manage data analytics workloads easily.
This instructor-led, live training (online or onsite) is aimed at data scientists and developers who wish to set up, deploy, and manage data analytics solutions using Databricks.
By the end of this training, participants will be able to:
- Set up and configure Databricks.
- Understand how Databricks and Apache Spark work together.
- Learn how to load and transform data in Databricks.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
- Basic understanding of data analytics
- Knowledge of Apache Spark
- Data Engineers
- Data Scientists
- Overview of Databricks and Apache Spark
- Understanding the Databricks architecture
- Setting up the Environment
- Setting up and configuring Databricks
- Navigating the Databricks user interface
- Creating a Databricks workspace
Working with Data in Databricks
- Connecting to an Apache Spark data source
- Understanding the basics columns and datatypes
- Managing file system into Notebooks
Managing Jobs and Clusters
- Creating and configuring clusters
- Creating jobs using Notebook
- Running jobs
- Viewing jobs and job details
Using Delta Lake in Databricks
- Loading data into Delta Lake
- Managing data in Delta Lake
- Managing Databricks security
- Managing backup and recovery
Summary and Next Steps