Apache Spark is a distributed processing engine for analyzing very large data sets. It can process data in batches and real-time, as well as carry out machine learning, ad-hoc queries, and graph processing. .NET for Apache Spark is a free, open-source, and cross-platform big data analytics framework that supports applications written in C# or F#.
This instructor-led, live training (online or onsite) is aimed at developers who wish to carry out big data analysis using Apache Spark in their .NET applications.
By the end of this training, participants will be able to:
- Install and configure Apache Spark.
- Understand how .NET implements Spark APIs so that they can be accessed from a .NET application.
- Develop data processing applications using C# or F#, capable of handling data sets whose size is measured in terabytes and pedabytes.
- Develop machine learning features for a .NET application using Apache Spark capabilities.
- Carry out exploratory analysis using SQL queries on big data sets.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
- .NET programming experience using C# or F#
Overview of Apache Spark Features and Architecture
- Apache Spark modules: Spark SQL, Spark Streaming, MLlib, GraphX
- RDD, Dataframes, drive-workers, DAG, etc.
Setting up Apache Spark on .NET
- Preparing the Java VM
- Running .NET for Apache Spark using .NET Core
- Creating a sample .NET console application
- Adding the Spark driver
- Initializing a SparkSession
- Executing the application
- Building a data preparation pipeline
- Performing ETL (Extract, Transform, and Load)
- Building a machine learning model
- Preparing the data
- Training a model
- Processed streaming data in real-time
- Case study: monitoring sensor data
- Working with Spark SQL
- Analyzing structured data
- Plotting results
- Using third-party tools to visualize results
Summary and Conclusion