SMACK is a collection of data platform softwares, namely Apache Spark, Apache Mesos, Apache Akka, Apache Cassandra, and Apache Kafka. Using the SMACK stack, users can create and scale data processing platforms.
This instructor-led, live training (online or onsite) is aimed at data scientists who wish to use the SMACK stack to build data processing platforms for big data solutions.
By the end of this training, participants will be able to:
- Implement a data pipeline architecture for processing big data.
- Develop a cluster infrastructure with Apache Mesos and Docker.
- Analyze data with Spark and Scala.
- Manage unstructured data with Apache Cassandra.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
- An understanding of data processing systems
- Data Scientists
SMACK Stack Overview
- What is Apache Spark? Apache Spark features
- What is Apache Mesos? Apache Mesos features
- What is Apache Akka? Apache Akka features
- What is Apache Cassandra? Apache Cassandra features
- What is Apache Kafka? Apache Kafka features
- Scala syntax and structure
- Scala control flow
Preparing the Development Environment
- Installing and configuring the SMACK stack
- Installing and configuring Docker
- Using actors
- Creating a database for read operations
- Working with backups and recovery
- Creating a stream
- Building an Akka application
- Storing data with Cassandra
- Reviewing connectors
- Working with clusters
- Creating, publishing, and consuming messages
- Allocating resources
- Running clusters
- Working with Apache Aurora and Docker
- Running services and jobs
- Deploying Spark, Cassandra, and Kafka on Mesos
- Managing data flows
- Working with RDDs and dataframes
- Performing data analysis
- Handling failure of services and errors
Summary and Conclusion