Category: Explain basic virtualization concepts

  • Apache Spark SQL Training Course

    Overview Spark SQL is Apache Spark’s module for working with structured and unstructured data. Spark SQL provides information about the structure of the data as well as the computation being performed. This information can be used to perform optimizations. Two common uses for Spark SQL are: – to execute SQL queries. – to read data […]

    Read More

  • Alluxio: Unifying Disparate Storage Systems Training Course

    Overview Alluxio is an open-source virtual distributed storage system that unifies disparate storage systems and enables applications to interact with data at memory speed. It is used by companies such as Intel, Baidu and Alibaba. In this instructor-led, live training, participants will learn how to use Alluxio to bridge different computation frameworks with storage systems […]

    Read More

  • Magellan: Geospatial Analytics on Spark Training Course

    Overview Magellan is an open-source distributed execution engine for geospatial analytics on big data. Implemented on top of Apache Spark, it extends Spark SQL and provides a relational abstraction for geospatial analytics. This instructor-led, live training introduces the concepts and approaches for implementing geospacial analytics and walks participants through the creation of a predictive analysis […]

    Read More

  • Hortonworks Data Platform (HDP) for Administrators Training Course

    Overview Hortonworks Data Platform (HDP) is an open-source Apache Hadoop support platform that provides a stable foundation for developing big data solutions on the Apache Hadoop ecosystem. This instructor-led, live training (online or onsite) introduces Hortonworks Data Platform (HDP) and walks participants through the deployment of Spark + Hadoop solution. By the end of this […]

    Read More

  • Spark for Developers Training Course

    Overview OBJECTIVE: This course will introduce Apache Spark. The students will learn how  Spark fits  into the Big Data ecosystem, and how to use Spark for data analysis.  The course covers Spark shell for interactive data analysis, Spark internals, Spark APIs, Spark SQL, Spark streaming, and machine learning and graphX. AUDIENCE : Developers / Data […]

    Read More

  • Apache Drill Performance Optimization and Debugging Training Course

    Overview Apache Drill is a schema-free, distributed, in-memory columnar SQL query engine for Hadoop, NoSQL and and other Cloud and file storage systems. The power of Apache Drill lies in its ability to join data from multiple data stores using a single query. Apache Drill supports numerous NoSQL databases and file systems, including HBase, MongoDB, […]

    Read More

  • Apache Accumulo Fundamentals Training Course

    Overview Apache Accumulo is a sorted, distributed key/value store that provides robust, scalable data storage and retrieval. It is based on the design of Google’s BigTable and is powered by Apache Hadoop, Apache Zookeeper, and Apache Thrift. This instructor-led, live courses covers the working principles behind Accumulo and walks participants through the development of a […]

    Read More

  • Apache ActiveMQ Training Course

    Overview Apache ActiveMQ is an open source message broker written in Java. Course Outline Understanding message-oriented middleware and JMS Connecting to ActiveMQ ActiveMQ message storage Securing ActiveMQ Creating Java applications with ActiveMQ Integrating ActiveMQ with application servers ActiveMQ messaging for other languages Advanced client options Tuning ActiveMQ for performance Administering and monitoring ActiveMQ

    Read More

  • Zeppelin for Interactive Data Analytics Training Course

    Overview Apache Zeppelin is a web-based notebook for capturing, exploring, visualizing and sharing Hadoop and Spark based data. This instructor-led, live training introduces the concepts behind interactive data analytics and walks participants through the deployment and usage of Zeppelin in a single-user or multi-user environment. By the end of this training, participants will be able […]

    Read More

  • Apache Hama Training Course

    Overview Apache Hama is a framework based on the Bulk Synchronous Parallel (BSP) computing model and is primarily used for Big Data analytics. In this instructor-led, live training, participants will learn the fundamentals of Apache Hama as they step through the creation of a BSP-based application and a vertex-centric program using the Apache Hama frameworks. […]

    Read More