Understand Big Data and its components such as HDFS. In this Apache Spark training module, you will learn about the Hadoop Cluster Architecture, Introduction to Spark and the difference between batch processing and real-time processing.
Learn the basics of Scala that are required for programming Spark applications. In this Apache Spark course module, you will also learn about the basic constructs of Scala such as variable types, control structures, collections such as Array, ArrayBuffer, Map, Lists, and many more.
In this Scala course module, you will learn about object-oriented programming and functional programming techniques in Scala.
Understand Apache Spark and learn how to develop Spark applications. At the end, you will learn how to perform data ingestion using Sqoop.
Get an insight of Spark - RDDs and other RDD related manipulations for implementing business logics (Transformations, Actions, and Functions performed on RDD).
In this Apache Spark online training module, you will learn about SparkSQL which is used to process structured data with SQL queries, data-frames and datasets in Spark SQL along with different kind of SQL operations performed on the data-frames. You will also learn about Spark and Hive integration.
Learn why machine learning is needed, different Machine Learning techniques/algorithms, and SparK MLlib.
Implement various algorithms supported by MLlib such as Linear Regression, Decision Tree, Random Forest and many more.
Understand Kafka and its Architecture. Also, learn about Kafka Cluster, how to configure different types of Kafka Cluster. Get introduced to Apache Flume, its architecture and how it is integrated with Apache Kafka for event processing. In the end, learn how to ingest streaming data using flume.
Work on Spark streaming which is used to build scalable fault-tolerant streaming applications. Also, learn about DStreams and various Transformations performed on the streaming data. You will get to know about commonly used streaming operators such as Sliding Window Operators and Stateful Operators.
In this Apache Spark and Scala training module, you will learn about the different streaming data sources such as Kafka and flume. At the end of the module, you will be able to create a spark streaming application.