Course Provider
What will you learn in this course?
- Hadoop - History Architecture, Hadoop Components HDFS Architecture, HDFS Operations,
- Hands-on Exercises on Jigsaw Lab
- MapReduce Concept
- MapReduce Architecture
- YARN MapReduce Internals Hands-on Exercises
- Hive Architecture and Components
- Data Storage in Hive
- Data Types in Hive
- Hive Query Language Features
- Partitions in Hive Joins in Hive
- Advanced features - Handling JSON and XML format in Hive
- Hands-on Exercises
- HBase Overview and Architecture
- Data Model
- Bulk Data Upload to Hbase
- Oozie Introduction and Overview Workflow
- Coordinator in Oozie
- Need_for_Visualizations-1
- Importance_of_Big_Data_Visualization
- Big_Data_Visualization_Tools_Tableau_Products-1
- Tableau_Installation_Workspace_Overview-1
- Working with Tableau
- Creating interactive dashboards
- Integrating Tableau with Hadoop
Exploratory Data Analysis for Data Science using R
-
Skill Type
Emerging Tech
- Domain
Big Data Analytics
- Course Category
Popular Tech Topics
-
- Course Price
INR 4,999
- Course Duration
73 Hours
- Course Price
Why should you take this course?
- Know what is Hadoop, Hadoop Distributed File System, and MapReduce and have working knowledge of how to use them
- Be able to work with HDFS and all basic operations Run MapReduce jobs with a given pre-compiled jar file and check the output
- Get introduced to Spark concepts and set up the environment
- A good understanding of RDD and working knowledge of RDD operation
- Participants will have an overview of Spark architecture, concepts of performance tuning, job submission and job management
- Understand and know how to use Spark Streaming, Spark SQL, Dataframes, the APIs
- Understanding of Spark MLLib with practical examples
Who should take this course?
All IT people who want to switch to data Engineering
Curriculum
- Data structures in R. Get key insights using exploratory data analysis using R.
- Get introduced to Spark concepts and set up the environment
- A good understanding of RDD and working knowledge of RDD operation
- Participants will have an overview of Spark architecture, concepts of performance tuning, job submission and job management
- Understand and know how to use Spark Streaming, Spark SQL, Dataframes, the APIs
- Understanding of Spark MLLib with practical examples
- FutureSkills Prime badge that can be added to your LinkedIn profile.
- Certificate by Jigsaw Manipal on course completion
Tools you will learn in this course
- Core Spark
- Dataframes
- RDD Operations
- Shared Variables
- Spark MLlib
- Spark SQL