Big Data With Hadoop

What is Big Data with Hadoop ?

Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. Hadoop changes the economics and the dynamics of large scale computing. Big Data with Hadoop training and expertise impact can be boiled down to four salient characteristics. For obvious reason Hadoop certified professional are enjoying now huge demand all over the world with fat pay package and growth potential where sky is the limit.

Our Approach:

Students are at the top of our priority list and we always make sure that every student is given the best training possible. In order to provide the best training, all our training modes have been made interactive sessions. Out of all the 4 training modes, the students are given an opportunity to choose a mode of training depending on their requirements. Different training methods have been introduced for individuals as well as for corporates. Unlike most of the online trainings today, Our Online trainings are interactive sessions and are similar to our classroom trainings. The student will be connecting to our Live virtual classroom where they will be able interact with the trainer.

We provide one of the best professional trainings within SAP in the industry. The courses are run by experts with ample industry experience on this subject matter. The course run are well up to professional standards with the latest  industry updates. Contact our team at Jenrac Technologies for all your queries.

By the end of this training you will:
- Understand the core concepts of Big Data module.
- Be able to apply the knowledge learned to progress in your career as an Big Data Developer.

Essential
Minimum knowledge of OOP languages like Core Java, python, Ruby. Recommended/Additional
Experience in the above mentioned languages is recommended but not essential.

Classroom Training: An Instructor led training in our dynamic learning environment based in our office at West London. The classroom is fitted with all the essential amenities needed to ensure a comfortable training experience and with this training you will have an opportunity to build a Networking with other learners, share experiences and develop social interaction.

Online: Unlike most organisations our online based training is a tutor led training system similar to the classroom based training in every given aspect making it more convenient to the students from any location around the world and also cost effective.

Onsite: This training is specifically made for the Corporate clients who wish to train their staff in different technologies. The clients are given an opportunity where they can tailor the duration of course according to their requirements and the training can be delivered in house/ at your location of choice or online.

Customised one to one: A tailored course for students looking for undeterred attention from the tutor at all the times. The duration of course and contents of the course are specifically customised to suite the students requirements. In addition to it the timings of the trainings can also be customised based on the availability of both the tutor as well as the student.

Contractors can expect to earn between £300 and £500 per day depending on the experience. Permanent roles on average offer a salary of between £30 and £60k per annum, again depending on the experience required for the job.

Although there is no guarantee of a job on course completion we are almost certain that you should be able to find a suitable position within a few weeks after successful completion of the course. As a part of Placement service, we offer CV reviewing in which your CV would be reviewed by our experts and essential modifications to be made would be recommended so that your CV suits perfectly to the kind of training you have taken.

Course Preview

• Introduction to Big Data
• Introduction to Hadoop
• Big Data and Hadoop Relation.
• Why Hadoop?
• Components of Hadoop EcoSystem
o Storage Components
o Processing Components
• Overview of the complete training

• HDFS Introduction
• HDFS layout
• Importance of HDFS in Hadoop
• HDFS Features
• Storage aspects of HDFS
» Block
» Configuring block size
» Difference between Default and Configurable Block size
» Design Principles of Block Size
• HDFS Architecture
• HDFS Daemons and its Functionalities
» NameNode
» Secondary Name Node
» DataNode
• HDFS Use cases

• What is Map Reduce?
• Map Reduce Use cases?
• Map Reducing Functionalities
• Importance of Map Reduce in Hadoop?
• Processing Daemons of Hadoop
 » Job Tracker
 » Task Tracker

• Input Split
 » Role of Input Split in Map Reduce
 » InputSplit Size Vs Block Size
 » InputSplit Vs Mappers
• How to write a basic Map Reduce Program
 » Driver Code
 » Mapper Code
 » Reducer Code
• Driver Code
 - Importance of Driver Code in a Map Reduce program
 - How to Identify the Driver Code in Map Reduce program
 - Different sections of Driver code
• Mapper Code
 - Importance of Mapper Phase in Map Reduce
 - How to Write a Mapper Class?
 - Methods in Mapper Class
• Reducer Code
 - Importance of Reduce phase in Map Reduce
 - How to Write Reducer Class?
 - Methods in Reducer Class
• Input and output Format's in Map Reduce
• Map Reduce API(Application Programming Interface)
 - New API
 - Depreciated API
• Combiner in Map Reduce
 - Importance of combiner in Map Reduce
 - How to use the combiner class in Map Reduce?
 - Performance tradeoffs with respects to Combiner
• Partitioner in Map Reduce
 - Importance of Partitioner class in Map Reduce
 - How to use the Partitioner class in Map Reduce
 - hash Partitioner functionality
 - How to write a custom Partitioner
• Joins - in Map Reduce
 - Map Side Join
 - Reduce Side Join
 - Performance Trade Off
• How to debug MapReduce Jobs in Local and Pseudo cluster Mode.
• Introduction to MapReduce Streaming
• Data localization in Map Reduce
• Secondary Sorting Using Map Reduce
• Job Scheduling

• Apache PIG Introduction
• Introduction to PIG Data Flow Engine
• Map Reduce vs. PIG
• PIG use cases
• Data Types in PIG
• Basic PIG programming
• Modes of Execution in PIG
- Local Mode and
- Map Reduce Mode
• Execution Mechanisms
- Grunt Shell
- Script
• Embedded
• Transformations in PIG
• User defined functions in PIG
• Word Count concept in PIG
• Map Reduce and PIG differences.

• Introduction to Apache Hive
• Importance of HIVE in Hadoop
• Hive Architecture
- Driver
- Compiler
- Executor(Semantic Analyzer)
• Meta Store in Hive
- Importance Of Hive Meta Store
- Embedded metastore configuration
- External metastore configuration
- Communication mechanism with Metastore
• Hive Integration with Hadoop
• Hive Query Language(Hive QL)
• Configuring Hive with MySQL MetaStore
• SQL VS Hive QL
• Data Slicing Mechanisms
- Partitions In Hive
- Buckets In Hive
- Partitioning Vs Bucketing
- Real Time Use Cases
• Collection Data Types in HIVE
- Array
- Struct
- Map
• User Defined Functions(UDFs) in HIVE
- Importance of UDFs in HIVE
• Hive Serializer/Deserializer
• HIVE – HBASE Integration

• Introduction to SQOOP
• Use of SQOOP
• Connect to mySql database
• SQOOP commands
- Import
- Export
- Evaluation
- Codegen etc…
• Joins in SQOOP
• Export to MySQL
• Export to HBase