0120 4280181 Admissions Enquiry
affiliated

Big Data Hadoop

Hadoop Ratings
5 Star Rating: Recommended 4.9 out of 5 based on 1057 ratings.

Big Data Hadoop Training

Apache Big Data Hadoop Training

Big Data Hadoop Training
Big Data Hadoop

Apache Big Data Hadoop

What is Big Data

Large amount of data of different types like structured, semi-structured and unstructured which is generating in high velocity and difficult to process is known as Big Data.

The major problem with big data is to “How to analyse big data efficiently”.

What is Hadoop

Hadoop is licensed under the Apache. It is an open source and is one of the solutions of the problem of big data. It distributes the data on different commodity hardware and provides efficient processing.

List of companies working of Hadoop

There are so many companies which is using Hadoop to manage data. Some of these are as follows:-

  • Ebay
  • Facebook
  • Twitter
  • Linkedin
  • Yahoo
  • Adobe
  • Infosys
  • IIT Hyderabad
  • Cognizent
  • Accenture

Introduction of Big Data and Hadoop

  • IntroDuction and Rise of Big Data
  • Compare Hadoop vs traditional systems
  • Hadoop Master-Slave Architecture
  • Understanding HDFS Architecture
  • NameNode, DataNode, Secondary Node
  • Learn about JobTracker, TaskTracker
  • Core components of Hadoop
  • Understanding Hadoop Master-Slave Architecture
  • Learn about NameNode, DataNode, Secondary Node
  • Understanding HDFS Architecture
  • Anatomy of Read and Write data on HDFS
  • MapReduce Architecture Flow
  • JobTracker and TaskTracker
  • Hadoop Modes

Hadoop Clusters and Map Reduce

  • Hadoop Terminal Commands
  • Cluster Configuration
  • Web Ports
  • Hadoop Configuration Files
  • Reporting, Recovery
  • MapReduce in Action
  • Overview of the MapReduce Framework
  • Use cases of MapReduce
  • MapReduce Architecture
  • Anatomy of MapReduce Program
  • Mapper/Reducer Class, Driver code
  • Understand Combiner and Partitioner
  • Write your own Partitioner
  • Writing Map and Reduce in Python
  • Map side/Reduce side Join
  • Distributed Join
  • Distributed Cache
  • Counters
  • Joining Multiple datasets in MapReduce
  • MapReduce internals
  • Understanding Input Format
  • Custom Input Format
  • Using Writeable and Comparable

Junit and MRunit Testing Framework

  • Understanding Output Format
  • Sequence Files
  • JUnit and MRUnit Testing Frameworks
  • MapReduce internals
  • Understanding Input Format
  • Custom Input Format
  • Using Writeable and Comparable
  • Understanding Output Format
  • Sequence Files
  • JUnit and MRUnit Testing Frameworks
  • What is Hive

Hive and Hiveql

  • Hive DDL - Create/Show Database
  • Hive DDL - Create/Show/Drop Tables
  • Hive DML - Load Files & Insert Data
  • Hive SQL - Select, Filter, Join, Group By
  • Hive Architecture & Components
  • Difference between Hive and RDBMS
  • Multi-Table Inserts
  • Joins
  • Grouping Sets, Cubes, Rollups
  • Custom Map and Reduce scripts
  • Hive SerDe
  • Hive UDF
  • Hive UDAF

Sqoop, Flume and Oozie

  • Sqoop - How Sqoop works
  • Sqoop Architecture
  • Flume - How it works
  • Flume Complex Flow - Multiplexing
  • Oozie - Simple/Complex Flow
  • Oozie Service/ Scheduler
  • Use Cases - Time and Data triggers
  • Sqoop - How Sqoop works
  • Sqoop Architecture
  • Flume - How it works
  • Flume Complex Flow - Multiplexing
  • Oozie - Simple/Complex Flow
  • Oozie Service/ Scheduler
  • Use Cases - Time and Data triggers

Hbase and Zookeeper

  • When/Why to use HBase
  • HBase Architecture/Storage
  • HBase Data Model
  • HBase Families/ Column Families
  • HBase Master
  • HBase vs RDBMS
  • Access HBase Data
  • What is Zookeeper
  • Zookeeper Data Model
  • ZNokde Types
  • Sequential ZNodes
  • Installing and Configuring
  • Running Zookeeper
  • Zookeeper use cases

Yarn

  • Hadoop 1.0 Limitations
  • MapReduce Limitations
  • HDFS 2: Architecture
  • HDFS 2: High availability
  • HDFS 2: Federation
  • YARN Architecture
  • Classic vs YARN
  • YARN multi-tenancy
  • YARN Capacity Scheduler

Hadoop Project

  • Demo of 2 Sample projects.
  • Twitter Project : Which Twitter users get the most retweets?
  • Who is influential in our industry? Using Flume & Hive analyze Twitter data.
  • Sports Statistics : Given a dataset of runs scored by players using Flume and
  • PIG, process this data find runs scored and balls played by each player.
  • NYSE Project : Calculate total volume of each stock using Sqoop and MapReduce.
  • Introduction to Text Mining
  • Sentiment Analysis
  • Human Behaviour Analysis
  • Introduction to R Programming
  • Twitter Case Study Using RHadoop

TRAINING FEATURES

  • 8 to 10 Students in batch
  • Complete Study Material
  • Special Focus on Practical
  • Trainer having 9+ years Industrial Experience.
  • Project will be handled by Trainer
Available Discounts
  • Flat 10% Discount on one time payment.
  • Flat 5% Discount if joining on the same day of demo or enquiry.
  • Special discount for group joining.
  • Got someone's reference??, get flat ₹ 500 Discount. .
  • Flat 10% Discount for our old students. .

Apply now

Isha

Trainer Profile

Mrs Isha Malhotra

Hadoop Corporate Trainer at Tech Altum

Total 5 Years Experience as a Software Developer. Working as a Software Developer and Part Time Corporate Trainer at Tech Altum from last 3+ Years.
Til now trained 200+ Students and Working Professionals.

  • 09015041412

Upcoming Demo

Course Date Time
Big Data Hadoop Demo
Big Data Hadoop Demo

Venue

Tech Altum
501, Om Complex, Sec 15
P.O. Box: 201301
Noida, Uttar Pradesh
201301
India

Apply now

Back to Top