Skip to main content

Hadoop Training Material


Introduction to BigData, Hadoop:-
 Big Data Introduction
 Hadoop Introduction
 What is Hadoop? Why Hadoop?
 Hadoop History?
 Different types of Components in Hadoop?
 HDFS, MapReduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…
 What is the scope of Hadoop?
Deep Drive in HDFS (for Storing the Data):-
 Introduction of HDFS
 HDFS Design
 HDFS role in Hadoop
 Features of HDFS
 Daemons of Hadoop and its functionality
o Name Node
o Secondary Name Node
o Job Tracker
o Data Node
o Task Tracker
 Anatomy of File Wright
 Anatomy of File Read
 Network Topology
o Nodes
o Racks
o Data Center
 Parallel Copying using DistCp
 Basic Configuration for HDFS
 Data Organization
o Blocks and
o Replication
 Rack Awareness
 Heartbeat Signal
 How to Store the Data into HDFS
 How to Read the Data from HDFS
 Accessing HDFS (Introduction of Basic UNIX commands)
 CLI commands
MapReduce using Java (Processing the Data):-
 The introduction of MapReduce.
 MapReduce Architecture
 Data flow in MapReduce
o Splits
o Mapper
o Portioning
o Sort and shuffle
o Combiner
o Reducer
 Understand Difference Between Block and InputSplit
 Role of RecordReader
 Basic Configuration of MapReduce
 MapReduce life cycle
o Driver Code
o Mapper
o and Reducer
 How MapReduce Works
 Writing and Executing the Basic MapReduce Program using Java
 Submission & Initialization of MapReduce Job.
 File Input/Output Formats in MapReduce Jobs
o Text Input Format
o Key Value Input Format
o Sequence File Input Format
o NLine Input Format
 Joins
o Map-side Joins
o Reducer-side Joins
 Word Count Example
 Partition MapReduce Program
 Side Data Distribution
o Distributed Cache (with Program)
 Counters (with Program)
o Types of Counters
o Task Counters
o Job Counters
o User Defined Counters
o Propagation of Counters
 Job Scheduling
PIG:-
 Introduction to Apache PIG
 Introduction to PIG Data Flow Engine
 MapReduce vs. PIG in detail
 When should PIG use?
 Data Types in PIG
 Basic PIG programming
 Modes of Execution in PIG
o Local Mode and
o MapReduce Mode
 Execution Mechanisms
o Grunt Shell
o Script
o Embedded
 Operators/Transformations in PIG
 PIG UDF’s with Program
 Word Count Example in PIG
 The difference between the MapReduce and PIG
SQOOP:-
 Introduction to SQOOP
 Use of SQOOP
 Connect to mySql database
 SQOOP commands
o Import
o Export
o Eval
o Codegen etc…
 Joins in SQOOP
 Export to MySQL
 Export to HBase
HIVE:-
 Introduction to HIVE
 HIVE Meta Store
 HIVE Architecture
 Tables in HIVE
o Managed Tables
o External Tables
 Hive Data Types
o Primitive Types
o Complex Types
 Partition
 Joins in HIVE
 HIVE UDF’s and UADF’s with Programs
 Word Count Example
HBASE:-
 Introduction to HBASE
 Basic Configurations of HBASE
 Fundamentals of HBase
 What is NoSQL?
 HBase Data Model
o Table and Row
o Column Family and Column Qualifier
o Cell and its Versioning
 Categories of NoSQL Data Bases
o Key-Value Database
o Document Database
o Column Family Database
 HBASE Architecture
o HMaster
o Region Servers
o Regions
o MemStore
o Store
 SQL vs. NOSQL
 How HBASE is differed from RDBMS
 HDFS vs. HBase
 Client-side buffering or bulk uploads
 HBase Designing Tables
 HBase Operations
o Get
o Scan
o Put
o Delete
MongoDB:--
 What is MongoDB?
 Where to Use?
 Configuration On Windows
 Inserting the data into MongoDB?
 Reading the MongoDB data.
Cluster Setup:--
 Downloading and installing the Ubuntu12.x
 Installing Java
 Installing Hadoop
 Creating Cluster
 Increasing Decreasing the Cluster size
 Monitoring the Cluster Health
 Starting and Stopping the Nodes

Zookeeper
 Introduction Zookeeper
 Data Modal
 Operations
OOZIE
 Introduction to OOZIE
 Use of OOZIE
 Where to use?
Flume
 Introduction to Flume
 Uses of Flume
 Flume Architecture
o Flume Master
o Flume Collectors

 

o Flume Agents
Project Explanation with Architecture

                                   

                                DOWNLOAD 

                                                          


                                                        

                                                              

Popular posts from this blog

Hybris Training Material

                  Hybris Features and Concepts The Hybris Online Training Features and Concepts track expands the participants' knowledge on infrastructural and business concepts and functionality of selected modules of the Hybris Multichannel Platform. Aim of this course is to make participants understand the features and concepts for the successful planning of projects. Course Description Any technically oriented Hybris Multichannel user will soon be challenged by the scope of the hybris Multichannel Suite's features and concepts. This  Hybris Online Training  course aims to facilitate project work by providing detailed information on architecture and all that can be summed up by the Technical Highlights. We start with a general introduction about the  SAP Hybris Training  itself before all areas of Hybris Software are discussed in detail. SAP Hybris Training Course Content Outline : WARM UP Training Course Introduction Overvie

Tableau Training Material

                                                      Tableau Course Content Introduction to Tableau Overview of Tableau, data visualization and analytics, elements of the Tableau dashboard, understanding the significance of Tableau Desktop and Tableau Server, extensively work with data visualization using line, bar, area, stacked bar, and multi line charts, connecting with Excel data. Deep dive into Tableau Graphs Various data representation techniques, like Tables, Graphs and Maps understanding the basics of Tree Map, Histogram, Filled Map, Symbol Map, Pie Chart, Trend Lines, Normal Tables and Multi measure Tables. Tableau Table Joins Understanding the conditions and methodology for joining Tables, knowledge of Multi Table Joins. Working with Metadata Working with Table, creation of Calculated Fields, duplicating and renaming columns, conversion of data types, default aggregation. Hierarchy & Groups Tableau Hierarchy creation, Static Group creation, dep

Data Science Training Material

Description Data Science Course Content Introduction to Data Science, importance of Data Science, statistical and analytical methods, deploying Data Science for Business Intelligence, transforming data, machine learning and introduction to Recommender systems. Reasons to Use Data Science – Project Life cycle How Data Science solves real world problems, Data Science Project Life Cycle, principles of Data Science, introduction to various BI and Analytical tools, data collection, introduction to statistical packages, data visualization tools, R Programming, predictive modelling, machine learning, artificial intelligence and statistical analysis. Data Conversion Converting data into useful information, Collecting the data, Understand the data, Finding useful information in the data, Interpreting the data, Visualizing the data Terms of Statistics Descriptive statistics, Let us understand some terms in statistics, Variable Plots Dot Plots, Histogram, Stemplo