Big data & Hadoop Developer certification program
Course Deliverable:
- Live virtual classroom sessions on weekends
- Access to voice-based learning content- E-learning
- E-book- big data and Hadoop developer
- Access to Virtual machines, use cases, test cases during entire training
- Hands-on learning approach using real-life case studies and examples
- Case study, Exam tips, Reference
- Trainer support via mail & Telephone
- Trainer would be an industry expert and Cloudera /Hortonworks certified Professionals
- Sample questions for preparation
- Pre and Post assignments
- Dummy projects for your practice
- Certification exam anytime within the course duration
Become an expert in Big Data Hadoop by getting hands-on knowledge on MapReduce, Hadoop Architecture, Pig & Hive, Oozie, Flume and Apache workflow scheduler. Also, get familiar with HBase, Zookeeper, and Sqoop concepts while working on industry-based, use-cases and projects.
This big data course also prepares you for the Cloudera CCA175 certification with simulation exams and real-life projects. The Cloudera certification is the most sought-after big data certification in the industry. After completing the Educivic Big Data and Hadoop training you will be exam ready for the Cloudera certification and job-ready for your next Big Data Assignment.
As new job opportunities are arising for IT professionals in the field of “Big Data & Hadoop,” there is an enormous scope for them. According to the recent study, in 2018, there will be 181,000 Big Data roles within the U.S. By 2020, the Big Data & Hadoop market is estimated to grow at a compound annual growth rate (CAGR) 58% surpassing $16 billion.
MAPREDUCE and HDFS
- Introduction to BIG DATA and Its characteristics
- 4 V’s of BIG DATA
- What is Hadoop?
- Why Hadoop?
- Core Components of Hadoop
- Intro to HDFS and its Architecture
- HDFS commands
- Intro to MAPREDUCE
- Versions of HADOOP
- What is Daemon?
- Hadoop Daemons?
- What is Name Node?
- What is Data Node?
- What is Secondary name Node?
- What is Job Tracker?
- What is Task Tracker?
- Read/Write operations in HDFS
- Complete Overview of Hadoop1.x and its architecture
- Rack awareness
- Introduction to Block size
- Introduction to replication factor
- Introduction to HeartBeat signal/pulse
- MAPREDUCE Architecture
- What is Mapper phase?
- What is shuffle and sort phase?
- What is Reducer phase?
- What is split?
- Difference between Block and split
- Intro to first wordcount program using MAPREDUCE
- Different classes for running MAPREDUCE program using Java
- Mapper class
- Reducer Class and its role
- Driver class
- Submitting the wordcount MAPREDUCE program
- Going through the Job’s system output
- Intro to Practitioner with example
- Intro to Combiner with example
- Intro to Counters and its types
- Different types of counters
- Joins in HADOOP
- Mapside Joins
- Reduce side joins
- Different types of input/output formats in HADOOP
- Use cases for HDFS and MapReduce programs using Java
- Schedulers in Hadoop
- FIFO, Fair, and capacity schedulers
- Speculative Execution in Hadoop
- Difference between Hadoop 1.x and Hadoop 2.x
- Intro to YARN and its core components(RM,NM AMs and AM)
- YARN architecture
PIG
- Intro to PIG
- Why PIG?
- The difference between MAPREDUCE and PIG
- When to go with MAPREDUCE?
- When to go with PIG?
- PIG datatypes
- What is field in PIG?
- What is tuple in PIG?
- What is Bag in PIG?
- Intro to Grunt shell?
- Local Mode
- MAPREDUCE mode
- Running PIG programs
- PIG Script
- Intro to PIG UDFs
- Writing PIG UDF using Java
- Registering PIG UDF
- Running PIG UDF
- Different types of UDFs in PIG
- WordCount program using PIG script
- Use cases for PIG scripts
- MAPREDUCE mode
- Running PIG programs
- PIG Script
- Intro to PIG UDFs
- Writing PIG UDF using Java
- Registering PIG UDF
- Running PIG UDF
- Different types of UDFs in PIG
- WordCount program using PIG script
- Use cases for PIG scripts
HIVE
- Intro to HIVE
- Why HIVE?
- History of HIVE
- Difference between PIG and HIVE
- HIVE data types
- Complex data types
- What is Metastore and its importance?
- Different types of tables in HIVE
- Managed tables
- External tables
- Running HIVE queries
- Intro to HIVE partitions
- Intro to HIVE Buckets
- How to perform the JOINS using HIVE queries
- Intro to HIVE UDFs
- Different types of UDFs in HIVE
- Running HIVE queries for wordcount example
- Use cases for HIVE
HBASE
- Intro to HBASE
- Intro to NoSQL database
- Sparse and dense Concept in RDBMS
- Intro to columnar/column oriented database
- Core architecture of Hbase
- Why Hbase?
- HDFS vs HBase
- Intro to Regions, Region server and Hmaster
- Limitations of Hbase
- Integration with Hive and Hbase
- Hbase commands
- Use cases for HBASE
FLUME
- Intro to Flume
- Intro to Sink, source, Flume master and Flume agents
- Importance of Flume agents
- Live Demo on copying the LOG DATA into HDFS
SQOOP
- Intro to Sqoop
- Importing and exporting the RDBMs into HDFS
- Intro to incremental imports and its types
- Use cases to import the Mysql data into HDFS
ZOOKEEPER
- Intro to Zookeeper
- Zookeeper operations
OOZIE
- Intro to Oozie
- What is Job.properties
- What is workflow.xml
- Scheduling the jobs in Oozie
- Scheduling MapReduce, HIVE,PIG jobs/Programs using Oozie workflows
- Setting up the VMware for Hadoop
- Installing all Hadoop Components
- Intro to Hadoop Distributions
- Intro to Cloudera and its major components
- Discuss on different projects
From the course:
- Learn to write complex codes in MapReduce on both MRv1 & MRv2 (Yarn) and understand Hadoop architecture.
- Perform analytics and learn high-level scripting frameworks Pig & Hive.
- Get full understanding of Hadoop system and its advance elements like Oozie, Flume and apache workflow scheduler.
- Get familiar with other concepts: Hbase, Zookeeper and Sqoop.
- Get hands-on expertise in numerous configurations surroundings of Hadoop cluster.
- Learn about optimization & troubleshooting.
- Acquire in-depth knowledge on Hadoop architecture by learning about Hadoop Distribution file system (vHDFS one.0 & vHDFS a pair of.0).
- Get to work on Real Life Project on Industry standards.
- Architects and developers who design, develop and maintain Hadoop-based solutions
- Data Analysts, BI Analysts, BI Developers, SAS Developers and related profiles who analyze Big Data in Hadoop environment
- Consultants who are actively involved in a Hadoop Project
- Experienced Java software engineers who need to understand and develop Java MapReduce applications for Hadoop 2.0.
On successful completion of the course and course requisites, the candidate will receive Big Data Hadoop Developer Certification accredited by Ministry of Skill Development and Entrepreneurship, Govt. of India.
About Course Advisor
- More than 20 years of experience in IT Architecting, Designing and Developing Solutions in Java/JEE and various other technologies
- Had been working on Big Data technologies since 2011 and on Java since 2003
- Had trained more than 5,000 professionals and students. Had worked with corporates like Capgemini, Alcatel-Lucent and Wipro
- Cloudera Certified Developer for Apache Hadoop , Hortonworks Certified Apache Hadoop 2.0 Developer(Java), and Sun Certified Java Professional.
- Very passionate about Big Data (Hadoop and ecosystem). Very active in the Hadoop community and maintaining an active blog on Big Data and related technologies
- Worked on Cloud Computing Platforms (Amazon Web Services, Microsoft Azure), Virtualization and Linux Technologies
- Was involved in formulating the proposals, Performance Engineering CoE, Technology Audit, and other activities
- Expertise in EAI (Enterprise Application Integration) and Service Oriented Architecture (SOA)
- Extensive experience in the consulting roles for customers in different domains both within and outside of India
WHY CHOOSE EduCivic as your Training partner.
- A company run by renowned and award winning team. Member of company are awarded prestigious awards like President Award. All members are IIT & IIM alumni and have done remarkable work in the field of education.
- Our proven training and coaching methodology have set us apart from competition. We focus on delivering up to date and complete knowledge to our learners. For the same we have started One 2 One learning method under Trainer At Home mode of learning.
- 65+ Countries: Our proven training methodology is available all around the globe. Our courses have met the satisfaction of individuals as well as corporate across countries that include US, APAC, Middle East, India and Europe. The training courses are globally recognised and are prepared keeping in mind the International education standards.
- 100+ certified world-class instructors: Our large faculty of experienced trainers are industry practitioner and have served dignitary position in global MNCs. The knowledge imparted by faculty are highly practical and world class.
- Global Accreditations: Courses provided are aligned with globally renowned names like Project Management Institute of USA, American Society of Quality USA, Scrum Alliance, APMG, EC Council, GARP, CompTIA, IIBA, AXELOS, (ISC)²® and others.
- Service: EduCivic believes and continuously strives for highest customer satisfaction level. We offer post assistance service to our clients for 6 six month after they complete their training. We continuously guide our client for career enhancement.
- 500+ Organizations: We have satisfactorily met the requirements of over 500 organizations, and they include some of the popular names such as TCS, Wipro, Cognizant, BOA, Times Group, Samsung and many others.
- 150+ Courses: We provide cutting edge solutions through a variety of training formats (Classroom, At Home, and Online). Our customized training course are available across multiple business units including Project Management, Quality Management, Agile Management, IT and IT Security, Big Data, technology and a lot more.