{"data":{"coursesJson":{"slug":"hadoop","title":"Apache Hadoop","pyramid":"Big Data","heroText":"Write once , Read Many Times!","heroImage":"/courses/hadoop/icon.png","aboutTopic":"About Apache Hadoop","aboutText1":"Apache Hadoop® is Built for Big Data, Insights and Innovation. Learn More Today. Cost-Effective Solution. Simple Programming Models.","aboutText2":"Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.","aboutPoints":["Offering Highly Reliable"," Distributed Processing","Providing Cost-Effective Solution"],"aboutImage":"/courses/hadoop/about.png","techTitle":"Hadoop Topics","techSubtitle":"The following are the things covered under Hadoop.","techTitle1":"HDFS","techTitle2":"YARN","techTitle3":"MAPREDUCE","techTitle4":"APACHE PIG","techTitle5":"APACHE HIVE","techTitle6":"APACHE HBASE","techDesc1":"Hadoop Distributed File System is the core component or you can say, the backbone of Hadoop Ecosystem.","techDesc2":"YARN as the brain of your Hadoop Ecosystem. It performs all your processing activities by allocating resources and scheduling tasks.","techDesc3":"MapReduce is a software framework which helps in writing applications that processes large data sets using distributed and parallel algorithms inside Hadoop environment.","techDesc4":"PIG has two parts: Pig Latin, the language and the pig runtime, for the execution environment. You can better understand it as Java and JVM.","techDesc5":"Facebook created HIVE for people who are fluent with SQL. Thus, HIVE makes them feel at home while working in a Hadoop Ecosystem.","techDesc6":"HBase is an open source, non-relational distributed database. In other words, it is a NoSQL database.","courseSubtitle":"The following are the course contents offered for Big Data / Apache Hadoop","courseContents":["INTRODUCTION","Hadoop Fundamentals","Introduction to Hadoop 2 and its Environment","An Introduction to the Architecture of Hadoop 2","HDFS","Creating and Configuring a Simple Hadoop 2 Cluster","Planning for and Creating a Fully Distributed Cluster","MAPREDUCE","HADOOP SETUP","Practical Tips and Techniques","SQOOP","APACHE PIG","FLUME","APACHE HIVE","Project: Hadoop Implementation"],"subContents":[["Introduction to Big Data and Hadoop","Getting Started with Hadoop","Introduction to Big Data Stack and Spark"],["The Motivation for Hadoop","Hadoop Overview","Data Storage: HDFS","Distributed Data Processing: Map Reduce","Data Processing and Analysis: Pig","Data Integration: Sqoop & Flume","Other Hadoop Data Tools & EcoSystem","Hive as Data warehouse","HBase as NoSQL","Oozie for workflow Management & Scheduling"],["Cluster Computing and Hadoop Clusters","Hadoop Components and the Hadoop Ecosphere","What Do Hadoop Administrators Do?","Key Differences between Hadoop 1 and Hadoop 2","Distributed Data Processing: MapReduce and Spark","Data Integration: Apache Sqoop","Key Areas of Hadoop Administration"],["Distributed Computing and Hadoop","Hadoop 2 Architecture","Data Storage – the Hadoop Distributed File System"],["HDFS – Hadoop Distributed File System","HDFS Architecture","Hadoop1.x Components","Namenode","Fault tolerance & High availability","Failure handling - FSImage","HDFS Commands"],["Hadoop Distributions and Installation Types","Understanding the Configuration files","Configuration Property names and Values","Setting Up a Portable Hadoop File System","Setting up a Pseudo-Distributed Hadoop 2 Cluster","Performing the Initial Hadoop Configuration","Operating the New Hadoop Cluster","Hands-On Exercise"],["Planning your Hadoop Cluster","Going from a Single Rack to Multiple Racks","Creating a Multi-Node Cluster","Modifying the Hadoop Configuration","Starting up the Cluster","Configuring Hadoop Services","Hands-On Exercise"],["Map Reduce Anatomy","Map Reduce Examples","Running MapReduce programs in Hadoop","Hadoop2.x Components","Block size and performance","YARN","Hadoop 2.x vs Hadoop 1.x","Hands-On Exercise"],["Single Node setup","Hands-On Exercise","Multi Node setup","Scaling up/down Hadoop cluster","Replication distribution and automatic discovery","Hands-On Exercise"],["Using Combiners","Reducing Intermediate Data with Combiners","Using The Distributed Cache","Logging","Splittable File Formats","Determining the Optimal Number of Reducers","Map-Only MapReduce Jobs","Hands-On Exercise"],["SQOOP Introcution & Architecture","Importing RDB data to HDFS","Importing RDB data to Hive"],["Apache Pig Introduction","Apache Pig Setup","Apache Pig Commands","FILTER","Structured(including XML/JSON) data processing using Apache Pig","Parameter substitution","Macros in Pig","Unstructured data processing using Apache Pig","Best Practices for Pig","Pig UDF","PIG Advanced"],["Flume Introduction","Flume with Local","Flume with HDFS","Flume with Hive","Flume with HBASE"],["Apache Hive - Introduction","Apache Hive - Setup","Managed tables & external tables","Apache Hive - Commands"],["Unstructured Data Handling with BigData Tools","HandsOn Use Case PoC","Best practices of monitoring a Hadoop cluster","Using logs and stack traces for monitoring and troubleshooting","Using open-source tools to monitor Hadoop cluster"]]}},"pageContext":{"isCreatedByStatefulCreatePages":false,"id":"e994b0b3-e704-5cd4-8d0d-969745c83b9d"}}