Hadoop Spark Hive Big Data Admin Class Bootcamp Course NYC
Hadoop Spark Hive Big Data Admin Class Bootcamp Course NYC, Learn installations and architecture of Hadoop, Hive, Spark, and other tools. Handle structured & Unstructured Data.
Course Description
Introduction Hadoop Big Data Course
- Introduction to the Course
Top Ubuntu commands
Understand NameNode, DataNode, YARN and Hadoop Infrastructure
Hadoop Install
- Hadoop Installation & HDFS Commands
- Java based Mapreduce
# Hadoop 2.7Â / 2.8.4
Learn HDFS commands
Setting up Java for mapreduce
Intro to Cloudera Hadoop & studying Cloudera Certification
SQL and NoSQL
- SQL, Hive and Pig Installation (RDBMS world and NoSQL world)
- More Hive and SQOOP (Cloudera – Sqoop and Hive on Cloudera.
- JDBC drivers.
- Pig
- Intro to NoSQL, MongoDB, Hbase Installation
Understanding different databases
Hive :
- Hive Partitions and Bucketing
- Hive External and Internal Tables
Spark Scala Python
- Spark Installations and Commands
- Spark Scala Scala Sheets
- Hadoop Streaming Python Map Reduce
- PySpark – (Python – Basics). RDDs.
Running Spark-shell and importing data from csv files
PySpark – Running RDD
Mid Term Projects
- Pull data from csv online and move to Hive using hive import
- Pull data from spark-shell and run map reduce for fox news first page
- Create Data in MySQL and using SQOOP move it to HDFS
- Using Jupyter Anaconda and Spark Context run count on file that has Fox news first page
- Save raw data using delimiter comma, space, tab and pipe and move that into spark-context and spark shell
Broadcasting Data – stream of data
Kafka Message Broadcasting