-
Hello, and welcome to Hadoop Data Management with Hive, Pig, and SAS. I’m Johnny Starling, and I’ll be your instructor for this course.
In my 20 year career at SAS, I’ve held a variety of positions in Research and Development, Technical Support, and Sales and Marketing, focused on data management. In my current position as a Principal Training Consultant, I have an opportunity to work with customers using SAS Data Management solutions to drive value from their investment, especially with emerging technologies like Hadoop, Event Stream Processing, and Data Virtualization.
This course provides the tools and knowledge to help you face the challenges of working effectively with large volumes of structured and unstructured data in Hadoop. During this course, you will learn about the Hadoop infrastructure, and how to organize structural data in tabular format using Apache Hive and to analyze that data using Hive Query Language. You will also learn the Apache Pig scripting language to perform batch processing tasks such as Extract-Transform-Load, data preparation, and analytics. And of course, you’ll learn to use these tools to work effectively with SAS software.
To get the most out of this course you should have some basic experience and understanding of SQL, and some prior programming experience may be needed for some advanced topics like user-defined functions. So let’s get started.
-
Chapters
-
This course does not require any data setup before you begin, but you might want to start the Virtual Lab so you are ready to do exercises. As you work through Chapter 4, follow the  instructions to create practice data for that chapter.
When you come to a demo, click the Open Demo Steps button to view a PDF of the demonstration steps that you can follow.
Course Curriculum
1.1a What Is Big Data | |||
1.1a What Is Big Data | 00:00:00 | ||
1.1b Where Does Big Data Come From | |||
1.1b Where Does Big Data Come From | 00:00:00 | ||
1.1c Big Data Management | |||
1.1c Big Data Management | 00:00:00 | ||
1.1d Big Data Processing | |||
1.1d Big Data Processing | 00:00:00 | ||
1.2a History of Hadoop | |||
1.2a History of Hadoop | 00:00:00 | ||
1.2c Hadoop Cluster Nodes | |||
1.2c Hadoop Cluster Nodes | 00:00:00 | ||
1.2b Hadoop Ecosystem | |||
1.2b Hadoop Ecosystem | 00:00:00 | ||
1.2d Hadoop Architectures | |||
1.2d Hadoop Architectures | 00:00:00 | ||
1.2e Distributing Data on Hadoop | |||
1.2e Distributing Data on Hadoop | 00:00:00 | ||
1.2f Demo Data Preparation for Hadoop File System (HDFS) | |||
1.2f Demo Data Preparation for Hadoop File System (HDFS) | 00:00:00 | ||
1.2h What Is Hadoop File System (HDFS) | |||
1.2h What Is Hadoop File System (HDFS) | 00:00:00 | ||
1.2i Demo Loading Data into HDFS | |||
1.2i Demo Loading Data into HDFS | 00:00:00 | ||
1.2k What Is Hadoop User Experience (HUE) | |||
1.2k What Is Hadoop User Experience (HUE) | 00:00:00 | ||
1.2l Demo Exploring the Hadoop User Experience (HUE) Web Interface | |||
1.2l Demo Exploring the Hadoop User Experience (HUE) Web Interface | 00:00:00 | ||
1.2n What Is MapReduce | |||
1.2n What Is MapReduce | 00:00:00 | ||
1.2o Managing Hadoop Resources | |||
1.2o Managing Hadoop Resources | 00:00:00 | ||
1.2p What Is Sqoop | |||
1.2p What Is Sqoop | 00:00:00 | ||
1.2q Importing and Exporting Data Using Sqoop | |||
1.2q Importing and Exporting Data Using Sqoop | 00:00:00 | ||
1.2r Demo Importing Data into HDFS Using Sqoop | |||
1.2r Demo Importing Data into HDFS Using Sqoop | 00:00:00 | ||
2.1b Hive Architecture and Modules | |||
2.1b Hive Architecture and Modules | 00:00:00 | ||
2.1c Accessing Hive using Beeline and HUE | |||
2.1c Accessing Hive using Beeline and HUE | 00:00:00 | ||
2.1d Demo Accessing Hive Using Beeline and HUE | |||
2.1d Demo Accessing Hive Using Beeline and HUE | 00:00:00 | ||
2.2a Hive Data Types | |||
2.2a Hive Data Types | 00:00:00 | ||
2.2b Hive Databases | |||
2.2b Hive Databases | 00:00:00 | ||
2.2c Hive Tables | |||
2.2c Hive Tables | 00:00:00 | ||
2.2d Hive Partitioned Tables | |||
2.2d Hive Partitioned Tables | 00:00:00 | ||
2.2e Demo Working with Partitioned Hive Tables | |||
2.2e Demo Working with Partitioned Hive Tables | 00:00:00 | ||
2.3a HiveQL Operators | |||
2.3a HiveQL Operators | 00:00:00 | ||
2.3c HiveQL Statements to Load Data | |||
2.3c HiveQL Statements to Load Data | 00:00:00 | ||
2.3d HiveQL SELECT Statements | |||
2.3d HiveQL SELECT Statements | 00:00:00 | ||
2.3e HiveQL Inner and Outer Joins | |||
2.3e HiveQL Inner and Outer Joins | 00:00:00 | ||
2.3f HiveQL ORDER BY and GROUP BY Clauses | |||
2.3f HiveQL ORDER BY and GROUP BY Clauses | 00:00:00 | ||
2.3g Demo Querying Data Using HiveQL | |||
2.3g Demo Querying Data Using HiveQL | 00:00:00 | ||
3.0 Pig and Pig Latin Introduction | |||
3.0 Pig and Pig Latin Introduction | 00:00:00 | ||
3.1b Pig Capabilities and Components | |||
3.1b Pig Capabilities and Components | 00:00:00 | ||
3.1c Basic Pig Scripts | |||
3.1c Basic Pig Scripts | 00:00:00 | ||
3.1d Pig and HiveQL Program Comparison | |||
3.1d Pig and HiveQL Program Comparison | 00:00:00 | ||
3.1e Pig Execution in Hue and Grunt | |||
3.1e Pig Execution in Hue and Grunt | 00:00:00 | ||
3.1f Demo Executing Pig in Grunt | |||
3.1f Demo Executing Pig in Grunt | 00:00:00 | ||
3.2a Pig Data Types | |||
3.2a Pig Data Types | 00:00:00 | ||
3.2b Pig Fields and Expressions | |||
3.2b Pig Fields and Expressions | 00:00:00 | ||
3.2c Pig Operators | |||
3.2c Pig Operators | 00:00:00 | ||
3.2d Pig Keywords for Reading, Writing, and Filtering | |||
3.2d Pig Keywords for Reading, Writing, and Filtering | 00:00:00 | ||
3.2e Case Sensitivity and Diagnostics | |||
3.2e Case Sensitivity and Diagnostics | 00:00:00 | ||
3.2f Demo Reading Data in HDFS Using Pig | |||
3.2f Demo Reading Data in HDFS Using Pig | 00:00:00 | ||
3.2h Limiting Pig Output | |||
3.2h Limiting Pig Output | 00:00:00 | ||
3.2i Splitting or Combining Aliases | |||
3.2i Splitting or Combining Aliases | 00:00:00 | ||
3.2j Ordering and Grouping Data | |||
3.2j Ordering and Grouping Data | 00:00:00 | ||
3.2k Analyzing Results | |||
3.2k Analyzing Results | 00:00:00 | ||
3.2m Keywords for Joining Aliases | |||
3.2m Keywords for Joining Aliases | 00:00:00 | ||
3.2n Keywords for Controlling Data Sizes and Processing | |||
3.2n Keywords for Controlling Data Sizes and Processing | 00:00:00 | ||
3.2p Special Join Operators | |||
3.2p Special Join Operators | 00:00:00 | ||
3.2q Parameter Substitution | |||
3.2q Parameter Substitution | 00:00:00 | ||
3.3a Pig Built-in Functions | |||
3.3a Pig Built-in Functions | 00:00:00 | ||
3.3b PiggyBank User-Defined Functions | |||
3.3b PiggyBank User-Defined Functions | 00:00:00 | ||
3.3c Apache DataFu User-Defined Functions | |||
3.3c Apache DataFu User-Defined Functions | 00:00:00 | ||
3.4a Recommendations and Best Practices | |||
3.4a Recommendations and Best Practices | 00:00:00 | ||
3.4b Demo Pig Script Process Analysis | |||
3.4b Demo Pig Script Process Analysis | 00:00:00 | ||
4.1a SAS Interfaces for Hadoop | |||
4.1a SAS Interfaces for Hadoop | 00:00:00 | ||
4.1b SAS Data Integration Studio and SAS Data Loader for Hadoop | |||
4.1b SAS Data Integration Studio and SAS Data Loader for Hadoop | 00:00:00 | ||
4.1c SAS Interfaces for Hadoop Architecture | |||
4.1c SAS Interfaces for Hadoop Architecture | 00:00:00 | ||
4.2a Base SAS Programming for Hadoop | |||
4.2a Base SAS Programming for Hadoop | 00:00:00 | ||
4.2b Base SAS Programming for Hadoop – Examples | |||
4.2b Base SAS Programming for Hadoop – Examples | 00:00:00 | ||
4.2c Demo SAS Programming to Read and Write Data in HDFS and Execute Pig Scripts, Part 1 | |||
4.2c Demo SAS Programming to Read and Write Data in HDFS and Execute Pig Scripts, Part 1 | 00:00:00 | ||
4.2d Demo SAS Programming to Read and Write Data in HDFS and Execute Pig Scripts, Part 2 | |||
4.2d Demo SAS Programming to Read and Write Data in HDFS and Execute Pig Scripts, Part 2 | 00:00:00 | ||
4.2h SAS ACCESS Interface to Hadoop | |||
4.2h SAS ACCESS Interface to Hadoop | 00:00:00 | ||
4.2i SQL Pass-Through Method | |||
4.2i SQL Pass-Through Method | 00:00:00 | ||
4.2j Demo Using SQL Pass-Through | |||
4.2j Demo Using SQL Pass-Through | 00:00:00 | ||
4.2l Optimizing and Tracing LIBNAME Implicit Pass-Through | |||
4.2l Optimizing and Tracing LIBNAME Implicit Pass-Through | 00:00:00 | ||
4.2m Demo Using the SAS ACCESS LIBNAME Method | |||
4.2m Demo Using the SAS ACCESS LIBNAME Method | 00:00:00 | ||
4.3a SAS Data Integration Studio Objects and Interface | |||
4.3a SAS Data Integration Studio Objects and Interface | 00:00:00 | ||
4.3b Hadoop Transformations in SAS Data Integration Studio | |||
4.3b Hadoop Transformations in SAS Data Integration Studio | 00:00:00 | ||
4.3c Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 1 | |||
4.3c Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 1 | 00:00:00 | ||
4.3d Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 2 | |||
4.3d Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 2 | 00:00:00 | ||
4.3e Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 3 | |||
4.3e Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 3 | 00:00:00 | ||
4.3f Demo- Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 4 | |||
4.3f Demo- Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 4 | 00:00:00 | ||
4.3g Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 5 | |||
4.3g Demo-Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 5 | 00:00:00 | ||
4.3h Demo Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 6 | |||
4.3h Demo Using Hadoop Transformations in SAS Data Integration Studio Jobs, Part 6 | 00:00:00 | ||
4.4a What is SAS DS2 | |||
4.4a What is SAS DS2 | 00:00:00 | ||
4.4b Basic Syntax of SAS DS2 Programs | |||
4.4b Basic Syntax of SAS DS2 Programs | 00:00:00 | ||
4.4c SAS DS2 Group Summarization | |||
4.4c SAS DS2 Group Summarization | 00:00:00 | ||
4.4d Threading a SAS DS2 Program | |||
4.4d Threading a SAS DS2 Program | 00:00:00 | ||
4.4e Demo- Executing DS2 Threads in the Hadoop Cluster to Summarize Data | |||
4.4e Demo- Executing DS2 Threads in the Hadoop Cluster to Summarize Data | 00:00:00 | ||
4.5a SAS In-Memory Analytics Interfaces for Hadoop | |||
4.5a SAS In-Memory Analytics Interfaces for Hadoop | 00:00:00 | ||
4.5b SAS In-Memory Analytics Components in a Hadoop Environment | |||
4.5b SAS In-Memory Analytics Components in a Hadoop Environment | 00:00:00 | ||
4.5c SAS In-Memory Analytics Products and Engines | |||
4.5c SAS In-Memory Analytics Products and Engines | 00:00:00 | ||
4.5d SAS High-Performance Analytics Grid | |||
4.5d SAS High-Performance Analytics Grid | 00:00:00 | ||
4.5e SAS LASR Analytic Grid | |||
4.5e SAS LASR Analytic Grid | 00:00:00 | ||
4.5f Demo-Using SAS High-Performance Procedures and the SASHDAT Library Engine, Part 1 | |||
4.5f Demo-Using SAS High-Performance Procedures and the SASHDAT Library Engine, Part 1 | 00:00:00 | ||
4.5g Demo-Using SAS High-Performance Procedures and the SASHDAT Library Engine, Part 2 | |||
4.5g Demo-Using SAS High-Performance Procedures and the SASHDAT Library Engine, Part 2 | 00:00:00 | ||
4.5i Demo Using SAS High-Performance Procedures and the SASHDAT Library Engine, Part 3 | |||
4.5h Demo Using SAS High-Performance Procedures and the SASHDAT Library Engine, Part 3 | 00:00:00 | ||
Course Module 2 Big Data Programming and Loading, Section Hadoop | |||
Course Module 2 Big Data Programming and Loading, Section Hadoop | 00:00:00 | ||
The Apache Hadoop Project Introduction | |||
The Apache Hadoop Project Introduction | 00:00:00 | ||
Chapter 1: The Apache Hadoop Project- Objectives | |||
Chapter 1: The Apache Hadoop Project- Objectives | 00:00:00 | ||
1.2g Excercise Loading Data onto the Hadoop Name Node | |||
1.2g Excercise Loading Data onto the Hadoop Name Node | 00:00:00 | ||
1.2g Excercise Solution Loading Data onto the Hadoop Name Node | |||
1.2g Excercise Solution Loading Data onto the Hadoop Name Node | 00:00:00 | ||
1.2j Excercise Loading Data onto HDFS | |||
1.2j Excercise Loading Data onto HDFS | 00:00:00 | ||
1.2j Excercise Solution Loading Data onto HDFS | |||
1.2j Excercise Solution Loading Data onto HDFS | 00:00:00 | ||
1.2m Excercise Exploring the HUE Web Interface | |||
1.2m Excercise Exploring the HUE Web Interface | 00:00:00 | ||
1.2m Excercise Solution Exploring the HUE Web Interface | |||
1.2m Excercise Solution Exploring the HUE Web Interface | 00:00:00 | ||
1.2s Excercise Importing Data into HDFS Using Sqoop | |||
1.2s Excercise Importing Data into HDFS Using Sqoop | 00:00:00 | ||
1.2s Excercise Solution Importing Data into HDFS Using Sqoop | |||
1.2s Excercise Solution Importing Data into HDFS Using Sqoop | 00:00:00 | ||
2.0b Objectives | |||
2.0b Objectives | 00:00:00 | ||
2.0a Hive and HiveQL Introduction | |||
2.0a Hive and HiveQL Introduction | 00:00:00 | ||
2.1a What is Hive and Hive Ql? | |||
2.1a What is Hive and Hive Ql? | 00:00:00 | ||
2.1e Excercise Executing HIVEQL in Beeline and HUE | |||
2.1e Excercise Executing HIVEQL in Beeline and HUE | 00:00:00 | ||
2.1e Excercise Solution Executing HIVEQL in Beeline and HUE | |||
2.1e Excercise Solution Executing HIVEQL in Beeline and HUE | 00:00:00 | ||
2.3b- HiveQL Functions | |||
2.3b- HiveQL Functions | 00:00:00 | ||
2.3h Excercise Executing Hive Queries to Access Data | |||
2.3h Excercise Executing Hive Queries to Access Data | 00:00:00 | ||
2.3h Excercise Solution Executing Hive Queries to Access Data | |||
2.3h Excercise Solution Executing Hive Queries to Access Data | 00:00:00 | ||
3.0b Objectives | |||
3.0b Objectives | 00:00:00 | ||
3.1a What Is Pig and Pig Latin | |||
3.1a What Is Pig and Pig Latin | 00:00:00 | ||
3.1g Excercise Executing a Pig Script in Grunt | |||
3.1g Excercise Executing a Pig Script in Grunt | 00:00:00 | ||
3.1g Excercise Solution Executing a Pig Script in Grunt | |||
3.1g Excercise Solution Executing a Pig Script in Grunt | 00:00:00 | ||
3.2g Excercise Basic Pig Operators for Reading, Filtering and Writing Data | |||
3.2g Excercise Basic Pig Operators for Reading, Filtering and Writing Data | 00:00:00 | ||
3.2g Excercise Solution Basic Pig Operators for Reading, Filtering and Writing Data | |||
3.2g Excercise Solution Basic Pig Operators for Reading, Filtering and Writing Data | 00:00:00 | ||
3.2l Excercise Aggregating Output with Pig Operators | |||
3.2l Excercise Aggregating Output with Pig Operators | 00:00:00 | ||
3.2l Excercise Solution Aggregating Output with Pig Operators | |||
3.2l Excercise Solution Aggregating Output with Pig Operators | 00:00:00 | ||
3.2o Excercise Using Pig Operators to Combine and Control Data Processing | |||
3.2o Excercise Using Pig Operators to Combine and Control Data Processing | 00:00:00 | ||
3.2o Excercise Solution Using Pig Operators to Combine and Control Data Processing | |||
3.2o Excercise Solution Using Pig Operators to Combine and Control Data Processing | 00:00:00 | ||
4.0 SAS and Hadoop Introduction | |||
4.0 SAS and Hadoop Introduction | 00:00:00 | ||
4.0b SAS and Hadoop Objectives | |||
4.0b SAS and Hadoop Objectives | 00:00:00 | ||
4.2e Creating Data for Chapter 4 | |||
4.2e Creating Data for Chapter 4 | 00:00:00 | ||
4.2f Excercise-Copying a file from the SAS Server to HDFS | |||
4.2f Excercise-Copying a file from the SAS Server to HDFS | 00:00:00 | ||
4.2f Excercise Solution-Copying a file from the SAS Server to HDFS | |||
4.2f Excercise Solution-Copying a file from the SAS Server to HDFS | 00:00:00 | ||
4.2g Excercise Copying a file from HDFS to the SAS Server | |||
4.2g Excercise Copying a file from HDFS to the SAS Server | 00:00:00 | ||
4.2g Excercise Solution-Copying a file from HDFS to the SAS Server | |||
4.2g Excercise Solution-Copying a file from HDFS to the SAS Server | 00:00:00 | ||
4.2k Excercise-Executing SQL Pass-Through to Query Tables in Hive | |||
4.2k Excercise-Executing SQL Pass-Through to Query Tables in Hive | 00:00:00 | ||
4.2k Excercise Solution-Executing SQL Pass-Through to Query Tables in Hive | |||
4.2k Excercise Solution-Executing SQL Pass-Through to Query Tables in Hive | 00:00:00 | ||
4.2n Excercise-Creating Hive Tables Using the SAS ACCESS LIBNAME Engine | |||
4.2n Excercise-Creating Hive Tables Using the SAS ACCESS LIBNAME Engine | 00:00:00 | ||
4.2n Excercise Solution-Creating Hive Tables Using the SAS ACCESS LIBNAME Engine | |||
4.2n Excercise Solution-Creating Hive Tables Using the SAS ACCESS LIBNAME Engine | 00:00:00 | ||
4.3i Excercise Use a Pig Transformation to Count Unique Words in Moby Dick | |||
4.3i Excercise Use a Pig Transformation to Count Unique Words in Moby Dick | 00:00:00 | ||
4.3i Excercise Solution Use a Pig Transformation to Count Unique Words in Moby Dick | |||
4.3i Excercise Solution Use a Pig Transformation to Count Unique Words in Moby Dick | 00:00:00 | ||
Most Commonly Used Commands for Hive, Pig, and Linux | |||
Most Commonly Used Commands for Hive, Pig, and Linux | 00:00:00 | ||
Hadoop Data Management with Hive, Pig, and SAS | |||
Hadoop Data Management with Hive, Pig, and SAS | 00:00:00 | ||
2.2f Excercise Creating and Loading Partitioned HDFS and Hive Structures | |||
2.2f Excercise Creating and Loading Partitioned HDFS and Hive Structures | 00:00:00 | ||
2.2f Excercise Solution Creating and Loading Partitioned HDFS and Hive Structures | |||
2.2f Excercise Solution Creating and Loading Partitioned HDFS and Hive Structures | 00:00:00 | ||
4.3j Excercise Using Hive Transformations to Execute HiveQL Statements | |||
4.3j Excercise Using Hive Transformations to Execute HiveQL Statements | 00:00:00 | ||
4.3j Excercise Solution Using Hive Transformations to Execute HiveQL Statements | |||
4.3j Excercise Solution Using Hive Transformations to Execute HiveQL Statements | 00:00:00 | ||
4.5i Demo-Starting a LASR Analytic Server Session and Using the SASIOLA Engine and the IMSTAT Procedure | |||
4.5i Demo-Starting a LASR Analytic Server Session and Using the SASIOLA Engine and the IMSTAT Procedure | 00:00:00 |
Course Reviews
No Reviews found for this course.