What do you want to learn today?

Performing Data Engineering on Microsoft HD Insight

Online Training by  NetCom Learning
Inquire Now
Online / Training

Details

The main purpose of the Performing Data Engineering on Microsoft HD Insight course is to give students the ability plan and implement big data workflows on HDInsight. 

This course is based on the objectives of the course version 20775A. 

Note: eLearning module (Microsoft On Demand, MOD) is also available for this course

Outline

  1. Getting Started with HDInsight
    1. What is Big Data?
    2. Introduction to Hadoop
    3. Working with MapReduce Function
    4. Introducing HDInsight
    5. Lab : Working with HDInsight
    6. Provision an HDInsight cluster and run MapReduce jobs
  2. Deploying HDInsight Clusters
    1. Identifying HDInsight cluster types
    2. Managing HDInsight clusters by using the Azure portal
    3. Managing HDInsight Clusters by using Azure PowerShell
    4. Lab : Managing HDInsight clusters with the Azure Portal
    5. Create an HDInsight cluster that uses Data Lake Store storage
    6. Customize HDInsight by using script actions
    7. Delete an HDInsight cluster
  3. Authorizing Users to Access Resources
    1. Non-domain Joined clusters
    2. Configuring domain-joined HDInsight clusters
    3. Manage domain-joined HDInsight clusters
    4. Lab : Authorizing Users to Access Resources
    5. Prepare the Lab Environment
    6. Manage a non-domain joined cluster
  4. Loading data into HDInsight
    1. Storing data for HDInsight processing
    2. Using data loading tools
    3. Maximising value from stored data
    4. Lab : Loading Data into your Azure account
    5. Load data for use with HDInsight
  5. Troubleshooting HDInsight
    1. Analyze HDInsight logs
    2. YARN logs
    3. Heap dumps
    4. Operations management suite
    5. Lab : Troubleshooting HDInsight
    6. Analyze HDInsight logs
    7. Analyze YARN logs
    8. Monitor resources with Operations Management Suite
  6. Implementing Batch Solutions
    1. Apache Hive storage
    2. HDInsight data queries using Hive and Pig
    3. Operationalize HDInsight
    4. Lab : Implement Batch Solutions
    5. Deploy HDInsight cluster and data storage
    6. Use data transfers with HDInsight clusters
    7. Query HDInsight cluster data
  7. Design Batch ETL solutions for big data with Spark
    1. What is Spark?
    2. ETL with Spark
    3. Spark performance
    4. Lab : Design Batch ETL solutions for big data with Spark.
    5. Create a HDInsight Cluster with access to Data Lake Store
    6. Use HDInsight Spark cluster to analyze data in Data Lake Store
    7. Analyzing website logs using a custom library with Apache Spark cluster on HDInsight
    8. Managing resources for Apache Spark cluster on Azure HDInsight
  8. Analyze Data with Spark SQL
    1. Implementing iterative and interactive queries
    2. Perform exploratory data analysis
    3. Lab : Performing exploratory data analysis by using iterative and interactive queries
    4. Build a machine learning application
    5. Use zeppelin for interactive data analysis
    6. View and manage Spark sessions by using Livy
  9. Analyze Data with Hive and Phoenix
    1. Implement interactive queries for big data with interactive hive.
    2. Perform exploratory data analysis by using Hive
    3. Perform interactive processing by using Apache Phoenix
    4. Lab : Analyze data with Hive and Phoenix
    5. Implement interactive queries for big data with interactive Hive
    6. Perform exploratory data analysis by using Hive
    7. Perform interactive processing by using Apache Phoenix
  10. Stream Analytics
    1. Stream analytics
    2. Process streaming data from stream analytics
    3. Managing stream analytics jobs
    4. Lab : Implement Stream Analytics
    5. Process streaming data with stream analytics
    6. Managing stream analytics jobs
  11. Implementing Streaming Solutions with Kafka and HBase
    1. Building and Deploying a Kafka Cluster
    2. Publishing, Consuming, and Processing data using the Kafka Cluster
    3. Using HBase to store and Query Data
    4. Lab : Implementing Streaming Solutions with Kafka and HBase
    5. Create a virtual network and gateway
    6. Create a storm cluster for Kafka
    7. Create a Kafka producer
    8. Create a streaming processor client topology
    9. Create a Power BI dashboard and streaming dataset
    10. Create an HBase cluster
    11. Create a streaming processor to write to HBase
  12. Develop big data real-time processing solutions with Apache Storm
    1. Persist long term data
    2. Stream data with Storm
    3. Create Storm topologies
    4. Configure Apache Storm
    5. Lab : Developing big data real-time processing solutions with Apache Storm
    6. Stream data with Storm
    7. Create Storm Topologies
  13. Create Spark Streaming Applications
    1. Working with Spark Streaming
    2. Creating Spark Structured Streaming Applications
    3. Persistence and Visualization
    4. Lab : Building a Spark Streaming Application
    5. Installing Required Software
    6. Building the Azure Infrastructure
    7. Building a Spark Streaming Pipeline
Reviews
Be the first to write a review about this course.
Write a Review

NetCom Learning helps build innovative learning organizations in the workplace by structuring a smarter workforce, supporting learning changes, and driving learning growth. Since 1998 we have been empowering organizations to reach optimal performance results and address challenges by managing all aspects of organizational learning.

With a team of dedicated and knowledgeable learning professionals having deep subject-matter expertise, NetCom Learning has serviced over 80% of Fortune 100 companies. We've helped over 10,000 organizations achieve their business goals by offering a full complement of Managed Learning Services, including IT and business training, curriculum design and content development, learning delivery and administration, consulting and advisory services, and management of learning technology.

NetCom Learning is honored to be recognized as Microsoft's Worldwide Training Partner of the Year by Microsoft and as a Top 20 IT Training Company by TrainingIndustry.com for three years. We offer over 1,500 courses available to organizations around the world, encompassing all your technology and business training needs. From authorized or foundational training to specialized certifications, we empower organizations to achieve a competitive advantage. In addition, we provide customers with a choice of customized private programs, convenient scheduling options, flexible delivery formats, and continuous learning support, all tailored towards minimizing the skills gap and gaining maximum business impact for your organization. ...

Sending Message
Please wait...
× × Speedycourse.com uses cookies to deliver our services. By continuing to use the site, you are agreeing to our use of cookies, Privacy Policy, and our Terms & Conditions.