Advanced search

M20773 Analyzing Big Data with Microsoft R Print

This 3-day instructor led course provides students to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.

Accredited course for Continuing Education of Pedagogical Staff

Course length: 3 days

Dates

DatePlaceLanguagePrice (without VAT)Availability
03/13/2019 - 03/15/2019 Prague cs 18 200 CZK Free date
05/20/2019 - 05/22/2019 Prague cs 18 200 CZK Free date
PDF to download Expand allCollapse all
  • Students will be able to

    • Explain how Microsoft R Server and Microsoft R Client work.
    • Use R Client with R Server to explore big data held in different data stores.
    • Visualize data by using graphs and plots.
    • Transform and clean big data sets.
    • Implement options for splitting analysis jobs into parallel tasks.
    • Build and evaluate regression models generated from big data.
    • Create, score, and deploy partitioning models generated from big data.
    • Use R in the SQL Server and Hadoop environments.
  • Course requirements

      Knowledge of common statistical methods and data analysis best practices. Basic knowledge of the Microsoft Windows operating system.Working knowledge of relational databases.
  • This course is intended for

    This course is intended people who wish to analyze large datasets within a big data environment and developers who need to integrate R analyses into their solutions.

  • Literature

    All participants will get original Microsoft student materials.

  • Hardware

    Classrooms are equipped with high-performance computers with Internet access and the possibility of wireless connection.

  • Syllabus

    Module 1: Microsoft R Server and R Client

    • Lesson: What is Microsoft R server
    • Lesson: Using Microsoft R client
    • Lesson: The ScaleR functions
    • Lab: Exploring Microsoft R Server and Microsoft R Client

    Module 2: Exploring Big Data

    • Lesson: Understanding ScaleR data sources
    • Lesson: Reading data into an XDF object
    • Lesson: Summarizing data in an XDF object
    • Lab: Exploring Big Data

    Module 3: Visualizing Big Data

    • Lesson: Visualizing In-memory data
    • Lesson: Visualizing big data
    • Lab: Visualizing data

    Module 4: Processing Big Data

    • Lesson: Transforming Big Data
    • Lesson: Managing datasets
    • Lab: Processing big data

    Module 5: Parallelizing Analysis Operations

    • Lesson: Using the RxLocalParallel compute context with rxExec
    • Lesson: Using the revoPemaR package
    • Lab: Using rxExec and RevoPemaR to parallelize operations

    Module 6: Creating and Evaluating Regression Models

    • Lesson: Clustering Big Data
    • Lesson: Generating regression models and making predictions
    • Lab: Creating a linear regression model

    Module 7: Creating and Evaluating Partitioning Models

    • Lesson: Creating partitioning models based on decision trees
    • Lesson: Test partitioning models by making and comparing predictions
    • Lab: Creating and evaluating partitioning models

    Module 8: Processing Big Data in SQL Server and Hadoop

    • Lesson: Using R in SQL Server
    • Lesson: Using Hadoop Map/Reduce
    • Lesson: Using Hadoop Spark
    • Lab: Processing big data in SQL Server and Hadoop

     

  • Dependencies

    Business Intelligence

    Business Intelligence
OKsystem a.s.
We use cookies to optimize site functionality and deliver best results based on your interests.