Data Mining & Machine Learning with R Training Course

Overview

R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining.

Requirements

This course is part of the Data Scientist skill set (Domain: Analytical Techniques and Methods)

Course Outline

Introduction to Data mining and Machine Learning

  • Statistical learning vs. Machine learning
  • Iteration and evaluation
  • Bias-Variance trade-off

Regression

  • Linear regression
  • Generalizations and Nonlinearity
  • Exercises

Classification

  • Bayesian refresher
  • Naive Bayes
  • Dicriminant analysis
  • Logistic regression
  • K-Nearest neighbors
  • Support Vector Machines
  • Neural networks
  • Decision trees
  • Exercises

Cross-validation and Resampling

  • Cross-validation approaches
  • Bootstrap
  • Exercises

Unsupervised Learning

  • K-means clustering
  • Examples
  • Challenges of unsupervised learning and beyond K-means

Advanced topics

  • Ensemble models
  • Mixed models
  • Boosting
  • Examples

Multidimensional reduction

  • Factor Analysis
  • Principal Component Analysis
  • Examples

Leave a Reply

Your email address will not be published. Required fields are marked *