Data Mining and Analysis Training Course

Overview

Objective:

Delegates be able to analyse big data sets, extract patterns, choose the right variable impacting the results so that a new model is forecasted with predictive results.

Course Outline

Data preprocessing
1. Data Cleaning
2. Data integration and transformation
3. Data reduction
4. Discretization and concept hierarchy generation
Statistical inference
1. Probability distributions, Random variables, Central limit theorem
2. Sampling
3. Confidence intervals
4. Statistical Inference
5. Hypothesis testing
Multivariate linear regression
1. Specification
2. Subset selection
3. Estimation
4. Validation
5. Prediction
Classification methods
1. Logistic regression
2. Linear discriminant analysis
3. K-nearest neighbours
4. Naive Bayes
5. Comparison of Classification methods
Neural Networks
1. Fitting neural networks
2. Training neural networks issues
Decision trees
1. Regression trees
2. Classification trees
3. Trees Versus Linear Models
Bagging, Random Forests, Boosting
1. Bagging
2. Random Forests
3. Boosting
Support Vector Machines and Flexible disct
1. Maximal Margin classifier
2. Support vector classifiers
3. Support vector machines
4. 2 and more classes SVM’s
5. Relationship to logistic regression
Principal Components Analysis
Clustering
1. K-means clustering
2. K-medoids clustering
3. Hierarchical clustering
4. Density based clustering
Model Assesment and Selection
1. Bias, Variance and Model complexity
2. In-sample prediction error
3. The Bayesian approach
4. Cross-validation
5. Bootstrap methods

Posts

Overview

Objective:

Course Outline

Data preprocessing

Statistical inference

Multivariate linear regression

Classification methods

Neural Networks

Decision trees

Bagging, Random Forests, Boosting

Support Vector Machines and Flexible disct

Principal Components Analysis

Clustering

Model Assesment and Selection

Leave a Reply Cancel reply