Overview
Participants who complete this training will gain a practical, real-world understanding of Data Science and its related technologies, methodologies and tools.
Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class.
The course starts with an introduction to elemental concepts of Data Science, then progresses into the tools and methodologies used in Data Science.
Audience
- Developers
- Technical analysts
- IT consultants
Format of the Course
- Part lecture, part discussion, exercises and heavy hands-on practice
Note
- To request a customized training for this course, please contact us to arrange.
Requirements
- A general understanding of database concepts
- A basic understanding of statistics
Course Outline
Introduction
- The Data Science Process
- Roles and responsibilities of a Data Scientist
Preparing the Development Environment
- Libraries, frameworks, languages and tools
- Local development
- Collaborative web-based development
Data Collection
- Different Types of Data
- Structured
- Local databases
- Database connectors
- Common formats: xlxs, XML, Json, csv, …
- Un-Structured
- Clicks, censors, smartphones
- APIs
- Internet of Things (IoT)
- Documents, pictures, videos, sounds
- Structured
- Case study: Collecting large amounts of unstructured data continuosly
Data Storage
- Relational databases
- Non-relational databases
- Hadoop: Distributed File System (HDFS)
- Spark: Resilient Distributed Dataset (RDD)
- Cloud storage
Data Preparation
- Ingestion, selection, cleansing, and transformation
- Ensuring data quality – correctness, meaningfulness, and security
- Exception reports
Languages used for Preparation, Processing and Analysis
- R language
- Introduction to R
- Data manipulation, calculation and graphical display
- Python
- Introduction to Python
- Manipulating, processing, cleaning, and crunching data
Data Analytics
- Exploratory analysis
- Basic statistics
- Draft visualizations
- Understand data
- Causality
- Features and transformations
- Machine Learning
- Supervised vs unsurpevised
- When to use what model
- Natural Language Processing (NLP)
Data Visualization
- Best Practices
- Selecting the right chart for the right data
- Color pallets
- Taking it to the next level
- Dashboards
- Interactive Visualizations
- Storytelling with data
Summary and Conclusion