Machine Learning for BI, PART 1: Data Profiling

Machine Learning for BI, PART 1: Data Profiling

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 2h 14m | 807 MB

Demystify the world of machine learning & build core data science skills, without writing a single line of code

If you’re excited to explore data science & machine learning but anxious about learning complex programming languages or intimidated by terms like “naive bayes”, “logistic regression”, “KNN” and “decision trees”, you’re in the right place.

This course is PART 1 of a 4-PART SERIES designed to help you build a strong, foundational understanding of machine learning:

  • PART 1: QA & Data Profiling
  • PART 2: Classification
  • PART 3: Regression & Forecasting
  • PART 4: Unsupervised Learning

This course makes data science approachable to everyday people, and is designed to demystify powerful machine learning tools & techniques without trying to teach you a coding language at the same time.

Instead, we’ll use familiar, user-friendly tools like Microsoft Excel to break down complex topics and help you understand exactly HOW and WHY machine learning works before you dive into programming languages like Python or R. Unlike most data science and machine learning courses, you won’t write a SINGLE LINE of code.

In this Part 1 course, we’ll introduce the machine learning landscape and workflow, and review critical QA tips for cleaning and preparing raw data for analysis, including variable types, empty values, range & count calculations, table structures, and more.

We’ll cover univariate analysis with frequency tables, histograms, kernel densities, and profiling metrics, then dive into multivariate profiling tools like heat maps, violin & box plots, scatter plots, and correlation:

Section 1: Machine Learning Intro & Landscape

Machine learning process, definition, and landscape

Section 2: Preliminary Data QA

Variable types, empty values, range & count calculations, left/right censoring, etc.

Section 3: Univariate Profiling

Histograms, frequency tables, mean, median, mode, variance, skewness, etc.

Section 4: Multivariate Profiling

Violin & box plots, kernel densities, heat maps, correlation, etc.

Throughout the course we’ll introduce real-world scenarios designed to help solidify key concepts and tie them back to actual business intelligence case studies. You’ll use profiling metrics to clean up product inventory data for a local grocery, explore Olympic athlete demographics with histograms and kernel densities, visualize traffic accident frequency with heat maps, and much more.

If you’re ready to build the foundation for a successful career in data science, this is the course for you.

What you’ll learn

  • Build foundational machine learning & data science skills, without writing complex code
  • Use intuitive, user-friendly tools like Microsoft Excel to introduce & demystify machine learning tools & techniques
  • Prepare raw data for analysis using QA tools like variable types, range calculations & table structures
  • Analyze datasets using common univariate & multivariate profiling metrics
  • Describe & visualize distributions with histograms, kernel densities, heat maps and violin plots
  • Explore multivariate relationships with scatterplots and correlation
Table of Contents

Getting Started
1 Course Structure & Outline
2 READ ME Important Notes for New Students
3 About this Series
4 DOWNLOAD Course Resources
5 Setting Expectations

ML Intro & Landscape
6 Intro to Machine Learning
7 When is ML the right fit
8 The Machine Learning Process
9 The Machine Learning Landscape

Preliminary Data QA
10 Introduction
11 Why QA
12 Variable Types
13 Empty Values
14 Range Calculations
15 Count Calculations
16 Left & Right Censored Data
17 Table Structure
18 CASE STUDY Preliminary QA
19 BEST PRACTICES Preliminary QA

Univariate Profiling
20 Introduction
21 Categorical Variables
22 Discretization
23 Nominal vs. Ordinal
24 Categorical Distributions
25 Numerical Variables
26 Histograms & Kernel Densities
27 CASE STUDY Histograms
28 Normal Distribution
29 CASE STUDY Normal Distribution
30 Univariate Data Profiling
31 Mode
32 Mean
33 Median
34 Percentile
35 Variance
36 Standard Deviation
37 Skewness
38 BEST PRACTICES Univariate Profiling

Multivariate Profiling
39 Introduction
40 Categorical-Categorical
41 CASE STUDY Heat Maps
42 Categorical-Numerical
43 Multivariate Kernel Densities
44 Violin Plots
45 Box Plots
46 Limitations of Categorical Distributions
47 Numerical-Numerical
48 Correlation
49 Correlation vs. Causation
50 Visualizing Third Dimension
51 CASE STUDY Correlation
52 BEST PRACTICES Multivariate Profiling
53 Looking Ahead

Wrapping Up
54 BONUS LECTURE