English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 180 lectures (8h 41m) | 3.17 GB
Learn how to use Python & Pandas to gather, clean, explore and analyze data for Data Science and Machine Learning
This is a hands-on, project-based course designed to help you master the core building blocks of Python for data science.
We’ll start by introducing the fields of data science and machine learning, discussing the difference between supervised and unsupervised learning, and reviewing the data science workflow we’ll be using throughout the course.
From there we’ll do a deep dive into the data prep & EDA steps of the workflow. You’ll learn how to scope a data science project, use Pandas to gather data from multiple sources and handle common data cleaning issues, and perform exploratory data analysis using techniques like filtering, grouping, and visualizing data.
Throughout the course, you’ll play the role of a Jr. Data Scientist for Maven Music, a streaming service that’s been struggling with customer churn. Using the skills you learn throughout the course, you’ll use Python to gather, clean, and explore the data to provide insights about their customers.
Last but not least, you’ll practice preparing data for machine learning models by joining multiple tables, adjusting row granularity, and engineering useful fields and features.
COURSE OUTLINE:
Intro to Data Science
Introduce the field of data science, review essential skills, and introduce each phase of the data science workflow
Scoping a Project
Review the process of scoping a data science project, including brainstorming problems and solutions, choosing techniques, and setting clear goals
Gathering Data
Read flat files into a Pandas DataFrame in Python, and review common data sources & formats, including Excel spreadsheets and SQL databases
Cleaning Data
Identify and convert data types, find and fix common data issues like missing values, duplicates, and outliers, and create new columns for analysis
Exploratory Data Analysis
Explore datasets to discover insights by sorting, filtering, and grouping data, then visualize it using common chart types like scatterplots & histograms
MID-COURSE PROJECT
Put your skills to the test by cleaning, exploring, and visualizing data from a brand-new data set containing Rotten Tomatoes movie ratings
Preparing for Modeling
Structure your data so that it’s ready for machine learning models by creating a numeric, non-null table and engineering new features
FINAL COURSE PROJECT
Apply all the skills learned throughout the course by gathering, cleaning, exploring, and preparing multiple data sets for Maven Music
What you’ll learn
- Master the core building blocks of Python for data science BEFORE applying machine learning algorithms
- Scope data science projects by clearly defining the goals, techniques, and data sources needed for your analysis
- Import and export flat files, Excel workbooks, and SQL database tables using Pandas
- Clean data by converting data types, handling common data issues, and creating new columns for analysis
- Perform exploratory data analysis (EDA) by sorting, filtering, grouping, and visualizing data to discover patterns and insights
- Prepare data for machine learning models by joining tables, aggregating rows, and applying feature engineering techniques
Table of Contents
Getting Started
Course Introduction
About This Series
Course Structure & Outline
READ ME Important Notes for New Students
DOWNLOAD Course Resources
Introducing the Course Project
Setting Expectations
Intro to Data Science
Section Introduction
What is Data Science
Data Science Skill Set
What is Machine Learning
Common Machine Learning Algorithms
Data Science Workflow
Step 1 Scoping a Project
Step 2 Gathering Data
Step 3 Cleaning Data
Step 4 Exploring Data
Step 5 Modeling Data
Step 6 Sharing Insights
Data Prep & EDA
Key Takeaways
Intro to Data Science
Scoping a Project
Section Introduction
Project Scoping Steps
Think Like an End User
Brainstorm Problems
Brainstorm Solutions
Supervised vs Unsupervised Learning
Identify Data Requirements
Data Structures
Model Features
Data Sources
Data Scope
Summarize the Scope
Key Takeaways
Scoping a Project
Installing Jupyter Notebook
Section Introduction
Why Python
Installing Anaconda
Launching Jupyter Notebook
The Notebook Interface
Edit vs Command Mode
The Code Cell
The Markdown Cell
Helpful Resources & Key Takeaways
Installing Jupyter Notebook
Gathering Data
Section Introduction
Data Gathering Process
Data Sources
Structured vs Unstructured Data
The Pandas DataFrame
Reading Flat Files
DEMO Reading Flat Files
Reading Excel Files
Connecting to a SQL Database
Quickly Exploring a DataFrame
ASSIGNMENT Gathering Data
SOLUTION Gathering Data
Key Takeaways
Gathering Data
Cleaning Data
Section Introduction
Data Cleaning Overview
Data Types
Converting to DateTime
Converting to Numeric
DEMO Converting Data Types
ASSIGNMENT Converting Data Types
SOLUTION Converting Data Types
Data Issues Overview
Finding Missing Data
DEMO Finding Missing Data
Handling Missing Data
Removing Missing Data
Imputing Missing Data
Resolving Missing Data
ASSIGNMENT Missing Data
SOLUTION Missing Data
Finding Inconsistent Text & Typos
Handling Inconsistent Text & Typos
Updating Values Based on a Logical Condition
Mapping Values
Cleaning Text
ASSIGNMENT Inconsistent Text & Typos
SOLUTION Inconsistent Text & Typos
Finding Duplicate Data
Handling Duplicate Data
ASSIGNMENT Duplicate Data
SOLUTION Duplicate Data
Finding Outliers
Histograms
Box Plots
Standard Deviation
Handling Outliers
DEMO Review Cleaned Data
ASSIGNMENT Outliers
SOLUTION Outliers
Creating New Columns
Creating Numeric Columns
DEMO Creating Numeric Columns
ASSIGNMENT Creating Numeric Columns
SOLUTION Creating Numeric Columns
Creating DateTime Columns
DEMO Creating DateTime Columns
ASSIGNMENT Creating DateTime Columns
SOLUTION Creating DateTime Columns
Creating Text Columns
DEMO Creating Text Columns
ASSIGNMENT Creating Text Columns
SOLUTION Creating Text Columns
Key Takeaways
Cleaning Data
Exploratory Data Analysis
Section Introduction
Exploratory Data Analysis Overview
Filtering
DEMO Filtering
Sorting
DEMO Sorting
Grouping
DEMO Grouping
ASSIGNMENT Exploring Data
SOLUTION Exploring Data
Data Visualization Overview
Data Visualization with Pandas
DEMO Data Visualization with Pandas
Pair Plots
DEMO Pair Plots
Distributions
DEMO Distributions
Common Distributions
The Normal Distribution
ASSIGNMENT Distributions
SOLUTION Distributions
Scatter Plots
DEMO Scatter Plots
Correlations
DEMO Correlations
ASSIGNMENT Correlations
SOLUTION Correlations
Data Visualization in Practice
EDA Tips
Key Takeaways
Exploratory Data Analysis
Mid-Course Project
Mid-Course Project Overview
SOLUTION Exploring Data
SOLUTION Creating New Columns
SOLUTION Visualizing Data
Preparing for Modeling
Section Introduction
Case Study Preparing for Modeling
Data Prep for EDA vs Modeling
Model Preparation Steps
Creating a Single Table
Appending
DEMO Appending
Joining
DEMO Joining
Types of Joins
DEMO Types of Joins
DEMO Creating a Single Table
ASSIGNMENT Creating a Single Table
SOLUTION Creating a Single Table
Preparing Rows for Modeling
DEMO Preparing Rows for Modeling
ASSIGNMENT Preparing Rows for Modeling
SOLUTION Preparing Rows for Modeling
Preparing Columns for Modeling
Dummy Variables
DEMO Dummy Variables
Preparing DateTime Columns
DEMO Preparing DateTime Columns
ASSIGNMENT Prepare Columns for Modeling
SOLUTION Prepare Columns for Modeling
Feature Engineering
Feature Transformations
Feature Scaling
Proxy Variables
Feature Engineering Tips
ASSIGNMENT Feature Engineering
SOLUTION Feature Engineering
PREVIEW Applying Algorithms
Key Takeaways
Preparing for Modeling
Final Course Project
Final Project Overview
SOLUTION Gathering Data
SOLUTION Cleaning Data
SOLUTION Exploratory Data Analysis
SOLUTION Preparing for Modeling
Wrapping Up
BONUS LESSON
Resolve the captcha to access the links!