Python Data Science with Pandas: Master 12 Advanced Projects

Python Data Science with Pandas: Master 12 Advanced Projects

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 14.5 Hours | 5.41 GB

Work with Pandas, SQL Databases, JSON, Web APIs & more to master your real-world Machine Learning & Finance Projects

Welcome to the first advanced and project-based Pandas Data Science Course!

This Course starts where many other courses end: You can write some Pandas code but you are still struggling with real-world Projects because

Real-World Data is typically not provided in a single or a few text/excel files -> more advanced Data Importing Techniques are required
Real-World Data is large, unstructured, nested and unclean -> more advanced Data Manipulation and Data Analysis/Visualization Techniques are required
many easy-to-use Pandas methods work best with relatively small and clean Datasets -> real-world Datasets require more General Code (incorporating other Libraries/Modules)

No matter if you need excellent Pandas skills for Data Analysis, Machine Learning or Finance purposes, this is the right Course for you to get your skills to Expert Level! Master your real-world Projects!

This Course covers the full Data Workflow A-Z:

  • Import (complex and nested) Data from JSON files.
  • Import (complex and nested) Data from the Web with Web APIs, JSON and Wrapper Packages.
  • Import (complex and nested) Data from SQL Databases.
  • Store (complex and nested) Data in JSON files.
  • Store (complex and nested) Data in SQL Databases.
  • Work with Pandas and SQL Databases in parallel (getting the best of both worlds).
  • Efficiently import and merge Data from many text/CSV files.
  • Clean large and messy Datasets with more General Code.
  • Clean, handle and flatten nested and stringified Data in DataFrames.
  • Know how to handle and normalize Unicode strings.
  • Merge and Concatenate many Datasets efficiently.
  • Scale and Automate data merging.
  • Explanatory Data Analysis and Data Presentation with advanced Visualization Tools (advanced Matplotlib & Seaborn).
  • Test the Performance Limits of Pandas with advanced Data Aggregations and Grouping.
  • Data Preprocessing and Feature Engineering for Machine Learning with simple Pandas code.
  • Use your Data 1: Train and test Machine Learning Models on preprocessed Data and analyze the results.
  • Use your Data 2: Backtesting and Forward Testing of Investment Strategies (Finance & Investment Stack).
  • Use your Data 3: Index Tracking (Finance & Investment Stack).
  • Use your Data 4: Present your Data with Python in a nicely looking HTML format (Website Quality).
    and many more…

What you’ll learn

  • Advanced Real-World Data Workflows with Pandas you won´t find in any other Course.
  • Working with Pandas and SQL-Databases in parallel (getting the best out of two worlds)
  • Working with APIs, JSON and Pandas to import large Datasets from the Web
  • Bringing Pandas to its Limits (and beyond…)
  • Machine Learning Application: Predicting Real Estate Prices
  • Finance Applications: Backtesting & Forward Testing Investment Strategies + Index Tracking
  • Feature Engineering, Standardization, Dummy Variables and Sampling with Pandas
  • Working with large Datasets (millions of rows/columns)
  • Working with completely messy/unclean Datasets (the standard case in real-world)
  • Handling stringified and nested JSON Data with Pandas
  • Loading Data from Databases (SQL) into Pandas and vice versa
  • Loading JSON Data into Pandas and vice versa
  • Web-Scraping with Pandas
  • Cleaning large & messy Datasets (millions of rows/columns)
  • Working with APIs and Python Wrapper Packages to import large Datasets from the Web
  • Explanatory Data Analysis with large real-world Datasets
  • Advanced Visualizations with Matplotlib and Seaborn
Table of Contents

Getting Started
1 Course Overview (don´t skip!)
2 Tips How to get the most out of this Course (don´t skip!)
3 FAQ Your Questions answered
4 How to download and install Anaconda for Python coding
5 Jupyter Notebooks – let´s get started
6 How to work with Jupyter Notebooks

Project 1 Explanatory Data Analysis & Data Presentation (Movies Dataset)
7 Project Overview
8 What are the most successful Franchises
9 The most successful Directors
10 The most successful Actors (Part 1)
11 The most successful Actors (Part 2)
12 Now it´s your turn (Homework)
13 Downloads (Project 1)
14 Project Brief for Self-Coders
15 Data Import from csv file and first Inspection
16 The best and the worst movies… (Part 1)
17 The best and the worst movies… (Part 2)
18 Which Movie would you like to see next
19 What are the most common Words in Movie Titles, Taglines and Overviews
20 Are Franchises more successful

Project 2 Data Import – Working with APIs and JSON (Movies Dataset)
21 Project Overview
22 Importing and Storing the Movies Dataset (Best Practice)
23 Importing and Storing the Movies Dataset (Real World Scenario)
24 Downloads (Project 2)
25 What is JSON
26 Importing Data from JSON files
27 JSON and OrientationFormats
28 What is an API – The Movie Database API
29 Working with APIs and JSON (Part 1)
30 How to work with your own API-KEY
31 Working with APIs and JSON (Part 2)

Project 3 Data Cleaning – Tidy up messy Datasets (Movies Dataset)
32 Project Overview
33 How to clean Columns with DateTime Information
34 How to clean String Text Columns
35 How to remove Duplicates
36 Handling Missing Values & Removing ObervationsRows
37 Final Steps
38 Downloads (Project 3)
39 First Steps
40 Dropping irrelevant Columns
41 How to handle stringified JSON columns (Part 1)
42 How to handle stringified JSON columns (Part 2)
43 How to flatten nested Columns
44 How to clean Numerical Columns (Part 1)
45 How to clean Numerical Columns (Part 2)

Project 4 Merging, Cleaning & Transforming Data (Movies Dataset)
46 Project Overview
47 Downloads (Project 4)
48 Getting the Datasets
49 Preparing the Data for Merge
50 Merging the Data (Left Join)
51 Cleaning and Transforming the new Cast Column
52 Cleaning and Transforming the new Crew Column
53 Final Steps

Project 5 Working with Pandas and SQL Databases (Movies Dataset)
54 Project Overview
55 Final Case Study
56 Downloads (Project 5)
57 What is a Database SQL
58 How to create an SQLite Database
59 How to load Data from DataFrames into an SQLite Database
60 How to load Data from SQLite Databases into DataFrames
61 Some simple SQL Queries
62 Some more SQL Queries
63 Join Queries

Project 6 Importing & Concatenating many files (Baby Names Dataset)
64 Project Overview
65 Excursus Saving Memory – Categorical Features
66 Downloads (Project 6)
67 Getting the Data from the Web
68 Importing one File & Understanding the Data Structure (easy case)
69 Importing & merging many Files (easy case)
70 Final Steps
71 Importing one File & Understanding the Data Structure (complex case)
72 The glob module
73 Importing & merging many Files (complex case)

Project 7 Explanatory Data Analysis & Advanced Visualization (Baby Names)
74 Project Overview
75 Why does a Name´s Popularity suddenly change (Part 1)
76 Why does a Name´s Popularity suddenly change (Part 2)
77 Persistant vs. Spike-Fade Names
78 Most Popular Unisex Names
79 Downloads (Project 7)
80 First Inspection The most popular Names in 2018
81 Evergreen Names (1880 – 2018)
82 Advanced Data Aggregation
83 What are the most popular Names of all Times
84 General Trends over Time (1880 – 2018)
85 Creating the Features Popularity and Rank
86 Visualizing Name Trends over Time

Project 8 Data Preprocessing & Feature Engineering for Machine Learning
87 Project Overview
88 Training the ML Model (Random Forest)
89 Evaluating the Model on the Test Set
90 Feature Importance
91 Downloads (Project 8)
92 Data Import and first Inspection
93 Data Cleaning and Creating additional Features
94 Which Factors influence House Prices
95 Advanced Explanatory Data Analyis with Seaborn
96 Feature Engineering – Part 1
97 Feature Engineering – Part 2
98 Splitting the Data into Train and Test Set

Project 9 Data Import – Web Scraping, APIs & Wrappers (US Stocks)
99 Project Overview
100 Downloads (Project 9)
101 Web Scraping – the Dow Jones Constituents
102 Normalizing Unicode Strings and Getting the Ticker Symbols
103 Download and Installation of an API Wrapper Package
104 Loading and Saving Historical Stock Prices

Project 10 (Finance Stack) Backtesting Investment Strategies (US Stocks)
105 Project Overview
106 Backtesting the Perfect Strategy (…in case you can predict the future…)
107 Downloads (Project 10)
108 Importing the Data
109 Data Visualization & Returns
110 Backtesting a simple Momentum Strategy
111 Backtesting a simple Contrarian Strategy
112 More complex Strategies & Backtesting vs. Fitting
113 Simple Moving Averages (SMA)
114 Backtesting Simple Moving Averages (SMA) Strategies

Project 11 (Finance Stack) Index Tracking and Forward Testing (US Stocks)
115 Project Overview
116 Forward Testing (Part 1)
117 Forward Testing (Part 2)
118 Downloads (Project 11)
119 Importing & Merging the Data
120 Transforming the Data
121 Explanatory Data Analysis (Risk, Return & Correlations)
122 Index Tracking – Introduction
123 Index Tracking – Selecting the Tracking Stocks
124 Index Tracking – A simple Tracking Portfolio
125 Index Tracking – The optimal Tracking Portfolio

Project 12 Explanatory Data Analysis and Seaborn Visualization (Olympic Games)
126 Project Overview
127 Aggregating and Ranking
128 Summer Games vs. Winter Games – does Geographical Location matter
129 Men vs. Women – do Culture & Religion matter
130 Do Traditions matter
131 Downloads (Project 12)
132 Data Import and first Inspection
133 Merging and Concatenating
134 Data Cleaning (Part 1)
135 Data Cleaning (Part 2)
136 What are the most successful countries of all times
137 Do GDP, Population and Politics matter
138 Statistical Analysis and Hypothesis Testing with scipy

Extra Project Prepare yourself for the Future – Pandas Version 1.0
139 Intro and Overview
140 The NEW StringDtype
141 The NEW nullable BooleanDtype
142 Addition of the ignore index parameter
143 Removal of prior Version Deprecations
144 How to update Pandas to Version 1.0
145 Downloads for this Section
146 Important Recap Pandas Display Options (Changed in Version 0.25)
147 Info() method – new and extended output
148 NEW Extension dtypes (nullable dtypes) Why do we need them
149 Creating the NEW extension dtypes with convert dtypes()
150 NEW pd.NA value for missing values
151 The NEW nullable Int64Dtype

Appendix Pandas Crash Course
152 Intro to Tabular Data Pandas
153 Downloads for this Section
154 Create your very first Pandas DataFrame (from csv)
155 Pandas Display Options and the methods head() & tail()
156 First Data Inspection
157 Built-in Functions, Attributes and Methods with Pandas
158 Selecting Columns
159 Selecting one Column with the dot notation
160 Zero-based Indexing and Negative Indexing
161 Selecting Rows with iloc (position-based indexing)
162 Slicing Rows and Columns with iloc (position-based indexing)
163 Position-based Indexing Cheat Sheets
164 Selecting Rows with loc (label-based indexing)
165 Slicing Rows and Columns with loc (label-based indexing)
166 Label-based Indexing Cheat Sheets
167 First Steps with Pandas Series
168 Analyzing Numerical Series with unique(), nunique() and value counts()
169 Analyzing non-numerical Series with unique(), nunique(), value counts()
170 Sorting of Series and Introduction to the inplace – parameter
171 Filtering DataFrames by one Condition
172 Filtering DataFrames by many Conditions (AND)
173 Filtering DataFrames by many Conditions (OR)
174 Creating Columns based on other Columns
175 User-defined Functions with apply(), map() and applymap()
176 Data Visualization with Matplotlib
177 GroupBy – an Introduction
178 Understanding the GroupBy Object
179 Splitting with many Keys
180 split-apply-combine explained
181 split-apply-combine applied
182 Data with DateTime Information – Part 1
183 Data with DateTime Information – Part 2
184 Data with DateTime Information – Part 3
185 Data with DateTime Information – Part 4

What´s next
186 Get your special BONUS here!