Did you take a statistics class in high school or college and feel like you don't remember anything you learned? Have you felt intimidated about taking a statistics class in the past? Do you want or need to improve your data literacy skills? If you answered yes to any of these questions, this beginner-friendly series is for you!
Our Curriculum Developers will walk you through the new Master Statistics with Python skill path, leading you through a step by step data analysis from start to finish. We’ll be streaming live on Tuesdays at 4pm ET from January 12 through March 2.
We'll start by loading, exploring, and preparing a dataset for analysis in Python. We'll then cover descriptive statistics and data visualizations that can help you learn about your data and understand possible trends and relationships. Finally, we'll jump into the world of inferential statistics and cover hypothesis testing methods that are useful for understanding populations that we cannot observe. In the process, we'll get to know the NumPy and pandas libraries (among others), and demonstrate how to use Jupyter notebooks. By the end of the series, you'll be prepared to plan, implement, and interpret your own data analysis in Python.
How to watch
Every Tuesday at 4pm ET, we’ll be streaming our weekly session to YouTube, Twitch, Twitter, and Facebook. Each session will be approximately 1 hour. You can find out more about what will be covered in each of the sessions on the Codecademy Events Page. You can also add the events to your calendar so you don’t forget.
What we’ll cover
Each week, we’ll be covering a module or set of modules from the Master Statistics with Python skill path. This is a Pro course. The stream is free to everyone, but to complete the Path on your own, you’ll need to have a Codecademy Pro or Pro Student membership. For anyone who would like to code along with us off-platform, we recommend downloading Anaconda prior to the first session. You can also find all the code we'll be going through in our lessons here on github.
Check out the schedule below to learn more about what we’ll be covering each week.
Summary Statistics and Visualizations Part I
January 12, 2021 at 4pm ET
In this session, we'll walk through the process of loading a tabular dataset into Python, inspecting the data, and taking a first look at some of the variables. We'll also cover basic Python and pandas syntax and discuss some of the different kinds of data you might encounter.
Watch the replay here:
Summary Statistics and Visualizations Part II
January 19, 2021 at 4pm ET
In this session, we'll continue to investigate our data with summary statistics and some basic data visualizations, using the Python libraries NumPy, pandas, matplotlib, and Seaborn. We'll also discuss how to choose an appropriate summary statistic to answer a particular question.
Watch the replay here:
Associations between Variables
January 26, 2021 at 4pm ET | LEARN MORE
In this session, we'll cover ways of assessing a relationship between two variables, using both summary statistics and data visualizations. For example, how could we use clinical trial data to get a sense of whether a vaccine appears to work?
Introduction to Hypothesis Testing: The Central Limit Theorem
February 2, 2021 at 4pm ET | LEARN MORE
In this session, we'll introduce inferential statistics and hypothesis testing by learning about the central limit theorem (CLT). The CLT is the mathematical theory behind a number of commonly-used hypothesis tests, and we will demonstrate it using simulation (no math-y formulas)!
Hypothesis Testing: Simulating a Binomial Test
February 9, 2021 at 4pm ET | LEARN MORE
In this session, we'll implement our first hypothesis test, but we'll do it by writing our own simulation-based function in Python. The code might get a little tricky here, but the lessons learned will be invaluable for any statistician, data scientist, or data analyst.
Hypothesis Testing: Significance Thresholds and Multiple Hypothesis Tests
February 16, 2021 at 4pm ET | LEARN MORE
In this session, we'll discuss some of the problems that can arise when hypothesis testing is misused. We'll cover error types and investigate some of the problems that can arise when a single study involves multiple tests.
Hypothesis Testing for an Association
February 23, 2021 at 4pm ET | LEARN MORE
In this session, we'll turn our attention to hypothesis tests that can be used to evaluate an association between two variables among a population that we can't observe. If we have time, we'll even simulate our own two-sample t-test.
A/B Testing: Planning and Implementing an A/B test
March 2, 2021 at 4pm ET | LEARN MORE
In this session, we'll talk about experimental design and sample size determination in the context of an A/B test. A/B testing is often used by marketers to compare two versions of a website or product to determine if one is better for business.