Catherine Zhou manages the data science team at Codecademy. She’s been working at Codecademy for 3 years and as a data scientist for ten. Because the job title of “Data Scientist” is relatively new, Catherine’s title has ranged over the years from quant to researcher to business intelligence analyst to engineer — but through it all she’s been doing the work in the field that’s known today as “Data Science.”
If you’re not sure what a data scientist does, you’re not alone. In an interview with Lillian from the Codecademy Community Team, Catherine shares:
“I think there are a lot of misconceptions about what data science is, what skills it takes to be a data scientist, and the role of data scientists in the workplace. Data science, as it exists today, is a pretty amorphous space. This can cause confusion and mismatched expectations between data scientists and non-data scientists.”
“It also can lead to imposter syndrome and confusion about career trajectory (I talk to so many people working in data science who aren’t sure if they can call themselves data scientists!). I'm optimistic that this will improve over time, but in the meantime, a big part of my job includes educating others about the field of data science and aligning on expectations.”
In this article, Catherine provides some insight into the world of data science, talks about her day to day, and helps us answer the elusive question, “What does a data scientist do?”
But first… what is data science?
The first step to understanding what a data scientist does is to understand what data science is. The following definition comes straight out of Code Foundations, a Codecademy Career Path designed to provide an overview of the main applications of programming.
“Data gives us information about the way the world works. And information can carry meaning - from a click telling us what someone likes, to toxins in the water signaling a health concern. But data is meaningless unless we do something with it. That’s where data science comes in.
“Data science enables us to take data and transform it into meaningful information that can help us make decisions. Data science is interdisciplinary and combines other well-known fields such as probability, statistics, analytics, and computer science. The work that you might do could range from writing up reports to building machine learning models. No matter what your interests are, data science is applicable - because these days, we have data on everything!”
What does a data scientist do?
In our interview with Catherine, she explains the skills required for data scientists. “Data science is defined as sort of the intersection between statistics, software engineering, and domain or business knowledge. So you have to have a little bit of coding skills, a little bit of statistics skills, and a little bit of knowledge about your business.”
Catherine says she thinks the piece of data science that’s the most overlooked is the business or domain side. As a data scientist, you’ll be working with specific stakeholders, teams, or parts of an organization to use data to help them achieve their goals or answer their questions.
“A lot of companies are trying to figure out how best to integrate data people into their organizations.” As a result, there’s a wide range of ways that data scientists may work with strategy, decision making, and implementation of analysis — and, as you can imagine, the role of a data scientist may look very different depending upon what company you’re working on and what business domain you’re working in!
What skills do data scientists need?
That said, there are a number of skills that are shared by data scientists across the board. If you’re thinking of becoming a data scientist you’ll want to build your skillset in the following areas:
- Descriptive and inferential statistics
- Programming (Specifically SQL, and Python or R)
- A passion for diving deep into the data for the specific field you plan to work in
Of course, there’s a whole lot to learn in each of these areas. But Catherine explains that you shouldn’t feel like you have to learn it all:
“I’m always humbled by how much more I have to learn. Originally when I broke into the field I felt really overwhelmed and felt a lot of imposter syndrome about having to learn a lot. But I realized that when you work in data analysis or statistics you end up specializing in one part of it.
“You might specialize in predictive analysis; you might specialize in reporting; you might specialize in machine learning or artificial intelligence. There are so many subsets — usually data scientists will focus on one thing and get really good at it.”
A day in the life of a (remote) data scientist
In her interview with Lillian, Catherine shared a glimpse into what a typical day looked like as a data scientist at Codecademy before we shifted to remote work. Here’s an update on how that has changed in recent months, since we've started working remotely.
Growing the team while working from home
Data Science at Codecademy has grown a lot over the last few months, and we’ve added quite a few new team members. We started a few people right before we shut our offices down, and it’s been a challenge figuring out how to shift their onboarding to virtual. As a manager, it's my job to make sure my team is feeling focused and engaged, which can be really tough during a pandemic. These days, I spend most of my time thinking about how to collaborate effectively through a crisis, and how I and the company can support our team and our learners. The hardest part is taking everything day by day, since we’re not able to plan ahead.
Meetings and written communication
Meetings and communication play a bigger role in my day-to-day than ever before. The DS team gets together regularly to align on projects, share work, and hold virtual pair programming sessions. We try to document everything in writing these days, in order to make sure we’re communicating effectively, and no decision is lost or unclear. We make sure we have write-ups on how-to wikis for debugging and investigations, results for experiments and analyses, team planning docs, and project plans. We do regular code review, and get together to demo and share work for feedback. Having great processes is time-consuming, but documenting knowledge well and having great processes will serve us well for years to come. In many ways, shifting to remote work allowed us to really think about what processes were working well and what wasn’t, and being very intentional about our channels of communication and project prioritization.
Hands-on data science work
At most tech companies, technical managers are expected to do a mix of IC (individual contributor) and people management work. As a manager, I reserve about 20% of my time, or about one day a week, on hands-on data science work. Examples of projects we've worked on are: building a model on learner engagement and retention, redesigning our database and ETLs to improve speed/performance, building internal data tools for other teams, designing experiments and choosing the right statistical tests to evaluate results. Lately, our team has been spending a lot of time thinking about developing frameworks to scale Data Science work. This means we’re automating work through scripting to make sure we’re not doing repetitive or manual work.
Unplugging from work
Working effectively during a pandemic can be really difficult, and I try to have empathy for myself and my coworkers when trying to strike the right balance between work and home life. We’re lucky to work in an industry and role that can be done from home. I adopted two cats, named Mu and Sigma after the normal distribution, and try to signal a close to my workday by playing with my cats, calling a friend, cooking dinner, or reading a book.
Is data science right for you?
When asked if she always wanted to be a data scientist, Catherine shared, “This might be weird, but I was also really into probabilistic thinking and used to think about how it applied to my day-to-day decisions. I would try to calculate things like: if I miss this traffic light, what are the chances I’ll miss the next two lights? How much longer would that lengthen my commute?”
If you’ve found yourself trying to make similar calculations, are curious about analyzing human behavior, or get excited about using data to uncover interesting or surprising information, a career in data science may be in your future!