The Full Stack Data Scientist

Chris Schon
Apr 04, 2020

What a data scientist should know to build end-to-end data science solutions

Stack Overflow recently released their 2019 developer survey. It was full of interesting developer insights into everything from preferred technologies to optimism of the future.

It made me think about the role of data science in technology and the skills required to have the role integrated into the wider ecosystem. Developers have coined the term ‘full stack’ for a developer who is comfortable working on all aspects of web development.

What would be the equivalent for data science?
Most respondents (51.9%) identify their roles as ‘full-stack developers’, with ‘data scientist or machine learning specialist’ taking up 7.9% of responses. 
Other data-related roles include data or business analyst (7.7%), data engineer (7.2%) and scientist (4.4%).

Stack Overflow survey 2019 ‘Developer Types’

Since many data scientists don’t have the luxury of the support of large teams of developers, they must be able to build things and perform tasks that aren’t traditionally thought of as part of their role. 

This could relate to business analysis, data engineering, DevOps, database management and web development. I would consider a data scientist who is capable in all these areas to be a full stack data scientist. It’s not an option in the survey, yet… :)

The ability to build end-to-end solutions is the best way to prepare yourself for any role or project, work with a variety of teams, and ensure your insights bring value to the business. I believe that in order to do this, you must have a good knowledge in each of these areas:

💼 Business analysis. A sound understanding of the requirements, available data and goals of a project.

🏛 Infrastructure. The ability to efficiently design, deploy and work with a wide range of technologies and data management systems.

🚂 ETL. Data scientists should be able to build effective data processing pipelines so that their models and analysis are easily maintained.

💡Machine learning. Extensive knowledge of techniques to build intelligent systems.

🖥 DevOps. Source controlling, deploying and monitoring solutions is made easier using tools like Git, Docker and Airflow.

📱Web app & API development. Building simple web applications and API endpoints will make it easier to integrate insights into other applications.

📊Data visualisation. Create intuitive visualisations using a variety of tools.

The aim of this series is to cover each of these areas. If we are showcasing a particular tool, the post will walk through a Github repository.

The first part of the series is already live! Check out The Full Stack Data Scientist Part 1: Productionise Your Models with Django APIs.

“The Full Stack Data Scientist”
– Chris Schon twitter social icon Tweet

Share this article:


Post a comment
Log In to Comment

Related Stories

Jul 23, 2021

Pandas vs SQL. When Data Scientists Should Use One Over the Other

A deep dive into the benefits of each toolTable of ContentsIntroductionPandasSQLSummaryReferencesIntroductionBoth of these tools are important to n...

Matt Przybyla
By Matt Przybyla
Jul 14, 2021

How To Write The Perfect Data Science CV

These tips are also applicable to Software Engineers. Make a few changes in your CV and land that job!Writing a good CV can be one of the toughest ...

Roman Orac
By Roman Orac
Jul 09, 2021

Separating Hype From Value In Artificial Intelligence

You've probably heard a lot about data science, artificial intelligence and big data. Frankly, there has been a lot of hype around these areas. Wha...

Daniel Morales
By Daniel Morales

Win USD $2,000 in cash prizes with our data science competition!

🎉 Model submissions for the "Keyword Recency Prediction" competition will close in

arrow-up icon