When hiring for a new data scientist role on your team, it's tempting to picture the perfect candidate. Someone with years of experience in machine learning and AI, makes eye-catching charts and graphs (that are also perfectly accurate and intuitively insightful!), and builds Python packages on the weekends.
This elusive "unicorn" data scientist, also referred to as "end-to-end data scientist," or "full-stack data scientist," can do it all. From data processing to model building, from communication to strategy, they embody the T-shaped professional to a tee.
But unicorn data scientists are like the mythical 10x engineers. They do exist. We've all heard stories. But they are rare. And when you are building a team, it can be unrealistic to wait for, or chase after, a unicorn data scientist, especially if you have a strict hiring window.
However, there is a clear path to building a unicorn data team. Instead of an individual that can do it all, you can reliably build an effective and high-impact data team.
The Dependency Chart of a Data Science Team
Let's first consider a "typical" data science team comprised of a few data scientists. The standard responsibilities of this team include two broad areas:
- Data analytics to identify patterns and trends; report such insights to stakeholders
- Examples: product usage dashboards, sales reports, user behavior studies
- Use data to build models that power organizational decision making, or embed in product applications
- Examples: recommendation engines, classification models, customized user experiences
In order to accomplish these goals, a data science team needs support from other parts of the organization:
- You can't do data science without data. So the first step is often obtaining access to the right datasets. But data in its original format is almost always not enough. Getting insight often requires combining disparate datasets, and performing a huge amount of data cleaning and transformation. When the data scale is significant, the data engineering effort also scales up.
- After performing analytics and arriving at certain insights, it's up to the Business team(s) to make the ultimate decision and drive action.
- If the work is about building models, a data science team often needs support from Engineering team to deploy the model. If the model is meant to be embedded in the product, in addition to model deployment, there's also the consideration of productionizing the model and integrate it with the product.
Build a Capable, High Impact Data Team
The key to building a unicorn data team is to recognize the skills required to deliver data science work end-to-end:
- Data engineering
- Analytics
- Models & algorithm
- Driving business decision
- Deployment & productionization
You can design an effective team by assembling the roles that bring these skills to the table.
The Data Engineer
Skills: SQL, Python, ETL, data warehousing
Responsibilities: Design data pipelines, optimize data pipeline performance, ensure data quality
Look for someone who:
- Gets excited about scaling database performance
- Has a deep understanding of common database architecture
- Likely also a fan of "Designing Data-Intensive Applications"
The Data Analyst
Skills: SQL, statistical analysis, data visualization, R or Python (either is excellent for this role)
Responsibilities: Perform exploratory data analysis, create dashboards, provide data-driven insights
Look for someone who:
- Has a keen eye for patterns in complex datasets, aka a "data detective"
- Can back up a conclusion with data and compelling visualizations
- Familiar with the scientific process of hypothesis testing and statistical analysis
The ML / AI Engineer
Skills: Python, deep learning frameworks, MLOps
Responsibilities: Develop and deploy machine learning models, design algorithms for business or product use cases
Look for someone who:
- Stays current with the latest AI research
- Strong python proficiency, especially with ML and AI frameworks
- Is comfortable with the full ML lifecycle, from data prep to deployment
The Project Manager
Skills: Stakeholder management, business acumen, communication
Responsibilities: Define project scope, coordinate team efforts, ensure alignment with business goals
Look for someone who:
- Fluently speaks both the technical language and the business language
- Has a track record of building alignment across multiple teams
- Can break down a big project into smaller tasks, and manage timelines and workloads
- As the team manager, this is likely the role you play
Create an Environment where Future Unicorns Emerge
As you assemble a unicorn data team by seeking out talents for specific roles, it's also important to consider the skill and career growth of your team members. More often than not, motivated data professionals will expand their skillset. They'd be eager to learn new tools and take on new challenges. A data engineer might over time transform into an ML / AI engineer. A data analyst might build on their communication skills and turn out to be an excellent fit for project management tasks. These individual growths will also make your team stronger, if given the right support.
While we can't count on hiring a unicorn data scientist, we can count on building a unicorn data team using the recipe above. Moreover, we can also count on the eventual development of a Unicorn Data Scientist. For that, there are really only two ingredients:
- The will to continuously learn
- Time
So as you build a data team, don't wait for a unicorn data scientist to come along. Instead, build a team that can grow and thrive. Support the up-skilling of your team members, and very likely the unicorn data scientist that you have been waiting for will slowly emerge from your own team!