From Novice To Data Hero: Top Resources For Becoming A Citizen Data Scientist
It is estimated that 90% of the world's data was generated in the last two years alone (Sources: Statista, Bernard Marr & Co.) This figure has increased by an estimated 74x from just 2 zettabytes (2 bn GBs) in 2010 to 129 zettabytes (140 bn GBs) in 2023. Yes, that is a HUGE number!

It is estimated that 90% of the world's data was generated in the last two years alone (Sources: Statista, Bernard Marr & Co.) This figure has increased by an estimated 74x from just 2 zettabytes (2 bn GBs) in 2010 to 129 zettabytes (140 bn GBs) in 2023. Yes, that is a HUGE number!
With that much data being generated, data literacy has become an essential skill, empowering individuals to navigate, interpret, and harness this vast sea of information for informed decision-making and to protect their digital footprint in an increasingly data-driven world.
But First, What is a Citizen Data Scientist
This is a 2-part blog series on the topic of Citizen Data Scientist. In part 1, we discussed in detail the emergence of this relatively new role in many large organizations, the genesis behind this trend, the challenges this role faces, etc.
Citizen Data Scientist = Business expert + Data Scientist - PhD. degree in data
The Decentralization of Data Expertise
Data is rapidly becoming a universal artifact in every team across organizations. There's a growing trend of data teams becoming decentralized, with each functional team requiring its own data expert. These experts possess the skills to work with and analyze data for quick decision-making without relying heavily on a centralized data science team. This shift reflects the increasing importance of data literacy across all business functions. As a result, many are taking on the role of citizen data scientists within their teams, bridging the gap between domain expertise and data analysis.
10 Essential Skills Required To Become A Citizen Data Scientist
These skills combine technical proficiency with business knowledge and soft skills, allowing citizen data scientists to bridge the gap between advanced analytics and business operations, even without formal data science training.
- $1
- $1
- $1
- $1
- $1
- $1
- $1
- $1
- $1
- $1
Top 10 Online Courses to Develop Your Skills as a Citizen Data Scientist
- Coursera - Data Science Specialization by Johns Hopkins University
- edX - Data Science Essentials by Microsoft
- DataCamp - Introduction to R (free courses available)
- Google Analytics Academy - Various courses on data analytics
- Kaggle - Intro to Machine Learning
- Stanford Online - Statistical Learning (free course materials)
- FutureLearn - Data Science: Visualization
- IBM Cognitive Class - Data Science Fundamentals
- Harvard Online - Data Science: R Basics
- YouTube - StatQuest with Josh Starmer (channel for statistics and machine learning)
Top 10 Books to Level Up Your Data Skills
- Data Science for Business by Foster Provost and Tom Fawcett
- Storytelling with Data by Cole Nussbaumer Knaflic
- Naked Statistics by Charles Wheelan
- The Art of Statistics by David Spiegelhalter
- Predictive Analytics by Eric Siegel
- Data Smart by John W. Foreman
- Python for Data Analysis by Wes McKinney
- R for Data Science by Hadley Wickham and Garrett Grolemund
- The Signal and the Noise by Nate Silver
- Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schönberger and Kenneth Cukier
Top 10 Podcasts to Enhance Your Data Knowledge
- Data Skeptic
- Linear Digressions
- DataFramed
- The O'Reilly Data Show
- Data Stories
- Partially Derivative
- Not So Standard Deviations
- Data Crunch
- Analytics on Fire
- The Data Scientist Show
Top 10 YouTube Channels to Learn Visually
- $1
- $1
- $1
- $1
- $1
- $1
- $1
- $1
- $1
- $1
Top 5 Data Science Communities
Kaggle
Kaggle is a well-known platform for data science competitions, where users can share datasets, explore machine learning models, and participate in challenges. It offers a supportive community for both beginners and experienced data scientists.
Reddit
Reddit hosts several active subreddits focused on data science, such as r/datascience, r/machinelearning, and r/dataisbeautiful. These communities provide a platform for discussions, sharing resources, and seeking advice from peers.
IBM Data Science Community
This community offers expert insights, discussions, and resources related to data science challenges. It's a great place to connect with industry professionals and stay updated on the latest trends.
Data Science Central
Data Science Central is one of the largest online communities for data scientists. It features forums, blogs, and articles, making it a valuable resource for networking and learning about industry trends.
Open Data Science
This community focuses on collaboration among data scientists, engineers, and students. It offers a variety of resources, including articles, tutorials, and events, fostering an inclusive environment for learning and sharing.
Tools of the Trade
While there are dozens of tools available to data professionals, we have filtered the list to show the most popular ones based on their use, ease of use, cost, and skill level required:
Category | Tool | Ease of Use (1 Most Difficult - 5 Easiest) | Cost ($/year) | Skill Level |
---|---|---|---|---|
Cleaning | OpenRefine | 4 | Free | Beginner |
Cleaning | Trifacta | 3 | (5,000-50,000) | Intermediate |
Cleaning | Talend | 3 | (1,000-200,000) | Intermediate |
Cleaning | Alteryx | 2 | (5,200-80,000) | Advanced |
Cleaning | Dataiku | 3 | (20,000-200,000) | Intermediate |
Analysis | Excel | 3 | (70-160) | Beginner |
Analysis | R | 2 | Free | Advanced |
Analysis | Python | 3 | Free | Intermediate |
Analysis | SAS | 2 | (8,000-210,000) | Expert |
Analysis | SPSS | 3 | (1,200-7,500) | Intermediate |
Visualization | Tableau | 4 | (70-840) | Intermediate |
Visualization | Power BI | 4 | (120-9,000) | Intermediate |
Visualization | QlikView | 3 | (1,500-35,000) | Advanced |
Visualization | Looker | 3 | (3,000-5,000) | Intermediate |
Visualization | D3.js | 1 | Free | Expert |
Cleaning, Analysis, Visualization | Querri | 5 | (900+) | Beginner |
Note:
- Ease of Use is rated on a scale of 1-5, with 5 being the easiest to use.
- Cost is given as a range (min-max) or average per year in USD. Some tools have wide ranges due to different editions or licensing models.
- Skill Level is categorized as Novice, Intermediate, Advanced, or Expert.
- These rankings are subjective and may vary based on individual experiences and specific use cases.
As organizations shift towards a decentralized model of data expertise, the role of citizen data scientists is becoming increasingly vital. This is where tools like Querri come into play. Designed to empower users with minimal technical background, Querri allows team members to extract insights from data seamlessly. By leveraging such intuitive tools, teams can enhance their decision-making processes without relying solely on specialized data teams.
If you would like to try out Querri, sign up for the free trial (no credit card required).