From Statistics To Data Science

By | September 2, 2016

From Statistics To Data Science

If you have a background in statistics, you may be in a good position to capitalise on the current demand for data scientists. Data science jobs and careers are often well paid, and practioners are in high demand. One way to move from statistics to data science would be to upskill from pure statistics and familiarise yourself with one of the commonly used statistical analysis suites, such as R.

Get Into R

R is a freely available, Open Source statistical analysis language that is widely used in both public and private sector organisations. R uses scripts to load, analyse and visualise data, such as in these examples from wikipedia:

From statistics to data science - Plots from lm example

R is perhaps overkill for smaller, simpler projects, but comes into its own when you are working with large datasets and complicated analysis.

You’ll find many guides to getting started with R on the internet – I have found the “Software Carpentry” series of tutorials to be particularly useful in the past.

Read more about using R for data science.

Find a Problem

Data science is about problem solving and supporting decision making. Starting to learn R is a good step towards understanding tools available to the data scientist, but you should also explore these overarching aims.

A good starting point is to think of a problem you have or a decision you yourself need to make. It doesn’t have to be a big or important one: the important thing is to get thinking about how you can provide insight. A simple example might be deciding how much or how often to buy milk or other groceries. Alternatively you may find you can add insight to your sporting activities such as running times or steps recorded by your fitbit.

If you have supportive or inquisitive friends you may also find you can apply your growing data science skills to some of their problems or decisions, too.

Show Me The Data

As your familiarity with data science grows you may wish to look to larger data sets, and thankfully recent years have seen the release of huge quantities of public and government-held data into the public domain. I have made my own small list of data sources here, but there are many others and a quick google search is likely to turn up all sorts of public data.

As well as providing interesting subjects and problems to explore, these data sets will often be large, and in a range of different formats allowing you to stretch and expand your R knowledge.

A Means to An End

Data science, coding, graphs and visualisations are sexy, but they are still just a means to an end. Data science supports decisions makers and decision making; it provides insight into complex problems, and helps to identify and highlight key issues.

Providing insight, and supporting decision making should be your focus when thinking about moving into data science. Sexy graphs and analysis are great as long as they work towards this.

Read more about having impact with analysis