Insight

R or Python? Which is the best programming language for data analysis.

By Aaron Heera |

 

More and more, professionals are finding that data analysis is a key feature of their job. From marketing, to broadcast media, to journalism, to healthcare, the increased availability of data has created an emphasis on analytics-driven decision making.

 

The majority of us at least began our journeys into the world of data science using spreadsheet programmes such as Microsoft Excel, or Google Sheets. Often moving on to proprietary statistical analysis software such as SPSS, learned at college or university.

 

While tools such as Excel and SPSS are powerful and well suited to what they do, they suffer from serious limitations in their ability to handle large datasets and reproduce previous analyses on fresh datasets. Also, neither Excel nor SPSS (or SAS, Stata, or any of their competitors) have large communities of developers creating new tools.

 

So, when you've reached the limit of what off-the-shelf tools have to offer, where should you turn next? The most popular programming languages used by data scientists are Python and R, so it makes sense to learn one, or both, of those. But which should you opt for first?

 

Both R and Python are open source (meaning, of course, that they're free). R is a specialist statistical analysis tool, whereas Python is a general-purpose programming language. But, for anyone undertaking large-scale data analysis, or creating complex visualisations, both are incredibly useful.

 

But if you only have time to learn one new language, should it be Python or R? From a purely practical perspective, most people consider Python to be the fastest, and easiest, of the two to get to grips with. So, if you only have limited time, then Python may be the best option for you.

 

On the other hand, your choice may be decided by the type of work that you are most likely to undertake. Python is often most suited to data manipulation and carrying out repeated tasks on multiple datasets. R, by contrast, is excellent for ad hoc analyses and one-time explorations of single datasets.

 

In an ideal world, of course, every data analyst would learn both, but sadly we don't all have that kind of time.


Aaron Heera

Written by Aaron Heera

I’m Aaron Heera and I work with customers placing IT Consultants/Contractors/Freelancers in roles such as Project Managers, Developers, Programme Managers, Network Engineers, Testers and Architects. I work with clients across Sweden, Netherlands and Belgium in the Telecoms, Media and Technology sector. Did you know Aaron is usually found at Glastonbury Festival every summer and even made the front page of the BBC website playing a self-created game of Welly Cricket on the Legendary Hill.