Believe it or not, Excel is still my go-to analysis tool a lot of times, because it’s great at what it does. I’m a shortcut fiend, so I can do things pretty quickly. So when do I opt for R? People have asked me this many times. Here is my unofficial checklist I loop through in my head to decide whether to Excel or not to Excel:

- Is the data
**not well structured**or PivotTable-ready? Does it have a**lot of stuff within cells**that needs to get broken out?

If yes,**R**, unless I can work my Excel magic to clean it up. - Is this a
**quick and dirty one-time analysis**? Including quick visuals.

If Yes, then**Excel**, as long as the data is not gigantic. - Do I need anything
**beyond basic statistical analysis**? Regression, clustering, text mining, time series analysis, etc

If Yes, then**R**. No contest. - Do I have to
**crunch a few disparate datasets**to do my work?

Depends on complexity. If data sets are small and a simple vlookup can handle it, then**Excel**. If more than three tables, most likely**R**. If more than 1-2 columns vlookup’ing from each table, also**R**. - Something I will want to
**share in a web-based, interactive format**that is nice to look at?

**R**with the Shiny framework **Unique and beautiful visuals**the world has rarely seen?

**R**.

While I was learning R, I used a hybrid approach … doing the heavy-lifting data prep work in R, then using the write.csv() function to send my data frames back to Excel for visuals and basic analysis. Over time, I have learned to do more complete analysis in R, from beginning to end.

I hope this helps! What scenarios did I miss?

## Leave a Reply

4 Comments on "Excel vs R: When to use what"

Nice post, John.

I use a lot of Excel when I have to present scenarios and change a “final” table in front of clients. Looking at the data is easier in Excel than in R, but of course I’m referring to a “final, condensed, aggregated” dataset (< ~30000 I guess?).

Knowing how to conduct a PCA analysis in R does not mean you understand the dataset and can play with human-generated scenarios on it in a spreadsheet.

For that sense, R is overrated. There's a reason the industry still relies heavily on spreadsheets. They are easy. They support the creation of scenarios and conversation about the data.

R programming will be gibbreish in front of a meeting room with the CEO. Unless you spent days/weeks working on a Shiny dashboard. And still you can be surprised with a simple question on "change a parameter x, please".

What you have said is true about scenario analysis, but then R is meant for statisticians and written by statisticians. Using it for a simple think like that is like using a thermonuclear bomb to kill a fly.

R shines truly when one actually have to use statistics to infer information, say creation of neural networks or forecasting or the likes.

Also when working in a file with more then 1000 line items when it is necessary to have a birds eye view then R works well for me (this is my personal thought and given that I am proficient in R this is biased).

Thanks for the input, guys. You’re both right. I will never argue against use of spreadsheets because I still use them on a daily basis, more than R, in fact.

Over time, each person figures out what works for them and when. To this day, I still pull our R as my secret weapon at work. The important thing is you diversify your skill set so you can build the confidence to think that nothing is impossible (at least as far as data work goes).

It’s apples and oranges. Spreadsheets are for displaying small data sets and doing simple calculations on them. That’s where Excel and the likes shine. For advanced statistical analysis on larger data sets you need R. There is a small intersect between the two (i.e. joining tables) but these have very different purposes. R is not suitable for “changing a “final” table in front of clients” because advanced statistical analysis needs more time and concentration than a meeting with clients could provide.