Excel vs R: When to use what

Believe it or not, Excel is still my go-to analysis tool a lot of times, because it’s great at what it does.  I’m a shortcut fiend, so I can do things pretty quickly. So when do I opt for R?  People have asked me this many times.  Here is my unofficial checklist I loop through in my head to decide whether to Excel or not to Excel:

    1. Is the data not well structured or PivotTable-ready?  Does it have a lot of stuff within cells that needs to get broken out?
      If yes, then R, unless I can work my Excel magic to clean it up.
    2. Is this a quick and dirty one-time analysis?  Including quick visuals.
      If Yes, then Excel, as long as the data is not gigantic.
    3. Do I need anything beyond basic statistical analysis?  Regression, clustering, text mining, time series analysis, etc
      If Yes, then R.  No contest.
    4. Do I have to crunch a few disparate datasets to do my work?
      Depends on complexity.  If data sets are small and a simple vlookup can handle it, then Excel.  If more than three tables, most likely R.  If more than 1-2 columns vlookup’ing from each table, also R.
    5. Something I will want to share in a web-based, interactive format that is nice to look at?
      R with the Shiny framework
    6. Unique and beautiful visuals the world has rarely seen?
      R.

While I was learning R, I used a hybrid approach … doing the heavy-lifting data prep work in R, then using the write.csv() function to send my data frames back to Excel for visuals and basic analysis.  Over time, I have learned to do more complete analysis in R, from beginning to end.

I hope this helps!  What scenarios did I miss?

Leave a Reply

2 Comments on "Excel vs R: When to use what"

avatar

Dan
Guest
Dan
4 months 6 days ago

Nice post, John.
I use a lot of Excel when I have to present scenarios and change a “final” table in front of clients. Looking at the data is easier in Excel than in R, but of course I’m referring to a “final, condensed, aggregated” dataset (< ~30000 I guess?).
Knowing how to conduct a PCA analysis in R does not mean you understand the dataset and can play with human-generated scenarios on it in a spreadsheet.
For that sense, R is overrated. There's a reason the industry still relies heavily on spreadsheets. They are easy. They support the creation of scenarios and conversation about the data.
R programming will be gibbreish in front of a meeting room with the CEO. Unless you spent days/weeks working on a Shiny dashboard. And still you can be surprised with a simple question on "change a parameter x, please".

Mohammad Kashif
Guest
Mohammad Kashif
1 month 28 days ago

What you have said is true about scenario analysis, but then R is meant for statisticians and written by statisticians. Using it for a simple think like that is like using a thermonuclear bomb to kill a fly.
R shines truly when one actually have to use statistics to infer information, say creation of neural networks or forecasting or the likes.
Also when working in a file with more then 1000 line items when it is necessary to have a birds eye view then R works well for me (this is my personal thought and given that I am proficient in R this is biased).

wpDiscuz