Excel vs R: When to use what

Believe it or not, Excel is still my go-to analysis tool a lot of times, because it’s great at what it does.  I’m a shortcut fiend, so I can do things pretty quickly. So when do I opt for R?  People have asked me this many times.  Here is my unofficial checklist I loop through in my head to decide whether to Excel or not to Excel:

    1. Is the data not well structured or PivotTable-ready?  Does it have a lot of stuff within cells that needs to get broken out?
      If yes, then R, unless I can work my Excel magic to clean it up.
    2. Is this a quick and dirty one-time analysis?  Including quick visuals.
      If Yes, then Excel, as long as the data is not gigantic.
    3. Do I need anything beyond basic statistical analysis?  Regression, clustering, text mining, time series analysis, etc
      If Yes, then R.  No contest.
    4. Do I have to crunch a few disparate datasets to do my work?
      Depends on complexity.  If data sets are small and a simple vlookup can handle it, then Excel.  If more than three tables, most likely R.  If more than 1-2 columns vlookup’ing from each table, also R.
    5. Something I will want to share in a web-based, interactive format that is nice to look at?
      R with the Shiny framework
    6. Unique and beautiful visuals the world has rarely seen?
      R.

While I was learning R, I used a hybrid approach … doing the heavy-lifting data prep work in R, then using the write.csv() function to send my data frames back to Excel for visuals and basic analysis.  Over time, I have learned to do more complete analysis in R, from beginning to end.

I hope this helps!  What scenarios did I miss?

Leave a Reply

3 Comments on "Excel vs R: When to use what"

avatar

Dan
Guest
Dan
6 months 5 days ago

Nice post, John.
I use a lot of Excel when I have to present scenarios and change a “final” table in front of clients. Looking at the data is easier in Excel than in R, but of course I’m referring to a “final, condensed, aggregated” dataset (< ~30000 I guess?).
Knowing how to conduct a PCA analysis in R does not mean you understand the dataset and can play with human-generated scenarios on it in a spreadsheet.
For that sense, R is overrated. There's a reason the industry still relies heavily on spreadsheets. They are easy. They support the creation of scenarios and conversation about the data.
R programming will be gibbreish in front of a meeting room with the CEO. Unless you spent days/weeks working on a Shiny dashboard. And still you can be surprised with a simple question on "change a parameter x, please".

Mohammad Kashif
Guest
Mohammad Kashif
3 months 26 days ago

What you have said is true about scenario analysis, but then R is meant for statisticians and written by statisticians. Using it for a simple think like that is like using a thermonuclear bomb to kill a fly.
R shines truly when one actually have to use statistics to infer information, say creation of neural networks or forecasting or the likes.
Also when working in a file with more then 1000 line items when it is necessary to have a birds eye view then R works well for me (this is my personal thought and given that I am proficient in R this is biased).

John
Admin
12 hours 53 minutes ago

Thanks for the input, guys. You’re both right. I will never argue against use of spreadsheets because I still use them on a daily basis, more than R, in fact.

Over time, each person figures out what works for them and when. To this day, I still pull our R as my secret weapon at work. The important thing is you diversify your skill set so you can build the confidence to think that nothing is impossible (at least as far as data work goes).

wpDiscuz