# Calculate percent of column in R

You want to calculate percent of column in R as shown in this example, or as you would in a PivotTable: Here are two ways: (1) using Base R, (2) using dplyr library. If you are dealing with many cases at once, you can also go with method (3) automating with a loop.

Let’s say our data frame is named fruits.

### Ordinary / manual method

Solution 1: With base R functions
```fruits\$weight_pct = fruits\$weight / sum(fruits\$weight)
fruits\$cost_pct = fruits\$cost / sum(fruits\$cost)
```

We’re manually creating two columns using standard dollar sign notation.

Solution 2: With dplyr functions
```fruits = mutate(fruits,
weight_pct = weight / sum(weight),
cost_pct = cost / sum(cost))
```

Here we used the `mutate()`. This is my preferred method because (1) it’s simpler to type, (2) dplyr is great and you can string together more commands easily.

### With Loops for Automation

If you had dozens or hundreds of columns, you would not want to do this one column at a time. Instead, you can use a loop (`for()`, `while()`) or an apply function like `sapply()`/`lapply()`.

Solution 1: with sapply() or lapply()
```sapply(names(fruits)[-1], function(x) {
fruits[paste0(x, "_pct")] <<- fruits[x] / sum(fruits[x])
})
```

We loop through a vector of column names (excluding the first column), and create new columns on-the-fly. Note the `<<-` notation. This is needed if you want to affect a data frame from within a function.

Solution 2: with a for() loop
```for(col in names(fruits)[-1]) {
fruits[paste0(col, "_pct")] = fruits[col] / sum(fruits[col])
}
```

Once again we looped through a vector of column names excluding the first one, creating columns on the fly. The nice thing about this vs using apply are (a) it does not print anything, (b) you can use the regular assignment operator (`=` or `<-`) instead of `<<-`. lapply/sapply might be faster in some cases though.

### Result: Percent of column

All the above solutions result in this table:

```        fruit weight cost weight_pct   cost_pct
1      Apples      9 27.0 0.10843373 0.11273486
2     Bananas     18 54.0 0.21686747 0.22546973
3     Oranges      4  8.0 0.04819277 0.03340292
4     Mangoes     19 76.0 0.22891566 0.31732777
5  Pineapples     10 50.0 0.12048193 0.20876827
6 Watermelons     18 18.0 0.21686747 0.07515658
7 Canteloupes      5  6.5 0.06024096 0.02713987
```

3 Comments on "Calculate percent of column in R"  Guest
Geoff Urland
2 years 11 months ago

If your’re really emulating a pivot table and grouping lots of rows for each fruit together, couldn’t you also do this with table() and prop.table()?

Something like:

prop.table(table(fruit\$fruit, fruit\$weight), 2) Guest
thelatemail
1 year 4 months ago

I would really discourage any R user, new or otherwise, from using <<- in the way described above. Without going into technical details, it is not as simple as presented here and can give unexpected results.*

Safer and more compact code would be something like:

# save variables you want to get %'s for
vars <- c("weight","cost")
# then make new variables – loop with lapply and assign only once
fruits[paste0(vars,"_pct")] <- lapply(fruits[vars], prop.table)

* Thomas Lumley of the R Core Development team on using <<- "The Evil and Wrong use is to modify variables in the global environment." – read more here: https://stat.ethz.ch/pipermail/r-help/2011-April/275905.html Guest
Zuz
7 months 21 days ago

Is there a way to calculate % where the names column have more than one row with the name of each fruit and we want the % of the weight to be based on the total weight of each fruits separately ? 