Find nth largest or smallest in a group

You want to identify the nth largest or smallest item in a group using R. For example, to filter out the two rows in the table below:

Find nth largest or smallest item in group in r

Any time there is some by-group processing, I almost always stick with the dplyr library because of it’s so-called window operations. Below are a few techniques:

Let’s say our data frame is named stuff.

Solution 1: Simply get the min/max
group_by(stuff, type) %>%
  filter(weight == max(weight))


        type            name weight
1     Fruits         Mangoes     19
2 Vegetables Brussel Sprouts     20

This gets right to the point. We set the data frame up for a grouped operation using group_by(). Then we filter the row(s) where weight is equal to the max weight. Because of the group_by, we are looking at max(weight) within each different type.

Solution 2: More flexible if needed
Perhaps we don’t need the smallest or largest within a group, but the 3rd smallest or the top 5 within each group. In that case we can use this more flexible approach:

group_by(stuff, type) %>%
  mutate(rank = rank(desc(weight))) %>%


         type            name weight rank
1      Fruits         Mangoes     19  1.0
2      Fruits         Bananas     18  2.5
3      Fruits     Watermelons     18  2.5
4      Fruits      Pineapples     10  4.0
5      Fruits          Apples      9  5.0
6      Fruits     Canteloupes      5  6.0
7      Fruits         Oranges      4  7.0
8  Vegetables Brussel Sprouts     20  1.0
9  Vegetables         Spinach     15  2.0
10 Vegetables       Asparagus     11  3.0
11 Vegetables       Mushrooms      8  4.0
12 Vegetables         Cabbage      4  5.0

Here we created a new column using the rank() function. Now we can filter what we’d like from here. E.g., filter(rank <= 3) will get you the top 3 within each group. Note the rank() function has a few arguments, like ties.method to handle ties (notice Bananas and Watermelons are tied).

Leave a Reply

Be the First to Comment!