## Grouping and finding mean in R dataframe

Problem Description:

I have a table like this:

genderTime | Salary | Grade |
---|---|---|

ftMale | 32000 | G |

ptMale | 15500 | DP |

ptMale | 37500 | H |

ptFemale | 31500 | G |

ftFemale | 37400 | H |

ftMale | 36000 | G |

ptFemale | 31000 | G |

ptFemale | 16000 | DP |

ptMale | 37000 | H |

I have the Grades as a factor where grades = DP, G, H

I need to create a table that gives me the **mean** salary by grade for full-time males (ftMale) and part-time males (ptMale) like so:

Grade | Full-time Male | Part-time male |
---|---|---|

DP | 0 | 15500 |

G | 34000 | 0 |

H | 0 | 37250 |

My dataset is a lot larger than this and there would be means for part time and full time in each grade. Any help would be much appreciated!

Thanks

## Solution – 1

Group by `Grade`

and then summarize your data by getting the mean of `Salary`

while making sure that you are subsetting to the relevant `ftMale/ptMale`

values.

```
library(tidyverse)
df |>
group_by(Grade) |>
summarize(full_time_male = mean(Salary[genderTime == "ftMale"]),
part_time_male = mean(Salary[genderTime == "ptMale"]))
```

## Solution – 2

Another dplyr approach:

```
df %>%
group_by(genderTime, Grade) %>%
summarise(Salary = mean(Salary, na.rm=TRUE)) %>%
ungroup() %>%
pivot_wider(names_from = genderTime, values_from = Salary, values_fill = 0) %>%
select("Grade", "ftMale", "ptMale")
# A tibble: 3 × 3
Grade ftMale ptMale
<chr> <dbl> <dbl>
1 H 0 37250
2 G 34000 0
3 DP 0 15500
```

Just another (slightly different) dplyr solution

```
df %>%
filter(genderTime %in% c("ftMale", "ptMale")) %>%
group_by(genderTime, Grade) %>%
summarise(Salary = mean(Salary, na.rm=TRUE)) %>%
pivot_wider(names_from = genderTime, values_from = Salary, values_fill = 0)
```

The same using full R base:

```
reshape(aggregate(Salary ~ genderTime + Grade,
subset = genderTime %in% c("ftMale", "ptMale"),
FUN=mean,
data=df),
direction = "wide",
idvar = "Grade",
timevar = "genderTime")
```