## Keep the corresponding value of another column after using groupby in a column

Problem Description:

After using groupby and agg to find the max and the min, I would like to keep the value of another column corresponding to said Max and Min.

What I have:

Order | Missing | Date |
---|---|---|

Order 1 | Missing 1 | 2002 |

Order 1 | Missing 2 | 2003 |

Order 2 | Missing 3 | 2004 |

Order 2 | Missing 4 | 2005 |

Order 2 | Missing 5 | 2006 |

Order 3 | Missing 6 | 2007 |

What I do

```
calculation = df.groupby(by=('Order')).agg(Max=('Date','max'),Min=('Date','min')).reset_index()
calculation['difference'] = calculation['Max']-calculation['Min']
```

What I get

Order | Min | Max | difference |
---|---|---|---|

Order 1 | 2002 | 2003 | 1 year |

Order 2 | 2004 | 2006 | 2 years |

Order 3 | 2007 | 2007 | 0s |

What I want,

Order | Min | Max | difference | Missing Min | Missing Max |
---|---|---|---|---|---|

Order 1 | 2002 | 2003 | 1 year | Missing 1 | Missing 2 |

Order 2 | 2004 | 2006 | 2 years | Missing 3 | Missing 5 |

Order 3 | 2007 | 2007 | 0s | Missing 6 | Missing 6 |

## Solution – 1

I would use separate `groupby`

and a `merge`

:

```
g = df.groupby(by='Order')
(pd.merge(df.loc[g['Date'].idxmin()]
.rename({'Date': 'Min', 'Missing': 'Missing Min'}, axis=1),
df.loc[g['Date'].idxmax()]
.rename({'Date': 'Max', 'Missing': 'Missing Max'}, axis=1),
on='Order')
.assign(difference=lambda d: d['Max']-d['Min'])
)
```

Output:

```
Order Missing Min Min Missing Max Max difference
0 Order 1 Missing 1 2002 Missing 2 2003 1
1 Order 2 Missing 3 2004 Missing 5 2006 2
2 Order 3 Missing 6 2007 Missing 6 2007 0
```

## Solution – 2

maybe try to add additional group by calculations/columns, similiar to how you created "Max" and "Min" columns from "Date". (I apologize, I am not able to debug code right now).:

```
calculation = df.groupby(by=('Order')).agg(Max=('Date','max'),Min=('Date','min'), Missing_Min=('Missing','min'), Missing_Max=('Missing','max')).reset_index()
calculation['difference'] = calculation['Max']-calculation['Min']
```

## Solution – 3

You need a `merge`

. Two more lines in your code would get you what you want.

```
calculation["Missing Min"] = calculation.merge(
df, left_on=["Order", "Min"], right_on=["Order", "Date"]
)["Missing"]
calculation["Missing Max"] = calculation.merge(
df, left_on=["Order", "Max"], right_on=["Order", "Date"]
)["Missing"]
```

print(calculation):

```
Order Max Min difference Missing Min Missing Max
0 Order 1 2003 2002 1 Missing 1 Missing 2
1 Order 2 2006 2004 2 Missing 3 Missing 5
2 Order 3 2007 2007 0 Missing 6 Missing 6
```