How to add a row cell value based on other rows cell values in Pandas?

How to add a row cell value based on other rows cell values in Pandas?

Problem Description:

I have a pandas dataframe like below

IdDateAid
02022-01-01one
12022-01-01two
22022-01-05one
32022-01-06three
42022-01-02one
52022-01-01one

I would like to add a column "counter" which, for each row, will contain the number of rows with the same "Aid" and "Date" lower or equal with the row’s "Date"

So, for row 4, ‘Counter’ = 3 (rows 0, 4, 5).

The result should be:

IdDateAidCounter
02022-01-01one2
12022-01-01two1
22022-01-05one4
32022-01-06three1
42022-01-02one3
52022-01-01one2

Is there a way to do this count in pandas ?

The original dataframe contains milions of rows so efficiency is very important

Solution – 1

rank with max method

pd.to_datetime(df['Date']).groupby(df['Aid']).rank(method='max').astype('int')

output:

0    2
1    1
2    4
3    1
4    3
5    2
Name: Date, dtype: int32

make output to Counter column

df.assign(Counter=pd.to_datetime(df['Date']).groupby(df['Aid']).rank(method='max').astype('int'))

result:

Id  Date        Aid     Counter
0   2022-01-01  one     2
1   2022-01-01  two     1
2   2022-01-05  one     4
3   2022-01-06  three   1
4   2022-01-02  one     3
5   2022-01-01  one     2
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject