How to apply a user defined function between rows in pandas using both rows values?

How to apply a user defined function between rows in pandas using both rows values?

Problem Description:

I have two rows of data in a Pandas data frame and want to operate each column separately with a function that includes both values e.g.

import pandas as pd    
df = pd.DataFrame({"x": [1, 2], "z": [2, 6], "i": [3, 12], "j": [4, 20], "y": [5, 30]})
    x   z   i   j   y
0   1   2   3   4   5
1   2   6   12  20  30

The function is something like the row 2 val minus row 1 val, divided by the latter – for each column separately e.g.

(row2-row1)/row2

so I can get the following

0.5  0.667   0.75   0.8   0.833

Based on the following links

how to apply a user defined function column wise on grouped data in pandas

https://www.geeksforgeeks.org/apply-a-function-to-each-row-or-column-in-dataframe-using-pandas-apply/

https://pythoninoffice.com/pandas-how-to-calculate-difference-between-rows

Groupby and apply a defined function – Pandas

I tried the following

df.apply(lambda x,y: (x + y)/y, axis=0)

This does not work as it expects y as an argument

df.diff()

This works but then it is not exactly the function I want.

Does anyone know how to achieve the result I expect?

Solution – 1

df.diff(1).div(df)

output

    x   z    i    j   y
0   NaN NaN  NaN  NaN NaN
1   0.5 0.67 0.75 0.8 0.83

With a short example, I answered. If I’m misunderstanding something, edit your example more long. I’ll answer again.

Solution – 2

After testing many things I found out that it was not required to include two variables in the Lambda function (x,y), but just one and treat that as a vector with all values in the column, so the following solved the issue

df.apply(lambda x: (x[1] - x[0]) / x[1], axis=0)

This avoids having a result with NaN in the first row.

Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject