Pandas dataframe row operation with a condition

Pandas dataframe row operation with a condition

Problem Description:

I have a dataframe with information about a stock that looks like this:

Product IDInitial stockInitial unit costReferenceQuantityUnit costCurrent stock
a522Purch.4249
a522Purch.82117
a522Sale-42513
a522Purch.102023
a522Sale-15228
b143.5Sale1044
b143.5Purch.20324
b143.5Sale5419
b143.5Purch.23.521
c271Purch.1000.95127
c271Purch.31.1130

Each row represents a purchase/sale of a certain product. Quantity represents the number of units purchased/sold at a given Unit cost. Current stock is the remaining stock after the purchase/sale. For every product, I want to calculate the Weighted Average Cost (WAC) after each sale/purchase. The procedure is the following:

  • For the first row of every product, WAC = (Initial stock * Initial unit cost + Quantity * Unit cost) / Current stock just if Reference == 'Purch.'. If not, WAC = Initial unit cost.

  • For the next rows, WAC[i] = (Current stock[i-1] * WAC[i-1] + Quantity[i] * Unit cost[i]) / Current stock[i] just if Reference[i] == 'Purch.'. If not, WAC[i] = WAC[i-1].

The next table shows what I’m looking for (WAC column and how to calculate it):

Product IDInitial stockInitial unit costReferenceQuantityUnit costCurrent stock(how to) WACWAC
a522Purch.4249(5*22 + 4*24)/922.89
a522Purch.82117(9*22.89 + 8*21)/1722
a522Sale-4251322
a522Purch.102023(13*22 + 10*20)/2321.13
a522Sale-1522821.13
b143.5Sale10443.5
b143.5Purch.20324(4*3.5 + 20*3)/243.08
b143.5Sale54193.08
b143.5Purch.23.521(19*3.08 + 2*3.5)/213.12
c271Purch.1000.95127(27*1 + 100*0.95)/1270.96
c271Purch.31.1130(127*0.96 + 3*1.1)/1300.96

How would you do it using Pandas? I’ve tried to use a groupby and a cumsum, but I don’t know how to introduce the "if" statement. After that, I want to summarize the information and just get the Product ID along with the final Stock and WAC, just like this:

Product IDCurrent stockWAC
a821.13
b213.12
c1300.96

Thank you in advance!

Solution – 1

Hope I understood your question correct.

Code:

#Create new columns using lambda function
df['(how to)WAC']= df.apply(lambda row: (row['Intial stock']*row['Intial unit cost']+row['Quantity']*row['Unit cost'])/row['Current stock'] if row['Reference']=='Purch' else None, axis=1)



#Creating another column WAC, here it will gonna take data from '(how to)WAC' column.
#More, if its None will will take the above value. and if its the first value then it will take from Initial 

df['WAC']  = df.groupby(['Product ID'])['(how to) WAC'].ffill().fillna(df['Initial unit cost'])




#Group by the ID and display the last rows of each
df.groupby('Product ID').tail(1)[['Product ID','Current stock', 'WAC']]#

Solution – 2

You may create a function and call it using apply in a groupby dataframe.

I would try something like this

def calc_wac(df_):
  df_ = df_.copy()
  cs_wac = 0
  for counter, row in enumerate(df_.iterrows()):
    idx,row = row
    if counter==0:
      if row['Reference'] == 'Purch.':
        cs_wac += row['Initial stock'] * row['Initial unit cost'] + row['Quantity'] * row['Unit cost']
      else:
        cs_wac += row['Current stock'] * row['Initial unit cost']
    elif row['Reference'] == 'Purch.':
      cs_wac += row['Quantity'] * row['Unit cost']
    else:
      cs_wac *= row['Current stock']/df.loc[idx-1,'Current stock']
    df_.loc[idx, 'WAC'] = cs_wac/row['Current stock']
  return pd.Series({'Current stock': row['Current stock'], 'WAC':cs_wac/row['Current stock']})

This should return the summarized information when you call:

df.groupby('Product ID').apply(calc_wac)

If you want the full dataframe just change the function return to the entire dataframe return df_

Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject