How to transpose certain columns in a pandas dataframe
Problem Description:
Currently have a dataframe like this:
RepName | Hours | Reason |
---|---|---|
John | 4 | Research |
John | 15 | Training |
Matt | 6 | Project Labor |
Matt | 10 | Training |
I want to transpose each reason as a column with the Hours values as values like so:
RepName | Research | Training | Project Labor |
---|---|---|---|
John | 4 | 15 | 0 |
Matt | 0 | 10 | 6 |
I tried to transpose and melting the dataframe and couldn’t figure it out. Still sort of new to this, any help would be appreciated.
Solution – 1
You can do this with pd.pivot_table
like so:
>>> df.pivot_table(index="RepName", columns="Reason", values="Hours", fill_value=0)
Reason Project Labor Research Training
RepName
John 0 4 15
Matt 6 0 10
Full example:
import pandas as pd
input_data = pd.DataFrame(
{
"RepName": ["John", "John", "Matt", "Matt"],
"Hours": [4, 15, 6, 10],
"Reason": ["Research", "Training", "Project Labor", "Training"],
}
)
output_data = pd.DataFrame(
{
"Project Labor": [0, 6],
"Research": [4, 0],
"Training": [15, 10],
},
index=["John", "Matt"],
)
pivoted = input_data.pivot_table(
index="RepName",
columns="Reason",
values="Hours",
fill_value=0,
)
assert output_data.equals(pivoted)
Solution – 2
This is a .pivot
operation. The point of a pivot is to turn some subset of rows into their own columns.
out = df.pivot(index='RepName', columns='Reason', values='Hours').fillna(0)
Reason Project Labor Research Training
RepName
John 0.0 4.0 15.0
Matt 6.0 0.0 10.0
print(out)
Reason Project Labor Research Training
RepName
John 0.0 4.0 15.0
Matt 6.0 0.0 10.0
If you’re unfamiliar with pandas.Index
that has names, you can get a more familiar result by removing the name from the column index and resetting the row index:
out = (
df.pivot(index='RepName', columns='Reason', values='Hours')
.rename_axis(columns=None)
.reset_index()
.fillna(0)
)
print(out)
RepName Project Labor Research Training
0 John 0.0 4.0 15.0
1 Matt 6.0 0.0 10.0