How to transpose certain columns in a pandas dataframe

How to transpose certain columns in a pandas dataframe

Problem Description:

Currently have a dataframe like this:

RepNameHoursReason
John4Research
John15Training
Matt6Project Labor
Matt10Training

I want to transpose each reason as a column with the Hours values as values like so:

RepNameResearchTrainingProject Labor
John4150
Matt0106

I tried to transpose and melting the dataframe and couldn’t figure it out. Still sort of new to this, any help would be appreciated.

Solution – 1

You can do this with pd.pivot_table like so:

>>> df.pivot_table(index="RepName", columns="Reason", values="Hours", fill_value=0)
Reason   Project Labor  Research  Training
RepName
John                 0         4        15
Matt                 6         0        10

Full example:

import pandas as pd

input_data = pd.DataFrame(
    {
        "RepName": ["John", "John", "Matt", "Matt"],
        "Hours": [4, 15, 6, 10],
        "Reason": ["Research", "Training", "Project Labor", "Training"],
    }
)

output_data = pd.DataFrame(
    {
        "Project Labor": [0, 6],
        "Research": [4, 0],
        "Training": [15, 10],
    },
    index=["John", "Matt"],
)

pivoted = input_data.pivot_table(
    index="RepName",
    columns="Reason",
    values="Hours",
    fill_value=0,
)

assert output_data.equals(pivoted)

Solution – 2

This is a .pivot operation. The point of a pivot is to turn some subset of rows into their own columns.

out = df.pivot(index='RepName', columns='Reason', values='Hours').fillna(0)
Reason   Project Labor  Research  Training
RepName                                   
John               0.0       4.0      15.0
Matt               6.0       0.0      10.0

print(out)
Reason   Project Labor  Research  Training
RepName                                   
John               0.0       4.0      15.0
Matt               6.0       0.0      10.0

If you’re unfamiliar with pandas.Index that has names, you can get a more familiar result by removing the name from the column index and resetting the row index:

out = (
    df.pivot(index='RepName', columns='Reason', values='Hours')
    .rename_axis(columns=None)
    .reset_index()
    .fillna(0)
)

print(out)
  RepName  Project Labor  Research  Training
0    John            0.0       4.0      15.0
1    Matt            6.0       0.0      10.0
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject