Data tables often come in a format that makes sense to the human who created the table, but that’s difficult for analysis. To make analysis easier we can reshape the data into a more computer-friendly form. Pandas is a python data analysis library and in this post I reshape pandas data with melt.

A Reshape Example

Simple Dataframe

Firstly, let’s create a simple dataframe in pandas:

data = {'weekday': ["Monday", "Tuesday", "Wednesday", 
         "Thursday", "Friday", "Saturday", "Sunday"],
        'Person 1': [12, 6, 5, 8, 11, 6, 4],
        'Person 2': [10, 6, 11, 5, 8, 9, 12],
        'Person 3': [8, 5, 7, 3, 7, 11, 15]}
df = pd.DataFrame(data, columns=['weekday',
        'Person 1', 'Person 2', 'Person 3'])

This dataframe is similar to how we might have recorded the data by hand or in a spreadsheet, and could represent test scores of three different people on each day of the week.There is a row for each day of the week, and a separate column for each person:

	weekday	Person 1	Person 2	Person 3
0	Monday	12	10	8
1	Tuesday	6	6	5
2	Wednesday	5	11	7
3	Thursday	8	5	3
4	Friday	11	8	7
5	Saturday	6	9	11
6	Sunday	4	12	15

Fine For Humans – Not So Good For Analysis

While this simple dataframe structure is fine for a human reader, if we wanted to create use ‘groupby‘ operations or begin to normalise this dataframe to we would need to do some reshaping. We could reshape the dataframe to have a column containing person information, and another column to contain that person’s score on a particular day, like this:

	weekday	Person	Score

Reshape With Melt

It is of course possible to reshape a data table by hand, by copying and pasting the values from each person’s column into the new ‘person’ column. This would take a a long time even for this small dataframe, and would be prone to errrors. A much better idea is to reshape the dataframe with melt:

melted = pd.melt(df, id_vars=["weekday"], 
                 var_name="Person", value_name="Score")

Here we have set the variables (columns) that we want to leave unaffected. Variables not included in this list will become rows in a new column (which has the name given by “var_name”). The values belonging to the original rows/columns are found in a new column with a name given by “value_name”, and the output dataframe now has three rows for each day of the week – one for each of person 1, 2, and 3. This new form of the dataframe is now more ‘normalised’ and can have groupby operations applied to it.

	weekday	Person	Score
0	Monday	Person 1	12
1	Tuesday	Person 1	6
2	Wednesday	Person 1	5
3	Thursday	Person 1	8
4	Friday	Person 1	11
5	Saturday	Person 1	6
6	Sunday	Person 1	4
7	Monday	Person 2	10
8	Tuesday	Person 2	6
9	Wednesday	Person 2	11
10	Thursday	Person 2	5
11	Friday	Person 2	8
12	Saturday	Person 2	9
13	Sunday	Person 2	12
14	Monday	Person 3	8
15	Tuesday	Person 3	5
16	Wednesday	Person 3	7
17	Thursday	Person 3	3
18	Friday	Person 3	7
19	Saturday	Person 3	11
20	Sunday	Person 3	15

Reshape Pandas Data With Melt

A Reshape Example

Simple Dataframe

Fine For Humans – Not So Good For Analysis

Reshape With Melt

Related

One thought on “Reshape Pandas Data With Melt”