One of the most common actions while cleaning data or doing exploratory data analysis (EDA) is manipulating/fixing/renaming column names. So in this post, we will explore various methods of renaming columns of a Pandas dataframe.
Two ways of modifying column titles
There are two main ways of altering column titles:
1.) the columns method and
2.) the rename method.
Columns method
If we have our labelled DataFrame already created, the simplest method for overwriting the column labels is to call the columns method on the DataFrame object and provide the new list of names we’d like to specify.
For example, if we take our original DataFrame:
We can modify the column titles/labels by adding the following line:
df.columns = ['Column_title_1','Column_title_2']
A problem with this technique of renaming columns is that one has to change names of all the columns in the Dataframe. This approach would not work if we want to change the name of just one column. The rename method outlined below is more versatile and works for renaming all columns or just specific ones.
Rename method
The other technique for renaming columns is to call the rename method on the DataFrame object, than pass our list of labelled values to the columns parameter:
df.rename(columns={0 : 'Title_1', 1 : 'Title2'}, inplace=True)
Its important to note that since the rename method is attempting to actually rename existing labels, you need to specify the existing label first, followed by the new label. As shown in the example above.
Please note, we specify the True
value for the inplace
parameter here, in order to update the existing DataFrame. Whitout this, the function call returns a newly created DataFrame instead.
Change a single column header name
If you need to rename a specific column you can use the df.rename()
function and refer the columns to be renamed. Not all the columns have to be renamed when using rename method:
df.rename(columns={'old_column_name': 'new_column_name'}, inplace=True)