Guides

Delete column/row from a Pandas dataframe using .drop() method

February 2, 2020

February 2, 2020

While working with data in Pandas, you might want to drop a column(s) or some rows from a pandas dataframe. One typically deletes columns/rows, if they are not needed for further analysis. There are a couple of ways you can achieve this, but the best way to do this in Pandas is to use .drop() method.

.drop()

The .drop() function allows you to delete/drop/remove one or more columns from a dataframe. It also can be used to delete rows from Pandas dataframe.

DataFrame.drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

As you can see above, .drop() function has multiple parameters. So to use the function correctly, we need to understand what these parameters do.

Parameters:

Self - Specifies what to drop.

Next come the optional parameters:

Labels - Index or column labels to drop.      

Axis - Whether to drop labels from the index (0 or 'index') or  columns (1 or 'columns').        

Index - Alternative to specifying axis.

Columns - Alternative to specifying axis.

Level - For MultiIndex, the level from which the labels will be removed.  

Inplace - If True, drops specified from the DataFrame. If False, shows how DataFrame would look like without specified data.

Errors - If ‘ignore’, suppress error and only existing labels are dropped.

Optional parametrers have the following default behaviour:

labels=None,        axis=0,        index=None,        columns=None,        level=None,        inplace=False,        errors="raise"

.drop() examples for dropping a column/columns

Let us see some examples of dropping or removing columns from pandas dataframe.

Create dataframe

Create a simple dataframe with  a dictionary of lists, and column names: name, year, orders, town.

data = {'name': ['Jon', 'Mia', 'Tony', 'Ted', 'Maria'], 'year': [2012, 2018, 2017, 2014, 2019], 'orders': [15, 20, 27, 9, 3], 'town': ['London', 'Birmingham', 'Manchester', 'Glasgow', 'Newcastle']} df = pd.DataFrame(data) df

Example #1 : Delete a single column using just the column name

Pandas provides data analysts with a way to delete and filter dataframe using .drop() method. Columns can be removed permanently using column name using this method df.drop(['your_column_name'], axis=1, inplace=True).

To drop a single column from pandas dataframe, we need to provide the name of the column to be removed as a list as an argument to drop function.  

Remember parameter self? Pandas .drop() function can drop column or row. This behaviour is controlled by the axis parameter. To specify that we want to drop a column, we need to provide axis=1 as an argument to the drop function. If you don't provide axis=1 then the .drop() function will default to axis=0. This means that the function will remove rows and not columns.

df.drop(['your_column_name'], axis=1)

Don't get caught by the default behaviour of the inplace parameter. After you run df.drop(['year'], axis=1) the actual dataframe remains unchanged. So if you call you dataframe again, it will still display a column that was specified as self in .drop() function. To remove a column permanently from your dataframe you will need to provide one more parameter inplace=True.

Your command should look like the following:

df.drop(['your_column_name'], axis=1, inplace=True) or

df.drop(columns=['your_column_name'], axis=1, inplace=True)

Its better to add labels when specifying parameters. Doing so will make reading your code easier.

Example #2 : Remove a column with column index

It is also possible to drop a column using its index rather than its name. IMHO it is harder to read code when the index is used.

df.drop(df.columns[[index_column]], axis = 1, inplace = True)

In our example, we are deleting column year, which has index one. It is the second column in the dataframe. Don't forget that python indexing starts from zero.

Example #3 : Delete multiple columns using the column name

Pandas .drop() function can also be used to remove multiple columns. To do so, one simply needs to provide names of columns that should be deleted. Here is an example of dropping two columns from our simple dataframe.

Your command should look like:

df.drop(['your_column1', 'your_column2'], axis=1, inplace = True) or

df.drop(columns=['your_column1', 'your_column2'], axis=1, inplace = True)

Dataframe before .drop() is used:

Dataframe after .drop() is used:

Example #4 : Delete multiple columns using the column index

You can also delete multiple columns with column index using the command:

df.drop(df.columns[[index_column1, index_column2]], axis=1, inplace = True)

.drop() examples for dropping a row(s)

In Pandas, it is also easy to drop rows of a dataframe. We can use the same .drop() function to delete rows.

To drop one or more rows from a Pandas dataframe, we need to specify the row index(s) that need to be dropped and axis=0 argument. Here the axis=0 argument specifies that we want to drop rows instead of dropping columns. Remember that this is the default parameter for the .drop() function and so it is optional.

Example #1 : Delete row with its index

To delete a specific row from a dataframe with its index use the following command:

df.drop([row_index], axis=0, inplace = True) or

df.drop([row_index], inplace = True)

Example #2 : Delete row with its custom index

You can also set custom indexes on your dataframe. In this case, to drop the specified row, you will need to use its custom Intex.

Use one of the following commands:

df.drop(index='custom_index', axis=0, inplace = True) or

df.drop('custom_index', axis=0, inplace = True) or

df.drop('custom_index', inplace = True) or

df.drop(['custom_index'], inplace = True)

data = {'name': ['Jon', 'Mia', 'Tony', 'Ted', 'Maria'], 'year': [2012, 2018, 2017, 2014, 2019], 'orders': [7, 20, 27, 9, 3]} df = pd.DataFrame(data, index = ['London', 'Birmingham', 'Manchester', 'Glasgow', 'Newcastle']) df

In the above dataframe towns serve as indexes.

Let's delete a row with custom index Glasgow.

Conclusion

Now you should have a better understanding of what the .drop() function does and be able to use it in order to drop/delete/remove rows/columns from your dataframe. There are many other ways of dropping rows/columns from a dataframe, but they are out of the scope of this article.

Subscribe

Get fresh web design stories, tips, and resources delivered straight to your inbox every week.

Get fresh web design stories, tips, and resources delivered straight to your inbox every week.

Continue Reading