Guides
Delete column/row from a Pandas dataframe using .drop() method
While working with data in Pandas, you might want to drop a column(s) or some rows from a pandas dataframe. One typically deletes columns/rows, if they are not needed for further analysis. There are a couple of ways you can achieve this, but the best way to do this in Pandas is to use .drop()
method.
.drop()
The .drop()
function allows you to delete/drop/remove one or more columns from a dataframe. It also can be used to delete rows from Pandas dataframe.
DataFrame.drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
As you can see above, .drop()
function has multiple parameters. So to use the function correctly, we need to understand what these parameters do.
Parameters:
Self - Specifies what to drop.
Next come the optional parameters:
Labels - Index or column labels to drop.
Axis - Whether to drop labels from the index (0 or 'index') or columns (1 or 'columns').
Index - Alternative to specifying axis.
Columns - Alternative to specifying axis.
Level - For MultiIndex, the level from which the labels will be removed.
Inplace - If True, drops specified from the DataFrame. If False, shows how DataFrame would look like without specified data.
Errors - If ‘ignore’, suppress error and only existing labels are dropped.
Optional parametrers have the following default behaviour:
labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors="raise"
.drop() examples for dropping a column/columns
Let us see some examples of dropping or removing columns from pandas dataframe.
Create dataframe
Create a simple dataframe with a dictionary of lists, and column names: name, year, orders, town.
data = {'name': ['Jon', 'Mia', 'Tony', 'Ted', 'Maria'], 'year': [2012, 2018, 2017, 2014, 2019], 'orders': [15, 20, 27, 9, 3], 'town': ['London', 'Birmingham', 'Manchester', 'Glasgow', 'Newcastle']} df = pd.DataFrame(data) df
Example #1 : Delete a single column using just the column name
Pandas provides data analysts with a way to delete and filter dataframe using .drop()
method. Columns can be removed permanently using column name using this method df.drop(['your_column_name'], axis=1, inplace=True)
.
To drop a single column from pandas dataframe, we need to provide the name of the column to be removed as a list as an argument to drop function.
Remember parameter self? Pandas .drop()
function can drop column or row. This behaviour is controlled by the axis parameter. To specify that we want to drop a column, we need to provide axis=1
as an argument to the drop function. If you don't provide axis=1
then the .drop()
function will default to axis=0
. This means that the function will remove rows and not columns.
df.drop(['your_column_name'], axis=1)
Don't get caught by the default behaviour of the inplace parameter. After you run df.drop(['year'], axis=1)
the actual dataframe remains unchanged. So if you call you dataframe again, it will still display a column that was specified as self in .drop()
function. To remove a column permanently from your dataframe you will need to provide one more parameter inplace=True
.
Your command should look like the following:
df.drop(['your_column_name'], axis=1, inplace=True)
or
df.drop(columns=['your_column_name'], axis=1, inplace=True)
Its better to add labels when specifying parameters. Doing so will make reading your code easier.
Example #2 : Remove a column with column index
It is also possible to drop a column using its index rather than its name. IMHO it is harder to read code when the index is used.
df.drop(df.columns[[index_column]], axis = 1, inplace = True)
In our example, we are deleting column year, which has index one. It is the second column in the dataframe. Don't forget that python indexing starts from zero.
Example #3 : Delete multiple columns using the column name
Pandas .drop()
function can also be used to remove multiple columns. To do so, one simply needs to provide names of columns that should be deleted. Here is an example of dropping two columns from our simple dataframe.
Your command should look like:
df.drop(['your_column1', 'your_column2'], axis=1, inplace = True)
or
df.drop(columns=['your_column1', 'your_column2'], axis=1, inplace = True)
Dataframe before .drop()
is used:
Dataframe after .drop()
is used:
Example #4 : Delete multiple columns using the column index
You can also delete multiple columns with column index using the command:
df.drop(df.columns[[index_column1, index_column2]], axis=1, inplace = True)
.drop() examples for dropping a row(s)
In Pandas, it is also easy to drop rows of a dataframe. We can use the same .drop()
function to delete rows.
To drop one or more rows from a Pandas dataframe, we need to specify the row index(s) that need to be dropped and axis=0
argument. Here the axis=0
argument specifies that we want to drop rows instead of dropping columns. Remember that this is the default parameter for the .drop()
function and so it is optional.
Example #1 : Delete row with its index
To delete a specific row from a dataframe with its index use the following command:
df.drop([row_index], axis=0, inplace = True)
or
df.drop([row_index], inplace = True)
Example #2 : Delete row with its custom index
You can also set custom indexes on your dataframe. In this case, to drop the specified row, you will need to use its custom Intex.
Use one of the following commands:
df.drop(index='custom_index', axis=0, inplace = True)
or
df.drop('custom_index', axis=0, inplace = True)
or
df.drop('custom_index', inplace = True)
or
df.drop(['custom_index'], inplace = True)
data = {'name': ['Jon', 'Mia', 'Tony', 'Ted', 'Maria'], 'year': [2012, 2018, 2017, 2014, 2019], 'orders': [7, 20, 27, 9, 3]} df = pd.DataFrame(data, index = ['London', 'Birmingham', 'Manchester', 'Glasgow', 'Newcastle']) df
In the above dataframe towns serve as indexes.
Let's delete a row with custom index Glasgow.
Conclusion
Now you should have a better understanding of what the .drop()
function does and be able to use it in order to drop/delete/remove rows/columns from your dataframe. There are many other ways of dropping rows/columns from a dataframe, but they are out of the scope of this article.
Continue Reading
Apps
Timestripe - my new favourite productivity app
March 5, 2023
Guides
How to scrape tables from websites using Pandas read_html() function
February 2, 2023
Guides
Drop all duplicate rows across multiple columns in Python Pandas
January 28, 2023
Guides
How to create effective prompts for AI image generation
August 15, 2022
Guides
Generate Huge Datasets With Fake Data Easily and Quickly using Python and Faker
April 16, 2022
Guides
How to change or update a specific cell in Python Pandas Dataframe
March 25, 2021
Guides
How to add a row at the top in Pandas dataframe
March 22, 2021
Guides
Creating WordClouds in Python from a single-column in Pandas dataframe
November 15, 2020
Guides
Python Regex examples - How to use Regex with Pandas
September 9, 2020
Guides
Python regular expressions (RegEx) simple yet complete guide for beginners
September 15, 2020
Guides
8 Python Pandas Value_counts() tricks that make your work more efficient
May 31, 2020
Guides
Exploring Correlation in Python: Pandas, SciPy
May 5, 2020
Guides
How to add new columns to Pandas dataframe?
March 22, 2020
Guides
How to visualize data with Matplotlib from a Pandas Dataframe
November 15, 2019
Guides
The ultimate beginners guide to Group by in Python Pandas
August 8, 2019
Guides
Guide to renaming columns with Python Pandas
July 2, 2019
Guides
How to suppress scientific notation in Pandas
July 12, 2019
Guides
The complete beginners guide to Pandas
June 29, 2019
Guides
Data project #1: Stockmarket analysis
June 29, 2019
Guides
Use Jupyter notebooks anywhere
June 10, 2019