Drop all duplicate rows across multiple columns in Python Pandas

The pandas drop_duplicates function is great for “uniquifying” a dataframe. However, one of the keyword arguments to pass is take_last=True or take_last=False, while I would like to drop all rows which are duplicates across a subset of columns. Is this possible?

    A   B   C
0   foo 0   A
1   foo 1   A
2   foo 1   B
3   bar 1   A

As an example, I would like to drop rows which match on columns A and C so this should drop rows 0 and 1.

6 Answers
6

Leave a Comment