Pandas ‘count(distinct)’ equivalent

I am using Pandas as a database substitute as I have multiple databases (Oracle, SQL Server, etc.), and I am unable to make a sequence of commands to a SQL equivalent.

I have a table loaded in a DataFrame with some columns:

YEARMONTH, CLIENTCODE, SIZE, etc., etc.

In SQL, to count the amount of different clients per year would be:

SELECT count(distinct CLIENTCODE) FROM table GROUP BY YEARMONTH;

And the result would be

201301    5000
201302    13245

How can I do that in Pandas?

10 Answers
10

Leave a Comment