How to change the datetime format in Pandas

My dataframe has a DOB column (example format 1/1/2016) which by default gets converted to Pandas dtype ‘object’. Converting this to date format with df[‘DOB’] = pd.to_datetime(df[‘DOB’]), the date gets converted to: 2016-01-26 and its dtype is: datetime64[ns]. Now I want to convert this date format to 01/26/2016 or any other general date format. How … Read more

Check if a value exists in pandas dataframe index

I am sure there is an obvious way to do this but cant think of anything slick right now. Basically instead of raising exception I would like to get True or False to see if a value exists in pandas df index. import pandas as pd df = pd.DataFrame({‘test’:[1,2,3,4]}, index=[‘a’,’b’,’c’,’d’]) df.loc[‘g’] # (should give False) … Read more

what is the most efficient way of counting occurrences in pandas?

I have a large (about 12M rows) DataFrame df with say: df.columns = [‘word’,’documents’,’frequency’] So the following ran in a timely fashion: word_grouping = df[[‘word’,’frequency’]].groupby(‘word’) MaxFrequency_perWord = word_grouping[[‘frequency’]].max().reset_index() MaxFrequency_perWord.columns = [‘word’,’MaxFrequency’] However, this is taking an unexpectedly long time to run: Occurrences_of_Words = word_grouping[[‘word’]].count().reset_index() What am I doing wrong here? Is there a better way … Read more