datetime dtypes in pandas read_csv

I’m reading in a csv file with multiple datetime columns. I’d need to set the data types upon reading in the file, but datetimes appear to be a problem. For instance: headers = [‘col1’, ‘col2’, ‘col3’, ‘col4’] dtypes = [‘datetime’, ‘datetime’, ‘str’, ‘float’] pd.read_csv(file, sep=’\t’, header=None, names=headers, dtype=dtypes) When run gives a error: TypeError: data … Read more

Can pandas automatically read dates from a CSV file?

Today I was positively surprised by the fact that while reading data from a data file (for example) pandas is able to recognize types of values: df = pandas.read_csv(‘test.dat’, delimiter=r”\s+”, names=[‘col1′,’col2′,’col3’]) For example it can be checked in this way: for i, r in df.iterrows(): print type(r[‘col1’]), type(r[‘col2’]), type(r[‘col3’]) In particular integer, floats and strings … Read more

Find column whose name contains a specific string

I have a dataframe with column names, and I want to find the one that contains a certain string, but does not exactly match it. I’m searching for ‘spike’ in column names like ‘spike-2’, ‘hey spike’, ‘spiked-in’ (the ‘spike’ part is always continuous). I want the column name to be returned as a string or … Read more

Multiple aggregations of the same column using pandas GroupBy.agg()

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df[“returns”], without having to call agg() multiple times? Example dataframe: import pandas as pd import datetime as dt import numpy as np pd.np.random.seed(0) df = pd.DataFrame({ “date” : [dt.date(2012, x, 1) for x in range(1, 11)], “returns” … Read more