Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I want to filter my dataframe with an or condition to keep rows with a particular column’s values that are outside the range [-0.25, 0.25]. I tried: df = df[(df[‘col’] < -0.25) or (df[‘col’] > 0.25)] But I get the error: Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all() … Read more

How to filter Pandas dataframe using ‘in’ and ‘not in’ like in SQL

How can I achieve the equivalents of SQL’s IN and NOT IN? I have a list with the required values. Here’s the scenario: df = pd.DataFrame({‘country’: [‘US’, ‘UK’, ‘Germany’, ‘China’]}) countries_to_keep = [‘UK’, ‘China’] # pseudo-code: df[df[‘country’] not in countries_to_keep] My current way of doing this is as follows: df = pd.DataFrame({‘country’: [‘US’, ‘UK’, ‘Germany’, … Read more

Filter pandas DataFrame by substring criteria

I have a pandas DataFrame with a column of string values. I need to select rows based on partial string matches. Something like this idiom: re.search(pattern, cell_in_question) returning a boolean. I am familiar with the syntax of df[df[‘A’] == “hello world”] but can’t seem to find a way to do the same with a partial … Read more

Creating an empty Pandas DataFrame, then filling it?

I’m starting from the pandas DataFrame docs here: http://pandas.pydata.org/pandas-docs/stable/dsintro.html I’d like to iteratively fill the DataFrame with values in a time series kind of calculation. So basically, I’d like to initialize the DataFrame with columns A, B and timestamp rows, all 0 or all NaN. I’d then add initial values and go over this data … Read more