Deleting DataFrame row in Pandas based on column value

I have the following DataFrame: daysago line_race rating rw wrating line_date 2007-03-31 62 11 56 1.000000 56.000000 2007-03-10 83 11 67 1.000000 67.000000 2007-02-10 111 9 66 1.000000 66.000000 2007-01-13 139 10 83 0.880678 73.096278 2006-12-23 160 10 88 0.793033 69.786942 2006-11-09 204 9 52 0.636655 33.106077 2006-10-22 222 8 66 0.581946 38.408408 2006-09-29 245 … Read more

How do I expand the output display to see more columns of a Pandas DataFrame?

Is there a way to widen the display of output in either interactive or script-execution mode? Specifically, I am using the describe() function on a Pandas DataFrame. When the DataFrame is five columns (labels) wide, I get the descriptive statistics that I want. However, if the DataFrame has any more columns, the statistics are suppressed … Read more

Writing a pandas DataFrame to CSV file

I have a dataframe in pandas which I would like to write to a CSV file. I am doing this using: df.to_csv(‘out.csv’) And getting the following error: UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\u03b1′ in position 20: ordinal not in range(128) Is there any way to get around this easily (i.e. I have unicode characters … Read more

Convert list of dictionaries to a pandas DataFrame

I have a list of dictionaries like this: [{‘points’: 50, ‘time’: ‘5:00’, ‘year’: 2010}, {‘points’: 25, ‘time’: ‘6:00’, ‘month’: “february”}, {‘points’:90, ‘time’: ‘9:00’, ‘month’: ‘january’}, {‘points_h1’:20, ‘month’: ‘june’}] And I want to turn this into a pandas DataFrame like this: month points points_h1 time year 0 NaN 50 NaN 5:00 2010 1 february 25 NaN … Read more

How to deal with SettingWithCopyWarning in Pandas

Background I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this: E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead quote_df[‘TVol’] = quote_df[‘TVol’]/TVOL_SCALE I want to know … Read more

“Large data” workflows using pandas [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it’s on-topic for Stack Overflow. Closed 3 months ago. The community reviewed whether to reopen this question 3 months ago and left it closed: Original close reason(s) were not resolved Improve … Read more

How to drop rows of Pandas DataFrame whose value in a certain column is NaN

I have this DataFrame and want only the records whose EPS column is not NaN: >>> df STK_ID EPS cash STK_ID RPT_Date 601166 20111231 601166 NaN NaN 600036 20111231 600036 NaN 12 600016 20111231 600016 4.3 NaN 601009 20111231 601009 NaN NaN 601939 20111231 601939 2.5 NaN 000001 20111231 000001 NaN NaN …i.e. something like … Read more

How to add a new column to an existing DataFrame?

I have the following indexed DataFrame with named columns and rows not- continuous numbers: a b c d 2 0.671399 0.101208 -0.181532 0.241273 3 0.446172 -0.243316 0.051767 1.577318 5 0.614758 0.075793 -0.451460 -0.012493 I would like to add a new column, ‘e’, to the existing data frame and do not want to change anything in … Read more