Counting unique values in a column in pandas dataframe like in Qlik?

If I have a table like this: df = pd.DataFrame({ ‘hID’: [101, 102, 103, 101, 102, 104, 105, 101], ‘dID’: [10, 11, 12, 10, 11, 10, 12, 10], ‘uID’: [‘James’, ‘Henry’, ‘Abe’, ‘James’, ‘Henry’, ‘Brian’, ‘Claude’, ‘James’], ‘mID’: [‘A’, ‘B’, ‘A’, ‘B’, ‘A’, ‘A’, ‘A’, ‘C’] }) I can do count(distinct hID) in Qlik to … Read more

A column-vector y was passed when a 1d array was expected

I need to fit RandomForestRegressor from sklearn.ensemble. forest = ensemble.RandomForestRegressor(**RF_tuned_parameters) model = forest.fit(train_fold, train_y) yhat = model.predict(test_fold) This code always worked until I made some preprocessing of data (train_y). The error message says: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example … Read more

NumPy or Pandas: Keeping array type as integer while having a NaN value

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element inside listed as numpy.NaN? In particular, I am converting an in-house data structure to a Pandas DataFrame. In our structure, we have integer-type columns that still have NaN’s (but … Read more

Suppress Scientific Notation in Numpy When Creating Array From Nested List

I have a nested Python list that looks like the following: my_list = [[3.74, 5162, 13683628846.64, 12783387559.86, 1.81], [9.55, 116, 189688622.37, 260332262.0, 1.97], [2.2, 768, 6004865.13, 5759960.98, 1.21], [3.74, 4062, 3263822121.39, 3066869087.9, 1.93], [1.91, 474, 44555062.72, 44555062.72, 0.41], [5.8, 5006, 8254968918.1, 7446788272.74, 3.25], [4.5, 7887, 30078971595.46, 27814989471.31, 2.18], [7.03, 116, 66252511.46, 81109291.0, 1.56], [6.52, 116, … Read more

How to do exponential and logarithmic curve fitting in Python? I found only polynomial fitting

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic). I use Python and Numpy and for polynomial fitting there is a function polyfit(). But I found no such functions for exponential and logarithmic fitting. Are there any? Or how to solve … Read more