scikit-learn Archives

Programming, Python

IT Nursery

Random state (Pseudo-random number) in Scikit learn

I want to implement a machine learning algorithm in scikit learn, but I don’t understand what this parameter random_state does? Why should I ...

June 3, 2022
0 Comments

Programming, Python

IT Nursery

How to extract the decision rules from scikit-learn decision-tree?

Can I extract the underlying decision-rules (or ‘decision paths’) from a trained tree in a decision tree as a textual list? Something like: ...

June 3, 2022
0 Comments

Programming, Python

IT Nursery

A column-vector y was passed when a 1d array was expected

I need to fit RandomForestRegressor from sklearn.ensemble. forest = ensemble.RandomForestRegressor(**RF_tuned_parameters) model = forest.fit(train_fold, train_y) yhat = model.predict(test_fold) This code always worked until I ...

May 31, 2022
0 Comments

Programming, Python

IT Nursery

pandas dataframe columns scaling with sklearn

I have a pandas dataframe with mixed type columns, and I’d like to apply sklearn’s min_max_scaler to some of the columns. Ideally, I’d ...

May 30, 2022
0 Comments

Programming, Python

IT Nursery

Is it possible to specify your own distance function using scikit-learn K-Means Clustering?

Is it possible to specify your own distance function using scikit-learn K-Means Clustering? 8 Answers 8

May 29, 2022
0 Comments

Programming, Python

IT Nursery

Find p-value (significance) in scikit-learn LinearRegression

How can I find the p-value (significance) of each coefficient? lm = sklearn.linear_model.LinearRegression() lm.fit(x,y) 9 Answers 9

May 26, 2022
0 Comments

Programming, Python

IT Nursery

Is there a library function for Root mean square error (RMSE) in python?

I know I could implement a root mean squared error function like this: def rmse(predictions, targets): return np.sqrt(((predictions - targets) ** 2).mean()) What ...

May 26, 2022
0 Comments

pandas, Programming

IT Nursery

How to split data into 3 sets (train, validation and test)?

I have a pandas dataframe and I wish to divide it to 3 separate sets. I know that using train_test_split from sklearn.cross_validation, one ...

May 26, 2022
0 Comments

Programming, Python

IT Nursery

sklearn error ValueError: Input contains NaN, infinity or a value too large for dtype(‘float64’)

I am using sklearn and having a problem with the affinity propagation. I have built an input matrix and I keep getting the ...

May 26, 2022
0 Comments

Save classifier to disk in scikit-learn

How do I save a trained Naive Bayes classifier to disk and use it to predict data? I have the following sample program ...

May 24, 2022
0 Comments