StandardScaler in preprocessing
Standardize features by removing the mean and scaling to unit variance.
scaler = StandardScaler() can have .tranform
with_stdbool, default=True with_meanbool, default=True copybool, default=True
>>> scaler = StandardScaler()
>>> print(scaler.fit(data))
Attributes ----------------------
scale_ndarray of shape (n_features,) or None
Per feature relative scaling of the data to achieve zero mean and unit(1) variance. Generally this is calculated using np.sqrt(var_)
. If a variance is zero, we can’t achieve unit variance, and the data is left as-is, giving a scaling factor of 1. scale_
is equal to None
when with_std=False
.
New in version 0.17: scale_
mean_ndarray of shape (n_features,) or None
The mean value for each feature in the training set. Equal to None
when with_mean=False
.
var_ndarray of shape (n_features,) or None
The variance for each feature in the training set. Used to compute scale_
. Equal to None
when with_std=False
.
n_features_in_int
Number of features seen during fit.
New in version 0.24.
feature_names_in_ndarray of shape (n_features_in_
,)
Names of features seen during fit. Defined only when X
has feature names that are all strings.
New in version 1.0.
n_samples_seen_int or ndarray of shape (n_features,)
The number of samples processed by the estimator for each feature. If there are no missing samples, the n_samples_seen
will be an integer, otherwise it will be an array of dtype int. If sample_weights
are used it will be a float (if no missing data) or an array of dtype float that sums the weights seen so far. Will be reset on new calls to fit, but increments across partial_fit
calls.
sklearn.feature_selection
.f_regression
Univariate linear regression tests returning F-statistic and p-values.
Quick linear model for testing the effect of a single regressor, sequentially for many regressors.