1.classification分类
2.Regression回归
3.Clustering聚类
4.Dimensionality reduction降维
5.Model selection模型选择
6.Preprocessing预处理
1.sklearn.base: Base classes and utility function基础实用函数
2.sklearn.cluster: Clustering聚类
3.sklearn.cluster.bicluster: Biclustering 双向聚类
4.sklearn.covariance: Covariance Estimators 协方差估计
5.sklearn.model_selection: Model Selection 模型选择
6.sklearn.datasets: Datasets 数据集
7.sklearn.decomposition: Matrix Decomposition 矩阵分解
8.sklearn.dummy: Dummy estimators 虚拟估计
9.sklearn.ensemble: Ensemble Methods 集成方法
10.sklearn.exceptions: Exceptions and warnings 异常和警告
11.sklearn.feature_extraction: Feature Extraction 特征抽取
12.sklearn.feature_selection: Feature Selection 特征选择
13。sklearn.gaussian_process: Gaussian Processes 高斯过程
14.sklearn.isotonic: Isotonic regression 保序回归
15.sklearn.kernel_approximation: Kernel Approximation 核 逼近
16.sklearn.kernel_ridge: Kernel Ridge Regression 岭回归ridge
17.sklearn.discriminant_analysis: Discriminant Analysis 判别分析
18.sklearn.linear_model: Generalized Linear Models 广义线性模型
19.sklearn.manifold: Manifold Learning 流形学习
20.sklearn.metrics: Metrics 度量 权值
21.sklearn.mixture: Gaussian Mixture Models 高斯混合模型
22.sklearn.multiclass: Multiclass and multilabel classification 多等级标签分类
23.sklearn.multioutput: Multioutput regression and classification 多元回归和分类
24.sklearn.naive_bayes: Naive Bayes 朴素贝叶斯
25.sklearn.neighbors: Nearest Neighbors 最近邻
26.sklearn.neural_network: Neural network models 神经网络
27.sklearn.calibration: Probability Calibration 概率校准
28.sklearn.cross_decomposition: Cross decomposition 交叉求解
29.sklearn.pipeline: Pipeline 管道
30.sklearn.preprocessing: Preprocessing and Normalization 预处理和标准化
31.sklearn.random_projection: Random projection 随机映射
32.sklearn.semi_supervised: Semi-Supervised Learning 半监督学习
33.sklearn.svm: Support Vector Machines 支持向量机
34.sklearn.tree: Decision Tree 决策树
35.sklearn.utils: Utilities 实用工具
from sklearn import preprocessing
标准化处理函数
将数据转化为标准正态分布(均值为0,方差为1)
preprocessing.scale(X,axis=0, with_mean=True, with_std=True, copy=True)
将数据在缩放在固定区间,默认缩放到区间 [0, 1]
preprocessing.minmax_scale(X,feature_range=(0, 1), axis=0, copy=True)
数据的缩放比例为绝对值最大值,并保留正负号,即在区间 [-1.0, 1.0] 内。唯一可用于稀疏数据 scipy.sparse的标准化
preprocessing.maxabs_scale(X,axis=0, copy=True)
通过 Interquartile Range (IQR) 标准化数据,即四分之一和四分之三分位点之间