随笔档案「2021年1月」 - lightsong

Computing with scikit-learn of sklearn

摘要：Computing with scikit-learn https://scikit-learn.org/stable/computing.html 此章讲解使用sklearn涉及到的计算性能相关问题。 Strategies to scale computationally: bigger data 阅读全文

posted @ 2021-01-30 18:46 lightsong 阅读(227) 评论(0) 推荐(0)

Model persistence of sklearn

摘要：Model persistence https://scikit-learn.org/stable/modules/model_persistence.html 模型训练完毕后，如何保存起来，以便日后使用呢？这就是模型持久化。 After training a scikit-learn model, 阅读全文

posted @ 2021-01-29 14:35 lightsong 阅读(221) 评论(0) 推荐(0)

Unsupervised dimensionality reduction of sklearn

摘要：Unsupervised dimensionality reduction https://scikit-learn.org/stable/modules/unsupervised_reduction.html 无监督学习领域的维度约减，应对特征数目非常高的情况。在监督学习步骤之前，进行无阅读全文

posted @ 2021-01-28 17:11 lightsong 阅读(211) 评论(0) 推荐(0)

Preprocessing data of sklearn

摘要：Preprocessing data https://scikit-learn.org/stable/modules/preprocessing.html 数据预处理提供工具函数和变换器类，将转换特征向量成为更加适合下游模型的数据表示。一般学习算法都会从数据标准化中受益。如果异常值存在于数据中阅读全文

posted @ 2021-01-26 16:52 lightsong 阅读(356) 评论(0) 推荐(0)

Semi-supervised Classification on a Text Dataset of sklearn

摘要：Semi-supervised Classification on a Text Dataset https://scikit-learn.org/stable/auto_examples/semi_supervised/plot_semi_supervised_newsgroups.html#sp 阅读全文

posted @ 2021-01-24 12:16 lightsong 阅读(420) 评论(0) 推荐(0)

Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation of sklearn

摘要：Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation https://scikit-learn.org/stable/auto_examples/applications/plo 阅读全文

posted @ 2021-01-23 00:41 lightsong 阅读(182) 评论(0) 推荐(0)

Classification of text documents using sparse features of sklearn

摘要：Classification of text documents using sparse features https://scikit-learn.org/stable/auto_examples/text/plot_document_classification_20newsgroups.ht 阅读全文

posted @ 2021-01-22 12:56 lightsong 阅读(251) 评论(0) 推荐(0)

Sample pipeline for text feature extraction and evaluation of sklearn

摘要：Sample pipeline for text feature extraction and evaluation https://scikit-learn.org/stable/auto_examples/model_selection/grid_search_text_feature_extr 阅读全文

posted @ 2021-01-21 17:01 lightsong 阅读(146) 评论(0) 推荐(0)

Clustering text documents using k-means of sklearn

摘要：Clustering text documents using k-means https://scikit-learn.org/stable/auto_examples/text/plot_document_clustering.html#sphx-glr-auto-examples-text-p 阅读全文

posted @ 2021-01-21 16:56 lightsong 阅读(190) 评论(0) 推荐(0)

Feature extraction of sklearn

摘要：Feature extraction https://scikit-learn.org/stable/modules/feature_extraction.html 从文本或图片的数据集中提取出机器学习支持的数据格式。 The sklearn.feature_extraction module ca 阅读全文

posted @ 2021-01-21 16:46 lightsong 阅读(181) 评论(0) 推荐(0)

Column Transformer with Heterogeneous Data Sources -- of sklearn

摘要：Column Transformer with Heterogeneous Data Sources https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer.html#sphx-glr-auto-ex 阅读全文

posted @ 2021-01-19 14:48 lightsong 阅读(213) 评论(0) 推荐(0)

Column Transformer with Mixed Types -- of sklearn

摘要：Column Transformer with Mixed Types https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#sphx-glr-auto-examp 阅读全文

posted @ 2021-01-19 12:54 lightsong 阅读(247) 评论(0) 推荐(0)

Pipelines and composite estimators of sklearn

摘要：Pipelines and composite estimators https://scikit-learn.org/stable/modules/compose.html 转换器通常跟分类器、回归器、其它的估计器组合使用，构建一个组合的估计器。（可以理解为组合模型）这就叫流水线技术Pipel 阅读全文

posted @ 2021-01-18 16:27 lightsong 阅读(254) 评论(0) 推荐(0)

Out-of-core classification of text documents of sklearn

摘要：Strategies to scale computationally: bigger data https://scikit-learn.org/stable/computing/scaling_strategies.html 针对海量样本和计算速度的要求，对于传统的方法（数据加载内存 - 阅读全文

posted @ 2021-01-15 16:43 lightsong 阅读(452) 评论(0) 推荐(0)

Working With Text Data of sklearn

摘要：Working With Text Data https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html#working-with-text-data 分析文本文档，关于20个不同主题。包括阅读全文

posted @ 2021-01-14 17:09 lightsong 阅读(237) 评论(0) 推荐(0)

docstring of python

摘要：sphinx usages https://brendanhasz.github.io/2019/01/05/sphinx.html#file-hierarchy sphinx可以从python文档中自动提取docstring生成文档。 docstring包括函数和类的注释。理解： sphinx在阅读全文

posted @ 2021-01-12 16:54 lightsong 阅读(187) 评论(0) 推荐(0)

Manifold learning of sklearn

摘要：Manifold learning https://scikit-learn.org/stable/modules/manifold.html#locally-linear-embedding 流形学习是一种非线性降维方法，算法是基于一种想法，很多数据集的高纬度是人为制造的高，并不是真的高。 PCA 阅读全文

posted @ 2021-01-12 12:41 lightsong 阅读(181) 评论(0) 推荐(0)

Visualizing the stock market structure of sklearn

摘要：Visualizing the stock market structure https://scikit-learn.org/stable/auto_examples/applications/plot_stock_market.html#stock-market 此例使用了集中非监督学习技术，阅读全文

posted @ 2021-01-11 17:01 lightsong 阅读(321) 评论(0) 推荐(0)

covariance of sklearn

摘要：Covariance estimation https://scikit-learn.org/stable/modules/covariance.html# 协方差矩阵可以看成是数据集分散布局的估计。理解：在矩阵中，高相关系数越多，则数据集分布越集中，反之越分散。例如各个特征之间的相阅读全文

posted @ 2021-01-07 16:57 lightsong 阅读(317) 评论(0) 推荐(0)

几个常用的统计概念

摘要：Arithmetic Mean（算数均值）均值关注整体的一个平均水平。 https://www.investopedia.com/terms/a/arithmeticmean.asp What Is the Arithmetic Mean? The arithmetic mean is the s 阅读全文

posted @ 2021-01-06 17:29 lightsong 阅读(623) 评论(0) 推荐(0)

statistical learning -- putting_together of sklearn

摘要：Pipelining https://scikit-learn.org/stable/tutorial/statistical_inference/putting_together.html#pipelining 有的模型用于转换数据，有的模型用于预测数据。可以将这两种模型组合起来，这就是流水阅读全文

posted @ 2021-01-04 00:19 lightsong 阅读(135) 评论(0) 推荐(0)

statistical learning -- Unsupervised learning of sklearn

摘要：Unsupervised learning https://scikit-learn.org/stable/tutorial/statistical_inference/unsupervised_learning.html 无监督学习的目的是，寻找数据的表示。探索数据的结构。 seeking r 阅读全文

posted @ 2021-01-03 23:47 lightsong 阅读(183) 评论(0) 推荐(0)

statistical learning -- Model selection of sklearn

摘要：Model selection https://scikit-learn.org/stable/tutorial/statistical_inference/model_selection.html#score-and-cross-validated-scores 模型选择，包括两个部分：（1）选阅读全文

posted @ 2021-01-03 22:59 lightsong 阅读(162) 评论(0) 推荐(0)

Stay Hungry,Stay Foolish!

lightsong

{Web: [React, Vue, NodeJS, HTTP]，DevOps:[Jenkins,Docker,K8S], Languages:[Python, JS, C, Lua, Shell, Groovy]}, AI:[LLM, langchain，langraph]

01 2021 档案

公告