python - 随笔分类 - qqhfeng16

2017年排名前15的数据科学python库

摘要：2017年排名前15的数据科学python库 2017-05-22 Python程序员 Python程序员 Python程序员微信号 pythonbuluo 功能介绍最专业的Python社区，有每日推送，免费电子书，真人辅导，资源下载，各类工具。我已委托“维权骑士”（rightknights.c 阅读全文

posted @ 2017-06-03 17:10 qqhfeng16 阅读(394) 评论(0) 推荐(0)

PCA降纬一步一步

摘要：import numpy as np 第一步：原始值 X1 0.9 2.4 1.2 0.5 0.3 1.8 0.5 0.3 2.5 1.3 X2 1 2.6 1.7 0.7 0.7 1.4 0.6 0.6 2.6 1.1 第二步：计算平均值 =1.17，np.mean(x1)=1.170000000 阅读全文

posted @ 2016-08-18 21:51 qqhfeng16 阅读(667) 评论(0) 推荐(0)

Python urllib模块urlopen()与urlretrieve()详解

摘要：1.urlopen()方法urllib.urlopen(url[, data[, proxies]]) :创建一个表示远程url的类文件对象，然后像本地文件一样操作这个类文件对象来获取远程数据。参数url表示远程数据的路径，一般是网址；参数data表示以post方式提交到url的数据(玩过web的人阅读全文

posted @ 2016-08-18 21:02 qqhfeng16 阅读(17636) 评论(1) 推荐(0)

numpy.linalg.eig

摘要：1、转置对于二维数组有用，对一位数组无效 2、理解特征值和特征向量的对应关系阅读全文

posted @ 2016-08-18 00:13 qqhfeng16 阅读(2991) 评论(0) 推荐(0)

python默认的是17位小数的精度，但是这里有一个问题，就是当我们的计算需要使用更高的精度（超过17位小数）的时候该怎么做呢？

摘要：1. 使用格式化(不推荐) 1 2 3 >>> a = "%.30f" % (1/3) >>> a '0.333333333333333314829616256247' 可以显示，但是不准确，后面的数字往往没有意义。 2. 高精度使用decimal模块，配合getcontext 1 2 3 4 5 6 7 8 9 10 11... 阅读全文

posted @ 2016-08-17 23:43 qqhfeng16 阅读(3394) 评论(0) 推荐(0)

一个难懂的聚簇分类算法

摘要：1、抽取全部图像的surf特征（每个图像的特征行不固定，但是列是固定的70） 2、将图像分为两组，一组训练，一组测试 3、将训练图像全部合并为一个大矩阵，并将矩阵聚簇为30个特征。 4、将每一个图像代入聚簇函数，推测每一个图像属于若干个分组（若不够30个分组，后面补1） 5、每个图像就表示为30个特阅读全文

posted @ 2016-08-11 23:34 qqhfeng16 阅读(1801) 评论(0) 推荐(0)

numpy.concatenate

摘要：import numpy as np a = np.array([[1, 2], [3, 4]]) a.shape Out[3]: (2, 2) b = np.array([[5, 6]]) b.shape Out[5]: (1, 2) np.concatenate((a, b)) Out[6]: 阅读全文

posted @ 2016-08-11 23:00 qqhfeng16 阅读(11373) 评论(0) 推荐(1)

KMeans的图像压缩

摘要：# -*- coding: utf-8 -*- """ Created on Thu Aug 11 18:54:12 2016 @author: Administrator """ import numpy as np import matplotlib.pyplot as plt from skl 阅读全文

posted @ 2016-08-11 20:51 qqhfeng16 阅读(579) 评论(0) 推荐(0)

随机打乱工具sklearn.utils.shuffle，将原有的序列打乱，返回一个全新的错乱顺序的值

摘要：Shuffle arrays or sparse matrices in a consistent way This is a convenience alias to resample(*arrays, replace=False) to do random permutations of the 阅读全文

posted @ 2016-08-11 20:27 qqhfeng16 阅读(11308) 评论(0) 推荐(0)

特征选择

摘要：# -*- coding: utf-8 -*- """ Created on Wed Aug 10 20:26:15 2016 @author: qqhfeng """ #模块1 VarianceThreshold 选择特征值 ''' Feature selector that removes all low-variance features. This feature selectio... 阅读全文

posted @ 2016-08-10 20:44 qqhfeng16 阅读(444) 评论(0) 推荐(0)

特征预处理

摘要：# -*- coding: utf-8 -*- """ Spyder Editor This is a temporary script file. """ import numpy as np from sklearn.preprocessing import StandardScaler #模块1 标准化 #无量纲化使不同规格的数据转换到同一规格。常见的无量纲化方法有标准化和区间缩放法... 阅读全文

posted @ 2016-08-10 20:28 qqhfeng16 阅读(598) 评论(0) 推荐(0)

关于yaha中文分词（将中文分词后，结合TfidfVectorizer变成向量）

摘要：https://github.com/jannson/yaha 阅读全文

posted @ 2016-08-10 09:21 qqhfeng16 阅读(1728) 评论(0) 推荐(0)

关于:cross_validation.scores

摘要：# -*- coding: utf-8 -*- """ Created on Wed Aug 10 08:10:35 2016 @author: Administrator """ ''' 关于:cross_validation.scores 此处cross_validation.scores并不是cross_validation的scores，而是分类函数（本文是clf,svm）的scor... 阅读全文

posted @ 2016-08-10 08:34 qqhfeng16 阅读(718) 评论(0) 推荐(0)

list array解析(总算清楚一点了)

摘要：# -*- coding: utf-8 -*- """ Created on Tue Aug 09 23:04:51 2016 @author: Administrator """ import numpy as np ''' python中的list是python的内置数据类型，list中的数据类不必相同的，而array的中的类型必须全部相同。在list中的数据类型保存的是数据的存... 阅读全文

posted @ 2016-08-10 00:20 qqhfeng16 阅读(3985) 评论(0) 推荐(1)

pipeline(管道的连续应用)

摘要：#Pipeline 无预测函数,他用管道中最后一个预测函数 Applies transforms to the data, and the predict method of the final estimator. Valid only if the final estimator impleme 阅读全文

posted @ 2016-08-09 22:59 qqhfeng16 阅读(708) 评论(0) 推荐(0)

关于RandomizedSearchCV 和GridSearchCV(区别：参数个数的选择方式)

摘要：RandomizedSearchCV took 8.64 seconds for 20 candidates parameter settings.[mean: 0.78075, std: 0.00987, params: {'bootstrap': True, 'min_samples_leaf' 阅读全文

posted @ 2016-08-09 22:54 qqhfeng16 阅读(5383) 评论(0) 推荐(0)

VotingClassifier

摘要：scores : array of float, shape=(len(list(cv)),) Array of scores of the estimator for each run of the cross validation. 关于scores：http://scikit-learn.or 阅读全文

posted @ 2016-08-09 22:37 qqhfeng16 阅读(2843) 评论(0) 推荐(0)

Python的zip函数

摘要：# -*- coding: utf-8 -*- """ Created on Tue Aug 09 22:17:32 2016 @author: Administrator """ #Python的zip函数 #zip函数接受任意多个（包括0个和1个）序列作为参数，返回一个tuple列表。具体意思不好用文字来表述，直接看示例： #注意：zip函数后，值是list类型 #示例1 zip的... 阅读全文

posted @ 2016-08-09 22:29 qqhfeng16 阅读(741) 评论(0) 推荐(0)

关于决策树的示例

摘要：# -*- coding: utf-8 -*- """ Created on Tue Aug 09 16:15:03 2016 @author: Administrator """ import numpy as np import pandas as pd from sklearn.tree import DecisionTreeClassifier from sklearn.cross_... 阅读全文

posted @ 2016-08-09 18:01 qqhfeng16 阅读(488) 评论(0) 推荐(0)

关于随机森林样本和分类目标的示例

摘要：关于随机森林样本和分类目标的示例注意： 1.目标类别是3个以上（逻辑分类只能两个） 2.自变量X以行为单位 3.因变量y以列为单位（每一个值对应X的一行） 4.其它不用管了，交给程序去吧# -*- coding: utf-8 -*- """ Created on Tue Aug 09 17:40:04 2016 @author: Administrator """ #... 阅读全文

posted @ 2016-08-09 18:00 qqhfeng16 阅读(2111) 评论(0) 推荐(0)

随笔分类 - python