PerfectData - 博客园

docker入门——安装(CentOS)、镜像、容器

摘要： Docker简介什么是docker 官方解释： Docker is the company driving the container movement and the only container platform provider to address every application ac 阅读全文

posted @ 2019-03-23 22:19 PerfectData 阅读(375) 评论(0) 推荐(0)

Docker toolbox换源

摘要：一 docker toolbox安装略。。。。阿里云加速器地址 https://jbriwmh3.mirror.aliyuncs.com 二为docker toolbox更换国内源 docker toolbox默认源下载速度慢，且可能会出错，这里记录docker toolbox更换为国内源的方阅读全文

posted @ 2019-03-23 22:12 PerfectData 阅读(485) 评论(0) 推荐(0)

urllib库使用方法 4 create headers

摘要： import urllib.requestimport urllib.parseurl = "https://www.baidu.com/"#普通请求方法response = urllib.request.urlopen(url)print(response.read().decode())#伪装头阅读全文

posted @ 2019-02-16 15:48 PerfectData 阅读(194) 评论(0) 推荐(0)

urllib库使用方法 3 get html

摘要： import urllib.requestimport urllib.parse#https://www.baidu.com/s?ie=UTF-8&wd=中国#将上面的中国部分内容，可以动态的变化内容、并编码，并得到html页面#1 得到url地址wd = input("请输入搜索内容：")url 阅读全文

posted @ 2019-02-16 15:47 PerfectData 阅读(493) 评论(0) 推荐(0)

urllib库使用方法 2 parse

摘要： import urllib.parse#url.parse用法包含三个方法：quote url， unquote rul， urlencode#quote url 编码函数,url规范只识别字母、数字、下划线，中文、符号等均不支持，parse url可以将不支持的编码为url能识别的内容img_ur 阅读全文

posted @ 2019-02-16 15:45 PerfectData 阅读(313) 评论(0) 推荐(0)

urllib库使用方法1 request

摘要： urllib是可以模仿浏览器发送请求的库，Python自带 Python3中urllib分为：urllib.request和urllib.parse 阅读全文

posted @ 2019-02-16 15:39 PerfectData 阅读(331) 评论(0) 推荐(0)

HBase操作一

摘要： 1 package Hbase; 2 3 import java.io.IOException; 4 import org.apache.hadoop.conf.Configuration; 5 import org.apache.hadoop.hbase.Cell; 6 import org.apache.hadoop.hbase.CellUtil; 7 impo... 阅读全文

posted @ 2018-12-18 13:44 PerfectData 阅读(153) 评论(0) 推荐(0)

随机生成字符串方法

摘要： 1 package beifeng.hadoop; 2 3 import java.util.Random; 4 import org.apache.commons.lang.RandomStringUtils; 5 6 /** 7 * Three Methods to generate random string. 8 */ 9 10 public clas... 阅读全文

posted @ 2018-12-17 22:17 PerfectData 阅读(1370) 评论(0) 推荐(0)

MapReduce之Map Join

摘要：一介绍之所以存在Reduce Join，是因为在map阶段不能获取所有需要的join字段，即：同一个key对应的字段可能位于不同map中。Reduce side join是非常低效的，因为shuffle阶段要进行大量的数据传输。 Map Join是针对以下场景进行的优化：两个待连接表中，有一个表阅读全文

posted @ 2018-12-15 23:16 PerfectData 阅读(423) 评论(0) 推荐(0)

MapReduce之Reduce Join

摘要：一介绍 Reduce Join其主要思想如下：在map阶段，map函数同时读取两个文件File1和File2，为了区分两种来源的key/value数据对，对每条数据打一个标签（tag），比如：tag=0表示来自文件File1，tag=2表示来自文件File2。即：map阶段的主要任务是对不同文阅读全文

posted @ 2018-12-15 22:17 PerfectData 阅读(601) 评论(0) 推荐(0)

Perfect Data