欣欣姐

2021年3月24日

说明，在安装kafka之前，必须将ZooKeeper先行安装好，具体详细的安装过程可参考 https://www.cnblogs.com/cstark/p/14573395.html 在官网 https://zookeeper.apache.org/releases.html#download 下载 Read More

posted @ 2021-03-24 15:26 欣欣姐 Views(71) Comments(0) Diggs(0)

zookeeper集群安装部署

背景：ZooKeeper是一个分布式的，开放源码的分布式应用程序协调服务，是Hadoop和Hbase的重要组件，它是一个为分布式应用提供一致性服务的软件，提供的功能包括：配置维护、域名服务、分布式同步、组服务等。安装说明：在官网上下载https://zookeeper.apache.org/re Read More

posted @ 2021-03-24 14:28 欣欣姐 Views(77) Comments(0) Diggs(0)

2021年3月22日

Spark流计算

Spark Streaming Spark Streaming可以整合多种数据源，如Kafka，Hdfs和Flume，甚至是普通的TCP套j借字，经处理后的数据可存储至文件系统，数据库，或显示再仪表盘里。 Spark Streaming的基本原理是将实时输入数据流以时间片（秒级）为单位进行拆分，然后 Read More

posted @ 2021-03-22 11:33 欣欣姐 Views(384) Comments(0) Diggs(0)

2021年3月18日

linux常见操作

linux 常见操作 1.linux查看正在运行的进程 [root@master ~]# ps -ef | grep spark* # 查看正在运行的spark程序 [root@master ~]# history | grep scp # 查看历史运行的某个命令[root@master ~]# p Read More

posted @ 2021-03-18 17:49 欣欣姐 Views(111) Comments(0) Diggs(0)

2021年3月17日

pyspark常见使用方法

以日志文件的解析过程为例，其中部分日志文件样例为： 2021-03-09 06:54:21,907 [http-nio-16680-exec-6-43:tUxRo338DAxy6xpj] INFO [m.u.g.s.l.ThreadLocalLogHandler] - request url: /w Read More

posted @ 2021-03-17 14:23 欣欣姐 Views(755) Comments(0) Diggs(0)

2021年3月15日

spark在windows环境下安装和配置详解

背景：一，在用python编辑spark时，需要在windows本地搭建一套spark环境，然后将编辑好的.py上传到hadoop集群，再进行调用；二，在应用spark进行机器学习时，更多时候还是在windows环境下进行会相对比较方便。组件准备： 1、Python3.6.7 2、JDK（本文中使 Read More

posted @ 2021-03-15 17:18 欣欣姐 Views(6292) Comments(0) Diggs(0)

pyspark踩坑：Python worker failed to connect back和an integer is required

在安装过程中，请务必注意版本，本人在第一次安装过程中，python版本为3.8，spark版本为3.1.1的，故安装后，在运行pyspark的“动作”语句时，一直报错 Python worker failed to connect back尝试很多办法都无法是解决这个问题，最后只能将spark版本 Read More

posted @ 2021-03-15 16:20 欣欣姐 Views(4334) Comments(0) Diggs(0)

2021年3月11日

-bash: /usr/bin/yum: /usr/bin/python: 坏的解释器: 没有那个文件或目录

问题：Linux安装Python3.X版本之后，运行yum指令的时候提示-bash: /usr/bin/yum: /usr/bin/python: 坏的解释器: 没有那个文件或目录。解决方法修改以下两个配置文件： /usr/bin/yum /usr/libexec/urlgrabber-ext-d Read More

posted @ 2021-03-11 15:26 欣欣姐 Views(6869) Comments(0) Diggs(0)

oracle使用游标查出数据集

Oracle中，不能直接使用select 直接查出数据集，必须用游标 create or replace procedure test_proc is v_date date; --定义变量 cursor cur is select * from cdmdata.uenshks_trade_pos_ Read More

posted @ 2021-03-11 10:58 欣欣姐 Views(284) Comments(0) Diggs(0)

2021年3月10日

rman备份与恢复

1 数据库备份方法 1.1数据库备份方法分类 1.2 数据库备份方法说明逻辑备份：指通过逻辑导出对数据进行备份，逻辑备份的数据只能基于备份时刻进行数据转储，所以恢复时也只能恢复到备份时保存的数据。对于备份点和故障点之间的数据，逻辑备份也是无能为力的，所以逻辑备份适合那些很少变化的数据表。如果通过逻 Read More

posted @ 2021-03-10 14:07 欣欣姐 Views(2163) Comments(0) Diggs(0)

公告