欣欣姐

2021年6月29日

一、hive架构相关可以结合平时使用hive的经验作答，也可以结合下图从数据的读入、解析、元数据的管理，数据的存储等角度回答：二、hive的特点本题主要为了考察对hive的整体使用场景的掌握程度，毕竟只有知道了hive的特点，才能有针对性的在实际项目中的合适场景下使用hive。可以从下面四个 Read More

posted @ 2021-06-29 11:40 欣欣姐 Views(323) Comments(0) Diggs(0) Edit

2021年6月28日

KAFKA经典问题

基础题目 1、Apache Kafka 是什么? Apach Kafka 是一款分布式流处理框架，用于实时构建流处理应用。它有一个核心的功能广为人知，即作为企业级的消息引擎被广泛使用。你一定要先明确它的流处理框架地位，这样能给面试官留下一个很专业的印象。 2、什么是消费者组? 消费者组是 Ka Read More

posted @ 2021-06-28 18:06 欣欣姐 Views(255) Comments(0) Diggs(0) Edit

2021年6月22日

Python-Excel-字体及对齐方式设置

需要将单元格合并居中 from openpyxl import load_workbook from openpyxl.styles import Font, colors, Alignmentimport osos.chdir(r'C:\Users\86159\Desktop\file')exce Read More

posted @ 2021-06-22 10:14 欣欣姐 Views(1269) Comments(0) Diggs(0) Edit

2021年6月21日

python excel单元格及样式

#!/usr/bin/env python # -*- coding: utf-8 -*-” #只对当前文件的中文编码有效 # Filename : Write_excel_Format.py import os import time import xlwt #检测当前目录下是否有TestData Read More

posted @ 2021-06-21 18:29 欣欣姐 Views(245) Comments(0) Diggs(0) Edit

2021年6月3日

Python key值相同合并value值

dict中将key相同的字典合并在一个对象里 lis=[('hadoop', 'hadoop1'), ('hadoop', 'hadoop2'), ('flume', 'flume1'), ('flume', 'flume2'), ('hadoop', 'hadoop3'), ('flink', ' Read More

posted @ 2021-06-03 17:49 欣欣姐 Views(1223) Comments(0) Diggs(0) Edit

2021年5月27日

Hive常见操作

1.Hive新建分区表 create external table bmal.wall_log_url ( log_time string, log_key string, url_detail string, url_briefly string, url_action string, time_ Read More

posted @ 2021-05-27 14:21 欣欣姐 Views(57) Comments(0) Diggs(0) Edit

HDFS定时导入Hive的分区表

过程：此代码在shell中进行编辑，并采用crontab进行定时运行 1.先将每天的数据导导到一张临时表mal.wall_log_url_tmp表中，此表可为内部表 2.然后再将临时表的数据导入到目标表中 mal.wall_log_url #!/bin/sh # upload logs to hd Read More

posted @ 2021-05-27 11:48 欣欣姐 Views(210) Comments(0) Diggs(0) Edit

2021年5月25日

linux本地日志文件定时上传至HDFS

背景项目中需要定时将本地文件上传至HDFS系统，按时间进行分目录存放，即每月1号生成一个月的目录，然后将这个月每天的数据存放在此目录下实现逻辑：通过判断当天日期，如果为本月一号，即先生成一个月的文件，然后再将数据存放在此目录下，如果不是当月1号，则直接把数据put到该目录下 export PA Read More

posted @ 2021-05-25 18:17 欣欣姐 Views(222) Comments(0) Diggs(0) Edit

解决shell脚本使用hadoop 命令报错 command not found

背景需要定时将本地文件上传到HDFS 中，为了方便操作，写了SHELL脚本定时上传，其代码如下，文件名为mkdir_file.sh export PATH =/opt/soft/hadoop-2.7.7/bin DAY=`date +%d` if [ $DAY -eq 1 ] then hdfs Read More

posted @ 2021-05-25 18:06 欣欣姐 Views(1080) Comments(0) Diggs(0) Edit

2021年5月24日

Linux中复制文件时追加时间后缀

使用`date +%y%m%d`例如: mkdir `log_date +%Y%m%d` tar cfvz /tmp/bak.`date +%y%m%d`.tar.gz /etccp /opt/data/wfbmall/16/wfbmall.log /opt/data/wfmall/16/histo Read More

posted @ 2021-05-24 14:50 欣欣姐 Views(1212) Comments(0) Diggs(0) Edit

公告