Failed to rollback to checkpoint/savepoint hdfs. Cannot map checkpoint/savepoint state for operator to the new program, because the operator is not available in the new program. If you want to allow to skip this, you can set the --allowNonRestoredState option on the CLI
https://www.cnblogs.com/gxgd/p/12673927.html
我还好是按小时统计的,可以重新打点位到之前整点时刻,但是
ContinuousEventTimeTrigger
敬畏!
redis subid思路设计
本地启动flink
Caused by: java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
https://stackoverflow.com/questions/54106187/apache-flink-java-lang-noclassdeffounderror
设置Idea Git忽略提交的文件
1、路径
IntelliJ IDEA-->Preferences-->Editor --> File Types
2、在右侧会看到ignore files and folders,加入 *.iml;.idea;target;使用英文格式下的;隔开, 然后点OK进行保存
http://www.mamicode.com/info-detail-2884088.html
慎用System.out.println
性能有问题,这是一个同步的方法(Synchronize)
public interface FlatMapFunction<T, O> extends Function, Serializable {
void flatMap(T value, Collector<O> out) throws Exception;
}
influxdb
切换网络到guest
https://www.cnblogs.com/inaruto/p/11168588.html
git clone git://mirrors.ustc.edu.cn/homebrew-core.git/ /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core --depth=1
brew install telnet
ctrl+c 中断加速
rz
https://www.cnblogs.com/zzhaolei/p/11068018.html
flink hbase sink ‘
'connector.type' = 'hbase'
’
Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSinkFactory...'
少包:flink-hbase_2.11:1.9.0
Redis 数据结构中Zset是set的sorted版本,其中有个特殊的Score的secondKey做为排序依据,set中有方法可以直接判断是否存在value,zset中没有,可以间接通过score实现。
假如现在有个需求是这样的,判断今天用户是否浏览过商家的商品,首先明确的这是个实时指标,参数中日期就是今天,不需要返回商品的明细,只需要返回True or False,用户的数据量超过千万,商家超过十万级别,那么这个时候存储的方式可以选取Redis,key为前缀加userId加day,value选取为Set或者Zset,存储商家的唯一标识符,如shopId或shopOwnerId。
influxdb
insert时需要注意除非是在tag和field之间需要一个空格以外,其他地方加空格会报错<missing tag key>,如果measurement后面加了空格,报错<invalid field format>
> insert business_monitor appVersion=1, platform=ios timeout=1 ERR: {"error":"unable to parse 'business_monitor appVersion=1, platform=ios timeout=1': invalid field format"}
> insert business_monitor, appVersion=1, platform=ios timeout=1
ERR: {"error":"unable to parse 'business_monitor, appVersion=1, platform=ios timeout=1': missing tag key"}
> insert business_monitor,appVersion=1,platform='ios' timeout=1 (success!)
另外一个问题是类型不匹配,工程中用的类型以string为主,如果需要手动insert则需要加双引号
> insert business_monitor,appVersion=1,platform=ios api_timeout=1 ERR: {"error":"partial write: field type conflict: input field \"api_timeout\" on measurement \"business_monitor\" is type float, already exists as type string dropped=1"}
> insert business_monitor,appVersion=1,platform=ios api_timeout="1" (success!)
Flink启动报错 UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
设置checkpoint存储地址为HDFS,但是找不到相关依赖,最开始可能会报找不到hadoop.configure,增加hadoop-common后改为报错‘Hadoop is not in the classpath/dependencies’
解决方法:在在官网下载hadoop的依赖: https://flink.apache.org/downloads.html,添加到项目中去
Flink 插入hbase时,如果遇到json,需要进行去除转义符。一行代码可以搞定:
StringEscapeUtils.unescapeJava
Flink启动报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/runtime/state/StateBackend at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment$.getExecutionEnvironment(StreamExecutionEnvironment.scala:763) at test$.main(test.scala:11) at test.main(test.scala) Caused by: java.lang.ClassNotFoundException: org.apache.flink.runtime.state.StateBackend at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
在包都导入的情况下基本上是包冲突导致,特别注意flink的版本号,不要引入多个版本的flink,如flink.version:1.9.0 / flink.version:1.9.1
HBase数据设置过期时间,设置TTL
设置TTL
desc 'BUSINESS_TABLE'
disable "BUSINESS_TABLE"
alter "BUSINESS_TABLE",NAME=>'cf',TTL=>'2592000'
alter 'BUSINESS_TABLE', NAME => 'cf', COMPRESSION => 'SNAPPY'
enable "BUSINESS_TABLE"
HBase常见命令
list
count 'BUSINESS_MONITOR_TABLE'
scan 'BUSINESS_TABLE',{LIMIT=>10}
scan 'BUSINESS_TABLE',{LIMIT=>10, STARTROW=>'api_15862663102990', ENDROW=>'api_15862663102999'}
truncate 'BUSINESS_TABLE'
scan 'BUSINESS_TABLE',{LIMIT=>5,FILTER=>"PrefixFilter('api_15859')"} //前缀匹配,不建议使用,建议使用startrow模式查询
get 'BUSINESS_TABLE','api_1586266310299011'
create 'BUSINESS_TABLE', 'cf' //建表
// 设置TTL/与压缩
desc 'BUSINESS_TABLE'
disable "BUSINESS_TABLE"
alter "BUSINESS_TABLE",NAME=>'cf',TTL=>'1296000'
alter 'BUSINESS_TABLE', NAME => 'cf', COMPRESSION => 'SNAPPY'
enable "BUSINESS_TABLE"
利用phoenix创建二级索引
1、在phoenix上创建相同名字的表,如 create table "sentinel"("ROW" varchar primary key,"record"."app" VARCHAR ,"record"."blockQps" VARCHAR, "record"."count" VARCHAR, "record"."exceptionQps" VARCHAR, "record"."gmtCreate" VARCHAR, "record"."gmtModified" VARCHAR, "record"."passQps" VARCHAR, "record"."resource" VARCHAR, "record"."rt" VARCHAR, "record"."successQps" VARCHAR, "record"."`timestamp`" VARCHAR );
2、select看下数据是否从hbase那边同步过来,如果有数据说明关联hbase成功了
3、建立索引,如 CREATE INDEX "indx_sentinel_time_app_res" ON "sentinel"("record"."`timestamp`","record"."app","record"."resource");
需要注意几点:
1、timestamp字段是关键字,需要加上``
2、hbase中默认字母都是大写的,如果需要小写需要“”
3、索引是否生效可以explain关注下哈,比如字段的顺序也会影响索引是否执行的
influxdb常用命令,基本与标准SQL相同,安装启动参考阿里文档
use business_monitor
drop measurement business_monitor //删除名字为‘business_monitor’的measurement,需要admin权限,一般使用delete
delete from business_monitor
select count(*) from business_monitor //只会对field进行count,对只能作为where条件的tag不做统计
time、key构成唯一,如果time和key在两条数据中重复,则只会插入一条
Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
解决方案如下:
在连接数据库的URL中加上:?useUnicode=true&characterEncoding=utf-8&useSSL=false
hbase出现如下错误,可能是因为hbase的性能问题导致
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException:
flink 保存savepoint与根据之前的sp启动:
./bin/flink savepoint 5ccf69bfa3a64cc1fdfe0d71e99fe956 hdfs://x.x.x.x:8020/business/flink/flink19-checkpoints/dapanonflink19/unshipped-orders/5ccf69bfa3a64cc1fdfe0d71e99fe956/
./bin/flink run -s hdfs://x.x.x.x:8020/business/flink/flink19-checkpoints/dapanonflink19/unshipped-orders/5ccf69bfa3a64cc1fdfe0d71e99fe956/savepoint-5ccf69-b0ab40a70138 -c com.xxxx.UnshippedOrder /app/flink-1.9.1/bak/realtime-shop-unshipped-order-1.0-SNAPSHOT.jar
因为mac里面很多命令都需要sudo,一怒之下将经常操作的文件一下子777了,但是在我登录之前配置好的JMS就出现这个问题,
It is required that your private key files are NOT accessible by others
解决方式:
chmod 600 ~/.ssh/id_rsa ~/.ssh/id_rsa.pub
不能太open了~
明明两个类是在同一个project的同一个module里面的,但是却不能引用,import不到,mvn重新导入也没用,其实也感觉没关系,这又不是其他jar的依赖。想到之前对这个工程进行过common模块的迁移,可能存在缓存,于是清理缓存就OK了: File -> Invalidate Caches/Restart
本地运行OK,但是打包到flink集群上出现问题:
flink 实时数仓
left join多次时会出现数据增多,注意去重,而实时去重目前我都是用State的