elk日志收集之rsyslog软连接监控文件深度坑
业务中通过rsyslog监控本地文件收集了一些redis和mc的慢日志,推到elk集群分析,这些日志一天一个文件,每晚零点5分通过计划任务用软连接的方式将新的文件固定到指定文件下,但是最近发现日志丢了很多,分析中发现了一个深坑,先说下现有的配置:
rsyslog的配置如下,监控固定的文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | local6.* @r-log.cms.sina.cn:1514 main_queue( queue.workerthreads= "10" # threads to work on the queue queue.dequeueBatchSize= "1000" # max number of messages to process at once queue.size= "50000" # max queue size ) module(load= "imfile" ) module(load= "omkafka" ) $PreserveFQDN on ############################ php curl log ##################################### input( type = "imfile" File= "/data1/ms/log/php_common/php_slow_log" Tag= "tag1" Facility= "local5" freshStartTail= "on" deleteStateOnFileDelete= "off" reopenOnTruncate= "on" ##日志切割Rsyslog不去读取文件,使用此选项 ) local5.* @r-log.cms.sina.cn:1515 ########################## redis log ######################################## $template redislog7215, "%$myhostname%`redis7215`%msg%" ruleset(name= "redislog7215" ) { action( broker=[ "10.13.88.190:9092" , "10.13.88.191:9092" , "10.13.88.192:9092" , "10.13.88.193:9092" ] type = "omkafka" topic= "redis-log" template= "redislog7215" partitions.auto= "on" ) } input( type = "imfile" File= "/data1/ms/log/front/redis7215.log" Tag= "" ruleset= "redislog7215" freshStartTail= "on" reopenOnTruncate= "on" ) $template redislog7216, "%$myhostname%`redis7216`%msg%" ruleset(name= "redislog7216" ) { action( broker=[ "10.13.88.190:9092" , "10.13.88.191:9092" , "10.13.88.192:9092" , "10.13.88.193:9092" ] type = "omkafka" topic= "redis-log" template= "redislog7216" partitions.auto= "on" ) } input( type = "imfile" File= "/data1/ms/log/front/redis7216.log" Tag= "" ruleset= "redislog7216" freshStartTail= "on" reopenOnTruncate= "on" ) $template redislog7242, "%$myhostname%`redis7242`%msg%" ruleset(name= "redislog7242" ) { action( broker=[ "10.13.88.190:9092" , "10.13.88.191:9092" , "10.13.88.192:9092" , "10.13.88.193:9092" ] type = "omkafka" topic= "redis-log" template= "redislog7242" partitions.auto= "on" ) } input( type = "imfile" File= "/data1/ms/log/front/redis7242.log" Tag= "" ruleset= "redislog7242" freshStartTail= "on" reopenOnTruncate= "on" ) $template redislog7243, "%$myhostname%`redis7243`%msg%" ruleset(name= "redislog7243" ) { action( broker=[ "10.13.88.190:9092" , "10.13.88.191:9092" , "10.13.88.192:9092" , "10.13.88.193:9092" ] type = "omkafka" topic= "redis-log" template= "redislog7243" partitions.auto= "on" ) } input( type = "imfile" File= "/data1/ms/log/front/redis7243.log" Tag= "" ruleset= "redislog7243" freshStartTail= "on" reopenOnTruncate= "on" ) #################### mc .log ############################### $template mc40016, "%$myhostname%`mc40016`%msg%" ruleset(name= "mc40016" ) { action( broker=[ "10.13.88.190:9092" , "10.13.88.191:9092" , "10.13.88.192:9092" , "10.13.88.193:9092" ] type = "omkafka" topic= "cms-front-mc" template= "mc40016" partitions.auto= "on" ) } input( type = "imfile" File= "/data1/ms/log/front/memcached_get_err_40016.log" Tag= "" ruleset= "mc40016" freshStartTail= "on" reopenOnTruncate= "on" ) $template mc40023, "%$myhostname%`mc40023`%msg%" ruleset(name= "mc40023" ) { action( broker=[ "10.13.88.190:9092" , "10.13.88.191:9092" , "10.13.88.192:9092" , "10.13.88.193:9092" ] type = "omkafka" topic= "cms-front-mc" template= "mc40023" partitions.auto= "on" ) } input( type = "imfile" File= "/data1/ms/log/front/memcached_get_err_40023.log" Tag= "" ruleset= "mc40023" freshStartTail= "on" reopenOnTruncate= "on" ) |
每晚凌晨5分进行计划任务切换,计划任务与脚本如下:
1 | 5 0 * * * root sh /usr/local/script/elkslowlog .sh & > /dev/null |
1 2 3 4 5 6 7 8 9 10 | #!/bin/bash DATE=` date +%F` DATE2=` date +%Y%m%d` ln -sf /data1/ms/log/php_common/curl- $DATE /data1/ms/log/php_common/php_slow_log ln -sf /data1/ms/log/front/redis7215 .$DATE2.log /data1/ms/log/front/redis7215 .log ln -sf /data1/ms/log/front/redis7216 .$DATE2.log /data1/ms/log/front/redis7216 .log ln -sf /data1/ms/log/front/redis7242 .$DATE2.log /data1/ms/log/front/redis7242 .log ln -sf /data1/ms/log/front/redis7243 .$DATE2.log /data1/ms/log/front/redis7243 .log ln -sf /data1/ms/log/front/memcached_get_err_40023 .$DATE2.log /data1/ms/log/front/memcached_get_err_40023 .log ln -sf /data1/ms/log/front/memcached_get_err_40016 .$DATE2.log /data1/ms/log/front/memcached_get_err_40016 .log |
问题分析:看着似乎没什么问题,但分析后发现,同样的收集方法,curl的php_slow_log就没有问题,丢日志的mc和redis有个共同特点,就是量不大,有时候半个小时可能就有1条,继续考虑其中的不同,于是想到,可能就是在零点5分创建超连接的瞬间,当天的日志文件还未生成,也就是建立了一个空连接,后面当日志文件真的生成后,rsyslog未检测到超连接对应文件的变化,所以一天的日志都没有,于是测试,果然如此,测试过程不说了。
解决方案:修改脚本文件,在创建超连接时,检测当天日志文件是否生成,如未生成,用echo写入一个空行(空行不会影响收集)生成文件,再进行超连接转换,脚本修改如下
1 2 3 4 5 6 7 8 9 10 | #!/bin/bash DATE=` date +%F` DATE2=` date +%Y%m%d` echo >> /data1/ms/log/php_common/curl- $DATE && ln -sf /data1/ms/log/php_common/curl- $DATE /data1/ms/log/php_common/php_slow_log echo >> /data1/ms/log/front/redis7215 .$DATE2.log && ln -sf /data1/ms/log/front/redis7215 .$DATE2.log /data1/ms/log/front/redis7215 .log echo >> /data1/ms/log/front/redis7216 .$DATE2.log && ln -sf /data1/ms/log/front/redis7216 .$DATE2.log /data1/ms/log/front/redis7216 .log echo >> /data1/ms/log/front/redis7242 .$DATE2.log && ln -sf /data1/ms/log/front/redis7242 .$DATE2.log /data1/ms/log/front/redis7242 .log echo >> /data1/ms/log/front/redis7243 .$DATE2.log && ln -sf /data1/ms/log/front/redis7243 .$DATE2.log /data1/ms/log/front/redis7243 .log echo >> /data1/ms/log/front/memcached_get_err_40023 .$DATE2.log && ln -sf /data1/ms/log/front/memcached_get_err_40023 .$DATE2.log /data1/ms/log/front/memcached_get_err_40023 .log echo >> /data1/ms/log/front/memcached_get_err_40016 .$DATE2.log && ln -sf /data1/ms/log/front/memcached_get_err_40016 .$DATE2.log /data1/ms/log/front/memcached_get_err_40016 .log |
好了,问题圆满解决。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!