解决异常断电导致的: CorruptSSTableException: java.io.EOFException
问题产生
服务器重启,导致cassandra损坏,整个集群不可用。所使用的cassandra为2.1.9版本。
问题描述
运行启动命令,报错如下:
DEBUG 07:51:03 All segments have been unmapped successfully INFO 07:51:03 Opening ./../data/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/system-size_estimates-ka-7382 (1293711 bytes) ERROR 07:51:03 Exiting forcefully due to file system exception on startup, disk failure policy "stop" org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] Caused by: java.io.EOFException: null at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.8.0_45] at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.8.0_45] at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.8.0_45] at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[apache-cassandra-2.1.9.jar:2.1.9] ... 14 common frames omitted DEBUG 07:51:03 INDEX LOAD TIME for ./../data/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/system-size_estimates-ka-7382: 0 ms. DEBUG 07:51:03 Load metadata for ./../data/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/system-size_estimates-ka-7381 INFO 07:51:03 Opening ./../data/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/system-size_estimates-ka-7381 (1288730 bytes) DEBUG 07:51:03 INDEX LOAD TIME for ./../data/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/system-size_estimates-ka-7381: 0 ms.
解决方案
1. 在正常节点上执行(节点需要启动)
./nodetool ring | grep 192.168.66.149 | awk '{print $NF ","}' | xargs
注意:initial_token的取值为上一步骤的返回值。
3. 删除数据盘下system目录
如/usr/local/cassandra2/apache-cassandra-2.1.9/data/data/system
4. 启动cassandra
./cassandra
启动过程可能会报错,但会继续重建system库,只要能启动成功加入集群就算正常。
5. 修复数据
运行nodetool工具:
nodetool repair
6. 将配置项改回原样并重启
本解决方案参考自:/usr/local/cassandra2/apache-cassandra-2.1.9/data/data/system
标签:
cassandra
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 地球OL攻略 —— 某应届生求职总结
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· .NET周刊【3月第1期 2025-03-02】
· [AI/GPT/综述] AI Agent的设计模式综述