JVM-SandBox 实战:Bug 修复 / 代码调用链 / 故障注入

JVM-SandBox 介绍

  • JVM-SandBox(沙箱)是一种 JVM 的非侵入式运行期 AOP 解决方案。

  • JVM-SandBox 是基于 Instrumentation 的动态编织类的 AOP 框架,可以在不重启应用且运行时的情况下,对目标应用的目标方法进行增强和替换

  • JVM-SANDBOX github 地址:项目/原理介绍、下载安装等。

  • 沙箱事件介绍

  • 参考文章

沙箱的特性

  • 无侵入:目标应用无需重启也无需感知沙箱的存在。
  • 类隔离:沙箱以及沙箱的模块不会和目标应用的类相互干扰。
  • 可插拔:沙箱以及沙箱的模块可以随时加载和卸载,不会在目标应用留下痕迹。
  • 多租户:目标应用可以同时挂载不同租户下的沙箱并独立控制。
  • 高兼容:支持 JDK [6,11]

沙箱常见应用场景

  • 线上故障定位
  • 线上系统流控
  • 线上故障模拟
  • 方法请求录制和结果回放
  • 动态日志打印
  • 安全信息监测和脱敏

演练环境准备

# 下载最新版本的JVM-SANDBOX
wget http://ompc.oss-cn-hangzhou.aliyuncs.com/jvm-sandbox/release/sandbox-stable-bin.zip

# 解压
unzip sandbox-stable-bin.zip

# 优化sandox的运行问题,使用 sandbox 函数代替 sandbox.sh
sandbox ()
{
    cd /root/jvm-sandbox/sandbox/bin/;
    ./sandbox.sh "$@";
    cd $OLDPWD
}

sandbox 插桩目标进程的两种方式:

# 方式一:attach
# 对运行时目标进程进行插桩
sandbox -p 目标进程pid

# 方式二:agent
# 即目标进程启动时带上该参数,从而进行插桩
-javaagent:/root/jvm-sandbox/sandbox/lib/sandbox-agent.jar

应用场景一:Bug Fix

插桩目标进程

目前进程存在的问题:

image

启动目标进程:

[root@localhost Clock]# pwd
/root/jvm-sandbox/jvm-sandbox-master/Clock

[root@localhost Clock]# mvn exec:java -Dexec.mainClass="com.taobao.demo.Clock"
[INFO] Scanning for projects...
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO]   ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO] 
[INFO] -------------------< com.alibaba.jvm.sandbox:Clock >--------------------
[INFO] Building Clock 1.3.1
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ Clock ---
java.lang.IllegalStateException: STATE ERROR!
	at com.taobao.demo.Clock.checkState(Clock.java:13)
	at com.taobao.demo.Clock.report(Clock.java:31)
	at com.taobao.demo.Clock.loopReport(Clock.java:41)
	at com.taobao.demo.Clock.main(Clock.java:50)
	at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:254)
	at java.lang.Thread.run(Thread.java:748)
java.lang.IllegalStateException: STATE ERROR!
	at com.taobao.demo.Clock.checkState(Clock.java:13)
	at com.taobao.demo.Clock.report(Clock.java:31)
	at com.taobao.demo.Clock.loopReport(Clock.java:41)
	at com.taobao.demo.Clock.main(Clock.java:50)
	at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:254)
	at java.lang.Thread.run(Thread.java:748)
java.lang.IllegalStateException: STATE ERROR!
	at com.taobao.demo.Clock.checkState(Clock.java:13)
	at com.taobao.demo.Clock.report(Clock.java:31)
	at com.taobao.demo.Clock.loopReport(Clock.java:41)
	at com.taobao.demo.Clock.main(Clock.java:50)
	at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:254)
	at java.lang.Thread.run(Thread.java:748)
java.lang.IllegalStateException: STATE ERROR!
	at com.taobao.demo.Clock.checkState(Clock.java:13)
	......

插桩目标进程:

# 查看目标进程 pid
[root@localhost ~]# ps -ef|grep java
root        3341    2001 30 22:08 pts/0    00:00:44 /opt/jdk1.8.0_171/bin/java -classpath /usr/local/maven/apache-maven-3.5.4/boot/plexus-classworlds-2.5.2.jar -Dclassworlds.conf=/usr/local/maven/apache-maven-3.5.4/bin/m2.conf -Dmaven.home=/usr/local/maven/apache-maven-3.5.4 -Dlibrary.jansi.path=/usr/local/maven/apache-maven-3.5.4/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/root/jvm-sandbox/jvm-sandbox-master/Clock org.codehaus.plexus.classworlds.launcher.Launcher exec:java -Dexec.mainClass=com.taobao.demo.Clock
root        3426    3380  0 22:10 pts/1    00:00:00 grep --color=auto java

# 进行 jvm-sandbox 对目标进程的插桩
[root@localhost ~]# sandbox -p 3341
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 43339
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

jvm-sandbox 常用命令

# 参数说明
[root@localhost target]# sandbox -m
./sandbox.sh: option requires an argument -- m

usage: ./sandbox.sh [h] [<p:> [vlRFfu:a:A:d:m:I:P:C:X]]

    -h : help
         Prints the ./sandbox.sh help

    -X : debug
         Prints debug message

    -p : PID
         Select target JVM process ID
....

# 挂载目标应用
[root@localhost ~]# sandbox -p 3341
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH  # 挂载模式
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 43339  # jvm-sandbox 服务的随机端口
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..  # 主目录
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module  # 系统级模块的所在目录
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;  # 用户级模块的所在目录
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

# 查看所挂载的模块
[root@localhost bin]# sandbox -p 3341 -l
sandbox-info        	ACTIVE  	LOADED  	0    	0    	0.0.4          	luanjia@taobao.com
sandbox-module-mgr  	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
sandbox-control     	ACTIVE  	LOADED  	0    	0    	0.0.3          	luanjia@taobao.com
total=3

# 停止挂载
[root@localhost bin]# sandbox -p 3341 -S
jvm-sandbox[default] shutdown finished.
# 再次挂载
[root@localhost bin]# sandbox -p 3341
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 46203  # 随机端口已变
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

# 重启挂载
[root@localhost bin]# sandbox -p 3341 -R
[root@localhost bin]# sandbox -p 3341
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 46203  # 随机端口未变化
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

# 指定固定端口
[root@localhost bin]# sandbox -p 3341 -S
jvm-sandbox[default] shutdown finished.
[root@localhost bin]# sandbox -p 3341 -P 8001
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 8001  # 指定端口
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

编写模块

创建一个 Java 工程 clock-tinker,将 parent 指向 sandbox-module-starter 来简化我们的配置工作:

<parent>
    <groupId>com.alibaba.jvm.sandbox</groupId>
    <artifactId>sandbox-module-starter</artifactId>
    <version>1.2.0</version>
</parent> 

编写修复代码:

image

打包模块并放置用户级模块目录下:

# 打包编写好的模块
[root@localhost clock-tinker]# pwd
/root/jvm-sandbox/jvm-sandbox-master/clock-tinker
[root@localhost clock-tinker]# mvn clean package
...
[root@localhost clock-tinker]# cd target/
[root@localhost target]# ls
apidocs      clock-tinker-1.2.0.jar                        clock-tinker-1.2.0-sources.jar  maven-archiver
archive-tmp  clock-tinker-1.2.0-jar-with-dependencies.jar  generated-sources               maven-status
classes      clock-tinker-1.2.0-javadoc.jar                javadoc-bundle-options

# 将模块放置用户级模块目录下
[root@localhost target]# mkdir /root/.sandbox-module
[root@localhost target]# cp clock-tinker-1.2.0-jar-with-dependencies.jar ~/.sandbox-module/

# 重新挂载目标进程
[root@localhost target]# sandbox -p 4384 -S
jvm-sandbox[default] shutdown finished.
[root@localhost target]# sandbox -p 4384 -P 8001
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 8001
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

# 查看自定义模块是否被成功挂载
[root@localhost target]# sandbox -p 4384 -l
sandbox-info        	ACTIVE  	LOADED  	0    	0    	0.0.4          	luanjia@taobao.com
broken-clock-tinker 	ACTIVE  	LOADED  	0    	0    	UNKNOW_VERSION 	UNKNOW_AUTHOR  # 自定义的模块
sandbox-module-mgr  	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
sandbox-control     	ACTIVE  	LOADED  	0    	0    	0.0.3          	luanjia@taobao.com
total=4

# 修复目标进程的问题:-d 模块名称/指令名称
[root@localhost target]# sandbox -p 4384 -d broken-clock-tinker/repairCheckState

image


应用场景二:DEBUG TRACE


代码调用链

模块编写 官方示例:

image

启动目标进程:

[root@localhost Clock]# pwd
/root/jvm-sandbox/jvm-sandbox-master/Clock
[root@localhost Clock]# mvn exec:java -Dexec.mainClass="com.taobao.demo.Clock"

将 jvm-sandbox 自带的 debug 模块放置用户级模块目录下:

[root@localhost example]# pwd
/root/jvm-sandbox/sandbox/example
[root@localhost example]# ls
sandbox-debug-module.jar
[root@localhost example]# cp sandbox-debug-module.jar ~/.sandbox-module/

挂载目标进程:

# 挂载目标进程
[root@localhost example]# sandbox -p 5816
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 39181
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

# 查看所需挂载的模块:debug-trace
[root@localhost example]# sandbox -p 5816 -l
debug-ralph         	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
debug-exception-logger	ACTIVE  	LOADED  	1    	5    	0.0.2          	luanjia@taobao.com
sandbox-info        	ACTIVE  	LOADED  	0    	0    	0.0.4          	luanjia@taobao.com
broken-clock-tinker 	ACTIVE  	LOADED  	0    	0    	UNKNOW_VERSION 	UNKNOW_AUTHOR
sandbox-module-mgr  	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
debug-trace         	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
debug-lifecycle     	ACTIVE  	LOADED  	0    	0    	0.0.1          	luanjia@taobao.com
sandbox-control     	ACTIVE  	LOADED  	0    	0    	0.0.3          	luanjia@taobao.com
debug-watch         	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
debug-servlet-access	ACTIVE  	LOADED  	0    	0    	0.0.2          	luanjia@taobao.com
total=10

打印指定类和方法的调用链:

# -d '模块名称/指令名称?class=目标类名&method=目标方法'
[root@localhost example]# sandbox -p 5816 -d 'debug-trace/trace?class=com.taobao.demo.Clock&method=report'
[##################################################]FINISH(cCnt=1,mCnt=1)
tracing on [com.taobao.demo.Clock#report].
Press CTRL_C abort it!
`---+Tracing for : com.taobao.demo.Clock.report by com.taobao.demo.Clock.main()
    `---+[49,49ms]Enter : com.taobao.demo.Clock.report(...);
        +---[31,6ms]com.taobao.demo.Clock:checkState(@31)[throw java.lang.IllegalStateException]
        `---[49,0ms]throw:java.lang.IllegalStateException()

`---+Tracing for : com.taobao.demo.Clock.report by com.taobao.demo.Clock.main()
    `---+[2,2ms]Enter : com.taobao.demo.Clock.report(...);
        +---[2,1ms]com.taobao.demo.Clock:checkState(@31)[throw java.lang.IllegalStateException]
        `---[2,0ms]throw:java.lang.IllegalStateException()
......

# -d '模块名称/指令名称?class=目标类名&method=目标方法'
[root@localhost bin]# sandbox -p 5816 -d 'debug-trace/trace?class=java.io.PrintStream&method=println'
[##################################################]FINISH(cCnt=1,mCnt=10)
tracing on [java.io.PrintStream#println].
Press CTRL_C abort it!
`---+Tracing for : org.fusesource.jansi.FilterPrintStream.println by com.taobao.demo.Clock.main()
    `---+[10,10ms]Enter : org.fusesource.jansi.FilterPrintStream.println(...);
        +---[3,1ms]java.lang.String:valueOf(@238)
        +---[5,2ms]org.fusesource.jansi.FilterPrintStream:print(@240)
        `---[5,0ms]org.fusesource.jansi.FilterPrintStream:newLine(@241)

`---+Tracing for : org.fusesource.jansi.FilterPrintStream.println by com.taobao.demo.Clock.main()
    `---+[15,15ms]Enter : org.fusesource.jansi.FilterPrintStream.println(...);
        +---[1,1ms]java.lang.String:valueOf(@238)
        +---[15,14ms]org.fusesource.jansi.FilterPrintStream:print(@240)
        `---[15,0ms]org.fusesource.jansi.FilterPrintStream:newLine(@241)
......

代码调用链行数

在上述功能中再增加调用链代码行数功能:

image

image

        final EventWatcher watcher = new EventWatchBuilder(moduleEventWatcher)
                .onClass(cnPattern).includeSubClasses()
                .onBehavior(mnPattern)
                .onWatching()
                .withCall()
                .withLine()
......
......
                    @Override
                    protected void beforeLine(Advice advice, int lineNum) {
//                        super.beforeLine(advice, lineNum);
                        String result=advice.getTarget().getClass().getCanonicalName()+": @"
                                +lineNum+": "
                                + advice.getBehavior().getName();
                        printer.println(result);
                    }

重新打包模块:

[root@localhost sandbox-debug-module]# mvn clean package
[root@localhost sandbox-debug-module]# cd target
[root@localhost target]# cp sandbox-debug-module-1.3.1-jar-with-dependencies.jar ~/.sandbox-module/

目标进程代码中增加代码,增强演示效果:

image

启动目标进程:

[root@localhost Clock]# mvn exec:java -Dexec.mainClass="com.taobao.demo.Clock"

sandbox 插桩:

[root@localhost sandbox-debug-module]# ps -ef|grep java
root        7460    6372 78 02:20 pts/2    00:01:00 /opt/jdk1.8.0_171/bin/java -classpath /usr/local/maven/apache-maven-3.5.4/boot/plexus-classworlds-2.5.2.jar -Dclassworlds.conf=/usr/local/maven/apache-maven-3.5.4/bin/m2.conf -Dmaven.home=/usr/local/maven/apache-maven-3.5.4 -Dlibrary.jansi.path=/usr/local/maven/apache-maven-3.5.4/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/root/jvm-sandbox/jvm-sandbox-master/Clock org.codehaus.plexus.classworlds.launcher.Launcher exec:java -Dexec.mainClass=com.taobao.demo.Clock
root        7495    6415  0 02:22 pts/3    00:00:00 grep --color=auto java

[root@localhost sandbox-debug-module]# sandbox -p 7460
                    NAMESPACE : default
                      VERSION : 1.3.3
                         MODE : ATTACH
                  SERVER_ADDR : 0.0.0.0
                  SERVER_PORT : 37937
               UNSAFE_SUPPORT : ENABLE
                 SANDBOX_HOME : /root/jvm-sandbox/sandbox/bin/..
            SYSTEM_MODULE_LIB : /root/jvm-sandbox/sandbox/bin/../module
              USER_MODULE_LIB : /root/jvm-sandbox/sandbox/sandbox-module;~/.sandbox-module;
          SYSTEM_PROVIDER_LIB : /root/jvm-sandbox/sandbox/bin/../provider
           EVENT_POOL_SUPPORT : DISABLE

[root@localhost target]# sandbox -p 7460 -d 'debug-trace/trace?class=com.taobao.demo.Clock&method=*'

image


应用场景三:故障注入(Debug Ralph)

关注场景:

  • 运维关注的故障注入:不关心业务,仅关心系统的可靠性(运维更喜欢用的方案是 systemta,可以注入更底层的故障)。
  • 测试关注的故障注入:关心业务,要有业务含义地使用故障注入。

故障注入是可以自动化的:

  1. 跟踪所有可能有异常的函数,比如文件/网络的 I/O 读写等地方。
  2. 对容易发生异常的函数,注入故障,以验证业务可靠性。

模块编写中级

DebugRalphModule.java:无敌破坏王,故障注入(延时、熔断、并发限流、TPS 限流)

image

在场景二中已经导入过 debug-ralph 模块:

image

使目标进程报指定异常的错

# -d 'debug-ralph/wreck?class=类名&method=方法名&type=异常类型'
[root@localhost bin]# sandbox -p 7460 -d 'debug-ralph/wreck?class=com.taobao.demo.Clock&method=report&typeNullPointException'

image


修改入参

方法如下:

    /**
     * 改变方法入参
     *
     * @param index       方法入参编号(从0开始)
     * @param changeValue 改变的值
     * @return this
     * @since {@code sandbox-api:1.0.10}
     */
    public BeforeEvent changeParameter(final int index,
                                       final Object changeValue) {
        argumentArray[index] = changeValue;
        return this;
    }

实现思路:如何将 jvm-sandbox 与 sonarqube 结合?

  1. 编写模块,以记录 beforeLine(上述应用场景二的示例)。
  2. 将 class、method、line、file 等信息保存到 trace log 中。
  3. 将测试用例与 trace 记录相关联。
  4. 提取 trace log 的信息,生成 soanr 的自定义覆盖率文件(如下示例)。
<coverage version="1">
  <file path="src/main/java/com/acme/basic/HelloWorld.java">
    <lineToCover lineNumber="13" covered="true"/>
    <lineToCover lineNumber="8" covered="false"/>
  </file>
  <file path="xources/hello/WithConditions.xoo">
    <lineToCover lineNumber="3" covered="true" branchesToCover="2" coveredBranches="1"/>
  </file>
</coverage>
  1. soanr-scanner 上传分析结果。
posted @ 2022-06-25 00:50  Juno3550  阅读(3856)  评论(0编辑  收藏  举报