jvm attach过程与底层实现
rasp的技术重点之一是java-agent技术,通过agent可以获取到Instrumentation接口的实现,通过这个inst变量对字节码进行修改。
javaagent可以在jvm启动时使用 -agentjar 参数启动,也可以在运行时通过attach相应进程,并且指明需要加载的jar包,就可以进入到jar包中定义好的agentmain方法处,执行相关的逻辑。
后续分析的源码均来自 openjdk8,不同版本可能实现不同。
Attach侧
下面以jdk中提供的attach接口为例,说明整个attach的过程。
下面的代码可以把一个jar包attach到指定的jvm进程:
String agentFilePath = "/Desktop/MyFirstAgent/target/MyFirstAgent-1.0-SNAPSHOT-jar-with-dependencies.jar";
String applicationName = "MyApplication";
//iterate all jvms and get the first one that matches our application name
Optional<String> jvmProcessOpt = Optional.ofNullable(VirtualMachine.list()
.stream()
.filter(jvm -> {
System.out.println("jvm:{}" + jvm.displayName());
return jvm.displayName().contains(applicationName);
})
.findFirst().get().id());
if(!jvmProcessOpt.isPresent()) {
System.err.println("Target Application not found");
return;
}
File agentFile = new File(agentFilePath);
try {
String jvmPid = jvmProcessOpt.get();
System.out.println("Attaching to target JVM with PID: " + jvmPid);
VirtualMachine jvm = VirtualMachine.attach(jvmPid);
jvm.loadAgent(agentFile.getAbsolutePath());
jvm.detach();
System.out.println("Attached to target JVM and loaded Java agent successfully");
} catch (Exception e) {
throw new RuntimeException(e);
}
代码中运行的java进程中选择jvm描述中含有我们指定类名的java进程,然后attach,并且把指定的agent load到jvm中.
VirtualMachine.attach(jvmPid)
jvm中执行attach的是由不同的AttachProvider实现的,不同的provider与系统平台有关,它们都是provider的具体实现类,执行attachVirtualMachine方法,传递对应的pid参数,在mac系统上,是BsdAttachProvider执行相关函数
会执行 new BsdVirtualMachine,跟踪进入,可以看到attach的逻辑:
- 先检查两个参数不为null,并将字符串格式的pid转为integer类型
- findSocketFile,去tmp目录下,寻找有无 .java_pid
的文件,如果不存在,就在tmp目录下创建.attach_pid 文件并且调用native方法 createAttachFile,该方法会在对应pid的工作目录("/proc/" + pid + "/cwd/" + fn)或者临时目录下创建.attach_pid 文件 - 如果java_pid文件存在,或者经过创建之后,调用native方法sendQuitTo jdk/src/solaris/native/sun/tools/attach/LinuxVirtualMachine.c
/*
* Class: sun_tools_attach_LinuxVirtualMachine
* Method: sendQuitTo
* Signature: (I)V
*/
JNIEXPORT void JNICALL Java_sun_tools_attach_LinuxVirtualMachine_sendQuitTo
(JNIEnv *env, jclass cls, jint pid)
{
if (kill((pid_t)pid, SIGQUIT)) {
JNU_ThrowIOExceptionWithLastError(env, "kill");
}
}
- 然后进入一个循环,每200ms尝试一次,判断tmp目录 .java_pid
文件是否存在,循环时间由 System.getProperty("sun.tools.attach.attachTimeout") 规定 - 存在java_pid文件后,初始化socket并与这个文件连接,调用的都是native方法,分别是socket 与 connect
jdk/src/solaris/native/sun/tools/attach/LinuxVirtualMachine.c
/*
* Class: sun_tools_attach_LinuxVirtualMachine
* Method: socket
* Signature: ()I
*/
JNIEXPORT jint JNICALL Java_sun_tools_attach_LinuxVirtualMachine_socket
(JNIEnv *env, jclass cls)
{
int fd = socket(PF_UNIX, SOCK_STREAM, 0);
if (fd == -1) {
JNU_ThrowIOExceptionWithLastError(env, "socket");
}
return (jint)fd;
}
/*
* Class: sun_tools_attach_LinuxVirtualMachine
* Method: connect
* Signature: (ILjava/lang/String;)I
*/
JNIEXPORT void JNICALL Java_sun_tools_attach_LinuxVirtualMachine_connect
(JNIEnv *env, jclass cls, jint fd, jstring path)
{
jboolean isCopy;
const char* p = GetStringPlatformChars(env, path, &isCopy);
if (p != NULL) {
struct sockaddr_un addr;
int err = 0;
addr.sun_family = AF_UNIX;
strcpy(addr.sun_path, p);
if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == -1) {
err = errno;
}
if (isCopy) {
JNU_ReleaseStringPlatformChars(env, path, p);
}
/*
* If the connect failed then we throw the appropriate exception
* here (can't throw it before releasing the string as can't call
* JNI with pending exception)
*/
if (err != 0) {
if (err == ENOENT) {
JNU_ThrowByName(env, "java/io/FileNotFoundException", NULL);
} else {
char* msg = strdup(strerror(err));
JNU_ThrowIOException(env, msg);
if (msg != NULL) {
free(msg);
}
}
}
}
}
socket函数使用 int fd = socket(PF_UNIX, SOCK_STREAM, 0); 初始化了一个 uds
connect函数使用 connect(fd, (struct sockaddr*)&addr, sizeof(addr)) 建立连接
jvm.loadAgent(agentFile.getAbsolutePath())
VirtualMachine定义了一个抽象方法loadagent,具体实现是下面的实现类做的,具体的说,是执行了 this.execute("load", var1, var2 ? "true" : "false", var3) 这个方法,其中var1是传入的字符串 "Instrument", var2 是false, var3是jar包的路径
execute也是个抽象方法,具体实现依赖平台,在Bsd实现中,会把刚才的socketfile建立连接,然后先写入一个字符串"1",然后把上面的参数写进socket
"1"是jvm规定的ATTACH_PROTOCOL_VER,在hotspot/src/os/linux/vm/attachListener_linux.cpp read_request 方法中有注释对指令进行了解释
// The request is a sequence of strings so we first figure out the
// expected count and the maximum possible length of the request.
// The request is:
//
0 0 0 0 0 // where
is the protocol version (1), is the command // name ("load", "datadump", ...), and
is an argument
执行完成后,读取返回值,判断load是否成功,返回的var13是包装的socket
if (var7 != 0) {
String var8 = this.readErrorMessage(var13);
var13.close();
if (var7 == 101) {
throw new IOException("Protocol mismatch with target VM");
} else if (var1.equals("load")) {
throw new AgentLoadException("Failed to load agent library");
} else if (var8 == null) {
throw new AttachOperationFailedException("Command failed in target VM");
} else {
throw new AttachOperationFailedException(var8);
}
} else {
return var13;
}
目标jvm侧
SIGQUIT信号处理
在 hotspot/src/share/vm/runtime/os.cpp 中 方法 signal_thread_entry 对系统的信号进行处理,在收到SIGQUIT信号时,会先去进行attach判断(!DisableAttachMechanism && AttachListener::is_init_trigger()),如果检查不通过就去打印栈上的trace
// SIGBREAK is sent by the keyboard to query the VM state
#ifndef SIGBREAK
#define SIGBREAK SIGQUIT
#endif
// sigexitnum_pd is a platform-specific special signal used for terminating the Signal thread.
static void signal_thread_entry(JavaThread* thread, TRAPS) {
os::set_priority(thread, NearMaxPriority);
while (true) {
int sig;
{
// FIXME : Currently we have not decieded what should be the status
// for this java thread blocked here. Once we decide about
// that we should fix this.
sig = os::signal_wait();
}
if (sig == os::sigexitnum_pd()) {
// Terminate the signal thread
return;
}
switch (sig) {
case SIGBREAK: {
// Check if the signal is a trigger to start the Attach Listener - in that
// case don't print stack traces.
if (!DisableAttachMechanism && AttachListener::is_init_trigger()) {
continue;
}
// Print stack traces
// Any SIGBREAK operations added here should make sure to flush
// the output stream (e.g. tty->flush()) after output. See 4803766.
// Each module also prints an extra carriage return after its output.
VM_PrintThreads op;
VMThread::execute(&op);
VM_PrintJNI jni_op;
VMThread::execute(&jni_op);
VM_FindDeadlocks op1(tty);
VMThread::execute(&op1);
Universe::print_heap_at_SIGBREAK();
...
break;
}
...
}
}
}
在 is_init_trigger 方法中,会检查tmp目录下是否存在 .attach_pid%pid 这个文件,检查文件创建的用户与当前jvm进程effective user相同,执行AttachListener的init方法
// If the file .attach_pid<pid> exists in the working directory
// or /tmp then this is the trigger to start the attach mechanism
bool AttachListener::is_init_trigger() {
if (init_at_startup() || is_initialized()) {
return false; // initialized at startup or already initialized
}
char path[PATH_MAX + 1];
int ret;
struct stat st;
snprintf(path, PATH_MAX + 1, "%s/.attach_pid%d",
os::get_temp_directory(), os::current_process_id());
RESTARTABLE(::stat(path, &st), ret);
if (ret == 0) {
// simple check to avoid starting the attach mechanism when
// a bogus user creates the file
if (st.st_uid == geteuid()) {
init();
return true;
}
}
return false;
}
hotspot/src/share/vm/services/attachListener.cpp
// Starts the Attach Listener thread
void AttachListener::init() {
...
{ MutexLocker mu(Threads_lock);
JavaThread* listener_thread = new JavaThread(&attach_listener_thread_entry);
...
Thread::start(listener_thread);
}
}
attach_listener_thread_entry 也是attachListener.cpp文件中的函数,根据注释,该函数初始化AttachListener,从一个队列里获取 operation,然后根据op的类型,派发相应的处理函数执行操作。
AttachListener
根据平台不同有不同的AttachListener实现,以hotspot/src/os/linux/vm/attachListener_linux.cpp为例,其他平台实现思路应该是相同的,细节方面可能有所差异。
初始化
上文中说到AttachListener的init方法,看一下LinuxAttachListener的初始化过程,只做了一件事,就是在tmp目录下新建 .java_pid
执行到这里,相当于从attach侧发送到attach请求,已经得到了jvm侧回应,建立好了socket连接。
之后通过AttachListener::dequeue(); 取出命令并使用相应的函数处理。
dequeue
dequeue方法是一个死循环,会循环使用accept方法,接受socket中传过来的数据,并且在验证通信的另一端的uid与gid与自身的euid与egid相同后,执行read_request方法,从socket读取内容,并且把内容包装成AttachOperation类的一个实例。
read_request方法规定了发送的内容,
支持的操作
static AttachOperationFunctionInfo funcs[] = {
{ "agentProperties", get_agent_properties },
{ "datadump", data_dump },
#ifndef SERVICES_KERNEL
{ "dumpheap", dump_heap },
#endif // SERVICES_KERNEL
{ "load", JvmtiExport::load_agent_library },
{ "properties", get_system_properties },
{ "threaddump", thread_dump },
{ "inspectheap", heap_inspection },
{ "setflag", set_flag },
{ "printflag", print_flag },
{ NULL, NULL }
};
load_agent_library
如果希望注入agent,就需要发送 "1" "load" "instrument" "false" "
false表示使用非绝对路径,函数会去找到对应的dll并加载,在macos上,找到的是jre/lib/libinstrument.dylib,也就是 lib路径+“参数”+“.dylib”
如果ddl加载成功了,就去里面寻找Agent_OnAttach方法,调用执行,如果执行成功就把这个agent加入到代理列表中
这里的Agent_OnAttach是 jvmti 定义的方法,更多有关jvmti的信息,可参考 https://docs.oracle.com/en/java/javase/20/docs/specs/jvmti.html
实现在 jdk/src/share/instrument/InvocationAdapter.c
JNIEXPORT jint JNICALL
Agent_OnAttach(JavaVM* vm, char *args, void * reserved)
JPLIS stands for Java Programming Language Instrumentation Services
后续分析需要涉及jvmti的实现,以及上面说的JPLIS,可参考 https://blog.csdn.net/sun_tantan/article/details/105786883