O030、Launch 和 shut off 操作详解
本节详细分析 instance launch 和 shut off 操作 ,以及如何在日志中快速定位有用信息的技巧。
Launch Instance
这是 Nova 最重要的操作。仔细研究 Lanuch 操作能够帮助我们充分理解 Nova 各个子服务的协调配合和运行机制。前面我们已经以 launch 操作为例详细讨论了各个 nova-* 子服务。这里不再赘述,只是再回顾一下流程。
Shut Off Instance
下面是 shut off instance 的流程图
①用户向 nova-api 发送关机请求
查看日志 /opt/stack/logs/n-api.log
2019-05-23 21:21:22.174 DEBUG nova.api.openstack.wsgi [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] Action: 'action', calling method: <bound method ServersController._stop_server of <nova.api.openstack.compute.servers.ServersController object at 0x7fb2f89b4fd0>>, body: {"os-stop": null} from (pid=28283) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:623
2019-05-23 21:21:22.213 DEBUG nova.compute.api [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] [instance: a0e2b485-f40c-43e4-beb6-049b6399f0ec] Going to try to stop instance from (pid=28283) force_stop /opt/stack/nova/nova/compute/api.py:2282
②nova-api 向 Messaging 发送一个关机的消息
nova-api 向messaging 发送消息,这条并没有明显的记录到日志文件中,但我们还是从日志中找到了蛛丝马迹,这里通过 Requests ID 加 代码模块定位到了相关日志
2019-05-23 21:21:22.325 DEBUG oslo_messaging._drivers.amqpdriver [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] CAST unique_id: 97bb0cf9fd5b42fbabf2de4218d4a8cb exchange 'nova' topic 'compute.DevStack-Controller' from (pid=28283) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:550
2019-05-23 21:21:22.329 INFO nova.osapi_compute.wsgi.server [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] 10.12.31.241 "POST /v2.1/servers/a0e2b485-f40c-43e4-beb6-049b6399f0ec/action HTTP/1.1" status: 202 len: 337 time: 0.1873510
③nova-compute 从Messaging 获取到关机的消息后执行关机操作
该操作在计算节点上,需要查看 /opt/stack/logs/n-cpu.log 日志文件,同上,我们使用 Requests ID 加 代码模块定位到了下面的日志信息
root@DevStack-Controller:/opt/stack/logs# cat n-cpu.log | grep req-748d953f-42d8-4853-9816-d44fbbcdbed7 | grep nova.compute.manager
2019-05-23 21:21:22.363 DEBUG oslo_concurrency.lockutils [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] Lock "a0e2b485-f40c-43e4-beb6-049b6399f0ec" acquired by "nova.compute.manager.do_stop_instance" :: waited 0.000s from (pid=4613) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
2019-05-23 21:21:22.364 DEBUG nova.compute.manager [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] [instance: a0e2b485-f40c-43e4-beb6-049b6399f0ec] Checking state from (pid=4613) _get_power_state /opt/stack/nova/nova/compute/manager.py:1184
2019-05-23 21:21:22.368 DEBUG nova.compute.manager [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] [instance: a0e2b485-f40c-43e4-beb6-049b6399f0ec] Stopping instance; current vm_state: active, current task_state: powering-off, current DB power_state: 1, current VM power_state: 1 from (pid=4613) do_stop_instance /opt/stack/nova/nova/compute/manager.py:2498
2019-05-23 21:21:25.826 DEBUG nova.compute.manager [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] [instance: a0e2b485-f40c-43e4-beb6-049b6399f0ec] Checking state from (pid=4613) _get_power_state /opt/stack/nova/nova/compute/manager.py:1184
2019-05-23 21:21:25.950 DEBUG oslo_concurrency.lockutils [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] Lock "a0e2b485-f40c-43e4-beb6-049b6399f0ec" released by "nova.compute.manager.do_stop_instance" :: held 3.587s from (pid=4613) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282
root@DevStack-Controller:/opt/stack/logs# cat n-cpu.log | grep req-748d953f-42d8-4853-9816-d44fbbcdbed7 | grep nova.virt
2019-05-23 21:21:22.477 DEBUG nova.virt.libvirt.driver [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] [instance: a0e2b485-f40c-43e4-beb6-049b6399f0ec] Shutting down instance from state 1 from (pid=4613) _clean_shutdown /opt/stack/nova/nova/virt/libvirt/driver.py:2562
2019-05-23 21:21:25.819 INFO nova.virt.libvirt.driver [req-748d953f-42d8-4853-9816-d44fbbcdbed7 admin admin] [instance: a0e2b485-f40c-43e4-beb6-049b6399f0ec] Instance shutdown successfully after 3 seconds.
分析日志小窍门
对于如何在日志文件中快速查找到有用的信息,对于初学者来说,这不是一件容易的事情。因为日志里条目和内容很多,特别是开启debug之后,容易让人眼花缭乱,无从下手。
这里给大家几个小窍门:
1、先确定大的范围,比如在操作之前用tialf 命令实时打印日志文件,这样需要查看的日志肯定会在操作之后打印在屏幕上。另外也可以通过时间戳来确定需要的日志范围
2、利用代码模块快速定位有用的信息。nova-* 子服务都有自己特定的代码模块
nova-api
nova.api.openstack.wsgi
nova-compute
nova.virt.libvirt.*
nova-scheduler
nova.scheduler.*
3、利用 Requests ID 查找相关的日志信息。在上面的日志中,我们可以利用 req-748d953f-42d8-4853-9816-d44fbbcdbed7 这个Requests ID 快速定位 n-api.log 中与shut off操作相关的其他日志。需要补充说明的是,Requests ID 是跨日志文件的,这一特性能帮助我们在其他子服务的日志文件中找到相关信息