[转] OpenStack — nova image-create, under the hood
I was trying to understand what kind of image nova image-create
creates. It’s not entirely obvious from its help output, which says — Creates a new image by taking a snapshot of a running server. But what kind of snapshot? let’s figure.
nova image-create operations
The command is invoked as below
$ nova image-create fed18 "snap1-of-fed18" --poll
Drilling into nova’s source code — nova/virt/libvirt/driver.py — this is what image-create
does:
- If the guest — based on which snapshot is to be taken — is running, nova calls libvirt’s
virsh managedsave
, which saves and stops a running guest, to be restarted later from the saved state. - Next, it creates a qcow2 internal disk snapshot of the guest (now offline).
- Then, extracts the internal named snapshot from the qcow2 file & exports it to a RAW format and temporarily places in
$instances_path/snapshots
. - Deletes the internal named snapshot from the qcow2 file.
- Finally, uploads that image into OpenStack glance service — which can be confirmed by running
glance image-list
.
Update: Steps 2 and 4 above are now effectively removed with this upstream change.
Remove unnecessary steps for cold snapshots
Up until now when we created cold snapshots we were stopping the
instance, create an internal snapshot, extract the snapshot to a file
and then delete the internal snapshot before bringing up the instance.If the instance is shut down, there's no concurrent writer, so the image
can be directly extracted without taking an internal snapshot first,
because the snapshot and the current state are the same.In this patch the creation and deletion of the internal snapshot are
removed to eliminate the extra steps and optimize the creation of
snapshots a bit.
A simple test
To get a bit more clarity, let’s try nova’s actions on a single qocw2 disk — with a running Fedora 18 OS — using libvirt’s shell virsh
and QEMU’s qemu-img
:
# Save the state and stop a running guest $ virsh managedsave fed18 # Create a qemu internal snapshot $ qemu-img snapshot -c snap1 fed18.qcow2 # Get information about the disk $ qemu-img info fed18.qcow2 # Extract the internal snapshot, # convert it to raw and export it a file $ qemu-img convert -f qcow2 -O raw -s \ snap1 fed18.qcow2 snap1-fed18.img # Get information about the new image # extracted from the snapshot $ qemu-img info snap1-fed18.img # List out file sizes of the original # and the snapshot $ ls -lash fed18.qcow2 snap1-fed18.qcow2 # Delete the internal snapshot # from the original disk $ qemu-img snapshot -d snap1 fed18.qcow2 # Again, get information of the original disk $ qemu-img info fed18.qcow2 # Start the guest again $ virsh start fed18
Thanks to Nikola Dipanov for helping me on where to look.
Update: A few things I missed to mention (thanks again for comments from Nikola) — I was using libvirt, kvm as underlying hypervisor technologies, with OpenStack Folsom release.
http://blog.csdn.net/epugv/article/details/9832535
def snapshot(self, context, instance, image_href, update_task_state):
"""Create snapshot from a running VM instance.
This command only works with qemu 0.14+
"""
try:
virt_dom = self._lookup_by_name(instance['name'])#获取域
except exception.InstanceNotFound:
raise exception.InstanceNotRunning(instance_id=instance['uuid'])
(image_service, image_id) = glance.get_remote_image_service(
context, instance['image_ref'])
try:
base = image_service.show(context, image_id)#获取实例镜像的base
except exception.ImageNotFound:
base = {}
#image_href是一个url,包括网页上输入的信息
_image_service = glance.get_remote_image_service(context, image_href)
#create一个image_service这个类对象,并从image_href中解析出一个image id
snapshot_image_service, snapshot_image_id = _image_service
snapshot = snapshot_image_service.show(context, snapshot_image_id)
metadata = {'is_public': False,
'status': 'active',
'name': snapshot['name'],
'properties': {
'kernel_id': instance['kernel_id'],
'image_location': 'snapshot',
'image_state': 'available',
'owner_id': instance['project_id'],
'ramdisk_id': instance['ramdisk_id'],
}
}
if 'architecture' in base.get('properties', {}):
arch = base['properties']['architecture']
metadata['properties']['architecture'] = arch
disk_path = libvirt_utils.find_disk(virt_dom)#磁盘路径
source_format = libvirt_utils.get_disk_type(disk_path)#快照源的格式
image_format = CONF.snapshot_image_format or source_format
# NOTE(bfilippov): save lvm as raw
if image_format == 'lvm':
image_format = 'raw'
# NOTE(vish): glance forces ami disk format to be ami
if base.get('disk_format') == 'ami':
metadata['disk_format'] = 'ami'
else:
metadata['disk_format'] = image_format
metadata['container_format'] = base.get('container_format', 'bare')
snapshot_name = uuid.uuid4().hex
(state, _max_mem, _mem, _cpus, _t) = virt_dom.info()
state = LIBVIRT_POWER_STATE[state]
# NOTE(rmk): Live snapshots require QEMU 1.3 and Libvirt 1.0.0.
# These restrictions can be relaxed as other configurations
# can be validated.
if self.has_min_version(MIN_LIBVIRT_LIVESNAPSHOT_VERSION,
MIN_QEMU_LIVESNAPSHOT_VERSION,
REQ_HYPERVISOR_LIVESNAPSHOT) \
and not source_format == "lvm":
live_snapshot = True##判断版本信息和源格式,由此决定是live还是cold
else:
live_snapshot = False
# NOTE(rmk): We cannot perform live snapshots when a managedSave
# file is present, so we will use the cold/legacy method
# for instances which are shutdown.
if state == power_state.SHUTDOWN:##实例SHUTDOWN状态的用cold
live_snapshot = False
# NOTE(dkang): managedSave does not work for LXC
if CONF.libvirt_type != 'lxc' and not live_snapshot:
if state == power_state.RUNNING or state == power_state.PAUSED:
virt_dom.managedSave(0)#This method will suspend a domain and save its memory contents to a file on disk
snapshot_backend = self.image_backend.snapshot(disk_path,
snapshot_name,
image_type=source_format)# Returns snapshot for given image,并带上后端格式
if live_snapshot:
LOG.info(_("Beginning live snapshot process"),
instance=instance)
else:
LOG.info(_("Beginning cold snapshot process"),
instance=instance)
snapshot_backend.snapshot_create()#调用snapshot_backend中带的格式类的#snapshot_create()
# Create a snapshot in a disk image
#这里是qcow2用的是qemu-img snapshot –c…
update_task_state(task_state=task_states.IMAGE_PENDING_UPLOAD)
snapshot_directory = CONF.libvirt_snapshots_directory
fileutils.ensure_tree(snapshot_directory)
with utils.tempdir(dir=snapshot_directory) as tmpdir:#
try:
out_path = os.path.join(tmpdir, snapshot_name)
if live_snapshot:
# NOTE (rmk): libvirt needs to be able to write to the
# temp directory, which is owned nova.
utils.execute('chmod', '777', tmpdir, run_as_root=True)
self._live_snapshot(virt_dom, disk_path, out_path,
image_format)
else:
snapshot_backend.snapshot_extract(out_path, image_format)#转换镜像格式,并output到out_path下
finally:
if not live_snapshot:
snapshot_backend.snapshot_delete()
# NOTE(dkang): because previous managedSave is not called
# for LXC, _create_domain must not be called.
if CONF.libvirt_type != 'lxc' and not live_snapshot:
if state == power_state.RUNNING:
self._create_domain(domain=virt_dom)
elif state == power_state.PAUSED:
self._create_domain(domain=virt_dom,
launch_flags=libvirt.VIR_DOMAIN_START_PAUSED)
LOG.info(_("Snapshot extracted, beginning image upload"),
instance=instance)
# Upload that image to the image service
update_task_state(task_state=task_states.IMAGE_UPLOADING,
expected_state=task_states.IMAGE_PENDING_UPLOAD)
with libvirt_utils.file_open(out_path) as image_file:
image_service.update(context,
image_href,
metadata,
image_file)
LOG.info(_("Snapshot image upload complete"),
instance=instance)
def _live_snapshot(self, domain, disk_path, out_path, image_format):
"""Snapshot an instance without downtime."""
xml = domain.XMLDesc(0) # Save a copy of the domain's running XML file
# Abort is an idempotent operation, so make sure any block
# jobs which may have failed are ended.
try:
domain.blockJobAbort(disk_path, 0)#Cancel the active block job on the given disk.
except Exception:
pass
def _wait_for_block_job(domain, disk_path):
status = domain.blockJobInfo(disk_path, 0)#进度
try:
cur = status.get('cur', 0)
end = status.get('end', 0)
except Exception:
return False
if cur == end and cur != 0 and end != 0:
return False
else:
return True
# NOTE (rmk): We are using shallow rebases as a workaround to a bug
# in QEMU 1.3. In order to do this, we need to create
# a destination image with the original backing file
# and matching size of the instance root disk.
src_disk_size = libvirt_utils.get_disk_size(disk_path)
src_back_path = libvirt_utils.get_disk_backing_file(disk_path,
basename=False)
disk_delta = out_path + '.delta'
# Creates a COW image with the given backing file
libvirt_utils.create_cow_image(src_back_path, disk_delta,
src_disk_size)#由此命令完成'qemu-img', 'create', '-f', 'qcow2'……
try:
# NOTE (rmk): blockRebase cannot be executed on persistent
# domains, so we need to temporarily undefine it.
# If any part of this block fails, the domain is
# re-defined regardless.
if domain.isPersistent():
domain.undefine()
# NOTE (rmk): Establish a temporary mirror of our root disk and
# issue an abort once we have a complete copy.
domain.blockRebase(disk_path, disk_delta, 0,
libvirt.VIR_DOMAIN_BLOCK_REBASE_COPY |
libvirt.VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT |
libvirt.VIR_DOMAIN_BLOCK_REBASE_SHALLOW)#实现live快照的地方,参考图1、2,http://www.libvirt.org/html/libvirt-libvirt.html#VIR_DOMAIN_BLOCK_REBASE_COPY
while _wait_for_block_job(domain, disk_path):
time.sleep(0.5)
domain.blockJobAbort(disk_path, 0)
libvirt_utils.chown(disk_delta, os.getuid())
finally:
self._conn.defineXML(xml)
# Convert the delta (CoW) image with a backing file to a flat
# image with no backing file.
libvirt_utils.extract_snapshot(disk_delta, 'qcow2', None,
out_path, image_format)