Linux Kernel Development (6)

Posted on 2011-06-16 17:21  Teddy Yan  阅读(147)  评论(0编辑  收藏  举报

Timers and Time Management

The frequency of the system timer (the tick rate) is programmed on system boot based on a static preprocessor define, HZ

The kernel defines the value in .

The final agreement is that, at least on modern systems, HZ=1000 does not create unacceptable overhead and the move to a 1000Hz timer has not hurt performance too much. Nevertheless, it is possible in 2.6 to compile the kernel with a different value for HZ.

Memory Management

The important point to understand is that the page structure is associated with physical pages, not virtual pages.

Developers are often surprised that an instance of this structure is allocated for each physical page in the system

The kernel uses the zones to group pages of similar properties.

  • ZONE_DMA—This zone contains pages that can undergo DMA.
  • ZONE_DMA32—Like ZOME_DMA, this zone contains pages that can undergo DMA. Unlike ZONE_DMA, these pages are accessible only by 32-bit devices. On some architectures, this zone is a larger subset of memory.
  • ZONE_NORMAL—This zone contains normal, regularly mapped, pages.
  • ZONE_HIGHMEM—This zone contains “high memory,” which are pages not permanently mapped into the kernel’s address space.

But if push comes to shove (say, if memory should get low), the kernel can dip its fingers in whatever zone is available and suitable.

You’ve seen various examples of allocator flags in both the low-level page allocation functions and kmalloc().

The vmalloc() function works in a similar fashion to kmalloc(), except it allocates memory that is only virtually contiguous and not necessarily physically contiguous.

In certain situations, only certain methods can be employed to allocate memory. For example, interrupt handlers must instruct the kernel not to sleep (because interrupt handlers cannot reschedule) in the course of allocating memory.

Slab Layer: A free list contains a block of available, already allocated, data structures.When code requires a new instance of a data structure, it can grab one of the structures off the free list rather than allocate the sufficient amount of memory and set it up for the data structure.

Statically Allocating on the Stack

User-space is afforded the luxury of a large, dynamically growing stack, whereas the kernel has no such luxury—the kernel’s stack is small and fixed.When each process is given a small, fixed stack, memory consumption is minimized, and the kernel need not burden itself with stack management code.

Single-Page Kernel Stacks

This was done for two reasons. First, it results in a page with less memory consumption per process. Second and most important is that as uptime increases, it becomes increasingly hard to find two physically contiguous unallocated pages.

To summarize, kernel stacks are either one or two pages, depending on compile-time configuration options.The stack can therefore range from 4KB to 16KB. Historically,
interrupt handlers shared the stack of the interrupted process.When single page stacks are enabled, interrupt handlers are given their own stacks. In any case, unbounded recursion and alloca() are obviously not allowed.

Performing a large static allocation on the stack, such as of a large array or structure, is dangerous.

High Memory Mappings

void *kmap(struct page *page)
This function works on either high or low memory. If the page structure belongs to a page in low memory, the page’s virtual address is simply returned. If the page resides in
high memory, a permanent mapping is created and the address is returned.The function may sleep, so kmap() works only in process context.

Per-CPU Allocations

Typically, per-CPU data is stored in an array. Each item in the array corresponds to a possible processor on the system.

Reasons for Using Per-CPU Data

There are several benefits to using per-CPU data.The first is the reduction in locking requirements. Depending on the semantics by which processors access the per-CPU data, you might not need any locking at all.

Second, per-CPU data greatly reduces cache invalidation.This occurs as processors try to keep their caches in sync. If one processor manipulates data held in another processor’s cache, that processor must flush or otherwise update its cache.

The only safety requirement for the use of per-CPU data is disabling kernel preemption, which is much cheaper than locking, and the interface does so automatically.

Picking an Allocation Method

Copyright © 2024 Teddy Yan
Powered by .NET 9.0 on Kubernetes