A synchronous I/O operation causes the requesting process to be blocked until that I/O operation completes;
An asynchronous I/O operation does not cause the requesting process to be blocked;
Single UNIX Specification 版本4 将实时扩展(real-time extension)中的通用异步I/O机制(general asynchronous I/O mechanism)添加到了基本规格(base specification)中。该机制解决了老式的异步I/O设施中存在的一些限制。
在我们学习使用异步I/O的不同方法之前,需要讨论一下使用异步I/O的代价。使用异步I/O会使得我们的设计变得复杂,因为此时需要同时应付许多同时发生的操作。一个简单的解决方法是使用多线程,这使得我们可以用同步模型(synchronous model)来编写程序,而各个线程之间则以异步方式运行。
当使用POSIX 异步I/O接口时,将会更加复杂:
- 对于每一个异步操作我们需要考虑三个出错源:一个与操作的提交(the submission of the operation)有关;一个与操作自身的结果有关;另一个与使用的确定异步操作的状态的函数有关。
- 与传统的接口相比,这些接口本身涉及大量额外的设置和规则处理。
We can’t really call the non-asynchronous I/O function calls ‘‘synchronous,’’ because
although they are synchronous with respect to the program flow, they aren’t
synchronous with respect to the I/O. Recall the discussion of synchronous writes in
Chapter 3. We call a write ‘‘synchronous’’ if the data we write is persistent when we
return from the call to the write function. We also can’t differentiate the
conventional I/O function calls from the asynchronous ones by referring to the
conventional calls as the ‘‘standard’’ I/O calls, because this confuses them with the
function calls in the standard I/O library. To avoid confusion, we’ll refer to the
read and write functions as the ‘‘conventional’’ I/O function calls in this section.
- 很难从错误中恢复。例如,我们提交了多个异步写并且其中一个出错,此时我们该如何处理?如果这些异步写操作是相互联系的,可能我们需要撤销那些成功的异步写操作。
表14-7 产生SIGPOLL信号的条件
常量 | 说明 |
S_INPUT S_RDNORM S_RDBAND S_BANDURG S_HIPRI | 非高优先级消息已到达 普通消息已到达 非0优先级波段消息已到达 若此常量和S_RDBAND一起指定,则当一非0优先级波段消息已到达时,产生SIGURG信号而非SIGPOLL 高优先级消息已到达 |
S_OUTPUT S_WRNORM S_WRBAND | 写队列不再满 与S_OUTPUT相同 可发送非0优先级波段消息 |
表14-7中 “已到达” 的意思是“已到达流首的读队列”。
POSIX异步I/O接口给我们提供了一个一致的方法来执行异步I/O,而且无需考虑文件的类型。这些接口采纳自实时草案标准(real-time draft standard)。Single UNIX Specification 版本4将这些接口加入到了基准当中(real-time draft standard--->base specification),所以现在所有的平台都必须支持这些接口。
struct aiocb { int aio_fildes; /* file descriptor */ off_t aio_offset; /* file offset for I/O */ volatile void *aio_buf; /* buffer for I/O */ size_t aio_nbytes; /* number of bytes to transfer */ int aio_reqprio; /* priority */ struct sigevent aio_sigevent; /* signal information */ int aio_lio_opcode; /* operation for list I/O */ };
struct sigevent { int sigev_notify; /* notify type */ int sigev_signo; /* signal number */ union sigval sigev_value; /* notify argument */ void (*sigev_notify_function)(union sigval); /* notify function */ pthread_attr_t *sigev_notify_attributes; /* notify attrs */ };
SIGEV_NONE 异步I/O请求完成时不通知进程。
SIGEV_SIGNAL 异步I/O请求完成时产生由sigev_signo字段指定的信号。如果应用程序选择捕获信号,并在建立信号处理程序时指定了SA_SIGINFO标志,则该信号被排队(如果实现支持信号队列)。将si_value字段设置为sigev_value的siginfo结构体传递给信号处理程序(再次声明,如果使用了SA_SIGINFO标志)。
SIGEV_THREAD 异步I/O请求完成时由sigev_notify_function字段指定的函数将会被调用。并且sigev_value字段作为该函数唯一的参数。该函数在另外一个线程中以分离状态执行,除非sigev_notify_attributes字段设置为一个线程属性结构地址,且该线程属性结构指定了其他的线程属性。
#include <aio.h> int aio_read(struct aiocb *aiocb); int aio_write(struct aiocb *aiocb); 两函数返回值:若成功则返回0,出错则返回-1
如果想要强制所有挂起的异步写持久性存储(To force all pending asynchronous writes to persistent storage)而无需等待,我们可以设立一个AIO控制块并且调用aio_fsync函数。
#include <aio.h> int aio_fsync(int op, struct aiocb *aiocb); 返回值:若成功则返回0,出错则返回-1
#include <aio.h> int aio_error(const struct aiocb *aiocb); 返回值:见下面
0 异步操作成功完成。我们需要调用aio_return函数从操作中获取返回值。
-1 aio_error调用失败。错误原因设置在errno。
EINPROGRESS 异步读、写和同步操作在挂起中。
anything else 其他任何返回值给出了相应的异步操作的错误代码。
#include <aio.h> ssize_t aio_return(const struct aiocb *aiocb); 返回值:见下面
#include <aio.h> int aio_suspend(const struct aiocb *const list[], int nent, const struct tiimespec *timeout); 返回值:若成功则返回0,出错则返回-1
有三种情况可以导致aio_suspend返回。如果被信号中断,则返回-1,并把errno设置为EINTR。如果在任何I/O操作完成之前,timeout超时,则返回-1,并把errno设置为EAGAIN(we can pass a null pointer for the timeout argument if we want to block without a time limit)。如果有任何一个I/O操作完成,则aio_suspend返回0。如果当我们调用aio_suspend的时候,所有的异步I/O操作全部都已经完成了,那么aio_suspend将无阻塞地返回。
#include <aio.h> int aio_cancel(int fd, struct aiocb *aiocb); 返回值:见下面
AIO_ALLDONE 在试图取消之前,所有的操作都已完成。
AIO_CANCELED 所有请求的操作都已经被取消。
AIO_NOTCANCELED 至少有一个请求的操作无法被取消。
-1 调用aio_cancel失败。相应的错误编号存入errno。
#include <aio.h> int lio_listio(int mode, struct aiocb *restrict const list[restrict], int nent, struct sigevent *restrict sigev); 返回值:若成功则返回0,出错则返回-1
在每一个AIO控制块中,aio_lio_opcode字段具体说明了操作是读(LIO_READ)、是写(LIO_WRITE)、还是no-op(LIO_NOP)(该操作将被忽略)。A read i-s treated as if the corresponding AIO control block had been passed to the aio_read function. Similarly, a write is treated as if the AIO control bloc-k had been passed to aio_write.
We can determine the value of AIO_LISTIO_MAX by calling the sysconf function
with the name argument set to _SC_IO_LISTIO_MAX. Similarly, we can determine the
value of AIO_MAX by calling sysconf with the name argument set to _SC_AIO_MAX,
and we can get the value of AIO_PRIO_DELTA_MAX by calling sysconf with its
argument set to _SC_AIO_PRIO_DELTA_MAX.
The POSIX asynchronous I/O interfaces were originally introduced to provide realtime
applications with a way to avoid being blocked while performing I/O operations.
Now we’ll look at an example of how to use the interfaces.
We don’t discuss real-time programming in this text, but because the POSIX
asynchronous I/O interfaces are now part of the base specification in the Single UNIX
Specification, we’ll look at how to use them. To compare the asynchronous I/O
interfaces with their conventional counterparts, we’ll look at the task of translating a file
from one format to another.
The program shown in Figure 14.20 translates a file using the ROT-13 algorithm(关于ROT-13算法可参考http://zh.wikipedia.org/wiki/ROT13)
that the USENET news system, popular in the 1980s, used to obscure text that might be
offensive or contain spoilers or joke punchlines. The algorithm rotates the characters ’a’
to ’z’ and ’A’ to ’Z’ by 13 positions, but leaves all other characters unchanged.
Figure 14.20 Translate a file using ROT-13
#include "apue.h" #include <ctype.h> #include <fcntl.h> #define BSZ 4096 unsigned char buf[BSZ]; unsigned char translate(unsigned char c) { if(isalpha(c)) { if(c >= 'n') c -= 13; else if(c >= 'a') c += 13; else if(c >= 'N') c -= 13; else c += 13; } return(c); } int main(int argc, char* argv[]) { int ifd, ofd, i, n, nw; if(argc != 3) err_quit("usage: rot13 infile outfile"); if((ifd = open(argv[1], O_RDONLY)) < 0) err_sys("can't open %s", argv[1]); if((ofd = open(argv[2], O_RDWR|O_CREAT|O_TRUNC, FILE_MODE)) < 0) err_sys("can't create %s", argv[2]); while((n = read(ifd, buf, BSZ)) > 0) { for(i = 0; i < n; i++) buf[i] = translate(buf[i]); if((nw = write(ofd, buf, n)) != n) { if(nw < 0) err_sys("write failed"); else err_quit("short write (%d/%d)", nw, n); } } fsync(ofd); exit(0); }
The I/O portion of the program is straightforward: we read a block from the input
file, translate it, and then write the block to the output file. We repeat this until we hit
the end of file and read returns zero. The program in Figure 14.21 shows how to
perform the same task using the equivalent asynchronous I/O functions.
Figure 14.21 Translate a file using ROT-13 and asynchronous I/O
#include "apue.h" #include <ctype.h> #include <fcntl.h> #include <aio.h> #include <errno.h> #define BSZ 4096 #define NBUF 8 enum rwop { UNUSED = 0, READ_PENDING = 1, WRITE_PENDING = 2 }; struct buf { enum rwop op; int last; struct aiocb aiocb; unsigned char data[BSZ]; }; struct buf bufs[NBUF]; unsigned char translate(unsigned char c) { if(isalpha(c)) { if(c >= 'n') c -= 13; else if(c >= 'a') c += 13; else if(c >= 'N') c -= 13; else c += 13; } return(c); } int main(int argc, char* argv[]) { int ifd, ofd, i, j, n, err, numop; struct stat sbuf; const struct aiocb *aiolist[NBUF]; off_t off = 0; if(argc != 3) err_quit("usage: rot13 infile outfile"); if((ifd = open(argv[1], O_RDONLY)) < 0) err_sys("can't open %s", argv[1]); if((ofd = open(argv[2], O_RDWR|O_CREAT|O_TRUNC, FILE_MODE)) < 0) err_sys("can't create %s", argv[2]); if(fstat(ifd, &sbuf) < 0) err_sys("fstat failed"); /* initialize the buffers */ for(i = 0; i < NBUF; i++) { bufs[i].op = UNUSED; bufs[i].aiocb.aio_buf = bufs[i].data; bufs[i].aiocb.aio_sigevent.sigev_notify = SIGEV_NONE; aiolist[i] = NULL; } numop = 0; for(;;) { for(i = 0; i < NBUF; i++) { switch(bufs[i].op) { case UNUSED: /* * Read from the input file if more data * remains unread. */ if(off < sbuf.st_size) { bufs[i].op = READ_PENDING; bufs[i].aiocb.aio_fildes = ifd; bufs[i].aiocb.aio_offset = off; off += BSZ; if(off >= sbuf.st_size) bufs[i].last = 1; bufs[i].aiocb.aio_nbytes = BSZ; if(aio_read(&bufs[i].aiocb) < 0) err_sys("aio_read failed"); aiolist[i] = &bufs[i].aiocb; numop++; } break; case READ_PENDING: if((err = aio_error(&bufs[i].aiocb)) == EINPROGRESS) continue; if(err != 0) { if(err == -1) err_sys("aio_error failed"); else err_exit(err, "read failed"); } /* * A read is complete; translate the buffer * and write it. */ if((n = aio_return(&bufs[i].aiocb)) < 0) err_sys("aio_return failed"); if(n != BSZ && !bufs[i].last) err_quit("short read (%d/%d)", n, BSZ); for(j = 0; j < n; j++) bufs[i].data[j] = translate(bufs[i].data[j]); bufs[i].op = WRITE_PENDING; bufs[i].aiocb.aio_fildes = ofd; bufs[i].aiocb.aio_nbytes = n; if(aio_write(&bufs[i].aiocb) < 0) err_sys("aio_write failed"); /* return our spot in aiolist */ break; case WRITE_PENDING: if((err = aio_error(&bufs[i].aiocb)) == EINPROGRESS) continue; if(err != 0) { if(err == -1) err_sys("aio_error failed"); else err_exit(err, "write failed"); } /* * A write is complete; mark the buffer as unused. */ if((n = aio_return(&bufs[i].aiocb)) < 0) err_sys("aio_return failed"); if(n != bufs[i].aiocb.aio_nbytes) err_quit("short write (%d/%d)", n, BSZ); aiolist[i] = NULL; bufs[i].op = UNUSED; numop--; break; } } if(numop == 0) { if(off >= sbuf.st_size) break; } else { if(aio_suspend(aiolist, NBUF, NULL) < 0) err_sys("aio_suspend failed"); } } bufs[0].aiocb.aio_fildes = ofd; if(aio_fsync(O_SYNC, &bufs[0].aiocb) < 0) err_sys("aio_fsync failed"); exit(0); }
编译此程序时,需要在编译选项中加上 -lrt,否则会出现“undefined reference to ‘aio_xxx’”这样的错误。(参考自http://hi.baidu.com/catproste2012/item/04eab0ee76afe0d2eb34c914)
Note that we use eight buffers, so we can have up to eight asynchronous I/O
requests pending. Surprisingly, this might actually reduce performance—if the reads
are presented to the file system out of order, it can defeat the operating system’s readahead
Before we can check the return value of an operation, we need to make sure the
operation has completed. When aio_error returns a value other than EINPROGRESS
or −1, we know the operation is complete. Excluding these values(EINPROGRESS, -1), if the return value is
anything other than 0, then we know the operation failed. Once we’ve checked these
conditions, it is safe to call aio_return to get the return value of the I/O operation.
As long as we have work to do, we can submit asynchronous I/O operations.
When we have an unused AIO control block, we can submit an asynchronous read
request. When a read completes, we translate the buffer contents and then submit an
asynchronous write request. When all AIO control blocks are in use, we wait for an
operation to complete by calling aio_suspend.
When we write a block to the output file, we retain the same offset at which we read
the data from the input file. Consequently, the order of the writes doesn’t matter. This
strategy works only because each character in the input file has a corresponding
character in the output file at the same offset; we neither add nor delete characters in the
output file.
We don’t use asynchronous notification in this example, because it is easier to use a
synchronous programming model. If we had something else to do while the I/O
operations were in progress, then the additional work could be folded into the for
loop. If we needed to prevent this additional work from delaying the task of translating
the file, however, then we might have to structure the code to use some form of
asynchronous notification. With multiple tasks, we need to prioritize the tasks before
deciding how the program should be structured.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 上周热点回顾(3.3-3.9)