PHP CURL的几种用法

1、抓取无访问控制文件

<?php  
 $ch = curl_init();  
 curl_setopt($ch, CURLOPT_URL, "http://localhost/mytest/phpinfo.php");  
 curl_setopt($ch, CURLOPT_HEADER, false);  
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //如果把这行注释掉的话,就会直接输出  
 $result=curl_exec($ch);  
 curl_close($ch);  
 ?> 

2、使用代理进行抓取

为什么要使用代理进行抓取呢?以google为例吧,如果去抓google的数据,短时间内抓的很频繁的话,你就抓取不到了。google对你的ip地址做限制这个时候,你可以换代理重新抓。

<?php  
 $ch = curl_init();  
 curl_setopt($ch, CURLOPT_URL, "http://blog.51yip.com");  
 curl_setopt($ch, CURLOPT_HEADER, false);  
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);  
 curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);  
 curl_setopt($ch, CURLOPT_PROXY, 125.21.23.6:8080);  
 //url_setopt($ch, CURLOPT_PROXYUSERPWD, 'user:password');如果要密码的话,加上这个  
 $result=curl_exec($ch);  
 curl_close($ch);  
 ?> 

3、POST提交之后,再抓取数据

<?php  
 $ch = curl_init();  
 /*在这里需要注意的是,要提交的数据不能是二维数组或者更高 
 *例如array('name'=>serialize(array('tank','zhang')),'sex'=>1,'birth'=>'20101010') 
 *例如array('name'=>array('tank','zhang'),'sex'=>1,'birth'=>'20101010')这样会报错的*/  
 $data = array('name' => 'test', 'sex'=>1,'birth'=>'20101010');  
 curl_setopt($ch, CURLOPT_URL, 'http://localhost/mytest/curl/upload.php');  
 curl_setopt($ch, CURLOPT_POST, 1);  
 curl_setopt($ch, CURLOPT_POSTFIELDS, $data);  
 curl_exec($ch);  
 ?>  

在 upload.php文件中,print_r($_POST);利用curl就能抓取出upload.php输出的内容Array ( [name] => test [sex] => 1 [birth] => 20101010 )

4、模拟登录到sina

<?php   
  
function checklogin( $user, $password )  
 {  
 if ( emptyempty( $user ) || emptyempty( $password ) )  
 {  
 return 0;  
 }  
 $ch = curl_init( );  
 curl_setopt( $ch, CURLOPT_REFERER, "http://mail.sina.com.cn/index.html" );  
 curl_setopt( $ch, CURLOPT_HEADER, true );  
 curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );  
 curl_setopt( $ch, CURLOPT_USERAGENT, USERAGENT );  
 curl_setopt( $ch, CURLOPT_COOKIEJAR, COOKIEJAR );  
 curl_setopt( $ch, CURLOPT_TIMEOUT, TIMEOUT );  
 curl_setopt( $ch, CURLOPT_URL, "http://mail.sina.com.cn/cgi-bin/login.cgi" );  
 curl_setopt( $ch, CURLOPT_POST, true );  
 curl_setopt( $ch, CURLOPT_POSTFIELDS, "&logintype=uid&u=".urlencode( $user )."&psw=".$password );  
 $contents = curl_exec( $ch );  
 curl_close( $ch );  
 if ( !preg_match( "/Location: (.*)\\/cgi\\/index\\.php\\?check_time=(.*)\n/", $contents, $matches ) )  
 {  
 return 0;  
 }else{  
 return 1;  
 }  
 }   
  
 define( "USERAGENT", $_SERVER['HTTP_USER_AGENT'] );  
 define( "COOKIEJAR", tempnam( "/tmp", "cookie" ) );  
 define( "TIMEOUT", 500 );   
  
 echo checklogin("zhangying215","xtaj227");  
 ?>  
posted @ 2016-11-16 17:51  宋子庆  阅读(284)  评论(0编辑  收藏  举报