看糗事百科是从2008年开始的,自从买了智能手机以后,就用手机看了,想着糗百的网站上下都有广告,自己只想看糗事,不想看广告,顺便还能节省下流量,就能能不能做个程序把糗百的糗事抓下来,其他的都去掉,于是就写了下面的这段.希望糗百大神们不要追究我的责任啊,我只是研究了一下下.
前台文件:
<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="Default.aspx.cs" Inherits="WebTest._Default" EnableViewState="false" %> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server"> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>糗事百科</title> <style type="text/css"> body{margin:5px;font:12px arial,sinsun;background:#fff;} img{border:none;} a{text-decoration:none;} .qiushi{margin:5px 0;padding:10px;border-bottom:1px solid #ece5d8;} </style> </head> <body><form id="bodyForm" runat="server"></form></body></html>
后台代码:
1 protected void Page_Load(object sender, EventArgs e) 2 { 3 string URI = "http://wap3.qiushibaike.com"; 4 string pageInfo = Request.QueryString["param"] == null ? string.Empty : Request.QueryString["param"].ToString().Trim(); 5 URI = URI + pageInfo; 6 7 bodyForm.InnerHtml = Server.HtmlDecode(getQiushi(URI)); 8 }
getQiushi
1 private string getQiushi(string URI) 2 { 3 WebRequest request = WebRequest.Create(URI); 4 WebResponse result = null; 5 result = request.GetResponse(); 6 Stream ReceiveStream = result.GetResponseStream(); 7 StreamReader sr = new StreamReader(ReceiveStream); 8 string resultstring = sr.ReadToEnd(); 9 StringBuilder responseString = new StringBuilder(); 10 11 Regex regContent = new Regex("<div class=\"qiushi\">(?<content>[\\s\\S]+?)</div>"); //匹配糗事内容 12 Regex regComment = new Regex("<p class=\"vote\">(?<content>[\\s\\S]+?)</p>"); //匹配评论 13 Regex regUserInfo = new Regex("<p class=\"user\">(?<content>[\\s\\S]+?)</p>"); //匹配发布者信息 16 Regex regLinks = new Regex("(href=\")(/[^\\s]*)(\")"); //匹配链接 17 Regex regPrevPage = new Regex("<a href=\".*?\">上一页</a>"); //匹配换页 18 Regex regNextPage = new Regex("<a href=\".*?\">下一页</a>"); 19 Regex regBlankLine = new Regex(@"[\n|\r|\r\n]"); //匹配换行 20 MatchCollection mcContent = regContent.Matches(resultstring); 21 Match mcPrevPage = regPrevPage.Match(resultstring); 22 Match mcNextPage = regNextPage.Match(resultstring); 23 string prevPage = "<a href=\"?param=" + mcPrevPage.ToString().Replace("<a href=\"", "").Replace("\">上一页</a>", "") + "\">上一页</a> "; 24 string nextPage = "<a href=\"?param=" + mcNextPage.ToString().Replace("<a href=\"", "").Replace("\">下一页</a>", "") + "\">下一页</a>"; 25 26 for (int i = 0; i < mcContent.Count; i++) 27 { 28 string content = mcContent[i].ToString(); 29 content = Regex.Replace(content, regComment.ToString(), "", RegexOptions.IgnoreCase); 30 content = Regex.Replace(content, regUserInfo.ToString(), "", RegexOptions.IgnoreCase); 32 content = Regex.Replace(content, regLinks.ToString(), "href=\"?param=$2\"", RegexOptions.IgnoreCase); 33 content = Regex.Replace(content, regBlankLine.ToString(),"", RegexOptions.IgnoreCase); 34 35 responseString.Append(content); 37 } 38 39 responseString.Append("<div style=\"text-align:center\">" + prevPage); 40 responseString.Append(nextPage + "</div>"); 41 42 return responseString.ToString(); 43 }
Page Load里面的那个param参数主要是为了获取上一页 ,下一页和标签的,现在基本的功能都实现了,没有广告了,不过不能查看留言.