使用 superagent 和 cheerio 爬取、解析网页（nodejs）

安装依赖：

npm install superagent cheerio --save

同步代码：

const superagent = require('superagent')
const cheerio = require('cheerio')
const fs = require('fs')

function getNews() {
  return new Promise((resolve, reject) => {
    superagent.get('https://a.b.c.cn/').end((err, data) => {
      if (err) {
        reject('error')
      } else {
        const $ = cheerio.load(data.text)
        var lst = []
        $('#blk_cjkjqcfc_011  a').each((index, item) => {
          var tex = $(item).text()
          if (tex && !tex.endsWith('|')) {
            lst.push(tex.replaceAll('\n', ''))
          }
        })
        resolve(lst)
      }
    })
  })
}

async function main() {
  var res = await getNews()
  console.log(res)
}

main()

posted @ 2023-02-19 23:01 EGU0 阅读(87) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· Node 笔记

· Uniapp学习笔记(vue3)

· Node中内置

· Node爬取网站数据

· 【node爬虫】node爬虫实用教程

阅读排行：
· 阿里最新开源QwQ-32B，效果媲美deepseek-r1满血版，部署成本又又又降低了！
· 单线程的Redis速度为什么快？
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决：字节Trae VS Cursor，谁才是开发者新宠？
· 展开说说关于C#中ORM框架的用法！

公告

昵称： EGU0
园龄： 5年5个月
粉丝： 7
关注： 7

+加关注

2025年3月

日

一

二

三

四

五

六

Egu0

使用 superagent 和 cheerio 爬取、解析网页（nodejs）

公告

搜索

常用链接

我的标签

随笔档案

相册

阅读排行榜

评论排行榜

推荐排行榜