JS如何实现书签导入导出?我是这么做的
目录
前言
使用Node做过爬虫的人应该都知道Cheerio.js模块,其快速灵活的机制,使我们只需要了解JQ就可以轻松上手,是在使用node抓取网页数据的过程中不可或缺的一员。
了解了cheerio后,我突发奇想:干脆拿cheerio实现个书签的导入吧,正好可以熟悉一下它的用法,于是早些时候我使用cheerio+node实现了初版的书签导入功能,将浏览器导出的书签通过前端页面上传到服务端,服务端使用cheerio将html解析成JSON文件,通过接口将数据传递到前端。
然而,当时我并不满意,因为就为了一个接口开了一个node服务,是不是有点大材小用了,我能否靠本地缓存实现一个纯前端的书签预览和导入导出功能?
说干就干,导入书签我借助前端的FileReader类,读取HTML文件,然后再使用cheerio将Dom解析成JSON格式的数据,在前端展示成menu形式;导出书签同样使用cheerio根据JSON数据生成对应的Dom数据,通过URL.createObjectURL新建文件的本地url地址,最后使用a标签下载文件
下面我分享一下完整的实现过程及源码
依赖
-
cheerio模块
-
vite:3.1
-
vue:3.2
-
element-plus:2.0
概览
这个小案例是基于vite搭建的一个vue-3.0的项目,除了layout之外,案例的核心部分是两个类:
FileSystem和HTMLSystem,前者提供下载,文件读取的功能,后者实现了JSON和HTML互转的功能,除此之外其他的都是常见的布局及组件,所以文章重点描述这两大块
功能实现
FileSystem:
- 读取文件功能,从element-ui的el-upload组件获取到数据后将结果转换成string格式
- 下载文件功能,给定url下载静态资源
- 本地文件转静态地址
import type { UploadFile } from "element-plus/es/components/upload/src/upload.type";
import { defer } from "utils-lib-js";
export type readFileType = 'readAsArrayBuffer' | 'readAsBinaryString' | 'readAsDataURL' | 'readAsText'
export declare interface IFileSystem {
readFile: (file: UploadFile, type?: readFileType, encoding?: string) => Promise<ProgressEvent<FileReader>>
downloadFile: (url: string, name?: string) => void
stringToBlobURL: (fileString: string) => string
}
export class FileSystem implements IFileSystem {
/**
* @name:
* @description: 读取前端上传的文件
* @param {UploadFile} file 文件
* @param {readFileType} type 文件类型
* @param {string} encoding 解码方式
* @return {Promise<ProgressEvent<FileReader>>}
*/
readFile(file: UploadFile, type: readFileType = 'readAsText', encoding: string = 'utf-8') {
const { promise, resolve, reject } = defer()
const reader: FileReader = new FileReader();
reader[type](file.raw, encoding)
reader.onload = resolve
reader.onerror = reject
return <Promise<any>>promise
}
/**
* @name:
* @description: 下载文件
* @param {string} url 资源目录/网址
* @param {string} name 下载文件名
* @return {*}
*/
downloadFile(url: string, name: string = 'file.txt') {
const link = document.createElement('a')
link.href = url
link.download = name
const _evt = new MouseEvent('click')
link.dispatchEvent(_evt)
}
/**
* @name:
* @description: 字符串转本地文件
* @param {string} fileString 文件内容
* @return {*}
*/
stringToBlobURL(fileString: string) {
return URL.createObjectURL(new Blob([fileString], { type: "application/octet-stream" }))
}
}
HTMLSystem:
- HTML转JSON函数,解析dom树,生成JSON数据
- JSON转HTML函数,通过标准格式生成书签格式的HTML标签
import { load, Cheerio, CheerioAPI, CheerioOptions } from 'cheerio'
import {
createHtmlFolder,
createHtmlFile,
createBaseTemp
} from '@/config'
import { File, Folder } from "@/layout/menu/types";
export declare interface IHTMLSystem<F = Folder | File, T = Cheerio<any>, I = CheerioAPI, FolderList = Array<F>> {
count: number
resetCount: () => void
initHTML: (html: string) => FolderList
htmlToJson: (node: T, bookMarks: FolderList) => void
addToBookMarks: (node: T, list: FolderList) => unknown
getNodeTitle: (node: T) => void
getNodeInfo: (node: T, info: File) => File
createInitHtml: (temp: string, opt?: CheerioOptions, isDoc?: boolean) => I
initJSON: (json: FolderList) => string
jsonToHtml: (bookMarks: FolderList, node: I) => string
createFolder: (folder: Folder, node: T) => I
createFile: (file: File, node: T) => I
createElemChild: (node: T) => (it: F, i: number) => void
checkIsFileOrFolder: (item: F) => 'folder' | 'file' | 'none'
}
export class HTMLSystem implements IHTMLSystem {
count = 0
/**
* @name:
* @description: 重置id
* @return {*}
*/
resetCount = () => {
this.count = 0;
};
/**
* @name:
* @description: 递增id
* @return {*}
*/
addCount = () => {
return this.count++
};
/**
* @name:
* @description: 初始化html生成器
* @param {string} html 预加载的html字符文件
* @return {Array<Folder | File>}
*/
initHTML(html: string) {
const $ = load(html);
const dl = $("dl").first();
const dt = dl.children("dt").eq(0);
return this.htmlToJson(dt, []);
}
/**
* @name:
* @description: html转Json的递归函数
* @param {Cheerio} node 根节点
* @param {Array} bookMarks JSON数据源
* @return {Array<Folder | File>}
*/
htmlToJson = (node: Cheerio<any>, bookMarks: Array<Folder | File> = []) => {
//下一级文件夹目录列表
const childrenNodeDL = node.children("dl");
const childrenNodeDT = childrenNodeDL.children("dt");
const { item: dir, dirType } = this.addToBookMarks(node, bookMarks)
childrenNodeDT.map((i) => {
const it = childrenNodeDT.eq(i)
dirType === 'file' && this.addToBookMarks(it, dir.children)
this.htmlToJson(it, dir.children);
});
return bookMarks;
};
/**
* @name:
* @description: 将单个数据添加到JSON中
* @param {Cheerio} node 父节点
* @param {Array} list 书签JSON数据
* @return {<Folder | File>, Array<Folder | File>, 'folder'|'file'}
*/
addToBookMarks = (node: Cheerio<any>, list: Array<Folder | File> = []) => {
const item = this.getNodeTitle(node);
const dirType = this.checkIsFileOrFolder(item)
switch (dirType) {
case "folder":
item.children = [];
case "file":
item.id = this.addCount().toString()
list.push(item)
break;
}
return { item, list, dirType }
}
/**
* @name:
* @description: 判断单个数据是否是文件夹,并解析详细信息
* @param {Cheerio} node 文件或文件夹所在的节点
* @return {*}
*/
getNodeTitle = (node: Cheerio<any>) => {
const info: any = {};
const title = node.children("h3");
// 如果h3的length为0则不是文件夹,就获取网站名称和网址,否则是文件夹并赋值title, add_date,last_modified
return title.length === 0 ? this.getNodeInfo(node, info) : {
...info,
title: title.text(),
add_date: title.attr("add_date"),
last_modified: title.attr("last_modified")
};
};
/**
* @name:
* @description: 解析书签文件详细信息
* @param {Cheerio} node 文件所在的节点
* @return {File}
*/
getNodeInfo = (node: Cheerio<any>, info: File) => ({
...info,
name: node.children("a").text(),
href: node.children("a").attr("href") ?? '',
icon: node.children("a").attr("icon") ?? '',
add_date: node.children("a").attr("add_date")
})
/**
* @name:
* @description: 入口函数
* @param {Array} json 上面生成的书签JSON文件
* @return {string}
*/
initJSON(json: Array<Folder | File>) {
return this.jsonToHtml(json);
}
/**
* @name:
* @description: 生成新标签的CheerioAPI
* @param {string} temp 标签
* @param {*} opt Cheerio 配置项
* @param {*} isDoc 是否生成完整的html标签
* @return {CheerioAPI}
*/
createInitHtml = (temp: string, opt = { xml: true, xmlMode: true }, isDoc = false) => {
const $ = load(temp, opt, isDoc);
return $
}
/**
* @name:
* @description: JSON转书签的主函数
* @param {Array} bookMarks 书签的JSON数据
* @return {string}
*/
jsonToHtml = (bookMarks: Array<Folder | File> = []) => {
const root = this.createInitHtml(`<div id="root">${createBaseTemp()}</div>`)("#root")
bookMarks.forEach(this.createElemChild(root.children().first()))
return root.children().toString()
}
/**
* @name:
* @description: 递归生成Dom树
* @param {Cheerio} node 父节点
* @return {void}
*/
createElemChild = (node: Cheerio<any>) => (it: Folder | File) => {
const type = this.checkIsFileOrFolder(it)
switch (type) {
case 'folder':
const folder = this.createFolder(it as Folder)
node.append(folder("*"))
//每次都会获取最后一个标签,将子项放进去,防止标签重复遍历
it.children.forEach(this.createElemChild(node.children("DL").last()))
break
case 'file':
const file = this.createFile(it as File)
node.append(file('*'))
break
case 'none':
throw new Error('Item is not Folder or File')
}
}
/**
* @name:
* @description: 生成文件夹标签
* @param {Folder} folder 文件夹格式的单个数据
* @return {CheerioAPI}
*/
createFolder = (folder: Folder) => {
const init = this.createInitHtml(createHtmlFolder(folder))
return init
}
/**
* @name:
* @description: 生成文件标签
* @param {File} file 文件格式的单个数据
* @return {CheerioAPI}
*/
createFile = (file: File) => {
const init = this.createInitHtml(createHtmlFile(file))
return init
}
/**
* @name:
* @description: 判断是文件还是文件夹格式的数据
* @param {Folder} item 单个数据
* @return {*}
*/
checkIsFileOrFolder = (item: Folder | File) => item.title ? 'folder' : item.name ? 'file' : 'none'
}
html-config:
此外,生成HTML时,需要一些模板函数
import { File, Folder } from "@/layout/menu/types";
/**
* @name:
* @description: 书签默认模板
* @param {string} 书签名
* @return {*}
*/
export const createHtmlTemp = (name: string) => `<!DOCTYPE NETSCAPE-Bookmark-file-1>
<!-- This is an automatically generated file.
It will be read and overwritten.
DO NOT EDIT! -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>${name}</TITLE>
<H1>${name}</H1>
`
/**
* @name:
* @description: 生成文件夹格式的Dom
* @param {Folder} folder 文件夹格式数据
* @return {*}
*/
export const createHtmlFolder = (folder: Folder) => `
<DT/>
<H3 ADD_DATE="${folder.add_date}" LAST_MODIFIED="${folder.last_modified}">${folder.title}</H3>
${createBaseTemp()}
`
/**
* @name:
* @description: 生成文件格式的Dom
* @param {File} file 文件格式数据
* @return {*}
*/
export const createHtmlFile = (file: File) => `
<DT/>
<A HREF="${file.href}" ICON="${file.icon}" ADD_DATE="${file.add_date}">${file.name}</A>
`
/**
* @name:
* @description: 列表格式的Dom
* @return {*}
*/
export const createBaseTemp = () => `
<DL><p>
</DL><p>
`
写在最后
最终实现效果:BookMarks
源码:book_mark: 纯前端导入导出html书签,生成书签导航
最后,感谢你看到这里,如果文章有帮助到你,还请支持一下博主!