Linux bash shell script batch download files All In One
Linux bash shell script batch download files All In One
Linux bash shell script 批量下载文件 All In One
solution
pdf crawler / pdf 爬虫
#!/bin/bash
# 下载目录
downdir="/Users/xgqfrms-mbp/Documents/swift-ui/Memorize/000-xyz/pdfs/"
# $1 是传递给 shell 的第一个参数
# read line 按行读取文件
cat $1 | while read line
do
# shell 变量需要使用双引号包裹, 或 echo $line
echo "$line"
cd $downdir
str=$line
# 按行分割,每行一个, 正则表达式:字符串转数组
array=(${str//;/ })
echo "$array"
url=${array[0]}
# tr 删除换行字符 ✅
filename=$(echo ${array[1]} | tr -d '\r')
# filename=$(echo "l" + ${index} + ".pdf" | tr -d '\r')
# filename=$(echo "l${index}.pdf" | tr -d '\r')
# cURL 执行下载, -o 输出文件
curl $url -o $filename
done
# exit 0
# mkdir pdfs
$ bash ./auto-download-pdfs.sh cs193p.txt
# OR, 可执行脚本
$ chmod +x ./auto-download-pdfs.sh
$ ./auto-download-pdfs.sh cs193p.txt
cs193p.txt
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l1.pdf;l1.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l2.pdf;l2.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l3.pdf;l3.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l4.pdf;l4.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l5.pdf;l5.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l6.pdf;l6.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l7.pdf;l7.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l8.pdf;l8.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l9.pdf;l9.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l10.pdf;l10.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l11.pdf;l11.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l12.pdf;l12.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l13.pdf;l12.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l14.pdf;l14.pdf
demos
CS193p PDFs, 2020 Spring L1 ~ L14
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l1.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l1.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l2.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l3.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l4.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l5.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l6.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l7.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l8.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l9.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l10.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l11.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l12.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l13.pdf
https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l14.pdf
Linux tr
command
tr 转义或删除字符
$ man tr > man-tr.md
$ cat man-tr.md
TR(1) User Commands TR(1)
NAME
tr - translate or delete characters
SYNOPSIS
tr [OPTION]... SET1 [SET2]
DESCRIPTION
Translate, squeeze, and/or delete characters from standard input, writing to standard output.
-c, -C, --complement
use the complement of SET1
-d, --delete
delete characters in SET1, do not translate
-s, --squeeze-repeats
replace each sequence of a repeated character that is listed in the last specified SET, with a
single occurrence of that character
-t, --truncate-set1
first truncate SET1 to length of SET2
--help display this help and exit
--version
output version information and exit
SETs are specified as strings of characters. Most represent themselves. Interpreted sequences are:
\NNN character with octal value NNN (1 to 3 octal digits)
\\ backslash
\a audible BEL
\b backspace
\f form feed
\n new line
\r return
\t horizontal tab
\v vertical tab
CHAR1-CHAR2
all characters from CHAR1 to CHAR2 in ascending order
[CHAR*]
in SET2, copies of CHAR until length of SET1
[CHAR*REPEAT]
REPEAT copies of CHAR, REPEAT octal if starting with 0
[:alnum:]
all letters and digits
[:alpha:]
all letters
[:blank:]
all horizontal whitespace
[:cntrl:]
all control characters
[:digit:]
all digits
[:graph:]
all printable characters, not including space
[:lower:]
all lower case letters
[:print:]
all printable characters, including space
[:punct:]
all punctuation characters
[:space:]
all horizontal or vertical whitespace
[:upper:]
all upper case letters
[:xdigit:]
all hexadecimal digits
[=CHAR=]
all characters which are equivalent to CHAR
Translation occurs if -d is not given and both SET1 and SET2 appear. -t may be used only when translat‐
ing. SET2 is extended to length of SET1 by repeating its last character as necessary. Excess characters
of SET2 are ignored. Only [:lower:] and [:upper:] are guaranteed to expand in ascending order; used in
SET2 while translating, they may only be used in pairs to specify case conversion. -s uses the last
specified SET, and occurs after translation or deletion.
AUTHOR
Written by Jim Meyering.
REPORTING BUGS
GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Report any translation bugs to <https://translationproject.org/team/>
COPYRIGHT
Copyright © 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later
<https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent
permitted by law.
SEE ALSO
Full documentation <https://www.gnu.org/software/coreutils/tr>
or available locally via: info '(coreutils) tr invocation'
GNU coreutils 8.32 September 2020 TR(1)
pi@raspberrypi:~/Desktop/man-docs $
emmet
使用 vscode emmet 语法动态生成
$index.pdf
p{https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l$.pdf}*14
https://code.visualstudio.com/docs/editor/emmet
https://github.com/emmetio/emmet
TODO
Node.js version
const fs = require("fs");
var path = require("path");
const { exit } = require("process");
const log = console.log;
const request = require("request");
// const request = require("request-promise-native");
var folder = path.resolve(__dirname, '../pdf');
// log('folder', folder);
if (!fs.existsSync(folder)) {
fs.mkdirSync(folder);
}
async function downloadPDF(url, filename) {
log('🚧 pdf downloading ...');
const pdfBuffer = await request.get({
uri: url,
encoding: null,
// encoding: 'utf-8',
});
fs.writeFileSync(filename, pdfBuffer);
log('✅ pdf finished!');
// exit 0;
}
const url = 'https://cs193p.sites.stanford.edu/sites/g/files/sbiybj16636/files/media/file/l1.pdf';
const filename = folder + '/cs193p-2021-l1.pdf';
// log('filename =', filename);
downloadPDF(url, filename);
https://www.cnblogs.com/xgqfrms/p/16086580.html
npm package
$ npm i -g auto-download-files
https://www.npmjs.com/package/auto-download-files
Python version
//
refs
TypeScript & Node.js crawler All In One
https://www.cnblogs.com/xgqfrms/p/16086580.html
©xgqfrms 2012-2025
www.cnblogs.com/xgqfrms 发布文章使用:只允许注册用户才可以访问!
原创文章,版权所有©️xgqfrms, 禁止转载 🈲️,侵权必究⚠️!
本文首发于博客园,作者:xgqfrms,原文链接:https://www.cnblogs.com/xgqfrms/p/16073509.html
未经授权禁止转载,违者必究!
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)
2021-03-29 页面超过 10 分钟没有任何操作
2021-03-29 node.js ECONNRESET error
2020-03-29 ROI
2020-03-29 React & Calendar
2020-03-29 taro & Block
2020-03-29 taro swiper & scroll tabs
2016-03-29 读写 LED 作业 台灯的 频闪研究 2 评测&对比!