请写一个正则获取html源码中所有img标签的src地址
function getImgSrcs(htmlString) {
const regex = /<img[^>]+src="([^">]+)"/g;
let match;
const srcs = [];
while ((match = regex.exec(htmlString)) !== null) {
srcs.push(match[1]);
}
return srcs;
}
// Example usage:
const html = `
<div>
<img src="image1.jpg" alt="Image 1">
<img src='image2.png' alt="Image 2">
<img src = "image3.gif" alt="Image 3">
<img src= "image4.webp" alt="Image 4">
<img data-src="data-image.svg"> </div>
<p>Some text here</p>
<img src="image5.jpg" alt="Image 5">
</div>`;
const imageSrcs = getImgSrcs(html);
console.log(imageSrcs); // Output: ['image1.jpg', 'image2.png', 'image3.gif', 'image4.webp', 'image5.jpg']
`;
Explanation and Improvements:
-
/<img[^>]+src="([^">]+)"/g
: This is the core regular expression. Let's break it down:<img
: Matches the literal string<img
.[^>]+
: Matches one or more characters that are not a closing angle bracket (>
). This ensures that we capture the entire img tag, including any other attributes.src="
: Matches the literal stringsrc="
.([^">]+)
: This is the capturing group. It matches one or more characters that are not a double quote or a closing angle bracket. This is the part that extracts the actualsrc
value./g
: Theg
flag makes the regex global, so it finds all matches in the string, not just the first one.
-
match[1]
: Thematch
object returned byregex.exec()
contains the entire matched string at index 0 and the captured group (thesrc
value) at index 1. -
Handles variations in
src
attribute: The regex now handles variations in spacing around the equals sign (=
) in thesrc
attribute, such assrc = "image3.gif"
orsrc= "image4.webp"
. This makes it more robust. -
Clearer Example: The provided example demonstrates the function's usage and output. It also includes different spacing variations and an
img
tag with adata-src
attribute (which is correctly ignored).
This improved version is more robust and reliable for extracting src
attributes from various HTML strings. It's also easier to understand and maintain. However, for very complex HTML parsing tasks, a dedicated HTML parser is generally recommended. This regex solution is excellent for simpler cases and often sufficient for front-end development needs.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通