Htmlparser Filter 简要归纳
1 . 逻辑关系:与或非
AndFilter() Creates a new instance of an AndFilter. |
AndFilter(NodeFilter[] predicates) Creates an AndFilter that accepts nodes acceptable to all given filters. |
AndFilter(NodeFilter left, NodeFilter right) Creates an AndFilter that accepts nodes acceptable to both filters. |
OrFilter() Creates a new instance of an OrFilter. |
OrFilter(NodeFilter[] predicates) Creates an OrFilter that accepts nodes acceptable to any of the given filters. |
OrFilter(NodeFilter left, NodeFilter right) Creates an OrFilter that accepts nodes acceptable to either filter. |
OrFilter() Creates a new instance of an OrFilter. |
OrFilter(NodeFilter[] predicates) Creates an OrFilter that accepts nodes acceptable to any of the given filters. |
OrFilter(NodeFilter left, NodeFilter right) Creates an OrFilter that accepts nodes acceptable to either filter. |
2. 内容
StringFilter:功能简单有限;复杂功能可使用RegexFilter (正则表达式)
StringFilter() Creates a new instance of StringFilter that accepts all string nodes. |
StringFilter(String pattern) Creates a StringFilter that accepts text nodes containing a string. |
StringFilter(String pattern, boolean sensitive) Creates a StringFilter that accepts text nodes containing a string. |
StringFilter(String pattern, boolean sensitive, Locale locale) Creates a StringFilter that accepts text nodes containing a string. |
RegexFilter() Creates a new instance of RegexFilter that accepts string nodes matching the regular expression ".*" using the FIND strategy. |
RegexFilter(String pattern) Creates a new instance of RegexFilter that accepts string nodes matching a regular expression using the FIND strategy. |
RegexFilter(String pattern, int strategy) Creates a new instance of RegexFilter that accepts string nodes matching a regular expression. |
3 标签
TagNameFilter()利用标签名过滤 : div ,img , ...
NodeClassFilter()利用标签类别 :LinkTag.class ...
HasAttributeFilter()利用属性 :HasAttributeFilter(“class”, “className”)
LinkRegexFilter()用正则表达式匹配链接
TagNameFilter() Creates a new instance of TagNameFilter. |
TagNameFilter(String name) Creates a TagNameFilter that accepts tags with the given name. |
NodeClassFilter() Creates a NodeClassFilter that accepts Html tags. |
NodeClassFilter(Class cls) Creates a NodeClassFilter that accepts tags of the given class. |
HasAttributeFilter() Creates a new instance of HasAttributeFilter. |
HasAttributeFilter(String attribute) Creates a new instance of HasAttributeFilter that accepts tags with the given attribute. |
HasAttributeFilter(String attribute, String value) Creates a new instance of HasAttributeFilter that accepts tags with the given attribute and value. |
LinkRegexFilter(String regexPattern) Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern. |
LinkRegexFilter(String regexPattern, boolean caseSensitive) Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern. |
LinkStringFilter(String pattern) Creates a LinkStringFilter that accepts LinkTag nodes containing a URL that matches the supplied pattern. |
LinkStringFilter(String pattern, boolean caseSensitive) Creates a LinkStringFilter that accepts LinkTag nodes containing a URL that matches the supplied pattern. |
4 层次关系
HasParentFilter() Creates a new instance of HasParentFilter. |
HasParentFilter(NodeFilter filter) Creates a new instance of HasParentFilter that accepts nodes with the direct parent acceptable to the filter. |
HasParentFilter(NodeFilter filter, boolean recursive) Creates a new instance of HasParentFilter that accepts nodes with a parent acceptable to the filter. |
HasChildFilter() Creates a new instance of a HasChildFilter. |
HasChildFilter(NodeFilter filter) Creates a new instance of HasChildFilter that accepts nodes with a direct child acceptable to the filter. |
HasChildFilter(NodeFilter filter, boolean recursive) Creates a new instance of HasChildFilter that accepts nodes with a child acceptable to the filter. |