Generates Regular Expressions That Match A Set of Strings

GitHub - devongovett/regexgen: Generate regular expressions that match a set of strings

How does it work?

  1. Generate a Trie containing all of the input strings. A Trie is a tree structure where each edge represents a character.
  2. A trie can be seen as a tree-shaped deterministic finite automaton (DFA). Hopcroft's DFA minimization algorithm is used to merge the nondistinguishable states.
    • What is the difference between finite automata and finite state machines? FirstIn FSM for circuit designs the input signal is mostly assumed to be a bit (binary), whereas in finite state automata one can have a general "abstract" alphabet of input symbols. Second, a FSM also generates an output, associated to the state reached, also binary. In automata terminology this 'extension' is called a Moore machine. Automata however have final (or accepting) states, that signal a favourable input read. Finally, FSM are mostly deterministic, i.e., for each input in a certain state there is one next state. In automata theory one also considers the nondeterministic variant where one might have choice in where to move.
    • NFA stands for non-deterministic finite automata.
  3. Convert the resulting minimized DFA to a regular expression. This is done using Brzozowski's algebraic method.

改动:不用装Node.js,打开test.html即可:

<script src='jsesc.js'></script>
<script src='regenerate.js'></script>
<!-- This class extends the native ES6 Set class with some additional methods -->
<script src='set.js'></script>
<!-- This ES6 Map subclass calls the getter function passed to the constructor to initialize
     undefined properties when they are first retrieved. -->
<script src='map.js'></script>
<!-- Represents a state in a DFA. -->
<script src='state.js'></script>
<!-- Implements Hopcroft's DFA minimization algorithm. -->
<script src='minimize.js'></script>
<!-- Represents an alternation (e.g. `foo|bar`) -->
<script src='ast.js'></script>
<!-- Implements Brzozowski's algebraic method to convert a DFA into a regular expression pattern. -->
<script src='regex.js'></script>
<!-- A Trie represents a set of strings in a tree data structure where each edge represents a single character. -->
<script src='trie.js'></script>
<script>
/**
 * Generates a regular expression that matches the given input strings.
 * @param {Array<string>} inputs
 * @param {string} flags
 * @return {RegExp}
 */
function regexgen(inputs, flags) {
  let trie = new Trie;
  trie.addAll(inputs);
  return trie.toRegExp(flags);
}
function print(s) { document.write(s + '<br>'); console.log(s + '\n'); }
let args = [ 'a', 'aa', 'aaa'];
print(regexgen(args, ''));
</script>

https://files.cnblogs.com/files/blogs/714801/regexgen-devongovett.7z 14KB 含test.hml及所有.js

  • 下载后不能运行,把Map改成DefaultMap,Set换成ExtendedSet后好了。
  • => 是箭头函数
  • (?: subexpression ) Defines a noncapturing group. When you want to have parenthesis but not capture the sub-expression you use non-capturing groups.
  • 例子 | 文档1 | 文档2 | 动手试一试
posted @ 2022-12-20 20:29  Fun_with_Words  阅读(21)  评论(0编辑  收藏  举报









 张牌。