A Go library implementing an FST (finite state transducer)——mark下
https://github.com/couchbaselabs/vellum
Building an FST
To build an FST, create a new builder using the New()
method. This method takes an io.Writer
as an argument. As the FST is being built, data will be streamed to the writer as soon as possible. With this builder you MUST insert keys in lexicographic order. Inserting keys out of order will result in an error. After inserting the last key into the builder, you MUST call Close()
on the builder. This will flush all remaining data to the underlying writer.
In memory:
var buf bytes.Buffer
builder, err := vellum.New(&buf, nil)
if err != nil {
log.Fatal(err)
}
To disk:
f, err := os.Create("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
builder, err := vellum.New(f, nil)
if err != nil {
log.Fatal(err)
}
MUST insert keys in lexicographic order:
err = builder.Insert([]byte("cat"), 1)
if err != nil {
log.Fatal(err)
}
err = builder.Insert([]byte("dog"), 2)
if err != nil {
log.Fatal(err)
}
err = builder.Insert([]byte("fish"), 3)
if err != nil {
log.Fatal(err)
}
err = builder.Close()
if err != nil {
log.Fatal(err)
}
Using an FST
After closing the builder, the data can be used to instantiate an FST. If the data was written to disk, you can use the Open()
method to mmap the file. If the data is already in memory, or you wish to load/mmap the data yourself, you can instantiate the FST with the Load()
method.
Load in memory:
fst, err := vellum.Load(buf.Bytes())
if err != nil {
log.Fatal(err)
}
Open from disk:
fst, err := vellum.Open("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
Get key/value:
val, exists, err = fst.Get([]byte("dog"))
if err != nil {
log.Fatal(err)
}
if exists {
fmt.Printf("contains dog with val: %d\n", val)
} else {
fmt.Printf("does not contain dog")
}
Iterate key/values:
itr, err := fst.Iterator(startKeyInclusive, endKeyExclusive)
for err == nil {
key, val := itr.Current()
fmt.Printf("contains key: %s val: %d", key, val)
err = itr.Next()
}
if err != nil {
log.Fatal(err)
}
How does the FST get built?
A full example of the implementation is beyond the scope of this README, but let's consider a small example where we want to insert 3 key/value pairs.
First we insert "are" with the value 4.
Next, we insert "ate" with the value 2.
Notice how the values associated with the transitions were adjusted so that by summing them while traversing we still get the expected value.
At this point, we see that state 5 looks like state 3, and state 4 looks like state 2. But, we cannot yet combine them because future inserts could change this.
Now, we insert "see" with value 3. Once it has been added, we now know that states 5 and 4 can longer change. Since they are identical to 3 and 2, we replace them.
Again, we see that states 7 and 8 appear to be identical to 2 and 3.
Having inserted our last key, we call Close()
on the builder.
Now, states 7 and 8 can safely be replaced with 2 and 3.
For additional information, see the references at the bottom of this document.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」