倒排索引的一些算法调研
下面的文章专门针对搜索引擎里的倒排列表 sorted sets研究交集算法,思路类似快排,非常值得一看
www.cs.ucr.edu/~stelo/cpm/cpm04/25_Baeza-yates.pdf
汇总资料:https://github.com/TechConf/CodeMash2016/blob/master/Great%20Galloping%20Cuckoos-%20Algorithms%20Faster%20than%20log(n)/index.html
关键信息:
## Comparisons of Set Intersections | |
<small>Excerpted from [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)</small> | |
Algorithm | # of comparisons | |
-----------|----------------: | |
Sequential | 119479075 | |
Adaptive | 83326341 | |
Small Adaptive | 68706234 | |
Interpolation Sequential | 55275738 | |
Interpolation Adaptive | 58558408 | |
</markdeep></section><section><markdeep> | |
## Comparisons of Set Intersections | |
<small>Excerpted from [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)</small> | |
Algorithm | # of comparisons | |
-----------|----------------: | |
Sequential | 119479075 | |
Interpolation Small Adaptive | 44525318 | |
Extrapolation Small Adaptive | 50018852 | |
Extrapolate Many Small Adaptive | 44087712 | |
Extrapolate Ahead Small Adaptive | 43930174 |
## Resources: Sets | |
- [A Fast Set Intersection Algorithm for Sorted Sequences](http://www.cs.ucr.edu/~stelo/cpm/cpm04/25_Baeza-yates.pdf) | |
- [Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences](https://cs.uwaterloo.ca/~ajsaling/papers/paper-spire.pdf) | |
- [Experimental Comparison of Set Intersection Algorithms for Inverted Indexing](http://ceur-ws.org/Vol-1003/58.pdf) | |
- [Fast Set Intersection in Memory](http://research.microsoft.com/pubs/142850/p255-dingkoenig.pdf) | |
- [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf) | |
- [Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions](http://www.vldb.org/pvldb/vol8/p293-inoue.pdf) | |
- [SIMD Compression and the Intersection of Sorted Integers](http://arxiv.org/abs/1401.6399) |
https://github.com/lemire/SIMDCompressionAndIntersection
提及较多的:
https://github.com/Randl/CS/tree/master/Hwang-Lin
标签:
搜索引擎
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」