xgqfrms™, xgqfrms® : xgqfrms's offical website of cnblogs! xgqfrms™, xgqfrms® : xgqfrms's offical website of GitHub!

js & anti craw & crawler spam

js & anti craw & crawler spam

demo & X-Sign


    , function(t, e, n) {
        "use strict";
        var r = n(126)
          , o = n.n(r)
          , i = "WSUDD"
          , a = "X"
          , s = "/fe_api/";
        e.a = {
            name: "crawler-spam",
            install: function(t, e) {
                e.isBrowser && e.http.interceptors.dispatch.use(function(t) {
                    return t.url.indexOf(s) > -1 && (t.headers["X-Sign"] = function(t, e) {
                        var n = arguments.length > 2 && void 0 !== arguments[2] ? arguments[2] : a
                          , r = t.url
                          , u = void 0 === r ? "" : r
                          , c = t.params
                          , f = t.paramsSerializer;
                        return u = u.slice(u.indexOf(s), u.length),
                        n === a ? "" + n + o()(e(u, c, f) + i) : ""
                    }(t, e.http.buildURL)),
                    t
                })
            }
        }
    }

  1. step 1: find crawler spam js file name & get json data

js & XHR

  1. open sources, set break points

js files

  1. debug, find the logic

step by step, look for the roots(溯源)

  1. mock / fake, craw datas

blogs

crawler-spam

https://www.xiaohongshu.com/page/hot
https://www.xiaohongshu.com/explore

https://www.edificeautomotive.com/blog/2016/02/26/ghost-and-crawler-spam/

referral exclusion list

https://support.google.com/analytics/answer/2795830?hl=en

Referral exclusions & 推荐排除

https://www.liquidlight.co.uk/blog/crawler-spam-referrals-how-to-filter-them-out-from-google-analytics/


hack methods

github

https://github.com/topics/xiaohongshu

https://github.com/lonngxiang/xiaohongshu-spider

https://github.com/vinchu/xiaohongshu-2

https://github.com/No-bb-just-do-it/xiaohongshu

npm

pm formula-static/@xhs/launcher



©xgqfrms 2012-2020

www.cnblogs.com 发布文章使用:只允许注册用户才可以访问!


posted @   xgqfrms  阅读(250)  评论(5编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2018-08-06 Linux bash shell All In One
2018-08-06 json server
2015-08-06 CSS hacks (CSS filter)还有必要使用吗?All In One
2015-08-06 浅谈 html5 兼容性 < IE9 + 的解决方案以及网站性能优化的工具 All In One
点击右上角即可分享
微信分享提示