A simple search engine demo.
A simple search engine demo.
https://github.com/bobbyz3g/Chihiro
一个基于elasticsearch的全栈应用demo。
使用scrapy爬取数据,输入elasticsearch,
使用Django作为web搜索界面, 调用elasticsearch的搜索接口。
Introduction
Chihiro is a simple search engine demo. It show you how to build a search engine website by using Scrapy, Django and ElasticSearch.
Chihiro consists of ChihiroSearch and ChihiroSpider.
- ChihiroSearch: website backend.
- ChihiroSpider: Spiders.
问题
上面例子, 不是完全使用docker封装,
下面repo对此问题做了改进。
https://github.com/fanqingsong/Chihiro
项目运行,只需要如下命令
docker-compose build
docker-compose up
关联知识点
scrapy选择器
https://scrapy-chs.readthedocs.io/zh-cn/latest/topics/selectors.html
爬取目标
http://quotes.toscrape.com/page/1/
elasticsearch
https://elasticsearch-py.readthedocs.io/en/v7.12.0/api.html
https://elasticsearch-dsl.readthedocs.io/en/latest/index.html
Scrapy-Redis is a powerful open source Scrapy extension that enables you to run distributed crawls/scrapes across multiple servers and scale up your data processing pipelines.
https://www.baeldung.com/linux/docker-cmd-multiple-commands
6. Run Multiple Commands With a Shell Script
Sometimes, we need to do more complex processing, instead of chaining a few commands. If this is the case, we can create a shell script that contains all the necessary logic, and copy it to the container’s filesystem. We can use the Dockerfile COPY directive to copy the shell script file to our container’s filesystem.
Let’s create a simple Dockerfile for our image:
FROM ubuntu:latest COPY startup.sh . CMD ["/bin/bash","-c","./startup.sh"]
We’ll use the COPY directive to copy the shell script to the container’s filesystem. We’ll execute the script with the CMD directive.
Next, we’ll create the startup.sh shell script:
#! /bin/bash echo $HOME date
The above script prints our home directory in the container and the current date. An important note is that we should grant the execute privilege to the shell script on the host machine. This is because the execute privilege will be transferred to our container when we copy the file. Otherwise, the container won’t be able to execute the script when it starts.
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全网最简单!3分钟用满血DeepSeek R1开发一款AI智能客服,零代码轻松接入微信、公众号、小程
· .NET 10 首个预览版发布,跨平台开发与性能全面提升
· 《HelloGitHub》第 107 期
· 全程使用 AI 从 0 到 1 写了个小工具
· 从文本到图像:SSE 如何助力 AI 内容实时呈现?(Typescript篇)
2023-02-20 Popular Cows
2022-02-20 SOLID -- OOP design principles
2022-02-20 SoC -- the root design principle
2022-02-20 refactoring of refactoring.guru
2022-02-20 design patterns of refactoring guru