Mongo-Es同步服务部署Monstache(cnetos)
环境及版本
- Centos 6.5
- Elasticsearch 7.6.2
- MongoDB 4.2
- Monstache 6.5.2
连接服务器
-
北京局服务器列表
-
es服务器地址,此次选用这台服务器
-
数据保存地址(shell连接数据)
-
windows下 d:\NetSarang Computer\6
python脚本实现ssh自动登录Centos
- centos安装openssh
- 踩坑(centos DNS配置缺失)
vim /etc/sysconfig/network-scripts/ifcfg-eth0 加入NDS1=8.8.8.8 修改ONBOOT=yes service network restart #重启
- 通过xshell加载自动登录脚本
- paramiko实现ssh自动交互远程服务器
ES部署
ES服务器部署与配置
- 添加用户
groupadd es
useradd -m -g es es
passwd es
es
# 更改文件夹所属用户组
chwon -R es:es /opt/elasticsearch-7.6.2
-
Centos默认部署位置
/opt/elasticsearch-7-6-2
-
elasticsarch.ymal完整配置
[root@1 elasticsearch-7.6.2]# cat config/elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # cluster.name: my-application # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # path.data: /path/to/data # # Path to log files: # path.logs: /path/to/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # #network.host: 192.168.0.1 network.host: 0.0.0.0 # # Set a custom port for HTTP: # http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when this node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # #discovery.seed_hosts: ["host1", "host2"] # # Bootstrap the cluster using an initial set of master-eligible nodes: # cluster.initial_master_nodes: ["node-1"] # # For more information, consult the discovery and cluster formation module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true bootstrap.system_call_filter: false
-
elasticsearch.bat的完整配置
#!/bin/bash # CONTROLLING STARTUP: # # This script relies on a few environment variables to determine startup # behavior, those variables are: # # ES_PATH_CONF -- Path to config directory # ES_JAVA_OPTS -- External Java Opts on top of the defaults set # # Optionally, exact memory values can be set using the `ES_JAVA_OPTS`. Note that # the Xms and Xmx lines in the JVM options file must be commented out. Example # values are "512m", and "10g". # # ES_JAVA_OPTS="-Xms8g -Xmx8g" ./bin/elasticsearch export JAVA_HOME=/opt/elasticsearch-7.6.2/jdk/ export PATH=$JAVA_HOME/bin:$PATH source "`dirname "$0"`"/elasticsearch-env if [ -z "$ES_TMPDIR" ]; then ES_TMPDIR=`"$JAVA" -cp "$ES_CLASSPATH" org.elasticsearch.tools.launchers.TempDirectory` fi ES_JVM_OPTIONS="$ES_PATH_CONF"/jvm.options ES_JAVA_OPTS=`export ES_TMPDIR; "$JAVA" -cp "$ES_CLASSPATH" org.elasticsearch.tools.launchers.JvmOptionsParser "$ES_JVM_OPTIONS"` #添加jdk判断 if [ -x "$JAVA_HOME/bin/java" ]; then JAVA="/opt/elasticsearch-7.6.2/jdk/bin/java" else JAVA=`which java` fi # manual parsing to find out, if process should be detached if ! echo $* | grep -E '(^-d |-d$| -d |--daemonize$|--daemonize )' > /dev/null; then exec \ "$JAVA" \ $ES_JAVA_OPTS \ -Des.path.home="$ES_HOME" \ -Des.path.conf="$ES_PATH_CONF" \ -Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \ -Des.distribution.type="$ES_DISTRIBUTION_TYPE" \ -Des.bundled_jdk="$ES_BUNDLED_JDK" \ -cp "$ES_CLASSPATH" \ org.elasticsearch.bootstrap.Elasticsearch \ "$@" else exec \ "$JAVA" \ $ES_JAVA_OPTS \ -Des.path.home="$ES_HOME" \ -Des.path.conf="$ES_PATH_CONF" \ -Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \ -Des.distribution.type="$ES_DISTRIBUTION_TYPE" \ -Des.bundled_jdk="$ES_BUNDLED_JDK" \ -cp "$ES_CLASSPATH" \ org.elasticsearch.bootstrap.Elasticsearch \ "$@" \ <&- & retval=$? pid=$! [ $retval -eq 0 ] || exit $retval if [ ! -z "$ES_STARTUP_SLEEP_TIME" ]; then sleep $ES_STARTUP_SLEEP_TIME fi if ! ps -p $pid > /dev/null ; then exit 1 fi exit 0 fi exit $?
-
java垃圾回收配置完整文件
cat /opt//opt/elasticsearch-7.6.2/config/jvm.otions ## JVM configuration ################################################################ ## IMPORTANT: JVM heap size ################################################################ ## ## You should always set the min and max JVM heap ## size to the same value. For example, to set ## the heap to 4 GB, set: ## ## -Xms4g ## -Xmx4g ## ## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html ## for more information ## ################################################################ # Xms represents the initial size of total heap space # Xmx represents the maximum size of total heap space -Xms1g -Xmx1g ################################################################ ## Expert settings ################################################################ ## ## All settings below this section are considered ## expert settings. Don't tamper with them unless ## you understand what you are doing ## ################################################################ ## GC configuration #8-13:-XX:+UseCMSInitiatingOccupancyOnly 8-13:-XX:+UseG1GC #8-13:-XX:+UseConcMarkSweepGC 8-13:-XX:CMSInitiatingOccupancyFraction=75 8-13:-XX:+UseCMSInitiatingOccupancyOnly ## G1GC Configuration # NOTE: G1 GC is only supported on JDK version 10 or later # to use G1GC, uncomment the next two lines and update the version on the # following three lines to your version of the JDK # 10-13:-XX:-UseConcMarkSweepGC # 10-13:-XX:-UseCMSInitiatingOccupancyOnly 14-:-XX:+UseG1GC 14-:-XX:G1ReservePercent=25 14-:-XX:InitiatingHeapOccupancyPercent=30 ## JVM temporary directory -Djava.io.tmpdir=${ES_TMPDIR} ## heap dumps # generate a heap dump when an allocation from the Java heap fails # heap dumps are created in the working directory of the JVM -XX:+HeapDumpOnOutOfMemoryError # specify an alternative path for heap dumps; ensure the directory exists and # has sufficient space -XX:HeapDumpPath=data # specify an alternative path for JVM fatal error logs -XX:ErrorFile=logs/hs_err_pid%p.log ## JDK 8 GC logging 8:-XX:+PrintGCDetails 8:-XX:+PrintGCDateStamps 8:-XX:+PrintTenuringDistribution 8:-XX:+PrintGCApplicationStoppedTime 8:-Xloggc:logs/gc.log 8:-XX:+UseGCLogFileRotation 8:-XX:NumberOfGCLogFiles=32 8:-XX:GCLogFileSize=64m # JDK 9+ GC logging 9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
异常处理
-
安装常见错误:ERROR: [2] bootstrap checks failed
ERROR: [2] bootstrap checks failed [1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
启动方式
- 开机启动脚本
vim /etc/rc.d/rc.local # root用户无法启动,用创建好的es用户运行 su - es -c "/opt/elasticsearch-7.6.2/bin/elasticsearch -d"
- 另一种方式
#添加脚本文件到 cd /etc/rc.d/init.d/ vim elasticsearch.sh cd /etc/rc.d/init.d/ #!/bin/sh #chkconfig: 2345 80 90 #description: es开机启动脚本 su - es -c "/opt/elasticsearch-7.6.2/bin/elasticsearch -d" #wq chmod +x elasticsearch.sh chkconfig --add elasticsearch.sh chkconfig elasticsearch.sh on reboot
脚本编写
Monstache部署
-
安装,参考windows版本安装即可
-
此次在centos上的安装路径
/opt/mongstache/
-
设置开机自启动
vim /etc/rc.d/rc.local #添加 /opt/monstache/linux-amd64/monstache -f /opt/monstache/linux-amd64/config.toml
-
完整配置文件config.toml
cat /opt/monstache/linux-amd64/config.toml mongo-url = "mongodb://172.16.5.124:27017" elasticsearch-urls = ["http://172.16.5.123:9200"] elasticsearch-max-conns = 10 namespace-regex = 'spider.*' #aaa表示mongodb的数据库,bbb表示集合,表示要匹配的名字空间 # direct-read-namespaces = [“common.parent_info”,”common.child_info”] dropped-collections = true # propogate dropped databases in MongoDB as index deletes in Elasticsearch dropped-databases = true resume = true #从上次同步的时间开始同步 # cluster-name = 'tzg' #es集群名 # 更新es而不是覆盖 index-as-update = true verbose = true [logs] info = "/var/logs/monstache/info.log" warn = "/var/logs/monstache/warn.log" error = "/var/logs/monstache/error.log" trace = "/var/logs/monstache/trace.log" # 自动创建日志文件
Mongodb 部署并开启replca Set
-
- By default, MongoDB runs using the mongod user account and uses the following default directories:
/var/lib/mongo (the data directory) /var/log/mongodb (the log directory)
- 配置文件默认位置
/etc/mongod.conf
- By default, MongoDB runs using the mongod user account and uses the following default directories:
-
完整配置文件
# mongod.conf # for documentatilson of all options, see: # http://docs.mongodb.org/manual/reference/configuration-options/ # where to write logging data. systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log # Where and how to store data. storage: dbPath: /var/lib/mongo journal: enabled: true # engine: # wiredTiger: # how the process runs processManagement: fork: true # fork and run in background pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile timeZoneInfo: /usr/share/zoneinfo # network interfaces net: port: 27017 bindIp: 0.0.0.0 #172.16.5.124 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting. #security: #operationProfiling: #replication: replication: replSetName: "rs0" #net: # bindIp: 0.0.0.0 #sharding: ## Enterprise-Only Options #auditLog: #snmp:
- 注意 bindIp: 0.0.0.0直接配置为这样后,开启replica Set会报错,可以先配置为bindIp: 172.16.5.124,初始化replica Set成功后再配置为bindIp:0.0.0.0
-
设置开机自启动
service mongod start chkconfig mongod on #默认开启级别2345 #如果要用这条命令,
-
卸载
-
彻底卸载MongoDB,必须移除MongoDB应用程序,移除配置文件和任何包含数据及日志的目录。下面的指南是彻底卸载MongoDB的必须步骤。
-
警告:如下步骤将会彻底卸载MongoDB,包括其配置文件和所有的数据库文件。这个过程是不可逆的,所以确保在执行这些步骤之前已经备份了你的配置文件和数据文件。
-
I. 停止MongoDB
sudo service mongod stop
-
II. 移除MongoDB包
移除之前安装的所有MongoDB包sudo yum erase $(rpm -qa | grep mongodb-org)
-
III. 删除数据文件及日志文件
删除MongoDB数据库和日志文件sudo rm -r /var/log/mongodb
-
-
注意官方教程有点问题,有个设置项要改一下
常见异常及处理
- 不正常关闭数据库,恢复方法
- https://blog.csdn.net/u010028869/article/details/50698689
- https://blog.csdn.net/u010647035/article/details/81434774
- 注意,安装方式最好的方式就是官网方式,有些坑填一下即可
- 最好不要直接杀死mongod进程,要用
service mongod stop
- 启动数据库可能遇到的问题
脚本
- 待定。。。
curl 测试es通不通
#查询es中的所有索引
curl "172.16.5.123:9000/_cat/indices?v"
#浏览器自带url转码,所以终端需要手动转好
#url转码中国=%e4%b8%ad%e5%9b%bd
curl "172.16.5.123:5000/search?q=%e4%b8%ad%e5%9b%bd&s=0"
参考文章:
安装,创建用户,编写自动安装脚本
https://www.cnblogs.com/qq931399960/p/10425803.html
replica set设置,设置开机启动,脚本
https://www.cnblogs.com/qq931399960/p/10428702.html
一个人可以被毁灭,但不能被打败