Mongo-Es同步服务部署Monstache(cnetos)

环境及版本

Centos 6.5
Elasticsearch 7.6.2
MongoDB 4.2
Monstache 6.5.2

连接服务器

北京局服务器列表
es服务器地址，此次选用这台服务器
数据保存地址(shell连接数据)
windows下 d:\NetSarang Computer\6

python脚本实现ssh自动登录Centos

centos安装openssh

vim /etc/sysconfig/network-scripts/ifcfg-eth0

加入NDS1=8.8.8.8
修改ONBOOT=yes
service network restart
#重启

通过xshell加载自动登录脚本
paramiko实现ssh自动交互远程服务器

ES部署

ES服务器部署与配置

添加用户

groupadd es
useradd -m -g es es
passwd es
es
# 更改文件夹所属用户组
chwon -R es:es /opt/elasticsearch-7.6.2

Centos默认部署位置
```
/opt/elasticsearch-7-6-2
```
官方下载教程
为es指定内置java并设置垃圾回收
配置es

elasticsarch.ymal完整配置

[root@1 elasticsearch-7.6.2]# cat config/elasticsearch.yml 
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /path/to/data
#
# Path to log files:
#
path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
#network.host: 192.168.0.1
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["node-1"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
bootstrap.system_call_filter: false

elasticsearch.bat的完整配置


#!/bin/bash

# CONTROLLING STARTUP:
#
# This script relies on a few environment variables to determine startup
# behavior, those variables are:
#
#   ES_PATH_CONF -- Path to config directory
#   ES_JAVA_OPTS -- External Java Opts on top of the defaults set
#
# Optionally, exact memory values can be set using the `ES_JAVA_OPTS`. Note that
# the Xms and Xmx lines in the JVM options file must be commented out. Example
# values are "512m", and "10g".
#
#   ES_JAVA_OPTS="-Xms8g -Xmx8g" ./bin/elasticsearch
export JAVA_HOME=/opt/elasticsearch-7.6.2/jdk/
export PATH=$JAVA_HOME/bin:$PATH


source "`dirname "$0"`"/elasticsearch-env

if [ -z "$ES_TMPDIR" ]; then
  ES_TMPDIR=`"$JAVA" -cp "$ES_CLASSPATH" org.elasticsearch.tools.launchers.TempDirectory`
fi

ES_JVM_OPTIONS="$ES_PATH_CONF"/jvm.options
ES_JAVA_OPTS=`export ES_TMPDIR; "$JAVA" -cp "$ES_CLASSPATH" org.elasticsearch.tools.launchers.JvmOptionsParser "$ES_JVM_OPTIONS"`


#添加jdk判断
if [ -x "$JAVA_HOME/bin/java" ]; then
	JAVA="/opt/elasticsearch-7.6.2/jdk/bin/java"
else
	JAVA=`which java`
fi

# manual parsing to find out, if process should be detached
if ! echo $* | grep -E '(^-d |-d$| -d |--daemonize$|--daemonize )' > /dev/null; then
  exec \
	"$JAVA" \
	$ES_JAVA_OPTS \
	-Des.path.home="$ES_HOME" \
	-Des.path.conf="$ES_PATH_CONF" \
	-Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \
	-Des.distribution.type="$ES_DISTRIBUTION_TYPE" \
	-Des.bundled_jdk="$ES_BUNDLED_JDK" \
	-cp "$ES_CLASSPATH" \
	org.elasticsearch.bootstrap.Elasticsearch \
	"$@"
else
  exec \
	"$JAVA" \
	$ES_JAVA_OPTS \
	-Des.path.home="$ES_HOME" \
	-Des.path.conf="$ES_PATH_CONF" \
	-Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \
	-Des.distribution.type="$ES_DISTRIBUTION_TYPE" \
	-Des.bundled_jdk="$ES_BUNDLED_JDK" \
	-cp "$ES_CLASSPATH" \
	org.elasticsearch.bootstrap.Elasticsearch \
	"$@" \
	<&- &
  retval=$?
  pid=$!
  [ $retval -eq 0 ] || exit $retval
  if [ ! -z "$ES_STARTUP_SLEEP_TIME" ]; then
	sleep $ES_STARTUP_SLEEP_TIME
  fi
  if ! ps -p $pid > /dev/null ; then
	exit 1
  fi
  exit 0
fi

exit $?

java垃圾回收配置完整文件

cat /opt//opt/elasticsearch-7.6.2/config/jvm.otions
## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
#8-13:-XX:+UseCMSInitiatingOccupancyOnly
8-13:-XX:+UseG1GC
#8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly

## G1GC Configuration
# NOTE: G1 GC is only supported on JDK version 10 or later
# to use G1GC, uncomment the next two lines and update the version on the
# following three lines to your version of the JDK
# 10-13:-XX:-UseConcMarkSweepGC
# 10-13:-XX:-UseCMSInitiatingOccupancyOnly
14-:-XX:+UseG1GC
14-:-XX:G1ReservePercent=25
14-:-XX:InitiatingHeapOccupancyPercent=30

## JVM temporary directory
-Djava.io.tmpdir=${ES_TMPDIR}

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=data

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log

## JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m

配置文件详解

异常处理

ElasticSearch启动报错：unable to install syscall filter:

安装常见错误：ERROR: [2] bootstrap checks failed

ERROR: [2] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]

启动方式

开机启动脚本

vim /etc/rc.d/rc.local
# root用户无法启动，用创建好的es用户运行
su - es -c "/opt/elasticsearch-7.6.2/bin/elasticsearch -d"

另一种方式

#添加脚本文件到
cd /etc/rc.d/init.d/
vim elasticsearch.sh
cd /etc/rc.d/init.d/
#!/bin/sh
#chkconfig: 2345 80 90
#description: es开机启动脚本
su - es -c "/opt/elasticsearch-7.6.2/bin/elasticsearch -d"
	
#wq
chmod +x elasticsearch.sh
chkconfig --add elasticsearch.sh
chkconfig elasticsearch.sh on
reboot

脚本编写

ElasticSerach编写Shell启动脚本

Monstache部署

下载
安装，参考windows版本安装即可
此次在centos上的安装路径
```
/opt/mongstache/
```

设置开机自启动

vim /etc/rc.d/rc.local
#添加
/opt/monstache/linux-amd64/monstache -f /opt/monstache/linux-amd64/config.toml

完整配置文件config.toml

cat /opt/monstache/linux-amd64/config.toml 

mongo-url = "mongodb://172.16.5.124:27017"
elasticsearch-urls = ["http://172.16.5.123:9200"]
elasticsearch-max-conns = 10
namespace-regex = 'spider.*'      #aaa表示mongodb的数据库，bbb表示集合，表示要匹配的名字空间
# direct-read-namespaces = [“common.parent_info”,”common.child_info”]
dropped-collections = true

# propogate dropped databases in MongoDB as index deletes in Elasticsearch
dropped-databases = true

resume = true #从上次同步的时间开始同步

# cluster-name = 'tzg'  #es集群名

# 更新es而不是覆盖
index-as-update = true
verbose = true

[logs]
info = "/var/logs/monstache/info.log"
warn = "/var/logs/monstache/warn.log"
error = "/var/logs/monstache/error.log"
trace = "/var/logs/monstache/trace.log"
# 自动创建日志文件

Mongodb 部署并开启replca Set

官方安装教程
官方开启replicaSet教程
- By default, MongoDB runs using the mongod user account and uses the following default directories:
```
/var/lib/mongo (the data directory)
/var/log/mongodb (the log directory)
```
- 配置文件默认位置
```
/etc/mongod.conf
```

完整配置文件

# mongod.conf

# for documentatilson of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

# Where and how to store data.
storage:
  dbPath: /var/lib/mongo
  journal:
	enabled: true
#  engine:
#  wiredTiger:

# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
  port: 27017
  bindIp: 0.0.0.0 #172.16.5.124  # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.


#security:

#operationProfiling:

#replication:
replication:
   replSetName: "rs0"
#net:
#   bindIp: 0.0.0.0
#sharding:

## Enterprise-Only Options

#auditLog:

#snmp:

注意 bindIp: 0.0.0.0直接配置为这样后，开启replica Set会报错，可以先配置为bindIp: 172.16.5.124，初始化replica Set成功后再配置为bindIp:0.0.0.0

设置开机自启动

service mongod start
chkconfig mongod on #默认开启级别2345
#如果要用这条命令，

卸载
- 彻底卸载MongoDB，必须移除MongoDB应用程序，移除配置文件和任何包含数据及日志的目录。下面的指南是彻底卸载MongoDB的必须步骤。
- 警告：如下步骤将会彻底卸载MongoDB，包括其配置文件和所有的数据库文件。这个过程是不可逆的，所以确保在执行这些步骤之前已经备份了你的配置文件和数据文件。
- I. 停止MongoDB
```
sudo service mongod stop
```
- II. 移除MongoDB包
  移除之前安装的所有MongoDB包
```
sudo yum erase $(rpm -qa | grep mongodb-org)
```
- III. 删除数据文件及日志文件
  删除MongoDB数据库和日志文件
```
sudo rm -r /var/log/mongodb
```
注意官方教程有点问题，有个设置项要改一下
- https://blog.csdn.net/zhou_438/article/details/85241013
- 上述方法不行试试这个
- https://www.cnblogs.com/wangxiaoqiangs/p/9024894.html

常见异常及处理

不正常关闭数据库，恢复方法
- https://blog.csdn.net/u010028869/article/details/50698689
- https://blog.csdn.net/u010647035/article/details/81434774
- 注意，安装方式最好的方式就是官网方式，有些坑填一下即可
- 最好不要直接杀死mongod进程，要用
```
service mongod stop
```
启动数据库可能遇到的问题
- Mongo Restart Error — /var/run/mongodb/mongod.pid exists

脚本

待定。。。

curl 测试es通不通

#查询es中的所有索引
curl "172.16.5.123:9000/_cat/indices?v"
#浏览器自带url转码，所以终端需要手动转好
#url转码中国=%e4%b8%ad%e5%9b%bd
curl "172.16.5.123:5000/search?q=%e4%b8%ad%e5%9b%bd&s=0"

参考文章：
安装，创建用户，编写自动安装脚本
https://www.cnblogs.com/qq931399960/p/10425803.html
replica set设置，设置开机启动，脚本
https://www.cnblogs.com/qq931399960/p/10428702.html

posted @ 2021-01-06 12:37 Bob-Dylan 阅读(773) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Loading

Spider

Mongo-Es同步服务部署Monstache(cnetos)

环境及版本

连接服务器

python脚本实现ssh自动登录Centos

ES部署

ES服务器部署与配置

异常处理

启动方式

脚本编写

Monstache部署

Mongodb 部署并开启replca Set

常见异常及处理

脚本

curl 测试es通不通

公告