Saltstack之multi-master
一、实验环境:
1、salt版本:
[root@master master]# salt --versions-report Salt: 2015.5.10 Python: 2.7.5 (default, Nov 6 2016, 00:28:07) Jinja2: 2.7.2 M2Crypto: 0.21.1 msgpack-python: 0.4.8 msgpack-pure: Not Installed pycrypto: 2.6.1 libnacl: Not Installed PyYAML: 3.10 ioflo: Not Installed PyZMQ: 14.3.1 RAET: Not Installed ZMQ: 3.2.5 Mako: Not Installed Tornado: Not Installed timelib: Not Installed dateutil: Not Installed
2、系统版本:
[root@master master]# cat /etc/redhat-release CentOS Linux release 7.2.1511 (Core)
二、Salt搭建主备master
1、安装新的master server
2、copy master keys到新的master对应的目录(master.pem和master.pub)
3、启动新的master进程
4、配置minions配置文件
5、Restart minions
6、在新的master上accept keys
7、测试两个salt-master对salt-minion的test.ping
[root@master master]# salt -L 192.168.163.13 test.ping 192.168.163.13: True [root@standby minions]# salt -L 192.168.163.13 test.ping 192.168.163.13: True
备注:
配置multi-master主要的问题是:每一个master使用相同的private key. Private key在master第一次启动时自动生成的(注意:配置multi-master时,一定要在启动新master前将老master的private key copy到对应目录)。
修改minion配置文件:
master
-saltmaster1.example.com
-saltmaster2.example.com
三、遇到的问题:
1、在minion端进行salt-call测试的时候(前提条件将主master stop),发现:
[root@standby salt]# salt-call test.ping [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 1 of 4) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 2 of 4) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 3 of 4) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 4 of 4) [WARNING ] Attempted to authenticate with master 192.168.199.39 and failed [WARNING ] Master ip address changed from 192.168.199.39 to 192.168.163.13 local: True ----以上测试结果是将minion中的auth_tries修改为4,默认值为7. 将值改成3次,并关闭主备master的测试结果: [root@standby minion]# salt-call test.ping [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 1 of 3) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 2 of 3) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 3 of 3) [WARNING ] Attempted to authenticate with master 192.168.199.39 and failed [WARNING ] Master ip address changed from 192.168.199.39 to 192.168.163.13 [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 1 of 3) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 2 of 3) [INFO ] SaltReqTimeoutError: after 60 seconds. (Try 3 of 3) [WARNING ] Attempted to authenticate with master 192.168.163.13 and failed [ERROR ] An un-handled exception was caught by salt's global exception handler: AttributeError: 'SMinion' object has no attribute 'functions' Traceback (most recent call last): File "/usr/bin/salt-call", line 11, in <module> salt_call() File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 227, in salt_call client.run() File "/usr/lib/python2.7/site-packages/salt/cli/call.py", line 71, in run caller.run() File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 236, in run ret = self.call() File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 107, in call if fun not in self.minion.functions: AttributeError: 'SMinion' object has no attribute 'functions' Traceback (most recent call last): File "/usr/bin/salt-call", line 11, in <module> salt_call() File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 227, in salt_call client.run() File "/usr/lib/python2.7/site-packages/salt/cli/call.py", line 71, in run caller.run() File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 236, in run ret = self.call() File "/usr/lib/python2.7/site-packages/salt/cli/caller.py", line 107, in call if fun not in self.minion.functions: AttributeError: 'SMinion' object has no attribute 'functions'
2、master数据共享问题:
masters之间并不会共享信息,public keys需要在每台master上accept,文件共享需要手工完成,或者使用类似git工具确保file_roots目录文件一致。
具体需要同步的目录有:
Minion keys:
- /etc/salt/pki/master/minions
- /etc/salt/pki/master/minions_pre
- /etc/salt/pki/master/minions_rejected
备注:直接共享/etc/salt/master目录是强烈反对的。允许外部访问master.pem key将带来严重的安全风险。
4、minion keys可以参考使用:
方案一:
*/10 * * * * rsync -av --progress --delete --timeout=30 root@192.168.199.39:/etc/salt/pki/master/minions/ /etc/salt/pki/master/minions/
方案二:修改salt-key的源代码:
当主master有accept的时候同步给备master,在配置文件中配置备机IP,只有两边同步成功了才算成功;
删除minion的时候只用通过salt-key -d的方式删除,或者配合rsync的方式,防止通过rm的方式删除minion。
5、file_roots和pillar_roots等文件可以放在git上。
需要提及的是,本实验的salt版本需要修改minion.py文件:
因为minion注册的时候会先往IP小的机器注册,而无法按你指定的IP顺序注册。
修改代码如下:
从minion.py代码中查看得到for master in set(self.opts['master']):中
class MultiMinion(MinionBase): ''' Create a multi minion interface, this creates as many minions as are defined in the master option and binds each minion object to a respective master. ''' # timeout for one of the minions to auth with a master MINION_CONNECT_TIMEOUT = 5 def __init__(self, opts): super(MultiMinion, self).__init__(opts) def minions(self): ''' Return a dict of minion generators bound to the tune_in method dict of master -> minion_mapping, the mapping contains: opts: options used to create the minion last: last auth attempt time auth_wait: time to wait for next auth attempt minion: minion object generator: generator function (non-blocking tune_in) ''' if not isinstance(self.opts['master'], list): log.error( 'Attempting to start a multimaster system with one master') sys.exit(salt.defaults.exitcodes.EX_GENERIC) ret = {} #在这里对master进行了一个排序 for master in self.opts['master']: # for master in set(self.opts['master']): s_opts = copy.deepcopy(self.opts) s_opts['master'] = master s_opts['multimaster'] = True ret[master] = {'opts': s_opts, 'last': time.time(), 'auth_wait': s_opts['acceptance_wait_time']} try: minion = Minion( s_opts, self.MINION_CONNECT_TIMEOUT, False, 'salt.loader.{0}'.format(master)) ret[master]['minion'] = minion ret[master]['generator'] = minion.tune_in_no_block() except SaltClientError as exc: log.error('Error while bringing up minion for multi-master. Is master at {0} responding?'.format(master)) return ret