Orchestrator中,在MySQL集群粒度,有故障自动恢复开关,在全局粒度,也有一个全局的开关(global recovery disable)。
本文主要介绍全局开关(global recovery disable)的基本实现方式。
下面分别从几个层面阐述。
1. DB 层
在DB层定义一个数据表,用于存储全局开关状态。
global_recovery_disable
数据表:
CREATE TABLE IF NOT EXISTS global_recovery_disable (
disable_recovery tinyint unsigned NOT NULL COMMENT 'Insert 1 to disable recovery globally',
PRIMARY KEY (disable_recovery)
) ENGINE=InnoDB DEFAULT CHARSET=ascii
相应的,有操作DB的相关接口,查询、插入记录(关闭开关)、删除记录(打开开关)。
func IsRecoveryDisabled() (disabled bool, err error) {}
func DisableRecovery() error {}
unc EnableRecovery() error {}
2. raft 同步层
为了在Orchestrator 集群节点之间同步全局开关状态,定义raft同步接口:
func (applier *CommandApplier) disableGlobalRecoveries(value []byte) interface{} {}
func (applier *CommandApplier) enableGlobalRecoveries(value []byte) interface{} {}
3. API 层
在HTTP 层,提供相关接口,供外部使用。
查询开关状态
// CheckGlobalRecoveries checks whether
func (this *HttpAPI) CheckGlobalRecoveries(params martini.Params, r render.Render, req *http.Request) {}
关闭全局开关
// DisableGlobalRecoveries globally disables recoveries
func (this *HttpAPI) DisableGlobalRecoveries(params martini.Params, r render.Render, req *http.Request, user auth.User) {}
打开全局开关
// EnableGlobalRecoveries globally enables recoveries
func (this *HttpAPI) EnableGlobalRecoveries(params martini.Params, r render.Render, req *http.Request, user auth.User) {}
4. snapshot 层
从数据表中查询记录,写入snapshot:
func CreateSnapshotData() *SnapshotData {
snapshotData := NewSnapshotData()
... ...
snapshotData.RecoveryDisabled, _ = IsRecoveryDisabled()
... ...
}
从snapshot中恢复到数据表中:
func (this *SnapshotDataCreatorApplier) Restore(rc io.ReadCloser) error {
snapshotData := NewSnapshotData()
... ...
// recovery disable
{
SetRecoveryDisabled(snapshotData.RecoveryDisabled)
}
... ...
5. 自动故障恢复
自动故障恢复发起前,检查全局开关状态,如果全局禁用,则直接返回,不继续执行:
func executeCheckAndRecoverFunction(analysisEntry inst.ReplicationAnalysis, candidateInstanceKey *inst.InstanceKey, forceInstanceRecovery bool, skipProcesses bool) (recoveryAttempted bool, topologyRecovery *TopologyRecovery, err error) {
... ...
// Check for recovery being disabled globally
if recoveryDisabledGlobally, err := IsRecoveryDisabled(); err != nil {
// Unexpected. Shouldn't get this
log.Errorf("Unable to determine if recovery is disabled globally: %v", err)
} else if recoveryDisabledGlobally {
if !forceInstanceRecovery {
log.Infof("CheckAndRecover: Analysis: %+v, InstanceKey: %+v, candidateInstanceKey: %+v, "+
"skipProcesses: %v: NOT Recovering host (disabled globally)",
analysisEntry.Analysis, analysisEntry.AnalyzedInstanceKey, candidateInstanceKey, skipProcesses)
return false, nil, err
}
log.Infof("CheckAndRecover: Analysis: %+v, InstanceKey: %+v, candidateInstanceKey: %+v, "+
"skipProcesses: %v: recoveries disabled globally but forcing this recovery",
analysisEntry.Analysis, analysisEntry.AnalyzedInstanceKey, candidateInstanceKey, skipProcesses)
}
... ...
}
6. Dashboard 页面
在Dashboard上,通过调用HTTP接口查看和操作。
如下图所示:
Just try, don't shy.
标签:
orchestrator
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 分享一个免费、快速、无限量使用的满血 DeepSeek R1 模型,支持深度思考和联网搜索!
· 基于 Docker 搭建 FRP 内网穿透开源项目(很简单哒)
· 25岁的心里话
· ollama系列01:轻松3步本地部署deepseek,普通电脑可用
· 按钮权限的设计及实现