【Azure 微服务】Service Fabric 部署时遇见了VMExtensionProvisioningError错误: Multiple VM extensions failed to be provisioned on the VM

问题描述

Deployment  Azure Service Fabric 时,遇见了VMExtensionProvisioningError, 全文如下:

Deployment Name: 385A084F35B2BC55-xxxx
Operation Id: 096CED
Operation: Create
Correlation Id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Status: Failed
Error:
Code: VMExtensionProvisioningError
Message: Multiple VM extensions failed to be provisioned on the VM. 
Please see the VM extension instance view
for other failures. The first extension failed due to the error:
VM Extension 'RunPowerShell_Frontdoor' is marked as failed since it depends upon the VM Extension 'ServiceFabricNodeVmExt_vmNodeType1Name' which has failed.
Help Link: https://aka.ms/ev2/errors/troubleshooting

 

问题解答

第一步:收集日志

需要远程连接(RDP)到SF的节点中, 在目录 C:\WindowsAzure\Logs\Plugins 中收集VM Extension的安装日志

 

第二步:分析日志

在 Microsoft.Azure.ServiceFabric.ServiceFabricNode plugin 日志中,发现是由于在获取证书失败而出现的错误。

[09/09/2021 12:32:01.86] Executing: C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11\ServiceFabricExtensionHandler.exe enable
[09/09/2021 12:32:03.62] Execution Complete.
######
Execution Output:
1> 2021-09-09T12:32:01.9973303Z: Information: Starting...
1> 2021-09-09T12:32:01.9973303Z: Information:   MachineName:       Frontdoor000000
1> 2021-09-09T12:32:01.9973303Z: Information:   CurrentDirectory:  C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11
1> 2021-09-09T12:32:01.9973303Z: Information:   UserName:          SYSTEM
1> 2021-09-09T12:32:01.9973303Z: Information: Created mutex; this is the only instance of this process running
1> 2021-09-09T12:32:02.3723445Z: Information: HandlerManifest Version: 1
1> 2021-09-09T12:32:02.4348520Z: Information: [HandlerEnvironment]
1> 2021-09-09T12:32:02.4348520Z: Information:   LogFolder:     C:\WindowsAzure\Logs\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11
1> 2021-09-09T12:32:02.4348520Z: Information:   ConfigFolder:  C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11\RuntimeSettings
1> 2021-09-09T12:32:02.4348520Z: Information:   StatusFolder:  C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11\Status
1> 2021-09-09T12:32:02.4348520Z: Information:   HeartbeatFile: C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11\Status\HeartBeat.Json
1> 2021-09-09T12:32:02.4348520Z: Information:   DeploymentId:  xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
1> 2021-09-09T12:32:02.4348520Z: Information:   RoleName:      _Frontdoor_0
1> 2021-09-09T12:32:02.4348520Z: Information:   Instance:      _Frontdoor_0
1> 2021-09-09T12:32:02.5442327Z: Information: Opening config file: C:\Packages\Plugins\Microsoft.Azure.ServiceFabric.ServiceFabricNode\1.1.0.11\RuntimeSettings\1.settings
1> 2021-09-09T12:32:02.5598642Z: Information: Config specified client cert FindBySubjectName 'xxxxx.cluster.prod.ngpproxy.microsoftonline.cn' (store name = My)
1> 2021-09-09T12:32:02.5754780Z: Error: ERROR: Microsoft.Azure.ServiceFabric.Extension.Core.AgentException: Cannot find certificate with 'FindBySubjectName' 'xxxxx.cluster.prod.ngpproxy.microsoftonline.cn' in store 'My'    at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadClientCertificate(String findValue, String certStoreName, X509FindType findBy)    at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadCertificateWrapper(CertificateSettings certificateSetting)
1> 2021-09-09T12:32:02.5754780Z: Warning: Heartbeat: Not Ready: Cannot find certificate with 'FindBySubjectName' 'xxxxx.cluster.prod.ngpproxy.microsoftonline.cn' in store 'My'
1> 2021-09-09T12:32:02.5754780Z: Error: Heartbeat file updated, with error msg
1> 2021-09-09T12:32:02.5911044Z: Error: ERROR: Microsoft.Azure.ServiceFabric.Extension.Core.AgentException: Cannot find certificate with 'FindBySubjectName' 'xxxxx.cluster.prod.ngpproxy.microsoftonline.cn' in store 'My'    at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadClientCertificate(String findValue, String certStoreName, X509FindType findBy)    at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadCertificateWrapper(CertificateSettings certificateSetting)    at Microsoft.Azure.ServiceFabric.Extension.Core.Common.PublicSettingsExtensions.AllowAccessToCerts(PublicSettings publicSettings, ICertificateUtility certificateUtility)    at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.ValidateDeployment(HandlerSettings settings)    at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.InstallService()    at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.InstallServiceWithRetry()    at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.Enable()    at Microsoft.Azure.ServiceFabric.Extension.Handler.Program.Main(String[] args) in X:\bt\1162492\repo\src\HandlerExe\Program.cs:line 52


Execution Error:
Microsoft.Azure.ServiceFabric.Extension.Core.AgentException: Cannot find certificate with 'FindBySubjectName' 'xxxxx.cluster.prod.ngpproxy.microsoftonline.cn' in store 'My'
   at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadClientCertificate(String findValue, String certStoreName, X509FindType findBy)
   at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadCertificateWrapper(CertificateSettings certificateSetting)
   at Microsoft.Azure.ServiceFabric.Extension.Core.Common.PublicSettingsExtensions.AllowAccessToCerts(PublicSettings publicSettings, ICertificateUtility certificateUtility)
   at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.ValidateDeployment(HandlerSettings settings)
   at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.InstallService()
   at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.InstallServiceWithRetry()
   at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.Enable()
   at Microsoft.Azure.ServiceFabric.Extension.Handler.Program.Main(String[] args) in X:\bt\1162492\repo\src\HandlerExe\Program.cs:line 52

最最关键的信息就是 Microsoft.Azure.ServiceFabric.Extension.Core.AgentException: Cannot find certificate with 'FindBySubjectName' 'xxxxx.cluster.prod.ngpproxy.microsoftonline.cn' in store 'My' 。

 

第三步:继续查找不能找到证书的问题

在获取证书的日志中,发现SF是从KV(Key Vault: 密钥保管库)中获取的证书,而在第5步获取auth challenge时,从18:59分02秒 到 19:04分45秒,耗时超过了5 分钟。

  1. 2021-09-09 18:59:02: <info> [CertificateManager] Beginning refresh for: https://xxxxx-chinanorth-kv.vault.azure.cn/secrets/cluster
  2. 2021-09-09 18:59:02: <info> [WindowsCertificateStore] attempting to open store 'LocalMachine\MY'
  3. 2021-09-09 18:59:02: <debug> [WindowsCertificateStore] opening the 'LocalMachine' store..
  4. 2021-09-09 18:59:02: <debug> [WindowsCertificateStore] store opened successfully.
  5. 2021-09-09 18:59:02: <info> [KeyVaultClient] Getting new auth challenge
  6. 2021-09-09 19:04:45: <error> [CertificateManager] Refreshing 'https://xxxxx-chinanorth-kv.vault.azure.cn/secrets/cluster' failed with KeyVault exception: KeyVaultHttpClient.GetChallenge. - http_exception: what = 'WinHttpSendRequest: 12030: The connection with the server was terminated abnormall' code = 'windows:12030'
  7. 2021-09-09 19:04:45: <info> [CertificateManager] Beginning refresh for:

由此可以推断,由于网络的延迟导致部署SF时,向Key Vault请求认证,由于请求超时而导致部署失败。

 

在多次重试重新部署SF,部署成功。

 

posted @ 2021-10-05 22:01  路边两盏灯  阅读(150)  评论(0编辑  收藏  举报