在vmware里部署ovirt遇到的问题及解决办法

2021/02 作者:ihunter 0 0

前言

有不少同学反映在vmware的虚机里部署oVirt遇到了问题,我也试了一把,果然到处是坑,为了便于手头受限、只能在vmware里学习ovirt的同学,我把遇到的问题及解决办法分享给大家。

注:虽然能跑,但还是不建议在vmware里跑ovirt,仅限学习用。本文是针对HostedEngine方式。

环境

硬件环境:Thinkpad W510,4核8线,16G内存;

vmware环境是:vmware workstation 12;(已知vmware workstation 15版本同样存在问题,vmware vsphere未测试,如遇到问题可尝试本文中的解决办法)

创建的宿主虚机(作为ovirt的node)配置是:2颗4核cpu,8G内存,200G硬盘;

oVirt的版本为:4.3.9;

注意:

  1. 下面问题的解决办法均需要ssh到宿主虚机中执行;

  2. 其它的步骤要保证都正常,不会的去看本站中的教程或官网,尤其注意/etc/hosts里的域名映射问题;

  3. 如果你是用的vmware workstation环境部署ovirt,建议以下的每个解决办法在部署前都提前做一下,然后再执行部署;

  4. 由于oVirt各个版本部署有所差异,内部部署细节经常变动,有可能后续版本更新后不存在这些问题;

  5. 毕竟虚机里部署性能受限,可能会导致意料之外的问题,多试几次吧!

问题及解决办法

  • 卡在Get local VM IP后报错:

[ INFO ] TASK [ovirt.hosted_engine_setup : Get local VM IP]

 

[ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 90, “changed”: true, “cmd”: “virsh -r net-dhcp-leases default | grep -i 00:16:3e:03:01:30 | awk ‘{ print $5 }’ | cut -f1 -d’/’”, “delta”: “0:00:00.050951”, “end”: “2020-04-23 09:57:00.043571”, “rc”: 0, “start”: “2020-04-23 09:56:59.992620”, “stderr”: “”, “stderr_lines”: [], “stdout”: “”, “stdout_lines”: []}


在vmware里部署ovirt遇到的问题及解决办法


解决办法:

在/usr/share/ansible/roles/ovirt.hosted-engine-setup/tasks/bootstrap_local_vm/02_create_local_vm.yml文件中找到“virt-install”的部分,将“–machine pc-i440fx-rhel7.6.0”修改为“–machine pc-i440fx-rhel7.1.0”,如下图:


在vmware里部署ovirt遇到的问题及解决办法


  • 卡在Inject network configuration with guestfish:

[ INFO ] TASK [ovirt.hosted_engine_setup : Generate DHCP network configuration for the engine VM]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Generate static network configuration for the engine VM, IPv4]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Generate static network configuration for the engine VM, IPv6]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Inject network configuration with guestfish]

解决办法:

在/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/create_target_vm/03_hosted_engine_final_tasks.yml文件中,找到“guestfish”部分,注意有两处,添加guestfish环境变量:LIBGUESTFS_BACKEND_SETTINGS: force_tcg,如下图:


在vmware里部署ovirt遇到的问题及解决办法
在vmware里部署ovirt遇到的问题及解决办法


  • 卡在Extract /etc/hosts from the Hosted Engine VM:

[ INFO ] TASK [ovirt.hosted_engine_setup : Inject network configuration with guestfish]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Extract /etc/hosts from the Hosted Engine VM]

解决办法:

在/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/create_target_vm/03_hosted_engine_final_tasks.yml文件中,找到“virt-copy-out”部分,添加环境变量:LIBGUESTFS_BACKEND_SETTINGS: force_tcg,如下图:


在vmware里部署ovirt遇到的问题及解决办法


  • 卡在Wait for the host to be up:

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Always revoke the SSO token]

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Wait for the host to be up]

注意注意:

这里有可能是因为虚机性能问题,添加主机过程比较慢,导致超时了,然后后面会报错,也有可能是根据没有执行添加主机的操作,这时候需要我们手动去添加主机。如果你不确定就登到engine上看一下,下面是具体办法。

我们先把主机的超时时间改长一些。

找个/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/bootstrap_local_vm/05_add_host.yml这个文件,找到“Wait for the host to be up”部分,将“retries: 120”改成“retries: 360”,如下图:


在vmware里部署ovirt遇到的问题及解决办法


然后在部署到这一步时,注意注意了,

在你的本地机器上(你使用浏览器的这台),在hosts文件里增加node(即这台宿主虚机)的域名映射,例如

192.168.8.221 node221.com

然后通过域名打开这个engine虚机的临时管理台,注意地址是https://node221.com:6900,端口是6900,域名是你的node的域名,不是engine的。打开后点“管理门户”,使用admin帐号登录进去,密码是你之前在部署页面中所填的。


在vmware里部署ovirt遇到的问题及解决办法


然后到“计算”->“主机”页面中去看,主机列表中有没有一台正在执行安装操作的主机?如果列表是空的话,点右上角“新建”,我们手动添加主机,输入名称、地址、root密码,如下图:


在vmware里部署ovirt遇到的问题及解决办法


点确定后等着就可以了,一直到主机激活,部署页面中不再卡住。

  • 卡在Check engine VM health:

[ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health]

[ ERROR ] fatal: [localhost]: FAILED! => {“attempts”: 180, “changed”: true, “cmd”: [“hosted-engine”, “–vm-status”, “–json”], “delta”: “0:00:00.901148”, “end”: “2020-04-28 05:22:06.475416”, “rc”: 0, “start”: “2020-04-28 05:22:05.574268”, “stderr”: “”, “stderr_lines”: [], “stdout”: “{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=10895 (Tue Apr 28 05:21:55 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=10895 (Tue Apr 28 05:21:55 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”node221.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”35c7eb12\”, \”local_conf_timestamp\”: 10895, \”host-ts\”: 10895}, \”global_maintenance\”: false}”, “stdout_lines”: [“{\”1\”: {\”conf_on_shared_storage\”: true, \”live-data\”: true, \”extra\”: \”metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=10895 (Tue Apr 28 05:21:55 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=10895 (Tue Apr 28 05:21:55 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\”, \”hostname\”: \”node221.com\”, \”host-id\”: 1, \”engine-status\”: {\”reason\”: \”failed liveliness check\”, \”health\”: \”bad\”, \”vm\”: \”up\”, \”detail\”: \”Up\”}, \”score\”: 3400, \”stopped\”: false, \”maintenance\”: false, \”crc32\”: \”35c7eb12\”, \”local_conf_timestamp\”: 10895, \”host-ts\”: 10895}, \”global_maintenance\”: false}”]}

[ INFO ] TASK [ovirt.hosted_engine_setup : Check VM status at virt level]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Fail if engine VM is not running]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Get VDSM’s target engine VM stats]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Convert stats to JSON format]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Get target engine VM IP address from VDSM stats]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Fail if Engine IP is different from engine’s he_fqdn resolved IP]

[ ERROR ] fatal: [localhost]: FAILED! => {“changed”: false, “msg”: “Engine VM IP address is while the engine’s he_fqdn engine222.com resolves to 192.168.8.222. If you are using DHCP, check your DHCP reservation configuration”}

解决办法:

到/usr/libexec/vdsm/hooks/before_vm_start目录下,新建一个名为“50_vm_rhel7_1_0”的文件,文件的内容为:

#!/usr/bin/python2

import os
import hooking

domxml = hooking.read_domxml()

os_elem = domxml.getElementsByTagName('os')[0]
type_elem = os_elem.getElementsByTagName('type')
if type_elem:
  if type_elem[0].attributes['machine'].value == 'pc-i440fx-rhel7.6.0':
    type_elem[0].setAttribute('machine', "pc-i440fx-rhel7.1.0")
    #type_elem[0].removeAttribute('machine')

hooking.write_domxml(domxml)
  • 卡在Copy engine logs:

[ INFO ] TASK [ovirt.hosted_engine_setup : Add additional gluster hosts to engine]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Add additional glusterfs storage domains]

[ INFO ] skipping: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Fetch logs from the engine VM]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Set destination directory path]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Create destination directory]

[ INFO ] changed: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Find the local appliance image]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Set local_vm_disk_path]

[ INFO ] ok: [localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Give the vm time to flush dirty buffers]

[ INFO ] ok: [localhost -> localhost]

[ INFO ] TASK [ovirt.hosted_engine_setup : Copy engine logs]

解决办法:

在/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/fetch_engine_logs.yml文件中找到“Copy engine logs”,在virt-copy-out中添加环境变量LIBGUESTFS_BACKEND_SETTINGS: force_tcg,如下图:


在vmware里部署ovirt遇到的问题及解决办法


注意,一旦部署出问题,建议执行以下清理步骤后再重新部署:

1、结束掉可能在运行的qemu虚机

pkill qemu-kvm

2、执行ovirt-hosted-engine清理命令(输入y执行清理)

ovirt-hosted-engine-cleanup

3、清理临时文件

rm -rf /var/tmp/localvm*

4、清理日志文件

rm -rf /var/log/ovirt-hosted-engine-setup/*

5、清理用于存放engine的存储(以本机上的nfs为例)

rm -rf /data/images/nfs/*


赞(1) 更多分享

上篇: oVirt4.3.8部署教程(单台主机All in one)
下篇: oVirt存储热迁移