03:09:37 <Yumeng> #startmeeting openstack-cyborg
03:09:38 <openstack> Meeting started Thu Nov 26 03:09:37 2020 UTC and is due to finish in 60 minutes.  The chair is Yumeng. Information about MeetBot at http://wiki.debian.org/MeetBot.
03:09:39 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
03:09:41 <openstack> The meeting name has been set to 'openstack_cyborg'
03:09:46 <Yumeng> #topic Roll call
03:09:57 <Yumeng> #info Yumeng
03:10:09 <xinranwang> #info xinranwang
03:10:10 <swp20> #info swp20
03:10:30 <Yumeng> #topic Agenda
03:11:18 <Yumeng> # topic vgpu
03:11:41 <Yumeng> #topic vgpu
03:13:33 <Yumeng> swp20: pls continue
03:14:00 <Yumeng> are you saying the detach failure in hotplug is because of cirros mssing?
03:14:36 <swp20> yeah, the cirros vm process crash when detach the gpu device
03:15:06 <swp20> the un-hotplug is not real success in fact.
03:15:58 <swp20> cirros image is not support and centos is well.
03:18:13 <Yumeng> so vgpu hotplug is not supported in cirros but supported in centos, right?
03:18:51 <swp20> i am not sure.
03:18:58 <swp20> i means re-hotplug
03:19:27 <Yumeng> ok
03:19:43 <Yumeng> did you find out why un-hotplug is not successful?
03:19:59 <swp20> attach, detach and reattach
03:20:19 <swp20> i search the vm log
03:21:12 <swp20> there is process crash problem.
03:23:01 <Yumeng> is it a occasional case or it crash every time?
03:23:41 <swp20> it's high probability
03:23:51 <Yumeng> ok
03:24:32 <Yumeng> Has this crash ever happend in Centos?
03:25:10 <swp20> hasn't met yet.
03:25:25 <Yumeng> ok. got taht.
03:25:46 <swp20> cool
03:26:08 <Yumeng> looks like hotplug is image sensitive.
03:26:32 <Yumeng> Thanks wenping for the sharing
03:26:33 <swp20> maybe the driver is important.
03:27:09 <Yumeng> do you mean nvidia virtualization driver?
03:27:37 <swp20> no, i means the driver in image
03:28:03 <swp20> gpu is not support well for cirros
03:28:14 <swp20> include vgpu
03:28:39 <swp20> you can test for vgpu about detach by 'virsh detach-device'
03:29:30 <Yumeng> yes, the VFIO mdev driver is very important. nvidia virtualization driver version must be well match the image version
03:29:48 <Yumeng> ok. will try when I got time
03:30:01 <swp20> cool
03:30:13 <Yumeng> I also have a vGPU issue to discuss with you
03:30:20 <Yumeng> about the vGPU support
03:30:26 <swp20> yep
03:30:58 <swp20> i think in the time of bind arq is better
03:31:03 <swp20> to create mdev
03:31:42 <swp20> attach_handle is too early
03:32:20 <swp20> and maintain task is heavy
03:32:20 <Yumeng> yes, I also think so.
03:33:58 <swp20> so let's confirm this.
03:34:07 <Yumeng> xinranwang what do you think?
03:35:29 <Yumeng> Sylvain prefer create mdev in generate attach_handle. sean and gibi is fine with either
03:36:14 <Yumeng> from my perspective, I also prefer creating medv when arq bind
03:37:24 <xinranwang> if gpu's type is determined, the max number of vfs is also determined, right?
03:37:51 <Yumeng> yes
03:38:15 <Yumeng> but if it is changed, we need to delete all the created ones and create new ones
03:38:19 <xinranwang> if we do not create mdev at attach_handle generation step, how many vfs should we report?
03:38:26 <Yumeng> even if they were never used.
03:40:11 <Yumeng> xinranwang: the maximum number
03:40:36 <Yumeng> in the inventory, we always report the maximum number
03:40:43 <xinranwang> ok, got it.
03:41:39 <xinranwang> it seems create mdev during binding is more efficient. we just create it when we use it.
03:42:41 <Yumeng> yes, that's also how I mentioned in nova spec.
03:42:42 <xinranwang> does mdev creation spend much time?
03:42:48 <Yumeng> not much.
03:43:12 <xinranwang> will it fail in some cases?
03:44:28 <Yumeng> I tested in my env, but it was not a big number of VMs. create mdev is very fast
03:44:50 <Yumeng> but not sure what's the results when VM is a large number
03:45:16 <xinranwang> mdev creation is a serial task, i think.
03:46:04 <xinranwang> anyway, i think at binding step is more efficient, if there is no obvious gap.
03:46:06 <Yumeng> creation failure is at very Low frequency.
03:46:30 <Yumeng> hasn't met yet
03:47:14 <Yumeng> xinranwang: ok. cool
03:47:24 <Yumeng> So we agreed on at binding step.
03:47:33 <Yumeng> I will go back to sync with nova guys
03:47:58 <Yumeng> ok. nothing else from side.
03:48:18 <Yumeng> Is ther anything else you guys what to mention?
03:49:41 <xinranwang> nothing from my side
03:51:23 <Yumeng> ok~
03:51:32 <Yumeng> lunch time~~
03:51:41 <xinranwang> lol
03:51:46 <xinranwang> bon appetit
03:52:14 <Yumeng> haha
03:52:27 <Yumeng> so let's wrap up today's meeting
03:52:41 <Yumeng> #endmeeting