15:04:20 #startmeeting openstack-cyborg-driver 15:04:21 Meeting started Mon Mar 18 15:04:20 2019 UTC and is due to finish in 60 minutes. The chair is shaohe_feng_. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:04:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:04:25 The meeting name has been set to 'openstack_cyborg_driver' 15:04:39 let waits for a minutes. 15:05:39 #info shaohe_feng_ 15:05:40 Fine. 15:06:03 Hi all 15:06:07 Hi xinran. 15:06:11 Sorry for late 15:06:16 #info xinranwang 15:06:19 evening xinranwang 15:06:24 #info wangzhh 15:06:35 hi shaohe_feng_ wangzhh 15:06:53 we have not hold this meeting for a long time. 15:07:24 #link https://wiki.openstack.org/wiki/Meetings/CyborgDriverTeamMeeting#Agenda_for_next_meeting_:_Mar_18th.2C_2019 15:07:28 here is the agent. 15:07:50 s/agent/agenda 15:07:57 Hi Gyus 15:08:04 Hi, uncle Li. 15:08:16 Li_Liu: morning uncle Li. 15:08:19 Hi, xiaohei~~ 15:08:28 hi shaohe 15:08:38 you guys wanna do a zoom meeting instead? 15:08:38 I want to introduce some some hardware accelerators. 15:09:10 1. the current know type of accelerator card 15:09:45 as we all know cyborg will support mdev and pci card. 15:10:11 but now I find there are 2 other kinds of hardware card we can support. 15:10:35 one is ip over PCIE, another is USB. 15:10:47 i see 15:11:06 wangzhh: do you know these two kind cards? 15:11:08 can they fit into our current design? 15:11:29 not sure, so we need more discuss with them. 15:11:56 I don't know much about ip over pcie, what does that mean? 15:12:29 I think it's a remote case 15:12:55 PCI over ethernet? 15:13:14 Li_Liu: yes. 15:13:43 #link https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/vca-2-visual-compute-accelerator-product-brief.pdf 15:13:53 Li_Liu: No, IP over pci. 15:14:05 from Operation System point of view, it's still a pci device right? 15:14:45 Li_Liu: is it s pci devices, but you communicate with it by it. 15:14:52 Li_Liu: it is a local pci card. 15:15:16 such as the vca2 card, see link above, 15:15:34 you mean other hosts can communicate with it over ethernet? 15:16:29 So, actually, it is a pci device? 15:16:31 oh, the local host communicate the local card, over PCIE. 15:16:49 from os view. 15:17:50 wangzhh: from the os view, you can see it a new kind device with new driver maybe. 15:18:18 there's another card I have attend meeting last week, seem this is a common way for some card. 15:18:58 we can dig more about this kind of card. 15:19:52 for usb card, the movidius AI card is this kind. 15:20:10 I think we just need to make sure 2 things: 1. can os-acc attach it like all the other devices, 2. can the resource fit into our current data model 15:20:53 yes. 15:20:55 as long as these two requirements can meet, we should be good 15:21:26 It make sense. 15:21:48 I think the usb devices can satisfy these two requirements. 15:22:01 shaohe_feng_ have they finalize the resource structure yet? 15:23:10 Li_Liu: usb, yes. 15:23:31 just remind these 2 kind devices. 15:23:31 How about another one? 15:24:20 wangzhh: I'm not looking into looking into it well. 15:24:35 OK, let's go ahead. 15:24:48 OK. 15:24:52 Re-enumeration of hardware card 15:25:17 most of us know the issue of Re-enumeration. 15:25:52 the issue we discussed last week? 15:26:16 no, but this is a common issue. 15:26:43 the bus of a hardware card maybe change after we resize a hardware and reboot. 15:27:12 seem this is a big problem for accelerator manage in cyborg. 15:28:01 I have discuss it with Yongli, the main PCI devices contributor in nova. 15:28:29 he say, nova does not allow resize hardware 15:29:35 you mean add/remove device after reboot? 15:29:54 yes 15:30:04 unless evict all VMs from this node. 15:30:30 Li_Liu: wangzhh: xinranwang: what's do you think about? 15:31:00 Or do you have a good ideas for hardware resize? 15:31:33 Wuu, IMO, it's better to change status to error or offline in cyborg. 15:31:58 And let operator sync it manaully. 15:32:17 Let's say before restart we have 3 cards, after restart we now have 4 cards 15:32:44 I think driver can find out which one is the new one right? 15:32:47 We can supply a tool or api for operator to update it. 15:32:50 If we plug in a new card on server, and reboot. But the hw resource assigned to an instance does not change. 15:33:17 will the bdf change? if so, that should be an issue. 15:33:20 Li_liu, as xinran said. 15:33:57 the bdf might change, but we don't need to guarantee give user the card with the same bdf 15:34:12 the bdf maybe change, bus-port of a usb devices also maybe change. 15:34:18 just give user the card with the same type 15:35:27 The most tricky thing is how to handle the resource which had been assinged. 15:36:18 if user has done some work on old hw, that will be a loss. 15:37:12 in that case, operator has to notify the user to backup first 15:37:35 size operator should know when the resizing is happening 15:37:44 since* 15:38:50 In 99% of the scenarios tho, I don't think it matters anyway 15:38:57 What about power failure？ 15:39:28 how will nova record the hw resource from cyborg, there should be a field of nova instance to record this. is this attach_handle_uuid? 15:39:28 if power failure happen, the device should not be resized right? 15:40:56 if you hotplug in a hardware before failure happen, the things is also bad. 15:42:33 Li, If we just reboot the server, the bus wont't change? 15:42:43 *won't 15:43:03 lol... as I said.. if operator wants to do this... he/she needs to notify users... 15:43:14 if you do not resize hardware, the bus wont't change. 15:43:16 Li_Liu: yes. 15:43:31 wangzhh: no, it will not change 15:43:36 shaohe_feng_, Got it. 15:43:36 wangzhh, I think simple reboot should not change the bdf 15:43:48 bios just scan the pci tree 15:44:04 if nothing new is inserted, it should not change 15:44:07 live migrate the VM to another host. 15:44:50 that's more complex... 15:45:21 scheduler filter should deal with this part. shaohe_feng_ 15:46:19 the data center can scale their hardwares. For example the want to support more AI card in their exist hosts. 15:47:27 OK, let keep this issue in mind, maybe we can find a good way to solve it 15:47:31 go ahead. 15:47:52 multi-level resources support 15:48:18 now I want to support a new multi-level card. 15:48:29 similar to pfga card. 15:49:15 for example. There is a one region in a card but 4 functions in a region. 15:49:16 sure, to support new cards. as long as it can meet the requirements I mentioned earlier 15:49:38 4 different functions? 15:49:42 there's 3 requirements: 15:50:29 Li_Liu: in my new card, they are same function, but for fpga, it may different functions. fpga is more complex. 15:51:03 1. we should know the topology of this devices. 15:52:05 2. user can apply any level of the resources, for example, he want to apply a region or just one function. 15:52:37 3. avoid fragmentization 15:53:35 Li_Liu: now the cyborg satisfy the the former 2 requirements, right? 15:54:11 shaohe_feng_ it should 15:54:18 Ok, greate. 15:54:26 what's about 3. 15:54:27 cyborg was designed to have these in mind 15:54:36 good. 15:54:49 the 3rd one is related to scheduling algorithm 15:55:10 we might need to work with nova weigher for that 15:55:22 Li_Liu: that's need cyborg help. 15:55:31 that's for sure 15:55:47 let me elaborate it 15:56:01 3 regions 15:56:03 cyborg can provide a weigher like mechanism and work with nova 15:56:20 one region with 4 function. 15:56:43 User 1 apply one function from region 1 15:58:14 user 2 want another 2 more functions. I expect cyborg allocate them from region 1 instead of region 2/3. 15:58:48 user 3 want another one more functions, it is also from region 1. 15:59:49 the allocation should not scatter among region 1，2 and 3 16:00:28 they should centralize 1 region. 16:00:29 that should be easy to do. a weigher would do the job' 16:01:24 so user 4 can apply the rest 2 whole regions. 16:02:08 Li_Liu: OK, is there a weigher mechanism for it now？ 16:02:17 not yet 16:02:27 we can plan this 16:02:42 OK, good. 16:02:44 coz I think numa scheduling also needs this feature 16:02:56 this is useful. 16:03:15 for sure 16:03:40 I will add this to T release plannig 16:03:59 there's a common scenario for this feature. 16:04:10 Li_Liu: good, thanks. 16:04:26 npnp 16:04:57 AoB? 16:05:02 I need to pick up my lunch now, you guys can go ahead. don't stay too late.. :P 16:05:08 Li_Liu: wangzhh: xinranwang ? 16:05:16 I am all good\ 16:05:26 good. 16:05:32 glad to talk with you. 16:05:42 Me, too. 16:06:10 let's end the meeting. 16:06:11 i am fine with that. NUMA should also need the similar mechanism 16:06:33 thanks all. 16:06:55 #endmeeting