03:02:26 #startmeeting openstack-cyborg 03:02:27 Meeting started Thu Oct 24 03:02:26 2019 UTC and is due to finish in 60 minutes. The chair is Sundar. Information about MeetBot at http://wiki.debian.org/MeetBot. 03:02:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 03:02:30 The meeting name has been set to 'openstack_cyborg' 03:02:43 #topic Who's here 03:02:46 o/ 03:02:48 seems you did not finish last meeting 03:02:54 Hi all 03:02:58 #info s_shogo 03:03:02 o/ 03:03:02 #info chenke 03:03:05 0/ 03:03:12 hi all~ 03:03:14 #info shaohe_feng 03:03:19 #info Yumeng 03:03:24 Hi y'all 03:03:30 #topic Agenda 03:03:40 https://wiki.openstack.org/wiki/Meetings/CyborgTeamMeeting#Agenda 03:03:42 what does y' means? 03:03:57 :) "you all" 03:04:22 Anything to add to the agenda? 03:04:30 hi all 03:04:37 Hey xinranwang 03:04:49 #topic Summit and PTG 03:05:26 First, re. project update at the summit, I got clarification after the last meeting. 03:05:39 will miss the Team dinner :') 03:05:41 It is actually on. But it is not recorded as video. 03:06:27 Sorry to have mis-stated before I got the clarification. Yumeng, I will create the slides and we can work on them. 03:07:01 shaohe_feng: we will get you a good dinner in Beijing ;) 03:07:33 Thanks 03:07:48 you will visit Beijing? 03:07:58 after PTG 03:08:05 PTG: #link http://ptg.openstack.org/ 03:08:18 Sundar: ok. 03:08:26 All the info for PTG is here. 03:08:45 It turns out that nearly all projects get only a table, not a separate room. 03:09:21 http://ptg.openstack.org/ptg.html Diable is a table, not a room 03:09:25 *Diablo 03:09:48 Not sure how well that works, but that's the case for most projects 03:11:35 Etherpad: The current etherpad (https://etherpad.openstack.org/p/cyborg-ptg-ussuri) will need to be merged into the final one: https://etherpad.openstack.org/p/shanghai-ptg-cyborg 03:11:49 We have three days 03:11:49 I will take care of it, once we have enough content in the current etherpad 03:12:14 chenke: True -- but that is a misleading because Wednesday includes the summit too 03:12:36 Yes. 03:12:48 I have activities on Wednesday, such as office hours for Cyborg 03:13:23 Also, the project onboarding is part of the PTG this time. So, we will be spending some time talking to newcomers who want to contribute to Cyborg 03:13:42 I suspect it will really be 2 days of discussion 03:14:44 It seems that time is sufficient. 03:14:54 Finally: the team photo shoot is part of the PTG: on Thu at 2 pm (after lunch). But that is only 10-15 min 03:15:19 Does anybody have any questions or comments on the PG or the summit? 03:15:21 *PTG 03:17:35 OK, moving on 03:17:43 #topic Doc patches 03:18:24 The documentation has merged into stable/train -- yay! Thanks, xinranwang, Yumeng, chenke and all for proposing/reviewing it. 03:18:43 Great. 03:19:00 Now Train has truly left the station for Cyborg. ;) 03:20:02 I see soem new patches from Yumeng and others for docs. Yumeng, you don't expect this to be backported to Train, right? 03:21:27 sundar: no, no need to backported to Train. what do you think? 03:21:32 We need to check how this differs from other API doc that got merged 03:21:42 https://review.opendev.org/#/c/690539/ 03:22:35 Your patch has more detail. 03:23:51 OK, anything else on docs? We need similar content for ARQs, etc., I suppose. 03:23:56 I took keystone api-ref as a reference, they have deprecated and current APIs 03:24:27 here is the keystone api-ref: https://docs.openstack.org/api-ref/identity/ 03:24:31 take a look 03:24:38 Cool. 03:24:54 #topic Other patches 03:25:19 https://review.opendev.org/#/c/685542/ 03:25:58 My question is, should we expend more effort on the ksa_adapter and older methods, or should we aim to move to openstacksdk? 03:27:09 According to eric's advice. He suggest we use openstacksdk. 03:28:33 So, what should be the status of this patch? 03:28:50 It is easier to use than before. 03:29:23 If your devstack env test ok. 03:29:47 I suggest to merge this patch and this https://review.opendev.org/#/c/690509/2. 03:30:19 Ok, I'll try it out. The 2nd one is definitely an improvement 03:30:31 Please review this patch too, after async bind merged, GPU is not supportted. https://review.opendev.org/#/c/688239/ 03:30:37 Ye. 03:30:49 Sundar: Thanks. 03:31:20 And there is a bug in placement client, please review this one too :) https://review.opendev.org/#/c/688231/ 03:31:31 Sundar: I think we can merge patch 685542, no more effort on ksa_adapter and openstacksdk for now. 03:32:36 xinranwang, Yumeng: yes to both 03:33:31 Any other patches to be highlighted today? I know there are a bunch of others that need review. 03:35:04 #topic AoB 03:35:21 If there is nothing else, we can get back 25 minutes. 03:36:11 Will we support Orchestrator components(or system IAAS sotrware tools) also want to leverage accelerators? 03:36:25 These components may run on the same host node with VM guest. 03:36:39 Seems this is out of cyborg control. (should we add a reservation mechanism for them?) 03:36:44 Could you elaborate or provide links? 03:37:11 I have attend a meeting about accelerators. 03:38:12 Get the information, not only VM need accelerator, but maybe some other applications run on the host node also need accelerators 03:39:02 It is not common in an OpenStack cluster to have VMs and non-VM processes running on the same compute node. 03:39:51 I meam the non-VM processes are openstack required components 03:40:03 Not sure you know RDT. 03:40:24 Yes, I know RDT. Are you referring to RMD? 03:40:34 No 03:40:37 RDT 03:40:53 the use case is: 03:41:04 Divide cache into 2 parts. 03:41:29 Resource Director technology (RDT) is a set of Intel processor features for cache/memory management. They can be used in OpenStack in different ways. 03:41:42 1 parts are reserved for system IAAS sotrware run on the node 03:42:04 another parts are reserved for VM. 03:42:49 That part is outside Cyborg though -- it relates to cache/memory and should be configured by the operator. How does it relate to accelerators? 03:43:17 We defined, cyborg can only allocate accelerators to nova, right? 03:43:48 I means like RDT. 03:43:58 we have 5 accelerators. 03:44:11 we divided into 2 parts. 03:45:20 1 parts have 1 acc, it is out of cyborg control. It is reserved for system IAAS sotrware. Admin allocate them 03:45:56 another parts have the remain 4 acc, cyborg allocate them to nova. 03:46:09 similar like RDT use case. 03:46:38 on we can simple define cyborg. 03:47:20 if cyborg run on a host node. we not allow to allocate accelerators to system IAAS sotrware any more. 03:47:38 we can note this on cyborg docs. 03:47:44 If the admin wants to keep aside some accelerators, he/she can update Placement to mark that resource provider as in-use. 03:48:22 From Cyborg's POV it created the RP but Placement will never pick that RP because it sees it as in-use 03:49:26 As we know, cinder has enable qat driver, I guess shaohe want to know if Cyborg can help cinder manage this kind of accelerator, is that what you want say, shaohe_feng? 03:49:44 not only cinder 03:49:53 also for some VPN on the host. 03:50:02 what does POV means? 03:50:02 and storage 03:50:08 yes, cinder is just an example 03:50:12 chenke: point of view 03:50:17 DPDK, SPDK also need . 03:50:25 yes, it is an example 03:50:49 so does that means, let cyborg run first before other system IAAS sotrware, it report resources to placement first 03:51:34 and amdin reserve some accelerators, and start the remain system IAAS sotrwares 03:52:11 and amdin reserve some accelerators to the remain system IAAS sotrwares 03:52:57 Not sure where this discussion is headed. If Cinder is using a device, operator should not configure Cyborg driver for that device. If the admin wants to dedicate some networking/storage devices for DPDK/SPDK and they were claimed by Cyborg, the admin could still mark them in placement, as I said. 03:53:01 We can add this usage as a guideline in cyborg doc. 03:54:04 Sundar: can we add this to the cyborg doc? 03:54:17 shaohe_feng: It may be better to wait for concrete usage by operators and get their feedback, instead of adding hypothetical use cases in the docs -- they could be confusing. 03:54:19 maybe DPDK as example. 03:55:10 I also interested in disabling to allocate some accelerators, as maintenance. 03:55:43 s_shogo: Only when no VM is using that device, right? 03:55:56 Sundar: yes 03:56:00 Maybe we had better not allow DPDK to use accelerators any more. This can make things more simple. 03:56:09 Currently, operator plays this role. 03:57:18 For example, had better to avoid DPDK to use some network accelerator. 03:57:50 Then things is simple. 03:57:53 s_shogo: Would the admin-Placement method work for now? We had plans to add health reporting and enabling/disabling devices. Once we have the /v2/devices API, we could add this functionality. Seems ok? 03:58:46 shaohe_feng: Cyborg does not support network resources today. 03:59:02 yes, just a example. 03:59:35 QAT can be used for some network accelerations 03:59:47 and it can be used by DPDK. 04:00:00 not sure cyborg will support QAT 04:00:23 Sundar: Yes, the approach is good. I want to discuss the methods continuously with example usecase for maintenance and that's requirements. 04:00:26 Or cyborg just plan to manager only VM used accelerators. 04:01:37 shaohe_feng: Do you have specific knowledge that somebody wants to have several QATs in one host, some for VMs, and some for DPDK on the host? 04:02:45 s_shogo: Sure, let's make this a discussion topic at the PTG. Your I could create an etherpad with the problem statement and some ideas. 04:03:15 *You or I 04:04:25 usually,they use kolla to deploy the system IAAS sotrwares include (placement), How do kolla know what's accelerators in placement? 04:04:30 Sundar : Thanks, I'll update the etherpad (part of the topic is already written) 04:05:31 My suggestion: let's get Nova integ done and make something functional. Then we can start talking about more advanced use cases. 04:05:52 We are over the time. Is there anything else? 04:06:49 Yes, the meeting tells use DPDK can use accelerators. But they does not say DPDK can not live with VM on same host. 04:07:10 Ok. 04:07:36 seems, we did not support much advanced use cases at present. 04:07:40 step by step 04:08:10 shaohe_feng: Let's take this offline. 04:08:16 Good. Thanks a lot, everybody. Have a good day! 04:08:20 #endmeeting