03:02:58 #startmeeting openstack-cyborg 03:02:59 Meeting started Thu Aug 22 03:02:58 2019 UTC and is due to finish in 60 minutes. The chair is Sundar. Information about MeetBot at http://wiki.debian.org/MeetBot. 03:03:00 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 03:03:02 The meeting name has been set to 'openstack_cyborg' 03:03:26 Agenda: https://wiki.openstack.org/wiki/Meetings/CyborgTeamMeeting#Agenda 03:03:32 #topic Roll call 03:03:45 #info Coco_gao 03:03:52 #info Yumeng 03:03:54 #info s_shogo 03:04:11 #info chenke 03:04:14 #info Li_Liu 03:04:52 Welcome, all 03:05:09 #topic Python 3 03:05:54 There has been some feedback from TC to fix Python 3 tests and make them voting jobs 03:06:10 We should minimally try to get the Python 3 stuff fixe 03:06:12 *fixed 03:06:37 s_shogo: Thanks for your patches. Is it possible to prioritize it? We can even test with the patches 03:07:08 #info xinranwang 03:07:52 I'm waiting for merge of P1-P9 patches, after that , I'll delete WIP in py3 patch. 03:08:47 We can test and update the Py3 patch with the P5-P9 patches. I agree we need to speed up the merging of those patches. But we don't need to wait for that to update Py3 fixes, right? 03:09:23 ok, I'll delete the WIP and request review today. 03:09:48 Thanks 03:10:12 I intend to commit the patch with "non-voting" ,are there any problems? 03:10:56 We all voted in a past IRC meeting that it should be non-voting for now. So I agree. 03:11:08 Ok. 03:11:09 Do you need help with testing with the patches? You could do: 'git review -d 670470' and then 'git review -x 667524' 03:11:26 That will pull the P5-P9 patches and your patch in one branch 03:12:02 Thanks, I'll do that. 03:12:13 Great. Thanks a lot, s_shogo :) 03:12:19 #topic Merging patches 03:12:48 As s_shogo pointed out, a lot of things are blocked on merging the P5-P9 patches. Can you all please prioritize that? 03:13:05 Agree 03:13:20 Sure. 03:14:09 About P5 patch, a lot of patch is denpending on it. 03:14:23 Great. Xinran's patches on Nova notification and Placement are important too. Once they also merge, we can enable tempest CI, and that is what we need to kick off Nova integ. 03:14:58 hi Sundar: I noticed you gave a -2 on P5: Basic changes for API layer. https://review.opendev.org/#/c/670466/ are there any problems? 03:15:00 Coco_gao: Agreed. I have tested the P5-P9 with fake and FPGA drivers. 03:15:00 yumeng also need to modify the database 03:15:44 Sundar: do you want to discuss vendor_id, vendor_name stuff, which we discussed yesterday? 03:15:59 Yumeng: It is a procedural -2 to prevent the patch series from merging piece by piece. It will be cleaner if we all review all the patches as a whole and then the whole thing merges around the same time. 03:16:22 Same reasoning as with the -2 for https://review.opendev.org/631242 in Nova 03:17:05 Please review placement report patch, it has less dependency on P5-P6 patch. :) 03:17:11 Sundar: got it . thanks 03:17:14 So can we make the priority list about the patches? 03:18:15 Coco_gao: I would suggest: P5-P9, Nova notification, Placement report. What do you all think? 03:18:32 Not saying other patches are less important :) 03:18:51 Just tryin to unblock as much as we can and get the ball back to Nova developers' court. 03:19:17 I think some of the patches had dependency issues so I am asking about the priority. 03:19:22 I think we can review placement from now, cause it has less dependency on your patches. 03:19:47 notification after p5-p9 03:20:05 Both dependency and our Train release goal are matter. 03:20:12 xinranwang: Not disagreeing, but Placement needs to be tested with P5-P9 patches, right? 03:20:24 No 03:20:46 P5-P9, Nova notification, Placement report. What about python3? 03:20:49 we can run placement patch without nova-cyborg integration 03:21:29 Ok, we could do partial testing with CYborg driver -> agent -> conductor (db) -> Placement. Without using Nova. Is that what you are saying? 03:21:44 Yes, exactl. 03:21:51 *exactly 03:21:58 OK, got that. 03:22:00 Agreed. 03:22:35 Also, placement provides cmd line like openstack resource provider list, we can check the result directly by using this. 03:23:11 Ok, then order with dependencies is: [ P5-P9, Placement report], then nova notification. Sounds good? 03:23:31 Great. 03:23:34 yes. 03:24:16 yes, great 03:24:17 I see there are several patches which had conflict problems. 03:24:36 Will that be solved if we merge P5-P9 03:25:17 Coco_gao_: In https://review.opendev.org/#/q/status:open+project:openstack/cyborg+branch:master, I see 4 merge conflicts and they are all old patches only for reference. Rest should be good. 03:25:18 Seems good to me because I know where to start my reviews. 03:26:13 Sundar wangzhh: I noticed that GPU driver report vendor id and fill the vendor field with ID, is there any possibility that we fill vendor field by vendor name? 03:27:11 IMHO, we need the vendor name for the device somehwere, so that we can create the traits with the vendor name, as listed in https://review.opendev.org/gitweb?p=openstack/cyborg-specs.git;f=specs/train/approved/cyborg-nova-placement.rst;hb=refs/changes/45/603545/7#l179 03:27:14 We can get the vendor name from vendor ID right? 03:27:44 Since the vendor ID is unique 03:28:01 Yes, but the driver report does not include the vendor name as a field today 03:28:14 Of course, but I suggest record vendorId in db. And translate it to name if needed. 03:28:39 Perhaps it will be easiest to add vendor name as a separate field in the driver_objects.device, to minimize changes to GPU driver and other drivers 03:29:14 OK. 03:29:26 Agree that we should keep vendor ID 03:29:38 each driver should know about the mapping relation between id and name, but cyborg-agent/conductor don't 03:30:11 The Cyborg driver for a vendor could even hardcode the vendor name, if it will handle only one vendor. 03:30:35 So IMO, driver should report vendor name anyway. 03:30:54 xinranwang: Agreed. Proposal: Add vendor name to driver_objects.device . Any objections? 03:30:56 Sundar, It is hardcoded in driver now. 03:31:08 Agreed 03:31:26 no objection. 03:31:45 Btw, i find driver name in deployable tables, but no assignment. 03:32:04 Yes, I translate the vendor ID to name for FPGA in my DEMO 03:32:05 Agree. The object.device will not add vendor name? Only driver_objects.device will? 03:32:54 it can add a vendor name 03:33:24 Coco_gao_: Good point. It will be easier to add it to device object too, for use in ext-arq binding for creating traits 03:33:24 but maybe more driver infos, such as driver version. 03:34:06 Do we need to add that to the db then? Oherwise, how how do we get it in the device object by querying? 03:34:07 So just add name in object, not in db, right? 03:34:09 My question is do you think we should keep vendor name in object or database? 03:34:35 for a long term. maybe we can translate the ID to name at present. 03:34:37 DB 03:34:58 vendor ID is already in DB? 03:35:03 Sundar: what is driver name you added in deployable table stand for? 03:35:22 coco yep 03:35:27 If you don;t add the vendor name to db, the translation from id to name may have to be outside the driver too, which is not good 03:35:32 Coco_gao_: yes, in device tables 03:36:09 Coco_gao_:yes. already have vendor_id in db device table 03:36:42 About the ext-arq binding, do we need to qurey DB for vendor name? 03:36:44 xinranwang: It is the driver that is used for programming and device updates. At discovery time, we know which deployable has which driver. So, we can use that info to locate the driver needed for programming. 03:37:00 https://github.com/shaohef/cyborg/commit/7e353bf09259a9b4cfba3763f3a99eb1adfaaa2b#diff-7ca60ed56d570ebe3490 03:37:11 Sundar: 03:37:20 there is no assignment of this field 03:37:24 Coco_gao_: For ext_arq binding, we are queryig db for the deployable now. We can extend tthat to query the device and its vendor name too 03:37:38 maybe we can extract vendor name from this field? 03:37:40 here is what I do, just for a temporary solution 03:38:19 driver_name = db_deployable.driver_name 03:39:44 xinranwang: The intention is that, when the agent discovers a device from a driver, it sends the driver name to the conductor and has it persisted. 03:40:39 shaohe_feng: That is what we are doing today. We could use the driver name and vendor name as same-- at least for Train. Is that what I am hearing from many of you? 03:41:24 Sundar: so what will it look like? intel_fpga_driver? or 0x8086 like shaohe's demo does. 03:41:40 https://github.com/shaohef/cyborg/blob/profile-ctrl/cyborg/accelerator/drivers/fpga/base.py#L23 03:42:16 for fpga driver, I maintain a map, both vendor ID and name can work. 03:42:40 Sundar: yes, we need a driver name 03:43:17 xinranwang: just 'intel', for example. That is what we use to load a driver module. The intel_fpga_driver is the name in setup.cfg. 03:44:08 so we should fill driver_name with 'intel' for example? 03:44:28 if so, we can extract vendor name for driver_name.. 03:44:54 Yes, xinranwang. To summarize, I think we are agreeing to update the driver_name field in the deployable and use that as 'vendor name' to create the traits as per https://review.opendev.org/gitweb?p=openstack/cyborg-specs.git;f=specs/train/approved/cyborg-nova-placement.rst;hb=refs/changes/45/603545/7#l179 03:45:02 Sounds good? 03:45:21 Then no need to add vendor_name to any object 03:45:44 yes 03:46:56 Great. If others have objecttions, please LMK. Let's move on in the interest of time 03:47:04 #topic python cyborg client 03:47:44 I'm working that with P5-P9 patch in advance. 03:48:12 Thanks, s_shogo. I am wondering if we have the time to update the cyborg client to v2 API? 03:48:28 So zhenghao's patch should use Vendor ID and then update deployable vendor name? 03:48:47 Coco_gao_: Yes 03:48:48 update to deployable table with vendor name 03:48:54 Sounds good. 03:49:21 It already has a driver name. At least today, driver name and vendor name are the same. 03:49:36 s_shogo: May be we need somebody to help you, for the client? 03:49:43 Sundar : ok, I think so. I'll report the prospection for the v2 API Work in every IRC meeting. 03:50:12 s_shogo: Thanks a lot. Appreciate it. :) Please do ask if you need help. 03:50:49 BTW, TC has proposed: https://review.opendev.org/#/c/675317/ 03:50:50 Sundar: Ok, Thanks. 03:51:25 They are proposing an intermediate release of the client. IMHO, it is useless because it is currently v1 based, has no Python 3 fixes and so nothing new from Stein release 03:51:47 We may as well club the client update with Train release. What do you al think? 03:53:47 Coco_gao_, wangzhh, xinranwang, shaohe_feng, Li_Liu, all: ^ 03:54:39 Agree if we have time. I don't know whether we can make that happen, are you optimistic about that? 03:54:51 what's club mean.... 03:55:13 xinranwang: I mean combine client update with rest of Train release. 03:55:54 Coco_gao_: Yes, i share that concern. Once s_shogo lets us know if it can happen by Sep, we can decide. 03:56:25 Yes, nice to have. 03:57:02 Yes, It will probably be lesser priority than Nova integ, Python 3, RBAC/security etc. Anybody thinks otherwise? 03:57:14 Agree. 03:57:33 #topic Other important Train tasks 03:57:48 should the async bind in this release? 03:58:04 wangzhh: Could you give us an update on RBAC, whenever you have the time? 03:58:21 shaohe_feng: Yes. It is needed for Nova integ 03:58:33 I notice current code does not create deployable with driver_name, does it? Maybe need one patch on it? 03:58:41 OK. 03:58:52 xinranwang: Yes, it is a gap 03:58:54 Then we will add 2 more state for bind. 03:59:15 ARQ_BIND_STARTED and ARQ_DELETING 03:59:23 BIND_STARTED is already being added in my patches 03:59:32 But I agree 03:59:33 ARQ_INITIAL, ARQ_BIND_STARTED, ARQ_BOUND, ARQ_DELETING, ARQ_UNBOUND, ARQ_BIND_FAILED 03:59:59 there will be 7 states 04:00:11 Use “->” for transform , “X->” for can’t transform. 04:00:15 TING X-> ARQ_BOUND 04:00:20 ARQ_BIND_FAILED 04:00:26 any comments on it? 04:00:28 Yes. Between bind and unbind, making bind async is more important. We don;t do much in unbind today 04:00:34 Sundar: I can help on it. I will submit a patch by today to fix that. 04:00:53 xinranwan: Excellent. Thanks :) 04:00:56 ^ Sundar Coco_gao_ xinranwang wangzhh Yumeng 04:01:09 Sundar: np 04:01:10 shaohe_feng: I agree 04:01:10 We'd better merge the old patch P5-P9 first, and then discuss what are the states? 04:01:36 Maybe a new patch for states update? 04:02:17 I think shaohe_fend will propose a new patch for async un/bind 04:02:25 *shaohe_feng 04:02:31 That's good. 04:02:38 wangzhh: Can you please let us know about RBAC? 04:02:50 btw, please review this https://review.opendev.org/#/c/677436/ when you got time :) 04:02:51 Yes. After discussion, I should also implement the code. So I'm afraid there is no time left for me 04:02:54 maybe he is offline 04:03:08 Test in my local env 04:03:13 we should discuss it ASAP 04:03:31 you can arrange a meeting to discuss that. 04:03:41 probably submit patch tomorrow 04:03:42 any comments on the states transform above? 04:04:03 shaohe_feng: I agree with your states. I think they will suffice. Let's start with them and have people code review it? 04:04:05 ^ Sundar Coco_gao_ xinranwang wangzhh Yumeng 04:04:37 I will also prioritize it for next week's IRC call 04:04:41 As good as my expectation, thank you. 04:05:03 Hopefully, P5-P9 patches would have merged by then :) 04:05:21 shaohe_feng: no objections ^^ 04:05:30 Let's set a deadline for P5-P9 reviews? 04:05:30 #topic AoB 04:05:46 Coco_gao_: That would be welcome! 04:05:51 shaohe no objections 04:06:07 So far so good. 04:06:17 hey, btw, I would like to introduce you my colleague chenke, he will join us and contribute in cyborg. 04:06:21 What deadline do we want to set? 04:06:26 so are you satisfy the state transform ? 04:06:33 end of this week? 04:06:44 including the weekend. 04:06:45 Welcome, chenke :) 04:06:51 welcome cheke 04:06:58 welcome chenke 04:06:59 welcome 04:07:06 welcome 04:07:06 I had already saw your patch, welcome chenke. 04:07:08 Coco_gao_: Sounds good to me. Others good with that? 04:07:09 Hello everyone, I am new here, I hope to contribute to cyborg. 04:07:24 sorry s/cheke/ chenke 04:07:30 Thanks. 04:07:31 chenke, could you please tell us your backgrounnd or skill set? 04:07:39 Ok. 04:07:42 Previously, I mainly contributed to the watcher project. 04:08:26 Ok, sounds good 04:08:57 I have been working on watcher for two years. Now is a core for watcher. 04:09:23 Sounds really good 04:09:29 I am happy to contribute to cyborg. 04:09:30 In terms of other Train tasks, we continue to work on enabling tempest. The https://review.opendev.org/677436 is needed to enable upstream tempest CI 04:09:30 Glad you can join us. 04:11:00 chenke, BTW, we have a Storyboard: https://storyboard.openstack.org/#!/project/openstack/cyborg . Please feel free to see if you find anything interesting that you can take up. You can reach us on IRC or me by email (sundar.nadathur@intel.com) 04:11:27 I will do it. Thanks. 04:11:43 Great. wangzhh: Thanks for your response on RBAC too. 04:11:54 Anything else for today? 04:12:00 No 04:12:07 no 04:12:31 no from my side 04:12:38 no 04:12:39 Good night Sundar 04:12:43 Thanks for a productive discussion, everybody. Have a good day (or night)! 04:12:48 bye 04:12:51 bye 04:12:52 #endmeeting