03:02:58 <Sundar> #startmeeting openstack-cyborg
03:02:59 <openstack> Meeting started Thu Aug 22 03:02:58 2019 UTC and is due to finish in 60 minutes.  The chair is Sundar. Information about MeetBot at http://wiki.debian.org/MeetBot.
03:03:00 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
03:03:02 <openstack> The meeting name has been set to 'openstack_cyborg'
03:03:26 <Sundar> Agenda: https://wiki.openstack.org/wiki/Meetings/CyborgTeamMeeting#Agenda
03:03:32 <Sundar> #topic Roll call
03:03:45 <Coco_gao_> #info Coco_gao
03:03:52 <Yumeng> #info Yumeng
03:03:54 <s_shogo> #info s_shogo
03:04:11 <chenke> #info chenke
03:04:14 <Li_Liu> #info Li_Liu
03:04:52 <Sundar> Welcome, all
03:05:09 <Sundar> #topic Python 3
03:05:54 <Sundar> There has been some feedback from TC to fix Python 3 tests and make them voting jobs
03:06:10 <Sundar> We should minimally try to get the Python 3 stuff fixe
03:06:12 <Sundar> *fixed
03:06:37 <Sundar> s_shogo: Thanks for your patches. Is it possible to prioritize it? We can even test with the patches
03:07:08 <xinranwang> #info xinranwang
03:07:52 <s_shogo> I'm waiting for merge of P1-P9 patches, after that , I'll delete WIP in py3 patch.
03:08:47 <Sundar> We can test and update the Py3 patch with the P5-P9 patches. I agree we need to speed up the merging of those patches. But we don't need to wait for that to update Py3 fixes, right?
03:09:23 <s_shogo> ok, I'll delete the WIP and request review  today.
03:09:48 <Coco_gao_> Thanks
03:10:12 <s_shogo> I intend to commit the patch with "non-voting" ,are there any problems?
03:10:56 <Sundar> We all voted in a past IRC meeting that it should be non-voting for now. So I agree.
03:11:08 <s_shogo> Ok.
03:11:09 <Sundar> Do you need help with testing with the patches? You could do: 'git review -d 670470' and then 'git review -x 667524'
03:11:26 <Sundar> That will pull the P5-P9 patches and your patch in one branch
03:12:02 <s_shogo> Thanks, I'll do that.
03:12:13 <Sundar> Great. Thanks a lot, s_shogo :)
03:12:19 <Sundar> #topic Merging patches
03:12:48 <Sundar> As s_shogo pointed out, a lot of things are blocked on merging the P5-P9 patches. Can you all please prioritize that?
03:13:05 <Coco_gao_> Agree
03:13:20 <wangzhh> Sure.
03:14:09 <Coco_gao_> About P5 patch,   a lot of patch is denpending on it.
03:14:23 <Sundar> Great. Xinran's patches on Nova notification and Placement are important too. Once they also merge, we can enable tempest CI, and that is what we need to kick off Nova integ.
03:14:58 <Yumeng> hi Sundar: I noticed you gave a -2 on P5: Basic changes for API layer. https://review.opendev.org/#/c/670466/    are there any problems?
03:15:00 <Sundar> Coco_gao: Agreed. I have tested the P5-P9 with fake and FPGA drivers.
03:15:00 <Coco_gao_> yumeng also need to modify the database
03:15:44 <xinranwang> Sundar:  do you want to discuss vendor_id, vendor_name stuff, which we discussed yesterday?
03:15:59 <Sundar> Yumeng: It is a procedural -2 to prevent the patch series from merging piece by piece. It will be cleaner if we all review all the patches as a whole and then the whole thing merges around the same time.
03:16:22 <Sundar> Same reasoning as with the -2 for https://review.opendev.org/631242 in Nova
03:17:05 <xinranwang> Please review placement report patch, it has less dependency on P5-P6 patch. :)
03:17:11 <Yumeng> Sundar: got it . thanks
03:17:14 <Coco_gao_> So can we make the priority list about the patches?
03:18:15 <Sundar> Coco_gao: I would suggest: P5-P9, Nova notification, Placement report. What do you all think?
03:18:32 <Sundar> Not saying other patches are less important :)
03:18:51 <Sundar> Just tryin to unblock as much as we can and get the ball back to Nova developers' court.
03:19:17 <Coco_gao_> I think some of the patches had dependency issues so I am asking about the priority.
03:19:22 <xinranwang> I think we can review placement from now, cause it has less dependency on your patches.
03:19:47 <xinranwang> notification after p5-p9
03:20:05 <Coco_gao_> Both dependency and our Train release goal are matter.
03:20:12 <Sundar> xinranwang: Not disagreeing, but Placement needs to be tested with P5-P9 patches, right?
03:20:24 <xinranwang> No
03:20:46 <Coco_gao_> P5-P9, Nova notification, Placement report.  What about python3?
03:20:49 <xinranwang> we can run placement patch without nova-cyborg integration
03:21:29 <Sundar> Ok, we could do partial testing with CYborg driver -> agent -> conductor (db) -> Placement. Without using Nova. Is that what you are saying?
03:21:44 <xinranwang> Yes, exactl.
03:21:51 <xinranwang> *exactly
03:21:58 <Coco_gao_> OK, got that.
03:22:00 <Sundar> Agreed.
03:22:35 <xinranwang> Also, placement provides cmd line like openstack resource provider list, we can check the result directly by using this.
03:23:11 <Sundar> Ok, then order with dependencies is: [ P5-P9, Placement report], then nova notification. Sounds good?
03:23:31 <Coco_gao_> Great.
03:23:34 <chenke> yes.
03:24:16 <xinranwang> yes,  great
03:24:17 <Coco_gao_> I see there are several patches which had conflict problems.
03:24:36 <Coco_gao_> Will that be solved if we merge P5-P9
03:25:17 <Sundar> Coco_gao_: In https://review.opendev.org/#/q/status:open+project:openstack/cyborg+branch:master, I see 4 merge conflicts and they are all old patches only for reference. Rest should be good.
03:25:18 <Coco_gao_> Seems good to me because I know where to start my reviews.
03:26:13 <xinranwang> Sundar wangzhh: I noticed that GPU driver report vendor id and fill the vendor field with ID, is there any possibility that we fill vendor field by vendor name?
03:27:11 <Sundar> IMHO, we need the vendor name for the device somehwere, so that we can create the traits with the vendor name, as listed in https://review.opendev.org/gitweb?p=openstack/cyborg-specs.git;f=specs/train/approved/cyborg-nova-placement.rst;hb=refs/changes/45/603545/7#l179
03:27:14 <Coco_gao_> We can get the vendor name from vendor ID right?
03:27:44 <Coco_gao_> Since the vendor ID is unique
03:28:01 <Sundar> Yes, but the driver report does not include the vendor name as a field today
03:28:14 <wangzhh> Of course, but I suggest record vendorId in db. And translate it to name if needed.
03:28:39 <Sundar> Perhaps it will be easiest to add vendor name as a separate field in the driver_objects.device, to minimize changes to GPU driver and other drivers
03:29:14 <Coco_gao_> OK.
03:29:26 <Coco_gao_> Agree that we should keep vendor ID
03:29:38 <xinranwang> each driver should know about the mapping relation between id and name, but cyborg-agent/conductor don't
03:30:11 <Sundar> The Cyborg driver for a vendor could even hardcode the vendor name, if it will handle only one vendor.
03:30:35 <xinranwang> So IMO, driver should report vendor name anyway.
03:30:54 <Sundar> xinranwang: Agreed. Proposal: Add vendor name to  driver_objects.device . Any objections?
03:30:56 <wangzhh> Sundar, It is  hardcoded in driver now.
03:31:08 <wangzhh> Agreed
03:31:26 <xinranwang> no objection.
03:31:45 <xinranwang> Btw, i find driver name in deployable tables, but no assignment.
03:32:04 <shaohe_feng> Yes, I translate the vendor ID to name for FPGA in my DEMO
03:32:05 <Coco_gao_> Agree. The object.device will not add vendor name? Only driver_objects.device will?
03:32:54 <shaohe_feng> it can add a vendor name
03:33:24 <Sundar> Coco_gao_: Good point. It will be easier to add it to device object too, for use in ext-arq binding for creating traits
03:33:24 <shaohe_feng> but maybe more driver infos, such as driver version.
03:34:06 <Sundar> Do we need to add that to the db then? Oherwise, how how do we get it in the device object by querying?
03:34:07 <wangzhh> So just add name in object, not in db, right?
03:34:09 <Coco_gao_> My question is do you think we should keep vendor name in object or database?
03:34:35 <shaohe_feng> for a long term.  maybe we can translate the ID to name at present.
03:34:37 <shaohe_feng> DB
03:34:58 <Coco_gao_> vendor ID is already in DB?
03:35:03 <xinranwang> Sundar:  what is driver name you added in deployable table stand for?
03:35:22 <wangzhh> coco  yep
03:35:27 <Sundar> If you don;t add the vendor name to db, the translation from id to name may have to be outside the driver too, which is not good
03:35:32 <xinranwang> Coco_gao_: yes, in device tables
03:36:09 <Yumeng> Coco_gao_:yes.  already have vendor_id  in db device table
03:36:42 <Coco_gao_> About the ext-arq binding, do we need to qurey DB for vendor name?
03:36:44 <Sundar> xinranwang: It is the driver that is used for programming and device updates. At discovery time, we know which deployable has which driver. So, we can use that info to locate the driver needed for programming.
03:37:00 <shaohe_feng> https://github.com/shaohef/cyborg/commit/7e353bf09259a9b4cfba3763f3a99eb1adfaaa2b#diff-7ca60ed56d570ebe3490
03:37:11 <xinranwang> Sundar:
03:37:20 <xinranwang> there is no assignment of this field
03:37:24 <Sundar> Coco_gao_: For ext_arq binding, we are queryig db for the deployable now. We can extend tthat to query the device and its vendor name too
03:37:38 <xinranwang> maybe we can extract vendor name from this field?
03:37:40 <shaohe_feng> here is what I do, just for a temporary solution
03:38:19 <shaohe_feng> driver_name = db_deployable.driver_name
03:39:44 <Sundar> xinranwang: The intention is that, when the agent discovers a device from a driver, it sends the driver name to the conductor and has it persisted.
03:40:39 <Sundar> shaohe_feng: That is what we are doing today. We could use the driver name and vendor name as same-- at least for Train. Is that what I am hearing from many of you?
03:41:24 <xinranwang> Sundar:  so what will it look like? intel_fpga_driver?  or 0x8086 like shaohe's demo does.
03:41:40 <shaohe_feng> https://github.com/shaohef/cyborg/blob/profile-ctrl/cyborg/accelerator/drivers/fpga/base.py#L23
03:42:16 <shaohe_feng> for fpga driver, I maintain a map, both vendor ID and name  can work.
03:42:40 <shaohe_feng> Sundar: yes,  we need a driver name
03:43:17 <Sundar> xinranwang: just 'intel', for example. That is what we use to load a driver module. The intel_fpga_driver is the name in setup.cfg.
03:44:08 <xinranwang> so we should fill driver_name with 'intel' for example?
03:44:28 <xinranwang> if so, we can extract vendor name for driver_name..
03:44:54 <Sundar> Yes, xinranwang. To summarize, I think we are agreeing to update the driver_name field in the deployable and use that as 'vendor name' to create the traits as per https://review.opendev.org/gitweb?p=openstack/cyborg-specs.git;f=specs/train/approved/cyborg-nova-placement.rst;hb=refs/changes/45/603545/7#l179
03:45:02 <Sundar> Sounds good?
03:45:21 <Sundar> Then no need to add vendor_name to any object
03:45:44 <xinranwang> yes
03:46:56 <Sundar> Great. If others have objecttions, please LMK. Let's move on in the interest of time
03:47:04 <Sundar> #topic python cyborg client
03:47:44 <s_shogo> I'm working that with P5-P9 patch in advance.
03:48:12 <Sundar> Thanks, s_shogo. I am wondering if we have the time to update the cyborg client to v2 API?
03:48:28 <Coco_gao_> So zhenghao's patch should use Vendor ID and then update deployable vendor name?
03:48:47 <Sundar> Coco_gao_: Yes
03:48:48 <Coco_gao_> update to deployable table with vendor name
03:48:54 <Coco_gao_> Sounds good.
03:49:21 <Sundar> It already has a driver name. At least today, driver name and vendor name are the same.
03:49:36 <Sundar> s_shogo: May be we need somebody to help you, for the client?
03:49:43 <s_shogo> Sundar : ok, I think so. I'll report the prospection for the v2 API Work in every IRC  meeting.
03:50:12 <Sundar> s_shogo: Thanks a lot. Appreciate it. :)  Please do ask if you need help.
03:50:49 <Sundar> BTW, TC has proposed: https://review.opendev.org/#/c/675317/
03:50:50 <s_shogo> Sundar: Ok, Thanks.
03:51:25 <Sundar> They are proposing an intermediate release of the client. IMHO, it is useless because it is currently v1 based, has no Python 3 fixes and so nothing new from Stein release
03:51:47 <Sundar> We may as well club the client update with Train release. What do you al think?
03:53:47 <Sundar> Coco_gao_, wangzhh, xinranwang, shaohe_feng, Li_Liu, all: ^
03:54:39 <Coco_gao_> Agree if we have time. I don't know whether we can make that happen, are you optimistic about that?
03:54:51 <xinranwang> what's club mean....
03:55:13 <Sundar> xinranwang: I mean combine client update with rest of Train release.
03:55:54 <Sundar> Coco_gao_: Yes, i share that concern. Once s_shogo lets us know if it can happen by Sep, we can decide.
03:56:25 <xinranwang> Yes, nice to have.
03:57:02 <Sundar> Yes, It will probably be lesser priority than Nova integ, Python 3, RBAC/security etc. Anybody thinks otherwise?
03:57:14 <Coco_gao_> Agree.
03:57:33 <Sundar> #topic Other important Train tasks
03:57:48 <shaohe_feng> should the async bind in this release?
03:58:04 <Sundar> wangzhh: Could you give us an update on RBAC, whenever you have the time?
03:58:21 <Sundar> shaohe_feng: Yes. It is needed for Nova integ
03:58:33 <xinranwang> I notice current code does not create deployable with driver_name, does it? Maybe need one patch on it?
03:58:41 <shaohe_feng> OK.
03:58:52 <Sundar> xinranwang: Yes, it is a gap
03:58:54 <shaohe_feng> Then we will add 2 more state for bind.
03:59:15 <shaohe_feng> ARQ_BIND_STARTED and ARQ_DELETING
03:59:23 <Sundar> BIND_STARTED is already being added in my patches
03:59:32 <Sundar> But I agree
03:59:33 <shaohe_feng> ARQ_INITIAL, ARQ_BIND_STARTED, ARQ_BOUND, ARQ_DELETING, ARQ_UNBOUND, ARQ_BIND_FAILED
03:59:59 <shaohe_feng> there will be 7 states
04:00:11 <shaohe_feng> Use “->” for transform , “X->” for can’t transform.
04:00:15 <shaohe_feng> TING X->  ARQ_BOUND
04:00:20 <shaohe_feng> ARQ_BIND_FAILED
04:00:26 <shaohe_feng> any comments on it?
04:00:28 <Sundar> Yes. Between bind and unbind, making bind async is more important. We don;t do much in unbind today
04:00:34 <xinranwang> Sundar:  I can help on it. I will submit a patch by today to fix that.
04:00:53 <Sundar> xinranwan: Excellent. Thanks :)
04:00:56 <shaohe_feng> ^ Sundar Coco_gao_ xinranwang wangzhh Yumeng
04:01:09 <xinranwang> Sundar: np
04:01:10 <Sundar> shaohe_feng: I agree
04:01:10 <Coco_gao_> We'd better merge the old patch P5-P9 first, and then discuss what are the states?
04:01:36 <Coco_gao_> Maybe a new patch for states update?
04:02:17 <Sundar> I think shaohe_fend will propose a new patch for async un/bind
04:02:25 <Sundar> *shaohe_feng
04:02:31 <Coco_gao_> That's good.
04:02:38 <Sundar> wangzhh: Can you please let us know about RBAC?
04:02:50 <xinranwang> btw, please review this https://review.opendev.org/#/c/677436/ when you got time :)
04:02:51 <shaohe_feng> Yes. After discussion, I should also implement the code. So I'm afraid there is no time left for me
04:02:54 <Coco_gao_> maybe he is offline
04:03:08 <wangzhh> Test in my local env
04:03:13 <shaohe_feng> we should discuss it ASAP
04:03:31 <Coco_gao_> you can arrange a meeting to discuss that.
04:03:41 <wangzhh> probably  submit patch tomorrow
04:03:42 <shaohe_feng> any comments on the states transform above?
04:04:03 <Sundar> shaohe_feng: I agree with your states. I think they will suffice. Let's start with them and have people code review it?
04:04:05 <shaohe_feng> ^ Sundar Coco_gao_ xinranwang wangzhh Yumeng
04:04:37 <Sundar> I will also prioritize it for next week's IRC call
04:04:41 <Coco_gao_> As good as my expectation, thank you.
04:05:03 <Sundar> Hopefully, P5-P9 patches would have merged by then :)
04:05:21 <Yumeng> shaohe_feng: no objections ^^
04:05:30 <Coco_gao_> Let's set a deadline for P5-P9 reviews?
04:05:30 <Sundar> #topic AoB
04:05:46 <Sundar> Coco_gao_: That would be welcome!
04:05:51 <wangzhh> shaohe no objections
04:06:07 <Coco_gao_> So far so good.
04:06:17 <Yumeng> hey, btw, I would like to introduce you my colleague chenke, he will join us and contribute in cyborg.
04:06:21 <Sundar> What deadline do we want to set?
04:06:26 <shaohe_feng> so are you satisfy the state transform ?
04:06:33 <Coco_gao_> end of this week?
04:06:44 <Coco_gao_> including the weekend.
04:06:45 <Sundar> Welcome, chenke :)
04:06:51 <shaohe_feng> welcome cheke
04:06:58 <xinranwang> welcome chenke
04:06:59 <wangzhh> welcome
04:07:06 <s_shogo> welcome
04:07:06 <Coco_gao_> I had already saw your patch, welcome chenke.
04:07:08 <Sundar> Coco_gao_: Sounds good to me. Others good with that?
04:07:09 <chenke> Hello everyone, I am new here, I hope to contribute to cyborg.
04:07:24 <shaohe_feng> sorry s/cheke/ chenke
04:07:30 <chenke> Thanks.
04:07:31 <Sundar> chenke, could you please tell us your backgrounnd or skill set?
04:07:39 <chenke> Ok.
04:07:42 <chenke> Previously, I mainly contributed to the watcher project.
04:08:26 <Sundar> Ok, sounds good
04:08:57 <chenke> I have been working on watcher for two years. Now is a core for watcher.
04:09:23 <Coco_gao_> Sounds really good
04:09:29 <chenke> I am happy to contribute to cyborg.
04:09:30 <Sundar> In terms of other Train tasks, we continue to work on enabling tempest. The https://review.opendev.org/677436 is needed to enable upstream tempest CI
04:09:30 <Coco_gao_> Glad you can join us.
04:11:00 <Sundar> chenke, BTW, we have a Storyboard: https://storyboard.openstack.org/#!/project/openstack/cyborg . Please feel free to see if you find anything interesting that you can take up. You can reach us on IRC or me by email (sundar.nadathur@intel.com)
04:11:27 <chenke> I will do it. Thanks.
04:11:43 <Sundar> Great. wangzhh: Thanks for your response on RBAC too.
04:11:54 <Sundar> Anything else for today?
04:12:00 <Coco_gao_> No
04:12:07 <wangzhh> no
04:12:31 <Yumeng> no from my side
04:12:38 <chenke> no
04:12:39 <Coco_gao_> Good night Sundar
04:12:43 <Sundar> Thanks for a productive discussion, everybody. Have a good day (or night)!
04:12:48 <Yumeng> bye
04:12:51 <xinranwang> bye
04:12:52 <Sundar> #endmeeting