15:02:37 #startmeeting openstack-cyborg 15:02:38 Meeting started Wed Aug 30 15:02:37 2017 UTC and is due to finish in 60 minutes. The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:02:42 The meeting name has been set to 'openstack_cyborg' 15:02:57 one comment for zhuli you capture rpc connection errors and log them in the common service code, but you don't change the exit code, this makes automated checks difficult :( 15:03:00 I' 15:03:17 I'm not really sure what to do to fix it, since there seems to be some graceful exit code there I didn't want to break 15:04:22 zhuli around ? 15:04:22 i will let him know if he's not here 15:04:36 #topic patch discussion 15:04:49 yep, I'm here 15:04:52 jkilpatr, I can test it on devstack 15:05:03 crushil that'd be great 15:05:17 zhuli did you get justin's question ? 15:05:18 I will add a devstack plugin too soon. Hopefully this week 15:06:35 thx crushil, looking forward to it 15:06:49 I will confirm the rpc exit code issue later 15:08:00 thx zhuli 15:08:16 jkilpatr, I will test it on devstack too if I have time 15:09:38 so any questions on zhuli's three new patches ? 15:09:55 commit messages! 15:10:00 other than comit msg fixing as uaual 15:10:04 lol 15:10:04 haha I was just saying 15:10:24 looks good to me otherwise, I think I commented on at least two, I'll have to go and look at the third. 15:10:49 I think I need fulfill the commit messages before merged 15:10:56 yep 15:12:20 zhuli just git commit --amend everytime after git commit -a -m :P 15:12:42 so we got one more new patch, from a colleague of mine 15:12:47 #link https://review.openstack.org/#/c/498690/ 15:13:12 this is a new spec (targeting queen) for SPDK driver 15:13:23 for high perf NVMe SSDs as accelerators 15:13:39 zhipeng, thx 15:14:08 zhuli no problem 15:14:20 everyone plz help review the spdk spec when you got time 15:14:32 altho this is not Pike priority 15:15:12 so this involves cyborg reconfiguring other openstack services 15:15:25 not really I think 15:15:34 don't we need to do some config in cinder? 15:15:36 can we do that live? 15:15:51 no we only do that for spdk 15:16:05 ok then I'm less concerned about the scope getting too big then. 15:16:58 the premise of this scenario is that Cinder will use a accelerated backend 15:16:58 so spdk is in the config anyways 15:17:04 yep 15:17:16 just don't want to obligate us to restart other services 15:17:24 definitely 15:18:32 so the conductor is in charge of nova placement api interaction? 15:18:39 we should start looking into that. 15:18:56 I was gonna discuss with you about that 15:19:04 okey let's move on to the next topic 15:19:18 #topic Denver PTG TC presentation prep 15:20:41 so I will do a project update presentation to TC on Sep 12th 15:20:41 I did a internal check with our TC member, and get several really good suggestions 15:20:41 for us to better present the project to the TC 15:20:57 oh cool we're presenting to TC 15:21:04 So one thing would be an end to end workflow demo of Nova works with Cyborg 15:21:14 could be just bare minimum functionality 15:21:21 but a demo of such would be great 15:21:49 well we need some sort of driver and I'm not even sure if we can whitelist and load a pci device live, there's some docs for it. 15:21:59 but it looks dusty to me. 15:22:18 jkilpatr we could do it without the driver, in my opinion 15:22:58 we could have the generic driver just response success everytime the request do get through there 15:24:40 the main point is on the Nova Cyborg interaction 15:24:40 jkilpatr crushil zhuli I think what we could do is that 15:24:40 Justin could help add the report.py to the agent 15:24:40 jaypipes mentioned to me that we could just copy that python file from nova compute 15:24:49 with necessary modification 15:25:13 and then it could interact with Placement, which could interact with Nova 15:27:13 zhuli could help with nova api hack, which could talk to cyborg api 15:27:13 well let's just do a best effort, see what we could have before PTG :) 15:27:28 sounds good ? 15:31:32 okey I will take that silence as a OK :P 15:31:37 sorry dual meetings 15:31:54 what does report.py do? 15:32:27 * jkilpatr finishes reading history 15:32:52 jkilpatr no worries 15:32:52 another feedback I got is that we lack of unit/function tests 15:32:52 zhuli has provided some on the api/db 15:32:54 jkilpatr and crushil could you guys help on the conductor/agent and driver side as well ? 15:32:55 you mean the cyborg api to response something like detach/attach action from nova ? 15:33:05 zhuli yes 15:33:18 ok 15:33:18 jkilpatr it reports the resource information to the placement api 15:33:19 ok I think that's doable, I need to debug some things though, zhuli when someone sends an api request does it actually go out on rpc to the conductor? We need that to work for an instance request so that I can test this. 15:34:44 okey, back to the unit/function testing issue 15:35:30 jkilpatr, yes the api will send rpc message to conductor 15:35:52 well we can run the deployment playbook and make some requests against it for functional tests. Unit testing wise there isn't much buisness logic anywhere but the API at the moment so test will have to come with functionality 15:36:04 zhipeng, do we need the tempest as well ? 15:36:31 jkilpatr / crushil could you guys help on the respective modules, to add the test files ? 15:37:31 sure, I'll do that as I work on stuff 15:37:44 zhuli not at the moment 15:37:44 jkilpatr that makes sense 15:37:46 ok the last issue is about the documentation, I could help with the releasenote, but we also need a brief user guide and dev guide 15:37:50 in the doc folder 15:38:25 I will take a stab at it and you guys could help review them 15:39:01 so that was the three major feedback I got 15:40:09 sure we can doc stuff. not sure how much of this we can get done before ptg, but we can make an effort. it may also go very fast at ptg itself 15:40:44 of course there were also suggestion on having a client for CLI interface 15:40:44 having a specific hardware to showcase 15:40:57 but I doubt that was something we could do before PTG 15:41:12 jkilpatr exactly 15:42:41 so the tests and docs are first priority for us 15:42:41 e2e workflow is a best effort 15:42:41 after we could merge the tests and docs next week, I will submit the official application patch on governance 15:42:41 to kick start the conversation 15:43:46 #action jkilpatr and crushil helps add unit/function tests 15:44:05 #action zhipeng will help with the docs/releasenotes 15:46:02 #info these should be wrapped up before the end of next week 08:29:24 hi 08:29:54 do we have the discussion related to Cyborg ? 08:45:33 Hi 08:46:01 abhinav what do you mean by discussion ? 08:36:44 hi jkilpatr, I have discussed the transport_url misconfiguration issue with my colleagues, it seems that most OpenStack projects like Nova etc, work in the same way as Cyborg, they just do nothing but turn out a blocking rpc connection, the reason maybe it is the oslo.messaging that reads the transport_url config which should also be in charge of the exception stuff, so there is not a common way to fix it in 08:36:44 the user side. 08:55:30 I think we could stay the same if it doesn't block your further work 15:00:23 \o 15:11:04 No meeting today? 15:27:24 Just read the email 15:34:07 o/ 15:34:14 nah looks like we're just going to meet at ptg then 16:29:55 Cool 16:30:10 Btw our talk got accepted as a lightning talk 00:48:32 crushil that is a great news ! 12:56:43 Hi, Can anybody kindly help to merge this? https://review.openstack.org/#/c/500862/ , another oslo.db commit is blocked by this one. Thanks very much. 13:14:22 zhouyaguo, got it. I'll let zhipengh[m] see it before landing though, good catch 13:14:37 zhipengh[m], crushil depending on the outcome of hurricane irma I may get stuck in NC 13:14:39 http://www.npr.org/sections/thetwo-way/2017/09/07/549121378/hurricane-irma-blasts-past-puerto-rico-with-180-mph-winds-risk-rises-for-florida 13:14:47 depends on if my flight is grounded early tuesday morning. 13:15:07 I'll keep yall informed :( 13:19:52 jkilpatr: Thank you very much. 13:22:15 jkilpatr Jesus ~ take care man, always safety first 13:22:45 jkilpatr: Be careful and take care yourself 13:23:16 Hi yaguo, first time see you here :) 13:23:34 Which company are u from, if I may ask ? 13:23:56 China UnionPay, :) , glad to see huawei man. 13:24:08 Hey 13:24:15 Nice to meet you :) 13:24:39 Is your patch specific to Cyborg or is it a community wide fix ? 13:24:42 yeah. glad to meet you too. :) 13:25:04 a wide fix. i am working on oslo.db debt cleanup 13:25:38 i have to make sure no deprecated class is used in cyborg. 13:25:57 Ah got it 13:25:59 cyborg seems like quite young. 13:26:01 THX ! 13:26:16 Very young :) 13:26:48 maybe i can get involved more in the future. :) 13:27:09 I just gave a speech on huawei connect meeting yesterday. :) 13:27:58 Really ? I was at HC as well lol 13:28:23 Look forward to have you work on cyborg if you have the volume :) 13:30:57 My speech is on the 1st day of meeting. it's about huawei SDN fabric applied in UnionPay. :) 13:31:55 awesome man, I was mending the OpenStack and OpenSDS booth lol, no time for session 13:32:39 Sure, i'll dive into it and maybe cyborg is interesting. :D 13:33:37 jkilpatr, I fly in Sunday. So, I guess I might be fine 13:33:50 crushil, where are you flying from? 13:34:12 RDU 13:36:55 crushil, you'll almost certainly be fine on the other hand you better clean up your fridge etc in case of prolonged power outages while your gone. 13:37:24 zhipengh[m]: yeah, cool, real engineer. 13:37:43 jkilpatr, Gotcha. My wife and dogs are still going to be here which is kinda scary 13:38:09 crushil, my parents are riding it out in south FL :( the worst we're going to get here is flooding (probably) 13:39:22 jkilpatr, Dang. Sorry about that. What part are they in South Florida? 13:39:30 West Palm 16:17:14 zhipeng, what exactly do we want out of updated Cyborg documentation? 16:17:24 * jkilpatr is trying to figure out what to write 16:18:31 something like https://github.com/openstack/blazar/tree/master/doc/source 16:19:00 ah 16:19:01 I can do that. 16:28:22 :) 17:45:47 https://docs.google.com/presentation/d/1RyDDVMBsQndN-Qo_JInnHaj_oCY6zwT_3ky_gk1nJMo/edit?usp=sharing 17:46:01 btw this is the presentation I prepared for tmr TC session 17:46:10 not sure if slide is needed, but just in case 20:29:52 btw tmr cyborg presentation on TC meeting 20:29:54 https://etherpad.openstack.org/p/queens-PTG-TC-SWG 20:30:05 line 63 is cyborg 20:55:14 ttk2[m] is it okey for you that we start our team meeting tmr afternoon ? 21:00:40 I'll be there. 21:02:22 okey then let's start early tmr afternoon 21:03:15 oh tmr 11:40 am we will have team photo shoot 21:03:44 everyone could make it ? 21:03:53 I can 21:04:16 crushil you could grab scott again lol 21:05:12 Scott is not here 21:05:22 It's just me this time 21:05:32 Justin should be here. :) 21:07:07 okey :P 16:11:16 ttk2[m] 16:11:21 1:00pm ok ? 17:16:04 Team Photo 11:40 at lobby front door 18:12:24 @zhipengh:matrix.org: sorry flight got delayed by three hours. 18:12:42 Just getting in now. 18:13:44 No problem 18:13:55 We resched the photo to 2:30pm :) 18:14:10 And let's meet at the lobby at 1:00pm 18:14:18 Grab some lunch now ! 18:19:24 That's the plan trying to meet up with my coworkers. 18:23:12 Maybe two? Sorry for all the trouble. This airport is far from the hotel. 18:23:32 No problem 18:23:51 Me and Rushil will just be in the lobby 18:24:09 (nodding off) 18:54:24 Reminder : afternoon meeting starts 2:00pm at front lobby 19:42:32 zhipengh: https://etherpad.openstack.org/p/cyborg-dev-docs 19:56:35 jkilpatr zhipengh[m] Will be there asap 20:09:26 https://wiki.openstack.org/wiki/Nova/pci_hotplug 20:10:20 http://www.vscaler.com/gpu-passthrough 20:25:31 https://etherpad.openstack.org/p/cybrog-api-workflow 20:48:05 https://github.com/openstack/blazar/blob/master/doc/source/architecture.rst 20:48:14 for doc reference 21:34:18 http://status.openstack.org/zuul/ 22:37:34 jkilpatr http://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ 02:24:35 zhuli There? 02:24:53 yep 02:27:45 I mean online, not at the PTG spot 02:32:47 Yup, that 02:32:55 *that's what I meant 02:34:32 How are you testing functionally with https://review.openstack.org/#/c/501656/ ? Does this commit only create the functional testing framework? 02:34:39 zhuli ^^ 02:39:46 It do the real API functional testing, I think I'd better to detail the commit message :) 02:53:02 What about the unit tests added here https://review.openstack.org/#/c/498410/ ? 02:53:51 The unit tests don't seem to be testing units. 03:04:39 zhuli ^^ 03:16:50 crushil, I just update the functional test commit msg, sorry for late reply 14:07:07 so if I'm reading the project-config repo right the only testing they run is devstack based (well integration testing obviously they run linters and unit tests for everyone) 14:07:45 if we want tripleo testing I guess we can do it on RDO cloud, I'll ask how to do that. 16:06:42 jkilpatr that does not affect what you mentioned as a patch for the OpenStack CI system right ? For the integration testing ? 16:07:20 zhipeng, well I'm trying to figure out what goes where, it looks like we will be able to get resource it's just a process that's going to span multiple repos, I'm talking to some of the people responsible for that now. 16:07:37 okey ! 16:07:50 btw plz help review zhuli's patches 16:08:02 i will have another doc patch up shortly 16:08:12 will do 16:08:53 jkilpatr Can you look at https://review.openstack.org/#/c/501825/ ? 16:25:19 crushil, done 16:40:59 zhipeng Are you still in the same room? 16:41:25 zhipeng Is jkilpatr there too? 16:41:25 i'm at Nova room 16:41:31 ok 16:41:44 Guess I'll stay in the Ironic room 16:47:15 I'm in tirpleo 16:47:51 new doc patch coming up https://review.openstack.org/503746 ! 16:49:20 It's in merge conflict 16:49:45 i know .. 16:49:55 working on it 16:52:31 should be ok now 16:57:13 Left a review comment 16:58:34 Check it now 16:59:25 zhipengh[m] I don't see a response. :) 17:01:59 Check at again :) 17:08:20 zhipeng Done. Please check https://review.openstack.org/#/c/501825/ again 17:09:25 thx man ! 17:19:57 we are very productive ! 17:20:14 crushil have you take a look again at https://review.openstack.org/#/c/498410/ ? 17:22:55 zhipeng I will fix the Unit testing and functional testing patches 17:23:08 Expect something by end of day today 17:24:24 Cool ! 18:01:32 zhipengh[m] Nova 1-3 is placement groups discussion. Hope you're attending that 18:06:21 Yes I will 18:06:37 Thanks for the heads up man 21:59:52 just created a local patch for official application, snippets: https://gist.github.com/hannibalhuang/d9f9d79c1505453097baf305e4eb0dc9 22:00:06 as soon as we are ready, I could push the button and go :) 22:00:26 Define ready 22:00:43 git commit ready 22:01:39 I mean after we have the unit tests and functional tests merged, are we going to push for being an official project? 22:01:52 yep 22:01:59 that would be the sign of ready 22:02:13 well I will cut a stable branch first 22:02:28 then I could push up the application patch to openstack/governance 22:02:32 ask for a formal vote 22:06:25 of course I also need to write a long ass commit-msg 22:07:26 zhipeng Alright 22:07:34 Will try to wrap the tests by tomorrow 22:07:41 no hurry 22:07:56 I think we already beat the expectation of what we could achieve at PTG :) 22:07:59 I want to so that we can file for our official application 22:08:05 yep 22:08:16 target late tmr or early Fri 22:09:17 That is what a few hours together in the same room can get you 22:09:52 :P 22:18:41 but please do check the snippet see if typos there 22:18:56 I will put up the commit-msg snippet later 22:19:10 so that we could review it and modify it before my submission 23:34:08 crushil, when you get a minute could you explain how to use your devstack plugin? trying to push the devstack check and gate job for cyborg 23:36:14 jkilpatr You basically add enable_plugin cyborg https://github.com/openstack/cyborg in your local.conf 23:39:35 crushil, should that be found in the cwd or somehwere else? 23:39:39 * jkilpatr doesn't use devstack at all 23:39:58 What do you mean by cwd? 23:41:20 crushil, pwd, the same directory youre running devstack from etc 23:41:30 Ah 23:41:43 So, you do a git clone devstack 23:41:52 With the appropriate url 23:42:01 And then you go to devstack/local.conf 23:42:22 If there is no sample local.conf 23:42:40 You need to create it by copying it from sample_local.conf 23:42:56 You should find ample examples if you google it 23:45:11 jkilpatr I am headed to dinner now 03:08:12 I discussed with dims over dinner , he suggested that we submit the application patch on Monday morning, that will guarantee it will be everyone's top list 03:08:27 Thurs night or Friday, people will forget 03:32:34 ttk2[m] There? 15:13:11 Hi, are any Cyborg team members present here at the PTG? I'm sitting outside the "Vail" room and it's empty. 17:01:38 Hi jangutter , sorry we wrapped up most of the discussion on Tuesday , I forgot to send an email out 17:02:31 Is there any topic you want to discuss ? We are still around PTG 17:07:10 ttk2: could u help land https://review.openstack.org/#/c/498410/ ? 17:07:41 sure I'l look it over 17:11:06 zhipengh[m] We met with jangutter 17:13:10 crushil oh great ! What did you guys discuss about ? 17:15:03 zhipengh[m] Bunch of things ranging from what jangutter's company is doing with accelerators, what Cyborg's objective is and how we can work together 17:17:34 Cool, which company does he work for ? 18:07:38 crushil ttk2 kind reminder that we will have the team interview from 1:00-1:30pm 18:07:59 zhipengh[m] We know. We were trying to find you 18:08:32 I'm just sitting in front of ballroom a like an old man... 19:22:13 zhipengh[m] jkilpatr What do you want cyborg.conf.sample to look like? 19:22:43 from what perspective ? 19:23:43 zhipengh[m] For instance ironic.conf.sample looks like this https://github.com/openstack/ironic/blob/master/etc/ironic/ironic.conf.sample 19:26:03 i think we could have a much shorter one ... 19:26:22 so as we discussed we will focus on pci-e first 19:26:25 zhipeng I agree. But what all do you want me to put in there 19:26:34 that;d be our hardware type to start with 19:27:05 I would like to put in a cyborg conf sample as part of the devstack process 19:27:23 protocol wise i can't think anything particularly needed for accelerators 19:27:34 like for ironic that'd be ipmi or redfish 19:27:40 Yup 19:28:09 at the moment there is no standard mgmt protocal for FPGAs or GPUs I guess 19:29:17 we could some stuff under the api section 19:29:34 i think zhuli mandated some number for port and host ip etc 19:30:18 ok 19:30:27 god ironic's conf is long~~~ 19:30:32 Haha 19:30:46 We need it for devstack plugin 19:30:49 we could also have some stuff under conductor and deploy 19:30:52 ya i know 19:31:05 I think we could just setup very basic parameters to begin with 19:31:13 and we add more stuff later 19:31:51 I used an empty one. It starts the services, but then doesn't do anything 19:32:28 okey then we could just add what we have for hardware type, api, conductor and deploy, that should be enough 19:34:16 justin finally tamed the jenkins dragon 19:34:57 jkilpatr ttk2[m] add infra cores to the patch, it'll move faster 19:37:07 zhipeng: gimme a second to look over the results and move it out of wip 19:40:52 So, I am going to wait till we have an official sample .conf 19:41:27 to what merge your devstack changes? 19:41:35 who are the infra cores anyways? 19:42:02 Yup 19:42:30 Or I can push it with an empty sample conf which we can fill it up later 20:00:11 Andreas ? 20:00:26 crushil sounds good either way 20:00:40 Ok 21:15:12 https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management 21:15:25 cyborg mentioned at line 172 21:15:38 and also just mentioned at the meeting 21:17:57 realized that for queen we also need to extend placement api in order to receive our request ... 21:31:53 oh just confirmed no need for placement api extension 21:36:37 also see https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management line 106 21:43:14 zhipeng jkilpatr This devstack plugin is taking a while on CentOS 21:43:23 no problem 21:43:43 that is not blocking factor right ? 21:43:54 You can go to TC today 21:43:57 And point to the patches 21:44:27 zhipeng, from the context it looks like "this affinity stuff won't make queens" 21:44:27 I will submit the patch on Monday morning like dims suggested 21:44:37 lol 21:44:53 What's the affinity stuff? 21:44:57 sorry @108 21:45:15 wrong line number since people wrote more stuff above it .,... 21:48:36 zhipeng What document are you looking at? 21:49:09 https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management 21:49:25 line 108 21:50:10 zhipeng Ah 00:57:25 ttk2: don't forget to help land zhuli unit test patch 00:57:46 ah started reading it and never finished, I'll get to it 01:17:13 ttk2: u also need workflow +1 lol 01:19:35 I just ran it real quick and poked around before merging 01:20:10 Ah I see 01:20:40 Just in case u forget, coz it happens to me a lot of the times lol 17:14:22 hi guys, I've put the commit-msg of my locan patch on official application at https://gist.github.com/hannibalhuang/b4a2522e8918649953793ca7196136e7 17:14:44 also the patch itself at https://gist.github.com/hannibalhuang/d9f9d79c1505453097baf305e4eb0dc9 17:15:00 plz take a look at it when you got time, so that we could fine tune the writing for TC review