15:13:59 <zhipeng> #startmeeting openstack-cyborg
15:13:59 <openstack> Meeting started Wed Jun 28 15:13:59 2017 UTC and is due to finish in 60 minutes.  The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:14:00 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:14:03 <openstack> The meeting name has been set to 'openstack_cyborg'
15:14:33 <zhipeng> #topic Roll Call
15:14:54 <NokMikeR> #info Michael Rooke, Nokia
15:15:03 <zhipeng> #info Howard Huang
15:16:16 <NokMikeR> Must be summer :)
15:16:37 <ttk2[m]> #info Justin Kilpatrick, RedHat
15:16:43 <ttk2[m]> Sorry on a plane today.
15:16:57 <zhipeng> wow, still wifi ?
15:17:17 <ttk2[m]> Just landed.
15:17:44 <zhipeng> you work too hard :)
15:18:03 <zhipeng> #topic development progress
15:18:06 <ttk2[m]> Anyways I'd like to get that accelerator object stub merged so that I can start setting up the db components of the conductor. Messaging and rpc is working.
15:18:27 <ttk2[m]> The agent needs the dummy driver merged at least so that I can play around with integrating those two.
15:19:05 <ttk2[m]> Oh also I figured out the config module it loads transport urls and works in a real cloud env.
15:19:16 <zhipeng> crushil how's the driver patch going ?
15:19:32 <zhipeng> ttk2[m] config module ?
15:20:02 <ttk2[m]> Oslo config.
15:20:22 <zhipeng> okey
15:21:22 <ttk2[m]> Essentially I have all the basics worked out except the db so all the various components can talk to each other after that it's just a matter of writing business logic.
15:28:42 <zhipeng> ttk2[m] so looking at the conductor patch
15:28:47 <zhipeng> #link https://review.openstack.org/#/c/472662/
15:29:10 <zhipeng> I think except for a need to modify the patch title, it looks good for me to merge
15:30:01 <zhipeng> also the api spec could be merged at this point I guess
15:30:06 <zhipeng> #link https://review.openstack.org/445814
15:34:25 <ttk2[m]> API spec is good. And I'll get the db stuff into the conductor and then mark it as ready.
15:34:38 <zhipeng> cool
15:35:18 <NokMikeR> does the api cover listing an application running on the accelerator?
15:36:05 <zhipeng> we list accelerators, but i doubt we will list applications
15:36:12 <zhipeng> could you give a use case on that ?
15:36:50 <NokMikeR> GPU has a running application, you list the GPU to see if its 100% free, then discover its not, hence skip to the next one.
15:37:12 <NokMikeR> thats more fine grained than simply discovering / listing the accelerator itself.
15:37:23 <ttk2[m]> isn't that the same as listing usage?
15:37:43 <ttk2[m]> I'm planning on usage metric collection but I'm not sure if/how we expose that to the api instead of just the operator/internally
15:38:39 <zhipeng> i think we could definitely have that metric for the scheduler to use, but it might be just used internally
15:38:57 <zhipeng> and it would be difficult to model it in the resource provider, because it is dynamic
15:39:28 <NokMikeR> so operations like deleting an accelerator only occur when there is no running applications (=anything left) on the accelerator?
15:40:57 <zhipeng> i think it would be up to the users to make that judgement (check if nothing left running)
15:41:07 <ttk2[m]> that's easy enough to do as we have to control scheduling based on usage anyways
15:41:15 <zhipeng> cyborg will just receive a request to detach the accelerator for whatever purpose
15:41:22 <NokMikeR> ok
15:41:33 <ttk2[m]> on the ohter hand it would be a pretty bad user experience if an idling VM kept the accelerator from being changed.
15:42:07 <NokMikeR> or updated with something else since theres a left over app using resources that should be allocated for the new app etc.
15:42:37 <NokMikeR> maybe these are issues outside of cyborg, not sure.
15:51:42 <ttk2[m]> so managing instances attached to the accelerators is only partly cyborgs job
15:52:54 <ttk2[m]> I mean we'll list instances that are attached or running on various accelerators, but deciding if the work being done is useful  is out of scope. And in the case of shared accelerators we're going to need to rely on them being able to print out who is doing what since there's probably a scheduler running below us for that.
15:55:02 <zhipeng> a sched running below us ?
15:58:39 <ttk2[m]> zhipeng: like if we had gpu virtualization and three vm's sharing a gpu
15:59:38 <ttk2[m]> there's a scheduler at the driver level taking workloads from the 3 virutal gpu's and muxing them into the real hardware. We may be able to see how much the actual hardware is utilized, but unless the driver tells us which vm is using what percentage of the total resources we're in the dark other than "these vm's are hooked up to this physical gpu"
16:06:08 <zhipeng> agree
16:06:31 <zhipeng> so many of this kind of scheduling will happen under the hood for cyborg
16:06:46 <zhipeng> unless the driver is able to report to cyborg
16:14:16 <zhipeng> #endmeeting