14:01:10 <shaohe_feng> #startmeeting openstack-cyborg-driver
14:01:11 <openstack> Meeting started Mon Aug 13 14:01:10 2018 UTC and is due to finish in 60 minutes.  The chair is shaohe_feng. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:01:13 <shaohe_feng> #topic Roll Call
14:01:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:01:15 <openstack> The meeting name has been set to 'openstack_cyborg_driver'
14:01:40 <shaohe_feng> #info shaohe_feng
14:02:16 <edleafe> #info ed leafe
14:05:13 <Sundar> #info Sundar
14:05:54 <shaohe_feng> Sundar, edleafe morning
14:06:10 <edleafe> good UGT morning to you!
14:06:36 <shaohe_feng> :)
14:06:43 <Sundar> Good day, Shaohe
14:07:50 <shaohe_feng> edleafe, Sundar now only we three on line.
14:08:28 <Li_Liu> #info Li_Liu
14:08:40 <Li_Liu> do we still have the meeting today?
14:08:41 <Sundar> shaohe: I had sent an email yesterday to openstack-dev. Shall we talk about that?
14:09:01 <shaohe_feng> Sundar: OK, go ahead.
14:09:16 <shaohe_feng> Li_Liu: morning.
14:09:46 <shaohe_feng> good evening,  xinran
14:09:48 <Li_Liu> shaohe_feng: good evening :P
14:09:58 <xinran> Hi all
14:10:33 <shaohe_feng> Sundar: You can talk about your email
14:10:40 <Sundar> It was apparently decided in some meeting that, to record the discovered devices, Cyborg agent will call Cyborg REST API. Also, to allocate and deallocate accelerators. If so, that will make the public and ha many disadvantages.
14:11:45 <Sundar> Among other things, it means it is not internal to Cyborg any more. Any user can call it. So, we should authenticate. Even if we open it only to operators, it is still error-prone. We can just keep it Cyborg-internal, right?
14:12:43 <xinran> When Cyborg agent will call restful API?
14:13:38 <shaohe_feng> Cyborg agent should not  call Cyborg REST API.
14:13:53 <Li_Liu> Agent should stay with rpc
14:13:59 <Sundar> xinran: Rest API is meant for things that can be accessed by external users, operators or other services.
14:14:44 <xinran> Sundar:  can you give us an example when agent call restful api
14:15:31 <Sundar> shaohe, Li_Liu: Agreed. I amreferring to #link https://etherpad.openstack.org/p/cyborg-rocky-development (Line 44)
14:15:38 <wangzhh> Hi all. Test connection.
14:16:03 <xinran> IMHO agent should not call restful api
14:16:32 <Li_Liu> Sundar, that is exposed by cyborg-api, not meat to be used by Agent
14:17:46 <Sundar> Li_Liu: Was that agreed only for GET, not for PUT/POST?
14:18:19 <shaohe_feng> wangzhh: good evening.
14:19:10 <wangzhh> shaohe_feng: good evening. :)
14:19:15 <Li_Liu> Sundar, PUT/POST are open to admin I think, in case they want to tune some of the deployables
14:20:18 <Sundar> Li_Liu: the issue there is that we will have to maintain backwards compat for REST APIs. That is going to constrain our development and/or cause upgrade issues.
14:21:13 <Li_Liu> wangzhh, do you have comments on this?
14:22:00 <wangzhh> Sundar,  what do you mean about "we will have to maintain backwards compat for REST APIs. That is going to constrain our development and/or cause upgrade issues."
14:23:11 <wangzhh> Can u give me an example which current APIs can not handle?
14:23:32 <Sundar> wangzhh: In #link https://etherpad.openstack.org/p/cyborg-rocky-development (Line 44), if we allow PUT/POST in addition to get, we may have such considerations
14:25:15 <wangzhh> We have PATCH now. If we want to update deployable. we can use it.
14:25:46 <shaohe_feng> sure we should allow PUT/POST,  that depends on
14:26:09 <Sundar> wangzhh: Can you elaborate?
14:26:35 <wangzhh> shaohe_feng: Do we need create deployable via API?
14:27:52 <shaohe_feng> That depends on.
14:28:07 <Li_Liu> wangzhh, maybe we should allow that for now
14:28:23 <shaohe_feng> at present, agent creates deployable.
14:28:30 <Li_Liu> to let admin able to add stuff in manaully
14:29:02 <Sundar> Li_Liu, shaohe: can you explain why that is necessary? And how we will ensure backwards compat and avoid upgrade issues?
14:29:04 <wangzhh> Sundar: https://github.com/openstack/cyborg/blob/719f3dee01b6f0b0f4f2ce9b45ffd4978eeac287/cyborg/api/controllers/v1/deployables.py#L217
14:30:03 <shaohe_feng> Sundar: we should allow users to update some attribution of a deployable
14:30:30 <wangzhh> I agree that users need to update something.
14:31:20 <Sundar> By users, I hope you mean operators. End users (tenants) should not be touching this
14:31:22 <wangzhh> But can user create a new deployable? In which case? Could u give some example?
14:31:55 <Li_Liu> The users here should mean admin/operators
14:32:08 <shaohe_feng> ^ IMHO,  Li_Liu, the intention of your attribution design should allow user to update them, right?
14:32:20 <Li_Liu> yes, indeed
14:32:30 <wangzhh> Agree.
14:32:50 <Sundar> If operators need a manual way, can't we provide a script of some kind? Before opening up REST APIs, we need to understand their implications.
14:33:06 <Li_Liu> I agree, right now Create api for deployables might not be very useful
14:34:15 <Li_Liu> Sundar, I assume the script you mena cyborg-pythonclient?
14:34:16 <shaohe_feng> Sundar: yes, at present it means admin/operations, Unless there are sufficient reasons that end users should update them
14:35:19 <wangzhh> Sundar, I think update is necessary. For example, I need change my deployable's name. Some attr which we can't collect from driver.
14:35:44 <Sundar> Li_Liu: isn't pythonclient a wrapper around the REST API?
14:36:19 <Li_Liu> Yes
14:36:23 <Sundar> wangzhh: Sure, we can provide some script that not guarantees backwards compat till it all matures
14:36:40 <Sundar> *does not guarantee
14:37:32 <Sundar> Perhaps @edleafe can comment but, my understand is that REST APIs need to be honored in future releases
14:37:42 <Sundar> *understanding
14:38:26 <Li_Liu> So you suggestion is take out the deployable creation rest api?
14:38:46 <edleafe> Sundar: it has nothing to do with REST - any published API should not be removed or modified
14:39:43 <Sundar> edleafe: What APIs do we publish to users apart from REST? RPC APIs are internal and can evolve across releases, right?
14:39:46 <edleafe> A public API is a contract with your users
14:40:26 <edleafe> I'm speaking more generally
14:40:57 <edleafe> Whether your API is REST, RPC, SOAP, or anything else: public APIs need to be maintained
14:42:16 <Sundar> If, say, Cyborg offers a RPC API to Nova or os-acc, we can evolve that in future releases based on mutual understanding, right? IOW, that doesn;t count as 'public' - exposed to users
14:43:04 <edleafe> I'm not following. If another service is relying on an API, then it's public
14:43:25 <wangzhh> Sundar, if we use script. It's hard to intergrate with other project or software.
14:43:50 <wangzhh> For example, If we want to update attr by horizon?
14:44:47 <wangzhh> Command line and script is not friendly. :) IMHO.
14:44:58 <xinran> If other service interact with cyborg there may be the rest ful apis
14:45:13 <wangzhh> Agree with xinran.
14:45:53 <Sundar> wangzhh: Before we make something public and incur all the costs for all future releases, let us keep it minimal. If it is not absolutely essential to do something with public APIs, we should look at alternatives first
14:46:40 <Sundar> edleafe: Has nova/os-vif/neutron RPC APIs never changed? I thought those have evolved too
14:47:53 <edleafe> Sundar: I don't really know - I haven't worked with Neutron or os-vif
14:47:58 <Li_Liu> Sundar, I am thinking about it in the other direction tho. I think there is no harm to open couple rest apis and maintain them. As we couldn't confine what the users can/want to do in their scenarios.
14:48:48 <Li_Liu> By offering something more, can open up more opportunities for them.
14:48:54 <edleafe> Li_Liu: you should always think very carefully before publishing an API
14:49:08 <Sundar> Li_Liu: That means keeping your deployables structure backwards compatible? Before we know all the use cases?
14:49:16 <edleafe> Li_Liu: https://blog.leafe.com/api-longevity/
14:50:19 <shaohe_feng> Yes, API should be carefully, as edleafe said it is a contract :)
14:51:02 <Li_Liu> For sure, I am not saying we should do it without a thought :)
14:51:38 <wangzhh> Nova have API version  to handle these issue.
14:51:57 <Sundar> shaohe: Not just the API. It will constrain how you can evolve the data structure that we call Deployable. How do we add new fields, drop old fields, or change the meaning/type/format of any field?
14:54:44 <Coco_gao> Hi
14:55:06 <shaohe_feng> Sundar: yes, we need a API version for them.
14:55:09 <edleafe> Sundar: you're describing alpha development, where things can be added/removed/changed. Is that you would describe Cyborg's current state?
14:55:10 <wangzhh> CoCo, Good evening. :)
14:55:16 <shaohe_feng> Coco_gao: evening.
14:55:47 <Sundar> wangzhh, shaohe, IIUC, having API versions does not mean that older versions can be dropped. They are still public
14:56:17 <Coco_gao> #evening or morning, everyone.
14:57:40 <Sundar> edleafe: Not quite, I am just advocating minimal public APIs till Cyborg has got some adoption and use cases are known and well-supported
14:57:40 <edleafe> Sundar: exactly. The point of API versioning is that the default never changes, but users can opt in to a newer version
14:57:48 <shaohe_feng> Sundar: yes. They are still public.  For the end users's app still use older versions, depends on older  instead of latest.
14:57:51 <wangzhh> Yep. LTS. But it can be dropped after depatched declaration.
14:58:06 <Li_Liu> As edleafe said, Cyborg is still at its very early dev stage
14:58:47 <Sundar> When you introduce API v2 with many changes, you still have to maintain v1 at least for some time. That means handling upgrades, conversions without breaking old functionality.
14:59:15 <edleafe> I would think that given Cyborg's early development status, that any APIs you create are labeled as alpha, with a note that they may be changed in the future
14:59:35 <edleafe> Sundar: some would contend that you have to maintain v1 forever.
14:59:41 <edleafe> (not me, though!)
15:00:13 <Sundar> edleafe: iIf we can mark REST APIs as alpha, that will be great
15:00:38 <Li_Liu> I think that's want we should be doing for now
15:00:48 <wangzhh> Sundar, all that mean that  we should do with thought. Instead of not to do.
15:01:31 <edleafe> Sundar: http://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability.html#new-or-experimental-services-and-versioning
15:02:46 <Sundar> Thanks, edleafe
15:02:49 <Sundar> Folks, shall we agree to mark Deployables API as alpha (subject to change without backwards compat) for now?
15:03:26 <Li_Liu> I vote yes
15:04:27 <wangzhh> Agree with Sundar. :) :) :) :)
15:04:46 <shaohe_feng> agree
15:05:28 <xinran> agree
15:05:31 <Coco_gao> I agree, I think maybe the alpha is the version when cyborg finish interactions with nova.  Before that, cyborg is not workable by users.
15:06:06 <Sundar> Great! Thanks, guys. :)
15:06:17 <shaohe_feng> Sundar: you should take more time/though on how to define the API. Post your ideas and let we discuss together.
15:07:24 <shaohe_feng> Sundar: have you consider the API that nova how to apply an accelerators from cyborg?
15:08:45 <Sundar> shaohe: Yes, that should be part of the os-acc spec. We are still in the stage of getting agreement on the overall workflows and interactions. But I agree we should document that in detail
15:09:12 <shaohe_feng> Sundar: and also the parameters that API supports.
15:10:09 <Sundar> Sure. Please see https://review.openstack.org/#/c/577438/6/doc/source/specs/rocky/approved/compute-node.rst Line 525 for example
15:11:31 <shaohe_feng> Sundar: OK, for example, as a user I want to apply a accelerator,  accelerator type is FPGA instead of GPU, vendor is intel, model is A10, and it's function is Crypto.
15:12:01 <shaohe_feng> Sundar: maybe more parameters will be extended.
15:13:16 <Li_Liu> shaohe_feng, use the attribute in deployables, if the fields are not supported
15:13:24 <shaohe_feng> Sundar: these parameters, nova will extract for your defined traits/RC in the flavor
15:13:34 <Sundar> Operators will define flavors based on the attributes you mentioned, and users will pick one of the flavors. The accelerator requirement is conveyed though extra specs, and that is aprt of these function signatures
15:13:57 <shaohe_feng> Li_Liu: Yes, I already use attribute  :)
15:14:01 <Li_Liu> right now, when creating Deployables any parameters that are not native to deployable, will be added to the attribute_list
15:14:58 <shaohe_feng> yes. I did use attribute as this way.
15:15:42 <shaohe_feng> Sundar: also can we support to apply batch accelerators in that API?
15:16:12 <Sundar> shaohe: if you mean that a sigle VM may want more than one accelerator, sure.
15:16:30 <shaohe_feng> Sundar: for example, I want 2 accelerator,   both accelerator type is FPGA, vendor is intel, model is A10, and it's function is Crypto.
15:16:58 <Coco_gao> Have we make an agreement on users' input,  parameters or sections we may support by now?
15:17:02 <Sundar> Currently, the API gets a list of device RPs selected by Nova, and extra specs. But that is not ideal. Still trying to think of a better way
15:17:05 <shaohe_feng> Sundar: or I want 2 different accelerators, one is GPU another is FPGA?
15:17:53 <Sundar> shaohe: All that should be possible. The currently proposed APIs can technically handle them, but we can probably improve upon them
15:18:27 <shaohe_feng> Coco_gao: Sundar is working on them.
15:18:28 <shaohe_feng> Sundar when will we see your API define?
15:19:00 <shaohe_feng> Sundar: one api to apply 2 accelerators instead of two api call?
15:19:17 <Sundar> shaohe: I am updating the os-acc spec. It should be out by this week, if not in next couple of days.
15:19:54 <shaohe_feng> Sundar: nice.
15:19:57 <Sundar> shaohe: The current API looks something like: prepareVANs(device_rp[], extra_specs, instance_info)
15:20:02 <Coco_gao> Great jobs.
15:20:27 <Li_Liu> ok
15:20:32 <Sundar> The extra specs will contain the details of the user request, like: CUSTOM_ACCELERATOR_FPGA=2 and the traits
15:20:52 <Sundar> The device_rp[] will contain the device resource providers selected by Nova for each of those accelerators
15:21:08 <shaohe_feng> Sundar: anyway, the api should well match the user the expect accelerators
15:21:18 <Li_Liu> device_rp is a list, and extra_specs will be applicable to all the RPs in the list?
15:21:54 <Sundar> The problem is, how do we correlate each device_rp to a specific accelerator in extra specs. If we can massociate each user request with its own device RP, that will simplify our livs
15:23:51 <Li_Liu> can we make extra_specs a list as well?
15:23:56 <Sundar> Li_Liu: yes. The extra specs may contain groups based on granular request syntax #link https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/granular-resource-requests.html
15:24:13 <Sundar> The extra specs is a pre-defined thing coming from Nova.
15:24:35 <Li_Liu> ah, ok
15:24:36 <Sundar> Not sure if we can modify it. Still looking into that
15:25:31 <openstackgerrit> OpenStack Release Bot proposed openstack/cyborg master: Update reno for stable/rocky  https://review.openstack.org/591423
15:26:15 <shaohe_feng> I have already use resource groups to support batch accelerators :)
15:26:59 <shaohe_feng> Sundar: remind,  have you notify rodrego that you and Coco_gao are refactoring the new returning  data of  the driver?
15:27:25 <Sundar> Oh yes.
15:27:34 <shaohe_feng> Good
15:27:46 <Sundar> We are currently looking at making the current driver patch work with Zuul
15:27:52 <shaohe_feng> Coco_gao: how is the refactor going on.
15:28:43 <Coco_gao> yeah, I am working on it right now. Still need to test the code by now.
15:29:10 <Sundar> Coco_gao: we probably need to sync up. Are you using OVOs?
15:29:33 <shaohe_feng> Coco_gao: good. Guess other developers are depending on your change.
15:29:56 <shaohe_feng> Sundar: have resolve the OPAE package install in Zuul?
15:29:59 <wangzhh> Coco:Good job.  Plz don't forget test with my GPU driver.
15:30:34 <Coco_gao> Our Datacenter is going to move to other location,  so the test work need to be done several days later.
15:31:31 <Sundar> shaohe: Rodrigo and I discussed the pkg install, and he is making changes
15:31:31 <Coco_gao> Sundar, you mean VersionedObject?
15:31:39 <Sundar> Coco_gao: yes
15:32:17 <Sundar> We should update the spec before submitting the patch?
15:32:17 <Coco_gao> yes,  I am using VersionedObject
15:32:59 <shaohe_feng> Sundar: good.
15:33:05 <Sundar> I am still backlogged on os-acc. I'll update the driver/agent spec. Coco_gao, I cna send you an early version for review
15:33:17 <shaohe_feng> Sundar: you should sync up with Coco_gao well.
15:33:51 <Sundar> Yes
15:33:51 <shaohe_feng> Li_Liu: do we have deadline for this refactor?
15:34:08 <Coco_gao> I don't know
15:34:29 <Coco_gao> maybe we can discuss the deadline.
15:35:41 <Li_Liu> shaohe_feng, if you wanna catch the Rocky release, then Aug 20 - Aug 24
15:35:53 <Li_Liu> but if not, there's not hard deadline on this
15:36:00 <Coco_gao> Sundar, any detailed questions or informations , plz send me an email(gaojh4@lenovo.com)
15:36:13 <shaohe_feng> yes, Your working is very important. for other drivers base on your work.
15:36:21 <Sundar> Coco_gao: sure
15:36:52 <Li_Liu> well, yes, we need to save time for reviews as well(if you wanna catch the R release)
15:37:50 <shaohe_feng> Li_Liu:  I need to change the agent code, base on your new pf/vf model.
15:38:05 <Coco_gao> shaohe_feng, I know, maybe this week I will give an patch on the object first.
15:38:06 <Sundar> R release is past code freeze, right?
15:38:15 <Li_Liu> ok, let me know if you need any help/discussion
15:38:59 <Li_Liu> we can have ZOOM meeting anytime for discussion until Rocky releases
15:39:08 <shaohe_feng> OK.
15:40:21 <Coco_gao> then shaohe can update the agent code, others can update the driver code.
15:41:22 <shaohe_feng> Sundar,  Li_Liu, will we not touch the API code until R release any more?
15:41:40 <shaohe_feng> even Sundar have define the new API, right?
15:42:01 <Li_Liu> I think so
15:42:09 <Li_Liu> no time for that
15:42:48 <Sundar> Agreed
15:43:04 <Coco_gao> Hi, shaohe, I think maybe zhenghao can help update agent code, since we need to test gpu driver together.
15:43:37 <wangzhh> Yep. shaohe_feng: Can I take it?
15:44:08 <Coco_gao> If the agent code don't finished, we can't test our code seperately.
15:45:05 <shaohe_feng> that's great
15:45:18 <shaohe_feng> 3ks,  wangzhh
15:45:55 <wangzhh> You're welcome :)
15:46:43 <Coco_gao> Everyone, if I have any questions I will contact you directly and thank you for you support and patience~
15:47:24 <Li_Liu> Thank you Coco :)
15:47:38 <Li_Liu> Thanks a lot everyone for the hard work
15:47:57 <shaohe_feng> cool, Coco.
15:48:15 <wangzhh> No problem. ;)
15:48:25 <Coco_gao> Time to bed if we don't have other questions.
15:48:52 <shaohe_feng> yes, enough sleep keep girl beauty.
15:49:08 <wangzhh> ヾ(・ω・`。)  Good night.
15:49:10 <shaohe_feng> Any thing wants to discuss?
15:49:10 <Sundar> Good night or good day, everybody
15:49:20 <shaohe_feng> OK, let's end the meeting.
15:49:37 <Li_Liu> Have a good nite guys
15:49:51 <wangzhh> ヾ( ̄▽ ̄)Bye~Bye~
15:49:52 <shaohe_feng> Thank you Li_Liu
15:49:59 <shaohe_feng> #endmeeting