19:01:20 #startmeeting Ironic 19:01:20 #chair devananda 19:01:20 Welcome everyone to the Ironic meeting. 19:01:20 Meeting started Mon Dec 9 19:01:20 2013 UTC and is due to finish in 60 minutes. The chair is NobodyCam. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:21 no logging? 19:01:21 weird 19:01:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:24 The meeting name has been set to 'ironic' 19:01:26 Current chairs: NobodyCam devananda 19:01:27 NobodyCam: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 19:01:36 slow bot ... 19:01:38 wow just really slow 19:01:48 Of course the agenda can be found at: 19:01:48 #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_next_meeting 19:01:57 #topic Greetings, roll-call and announcements 19:01:57 Who's here for the Ironic Meeting? 19:02:11 me 19:02:14 :) 19:02:18 me 19:02:26 + 19:02:30 o/ 19:02:35 :) 19:02:41 welcome all!!!! 19:02:49 ! 19:02:54 a few quick announcements 19:02:57 announcements 19:03:01 go devananda 19:04:08 - we have a client release on PyPI 19:04:08 #link https://pypi.python.org/pypi/python-ironicclient 19:04:08 w00t 19:04:08 and its already landed in OoO's setup-clienttools 19:04:08 and we have an I-1 release tagged 19:04:08 https://launchpad.net/ironic/+milestone/icehouse-1 19:04:08 #link https://launchpad.net/ironic/+milestone/icehouse-1 19:04:08 #link https://review.openstack.org/#/c/60879 19:04:22 though we didn't go through the whole release process this time -- no tarball, etc -- this updates our status on LP and closes all the FixCOmmitted bugs, etc 19:04:44 so from here on, we need to be mindful of targeting work to the release cycles 19:05:12 awesome progress everyone!!!! 19:05:18 if you think you'll fix a bug (or think the bug *needs* to be fixed) by a given milestone, please tag it accordingly 19:05:40 same for blueprints 19:05:48 that's all my announcements :) 19:05:49 :) 19:06:03 any other anniuncements? 19:06:19 who is not going to be here over the holiday? 19:06:36 *announcements 19:06:38 :-p 19:06:47 I'm off from 21 of december to 2 of january 19:06:54 What holiday? 19:07:02 romcheg1: lol 19:07:06 going to brazil for xmas and new years 19:07:31 (I mean 23 of december, 21 is saturday) 19:07:31 lucasagomes: now thats a holiday 19:07:34 NobodyCam: Ah, we just don't have those ones :) 19:07:41 hp is off but I should be around most of the time.. I will be away 28-2nd 19:07:44 dkehn, :) 19:08:07 lucasagomes: i just bumped 1259269 to I-2. did you mean to target it for I-1? 19:08:13 ok moving on 19:08:18 #topic Outstanding, in-progress or Action Item updates 19:08:26 will be around, except for the obvious days, but I will be here 19:08:51 devananda, I've a review in progress for that bug already 19:08:59 romcheg1: any updates on devstack & tempest? 19:09:06 There are updates 19:09:26 Today I got +2 for my patch to Infra-config 19:09:33 #link https://review.openstack.org/#/c/53917/ 19:09:48 lucasagomes: sure. but I-1 is already closed. we can't target bugs for a closed release (unless the intent is to do a backport... which is overkill right now) 19:10:12 I also moved Ironic jobs to the experimental pipeline _both_ for tempest and ironic 19:10:18 devananda: can you backport from 0.0.1 release? 19:10:24 lol 19:10:25 clarkb suggesed to do that :) 19:10:42 romcheg1: ya that makes perfect sense 19:11:14 NobodyCam: sorry, side conversation there -- i was referring to the Ironic Icehouse-1 release (which was tagged over the weekend) not the client 0.0.1. lucas and I moved that to another room so we dont distract here any more 19:11:26 devananda, gotcha ah cheers for bumping it to i2 :) 19:11:26 romcheg1: awesome! 19:11:35 :) 19:11:43 ahh 19:11:49 I hope we will have devstack tests for Ironic before the new year :) 19:12:04 romcheg1, fingers crossed :) 19:12:05 romcheg1: you're just using the fake driver, right? 19:12:16 romcheg1: to do basic api tests (CRUD of resources) 19:12:25 yes, I do 19:12:35 cool 19:12:43 Need to merge the basic ecosystem first 19:12:48 yea 19:12:54 Then I'll be able to think about more advanced testing 19:13:11 but we are getting there.. :) 19:13:12 I will talk with infra soon about resuming the work on devstack-gate testing with pxe+ssh driver 19:13:36 that is ,in theory, possible within a single devstack-gate VM, but it will take some reworking of devstack-gate scripts 19:13:38 We will need TripleO cloud for that 19:13:53 not necessarily ;) 19:14:07 not so cool :) 19:14:34 on the consistent hashing, just a brief update 19:14:54 some bits have landed, and I need to rework some of the patches based on feedback (thanks everyone who's reviewed them!) 19:15:19 should be able to do that today or tomorrow 19:15:19 nova driver updates. 19:15:42 woo hoo :) 19:16:09 I pushed a new patch to the nova driver review friday 19:16:36 that should allow a deploy from nova to go thru the steps 19:16:56 also should power on / off the node 19:17:10 thou deploy is still just a log "CAll deploy here" 19:17:40 fyi: 19:17:43 NobodyCam, :( now the trigger became a blocker 19:18:04 #link https://review.openstack.org/#/c/51328/ 19:18:06 I gotta submit a review for that soon 19:18:10 lucasagomes: is that work (add deploy to client) blocked on anything right now? 19:18:40 devananda, not to trigger it 19:18:55 but we still need pxe to control the power on/off (based on that diagram) 19:19:23 #link https://docs.google.com/drawings/d/1azAWh0ZfhDqEUsC14ZEBawbnAmdQ2_yl3CfOdDbPvOk/edit 19:19:31 lucasagomes: that one ??^^^ 19:19:40 btw, /state/provision will be the trigger is that correct? 19:19:50 NobodyCam, that's correct :) 19:20:11 we should add all the state changes to that 19:20:15 lucasagomes: it does -- https://review.openstack.org/#/c/50409/ 19:20:51 devananda, ah I think I missed something then 19:20:54 so I think we are fine 19:20:57 looks like most are already there 19:20:59 I will add the trigger 19:21:02 awesome 19:21:06 awesome 19:21:29 any Q/c on nova-driver? 19:21:36 lucasagomes: states/power && states/provision -- based on our last discussion, I think thats fine. 19:22:00 magic :) 19:22:12 lucasagomes: we should also start thinking very soon about how to avoid making incompatible changes in our API and client lib 19:22:28 romcheg1: yuriyz: anything on Integration testing scheme? 19:22:33 devananda: ++ 19:22:46 devananda, yea, we were fine before just adding stuff 19:22:56 but I had to refactor the name of 2 attributes on the API side 19:23:09 like moving /nodes/xxx/state --> /nodes/xxx/states. this is an incompat change. 19:23:24 oh yea that as well :/ 19:23:29 right 19:23:44 we can easily alias it, or have /state return a redirect to /states 19:24:04 devananda: should changes to api be accompanied by client changes too? 19:24:13 that makes sense, now that we have a release we should be more strict on those changes 19:24:18 technically, i dont think we need to be careful about this until Icehouse is actually released 19:24:25 As soon as we have devstack tests running for every patch, accidentally breaking API will be harder 19:24:35 romcheg1: ++ :) 19:24:39 romcheg1, indeed 19:25:01 However, we cannot cover 100% of the API now 19:25:02 and that ^ :) 19:25:28 any updates on the Integration testing scheme? 19:25:44 (it's on the agenda) 19:25:47 NobodyCam: yes, changes to the API service should be accompanied by changes to the client. but that's not really the whole story 19:26:04 Hm, I'm not sure I put it to the Agenda 19:26:17 I told everything in the beginning of the meeting 19:26:30 devananda: so at min reff in the commit moessage to accompaning api / client change? 19:26:35 NobodyCam: we can't just go making an incompatible change to the API once we have tests in tempest, and same once we have testign for the nova-ironic driver. 19:26:48 romcheg1: great !!! 19:26:53 by "cant" i mean, literally, we CANT. Gerrit won't let us :) 19:27:07 Ah, that was one of my coleagues, who put that there 19:27:15 vkozhukalov: Are you around? 19:27:36 devananda: ahh :) 19:27:50 I will poke him tomorrow 19:28:01 romcheg1: his in #ironic 19:28:09 romcheg1, yes, I am 19:28:22 welcome vkozhukalov 19:28:54 vkozhukalov: hi! we just started chatting about this in -ironic, i think. 19:29:11 vkozhukalov: I was check on updates to Integration testing scheme 19:29:38 checking even 19:29:40 right now we with agordeev are working on this 19:30:00 we wasted a lot of time just to understand how to make it 19:30:08 vkozhukalov: awesome :) 19:30:48 ok we'll move on 19:30:52 I think we'll send pull request with some kind of functional smoke test tomorrow or day after 19:31:12 we may have covered most of this but: 19:31:21 #topic Python-IronicClient 19:31:42 any thing else on the client? 19:32:01 idk if there's much to cover here :) apart from the release and some changes to reflect the api changes 19:32:10 that was said already 19:32:18 it's on PyPI. it's in tripleo-image-elements. and some small changes in the pipeline for it. 19:32:22 that's about it, i think :) 19:32:30 ya 19:32:37 is there docn for the client? 19:32:46 there's --help :) 19:32:53 lucasagomes: did you write a man page for it? 19:33:03 devananda: we should create a wiki page for it 19:33:05 rloo, I promess I will revive that man page review 19:33:17 devananda, I have one review abandoned I will revive it 19:33:22 I need to find osme time 19:33:25 NobodyCam: i dont see the use of a wiki page for the client 19:33:36 ok 19:33:45 #link https://review.openstack.org/#/c/52902/ 19:33:52 I've been using --help 19:33:58 ha ha, yes, I know there's help, just wanted to know if we're responsible for more than that :-) 19:34:21 ah 19:34:23 rloo, I think so 19:34:34 :) 19:34:38 i am not aware, as far as the client, taht we need more 19:34:50 rloo: however, we will need deployer docs for the ironic services 19:35:04 in that case, forget the man page? 19:35:13 yes, deployer docs would be good. 19:35:26 rloo: I would forget the man page 19:35:35 the last sprint (feature freeze window after I-3) is meant for bug fixing and doc writing, but anything before that is great, too 19:35:42 *would NOT... 19:35:46 :-p 19:35:53 I think that any docs would be welcome 19:36:12 lucasagomes: yes!!! ++ 19:36:16 yes, docs are great, but useless if they are out of date. 19:36:23 :-p 19:36:29 rloo, indeed 19:36:42 good to move on? 19:36:59 mordred: in general, do / should client packages install a man page? it looks like they don't. 19:37:15 I think swift has a manpage 19:37:19 maybe the only one 19:37:37 https://github.com/openstack/python-swiftclient/blob/master/doc/manpages/swift.1 19:37:42 yea, let's move on. I dont think we need a man page. and if it's not automatically generated from code, it's just goign to get stale and therefor we should not have it 19:37:46 lucasagomes: in general https://github.com/openstack/swift/tree/master/doc/manpages 19:37:48 I would think it good to have a man page 19:37:53 #topic Nova-driver 19:38:04 we've covered most of this 19:38:10 notmyname, thanks :) 19:38:13 notmyname: is that manpage for the swift client or server? 19:38:55 nvm. both :) 19:39:08 any ting on the Nova driver? Q/ C? 19:39:34 moving on then 19:39:47 #topic API discussion 19:40:10 I think we already covered deploy in the API 19:40:17 lucasagomes: devananda: anything else 19:40:19 :-p 19:40:23 there's one topic about the API, which I put on the FFT but might worth talking here 19:40:27 devananda, the lock break 19:40:32 lucasagomes: yea, go for it. 19:40:37 we need to address that 19:40:48 so is the API right to expose a way to break the lock of a node? 19:40:57 if yes, where it should live on the API 19:41:29 #link https://review.openstack.org/#/c/55549/ 19:41:32 lucasagomes: are you thinking say from nova driver if some thing like a deploy time out occurs? 19:41:53 IMO only break existing lock, not expose 19:42:23 read my comments on the patch 19:42:50 devananda: you should install a manpage if you, you know have one. otherwise, I doubt that "man ironic" is going to do much good for anyone 19:43:31 if we allow a TaskManager lock to be broken by a simple API call alone, it will be too easy to abuse. 19:43:36 mordred: thanks 19:43:56 who has access to ironic commands. anyone? 19:44:16 lets switch to : 19:44:24 #topic open discussion / FFT 19:44:40 open mic 19:44:41 if we create an API endpoint that allows POST or PATCH to unset the reservation, but we never expose that there is such a resource, it's rather confusing 19:44:53 rloo: ironic is, today, admin-only 19:45:02 rloo: which means anyone with the admin bit set in keystone 19:45:11 so 'too easy to abuse' by admin folks. 19:45:15 rloo: "if you build it, they will come" 19:45:33 NobodyCam, it's about if the conductors die while helding the lock of the node 19:45:36 we can't trust admin to be careful... 19:45:40 rloo: right 19:45:48 lucasagomes: yes. so, here's an idea 19:45:55 yuriyz, you mean like having it hidden? 19:46:09 yuriyz, we need to expose it to be able to break via the API 19:46:17 lucasagomes: what if we allow a lock to be broken IFF the conductor holding it is dead (last heartbeat > heartbeat_timeout ago) 19:46:27 admin have already *status fields 19:46:29 devananda, ++, we need some sanity check 19:46:50 yuriyz: ^ and see my last comment on your patch (from friday) 19:47:07 yuriyz: admin does not currently see any indication of reservation 19:47:07 lucasagomes: devananda: how would nova driver handle a time out during a deploy? 19:47:27 NobodyCam: nova will destroy the instance 19:47:45 and that would break the lock?? 19:48:06 NobodyCam: this can actually be triggered at a higher layer in nova's stack. the nova-ironic driver could be sitting in a wait loop, and the scheduler could send a destroy() call down to it 19:48:08 yuriyz, yea, we should expose that reservation field as an attribute on the API object so when people GET /nodes/ they can see the reservation 19:48:16 NobodyCam: this is, last I checked, still an issue with noav-baremetal .... 19:48:21 we need to be able to GET -> document before PATCH -> document 19:48:45 NobodyCam: that won't currently break any lock. there's no way to interrupt an ongoing task in ironic. and I don't think we should worry about that now(). 19:49:27 devananda: ack... on the NOW() but should be thought about ... imo 19:49:30 NobodyCam: eventually, we will want to be able to interrupt a deploy, or a format, or whatever ironic is doign that is interruptable (eg, NOT a firmware update) 19:50:07 I got one action last week to write a bp about it, it's on my todo list 19:50:15 the issue at hand is: how do we allow conductor-B to regain control of a node formerly controlled by the now-dead-conductor-A 19:50:23 I can see breakable and unbrackable locks.. ie. a firmware update you would not with to stop because a admin set a time value to low 19:50:51 s/with/want/ 19:51:08 * devananda resists the over-engineering-tendencies 19:51:14 :) 19:51:20 just food for thought 19:51:39 hehe let's make one condition for now... if the node is locked by a conductor which is _dead_ the lock can be breaked 19:51:42 broke* 19:52:06 we know the conductor is dead based on the hearbeats 19:52:09 :) lucasagomes + I agree 19:52:19 we need to know first, where to expose it 19:52:27 simple solution seems to me: expose 'reservation' on the GET document. allow PATCH to clear that field only, and add a check to ensure it can only be cleared if the conductor is dead 19:52:43 yuriyz: what do you think ^ ? 19:52:49 devananda, agree 19:52:50 devananda, +1 IF reservation is part of the document 19:53:09 lucasagomes: taht was the first part of my sentence :) 19:53:22 haha yea 19:53:24 ++ then 19:53:33 NobodyCam: ? 19:53:34 was the impulse :P 19:53:38 devananda: could the secondary conductor for a node notice the primary is dead and do the take/ break from the periodic task? 19:53:53 NobodyCam: yes. in principle, we can automate this and never expose it on the API 19:54:03 :) 19:54:03 yuriyz, could change ur patch to add the reservation to the node api object please? 19:54:04 that _is_ another option 19:54:14 ok 19:54:20 but automatic lock breaking was pretty strongly undesired at the summit 19:54:27 and so i'm hesitant to introduce it 19:54:33 (i think there were some good reasons against it) 19:55:04 do we have these arguments on a etherpad somewhere? 19:55:32 I dont recall them getting put on the ep 19:56:17 we may wan to see if we can gather them up and make a EP for lock breaking 19:57:38 any ting else in that last three minutes 19:57:44 gah 19:57:48 yuriyz: i think all the requirements for you to add the "check that conductor which holds teh lock is really dead, before breaking the lock" are already merged 19:57:55 anything else in the last three minutes 19:58:36 yuriyz: actually, nvm. there is one more patch coming: https://review.openstack.org/#/c/59795/ 19:58:50 yuriyz: that'll change the way you get the list of active conductors 19:59:31 last minute of official time 19:59:35 I'll work on that today 19:59:46 so how can we prevent, or have such discussions, before a patch has gotten so far? 20:00:14 rloo: we can bring ideas up in IRC or on the ML, which I think we do 20:00:16 or maybe that's the nature of the beast. just wondering. 20:00:25 rloo: but sometimes code speaks more clearly than anything else 20:00:35 ok going to wrap up the meeting 20:00:41 thank you all!!! 20:00:45 before someone writes code, should they discuss in IRC? 20:00:55 rloo: it's not a bad idea ;) 20:00:56 rloo, I think it depends on the change 20:01:04 great meeting 20:01:14 good meeting indeed -- let's move back to -ironic 20:01:16 thanks everyone! 20:01:20 cheers 20:01:21 rloo: can we move this to #ironic 20:01:25 sure 20:01:27 :) 20:01:32 thank you all 20:01:35 #endmeeting