19:04:31 #startmeeting infra 19:04:31 Meeting started Tue Apr 28 19:04:31 2015 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:04:32 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:04:35 The meeting name has been set to 'infra' 19:04:46 https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:04:59 O/ 19:05:02 #topic Announcements 19:05:28 jeblair and I will be in France tomorrow and thursday, so may be even more offline - and then he's sticking around for friday as well 19:05:30 make sure you add summit session topic ideas/preferences to the etherpad 19:05:34 #link https://etherpad.openstack.org/p/infra-liberty-summit-planning 19:05:56 jeblair said he plans to organize/finalize what's there at the end of the week or early next 19:06:01 so time is of the essence 19:06:24 #topic Actions from last meeting 19:06:38 #link http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-04-21-19.01.html 19:06:52 #action fungi check our cinder quota in rax-dfw 19:06:57 o/ 19:07:02 i have not done that, so popping it back into the list 19:07:16 #topic Priority Specs 19:07:34 skipping these--we'll have plenty of new ones to start putting on the list once we summit 19:07:43 #topic Priority Efforts 19:07:54 i'll try to get through these quickly... 19:08:01 #topic Priority Efforts (Swift logs) 19:08:22 so... we're doing more of these i gather... 19:08:28 saw some bugs recentlyish? 19:08:37 something about swift not being reachable 19:08:49 we need to add support to os-loganalyze to serve non-log files still 19:08:55 we tried this but had to revert 19:08:56 there were those jobs clarkb spotted yesterday hung for hours after (or while) trying to upload logs 19:09:10 so we will probably need to increase our testing before continuing 19:09:14 o/ 19:09:31 jhesketh: did you get the paste i provided of the traceback for the non-log serving problem? 19:09:43 fungi: I haven't seen that, any cahnce you can link it here? 19:09:44 fungi: I think they were due to network issues 19:09:50 if not, i can likely dig the paste url out of my irc history 19:09:51 fungi: ditto, I haven't seend that 19:10:07 finding 19:10:12 the upload stuff has been hardened a little last night to retry a few times before failing 19:10:27 https://review.openstack.org/#/c/178199/ 19:12:49 I will review that after the meeting 19:12:51 i'm not having a ton of luck tracking down the paste, so probably to do it after the meeting 19:12:51 #link https://review.openstack.org/#/c/178199/ 19:13:01 er, better to 19:13:09 okay 19:13:26 i'll need to triangulate based on timing of the revert and associated conversation in channel 19:13:35 so we probably can switch over some more jobs with simple logs but some of them will depend on fixing static file serving 19:13:48 otherwise it's progressing well and not much new to report (afaik) 19:14:31 okay, cool 19:14:51 #topic Priority Efforts (Nodepool DIB) 19:15:17 all the required DIB changes are merged 19:15:29 theres one last fix that id like to merge before cutting a release 19:15:41 and its just waiting on CI 19:16:03 on the bindep front, i got back to manual testing to make sure it's doing what we want before tagging. turned up a minor parsing bug which is easy to fix but i'm struggling figuring out how to add a regression test. should have that up soon after the meeting 19:16:07 There was talk about us being able to test consolodated images once the dib release is cut? 19:16:33 consolidated images? 19:16:41 clarkb: one-image-to-rule-them-all 19:16:44 the image that boots in all the places 19:16:55 well, there are two aspects 19:17:07 we need nodepool shade + image uploads which is merge conflicted iirc 19:17:10 one is "boot image of type X in all the places" 19:17:18 also not sure if my -1 there was ever addressed 19:17:25 one is "stop having multiple types" 19:17:43 Yes, it might be nice to switch over to the new dib images (which use *-minimal) in hp first? 19:17:45 I have not gone back to the nodepool patches because we've been landing more testing for things in shade 19:18:05 I'll do re-address the nodepool patches on the plane this week 19:18:36 greghaynes: I think we actually want to add a new image type for both clouds first - then run tests on it via experimental 19:18:48 sgtm 19:18:58 I can play with that 19:19:00 since there are some differences in image content and we want to make sure those don't break tests 19:19:01 cool 19:19:09 oh, still need centos-6 nodes 19:19:17 #link https://review.openstack.org/171286 19:19:18 fungi: we support those too 19:19:31 mordred, which kind of tests are you talking about? 19:19:33 i mean my review to add centos-6 nodes 19:19:35 oh it lost my +2 19:19:55 yeah, i had to rebase because it started to merge-conflict on the log config 19:19:57 yolanda: we need to run devstack-gate on the new nodes 19:20:05 i've still got some more manual bindep vetting to do, so might turn up a few other minor bugs. also i'll probably add a trivial --version option before releasing. should have the remaining reviews up today sometime 19:20:08 mordred, then it's quite related to my epc 19:20:10 spec 19:20:18 yolanda: to make sure that the switch to the ubuntu-minimal doesn't kill things 19:20:35 yolanda: yes- but this time I think we want to do it by hand 19:20:45 ut it would be covered by your spec in the future for sure 19:20:46 mordred, how can this fit there? https://review.openstack.org/139598 19:20:46 well and I am not sure we would make use of that spec upstream 19:21:01 I know jeblair is largely against that type of testing unless his opinion has changed 19:21:12 yah - let me rephrase 19:21:14 (he doesn't want us to go without image updates due to an automated process) 19:21:20 I don't want to conflate the two 19:21:33 right now, we're tantalizingly close to getting dib-nodepool landed 19:21:33 right, his concerns are validated by the fact that we used to have this 19:21:40 and it was terrible 19:21:45 agreed they are separate concerns, Ithink anytime you bring on a new image you need to vet it directly 19:22:26 we ran devstack smoke tests on new images before marking them available, and nondeterministic bugs in the tests meant that we often failed to update images 19:22:37 clarkb, so you prefer validating an image manually all days prior to using that? 19:22:38 yah - and even though these are replacements, they are _new_ images 19:22:52 yolanda: when it is a new image yes 19:23:02 a new image, means new distro? 19:23:10 yolanda: new build process 19:23:20 ok, it's different scope then 19:23:22 could be new distro but in this case its changing how we build the image 19:23:28 agree 19:24:00 so i need to do sort of that but in a daily way, but is for downstream issues 19:24:18 yah. and in the downstream case the automated testing daily makes total sense 19:24:30 yep, mirrors updating, packages, etc 19:24:53 maybe? i wouldn't be surprised if false negatives in those tests ended up causing you to end up with stale images regularly 19:25:21 fungi, i wanted to add some retry to discard false positives as well 19:25:25 but it can happen, yes 19:25:35 fungi: possibly - but in this case, downstream is also the vendor for the os images 19:25:44 fungi: so actually trapping all of the things is important 19:26:02 i suppose simple declarative tests might work, like "here is the list of packages i require to be in an installed state" 19:26:06 like the reason jeblair does not like this for upstream is teh exact same reason it's a good idea for them downstream 19:26:18 For those kinds of tests theres some new DIB test functionality we should use 19:26:21 which we can test per-commit 19:26:23 as opposed to functional tests like "do nova unit tests complete successfully on this node" 19:26:25 greghaynes: woot 19:26:30 fungi: ++ 19:26:38 fungi: similar to how we use the ready script 19:26:47 so - I'd like to suggest that now is probably not the right time to design downstream image testing 19:26:47 fungi: eg can I resolve a name in dns 19:26:51 greghaynes, can i know more about it? maybe you can explain me later 19:26:55 yolanda: yep 19:27:06 thx 19:27:55 so... anything else nodepool+dib related? 19:28:00 just one last thing 19:28:08 I would like to see us try to do one thing at a time, not 5 at a time 19:28:18 so if we can draft up a priority list that would be good 19:28:22 otherwise debugging is going to be unfun 19:28:33 clarkb: I believe I agree with you, but can you be more specific? 19:28:41 at least in places where the work intersects 19:28:43 bindep -> nodepool shade uploads -> ubuntu-minimal for example 19:28:59 right. totally agree 19:29:00 some of that involves non-impacting prep work which can happen in parallel 19:29:28 but, like, actually putting bindep into production use would be something we'd need to coordinate against other potentially impacting changes 19:29:42 clarkb: let's make a priority list / sequence 19:29:51 clarkb: I've got one in my head - but it's possible you do not share my brain 19:30:05 or starting to use shade to upload would be a potentially impacting thing we'd need to coordinate 19:30:10 mordred: that would be a neat trick, we don't need a list now but ya lets do that 19:30:17 ++ 19:30:17 minimizing the change sounds like a good idea 19:30:26 biggest question I think ... 19:30:31 In theory, storyboard is good for this? 19:30:57 is going to be "do we do ubuntu-minimal to hpcloud with current nodepool, then shade uploads to hp, then add dib uploads to rax" 19:31:27 or - "do we do shade uploads with current ubuntu image, then ubuntu-minimal to hpcloud, then add dib uploads to rax" 19:31:46 but those three are a sequence 19:31:57 and doing 2 of them at once will be too complex 19:32:07 IMO shade uploads first because using one image across rax and hpcloud is far more important that ubuntu-minimal 19:32:18 we can't do one image without ubuntu-minimal 19:32:25 as far as the nodepool+dib plan goes, bindep is mainly just the bit which will allow us to not have to care about dib'ing the bare-.* workers. it can come last-last 19:32:41 the entire thing behind ubuntu-minimal is making an image that we can boot in rax 19:33:05 it's the reason rax is at the end of the list both times - but there are two changes to the plumbing that we can do and test before adding rax to the mix 19:33:38 Could we just do ubuntu-minimal, without any other changes? 19:33:40 fungi: ++ 19:33:47 *-minimal, yes we could 19:33:48 SpamapS: yes. that is a thing we can do 19:34:03 which I am very much a fan of :) get something totally "done" 19:34:24 (it's possible we did not succeeed with "don't make the list now") 19:34:35 er, yeah 19:34:46 So, if we look at it as an airport where we have only one runway, and we need to land, and give things time to taxi/clear ... ubuntu-minimal.. shade uploads.. bindep? 19:35:00 have rax as thing 3 19:35:11 ubuntu-minimal ... shade uplodates ... uploads to rax ... bindep 19:35:17 ahhh right 19:35:22 that way we're just testing that shade doesn't break HP in that step 19:35:23 I keep forgetting those are two different things. :) 19:35:30 one facilitates the other 19:35:32 where "bindep" is probably better phrased as "drop bare-.* workers" 19:35:36 fungi: ++ 19:35:47 ubuntu-minimal ... shade uplodates ... uploads to rax ... drop bare-* workers 19:35:53 ok, somebody write that down 19:36:03 bindep is one of the things we're doing to get us to the point where we won't be needing/using those 19:36:17 yup. but there are other things to 19:36:19 too 19:36:35 the first set is "stop having hp and rax have different images for devstack nodes" 19:36:45 the second is "stop having non-devstack nodes" 19:37:18 oh - while we're on the topic - if everyone hasn't added openstack-infra/glean to their watch list already 19:37:20 please do 19:37:25 I expect zero new patches to it 19:37:31 but bugs happen 19:37:46 implementation order: switch to ubuntu-minimal, then do shade uploads, then add uploads to rax, finally drop bare-.* workers 19:37:52 is that something we're agreeing on? 19:37:56 I agree. 19:37:58 clarkb? 19:38:22 I mostly agree, still not a fan with diving into the replace cloud ionit full stop plan 19:38:24 never have been 19:38:29 to be clear - that's "switch to ubuntu-minimal for the devstack nodes currently built by dib on hp" 19:38:38 clarkb: sure. we just literally don't have another option on the table 19:38:42 I think our goal was to make image work in rax and we forgot that a while back and made this far more complicated than it needs to be 19:38:51 mordred: we can use nova agent, we can use patched cloud init 19:39:06 clarkb: yeah. we can go back to the drawing board now that we have all the work finished 19:39:13 I'm not saying that 19:39:18 I am merely saying my stance has not changed 19:39:22 fair 19:39:22 so asking me what I think is a nop 19:40:01 minimal images are fine 19:40:09 but I think they are being used as a proxy in the larger more complicated thing 19:40:22 which concerns me because that means there are many ways they can break 19:40:29 but only way to find out is to use them 19:40:45 so I won't stand in the way of that 19:40:46 yah. although that's also my concern with adding nova-agent to nodes in hpcloud 19:41:03 clarkb: have I mentioned I hate this particular problem? 19:41:09 and then replace them with sanely-operating cloud-init later if we can confirm that it is 19:41:21 but i'm fine with using the solution we have 19:41:31 until it's proven not to be a good solution 19:41:59 #agree implementation order: switch to ubuntu-minimal, then start using shade to upload, then add rackspace upload support, finally drop bare-.* workers 19:42:12 fungi: (I think it's #agreed) 19:42:14 ? 19:42:18 or no, I'm wrong 19:42:21 * mordred shuts up 19:42:21 i agree ;) 19:42:31 the meeting header says #agreed 19:42:50 #agreed implementation order: switch to ubuntu-minimal, then start using shade to upload, then add rackspace upload support, finally drop bare-.* workers 19:43:01 hopefully that took 19:43:07 for the record "switch to ubuntu-minimal" == "create a new image type based on ubuntu-minimal, verify tests work on that, then switch things to use it" 19:43:21 yup 19:43:21 it does not mean "switch the current devstack-dib nodes to use ubuntu-minimal" 19:43:34 #topic Priority Efforts (Migration to Zanata) 19:43:38 o/ 19:43:43 so things are moving along steadily 19:44:12 cinerama and I have submitted several bugs upstream (see line 165 of https://etherpad.openstack.org/p/zanata-install) and I've forwarded them along to carlos at redhat/zanata, where they're working to prioritize them 19:44:28 and carlos will be at the summit 19:44:33 yay we're helpful! 19:44:57 yup. currently we're at the point with the client work where we're adapting the existing proposal scripts and working around some missing features in the zanata client like percentage completion 19:45:23 I want to point out that zanata does string formatting parsing and can error/warn if translators change the expected formatting operators 19:45:28 which I think is the coolest feature ever 19:45:42 i will be proposing a change for that and everyone can jump in with suggestions etc & things i've overlooked 19:45:48 I've been chatting with daisy about whether we want the translations session in i18n track or infra, no decision yet 19:45:57 er, translations tools 19:46:03 I put some details in our summit eitherpad 19:46:09 clarkb: dude. that's awesomes 19:46:29 does it support new-style python {} formats? 19:46:38 mordred: I didn't check that, does support old style 19:46:43 cool 19:46:49 that is pretty amazing 19:47:41 in spite of java ;) I think we did go with the right tool in zanata, it's got some great features and with upstream willing to add/adjust features for us, things are going well 19:48:14 that's excellent to hear 19:48:21 it's always great to have a supportive upstream 19:48:47 * zaro is envious 19:49:06 * mordred hands zaro a pie 19:49:52 that's all from us I think :) 19:50:46 yay! more progress! 19:51:06 #topic Priority Efforts (Downstream Puppet) 19:51:14 anything new here since last week? 19:51:25 we've got a few people joinging the effort 19:51:27 some patches still pending reviews from my side 19:51:31 and some new patches 19:51:40 i didn't have time to work on anything new 19:51:42 we have a lot of reviews to do 19:51:49 indeed 19:51:52 please :) 19:51:56 I'll make that a priority this week 19:52:01 release season always leaves us swamped 19:52:06 pleia2, thx 19:52:11 i'm hoping next week will start to improve again 19:52:20 * pleia2 nods 19:52:26 i saw asselin topic for summit, i'd love this one 19:52:37 if it takes place i'll be joining the effort 19:52:40 quite a bit of reviews in the pipeline 19:52:47 yolanda, thanks 19:52:55 I'll go review things there too 19:52:57 this is my #1 https://review.openstack.org/#/c/171359/ 19:53:16 * asselin_ hopes we get through much of the review before the summit and deal with any challenges there 19:53:18 asselin, so i still have my concerns on the way these things are tested, but we can iterate later 19:53:20 nibalizer: good to know 19:53:20 nibalizer: is the in-tree-hiera patch part of this? 19:53:32 I plan to start back up on the pipeline this week, and review stuff 19:53:53 mordred: i think in tree hiera can help, but we're not sure exactly what that looks like yet 19:54:01 yolanda, we should meet offline to disucss that 19:54:18 simplyfing o_p::server makes it easier to separate downstream consumable from upstream immutable 19:54:22 asselin, we need to talk in Vancouver :) 19:54:33 nibalizer: totally. I was just wondering how much we should push on the hiera config change patch 19:54:44 nibalizer, agree, i've been focusing my efforts on that, isolate functionality, move to modules 19:54:47 mordred: :shrug: 19:54:58 nibalizer: I will be honest I have avoided that change because the previous one related to it broke so badly 19:55:02 asselin_: yea I'd like to see more code in the modules than in openstackci 19:55:12 clarkb: understandable 19:55:14 nibalizer: and I don't feel I hae the time to commit to unbreaking the entire world should something go bad again 19:55:51 clarkb: yea... not sure what the happy path forward on that is 19:55:58 nibalizer: I think smaller changes 19:56:02 nibalizer: don't move everything all at once 19:56:03 nibalizer, agree...I'd like to baby step in that direction 19:56:07 at a minimum, we shouldn't land it before kilo release day ;) 19:56:13 clarkb: i didn't move everything all at once 19:56:24 nibalizer: its >700 lines changed 19:56:31 which change are you talking about/ 19:56:37 nibalizer: 171359 19:56:42 your #1 change 19:56:48 clarkb: no so thats git being confused 19:56:54 there is one file that is 300 lines 19:56:58 and another file that is 10 lines 19:56:58 yes I understand 19:57:01 and they are becomming the same file 19:57:04 with no inline changes 19:57:31 but every single node is affected 19:57:41 and the last time we did this was basically the same scenario 19:58:04 clarkb: well so my answer to that is that im a little frustrated 19:58:09 we agreed in spec to do these refactors 19:58:14 I am not saying don't do the refactor 19:58:15 then the changes sit for a while 19:58:22 and things that conflict get merged in 19:58:22 sounds like you need to stage the servers for the change, vs all of them at once 19:58:23 I am saying lets break it up and do it piece by peice 19:58:29 which will also avoid conflicts 19:58:47 pabelanger: unfortunately puppet doesn't really grok that 19:58:55 clarkb: lets talk about this out of the meeting 19:59:08 yeah - TC is one minute away 19:59:08 nibalizer: sure 19:59:09 well, we could disable puppet agent everywhere and then try bringing the change in machine by machine manually 19:59:09 we can find a way that works to go forward 19:59:12 clarkb, ya, trying to think of a way to do that 19:59:14 but that's painful 19:59:41 okay, thanks everyone. we can try to invert the priority topics list next week to get to what we missed this time 20:00:01 #endmeeting