19:04:31 <fungi> #startmeeting infra
19:04:31 <openstack> Meeting started Tue Apr 28 19:04:31 2015 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:04:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:04:35 <openstack> The meeting name has been set to 'infra'
19:04:46 <fungi> https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:04:59 <greghaynes> O/
19:05:02 <fungi> #topic Announcements
19:05:28 <mordred> jeblair and I will be in France tomorrow and thursday, so may be even more offline - and then he's sticking around for friday as well
19:05:30 <fungi> make sure you add summit session topic ideas/preferences to the etherpad
19:05:34 <fungi> #link https://etherpad.openstack.org/p/infra-liberty-summit-planning
19:05:56 <fungi> jeblair said he plans to organize/finalize what's there at the end of the week or early next
19:06:01 <fungi> so time is of the essence
19:06:24 <fungi> #topic Actions from last meeting
19:06:38 <fungi> #link http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-04-21-19.01.html
19:06:52 <fungi> #action fungi check our cinder quota in rax-dfw
19:06:57 <krtaylor> o/
19:07:02 <fungi> i have not done that, so popping it back into the list
19:07:16 <fungi> #topic Priority Specs
19:07:34 <fungi> skipping these--we'll have plenty of new ones to start putting on the list once we summit
19:07:43 <fungi> #topic Priority Efforts
19:07:54 <fungi> i'll try to get through these quickly...
19:08:01 <fungi> #topic Priority Efforts (Swift logs)
19:08:22 <fungi> so... we're doing more of these i gather...
19:08:28 <fungi> saw some bugs recentlyish?
19:08:37 <mordred> something about swift not being reachable
19:08:49 <jhesketh> we need to add support to os-loganalyze to serve non-log files still
19:08:55 <jhesketh> we tried this but had to revert
19:08:56 <fungi> there were those jobs clarkb spotted yesterday hung for hours after (or while) trying to upload logs
19:09:10 <jhesketh> so we will probably need to increase our testing before continuing
19:09:14 <asselin_> o/
19:09:31 <fungi> jhesketh: did you get the paste i provided of the traceback for the non-log serving problem?
19:09:43 <clarkb> fungi: I haven't seen that, any cahnce you can link it here?
19:09:44 <jhesketh> fungi: I think they were due to network issues
19:09:50 <fungi> if not, i can likely dig the paste url out of my irc history
19:09:51 <jhesketh> fungi: ditto, I haven't seend that
19:10:07 <fungi> finding
19:10:12 <jhesketh> the upload stuff has been hardened a little last night to retry a few times before failing
19:10:27 <jhesketh> https://review.openstack.org/#/c/178199/
19:12:49 <clarkb> I will review that after the meeting
19:12:51 <fungi> i'm not having a ton of luck tracking down the paste, so probably to do it after the meeting
19:12:51 <jhesketh> #link https://review.openstack.org/#/c/178199/
19:13:01 <fungi> er, better to
19:13:09 <jhesketh> okay
19:13:26 <fungi> i'll need to triangulate based on timing of the revert and associated conversation in channel
19:13:35 <jhesketh> so we probably can switch over some more jobs with simple logs but some of them will depend on fixing static file serving
19:13:48 <jhesketh> otherwise it's progressing well and not much new to report (afaik)
19:14:31 <fungi> okay, cool
19:14:51 <fungi> #topic Priority Efforts (Nodepool DIB)
19:15:17 <greghaynes> all the required DIB changes are merged
19:15:29 <greghaynes> theres one last fix that id like to merge before cutting a release
19:15:41 <greghaynes> and its just waiting on CI
19:16:03 <fungi> on the bindep front, i got back to manual testing to make sure it's doing what we want before tagging. turned up a minor parsing bug which is easy to fix but i'm struggling figuring out how to add a regression test. should have that up soon after the meeting
19:16:07 <greghaynes> There was talk about us being able to test consolodated images once the dib release is cut?
19:16:33 <clarkb> consolidated images?
19:16:41 <mordred> clarkb: one-image-to-rule-them-all
19:16:44 <greghaynes> the image that boots in all the places
19:16:55 <mordred> well, there are two aspects
19:17:07 <clarkb> we need nodepool shade + image uploads which is merge conflicted iirc
19:17:10 <mordred> one is "boot image of type X in all the places"
19:17:18 <clarkb> also not sure if my -1 there was ever addressed
19:17:25 <mordred> one is "stop having multiple types"
19:17:43 <greghaynes> Yes, it might be nice to switch over to the new dib images (which use *-minimal) in hp first?
19:17:45 <mordred> I have not gone back to the nodepool patches because we've been landing more testing for things in shade
19:18:05 <mordred> I'll do re-address the nodepool patches on the plane this week
19:18:36 <mordred> greghaynes: I think we actually want to add a new image type for both clouds first - then run tests on it via experimental
19:18:48 <greghaynes> sgtm
19:18:58 <greghaynes> I can play with that
19:19:00 <mordred> since there are some differences in image content and we want to make sure those don't break tests
19:19:01 <mordred> cool
19:19:09 <fungi> oh, still need centos-6 nodes
19:19:17 <fungi> #link https://review.openstack.org/171286
19:19:18 <mordred> fungi: we support those too
19:19:31 <yolanda> mordred, which kind of tests are you talking about?
19:19:33 <fungi> i mean my review to add centos-6 nodes
19:19:35 <clarkb> oh it lost my +2
19:19:55 <fungi> yeah, i had to rebase because it started to merge-conflict on the log config
19:19:57 <mordred> yolanda: we need to run devstack-gate on the new nodes
19:20:05 <fungi> i've still got some more manual bindep vetting to do, so might turn up a few other minor bugs. also i'll probably add a trivial --version option before releasing. should have the remaining reviews up today sometime
19:20:08 <yolanda> mordred, then it's quite related to my epc
19:20:10 <yolanda> spec
19:20:18 <mordred> yolanda: to make sure that the switch to the ubuntu-minimal doesn't kill things
19:20:35 <mordred> yolanda: yes- but this time I think we want to do it by hand
19:20:45 <mordred> ut it would be covered by your spec in the future for sure
19:20:46 <yolanda> mordred, how can this fit there? https://review.openstack.org/139598
19:20:46 <clarkb> well and I am not sure we would make use of that spec upstream
19:21:01 <clarkb> I know jeblair is largely against that type of testing unless his opinion has changed
19:21:12 <mordred> yah - let me rephrase
19:21:14 <clarkb> (he doesn't want us to go without image updates due to an automated process)
19:21:20 <mordred> I don't want to conflate the two
19:21:33 <mordred> right now, we're tantalizingly close to getting dib-nodepool landed
19:21:33 <fungi> right, his concerns are validated by the fact that we used to have this
19:21:40 <fungi> and it was terrible
19:21:45 <clarkb> agreed they are separate concerns, Ithink anytime you bring on a new image you need to vet it directly
19:22:26 <fungi> we ran devstack smoke tests on new images before marking them available, and nondeterministic bugs in the tests meant that we often failed to update images
19:22:37 <yolanda> clarkb, so you prefer validating an image manually all days prior to using that?
19:22:38 <mordred> yah - and even though these are replacements, they are _new_ images
19:22:52 <clarkb> yolanda: when it is a new image yes
19:23:02 <yolanda> a new image, means new distro?
19:23:10 <clarkb> yolanda: new build process
19:23:20 <yolanda> ok, it's different scope then
19:23:22 <clarkb> could be new distro but in this case its changing how we build the image
19:23:28 <yolanda> agree
19:24:00 <yolanda> so i need to do sort of that but in a daily way, but is for downstream issues
19:24:18 <mordred> yah. and in the downstream case the automated testing daily makes total sense
19:24:30 <yolanda> yep, mirrors updating, packages, etc
19:24:53 <fungi> maybe? i wouldn't be surprised if false negatives in those tests ended up causing you to end up with stale images regularly
19:25:21 <yolanda> fungi, i wanted to add some retry to discard false positives as well
19:25:25 <yolanda> but it can happen, yes
19:25:35 <mordred> fungi: possibly - but in this case, downstream is also the vendor for the os images
19:25:44 <mordred> fungi: so actually trapping all of the things is important
19:26:02 <fungi> i suppose simple declarative tests might work, like "here is the list of packages i require to be in an installed state"
19:26:06 <mordred> like the reason jeblair does not like this for upstream is teh exact same reason it's a good idea for them downstream
19:26:18 <greghaynes> For those kinds of tests theres some new DIB test functionality we should use
19:26:21 <greghaynes> which we can test per-commit
19:26:23 <fungi> as opposed to functional tests like "do nova unit tests complete successfully on this node"
19:26:25 <mordred> greghaynes: woot
19:26:30 <clarkb> fungi: ++
19:26:38 <clarkb> fungi: similar to how we use the ready script
19:26:47 <mordred> so - I'd like to suggest that now is probably not the right time to design downstream image testing
19:26:47 <clarkb> fungi: eg can I resolve a name in dns
19:26:51 <yolanda> greghaynes, can i know more about it? maybe you can explain me later
19:26:55 <greghaynes> yolanda: yep
19:27:06 <yolanda> thx
19:27:55 <fungi> so... anything else nodepool+dib related?
19:28:00 <clarkb> just one last thing
19:28:08 <clarkb> I would like to see us try to do one thing at a time, not 5 at a time
19:28:18 <clarkb> so if we can draft up a priority list that would be good
19:28:22 <clarkb> otherwise debugging is going to be unfun
19:28:33 <mordred> clarkb: I believe I agree with you, but can you be more specific?
19:28:41 <fungi> at least in places where the work intersects
19:28:43 <clarkb> bindep -> nodepool shade uploads -> ubuntu-minimal for example
19:28:59 <mordred> right. totally agree
19:29:00 <fungi> some of that involves non-impacting prep work which can happen in parallel
19:29:28 <fungi> but, like, actually putting bindep into production use would be something we'd need to coordinate against other potentially impacting changes
19:29:42 <mordred> clarkb: let's make a priority list / sequence
19:29:51 <mordred> clarkb: I've got one in my head - but it's possible you do not share my brain
19:30:05 <fungi> or starting to use shade to upload would be a potentially impacting thing we'd need to coordinate
19:30:10 <clarkb> mordred: that would be a neat trick, we don't need a list now but ya lets do that
19:30:17 <mordred> ++
19:30:17 <SpamapS> minimizing the change sounds like a good idea
19:30:26 <mordred> biggest question I think ...
19:30:31 <SpamapS> In theory, storyboard is good for this?
19:30:57 <mordred> is going to be "do we do ubuntu-minimal to hpcloud with current nodepool, then shade uploads to hp, then add dib uploads to rax"
19:31:27 <mordred> or - "do we do shade uploads with current ubuntu image, then ubuntu-minimal to hpcloud, then add dib uploads to rax"
19:31:46 <mordred> but those three are a sequence
19:31:57 <mordred> and doing 2 of them at once will be too complex
19:32:07 <clarkb> IMO shade uploads first because using one image across rax and hpcloud is far more important that ubuntu-minimal
19:32:18 <mordred> we can't do one image without ubuntu-minimal
19:32:25 <fungi> as far as the nodepool+dib plan goes, bindep is mainly just the bit which will allow us to not have to care about dib'ing the bare-.* workers. it can come last-last
19:32:41 <mordred> the entire thing behind ubuntu-minimal is making an image that we can boot in rax
19:33:05 <mordred> it's the reason rax is at the end of the list both times - but there are two changes to the plumbing that we can do and test before adding rax to the mix
19:33:38 <SpamapS> Could we just do ubuntu-minimal, without any other changes?
19:33:40 <mordred> fungi: ++
19:33:47 <greghaynes> *-minimal, yes we could
19:33:48 <mordred> SpamapS: yes. that is a thing we can do
19:34:03 <greghaynes> which I am very much a fan of :) get something totally "done"
19:34:24 <mordred> (it's possible we did not succeeed with "don't make the list now")
19:34:35 <fungi> er, yeah
19:34:46 <SpamapS> So, if we look at it as an airport where we have only one runway, and we need to land, and give things time to taxi/clear ... ubuntu-minimal.. shade uploads.. bindep?
19:35:00 <mordred> have rax as thing 3
19:35:11 <mordred> ubuntu-minimal ... shade uplodates ... uploads to rax ... bindep
19:35:17 <SpamapS> ahhh right
19:35:22 <mordred> that way we're just testing that shade doesn't break HP in that step
19:35:23 <SpamapS> I keep forgetting those are two different things. :)
19:35:30 <mordred> one facilitates the other
19:35:32 <fungi> where "bindep" is probably better phrased as "drop bare-.* workers"
19:35:36 <mordred> fungi: ++
19:35:47 <mordred> ubuntu-minimal ... shade uplodates ... uploads to rax ... drop bare-* workers
19:35:53 <SpamapS> ok, somebody write that down
19:36:03 <fungi> bindep is one of the things we're doing to get us to the point where we won't be needing/using those
19:36:17 <mordred> yup. but there are other things to
19:36:19 <mordred> too
19:36:35 <mordred> the first set is "stop having hp and rax have different images for devstack nodes"
19:36:45 <mordred> the second is "stop having non-devstack nodes"
19:37:18 <mordred> oh - while we're on the topic - if everyone hasn't added openstack-infra/glean to their watch list already
19:37:20 <mordred> please do
19:37:25 <mordred> I expect zero new patches to it
19:37:31 <mordred> but bugs happen
19:37:46 <fungi> implementation order: switch to ubuntu-minimal, then do shade uploads, then add uploads to rax, finally drop bare-.* workers
19:37:52 <fungi> is that something we're agreeing on?
19:37:56 <mordred> I agree.
19:37:58 <mordred> clarkb?
19:38:22 <clarkb> I mostly agree, still not a fan with diving into the replace cloud ionit full stop plan
19:38:24 <clarkb> never have been
19:38:29 <mordred> to be clear - that's "switch to ubuntu-minimal for the devstack nodes currently built by dib on hp"
19:38:38 <mordred> clarkb: sure. we just literally don't have another option on the table
19:38:42 <clarkb> I think our goal was to make image work in rax and we forgot that a while back and made this far more complicated than it needs to be
19:38:51 <clarkb> mordred: we can use nova agent, we can use patched cloud init
19:39:06 <mordred> clarkb: yeah. we can go back to the drawing board now that we have all the work finished
19:39:13 <clarkb> I'm not saying that
19:39:18 <clarkb> I am merely saying my stance has not changed
19:39:22 <mordred> fair
19:39:22 <clarkb> so asking me what I think is a nop
19:40:01 <clarkb> minimal images are fine
19:40:09 <clarkb> but I think they are being used as a proxy in the larger more complicated thing
19:40:22 <clarkb> which concerns me because that means there are many ways they can break
19:40:29 <clarkb> but only way to find out is to use them
19:40:45 <clarkb> so I won't stand in the way of that
19:40:46 <mordred> yah. although that's also my concern with adding nova-agent to nodes in hpcloud
19:41:03 <mordred> clarkb: have I mentioned I hate this particular problem?
19:41:09 <fungi> and then replace them with sanely-operating cloud-init later if we can confirm that it is
19:41:21 <fungi> but i'm fine with using the solution we have
19:41:31 <fungi> until it's proven not to be a good solution
19:41:59 <fungi> #agree implementation order: switch to ubuntu-minimal, then start using shade to upload, then add rackspace upload support, finally drop bare-.* workers
19:42:12 <mordred> fungi: (I think it's #agreed)
19:42:14 <mordred> ?
19:42:18 <mordred> or no, I'm wrong
19:42:21 * mordred shuts up
19:42:21 <fungi> i agree ;)
19:42:31 <clarkb> the meeting header says #agreed
19:42:50 <fungi> #agreed implementation order: switch to ubuntu-minimal, then start using shade to upload, then add rackspace upload support, finally drop bare-.* workers
19:43:01 <fungi> hopefully that took
19:43:07 <mordred> for the record "switch to ubuntu-minimal" == "create a new image type based on ubuntu-minimal, verify tests work on that, then switch things to use it"
19:43:21 <clarkb> yup
19:43:21 <mordred> it does not mean "switch the current devstack-dib nodes to use ubuntu-minimal"
19:43:34 <fungi> #topic Priority Efforts (Migration to Zanata)
19:43:38 <pleia2> o/
19:43:43 <pleia2> so things are moving along steadily
19:44:12 <pleia2> cinerama and I have submitted several bugs upstream (see line 165 of https://etherpad.openstack.org/p/zanata-install) and I've forwarded them along to carlos at redhat/zanata, where they're working to prioritize them
19:44:28 <pleia2> and carlos will be at the summit
19:44:33 <mordred> yay we're helpful!
19:44:57 <cinerama> yup. currently we're at the point with the client work where we're adapting the existing proposal scripts and working around some missing features in the zanata client like percentage completion
19:45:23 <clarkb> I want to point out that zanata does string formatting parsing and can error/warn if translators change the expected formatting operators
19:45:28 <clarkb> which I think is the coolest feature ever
19:45:42 <cinerama> i will be proposing a change for that and everyone can jump in with suggestions etc & things i've overlooked
19:45:48 <pleia2> I've been chatting with daisy about whether we want the translations session in i18n track or infra, no decision yet
19:45:57 <pleia2> er, translations tools
19:46:03 <pleia2> I put some details in our summit eitherpad
19:46:09 <mordred> clarkb: dude. that's awesomes
19:46:29 <mordred> does it support new-style python {} formats?
19:46:38 <clarkb> mordred: I didn't check that, does support old style
19:46:43 <mordred> cool
19:46:49 <fungi> that is pretty amazing
19:47:41 <pleia2> in spite of java ;) I think we did go with the right tool in zanata, it's got some great features and with upstream willing to add/adjust features for us, things are going well
19:48:14 <mordred> that's excellent to hear
19:48:21 <mordred> it's always great to have a supportive upstream
19:48:47 * zaro is envious
19:49:06 * mordred hands zaro a pie
19:49:52 <pleia2> that's all from us I think :)
19:50:46 <fungi> yay! more progress!
19:51:06 <fungi> #topic Priority Efforts (Downstream Puppet)
19:51:14 <fungi> anything new here since last week?
19:51:25 <asselin_> we've got a few people joinging the effort
19:51:27 <yolanda> some patches still pending reviews from my side
19:51:31 <asselin_> and some new patches
19:51:40 <yolanda> i didn't have time to work on anything new
19:51:42 <pleia2> we have a lot of reviews to do
19:51:49 <fungi> indeed
19:51:52 <yolanda> please :)
19:51:56 <pleia2> I'll make that a priority this week
19:52:01 <fungi> release season always leaves us swamped
19:52:06 <yolanda> pleia2, thx
19:52:11 <fungi> i'm hoping next week will start to improve again
19:52:20 * pleia2 nods
19:52:26 <yolanda> i saw asselin topic for summit, i'd love this one
19:52:37 <yolanda> if it takes place i'll be joining the effort
19:52:40 <nibalizer> quite a bit of reviews in the pipeline
19:52:47 <asselin_> yolanda, thanks
19:52:55 <mordred> I'll go review things there too
19:52:57 <nibalizer> this is my #1 https://review.openstack.org/#/c/171359/
19:53:16 * asselin_ hopes we get through much of the review before the summit and deal with any challenges there
19:53:18 <yolanda> asselin, so i still have my concerns on the way these things are tested, but we can iterate later
19:53:20 <pleia2> nibalizer: good to know
19:53:20 <mordred> nibalizer: is the in-tree-hiera patch part of this?
19:53:32 <pabelanger> I plan to start back up on the pipeline this week, and review stuff
19:53:53 <nibalizer> mordred: i think in tree hiera can help, but we're not sure exactly what that looks like yet
19:54:01 <asselin_> yolanda, we should meet offline to disucss that
19:54:18 <nibalizer> simplyfing o_p::server  makes it easier to separate downstream consumable from upstream immutable
19:54:22 <yolanda> asselin, we need to talk in Vancouver :)
19:54:33 <mordred> nibalizer: totally. I was just wondering how much we should push on the hiera config change patch
19:54:44 <yolanda> nibalizer, agree, i've been focusing my efforts on that, isolate functionality, move to modules
19:54:47 <nibalizer> mordred: :shrug:
19:54:58 <clarkb> nibalizer: I will be honest I have avoided that change because the previous one related to it broke so badly
19:55:02 <nibalizer> asselin_: yea I'd like to see more code in the modules than in openstackci
19:55:12 <nibalizer> clarkb: understandable
19:55:14 <clarkb> nibalizer: and I don't feel I hae the time to commit to unbreaking the entire world should something go bad again
19:55:51 <nibalizer> clarkb: yea... not sure what the happy path forward on that is
19:55:58 <clarkb> nibalizer: I think smaller changes
19:56:02 <clarkb> nibalizer: don't move everything all at once
19:56:03 <asselin_> nibalizer, agree...I'd like to baby step in that direction
19:56:07 <fungi> at a minimum, we shouldn't land it before kilo release day ;)
19:56:13 <nibalizer> clarkb: i didn't move everything all at once
19:56:24 <clarkb> nibalizer: its >700 lines changed
19:56:31 <nibalizer> which change are you talking about/
19:56:37 <clarkb> nibalizer: 171359
19:56:42 <clarkb> your #1 change
19:56:48 <nibalizer> clarkb: no so thats git being confused
19:56:54 <nibalizer> there is one file that is 300 lines
19:56:58 <nibalizer> and another file that is 10 lines
19:56:58 <clarkb> yes I understand
19:57:01 <nibalizer> and they are becomming the same file
19:57:04 <nibalizer> with no inline changes
19:57:31 <clarkb> but every single node is affected
19:57:41 <clarkb> and the last time we did this was basically the same scenario
19:58:04 <nibalizer> clarkb: well so my answer to that is that im a little frustrated
19:58:09 <nibalizer> we agreed in spec to do these refactors
19:58:14 <clarkb> I am not saying don't do the refactor
19:58:15 <nibalizer> then the changes sit for a while
19:58:22 <nibalizer> and things that conflict get merged in
19:58:22 <pabelanger> sounds like you need to stage the servers for the change, vs all of them at once
19:58:23 <clarkb> I am saying lets break it up and do it piece by peice
19:58:29 <clarkb> which will also avoid conflicts
19:58:47 <clarkb> pabelanger: unfortunately puppet doesn't really grok that
19:58:55 <nibalizer> clarkb: lets talk about this out of the meeting
19:59:08 <mordred> yeah - TC is one minute away
19:59:08 <clarkb> nibalizer: sure
19:59:09 <fungi> well, we could disable puppet agent everywhere and then try bringing the change in machine by machine manually
19:59:09 <nibalizer> we can find a way that works to go forward
19:59:12 <pabelanger> clarkb, ya, trying to think of a way to do that
19:59:14 <fungi> but that's painful
19:59:41 <fungi> okay, thanks everyone. we can try to invert the priority topics list next week to get to what we missed this time
20:00:01 <fungi> #endmeeting