19:02:23 #startmeeting infra 19:02:24 Meeting started Tue May 26 19:02:23 2015 UTC and is due to finish in 60 minutes. The chair is jeblair. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:27 #link agenda https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:02:28 #link previous meeting http://eavesdrop.openstack.org/meetings/infra/2015/infra.2015-05-05-19.03.html 19:02:28 The meeting name has been set to 'infra' 19:02:28 o/ 19:02:34 #topic Actions from last meeting 19:02:35 fungi check our cinder quota in rax-dfw 19:02:39 o/ 19:02:55 i love that we're task tracking that here ;) 19:03:00 o/ 19:03:07 ohai! 19:03:14 \o/ 19:03:15 trying that right now ;) 19:03:25 cool, we'll tune back in later :) 19:03:31 #topic Infrastructure project process changes (jeblair) 19:03:32 #link https://review.openstack.org/182811 19:03:37 ls 19:03:41 whoops 19:03:43 sorry 19:03:57 o/ 19:04:06 jeblair: ooh, I like the sound of infra council 19:04:10 beagles, that happens :) 19:04:13 do you get hooded robes? 19:04:15 i inserted this topic before the usual bits in the meeting because i think normally i would expect us to start talking about priority efforts, etc, now... 19:04:37 mtreinish: yeah, we've just about run out of names for groups of people around here :) 19:04:46 mtreinish: hoodies? 19:05:01 jeblair: I think we need robes 19:05:15 24264/51200gb sata, 1124/25600gb ssd 19:05:38 * fungi has no idea why cinderclient insists on a redundant tenant id for quota-usage 19:05:49 jeblair: heh, hoodies aren't quite as cool 19:05:49 anyway, what i'd actually like to do is spend the next week working on getting all of that in place and hopefully running a spec through that process by next week 19:05:54 i updated that change this morning in a way that i think addresses all the comments 19:06:08 fungi: w00t, thx! 19:06:39 so if folks could take a look at https://review.openstack.org/182811 today, that would be great 19:07:03 and in general, does that sound like a good plan to proceed? 19:07:08 jeblair, i like that updates 19:07:30 wfm 19:07:34 did I miss the discussion where we decided we needed a council? 19:07:46 or is this it? 19:07:51 Indeed I think it addresses most of my concern which was the broadness of focus that it seemed was required to make progress toward infra core. 19:07:52 anteaya: 182811 is the discussion 19:08:54 I had a chance to read it over the weekend, seems like a sane approach 19:08:55 I have opened teh change and it will be next up in my review queue 19:09:20 ti definitely feels like it formalizes a lot of what we already have in place informally (for current situations like jjb, project-config, devstack-gate, et cetera) 19:09:28 jeblair: great updates. +2 19:09:32 fungi: that's how I felt upon reading it too 19:09:56 and empowers those groups to give them more say in overall direction of the infra project 19:09:58 so i'll spend the week working on the acl changes needed for that, and working on the specs repo 19:10:07 woot 19:10:43 i also have this up, which is marginally related 19:10:45 #link https://review.openstack.org/183337 19:12:47 anything else on this subject? 19:13:06 jeblair: super excited about both changes 19:13:27 thx 19:13:33 #topic Priority Efforts (Upgrading Gerrit) 19:14:04 o/ 19:14:05 upstream thinks they have narrowed down the jgit problem 19:14:08 I heard that someone may have found a maybe cause to this? 19:14:09 i think there's a test case we want to test out. upload large files with lots of changes? 19:14:17 #link https://git.eclipse.org/r/48288 19:14:21 zaro: and shorten the diff timeout 19:14:27 yeah, i stuck this on here because i had an idea at the summit, which i communicated to zaro but wanted to discuss here 19:14:28 that seems to be the current change to watch which supposedly addresses this 19:14:37 all things that can be done on review-dev it sounds like 19:14:46 tl;dr is if you get a diff timeout while processing a packfile jgit will treat that as being a corrupted pack and remove it from the list 19:14:54 which is not good 19:14:59 which was, that if it is triggered by the diff timeout, we may be able to trigger on review-dev (without moving production data) just by uploading some changes with huge diffs 19:15:02 nibalizer: exactly 19:15:10 i was wondering if we should just get a copy of review data instead ? 19:15:11 yeah, all i/o exceptions are not created equal it seems 19:15:48 jeblair: ++ but also shorten the timeout 19:15:52 zaro: our current plan was to sync review -> review-dev; but i'm suggesting we might be able to reproduce without doing that and it might be simpler. i'm okay with either approach. 19:15:57 jeblair: zaro but that should be very easy to prove on review-dev 19:15:57 clarkb: good idea 19:16:28 ok. i was just thinking that we might need to do that anyway 19:16:44 i mean run test against review data 19:16:49 zaro: yeah, but maybe keep working on that in the background after this? 19:17:11 sure. i said i would test this but just haven't gotten around to it yet. 19:17:14 ahh, right, drop cache.diff.timeout to something like 1s or lower 19:17:24 will start it today. 19:17:27 defaults to 5s according to the docs 19:17:55 yup and the logs say we hit it at 5s on review.o.o 19:18:02 zaro: cool, no problem. it would be crazy if you had found time to do it since the summit. :) i just wanted to discuss it here so that it wasn't just something we talked about over lunch that one time. :) 19:18:07 its set in milliseconds iirc so easy to do subsecond values 19:18:10 so that's suspiciously coincidental 19:18:33 yeah, seems to support subsecond values 19:19:00 jeblair: agreed, good to discuss. 19:19:34 #action zaro configure review-dev for subsecond diff timeouts and test that "large" diffs trigger the jgit error 19:20:09 anything else regarding priority efforts we should discuss? 19:20:50 no new blockers i'm aware of 19:21:37 mostly trying to figure out where i left things before the conference amnesia set in 19:21:39 maybe the swift uploads? 19:21:43 I think I discussed everything out at summit, so I'm good 19:21:47 we are still not passing through non log data cleanly 19:22:03 and it seems like every time we try to tackle that there is a suggestion to do more and more unrelated work 19:22:15 i think there was discussion at summit to stand up a phabritcator. new priority effort? 19:22:20 its not bad work it just doesn't get us closer to the goal of host data that isn't logs in swift 19:22:27 zaro: would probably be a new spec 19:22:35 * mordred has spec for that on his TDL 19:22:36 o/ 19:22:38 #topic Priority Efforts (Swift logs) 19:23:02 clarkb: yeah, i spoke with jhesketh about that -- i think he's writing that change? 19:23:02 quick update is: we need a way to have os loganalyze pass through data in swift somehow 19:23:09 jeblair: he is 19:23:31 but what has happened is we went from that to saying we should have os loganalyze be a devstack plugin so we can run integration tests with swift against changes to it 19:23:51 which has lead to lets fix devstack's handing of requirements so that os loganalyze can have requirements not in global reqs 19:24:14 that sounds like a swell idea, but i'm not sure we need to block on that. so let's have a chat with jhesketh when he's around and see if we can't separate those two efforts. 19:24:41 right I think the work is valuable its just not helping us solve the problem hosting fiels in swift 19:24:46 clarkb: there should be a call in devstack to install from pip without g-r 19:25:00 mtreinish: sdague made it sound like there wasn't one that would work, but maybe there is 19:25:10 yeah, i think it cropped up because the last time that change was attempted it broke serving logs, so "get some testing" was a bit of a knee-jerk reaction to not breaking again 19:25:42 clarkb: hmm ok, I can take a look at it in a bit 19:25:45 fungi: and that would be great if it wasn't snowballing into rewriting half the infrastructure 19:26:09 i think sdague is away this week? 19:26:14 yes. he's out 19:26:14 keep in mind that jhesketh is in aus so it’s currently 5:26am for him 19:26:21 tchaypo: and for you too 19:26:32 tchaypo: I'm guessing we can blame jetlag for you being awake? :) 19:26:39 yes, but not everyone is an early bird like me 19:26:45 heh 19:26:46 no, I usually get up about this time anyway 19:26:55 but I think theer is a trivial change we can make to os loganalyze which is if root url is not htmlify/ then pass data through untouched 19:27:20 clarkb: sounds pretty triial 19:27:22 and hopefully that is not controversial and we can just make that change to os loganalyze with the testing that we do have today (which we do have) 19:27:44 clarkb: cool, let's discuss that with jhesketh and mtreinish later, and do that if we can 19:27:52 i take it the earlier attempt which got reverted was nontrivial and so had unanticipated side-effects 19:28:08 fungi: yes it was much more "correct" at the expense of being more complicated 19:29:03 jeblair: sounds good 19:29:19 #topic Open discussion 19:29:37 the rename requests are starting to pile up 19:29:47 mordred: so... "action mordred write a spec to move infra bugs to maniphest" ? 19:29:58 yah. I have several spec writing tasks for this week 19:30:01 that's one of them 19:30:02 fungi: oh, yeah, should put scheduling those on the agenda 19:30:08 #action mordred write a spec to move infra bugs to maniphest 19:30:31 fungi: I think we should finish up my "rename projects" playbook before we do any sets that involve the puppet or chef repos 19:30:52 mordred: great idea 19:30:55 i just saw a governance change that looks like ironic-discoverd may also want to change its name 19:30:55 Should we add infra-cloud as a priority effort to the agenda? 19:31:09 fungi: maybe I'll grab you east coast morning and we can figure out how to test it 19:31:17 SpamapS: i'll write a spec for infra cloud this week 19:31:25 jeblair: splendid 19:31:32 I thought the documentation was the spec? 19:31:34 not sure how i feel about the stackforge/ironic-discoverd -> openstack/ironic-discoverd rename when https://review.openstack.org/185442 is also proposed 19:31:45 mordred: should be possible to use review-dev et al as a test bed 19:31:49 seems like we should maybe avoid renaming that twice in a short timeframe 19:31:52 SpamapS: since the technical bits are happening over in system-config, i believe it will just describe the process and should be simpler 19:31:54 anteaya: ^ 19:32:05 I mentioned this, but wrt infra-cloud docs, the tense on https://review.openstack.org/#/c/180796/ is getting confusing 19:32:22 maybe we can even this all out while writing the spec 19:32:23 * mordred is also going to write a spec on shade - because it needs one 19:32:27 pleia2: ++ 19:32:31 something like "we will run a cloud, it will do this, these are the major steps that will happen, etc" 19:32:34 pleia2: I think it's because it's part docs, part spec right now 19:32:38 mordred: yeah 19:32:45 pleia2: and should be all docs 19:32:45 pleia2: It is supposed to be present tense, but a spec would definitely make it easier to write in future tense. 19:32:53 SpamapS: nods 19:32:56 yeah, i'll look at that too and try to help sort it out 19:33:16 also specs tend to read more like logical tests than docs. "Infra-cloud shall xxx" can apply to the future or the present. 19:33:44 I feel an infra spec spec coming on ... 19:33:46 #action jeblair write infra-cloud spec with SpamapS 19:34:06 jeblair: while we're on that topic 19:34:39 we started poking at getting full inventories for things - there are scripts to do this - but I'm not sure where/if they should live 19:34:49 fungi, mordred: so let's kick rename discussion till next week's meeting, and poke the ironic folks about updating the change for discoverd 19:35:09 mordred: if we ever get a new region, we might use them again, yeah? 19:35:10 at the moment, they're adhoc "please troll the machines for datas" scripts - so I'm thinking doing them live and checking them in later as docs is appropriate? 19:35:13 jeblair: yah 19:35:17 they're useful in general 19:35:27 also what do you do with the resulting data? throw that in a git repo too? 19:35:29 mordred: or stick them in the tools/ dir? 19:35:39 (of system-config) 19:35:46 yah 19:35:49 tools/ dir seems like a good home 19:35:52 or the playbooks dir (they're playbooks) 19:36:14 mordred: one issue is they are now tied to machine-information.json schema 19:36:28 greghaynes: well, that's still likely useful 19:36:32 not sure how we want to deal with that interface, but its important that we pull ipmi info from somewhere 19:36:35 greghaynes: as we'll want things to consume that in the future 19:36:39 nibalizer: i imagine the resulting data should either show up in documentation ("this is the hardware we have") or config files ("do these things on this hardware") 19:36:42 If we make the data public we need to exclude things like ipmi password 19:36:42 mordred: ^ ? 19:36:48 sure, if we want to say machine-information.json is the schema for that 19:36:50 jeblair: yes 19:36:56 And if we don't we need to store that separately 19:37:02 greghaynes: I think it probably will be- but we'll need a thing that merges passwords from hiera 19:37:28 jeblair: i've updated the project renames section of the agenda with ironic-inspector, and also noted that whoever originally added that item linked the stackforge->openstack governance change for the repo, not a project-config change 19:37:28 the data all wants to be public, we probably want to write a thing taht turns it into nice looking sphinx docs tbh - and also ipmi passwords want to go into hiera 19:37:30 however 19:37:39 remember that you can always just call 'hiera keyname' from shell if that makes hacking easier 19:37:42 I dont' think we need to deal with hiera ipmi password merge in the first pass 19:37:55 yeaa, isnt hiera keyed by hostname too? 19:38:03 greghaynes: not in the way this will be using it 19:38:07 ok 19:38:18 greghaynes: the values will all be associated with the hostname of the bastion host, most likely 19:38:19 greghaynes: you can call 'hiera keyname filter=value filter=value' 19:38:32 "these are the ipmi passwords for the machines taht bastion host A wants" 19:38:36 if you need to inspect like that, but infra-hiera is amazingly flat++ 19:38:52 yea, so then key them by mac like we are now? 19:38:55 mordred: you could use a hash in there to line up with the hosts 19:39:07 and then the public data (the machine list) will get merged with that hiera data and the results will be puppet writing a machine-inforation.json file to disk on each bastion host 19:39:19 it's possible this is not the right forum to design this 19:39:21 mordred: so bastion: { ipmi: { host1: passwd, host2: passwd } } 19:39:29 clarkb: yes. that's exactly right 19:39:46 Asking about that because we dont have permanent hostnames for these yet 19:39:50 I mainly brought it up to point out that there may be some useful scripts being hacked live at the moment that want to end up in system-config 19:40:09 we can move to #infra though 19:40:10 greghaynes: right. that's why I think hiera merging is premature currently - these arent' REALLY part of the world yet 19:40:13 probably want the serial of the box as the key for 'host1/host2' there, as that is the thing that will remain constant, but yeah, not a thing to meeting-ize. 19:41:24 jeblair: just arriving. Did you mention irc-meetings ? 19:41:53 * jeblair has the irc lag 19:42:13 ttx: no, but i have been working on cleaning that up 19:42:24 ttx: we didn't quite get everything right, but it's almost there :) 19:42:53 ttx: i manually fixed the current state on eavesdrop, should be okay to declare it in production and start publicising it 19:43:19 Cool. I'll sync with tonyb and make sure he has all the meetings in 19:43:48 yeah, http://eavesdrop.openstack.org/ is looking usable now 19:44:03 oh, needed a force-reload 19:44:20 nice job everyone 19:44:28 fungi, nice! 19:44:37 ++ 19:44:40 well "nice" is not the way to describe that current page 19:44:52 I bet it renders great in w3m 19:44:55 thats worthy of a nice :) 19:44:55 hehe 19:44:57 it's "ugly", hopefully enough for someone with skillz to fix it 19:45:09 HTML 1.0 FTW 19:45:18 it's ugly enough to encourage someone to beautify it very soon, i'm sure 19:45:23 pabelanger: yes, guess who wrote that template 19:45:29 * taron 's design senses are tingling 19:45:44 At the very minimum a TOC would be nice :) 19:46:00 could probably use some bootstrap or something of the sort as well 19:46:07 effective workflow: build something intentionally ugly, publicize it, wait for improvements from irritated web designers 19:46:26 pabelanger: timrc any grafana things to talk about? 19:46:40 free time in the infra meeting is such a treat, jeblair++ for wizardry 19:46:45 Nope. Not on my end, pabelanger is doing all the work :) 19:46:54 Hmm, grafyaml is up in governance right now 19:47:03 so, that is moving forward 19:47:11 all hail the new grafyaml overlords 19:47:15 timrc: excellent, we can profit from his persperation 19:47:23 eww 19:47:32 i also know that timrc has been kicking the tires on a public grafana module which is sweet 19:47:35 * timrc slowly backs away from nibalizer 19:47:48 pabelanger: though we can add it to project-config before governance rubber-stamps since jeblair has already +1'd 19:48:04 I also have puppet-grafana rolling too: https://review.openstack.org/#/c/179208/ 19:48:07 The bfraser-grafana is pretty decent. 19:48:08 but some work left to do on it 19:48:20 fungi, roger 19:48:39 pabelanger, Did you figure out a way to add datasources and organizations programmatically? Was not sure if those things could be added through the API. 19:48:51 If not, that could present a bootstrapping challenging. 19:49:08 challenge* 19:49:13 timrc, no, not yet. I need to do some sql magic to make them bits work. Right now, it is a race to log in as admin / admin to change password once puppet launches the node 19:49:35 pabelanger: does the node need to be launched to generate the admin user? 19:49:48 pabelanger, You can control those settings from the puppet module. 19:49:48 pabelanger: like, could the puppet install pre-seed an admin user in the db? 19:50:19 mordred, I need to check if the package support that, I don;t think so right now. 19:50:28 pabelanger, So if you deploy with bfraser-grafana you can create a grafana.conf which allows you to set things like a secret key, admin credentials, default org, etc. 19:50:36 we _could_ add support into the package for that pretty easy with dbconfig 19:50:58 timrc, Interesting. I just googled a puppet-grafana module, not sure which I am using 19:51:05 Ooh 19:51:07 using the same 19:51:12 you're using the same :) 19:51:12 ya: https://review.openstack.org/#/c/179208/5/modules.env 19:51:20 hi 19:51:23 Ya, so need to just expose the bits for that 19:51:58 pabelanger, https://forge.puppetlabs.com/bfraser/grafana <-- that works pretty well 19:52:31 timrc, Yup, using the same. Just need to see how to provision the admin user from the config file now 19:52:31 But yeah, I _think_ we should add the end points to add datasources and organizations if they do not exist, rather than doing something hacky. But that's my own opinion. 19:52:39 ++ 19:52:50 So that is upstream grafana work. 19:53:15 Yup, don't see that as an issue 19:53:35 how are you going to handle that? pull requests ? or forking the module? 19:53:56 I would prefer pull requests. But I've not worked with that community yet. 19:54:03 we should always try to work upstream 19:54:04 jeblair: should future irc-meetings chances get picked up automatically ? Or will you have to manually publish a few more times ? 19:54:10 changes* 19:54:12 Ya, I don't want to fork the module 19:54:13 i worked with a pair of puppet modules, and it was super easy to engage people 19:54:36 ttx: until https://review.openstack.org/185677 and https://review.openstack.org/185678 land, changes to irc-meetings will probably actually break the site 19:54:53 ttx: so we should go ahead and merge those and make sure they work 19:55:07 * ttx has a quick look 19:55:10 yolanda, This will be upstream grafana work. I do not think at this time we have any upstream puppet work to do re: grafana. 19:55:23 ah ok 19:55:30 But hey, these end points could actually already exist. We just need to confirm one way or the other. 19:56:30 A open question before we run out of time. Is there any common place, e.g. blah.o.o where one can store this fedora image used by magnum dsvm functional test I mentioned in channel this morning. 19:56:47 jeblair: approved https://review.openstack.org/#/c/185678/ 19:56:48 rbradfor_: if it is created by a job in our system, it can be published to tarballs.o.o 19:56:52 and +1ed the other one 19:56:58 rbradfor_: this happens with, eg, heat and trove images 19:57:11 rbradfor_: after you left clarkb suggested that magnum and heat should collaborate on using one fedora image if possible 19:57:12 timrc, looks like a challenging work 19:57:19 Not sure the timelines that grafana has, but we could do both. Inject data into SQL, if we find the API is taking too long to merge. And honestly, I have no idea what is needed in grafana to add support for it. But it doesn't hurt to look upstream and see 19:57:22 jetblair: great I was going to suggest that. 19:57:34 pabelanger, Here is the start of our grafana module downstream: http://paste.openstack.org/show/238414/ 19:57:48 rbradfor_: yes my suggestion was that magnum determine what they need and determine how to make the existing fedora image caching work for magnum 19:57:48 fungi I'll reach out to somebody at heat and see what I can determine. 19:57:57 rbradfor_: but step 0 there is "what does magnum need from this image" 19:58:02 rbradfor_: but since we lack insight currently into how that image is produced, i have no idea why it's special and not just the fedora image we already cache on our workers 19:58:26 clarkb, the image is used as the base for kubernetus containers 19:58:34 rbradfor_: yes but why is it special? 19:58:43 rbradfor_: we already cache a half dozen images that could be used for containers 19:58:50 e.g., preinstalled kubernetes packages? customized kernel? 19:58:54 (give or take, I am not sure of the exact number) 19:58:58 timrc, Nice, looks similar, so that is good. 19:59:21 clarkb, I am not that familiar. Can you point me to the images that are cached so I can ask of the project. 19:59:42 let's follow that up in #infra 19:59:45 thanks everyone! 19:59:47 #endmeeting