19:00:01 #startmeeting infra 19:00:02 Meeting started Tue Oct 9 19:00:01 2018 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:05 The meeting name has been set to 'infra' 19:00:07 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:00:18 i suppose i should take a look at this thing; just a sec ;) 19:00:30 o/ 19:00:33 I'm sort of around too 19:00:37 #topic Announcements 19:00:52 I did trim the agenda and add an entry for you earlier today or was that last night? 19:00:54 clarkb: any important announcements? 19:01:09 i think it was today, or at least that's when i saw the update notification 19:01:14 I don'tthink so other than summit is fast approaching and then holidays then its 2019 and wow time flies 19:01:35 something something trusty eol 19:03:08 yeah, um, don't forget to book flights/hotel and register if you're planning to go to berlin with us! 19:03:10 i am here too 19:03:23 #topic Actions from last meeting 19:03:54 #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-09-25-19.01.html Minutes from last meeting 19:04:00 i see nothing, so i think we're good 19:04:18 #topic Specs approval 19:04:32 nothing mentioned on the agenda up for a council vote 19:05:04 i still owe an alternative implementation writeup for the letsencrypt/acme spec 19:05:06 I think I meant to follow up on the 3rd party ci spec 19:05:15 oh, sure we can do that 19:05:17 but then the gate fires happened and I was very distracted 19:05:31 yes, unless people's mind has changed due to intervening conferences, etc, i think that's ready 19:06:11 I've written a note to follow up after the current conference 19:06:29 #link https://review.openstack.org/563849 Direction setting for 3rd Party CI 19:07:13 are we good putting this up for council vote through thursday? 19:07:17 ++ 19:08:08 #info The "Direction setting for 3rd Party CI" spec is open for council voting until 19:00 UTC Thursday, October 11. 19:08:19 any objections? 19:08:50 if you have any concerns, be sure to comment on the review in the next 48 hours! 19:09:18 #topic Priority Efforts - A Task Tracker for OpenStack 19:10:01 i don't really have any storyboard updates this week, and diablo_rojo is in the middle of conducting an upstream institute training in some farflung corner of the world right now 19:10:42 oh, we can hyperlink task footers from gerrit if anyone wants to review the corresponding system-config changes: 19:11:14 #link https://review.openstack.org/607699 Hyperlink task footers 19:11:18 (and its parent) 19:11:45 also diablo_rojo has posted an attachments spec she'd appreciate some reviews on: 19:11:58 #link https://review.openstack.org/607377 [WIP] StoryBoard Story Attachments 19:12:38 and anyone interested in storyboard, we'll (probably) have our regular weekly meeting tomorrow at 19:00 utc in this same channel 19:12:50 or find us in #openstack-storyboard any other time 19:12:52 * AJaeger apologies for missing the bell 19:13:11 #topic Priority Efforts - Update Config Management 19:13:31 i see a couple of possible subtopics mentioned here with no associated proposer 19:13:39 topic:puppet-4 and topic:update-cfg-mgmt 19:14:11 fungi - ins't the correct irc channel #storyboard ? --- without openstack prefix? 19:14:24 ssbarnea_: yep! fingers failed me 19:14:28 thanks for the correction 19:14:42 i see 20 open changes in the first topic and 11 in the second 19:14:55 i guess this bullet item is here to remind us to review these changes 19:15:17 the other possible subtopic on the agenda says "Zuul as CD engine" 19:15:25 anyone here who added that? 19:16:06 maybe corvus? i think that may have involved containery-things? 19:16:27 okay, well, zuul as a cd engine seems like a great idea. i know we've talked about it and are starting to do it, so... 19:16:49 if i added it, it was a while ago :) 19:17:01 cool, so maybe nothing we need to cover on this in the meeting today unless you're aware of something 19:17:41 not immediately -- i think the next steps from that (coming out of the ptg) are that if we want to have zuul trigger its own config updates, we need to move the entire project-creation process to ansible 19:17:59 that sounds like what i remember 19:18:44 oh Ihve changes that add the zuulcd user to bridge.o.o which should have the correct topic on them 19:18:50 which is another prestep for that I think 19:19:05 and a sensible choice anyway since part of why we originally started using ansible (before zuul v2.5) was because we'd stretched the nearly nonexistent orchestration facilities of puppet to its breaking point with our new project automation 19:19:50 clarkb: ah yep, thanks 19:19:54 #link https://review.openstack.org/604925 Add zuul user to bridge.openstack.org 19:20:05 it's in topic:update-cfg-mgmt 19:20:14 so, yes! 19:20:37 i plan to also keep exploring the containierisation of graphite.o.o ... which is sort of related because it's all role based and you could see it being part of a CD workflow 19:21:05 that sounds great 19:21:29 i similarly have a to do item to try out the "official" mailman3 container image with a mind to that 19:21:40 basically redo my mm3 poc with their container 19:21:50 anything else on this for now? 19:22:02 we have a fullish assortment of general topics waiting, so let's see how many i can plow through in the next 38 minutes 19:22:32 #topic OpenDev 19:22:42 we are opendev. any questions? 19:22:44 ;) 19:22:57 there's some nameserver changes still out there which need reviews i guess? 19:23:00 I completely failed to schedule a meeting with the foundation 19:23:13 looks like those merged 19:23:14 I think we were somewhat surprised as the zuul interest last week and that consumed our time at sansiblefest 19:23:29 oh, except one 19:23:40 #link https://review.openstack.org/605092 Add opendev nameservers 19:23:49 i've lost the thread on opendev nameservers; maybe i can pick them up next week, but if anyone else wants to push forward on them in the interim you're welcome to :) 19:24:42 looks like we got ahead of ourselves with the firewall rules and either need public slave nameservers deployed first or need to do some manual bootstrapping of the hidden master 19:25:01 I think corvus mentioned there would be manual firewall bootstrapping 19:25:37 i'd love to work on moving this forward, but i'm hesitant to commit to anything else for the next few days until i get more caught up 19:25:44 puppet will deploy things the axfr's just won't work immediately until a human edits the firewall 19:26:06 well, for the moment we need some workarounds to get 605092 in that case 19:26:22 er, to get 605092 passing integration tests 19:26:29 ah 19:26:59 yeah, i guess that needs to be split into 2 commits 19:28:10 do we want to talk about anything else specific to the opendev domain, rebranding, renaming of services, et cetera today? 19:28:53 or save that for when clarkb gets a marketing and promotion discussion set up first so we can figure out our initial messaging (as discussed at the ptg)? 19:29:15 i'm good till then 19:29:34 i consider everyone else's deafening silence as tacit approval 19:29:37 ;) 19:29:58 #topic discuss silencing rsync from console logs (ssbarnea) 19:30:07 #link https://review.openstack.org/#/q/topic:quiet-rsync+(status:open+OR+status:merged) silencing rsync from console logs 19:30:34 so..., some ansible rsync(s) are quite verbose by default and they spam console logs in some extreme cases where a single command taking ~40% of total console length. 19:31:02 however, unless we save that somewhere we have no way to diagnose artifact collection failures, right? 19:31:05 I know for sure that I seen some task doing rsync on ara reports that were huge 19:31:16 tripleo does that, for sure 19:31:42 fungi: --quiet does not hide errors, only success. 19:31:48 maybe openstack-ansible too? 19:32:20 fungi, i know that here some more places,.... 19:32:22 ssbarnea_: sure, but --quiet also doesn't tell you what files you didn't collect because you put them in the wrong directory or had an overly-picky regex 19:33:06 fungi: neither non quiet doesn't tell you what you didn't collect. 19:33:20 and what you collected is visible at the destination anyway. 19:33:42 i doubt we do have tasks removing them after collection 19:34:08 i'm mostly trying to represent the other side in this argument but i don't feel super strongly either way. the changes in question are just for openstack-specific jobs, not other zuul jobs. it's been suggested that we can make the console log less verbose by default with collapseable sections instead. does this address your concerns? 19:34:32 fungi: we may make the --quiet optional? like enabled by default and disabled for special cases (like a debug build) 19:35:03 can me make a rule (lines/task) of what counts as reasonable usage and what counts as spam? if we spot a task as spammy, we silence it? 19:35:44 i think nobody would complain if there are 10-20 lines, but when it goes >100 it starts to become spam. 19:36:12 and while I like ARA, I find its number of files as super-spam. 19:36:40 usually the ones i end up wanting to look at, personally, are docs/website publication job logs because invariably i've screwed up where i put files or how i matched them 19:36:58 oh yeah, those are always fun ;( 19:37:19 also release jobs because of release artifacts being skipped due to incorrect naming 19:38:42 ideally we should dump list of files to an separate log file and avoid using console, but sadly for us ansible had no such support. 19:39:04 well, also there's a catch-22 19:39:12 since we need to collect the file containing the list of files? 19:39:42 i am not sure if the zuul special callback can do some smart filtering for this specific case. 19:39:45 sort of a gödel's completeness problem i guess 19:40:20 if we hack the callback to avoid console output, we still can have the full list captured by ara as json. still not sure how easy is this to implement. 19:41:09 yeah, i'm unfortunately not sure we have the attention of the right people in the meeting today to come to a conclusion on this topic 19:41:26 or at least nobody's really speaking up 19:42:06 fungi: in this case lets move on, we can postpone it. 19:42:09 unless someone has something they want to add on this, i can move on to your next topic 19:42:35 this next topic isn't really phrased as a topic. i'll see what i can come up with 19:42:55 #topic ask.openstack.org with tag #openstack-infra (ssbarnea) 19:43:32 so... in the past i and others have alerted people who ask infra questions on the openstack ask site that we don't really do infra support there and directed them to irc or the ml 19:43:40 i think "just drop the logs" when people then think "hey, i've used that before" makes it a hard argument for people to vote on. but an intermediate thing, like having it logged separately somehow, certainly makes it easier to come at 19:43:46 ssbarnea_: are you volunteering to be our ask.o.o answerer? 19:44:22 fungi: yes, I do, we can use a specific topic openstack-infra there. 19:44:53 the idea is to provide support, but only for things that are specific to us. 19:45:37 i have no objection to you seeking out seemingly infra-oriented questions on that site and giving answers... my main concern (having dealt with it for years) is that it's like a cross between a wiki and a blog where answers get outdated and rot and become attractive nuisances 19:45:59 we have so far answered the "common" questions in the infra-manual 19:46:35 yes, at times when i've answered infra-related questions on ask.o.o i've tried to stick to providing hyperlinks to our documentation 19:46:57 stackoverflow never become obsolete, the wiki or docs are much harder to update. yep hyperlinks are good. 19:46:57 be that infra-manual or tool-specific docs 19:47:12 and sometimes we wrote another section for our docs so that we can link next time ;) 19:47:42 we can just experiment for few weeks and wee if it works or not. 19:47:43 i'm not sure i'd even considered people asking infra questions there. i can subscribe to that tag, but I think we have long-standing issues knowing that subscriptions on ask.o.o don't quite work properly 19:48:07 so anyway, i see no problem with people who are interested in dealing with ask.o.o curating and answering/updating/combining/closing questions there 19:48:27 if you need moderator access and don't have it yet, let me know and i can figure out how to elevate you to a mod 19:48:39 we need to update it all during the next sprint ... i think most of the puppet has been updated ... and then maybe subscriptions will work better and be easier to follow? 19:49:05 but in general i'm going to focus on answering questions in irc and on mailing lists and getting answers to common questions answered in relevant documentation, as AJaeger mentioned 19:49:20 is based on osqa or something else? i had prev exp with osqa. 19:49:39 * fungi has no idea what osqa is 19:50:06 fungi: https://github.com/dzone/osqa 19:50:18 fungi, agree with your comments - if there're volunteers like ssbarnea_, let him try it... 19:50:38 ssbarnea_: ahh, no. see site footer: This site is powered by Askbot. http://askbot.com/ 19:50:38 ssbarnea_: no it's askbot 19:51:12 the author used to be contracted to help us support it, but he had less and less time to work on that over the years 19:51:44 ssbarnea_: if you'd like to take on upgrades, i'd be happy to liaise with you as i've done a bunch of puppet for xenial update, but the service (and all support bits) needs updating 19:52:10 there's also a mail out there about issues with it when i looked into it 19:52:13 this site basically came about 1. because we had old spam-filled web forums full of misinformation we wanted to close down and 2. stackoverflow wouldn't give the osf the time of day when they asked about setting up a zone or whatever there to corral the openstack-related questions 19:52:29 ianw: let me first see if it worth investing in it, without enough users it would be close to useless. 19:53:28 6.5 minutes remaining 19:53:39 #topic Upgrade sprint next week (clarkb) 19:53:48 "Should be a good time in the cycle to do some upgrades and we need to move off trusty." 19:53:59 he's hopefully getting ready to go on stage right now 19:54:06 ssbarnea_: http://lists.openstack.org/pipermail/openstack-dev/2018-April/129078.html for reference 19:54:27 so this is more of a reminder. people who want to help work on upgrading our servers off trusty... let's do that 19:54:37 ya very shortly. I just wanted to see if we could get concerted effort around this next week 19:54:44 #topic Mirror/network issues at mirror.us-west-1.packethost.openstack.org (ssbarnea) 19:55:00 we _think_ this one is due to hypervisor hosts running out of disk. we don't control them 19:55:30 last night i "preallocated" the rest of the rootfs on this server by writing zeroes to a file until we ran out of space, then deleting it 19:55:42 fungi: so what can we do to avoid these errors? other than praying 19:55:49 yeah ... although with the sprint, i'm not sure how much heroic effort we want to put into puppet things v container things. but the container stuff is all green-fields ATM, so we need to design some sort of process around that 19:56:03 we had two outages yesterday where i found it in a "shutoff" state according to nova before the preallocation was done 19:56:18 it's been up since, so hopefully this has helped 19:56:53 in this case i think it's due to it being a small, dedicated cloud rather than a public cloud provider where they tend to size and track workloads a little more effectively 19:57:21 not that we don't see occasional unannounced outages for our control plan servers in large public clouds as well 19:57:28 er, control plane 19:57:56 going to go to open discussion for this last topic 19:58:03 #topic Open discussion 19:58:10 fungi: speaking of small dedicated clouds, have you heard any more on the armci cloud? 19:58:17 ianw: nada 19:58:20 someone added something here which looks more like a bug report: 19:58:39 "logs.openstack.org does not return character encoding for served files, thus confusing browser into rendering logs with garbage due to defaulting to US-ANSI instead of UTF-8. This can avoded by setting 'Content-Type: text/plain; charset=utf-8' instead of current 'Content-Type: text/plain'. Firefox inspect message: The character encoding of the HTML document was not declared. The document will 19:58:41 render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol. job-output.txt.gz" 19:58:50 it was me 19:59:04 not sure where I should have raise it 19:59:38 #link https://git.openstack.org/cgit/openstack-infra/puppet-openstackci/tree/templates/logs.vhost.erb 19:59:46 that's the configuration it's using 20:00:11 there's also a logs-dev.vhost.erb we can use to test changes (points to the same docroot) 20:00:12 fungi: ok, in this case I will make a storyboard for it 20:00:28 and we're out of time 20:00:35 and try to fix it 20:00:46 thanks, ssbarnea_! 20:00:51 and thanks everyone! discussion can continue in #openstack-infra 20:00:59 #endmeeting