Tuesday, 2020-05-12

*** diablo_rojo has joined #opendev-meeting00:17
*** diablo_rojo has quit IRC04:25
*** openstackstatus has quit IRC13:53
*** openstackstatus has joined #opendev-meeting13:53
*** ChanServ sets mode: +v openstackstatus13:53
clarkbwe'll get started with our weekly meeting in a couple minutes18:59
clarkb#startmeeting infra19:01
openstackMeeting started Tue May 12 19:01:03 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
openstackThe meeting name has been set to 'infra'19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2020-May/000023.html Our Agenda19:01
zbro/19:01
clarkb#topic Announcements19:01
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:01
clarkbThis one didn't make it on the agenda because I had simply forgotten it. But OpenStack is looking to do its big ussuri release tomorrow morning UTC time19:01
clarkbit is a good time to be slushy right now19:02
ianwo/19:02
funginow i want a lime slushy19:02
clarkbBased on IRC discussions earlier they plan to start the process around 10:00 UTC tomorrow19:02
clarkb#topic Actions from last meeting19:03
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:03
fungii'll try to be around for the great button-pushing19:03
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-05-19.01.txt minutes from last meeting19:03
clarkbI took an action to prep a PTG etherpad of ideas19:03
clarkb#link https://etherpad.opendev.org/p/opendev-virtual-ptg-june-2020 PTG Ideas19:03
clarkbI did that and sent email to the service-discuss list about it19:03
clarkb#topic Priority Efforts19:04
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:04
clarkb#topic Update Config Management19:04
*** openstack changes topic to "Update Config Management (Meeting topic: infra)"19:04
clarkbI think the gerritbot work is still a todo.19:04
clarkbI may have time to help with that as things settle after the openstack release19:05
clarkbOn the Zuul for CD front I think we continue to learn new things19:06
clarkbFirst up is that some of our system-config-run jobs don't do a great job of verifying the installation that ansible/docker/etc perform is functional19:06
clarkbin particular the system-config-run-zuul job wasn't checking zuul was actually running. corvus has been looking into that and has found that one of our major issues there is uid conflicts between container users and host users19:07
clarkbcorvus: is your plan around that still to create a "containers" user rather than a different named user with the same uid across all the services?19:07
clarkbcalling it out because its a tricky problem and eyeballs on the situation is a good thing19:08
ianwon nodepool don't we ensure a nodepool user with uid 10001 or something?19:10
clarkbI'm not sure I can fully describe it myself, but basically zuul, nodepool, zookeeper and so on set container users. We then try to map the same uid to the user that owns config files and stuff outside the container that we bind map into the container19:10
clarkbianw: yup, but that same uid is used for many other services19:10
mordredyeah - I think that's the plan19:10
clarkbso the issue becomes we can't have a zuul and a nodepool and a zookeeper user with the same uid on the host side to own all those bind mountable things19:10
mordredmake a user on each of the hosts that is not called zuul but called something else, like "containers" or something, and use that user for zuul and nodepool19:11
clarkbthere is a separate issue where the zuul user on test nodes is already precreated so that zuul can ssh in and that uid doesn't match the uid the zuul containers use19:11
mordredexactly19:11
mordredso making a general service user should fix all of these19:11
clarkband having a separate "containers" user addresses both issues19:11
mordred++19:11
*** diablo_rojo has joined #opendev-meeting19:12
corvussorry late19:12
clarkbwe also split services by host currently anyway so we don't immediately create a larger attack surface by doing that19:12
ianwi guess each container only has mapped in it what it should be seeing, but it does mean there's no host-level separation of configs, secrets etc19:13
clarkbbut if there are other concerns now is probably a good time to bring them up otherwise I expect we'll start trying to roll that out soon19:13
corvus#link run zuul, nodepool, zk as 'container' user: https://review.opendev.org/72695819:13
corvusthat not ready yet19:13
corvusbut yes, is the current plan19:13
clarkbianw: correct, but we split the hosts up anyway so we should still have that separation19:13
ianwisn't there like a remapping option?  so all the containers would standardise on 10001 say, and then when you start them you map that to a specific user on the host?19:15
clarkbianw: I think only if you remap root onto those uids19:15
clarkbat least that seemed to be what the conclusion was the other day19:15
corvusi think it remaps a range19:16
corvusso if our uid in the image were, say, something less than 10001, that might be a more attractive option19:16
* diablo_rojo sneaks in late19:16
clarkbcorvus: does the range have to start at 0 ?19:16
corvusthough, i suppose balooning to 30003 reserved uids still is only worrisome if we colocate...19:16
corvusclarkb: i think so19:16
clarkbgot it19:16
corvusi'm fuzzy on this, but my understanding is that 0..N inside the container maps to X..X+N on the host19:17
ianwyeah, i'm only reading https://docs.docker.com/engine/security/userns-remap/ for the first time19:17
fungiyes, i read up on it and that's what i got too19:17
fungiyou basically carve up uid/gid ranges on the host context and then the base is added to the uids and gids in the container when mapping to the host context19:18
fungiso it's less id remapping, and more id partitioning19:18
corvussince our zuul user on the host is 1000, we can't map 10001 in the container to 1000 on the host.  but we could map zuul in the container to $something on the host at, say, uid 20001, and nodepool in the container to $somethingelse on the host at uid 30001....19:19
clarkbpersonally I like the simplicity of the container user plan19:19
clarkbI think we are somewhat resistant to the downsides of that plan due to existing design and not needing an extr alayer of abstraction is nice19:20
corvusmy inclination is to not worry too much about that, just do 'container: 10001' on the host for all 3, and if we hit a colocation issue before we get around to doing everything in k8s, deal with it then19:20
fungiis it a concern if the zuul-web and zuul-scheduler contaniers on the same host use the same uid/gid got their processes?19:20
fungier, for their processes19:21
corvusno wore than now19:21
clarkbya they already do that19:21
fungiright, i'm aware that's what's happennig now, just making sure that's acknowledged as something we're cool with19:21
clarkbI think so19:22
fungii believe it matches how we ran pre-containers anyway19:22
clarkbyes19:22
corvussorry, i forgot time works like that.  i meant pre-containers when i said 'now'19:22
fungithis might become more of a concern for servers like eavesdrop, where we run several bots under different users currently?19:23
clarkbalright we probably don't need to solve this problem during the meeting. But I did want to call it out so people can raise concerns or propose other ideas19:23
corvusapparently that's more correctly called "the before now time"19:23
fungithe long ago19:23
clarkbI expect we'll have plenty of time at the end of the meeting too if we want to swing back around to this19:23
fungisure19:23
clarkbbut lets keep going through the agenda to be sure we cover other topics too (and we don't have much so really should have time later)19:23
clarkbTHe other Zuul CD thing I wanted to call out is the system-config zuul job reorg landed19:24
clarkbthis means you may need to rebase any changes to those jobs you had outstanding. Sorry about that, but I think it does help make system-config's jobs a lot easier to read and reason about now19:24
clarkbwe tried to land it when there was less outstanding changes that would conflict, but some may have gotten caught by it19:24
clarkb#topic OpenDev19:25
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:25
clarkbThings have been busy with all the cd type changes lately that I haven't gotten around to reaching out about advisory council membership19:26
clarkbI'm hoping once the openstack release is done then I'll be able to do some of that19:26
clarkbThat was all I had on the opendev front. Anything else to add?19:26
clarkbfungi: the auth/identity service spec is still not up right?19:27
clarkbI want to make sure we don't miss that once it is up19:27
funginope, i've been distracted by other stuff and haven't had a chance to distill the pile of prose i pulled from those various e-mail threads19:28
fungiit's a bit of a jumble right now19:28
fungiand a lot of it is probably outdated19:28
clarkbthanks, just want to make sure I hadn't missed it19:28
fungii could push it up as is, but it's not likely very reviewable yet19:29
fungijust a patchwork of random comments right now19:29
clarkb#topic General Topics19:30
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:30
clarkbThe virtual ptg is fast approaching19:30
clarkb#link https://virtualptgjune2020.eventbrite.com Register if you plan to attend.19:30
clarkb#link https://etherpad.opendev.org/p/opendev-virtual-ptg-june-2020 PTG Ideas19:30
clarkbThe two things to do at this point are register if you plan to attend and submit ideas to the etherpad19:31
clarkbI think it would also be great if people can indicate subjects that they are interseted in so that we can try and schedule them during timezone appropriate slots?19:31
corvusis anyone interested in using meetpad?19:31
fungisure!19:31
clarkbcorvus: I am19:31
corvusok cool -- anything we need to do before then?19:32
clarkbcorvus: I think diablo_rojo plans to give it a good testing on friday this week19:32
fungii need to dig out a microphone19:32
clarkbI planned to dial into that and participate as part of the data collecting19:32
corvusdiablo_rojo, diablo_rojo_phon: count me in, i like to talk to people19:32
fungiand yeah, we'll probably get some user feedback from the openstack ussuri celebration19:32
corvussometimes19:32
corvusa little19:32
fungii make an exception for people i like ;)19:33
clarkbI think all of the known issues have been addressed. Including http -> https redirect and the valid etherpad name characters update19:33
diablo_rojocorvus, can't wait!19:33
clarkbI tested that -'s are now valid yesterday19:33
clarkb(they are)19:33
diablo_rojoAnd to see the cake I am planning to bake?19:34
corvusooooh19:34
diablo_rojoWill be my first foray into sculpting cake.19:35
clarkband based on how that goes I think we can schedule some other test times as necessary19:35
clarkbalso it looks like the schedule is up and we got our requested hours19:36
fungii have trouble bringing myself to sculpt cake because i think about all that trimmed cake going to waste (or rather, which i'll be compelled to eat separately while sculpting)19:36
diablo_rojofungi, it won't go to waste, have no fear19:36
clarkb#link http://ptg.openstack.org/ptg.html The schedule is up19:36
clarkbThat takes us to fungi's standing wiki item. Anything new there fungi?19:37
funginope!19:37
clarkb#topic Open Discussion19:37
*** openstack changes topic to "Open Discussion (Meeting topic: infra)"19:37
clarkbAs expected we've got additional time today. Feel free to resume the uid and containers discussion19:38
clarkbfungi: the point about eavesdrop is a good one and that might be a reasonable example caseto consider since you are correct we'll end up running different bots there with different levels of access to services19:39
ianwi've got a couple of low priority but outstanding things for nodepool19:40
clarkbfungi: perhaps in that case we can simply set users to non overlapping uids since they are not generally reconsumable images (eg its opendev specific images)19:40
clarkbfungi: in the case of zuul and nodepool and zk we consume them from external (though in the case of zuul and nodepool: friendly) sources19:41
ianwhttps://review.opendev.org/721509 , https://review.opendev.org/724452 , https://review.opendev.org/723782 , https://review.opendev.org/726032 ; fairly random things but that have popped up during recent work19:41
corvusclarkb: we could also not build the user into the image19:41
mordredwell - I thnk the main issue is that zuul and nodepool and friends expect to have read-write access to dirs on the host19:41
corvusmordred: i don't think that's an expectation that comes from the zuul image; that's just how we're using them19:42
mordredcorvus: indeed19:42
mordredcorvus: also, I'm not sure building the user into the image is gettig us much, tbh19:42
clarkbmordred: in the case of eavesdrop services we don't want shared read access for credentials though19:42
clarkbmordred: or at least ideally we would avoid that19:42
clarkband ya if we can simply avoid setting a user in the image and set that at runtime I think that solves the eavesdrop problem well19:43
mordredthis is, incidentally, why we originally wrote the container spec to say that we should use podman ...19:43
corvusmordred: yes, maybe for opendev images, we should avoid creating users, try that out for a bit, then revisit whether it's actually necessary to have a zuul user in the images at all19:43
mordredso that we could just run the containers as non-root in the first place and run them as a specific user19:43
mordredof course, we ran in to compose issues with podman sadly19:43
mordredcorvus: ++19:43
mordredcorvus: my hunch is that if we stop putting users in the images, then we can do run --uid=$SOME_UID - whatever matches on the host - and things shoudl be fine19:44
mordredother than fingergw and priv dropping inside of the container19:45
fungian option there is to not do priv dropping, and proxy the well-known port to one in the untrusted range19:45
fungisystemd can "just do that" or we could use a tcpproxy or something19:46
fungior grant the necessary cap to the container so it can bind to privileged ports19:46
clarkbor configure the uid to drop to19:46
clarkbwhich is basically what we do today except the uid is fixedright?19:46
fungiclarkb: problem there is the container needs to start as root so it can open the low-numbered listening socket19:47
corvuswe may still be able to run that container as root and let it drop privs without having an /etc/passwd entry19:48
fungialso true19:48
fungiit doesn't create files or anything, so picking an arbitrary nonzero uid/gid for that is likely fine19:48
fungiit's a questionable practice, but if it's made configurable as part of the container invocation that's probably good enough to solve any real-world collisions19:51
mordredfungi: docker can just do the port mapping actually if we chose to go that direction19:51
mordredit's not even a thing we have to engineer19:51
fungioh, nifty19:52
fungiregardless, that falls into the class of "don't need to bother with priv dropping"19:52
mordredyeah. there are definitely options - or we can the corvus thing, run the container as root and drop to an arbitrary uid in the container19:52
corvuswell, that's if we don't do host networking19:52
corvuswhich means firewalls19:52
fungiif we do host networking and systemd is starting the container then we could presumably use it to open the listening socket and hand it to the fingergw as a local fd (though that might need a couple additional lines in the fingergw implementation to support not expecting a real network socket)19:54
fungiassuming the process we're starting can properly inherit file descriptors from its parent19:55
fungii don't know how many of these traditional unix assumptions containers tend to break19:55
clarkbfungi: I have no idea if you can inherit file descriptors like that. Presumably yes, but it is a good question19:58
clarkbthat takes us to the end of our hour block19:58
clarkbthank you everyone! we'll see you here next week19:58
fungithanks clarkb!!!19:58
clarkb#endmeeting19:58
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"19:58
openstackMeeting ended Tue May 12 19:58:27 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:58
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-12-19.01.html19:58
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-12-19.01.txt19:58
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-12-19.01.log.html19:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!