Thursday, 2018-12-06

openstackgerritPaul Belanger proposed openstack-infra/nodepool master: Include host_id for openstack provider  https://review.openstack.org/62310700:08
pabelangercorvus: clarkb, mordred: Shrews: ^ first 1/2 to collect host_id for openstack providers, this is to better help openstack-infra collect information for the current jobs timing out in a cloud. Interested in your thoughts, feedback.00:09
clarkbpabelanger: well its not a feature in any of the clouds right?00:10
clarkbor does it pull from the api instead?00:10
clarkbah ya it is in the api interesting00:10
pabelangeryah, looking at openstacksdk, we should get it00:11
pabelangernodepool dsvm test should help confirm00:11
clarkbpabelanger: a common ish thing for me when debugging is to take the test node id and grep for that in the launcher debug log00:11
clarkbthat gets me lines like 2018-12-05 17:08:28,545 DEBUG nodepool.NodeLauncher-0000956882: Waiting for server 0b056afb-88e9-4d0f-8b3c-13f8363d7af2 for node id: 0000956882 and 2018-12-05 17:08:58,596 DEBUG nodepool.NodeLauncher-0000956882: Node 0000956882 is running [region: BHS1, az: nova, ip: 158.69.66.132 ipv4: 158.69.66.132, ipv6: 2607:5300:201:2000::576]00:12
clarkbif we added the uuid and the host_id to the second line there, that would be a major win for me I think00:12
pabelangerclarkb: kk, current patch doesn't log host_id, but should add it00:15
pabelangerwill do that in ps200:15
clarkbpabelanger: if you do expose it on the zuul side too adding in the instance uuid to the zuul side would be helpful too I think00:16
clarkbnot sure if that is already there00:16
jheskethpanda: perhaps long term there can be enough automation to actually run through the playbooks, but for now I was planning on preparing all the playbooks and tasks locally and spitting out the ansible-playbook commands that the user would need to run. The user can then modify the playbooks and set up an itinerary to match their local environment.00:17
jheskethTo do it fully automatically we'd have to build in extra flags to point to hosts etc, and/or build in cloud launching functionality. Which is something I'd like to see, but as a part 2 or separate tool even. eg, you give the tool your cloud credentials and it does the rest. But it'd need to know a lot more about the image building00:18
pabelangerclarkb: k, I'll look at uuid also00:23
clarkbpabelanger: the nice thing about those two messages is I get commonly needed info (uuid, ip addrs, etc)00:26
*** manjeets_ has joined #zuul01:49
*** manjeets has quit IRC01:51
*** bhavikdbavishi has joined #zuul02:41
*** bhavikdbavishi1 has joined #zuul02:44
*** bhavikdbavishi has quit IRC02:45
*** bhavikdbavishi1 is now known as bhavikdbavishi02:45
*** rlandy|bbl is now known as rlandy03:09
*** rlandy has quit IRC03:10
openstackgerritPaul Belanger proposed openstack-infra/nodepool master: Include host_id for openstack provider  https://review.openstack.org/62310703:12
*** bjackman has joined #zuul04:28
bjackmanIs there a way to get your config-project changes tested pre-merge in a post-review pipeline? I tried but it didn't work, not sure if this is because of config error on my part or just the way Zuul is05:59
bjackmanAh OK, I think the real answer to my question is that where I have shared config that I want to be tested pre-merge, that should go in a shared untrusted project (equivalent to the zuul-jobs one)06:35
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: update status page layout based on screen size  https://review.openstack.org/62201006:43
*** goern has quit IRC06:58
*** goern has joined #zuul07:08
*** bhavikdbavishi has quit IRC07:13
*** gtema has joined #zuul07:32
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Report tenant and project specific resource usage stats  https://review.openstack.org/61630607:33
*** pcaruana has joined #zuul07:58
*** pcaruana is now known as muttley07:58
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: refactor jobs page to use a reducer  https://review.openstack.org/62139608:06
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: refactor job page to use a reducer  https://review.openstack.org/62315608:06
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: refactor tenants page to use a reducer  https://review.openstack.org/62315708:06
*** themroc has joined #zuul08:48
*** AJaeger has quit IRC08:49
*** AJaeger has joined #zuul08:51
*** bhavikdbavishi has joined #zuul08:55
*** sshnaidm|afk has quit IRC09:45
*** sshnaidm|afk has joined #zuul09:46
*** bhavikdbavishi has quit IRC09:49
*** electrofelix has joined #zuul10:04
*** dkehn has quit IRC10:05
*** sshnaidm|afk is now known as sshnaidm10:12
*** sshnaidm has quit IRC10:33
*** sshnaidm has joined #zuul10:34
*** jesusaur has quit IRC11:27
*** jesusaur has joined #zuul11:31
*** bhavikdbavishi has joined #zuul11:48
*** sshnaidm is now known as sshnaidm|bbl12:08
*** dkehn has joined #zuul12:39
*** bjackman has quit IRC12:42
*** gtema has quit IRC12:46
*** bjackman has joined #zuul12:47
*** rlandy has joined #zuul12:58
*** muttley has quit IRC13:08
*** bjackman has quit IRC13:09
*** muttley has joined #zuul13:21
*** muttley has quit IRC13:25
*** muttley has joined #zuul13:26
*** muttley has quit IRC13:29
*** pcaruana has joined #zuul13:34
*** pcaruana has quit IRC13:39
*** rfolco has quit IRC13:41
*** rfolco has joined #zuul13:41
*** gtema has joined #zuul13:42
*** pcaruana has joined #zuul13:43
*** pcaruana has quit IRC13:47
*** bhavikdbavishi has quit IRC13:53
*** gtema has quit IRC13:53
*** smyers_ has joined #zuul13:57
*** smyers has quit IRC13:57
*** smyers_ is now known as smyers13:57
Shrewscorvus: tobiash: fwiw, i don't think https://review.openstack.org/622403 made much impact. I'm still seeing lot's of empty nodes being left around (but thankfully cleaned up now)14:06
tobiashShrews: ok, so maybe we should consider switching to sibling locks14:07
tobiashbut that would be a harder transition and might require a complete synchronized zuul + nodepool upgrade and shutdown14:08
Shrewsyes, a bit more involved to do that14:08
Shrewsbut at least not urgent now14:08
Shrewsat least we've learned something new about using zookeeper!  :)14:10
Shrewschild locks + znode deletion == bad news14:10
tobiashyepp :)14:13
*** gtema has joined #zuul14:26
*** smyers has quit IRC14:32
*** smyers has joined #zuul14:32
*** gtema has quit IRC14:44
*** sshnaidm|bbl is now known as sshnaidm14:51
mordredtobiash: that reducers stack is really nice14:55
mordredgah14:55
mordredtristanC: ^^14:55
mordredt <tab> is a fail :)14:55
tobiash:)14:55
mordredtobiash: I approved the stack except for the last 214:56
tobiashmordred: k, I'll check that out latest14:57
tobiashlater14:57
mordred++14:57
tobiashmordred: lgtm but I'm not feeling competent enough to +a it.15:02
*** njohnston_ is now known as njohnston15:10
mordredtobiash: yeah. these javascripts are just about at the edge of my brain abilities15:17
ssbarnea|rovercan we do something to avoid zuul spam with "Waiting on logger"? as in http://logs.openstack.org/30/621930/2/gate/tripleo-ci-centos-7-standalone/4fb356a/job-output.txt.gz15:18
mordredssbarnea|rover: we should probably instead figure out what broke the log streamer - do y'all reboot any of the VMs?15:27
mordredor, alternately, if you're doing iptables on the vms those could be blocking access to the log streamer daemon15:28
rlandyhello - I am testing out zuul static driver for use with some ready provisioned vms. I followed the nodepool.yaml configuration per https://zuul-ci.org/docs/zuul/admin/nodepool_static.html. The playbook setting up the multinode bridge fails - I think due to the fact that the private_ipv4 is set to null. The public_ipv4 value is populated with the 'name' ip. How can I get the static driver to set a private_ipv4?15:28
tobiashrlandy: the static driver only knows one ip address so you need to set the private_ipv4 in your job if you're depending on it (or maybe do a fallback when setting up the multinode bridge)15:30
rlandytobiash: ok - set_fact on hostvars[groups['switch'][0]]['nodepool']['private_ipv4']?15:32
rlandysetting a fallback would mean editing this role: https://github.com/openstack-infra/zuul-jobs/blob/master/roles/multi-node-bridge/tasks/peer.yaml#L1615:33
tobiashrlandy: you need the hostvars[]... cruft only for setting facts for a different machine15:33
rlandyand I am not sure other user will be open to my editing that for a static driver case15:33
rlandytobiash: yep - ack - thanks for your help15:33
*** jhesketh has quit IRC15:34
tobiashrlandy: yes, I'm not familiar with this role so someone else (mordred, AJaeger ?) might be of help with the multi-node-bridge role15:34
*** jhesketh has joined #zuul15:35
mordredrlandy: that role should work on nodes that don't have a private ip though - we have clouds that give us vms with no private ip15:37
mordredwe should check with clarkb when he wakes up15:37
rlandymordred: looking at the inventory, I saw the private_ipv4 set to null and I assumed that was the cause of the error. I could be wrong. I am testing it out again with private_ipv4 set15:39
mordredkk15:39
mordredit also might not be terrible to allow setting a private_ipv4 in the static driver since it's a value we provide for the dynamic nodes too15:40
rlandyor default to the public_ipv4 if private_ipv4 is null15:50
ssbarnea|rovermordred: i didn't do anything myself and I see the "[primary] Waiting on logger" error on multiple jobs during the last 7 days.  i don't know how to make a group by in logstash to identify a pattern.15:55
Shrewsi wonder why that role is using private_ipv4 and not interface_ip16:00
Shrewsthat's available in the inventory16:01
Shrewsmordred: do you know? ^^16:02
mordredShrews: the role uses private_ipv4 if it's there to establish the network bridge between nodes - that lets us always have a consistent network jobs can use regardless of provider differences16:04
Shrewsthat makes sense16:04
*** nilashishc has joined #zuul16:05
clarkbprivate ip is set to public ip is there is no private ip16:18
clarkbthe reason to useprivate over public when you have both is vxlan/gre were not reloable over nat16:18
clarkbrather than try and debug that I decided it was easier to just avoid the issue entirely16:19
clarkbwe can either make that ip behavior consistent in nodepool drivers or update the role to make that assumption instead16:19
*** themroc has quit IRC16:26
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Read old json data right before writing new data  https://review.openstack.org/62324516:30
pabelangermordred: Shrews: clarkb: in https://review.openstack.org/623107/ I'm trying to collect the host_id from wait_for_server in nodepool, but it seems to be empty: http://logs.openstack.org/07/623107/2/check/nodepool-functional-py35-src/46b9fb9/controller/logs/screen-nodepool-launcher.txt.gz#_Dec_06_04_00_07_442925 but 2 lines up I can in fact see hostId.  Any ideas why that would be?16:41
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add appending yaml log plugin  https://review.openstack.org/62325616:47
mordredpabelanger: looking16:49
mordredpabelanger: no - that makes no sense16:51
mordredpabelanger: I'm landing so can't dig too deep for a few minutes - but I'm gonna put money on a bug :(16:52
pabelangermordred: okay, that is what I figured also. I can start to dig into it more locally today too16:52
mordredpabelanger: cool. I'm guessing something in the conn.compute.servers() -> to_dict() -> normalize_server() sequence16:54
mordredpabelanger: which is new and is the first step in making the shade layer consume the underlying sdk objects16:54
mordredalthough looking at it it seems like all the things are in place properly to make sure you'd end up with a host_id16:55
clarkbssbarnea|rover: mordred: those test nodes are very memory constrainted I wonder if OOMKiller is targetting that process if it gets invoked17:01
clarkbssbarnea|rover: do those jobs capture syslog? we should be able to check for OOMKiller there17:01
rlandyclarkb: wrt private_ipv4 for drivers that only define a public_ipv4, I am happy to put in a review to default the private_ipv4 value in the role but if making the behavior consistent is possible, I think that would be better17:12
pabelangermordred: ack, thanks for the pointers17:12
clarkbrlandy: we may want to do both things now that I think about it more. As much as possible consistent driver behavior from nodepool is desireable but the roles should manage when they aren't consistent (maybe someone is running older nodepool)17:13
Shrewstobiash: i think i found the race in test_handler_poll_session_expired. running for a bit locally before i push up the fix17:14
tobiashYay :)17:14
rlandyclarkb: understood. I'll put in the role change for my own testing at least. Currently I am hacking up the job definition which is not a good way to go17:15
corvusShrews: if the deleted state didn't help, should we revert that patch?  (but also, any idea why it didn't work?)17:27
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_handler_poll_session_expired  https://review.openstack.org/62326917:39
Shrewscorvus: i have no idea why it didn't work. as for removing it, the only upside to doing so is an easier upgrade path for operators (the code itself doesn't hurt anything afaict)17:40
corvustobiash: heads up on https://review.openstack.org/62028517:41
Shrewscorvus: downgrading now might be tricky (we'd have to make sure there are no DELETED node states before we restarted)17:41
Shrewsan unlikely scenario, but we'd still have to check17:41
corvusShrews: hrm, maybe we keep DELETED in there for a little while?17:41
corvusmaybe until after the next release...17:41
corvussorry let me clarify17:41
corvusmaybe we should keep DELETED as an acceptable state, but remove the code which sets it17:42
corvusthen after the next release, also remove the state17:42
Shrewscorvus: it's possible that change is at least a little helpful, too. but hard to quantify17:42
corvusi think one of two plans make sense: 1) debug the DELETED patch and figure out how to make it work; or 2) agree that we should switch to sibling locks, remove the deleted state, and rely on the cleanup worker until we make the switch.17:43
Shrewsi think 2 is the real solution, but much harder to get there17:44
SpamapSlooks like /build/xxx doesn't know how to 404.17:45
corvusyeah, we'll need coordination between zuul and nodepool for that17:45
SpamapSit just... waits17:45
corvusSpamapS: :(17:45
SpamapSYa.. throw it on the bug pile? ;-)17:45
corvusSpamapS: there's sevear lines of code about returning a 404 in there17:45
corvuswow17:45
corvusseveral17:45
corvusSpamapS: it does take a while, but http://zuul.openstack.org/api/build/foo returns 40417:46
corvusa while=3.7s17:46
clarkbssbarnea|rover: mordred: Looking at logs for cases with Waiting on logger. This happens when the run playbook seems to die with "[Zuul] Log Stream did not terminate" Then the post run playbook has the Waiting on logger errors, presumably beacuse port 19885 is still heald by the existing log stream daemon?17:47
tobiashcorvus: thanks, looking17:47
clarkbssbarnea|rover: mordred: It also seems that when this happens we have incomplete log stream for the run playbook, but ara shows that things kept going behind the scenes17:48
clarkbhttp://logs.openstack.org/25/620625/2/gate/tripleo-ci-centos-7-standalone/70949b6/job-output.txt.gz#_2018-12-06_17_21_48_790889 is an example. This particular job failed trying to run delorean17:48
SpamapShm I may not have waited 3.7s17:48
clarkbI don't find any OOMs so it is possibly a bug in zuul (with the cleanup of the streamer in run failing hence talking about it here and not in -infra)17:48
corvusSpamapS: i don't see an index on uuid; that's probably why the response is slow17:48
corvusso for us, it's 3.7 seconds for a table scan i guess17:49
SpamapSdef worth an index, but the UI still doesn't show the 40417:49
SpamapSLoading... forever17:49
corvus(yeah, the table scan part of it takes 2.39s for us)17:50
SpamapSIn fact /api/build/foo show Loading... too.. hmm17:50
SpamapSWhy is that an HTML page and not a json response?17:50
corvusSpamapS: oh, maybe you need to restart zuul-web?17:50
tobiashShrews, corvus: while reading that stats rework, I've a side question. How can zuul cancel a node request that is currently locked by a provider trying to fulfill it?17:51
SpamapSI haven't upgraded recently.17:51
corvusSpamapS: i've seen zuul-web get grumpy after a scheduler restart17:51
SpamapSoh, it seems to switch on ACcept17:52
SpamapSBecause my browser is sending Accept html, it's sending me HTML17:52
Shrewstobiash: not currently possible17:53
corvusSpamapS: erm, we don't have anything like that.  you can hit the api in your browser17:53
corvustobiash, Shrews: wel... um, it looks like it actually just deletes the request out from under nodepool.17:53
SpamapShm, no it's not Accept.17:53
SpamapSI cannot hit the api in *my* browser17:53
Shrewscorvus: well, i mean, if we want to talk about out of bounds methods...17:53
SpamapS-> zuul.gdmny.co17:53
corvustobiash, Shrews: without any consideration of the lock.17:53
SpamapS(it's still not auth walled)17:54
tobiashah ok, that just works ;)17:54
SpamapSI probably messed something up in the translation from mod_rewrite to Nginx.17:54
Shrewscorvus: tobiash: is this something we are actively seeing then? i thought it was a hypothetical question17:54
SpamapSCurl'ing my api works17:54
SpamapSbut browsering it just shows Loading...17:54
corvusSpamapS: if i shift-reload i get an api response.17:54
tobiashShrews: it was a hypothetical question17:54
SpamapScorvus: *weird*17:54
corvusShrews: i'm sure this must happen in openstack-infra17:55
SpamapSAnd of course the javascript fetches are getting json17:55
corvusSpamapS: there may be something weird about the javascript service worker17:56
tobiashShrews, corvus: I'm also thinking how we would design this for the scheduler-executor interface17:56
SpamapSI'm guessing there's a header combination that gets you HTML.17:56
tobiashmaybe with two locks, a modify-znode-lock and a processing-lock17:56
* SpamapS is out of time to investigate though17:56
corvustobiash: okay you're getting way ahead of me here.  what's this have to do with the scheduler-executor interface?17:57
tobiashor to rephrade, locks for modifying the object, and a further ephemeral node that is hold during processing17:57
tobiashcorvus: I'm thinking about the scale out scheduler17:58
corvustobiash: i know17:58
corvustobiash: oh, you're thinking we need a distinct lock for "i'm running the job" and a separate lock for modifying the job information17:58
tobiashso I thought, if the executor holds the lock during processing, how can we cancel a build?17:58
tobiashexactly17:58
tobiashactually that's the same with the node-requests that are now just deleted17:59
corvustobiash: i agree, the situations are similar.17:59
corvuswe could probably do either thing: 2 locks, or, accept that "delete node out from under the lock" is a valid API :)18:00
corvusi'm not sure if the current situation with requests is on-purpose or accidental.  i'm not sure what nodepool will do at this point if the request disappears from under it.  especially with the cache changes.18:01
tobiashcorvus: yes, for node-requests, but for jobs on the executor we might want not to delete it but leave it within the pipeline (if we follow your suggestion that the executors take their jobs directly from the pipeline data in zk)18:02
tobiashcorvus: with the cache changes the object is removed from the list, but if some other code path currently processes it it probably just gets errors when locking or saving the node18:03
Shrewscorvus: tobiash: well, nodepool explicitly looks for node requests to disappear during handling as an assumed error condition. we could just pretend that's a proper stop-now-please api18:03
corvustobiash: if we wanted to, i think we could get rid of canceled build records faster than we do now.  if we wanted to, we could just delete the build record and have the executor detect that and abort.  i'm not saying let's do it that way, but i do think it's an option.18:03
clarkbssbarnea|rover: mordred http://logs.openstack.org/25/620625/2/gate/tripleo-ci-centos-7-standalone/70949b6/logs/undercloud/var/log/journal.txt.gz#_Dec_06_16_09_23 I think maybe that is the issue. Running out of disk space? The log streaming reads off of disk and I could see where maybe the reads and the writes get said if we run out of disk?18:04
corvusShrews: retro-engineering!18:04
corvusShrews: the fact that we're only now thinking about it probably means it's working okay :)18:04
Shrewshttp://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/driver/__init__.py#n68018:04
corvusShrews: that almost looks purposeful18:04
Shrewsright?18:05
corvusShrews: i'm feeling generous; i'm going to assume we engineered it that way and forgot :)18:05
Shrewstotes18:05
Shrewscorvus: that's how i (now) remember it18:05
corvus++18:05
clarkbssbarnea|rover: mordred what is odd there is the job seems to start in that state so may be unrelated, however that could also explain why delorean failed possibly18:05
tobiashcorvus: ok, I think the delete will work in the executor case too18:07
corvustobiash: i think as part of this, we probably want to have the executor hold the lock on nodes in the future.  so scheduler deletes build record; executor detects that and aborts job and releases node locks.18:08
Shrewscorvus: tobiash: actually, we may want to move that nodepool check up a bit in the code. it will only reach that point if it's done launching all requested nodes18:09
corvusShrews: ++ should save some time18:09
Shrewsthe 'if not launchesComplete(): return' is above that18:09
corvusShrews: though... that may be complex18:10
Shrewsyeah18:10
Shrewsjust thinking of the consequences...18:10
corvusShrews: it's okay if we complete the request and then end up with some extra ready nodes.  but if we want to abort mid-launch it'll get messy.18:10
tobiashcorvus: probably makes sense, I'll think about it18:10
SpamapScorvus: btw, it's possible that my API being behind CloudFlare could cause weirdness.18:11
corvustobiash: i think the main driver there is -- it's really the executor using the nodes.  a distributed scheduler may get restarted at any time and should have no consequence to running jobs.  only if the executor running the job is restarted should the nodes be returned.18:11
SpamapSI already can't use encrypt_secret.py through it.18:11
SpamapS(CloudFlare blocks unknown user agents and you have to pay to whitelist things, something we'll do.. but.. not today ;)18:12
corvusSpamapS: you may want to ask tobiash about building the web dashboard without support for service workers and see if that fixes any weirdness.18:12
tobiashcorvus: totally correct, I'm just thinking about who should request the nodes. I think this should still be the pipeline processor18:12
corvustobiash: yes.  the hand-off to an executor will be a neat trick.  :)18:12
tobiashand the executor holding the lock on the nodes is absolutely the right thing18:13
tobiashyeah, so the scheduler requests it, but the executor that got the job accepts it and locks the nodes18:14
clarkbmordred: ssbarnea|rover ok where the run streaming stops in that job there is a nested ansible run which ara repos was interrupted; data will be inconsistent18:16
clarkband from that point forward we stop getting streaming. So something is happening there that affects more than just zuul18:16
openstackgerritMerged openstack-infra/zuul master: web: break the reducers module into logical units  https://review.openstack.org/62138518:20
*** electrofelix has quit IRC18:26
*** cristoph_ has joined #zuul18:30
SpamapSSo... I'm about to submit a slack notifier role for Zuul... wondering if we should stand up a slack (they're free) just for running test jobs.18:31
SpamapSAlso.. Ansible 2.8 has added a threading mechanism to the slack module that would be super useful for threading based on buildset.... wondering how we're looking for catching up to Ansible any time soon.18:32
Shrewsit feels like we just caught up to ansible, like, last week18:34
Shrewswe need to slow their momentum  :)18:34
* Shrews plots an inside attack18:34
tobiashcorvus: I'm +2 on 62028518:38
tobiashcorvus: do we need to announce such a change on the mailing list?18:39
SpamapS<2.6 .. wasn't 2.6 like.. over a year ago?18:40
tobiashSpamapS: nope, we switched to 2.5 just one week before 2.6 has been released. And that was this year ;)18:41
tobiashthat has been merged in june... (https://review.openstack.org/562668)18:43
corvuswe need to find a volunteer for the support multiple ansible versions work18:43
clarkbSpamapS: openstack infra's testing of ansible 2.8 shows that handlers are going to all break unless things are changed in 2.8 before release18:44
tobiashI think I could at least support18:44
clarkbso ya ++ to multi ansible instead18:44
SpamapSOh right ok 2.6 was July18:52
SpamapSSeems like multi-ansible is a virtualenv+syntax challenge, yeah?18:53
tobiashSpamapS: plus possibility to pre-install18:54
tobiash(my zuul doesn't really have access to the internet)18:55
clarkbif we used teh venv module we should be able to document steps or supply a script to preinstall venv virtualenvs for zuul as it would on demand18:55
SpamapSThere's some interesting things to think through like, do we have 1executor:manyansibles, or just 1:1 executor:ansible and make ansible version a thing executors subscribe to (like "hey I can do 2.6")18:55
tobiashSpamapS: also we need to inject different versions of the command module into different versions of ansible18:56
clarkbtobiash: that is the biggest challenge I think18:56
SpamapSIs there no way we can write one that works for all supported versions?18:56
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_handler_poll_session_expired  https://review.openstack.org/62326918:56
SpamapSand just always inject that in to the module path..18:56
tobiashSpamapS: it could be that the latest one by accident works with all but we don't know18:57
SpamapSAnyway, yeah, would be great to have multi-version support, especially with how fast Ansible seems to be moving/breaking.18:57
*** nilashishc has quit IRC19:04
openstackgerritRonelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null  https://review.openstack.org/62329419:28
Shrewswow, i don't think this nodepool test ever worked properly  :(19:29
Shrewsanyone know of a way to have a mock.side_effect both execute code AND raise an exception? seems it's either one or the other19:34
openstackgerritMerged openstack-infra/zuul master: web: refactor info and tenant reducers action  https://review.openstack.org/62138619:35
clarkbShrews: have it call a fake?19:36
clarkbthen have that raise itself19:36
Shrewsclarkb: that doesn't work19:36
Shrewsit can either call the fake, or raise an Exc, but not both it would seem19:37
clarkbShrews: the fake does the raise19:37
Shrewsclarkb: that didn't work in my test19:38
Shrewsthe raise is ignored19:38
Shrewsoh, there is something else wrong here. maybe that will work if i fix that19:41
clarkbShrews: http://paste.openstack.org/show/736784/ it works here19:41
*** sshnaidm is now known as sshnaidm|afk19:43
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix race in test_handler_poll_session_expired  https://review.openstack.org/62326919:50
pabelangermordred: it seems we might already be passing in normalized data for server at: http://git.openstack.org/cgit/openstack/openstacksdk/tree/openstack/cloud/openstackcloud.py#n2144 because I can see host_id and has_config_drive data before we attempt to normailze again, which results in lost of data20:30
pabelangermordred: I am not familiar enough with code to figure out how to properly fix20:30
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Add spec for scale out scheduler  https://review.openstack.org/62147920:35
mordredpabelanger: hrm. we shouldn't be double-normalizing :(20:35
mordredpabelanger: OH20:36
pabelangerOh, HAHA20:37
pabelangerhttp://logs.openstack.org/07/623107/2/check/nodepool-functional-py35/e49051c/controller/logs/screen-nodepool-launcher.txt.gz#_Dec_06_03_59_24_67787120:37
pabelangerthis actually works with 0.20.020:37
pabelangerbut is a bug in master20:37
mordredremote:   https://review.openstack.org/623308 Deal with double-normalization of host_id20:38
mordredpabelanger: ^^20:38
pabelangermordred: https://review.openstack.org/621585/ I think that is what broke it20:38
mordredpabelanger: I believe it's because what we're now starting from is an openstack.compute.v2.server.Server Resource object that we then run to_dict() on. the Resource object already coerces hostId into host_id - and the normalize function was only doing host_id = server.pop('hostId', None) - but there isn't a hostId in the incoming - only a host_id20:41
pabelangermordred: yes, exactly20:42
pabelangerpossible there is others, but haven't checked20:42
mordredpabelanger: so I Think that patch above will fix this specific thign - the next step is actually to make that normalize function go away completely20:42
mordredpabelanger: but I figure that's going to need slightly more care than a quick fix20:42
openstackgerritMerged openstack-infra/zuul master: web: add error reducer and info toast notification  https://review.openstack.org/62138720:42
pabelangermordred: patch is missing trailing ) but worked20:44
pabelangermordred: clarkb: corvus: Shrews: okay, so https://review.openstack.org/623107/ is in fact working, if you'd like to review and confirm format of patch is something we want to actually do20:48
mordredpabelanger: yay! I have updated the patch to add the appropriate number of )s20:51
pabelangermordred: +220:51
mordred\o/20:52
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Read old json data right before writing new data  https://review.openstack.org/62324520:59
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add appending yaml log plugin  https://review.openstack.org/62325620:59
fungilooks like that busy cycle for the executors lasted ~17 minutes21:05
fungier, wrong channel (sort of)21:05
SpamapSmordred: there are ways to make json append-only too you know.21:10
fungii assumed the point of 623256 was more expressing a preference for yaml instead of json21:13
fungii do find it sort of disjoint that ansible takes yaml input and returns json output21:14
clarkbfungi well yaml can be written to without first parsing the file21:16
clarkbso its memory overhead is better21:16
fungiahh, yeah, i didn't consider that angle21:16
corvusSpamapS: can you elaborate on your json thoughts?21:19
SpamapScorvus: so there are some parsers that can handle this string as a "json stream":   '{"field":1}\n{"field":2}\n'21:31
SpamapSWhich allows you to have append-only json21:31
SpamapSBut not all parsers do it21:31
corvusSpamapS: i think that's the crux -- that we want the output to be valid normal json, not special zuul json21:32
corvus(because we want this to be valid if the job dies at any point)21:33
SpamapSYeah, I could have sworn there was a standard for doing it but I can't find it, so I probably dreamed it.21:35
mordredSpamapS: yah - I originally looked for a standard ... everything I could find with streaming json was just people doing really weird stuff21:42
mordredbut I figure - other than python's weird obession with not including yaml support in the core language - everybody else seems to be able to parse it easily21:43
SpamapSmordred: so in yaml to make it appendable you just have to indent everything by one and start with a "- " and, all good... +121:46
mordredSpamapS: heh21:49
mordredSpamapS: no - actually just separate sections with --- ... to make it a multi-document file21:49
mordredSpamapS: it's actually k8s yaml files that gave me the idea21:50
SpamapSOh docs.. hm21:54
SpamapSYou'd be surprised how many yaml parsers do not support multi doc21:54
SpamapSMostly because they're short-sighted.21:54
SpamapS"make maps into {my language's version of dict} and lists into {my language version of list} and done"21:55
openstackgerritMerged openstack-infra/zuul master: Read old json data right before writing new data  https://review.openstack.org/62324521:55
mordredSpamapS: at this point in my life, nothing surprises me22:03
mordredSpamapS: well, except for bojack getting zero golden globe nominations22:03
SpamapSI'm sure he has a long face.22:03
openstackgerritRonelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null  https://review.openstack.org/62329422:04
openstackgerritRonelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null  https://review.openstack.org/62329422:20
*** manjeets_ is now known as manjeets22:28
*** dkehn has quit IRC22:52
openstackgerritRonelle Landy proposed openstack-infra/zuul-jobs master: WIP: Default private_ipv4 to use public_ipv4 address when null  https://review.openstack.org/62329422:53
SpamapSHey, I'm setting up a Slack just to test the slack notifier role I've built to submit to zuul-roles. Who would like to be added to that slack? Anybody?22:57
*** cristoph_ has quit IRC22:58
*** dkehn has joined #zuul23:01
openstackgerritPaul Belanger proposed openstack-infra/nodepool master: Include host_id for openstack provider  https://review.openstack.org/62310723:49
clarkbpabelanger: does ^ depend on a fix in the sdk lib?23:50
pabelangerclarkb: no, that was a failure with unreleased version of openstacksdk. Ones on pypi work23:52
pabelangerwe can add depends-on if we want however23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!