Thursday, 2018-10-18

*** jangutter has quit IRC08:05
*** jangutter has joined #softwarefactory08:06
*** sshnaidm_ has joined #softwarefactory08:59
*** sshnaidm_ has quit IRC09:18
*** zoli is now known as zoli|lunch09:42
*** zoli|lunch is now known as zoli09:42
*** sshnaidm has joined #softwarefactory10:06
*** sshnaidm has quit IRC10:12
*** sshnaidm has joined #softwarefactory10:58
*** zoli is now known as zoli|afk11:34
*** zoli|afk is now known as zoli11:34
*** sshnaidm_ has joined #softwarefactory11:50
*** sshnaidm has quit IRC11:53
sfbenderMatthieu Huin created software-factory/sf-config master: Add (undocumented) login collision strategy option in cauth config  https://softwarefactory-project.io/r/1400312:42
pabelangerfbo: tristanC: mhu: we've deleted a branch from a repo, but don't believe zuul seen the event, now we are getting merge conflict errors on ansible-network/cloud_vpn. Can you check the zuul merge logs and see why?  I also belive a full reconfigure of zuul may fix it, but have no way of triggering that currently15:52
mhupabelanger, I'm going to have a look15:55
pabelangerIt is likely this is a bug, since zuul didn't see the delete branch event, and didn't reload configuration15:57
pabelangerbut logs will show more15:57
mhuI found PR #50, I'm following the thread...16:00
pabelangeryah, PR54 also fails: https://github.com/ansible-network/cloud_vpn/pull/5416:01
mhufound this on PR51, not sure if related or already fixed: https://pastebin.com/Sa152hVf16:02
*** mhu has left #softwarefactory16:06
*** mhu has joined #softwarefactory16:06
mhuoops16:06
mhuhttps://pastebin.com/HVvQcnN316:06
mhuthat's all i see in the logs16:07
mhupabelanger, ^16:07
mhupabelanger, a reconfigure requires a restart? if so I'm not too keen on restarting the service now, there are 33 jobs queued on the rdo tenant16:10
pabelangermhu: okay, can you get me the relevant logs so we can report this in #zuul, because this looks to be a zuul ub16:13
pabelangerbug*16:13
pabelangermhu: no, you can do full-reconfigure from CLI16:13
*** sshnaidm_ is now known as sshnaidm16:14
mhuit's not documented in the CLI help? or is it something now?16:14
pabelangermhu: SIGHUP16:14
pabelangerlet me find docs16:14
mhuthanks16:14
pabelangermhu: https://zuul-ci.org/docs/zuul/admin/components.html#operation16:15
*** atarakt has left #softwarefactory16:17
*** nhicher has joined #softwarefactory16:17
pabelangermhu: did the docs help?16:22
mhupabelanger, yeah but is that recent? the zuul-scheduler on sfio doesn't have this option16:23
mhuthe only option is "stop"16:23
pabelangermhu: yes, but you can kill -s SIGHUP <pid>16:24
pabelangerthat is the original way of doing it16:24
mhuright, sorry, I misread the doc16:24
mhuend of the day, etc16:24
pabelangerYah, need more converage in NA :)16:25
mhuok, SIGHUP done16:27
pabelangermhu: you should see reload happening logs16:27
pabelangerodd16:29
pabelangerhttps://ansible-network.softwarefactory-project.io/zuul/status16:29
pabelangerRequest failed with status code 50016:29
pabelangermhu: did zuul stop?16:31
pabelangerhttps://softwarefactory-project.io/grafana/d/000000001/zuul-status?orgId=1&from=now-1h&to=now&refresh=5s16:31
mhuit didn't appear stopped in systemctl16:31
pabelangerI don't see any executors or mergers online16:31
mhuto be sure I restarted the scheduler and web16:31
pabelangermhu: Oh, you stop / started everything?16:32
mhuodd, I'm not even on the executors nor mergers16:32
mhujust the scheduler16:32
pabelangermhu: did you dump the queues first?16:32
mhuno ...16:32
pabelangerotherwise, we loose all the open patches that are running16:32
pabelangerokay, that is an issue then16:33
pabelangerwe should avoid doing that, as all open changes now need to be rechecked16:33
pabelangerokay, my PR looks right16:34
mhuahah, well there's that at least16:35
pabelangermhu: you'll have to notify rdo all open changes need to be rechecked16:35
mhuyup, going there16:35
pabelangermhu: https://docs.openstack.org/infra/system-config/zuulv3.html#restarting-the-scheduler16:35
pabelangeris a good doc explaning how to do restarts safely16:35
pabelangeralso, dmsimard wrote a script upstream for infra, that dumped queues every minute, incase this happens16:36
pabelangerthen we have something to atleast try and re-enqueue16:36
pabelangerI would strongly recommend adding it to SF.io16:36
mhualso I shouldn't be allowed near production systems past 6PM16:36
mhuand with that, I'm off, catch you later16:37
pabelangermatburt: there was an Zuul outage, see above. you might need to recheck open awx PRs16:39
pabelangermhu: tristanC: fbo: https://review.openstack.org/#/c/532955/ is the patch from dmsimard, can we please add it to SF.io zuul if missing16:40
matburtpabelanger hah we noticed16:54
pabelangermatburt: yah, sorry about that. Going to work with SF.io team to help protect more from total outage16:55
matburtmeh, it is what it is16:55
dmsimardpabelanger: I'm not sure if they're still saved in the zuul web dir, we might have moved them afterwards17:03
pabelangermhu: fbo: tristanC: we are still getting a merge conflict from zuul, can we debug please: https://github.com/ansible-network/cloud_vpn/pull/5417:46
pabelangercheck pipeline works, but when we move to gate, fails17:50
*** ssbarnea_ has joined #softwarefactory20:03
*** ssbarnea_ has quit IRC21:53

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!