Saturday, 2022-03-05

corvuszuul01 is back up along with apache, and lb is serving both00:00
clarkbyup I'm back to hitting 0100:00
corvuswe can make a dashboard the "default"... i could do that for the zuul status page?00:01
corvusoh, or is that just for that account?00:01
corvusin which case.. maybe fungi's idea :)00:02
corvusoh there's an organization default_home_dashboard_path00:03
corvusokay, that's not really working and i don't think it's worth sinking more time into :)00:07
opendevreviewMerged opendev/system-config master: Do more robust checks against zuul-web with haproxy  https://review.opendev.org/c/opendev/system-config/+/83214100:15
fungiyeah, https://grafana.opendev.org/dashboards seems like a more useful landing page than the root page00:24
Clark[m]In theory we are doing the http checks now and I can still get https://zuul.opendev.org01:20
corvusi will try the stop experiment again01:23
corvusit's only downed finger...01:24
corvusdo we need to reload haproxy?01:24
corvushrm, i restarted haproxy and that hasn't improved things01:25
corvusoooh01:28
corvusClark: apache returns 302 for options / and that's acceptable for haproxy -- at least for http01:29
corvuschecking the https logs now01:29
fungioh, i think it's our redirecting01:30
fungiwe probably need to check something other than /01:30
corvusyeah... i'm not seeing the options request for https though...01:30
corvusso i don't know what's going on there01:30
corvusit may still be redirecting because of a hostname mismatch01:31
corvusbut i'd like to confirm01:31
fungioption httpchk GET /wherever01:32
Clark[m]Ah ya that is configurable01:32
Clark[m]HEAD not GET is probably better01:32
fungioption httpchk GET / HTTP/1.1\r\nHost: zuul.opendev.org01:33
corvusalso, btw, somoene is cralwing all the builds with python urllib01:33
fungithat may be the replacement for logstash/elasticsearch01:33
Clark[m]I'm doing dinner stuff so can't push anything now01:33
corvuswell, before we push anything, i still want to understand the https frontend01:34
corvusi can't find any evidence of a check there01:34
Clark[m]Ah01:36
fungii can't figure out how to manually emulate a browser when opening an ssl socket to apache on one of the schedulers01:42
fungino matter what i try to pass after opening the socket, apache sends back a 400 bad request01:43
Clark[m]fungi: curl has a verbose mode that will show you all that iirc01:44
fungiyeah, i'm able to use openssl s_client to connect to gitea01:3000 and send 'GET / HTTP/1.1\r\nHost: opendev.org' and get page content back01:46
fungican't seem to do something similar to zuul01 or zuul02 though01:46
corvusonly try zuul02 right now; 01 is down01:49
corvusand /api/info is a good endpoint to get01:49
fungiokay, i think i needed -noservername01:50
fungiseems s_client may have been trying to do sni01:50
fungino, nevermind, i wasn't connecting to the host i meant to01:51
fungicurl seems to be able to get content, yeah, setting --resolve zuul.opendev.org:443:104.130.246.3101:54
corvusi'm trying various options with haproxy, and i still don't see any https healthchecks01:56
corvus    option httpchk GET /api/info HTTP/1.1\r\nHost:\ zuul.opendev.org01:58
corvus    server zuul01.opendev.org 104.130.246.57:443 check-ssl verify none01:59
corvusthat's the current config :/01:59
fungiis it actually connecting?02:01
corvusi see no evidence we have ever executed an https health check, and therefore zuul01 is currently in the load balancer despite being down.02:01
corvusthis may not working on the gitea lb either -- i don't think there's anything zuul-specific about this02:02
fungiyeah, i'm trying to figure out if it logs health check failures at all02:03
corvusit's time for dinner here, so i'm going to restore the config and bring zuul01 back up; but this is definitely worth more investigation02:03
fungilooking at the syslog on gitea-lb0102:03
corvusfungi: it does -- you can see them on zuul-lb01 for finger (since that is failing health checks)02:03
fungiand yes, it's getting lateish here too but i'll keep poking for a bit02:03
corvusi'm looking at the apache logs for the actual checks, and i saw them for http but not https02:03
corvusokay, haproxy and zuul01 should both be coming back up now; it may take a bit for zuul01 to be back in service02:04
Clark[m]What is odd is how is it up at all without doing checks02:10
Clark[m]Seems like it would have to check it is up first. We had issues in testing when it tried to verify ssl implying at least then it checked02:11
corvusim wondering if it's silently falling back to tcp checks; we could probably confirm by downing apache (but i'm not going to do that right now because i'm no longer here)02:14
fungii'm wondering if we've accidentally configured it for passive checks02:32
fungistill reading up on haproxy's active vs passive checks02:33
fungiyeah02:33
fungiWhereas an active health check continually polls the server with either a TCP connection or an HTTP request, a passive health check monitors live traffic for errors. You can enable this mode by adding the check, observe, error-limit, and on-error parameters to a server line02:34
fungisince we're forwarding at layer 4 not 7, i don't think we can use passive health checks02:34
fungihttps://www.haproxy.com/blog/how-to-enable-health-checks-in-haproxy/ explains a lot of the possibilities02:38
fungii think we need to set "http-check connect ssl"02:40
fungiit looks like the check-ssl parameter on the server lines may only apply to passive checks not active02:41
*** mazzy5098812929580851 is now known as mazzy50988129295808502:52
fungihappy to push a change to add that if others' readings agree03:02
Clark[m]fungi the thing that confuses me is why is the ssl version different than the http version?03:36
Clark[m]We are doing active checks with http but not https I guess because we don't tell it to connect with ssl? Your suggestion can't hurt03:37
*** mtreinish_ is now known as mtreinish09:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!