Wednesday, 2023-12-20

opendevreviewJeremy Stanley proposed opendev/system-config master: Downgrade haproxy image from latest to lts  https://review.opendev.org/c/opendev/system-config/+/90380513:45
fricklerfungi: are we fine to merge ^^? I would think re-applying the previous +2's should be fine14:09
frickleralso, do we know which version of haproxy we were running on Saturday? 2.9.0 or 2.9.1?14:09
fricklerah, the haproxy:latest image on lb02 says HAPROXY_VERSION=2.9.114:11
*** d34dh0r5- is now known as d34dh0r5314:48
fungiyes i think we can merge that (the servers involved are still in the disable list for now anyway)14:49
fungiand per the commit message it was 2.9.1 we were running when we observed the issue14:49
fungii found an upstream bug report which looks more likely to be what we encountered than the previously suspected one, and it seems to have a fix merged upstream so probably the next lts version (whatever it ends up being) won't have the regression. at least here's hoping14:50
fungithe latest commit message has been updated with a link to the newer bug14:51
fricklerfungi: I was watching that issue, but from the latest comments and the submitted fix I'm not sure how that would match our issue. I think we will need to try and reproduce this and submit our own issue, I'll see if I can get to that next week16:00
fungiagreed, on the surface it could be what we observed (pretty sure our services are all http/1.1 not 2.0? but i could be wrong)16:01
Clark[m]I think we can use a testinfra test case on the gitea deployment job to test it16:22
Clark[m]We have all the components there16:22
fungithe bigger challenge will be if it turns out to need a high volume of browsing activity to exhibit issues. for example it took more than a day for things to get bad enough for the zuul-lb to start exhibiting user-facing issues (though maybe we can infer the symptoms by analyzing open sessions tracked in haproxy?)16:28
fungipopping out to lunch with friends, but will return in an hour or so16:30
Clark[m]I think we can start with something as simple as "for x in range(5000): make web request" and we'd expect all to complete successfully16:34
Clark[m]If the issue is still present it will fail on the 4001th request16:34
Ramerethfungi: FYI I think the issue we were having was related to not enough memory available. I just added a new arm64 node a few minutes ago which should help with that. Also it's a different model and is our first node using AlmaLinux 8. Let me know if you run into issues17:31
RamerethI booted up a test vm and it seems to be working fine17:31
Ramerethlooks like we already have some of your vms spinning up on it \o/17:34
fungiRamereth: thanks for the update! i'll go ahead and revert the temporary cap i put in place last week18:26
fungiRamereth: oh, actually i didn't lower it for osuosl since it was intermittent, but i'll keep an eye on it. thanks again!18:27
fungiodd, grafana says "no data" for every graph in every dashboard we have, but i'm still able to query graphite directly and it has current data... i wonder if we broke the grafana configs somehow18:33
fungithe good news is we do still have data, it's just not showing up18:34
fricklerhmm, that must be a very new regression then, I looked at the AFS dashboard during the meeting yesterday and it was still fine, now I can confirm the issue you're seeing19:29
fricklerc1cb8b9db100   grafana/grafana-oss:latest   "/run.sh"   16 hours ago   Up 16 hours             grafana-docker_grafana_119:30
fricklerdocker log seems to show this for every request: logger=live t=2023-12-20T19:30:51.840943044Z level=warn msg="Request Origin is not authorized" origin=https://grafana.opendev.org host=localhost:3000 appUrl=http://grafana.opendev.org:3000/ allowedOrigins=19:31
* frickler is close to eoding, will check closer tomorrow if nobody beats me to it19:32
fungiaha, sounds like maybe grafana got more picky about cors headers?19:50
fungigreat find!19:50
fungihttps://hub.docker.com/r/grafana/grafana-oss/tags implies "latest" is now 10.2.3 and was probably previously 10.2.219:53
funginothing obvious in the changelog for 10.2.3 related to cors19:55
fungithough several things related to authentication19:56
fungihttps://github.com/grafana/grafana/blob/main/CHANGELOG.md19:56
fungior maybe we upgraded from 9.x to 10.x and this is now relevant: https://grafana.com/docs/grafana/latest/setup-grafana/configure-security/configure-security-hardening/#add-a-samesite-attribute-to-cookies20:02
fungidocker image list mentions 9.0.6 as the next most recent image tag we have cached on the server20:03
fungimmm, hunting around, that warning seems to be non-fatal for at least some users, looks like setting live.allowed_origins = "https://grafana.opendev.org" would probably silence it if i could figure out where our grafana config lives20:30
fungihttps://github.com/grafana/grafana/issues/3644320:30
fungiso i'm back to having no idea what's breaking it yet20:38
frickleroh, right, that's only a warning. looking at the browser console is more helpful I think: Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://graphite.opendev.org/render. (Reason: header ‘x-grafana-device-id’ is not allowed according to header ‘Access-Control-Allow-Headers’ from CORS preflight response).20:41
fricklerand that is a header that was recentish added https://github.com/grafana/grafana/commit/1a281ac49dd6b9ee6964badb832918507bf8ba97#diff-70030faa245250908d55db47258ce505d474db1559995683f8df1a951504236fR2420:42
fungioh neat20:44
fungiseems we add Access-Control-Allow-Headers in playbooks/roles/graphite/templates/graphite-statsd.conf.j220:45
frickleractually this commit enabled it for OSS builds, so that matches better with the 10.2.3 timeframe https://github.com/grafana/grafana/commit/59bdff0280d52ca5d8918157d7697b9279b2550120:46
fungiokay, so maybe we did upgrade from 10.2.220:47
fungirather than 9.something20:47
fungii saw that commit in the changelog but in skimming it seemed to just be adding info about anonymous user connections20:48
fungi"remove check for enterprise for `Device-Id` header in request" i guess that could be related to the now disallowed x-grafana-device-id20:49
fungibingo: https://github.com/grafana/grafana/issues/7969220:50
opendevreviewJeremy Stanley proposed opendev/system-config master: Temporarily pin Grafana to 10.2.2  https://review.opendev.org/c/opendev/system-config/+/90415120:58
fungipretty sure that issue is the same one we're seeing since we include templated variables and "special characters" like parentheses in basically all our queries20:59
fricklerI'm not sure about that, we might need to report an issue ourselves or try to fix the issue on the graphite side in the long run. I have no idea why ianw added that header initially and whether it would be safe to simply add the new header grafana sets21:04
frickleralso displayAnonymousStats only seems to disable displaying the stats, it will not disable sending the new header if my reading of the patch is correct21:04
fricklerbut pinning to 10.2.2 and verifying that the errors don't appear in the browser console there is a good first step21:05
ianw(i have no idea why i added that header ... :)21:06
ianwi imagine i copied from some nginx setup/graphite setup guide ... if it didn't come from puppet21:07
tonybfwiw I approved the haproxy change 21:07
tonybnow back to the grafana issue 21:07
* frickler is really off now21:09
fungithanks frickler!21:10
ianwso we just need 'x-grafana-device-id' in the cors response right?21:21
JayFfungi: I assume \join_subline is not an authorized advertiser in #opendev?21:33
JayFfungi: they are using their irc profile to advertise in a way that is fairly obvious to irccloud users, less-so to other users, and using potentially an opendev etherpad to do so as well21:33
JayFfungi: DM'd you that suspect link21:33
JayFfungi: just validating since this isn't an openstack channel before using my newly-minted irc hammer 21:34
opendevreviewMerged opendev/system-config master: Downgrade haproxy image from latest to lts  https://review.opendev.org/c/opendev/system-config/+/90380522:16
ianwi added that manually just to test, restarted graphite and it seems to work now22:19
fungiianw: 903805 or the cors response addition?22:20
ianwthe cors header in the ngnix config22:20
fungier, right 903805 is the haproxy downgrade not the grafana downgrade22:21
fungii'll wip 904151 in favor of the cors update22:21
ildikovHi All, I have a quick question if anyone might have the experience with that. Is there a way to change the email address in a UbuntuOne account if the person forgot their password and don't have access to their email anymore?22:23
opendevreviewIan Wienand proposed opendev/system-config master: graphite: add grafana header to CORS allowed list  https://review.opendev.org/c/opendev/system-config/+/90415422:24
ianwi actualy think the other cors headers there don't need to be listed because they're on the always allow list22:24
ianwbut on Dec 21 the smallest change that gets it working is probably the best :)22:25
opendevreviewIan Wienand proposed opendev/system-config master: graphite: add grafana header to CORS allowed list  https://review.opendev.org/c/opendev/system-config/+/90415422:30
fungiildikov: i think it requires contacting the ubuntuone admins. i'm not sure they would have any way to verify it's the same person though, and so quite likely they'll be told to just create a new account with their updated address and then contact the admins of all the systems they logged into with the old id to associate the new id with their accounts22:34
fungi(or just create new accounts everywhere with the new id)22:35
Clark[m]https://github.com/haproxy/haproxy/issues/2395 this looks like our haproxy issue23:09

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!