Friday, 2020-04-10

fungiyep, looks like your suggestion is compatible with ttx's poc, we can just reparent it00:00
corvusthere's a poc?00:00
fungihttps://review.opendev.org/71847900:00
fungionly applied to openstack/release-test repo for now00:00
funginot sure if it's even been triggered yet00:01
corvusneat, that's slightly unsafe due to the caveats i mentioned in my message00:01
corvusa hypothetical "x/nova" project could use that to overwrite the nova repo00:02
fungiyep, agreed00:02
fungiwell, at least fight with openstack/nova over it anyway00:02
*** mlavalle has quit IRC00:02
fungicorvus: clarkb had suggested that current zuul limitations on running jobs with secrets would require adding them to project-pipeline within a trusted config repo, is that not the case?00:04
clarkbin order to "share" the secret I thought that was required, Though in this case maybe you cna make a variat that changes the var?00:05
corvusfungi: i don't think so?  i think this is just like the doc promote jobs...00:05
fungior is it just the case that we'd need to do it that way to account for triggering on tag events as well (because we want to replicate those too when they're pushed, but tag events don't convey a branch and so the job won't get run if added to a branched project due to implicit branch matcher behavior)00:06
corvusfungi: that is the case00:06
fungicorvus: got it, so yes we definitely need your regex idea, but also we'll still need to do the pipeline additions in the config repo00:07
fungireplying on-list00:07
clarkbfwiw I also suggested the job should be protected00:07
clarkbbut that didn't seem to end up in the latest patchset, I should've left comments in gerrit not irc00:07
corvusfungi, ttx, mnaser: feel free to ping me on stuff like this.  i wasn't intentionally ignoring that, i just had no idea any of this was happening.  been a bit busy lately.00:09
corvusi would recommend reverting 479 (jobs like that scare me), then redoing it as suggested on the ml00:10
fungisure, thanks! revert on the way00:11
openstackgerritJeremy Stanley proposed openstack/project-config master: Revert "Introduce job for granular GitHub mirroring"  https://review.opendev.org/71883900:14
fungicorvus: clarkb: mnaser: ttx: ^00:14
corvusyou might just be able to do "git_mirror_repository: {{ zuul.project.name }}" and that would be enough.  that will certainly work for openstack/*.  and would even work for foo/* if the project name were the same on github and that org also added the same user.  and if that weren't the case, it would fail harmlessly.  but if we want to avoid attempting to push to github in that case, then we'd need the00:15
corvusregex check.00:15
fungiyep, also your idea is more flexible anyway, since it would allow us to make versions for mirroring openstack/ namespace repos to a more diverse set of gh orgs if desired00:17
fungilike putting services under openstack-api and shared libraries under openstack-lib and so on00:17
fungiwhich was one of the ideas the folks who care about organizing things on github were todding around anyway00:17
fungier, tossing around00:18
mnasercorvus: I’m happy to help somewhat in this effort as it’s something that’s helpful for our opendev tenant00:18
fungii don't think todd was involved at all ;_00:18
openstackgerritMerged openstack/project-config master: Revert "Introduce job for granular GitHub mirroring"  https://review.opendev.org/71883900:32
*** ysandeep|out is now known as ysandeep00:34
mnaserpost jobs for manage-projects seem to have failed03:22
mnaserUnable to freeze job graph: Job infra-prod-remote-puppet-else depends on infra-prod-update-system-config which was not run.03:22
mnaserhttp://zuul.opendev.org/t/openstack/job/infra-prod-remote-puppet-else03:24
mnaserthat job's parent is infra-prod-service-base which depends on infra-prod-update-system-config running03:25
mnaserdoes this mean we need to include infra-prod-update-system-config in post for openstack/project-config?03:25
*** ysandeep is now known as ysandeep|off03:28
fungihrm, that's a good question... i don't *think* any of the deployment actions we trigger off project-config ought to implicitly depend on having latest system-config, but i could be wrong03:31
fungimordred?03:31
openstackgerritMohammed Naser proposed openstack/project-config master: Add infra-prod-update-system-config to deploy  https://review.opendev.org/71887703:31
mnaserfungi: ^ i pushed that up if that makes sense, it might be entirely wrong, but its something to start from03:31
fungii mean, yes the way the jobs are defined right now that see,s to be the case03:31
fungibut it may mean we want to change up how that one is parented03:32
mnaseryep03:32
mnaserits a soft dependency when running in other projects but a hard one in openstack/system-config03:32
mnaserso maybe it might just be another variant with no dependencies03:33
*** sgw has quit IRC03:38
*** sgw has joined #opendev04:06
ianwclarkb: my preference for restoring suse would be to merge https://review.opendev.org/718299 and get building working without pip-and-virtualenv, and then the stack like https://review.opendev.org/#/q/status:open+project:zuul/zuul-jobs+branch:master+topic:ensure-pip04:18
ianwclarkb: i think that the primary motivation for suse is also devstack, which should work without pip-and-virtualenv; if it doesn't have a a couple of patches up to devstack that would make it work, i haven't checked on their merge status04:19
openstackgerritIan Wienand proposed zuul/zuul-jobs master: ensure-pip: export ensure_pip_virtualenv_command  https://review.opendev.org/71822404:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-zuul-cloner: use ensure-pip  https://review.opendev.org/71788204:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: fetch-subunit-output test: use ensure-pip  https://review.opendev.org/71822504:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: ensure-tox: use ensure-pip role  https://review.opendev.org/71766304:53
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Update Fedora to 31  https://review.opendev.org/71765704:53
ianwclarkb: ^ thanks for comments.  responded inline, but on the order of installation -- i think we have to let the distribution own "pip" in the packaged case; in that case it doesn't matter if you install python3 or 2 first as only one package owns that generic namespace04:55
ianwit's when you're installing outside the system packages with git-pip.py that you end up fighting over /usr/local/bin/pip04:56
*** DSpider has joined #opendev05:48
*** dpawlik has joined #opendev06:20
AJaegerianw: did you see corvus' comment on https://review.opendev.org/718284 ?07:16
AJaegercorvus: in your email you say that the documentation promote job has something similar - could you point me to that one, please? I'm not seeing that directly07:31
*** tosky has joined #opendev07:32
AJaegermnaser, noonedeadpunk, after the removal of pip_install and repo_build, we have now failures in openstack tenant, please fix: http://zuul.opendev.org/t/openstack/config-errors07:35
*** rpittau|afk is now known as rpittau07:36
*** hashar has joined #opendev07:52
*** roman_g has quit IRC08:18
ttxcorvus: your proposal looks indeed a lot safer. I was operating under the assumption that the job would only be available from openstack/project-config's projects.yaml, and therefore openstack-infra reviewers could make sure it's not abused by a x/nova thing. Was that a bad assumption?08:28
AJaegerttx, for that you would need to mark it as such - for example using "protected"08:35
ttxAh ok, someone told me that the secret would not be available to children jobs08:37
ttxsince those would be defined in a different repo08:37
AJaegernot children - but you could still add it as is AFAIK08:38
openstackgerritBernard Cafarelli proposed openstack/project-config master: Update Grafana dashboards for stable Neutron releases  https://review.opendev.org/71867608:51
noonedeadpunkAJaeger: yeah, having a look08:57
noonedeadpunkI also noticed missing jobs today08:59
AJaegernoonedeadpunk: yeah, jobs cannot run with such an error ;(09:04
ttxre: meetpad, there are a number of Jitsi limitations that might block us for using it for big events09:09
ttxHard limit at 75 users in a room, with performance strongly degrading starting at 35: https://community.jitsi.org/t/maximum-number-of-participants-on-a-meeting-on-meet-jit-si-server/22273/1609:09
ttxWe need some mechanism to make sure only trusted people join the room, and can be kicked out. By default anyone can create a room in Jitsi, and then the first people in it is moderator and can set (and reset) a password. So we might need some way to communicate out-of-band password change, or some chanserv-like bot to keep the room09:12
ttxFinally there is GDPR compliance, which is not there, but they are apparently working on it: https://community.jitsi.org/t/privacy-gdpr/26388/509:14
ttxcorvus: ^09:14
*** persia_ is now known as persia10:07
*** rpittau is now known as rpittau|bbl10:19
*** rpittau|bbl is now known as rpittau11:58
ttxcorvus, fungi: re: that github-mirroring, see http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014046.html12:03
fungittx: that jitsi issue seems to be about the insufficiency of gdpr compliance statements on the instance hosted at meet.jit.si12:31
fungithough i'll grant that if opendev is going to start putting gdpr statements on all of its services it would need to do so for meetpad as well12:32
fungii guess that issue is further asking for a gdpr statement template so that people who are comfortable publishing generic legal statements without getting a lawyer involved to write them can do so12:34
fungithough i have a feeling most organizations who feel like they need to publish a gdpr statement will also feel compelled to engage legal counsel to write it for them rather than trust the authors of the application12:35
openstackgerritThierry Carrez proposed openstack/project-config master: [check-approval] Use committer instead of owner  https://review.opendev.org/71899412:36
* fungi notes our gerrit instance also has "personal information" and doesn't say anywhere how it's used12:36
fungisame probably goes for anything interactive we host which people can log into or otherwise input data in (mailman, etherpad, ethercalc, mediawiki, zanata, limesurvey, ...)12:38
fungi(storyboard, lodgeit, askbot, asterisk, maybe our irc bots...)12:40
ttxindeed12:40
fungifor that matter, basically every service we run at least records the ip addresses which connect to it, and that alone is considered personal data which we'd probably at least be expected to explain we only look at for troubleshooting and diagnostic purposes12:42
fungiso i wouldn't consider lack of gdpr integration in jitsi a blocker for using it in opendev any more than lack of gdpr integration is a blocker for any of the other services we run12:43
mnaserfwiw i think the only issue is that it may mean a different story about using jitsi for something like an "openinfra event"12:45
mnaserbecause then that might introduce some sort of liability wrt to the OSF12:46
fungiif it's something the osf is saying is an official part of the event, i suppose... though we use ptgbot and etherpad and gerrit and other things in nearly as officially linked a capacity for past ptgs12:49
fungimnaser: were you saying you were interested in taking this task on? http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014046.html13:13
*** prometheanfire has quit IRC13:16
fungittx: and yeah, we were wrong about the replication job being restricted to adding in individual repos, though clarkb's suggestion in irc to mark it as a protected job would have mitigated that as well (i forgot he'd mentioned that when i was reviewing the change)13:25
fungii need to disappear to pick up our weekly grocery order, but should return in time to help prep for the maintenances13:46
fungibbiab13:46
mnaserfungi, ttx: yes, i can help with that, but i think i need a little bit of guidance/discussion with corvus around the implications of all of it because it's not fully clear to me yet14:00
corvuso/14:02
corvusmnaser, ttx: i think start with a job like ttx wrote, then use an empty nodeset, then do something like https://opendev.org/opendev/base-jobs/src/branch/master/zuul.d/jobs.yaml#L25214:04
corvusnote how docs_branch_path is tied to the secret; it's not just a job variable, which means it can't be overridden.  the person with the secret decides what the secret can be used for.14:05
corvusso do that to pass the git_mirror_repository variable to the role14:06
corvusthen, i think there are 2 options: 1) in openstack, set that variable to "{{ zuul.project.name }}", so that x/nova won't fight with openstack/nova.  that could mean that if someone doesn't have it set up correctly, it will attempt to push and fail.  that's probably fine.14:07
corvusbut if we don't want that to happen (ie, we're concerned about x/foo setting up replication with openstack's key and github seeing a bunch of failed pushes from that account and blocking it), then 2) add an extra task to the playbook for the job which does some kind of validation.14:08
corvushonestly, i think that's pretty low risk.  i would probably just do #1 and stop there.14:08
corvusAJaeger: ^14:08
mnaseryes i agree that 1 is enough14:09
corvusttx: re meetpad -- frickler pointed us to a system for horizontally scaling jitsi bridges for more capacity; i only briefly reviewed it, but i suspect we could do that if called upon14:10
corvusttx: i certainly agree that we would need extra support for managing rooms for larger events; though i wonder if for design-session style events if "trust everyone" might work?  but then again, zoom-bombing has been invented, so maybe not.  maybe the first-person-is-the-moderator would be enough for design sessions, and we ask leaders to join the room a bit early?14:13
corvusttx: if we wanted to have bots hold the room, i suspect that would be possible -- apparently this whole thing works based on xmpp, so maybe we could have ptgbot join the xmpp room for scheduled sessions, have privileges, and annoint moderators...14:14
corvus(right now, the xmpp server is hidden, but it can be exposed)14:14
corvusfungi: ^14:15
corvusttx, fungi: though also, i think we have a little more evaluation to do once we're done with the etherpad transition to see if we think it's feasible.  the last time we tried, it was "mostly working".14:16
AJaegercorvus: thanks14:31
openstackgerritMohammed Naser proposed opendev/base-jobs master: opendev-upload-git-mirror: add job  https://review.opendev.org/71903214:41
mnasercorvus, ttx, AJaeger, fungi ^ i think that is step 1, i'll try to work on pt 2 (aka in openstack/project-config) but landing that would be nice so the job becomes available14:42
mnaserhmmm14:43
mnaseri wonder if i should make git_mirror_repository variable passd through the secret14:43
corvusmnaser: yep i'm leaving a comment to that effect for clarity14:43
mnaserbecause most likely we won't be able to pass a 'root' value in a secret, afaik that's not possible14:43
mnaserok cool14:44
AJaegercorvus: I left a question: How can we enforce that its run in trusted repo only?14:44
AJaegershould the job be abstract?14:45
mordredAJaeger: I think the idea is that we don't worry about it (see corvus comment in scrollback starting with "then, I think there are 2 options"14:48
corvusAJaeger, mnaser: responded to both14:50
AJaegerthx14:51
*** rpittau is now known as rpittau|afk14:57
mordredinfra-root: I also put etherpad01.openstack.org into emergency - probably don't want puppet restarting it while we have it turned off15:29
clarkbmordred: we are doing review first right? Also it occurred to me we may need to add bup to our ansible for etherpad, review etc?15:31
mordredwell - we already have bup ansible15:37
mordredHOWEVER - the way we have it set up is a little strangeish and I do think we need to do it for new etherpad15:37
mordredbasically - we need to add the host to the backup group, run a pulse, then re-remove it15:38
mordred(at least that's what the comments in inventory/groups.yaml say)15:38
corvusthat is weird; why?15:38
mordredI don't know15:38
mordredI've been meaning to bring this up with folks to wrap our heads around it15:39
mordredcorvus: oh - sorry - process is slightly different ...15:39
fungiyeah, i noticed the bup run for the new etherpad server sent some weird cronspam15:39
mordredwe need to add the server to the backup group in general15:39
mordredbut we need to do one pulse with the backup-server uncommented15:39
mordredbecause we don't usually run ansible against the backup server normally15:40
clarkbya that makes sense15:40
corvusgot it15:40
mordredalso - this makes me think - should we be putting the db dumps from etherpad somewhere so that bup is backing them up?15:40
mordredor will they be fine?15:41
ttxcorvus: re: load, scaling bridges for more capacity IMHO only helps with the number of meetings, not capacity in a single meeting. We'll see in testing15:41
ttxcorvus: re: securing access, I think ideally we would support some dynamic creation of room names and post them on the ptgbot somehow15:41
mordredwe only have review-dev in backup right now15:41
ttxThat way the meeting lead would jump to a random room, then add the link to the PTGbot for others to join15:41
ttxMakes the work of Zoombombers a lot more difficult15:42
openstackgerritMonty Taylor proposed opendev/system-config master: Add review and etherpad to backup group  https://review.opendev.org/71903615:42
clarkbmordred: that is why we backup to the local fs, bup will find them there unless the path is in bups exclude list15:43
mordredcool15:43
mnaserbtw, new project addition is broken right now -- i pushed up https://review.opendev.org/#/c/718877/ but its probably not the best cleanest aproach15:43
mordredwe should probably consider if there are additional hosts we should be backing up - because it looks like we only have review-dev in the backup list right now - at least til that patch ^^15:43
mnaserzuul error message: "Unable to freeze job graph: Job infra-prod-remote-puppet-else depends on infra-prod-update-system-config which was not run."15:43
clarkbmordred: well puppet has all the puppet things backing up. And because we didn't replace review.o.o its still backing up. However in general we should have ansible manage that15:44
clarkbmordred: we should backup a single gitea too probably15:44
corvusttx: yeah, that link to the jitsi forum had some interesting info on scaling; it all seems unclear.  (i wonder if we muted video for most participants if it would scale more.)15:44
fungiso... here's one possibility. etherpad has "read-only" view urls which are an unguessable hash, i wonder if we could tie meetpad to those15:44
mordredmnaser: poo. actually - remove update-system-config and add a dependencies: [] to manage-projects15:44
mnasermordred: oh gotcha15:45
mordredmnaser: (see service-nodepool)15:45
clarkbmordred: but yes we should ensure that new and old hosts don't lose their bup backups15:45
fungiahh, though i gues sthe idea is that the etherpad view in jitsi-meet is writeable by each client?15:45
fungii keep forgetting it's not just a view-only video stream there15:46
clarkbfungi: yes15:46
fungiso yeah, ignore that suggestion ;)15:46
openstackgerritMohammed Naser proposed openstack/project-config master: Drop dependencies from manage-projects  https://review.opendev.org/71887715:46
mnasermordred: updated, and then once that lands it'll be nice if someone can kick that off manually too15:47
corvusmnaser: if you remove the file matcher then add it back in a second change it'll run :)15:48
openstackgerritMohammed Naser proposed opendev/base-jobs master: opendev-upload-git-mirror: add job  https://review.opendev.org/71903215:50
mordredcorvus: speaking of - https://review.opendev.org/#/c/718827/15:52
corvusstatus notice review.opendev.org is being restarted for scheduled maintenance; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html15:55
corvushow's that?15:55
clarkblgtm15:56
corvusmordred: are you doing the typing?  you want to do a screen or anything, or just let us know everything worked? :)15:56
fungithat notice wfm16:00
corvusi'll wait until we have a mordred to send it16:01
AJaegermnaser, corvus , shouldn't opendev-upload-git-mirror show up in the docs?16:02
corvusdo we know if everything on the server is ready to go?  are all the docker changes in place, or do we need to do something to make that happen?16:02
clarkbcorvus: aiui we just have to stop and disable the systemd service then start the docker compose thing16:03
mnaserAJaeger: i wonder why they're not showing up..16:03
fungii assumed it was all in place because the changes i'm aware of since last attempt have merged and the server isn't in the emergency list16:03
clarkbbut we should haev mordred confirm16:03
corvusAJaeger: yeah, need to add it to doc/source/jobs.rst16:03
AJaegermnaser: you need to add the job manually. Also, lets make the job abstract, shouldn't16:03
corvusAJaeger: why abstract?16:03
mordredsorry - was reading a document - ready to go now16:04
corvus(it's runnable as-is)16:04
mnaseri'll make a follow up that adds all the jobs to base-jobs16:04
corvus#status notice review.opendev.org is being restarted for scheduled maintenance; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html16:04
openstackstatuscorvus: sending notice16:04
mordredyeah - I'll do the typing - let me set up a screen16:04
AJaegercorvus: really runnable as-is?16:04
-openstackstatus- NOTICE: review.opendev.org is being restarted for scheduled maintenance; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html16:04
corvusAJaeger: sure, just add secret :)16:04
AJaegercorvus: so, just passing in a secret but that could be done in project-stanza?16:04
mordredI'm in a root screen session on review16:04
corvusAJaeger: yep16:04
AJaegermnaser: so, ignore the request for abstract ;)16:05
fungiattached16:05
clarkbI'm attached as well16:05
corvusme too16:05
mnaserAJaeger: if you wanna +w, i'll push up a follow-up adding all docs for all jobs16:05
mordredfor stopping - is it systemctl stop gerrit?16:05
AJaegermnaser: ok16:05
fungimordred: yep, or service gerrit stop16:05
mordred(I gotta say this might be one of my favorite parts about the dockerification)16:05
AJaegermnaser: done16:05
fungibtw i love that systemd decided reversing the parameter order was no big deal16:06
* mordred will wait until we get the "done sending notice"16:06
* fungi forgot to add his <sarcasm> tag16:06
clarkbmordred: you'll want to disable the unit too `systemctl disable gerrit`16:06
openstackgerritMerged openstack/project-config master: Drop dependencies from manage-projects  https://review.opendev.org/71887716:06
clarkb(to prevent systemd from trying to manage it going forward)16:06
mnaserAJaeger: im going to see how we can make opendev/base-jobs linters get mad if you dont have a documented job16:06
mnaseror figure out how we do it in zuul/zuul-jobs16:06
mordredclarkb: yeah - I've got a todo list item to go remove the old units and stuff - but I'll do that right now16:06
openstackstatuscorvus: finished sending notice16:07
mordredok. here we go - everybody ready?16:07
corvus++16:08
* fungi braces for impact16:08
mordredok. gerrit seems to be down16:09
*** mlavalle has joined #opendev16:09
mordredstarting16:09
fungiyep, and disabled effectively16:09
corvuswhat's gerritcompose_shell?16:10
mnaserAJaeger: hmm i think that might involve writing a job/role to do the autodoc stuff as i might have to steal it from zuul/zuul-jobs, me has ENOTIME for that, so ill just manually add them16:10
mordredit's an extra container that does nothing for using for random exec tasks16:10
corvusgertty is happy, web site seems up16:11
clarkbso are we happy with that? the error log is clean16:11
mordredthere was a reason why just using the gerrit one wasn't always a good idea16:11
openstackgerritMerged opendev/system-config master: Collect logs from manage-projects runs  https://review.opendev.org/71882716:11
corvusmerges are apparently possible :)16:12
mordredyeah16:12
mordredthat was awesome :)16:12
clarkb[2020-04-10 16:11:54,207] [HookQueue-1] INFO  com.googlesource.gerrit.plugins.hooks.HookTask : hook[change-merged] output: FileNotFoundError: [Errno 2] No such file or directory: '/home/gerrit2/projects.yaml'16:12
clarkbthe hooks are broken16:12
mordredpoo. so we're missing a bind-mount16:12
clarkbbut general operation seems to be fine16:12
corvusalso /home/gerrit2/review_site/etc/gerrit.config16:12
openstackgerritMerged opendev/base-jobs master: opendev-upload-git-mirror: add job  https://review.opendev.org/71903216:13
corvuswhich is used by the welcome-message hook16:13
clarkb[2020-04-10 16:12:17,250] [HookQueue-1] INFO  com.googlesource.gerrit.plugins.hooks.HookTask : hook[patchset-created] output: FileNotFoundError: [Errno 2] No such file or directory: '/home/gerrit2/review_site/etc/gerrit.config'16:13
mordredshall we just manual that in to the compose file and do another restart?16:13
mordredthen followup with a change?16:13
clarkbmordred: how is it that the gerrit config isn't mounted?16:13
fungii'm fine with that16:13
openstackgerritMohammed Naser proposed opendev/base-jobs master: docs: add docs for all jobs  https://review.opendev.org/71904216:13
clarkbgerrit needs that to start so I think there may be more to this?16:14
fungiclarkb: i think it's just not at that exact path?16:14
mordredit's in a different path in the container16:14
clarkbfungi: ya so adding bind mounts won't fix that16:14
mordredit will16:14
clarkbunless we bind mount to the other path too?16:14
mnaserAJaeger: ^ fyi re adding docs16:14
fungiright, we can bindmount it to additional paths for now, and we can also fix the hook to use the same one the gerrit process does16:14
clarkbgot it16:15
corvuswhere's the path coming from?  can we do something other than double-bind-mount it?16:15
* fungi checks16:15
clarkbcorvus: I'm guessing jeepyb scripts hard code those paths16:15
mordredwe could do that16:15
clarkbbut maybe its configurable16:15
corvuslike can we update the scriptos to use /var/gerrit/...16:15
mordredcorvus: yeah - I think we should udpate the scripts - that'll just take a review cycle16:15
fungiyeah, there's a ton of /home/gerrit2/... defaults in jeepyb16:15
fungimost seem to be overridable16:15
corvusi'm okay with the double-bind-mount just to see if there are any more problems16:16
mordredor maybe it doesn't matter since it'll be a restart to roll out the fixed version anyway?16:16
mordredlet's do that for now16:16
mordreddoes what I've got htere look ok?16:16
corvusbut yeah, we're looking at more restarts no matter what16:16
mordredyeah16:16
mordredluckily they are _quick_ restarts16:16
fungijeepyb/cmd/welcome_message.py will need editing though, yes, it hard-codes BASE_DIR = '/home/gerrit2/review_site'16:16
clarkbif /home/gerrit2 doesn't exist in the container will the bind mounting process sort that out for us?16:16
mordredyeah16:16
clarkbalso I think it may be trying to get db connection info which is in the other secure.conf now?16:17
mordredclarkb: liek that?16:17
clarkbfungi: ^ can you check really quickly to see what it is trying to get from gerrit.config and if it looks in secure.config (or whatever it is called) too?16:17
clarkbmordred: ya I expect its looking in both for the db connection info16:18
fungiyeah, seems jeepyb/gerritdb.py will obey envvars for GERRIT_CONFIG and GERRIT_SECURE_CONFIG but defaults those to /home/gerrit2/review_site/etc/gerrit.config and /home/gerrit2/review_site/etc/secure.config respectively16:18
corvusmordred: lgtm16:18
clarkbso having both is probably a good thing (otherwise we'll get failure on the other file when we restart)16:18
mordredok. cool16:18
mordredoh - we can set env vars?16:18
fungimordred: for at least some of this, yes16:18
clarkbmordred: fungi depends on whether or not gerrit will set those when calling hooks16:18
mordredwhy don't we do that for those real quick - can we do that for projects.yaml?16:18
mordredoh - good point16:18
fungianything which uses the jeepyb/gerritdb.py module and is breaking there at least16:19
mordredlet's do this - then investigate more16:19
clarkbmordred: wfm16:19
fungiright, we may need to export those in the hook script wrappers16:19
mordredI'm going to save that, do a compose down and up16:19
mordredfungi: that would work16:19
mordredk. seems down. starting16:20
fungibasically gerrit is configured to run some shell scripts and pass command-line parameters to them, which then in turn call jeepyb entrypoint wrappers with more specific command-line options set16:20
clarkbdid you ps to check if it was down?16:20
clarkberror log never showed it being down16:20
mordredI didn't - but the container was gone, so I don't think it could still be running right?16:21
clarkbthere is only one gerrit ps16:21
clarkbmordred: I think it could be if it got reparented all the way back up to systemd16:21
AJaegermnaser: I'll look into the linter for jobs in opendev/base-jobs16:21
fungiyeah, the only java processes i see are for the container16:21
clarkbbut in this case seems fine it just didn't log it like I expected it to16:21
mordredyeah- I wonder if we need to configure compose to send a graceful shutdown signal first16:21
corvusseems to be up again16:22
clarkbmordred: oh ya that may be it16:22
fungialso a spontaneous thought, don't need to discuss it in the middle of this, but #opendev-meeting might be good for maintenance windows too16:22
corvusfungi: wfm.  maybe we should do the 1700 maint there16:23
fungii'm up for that16:23
mordred++16:23
mordredwe could even startmeeting them so we can log todo list items16:23
openstackgerritMohammed Naser proposed openstack/project-config master: Revert "Revert "Introduce job for granular GitHub mirroring""  https://review.opendev.org/71904716:24
mordredprojects.yaml and projects.ini seem to be settable via env vars too16:24
mordredso I think we can not do the double-mount and instead just set those vars in teh scripts16:24
mnasercorvus (when you're not in the middle of doing gerrit things), ttx, AJaeger, fungi ^ i believe that this is a good point for github mirroring job16:24
corvusjeepyb seems really unhappy?16:24
clarkb[2020-04-10 16:24:14,247] [HookQueue-1] INFO  com.googlesource.gerrit.plugins.hooks.HookTask : hook[patchset-created] output: fatal: not a git repository: '/home/gerrit2/review_site/git/openstack/project-config.git'16:24
fungihrm, yeah, looks like it's expecting the bare repos at the old path?16:25
openstackgerritAndreas Jaeger proposed opendev/base-jobs master: Check that all jobs are documented  https://review.opendev.org/71904816:25
AJaegermnaser: ^16:25
fungilonger term, so much of these gerrit hooks could be come zuul jobs16:26
fungier, become16:26
mordredsigh16:26
corvusit also seems to have problems with the gerrit config file?16:26
corvus[2020-04-10 16:25:51,996] [HookQueue-1] INFO  com.googlesource.gerrit.plugins.hooks.HookTask : hook[patchset-created] output: configparser.DuplicateOptionError: While reading from '<???>' [line 115]: option 'footer' in section 'trackingid "launchpad-bug"' already exists16:26
mordredyeah - I think we can do one more double-bind-mount for the git thing - but what's the issue with the gerrit config file16:26
mordredyeah- I agree - there are three footer options in section trackingid "launchpad-bug"16:27
mordredis it possible python3 configparser is just stricter?16:28
fungioh, yep i bet it doesn't like duplicate options16:29
mordredwe could maybe rework the gerrit.config to list those as three different trackingid sections rather than one with 3 footers16:29
mordredthe download section is totally going to break it though16:30
fungithat would probably work since they're not tied to an "its" plugin16:30
fungithere's magic name glue between tracking ids and its plugins16:30
mordredmaybe we just need to rework that script to have its own config file that we write out16:30
mordredso that it doesn't have to read this one16:30
mordredin any case - that's not going to be a quick fix16:30
fungii'd rather rewrite it as a zuul job if we're doing that16:30
mordredyeah16:30
mordredcan we though? doesn't it need acess to the gerrit db?16:31
fungioh, yeah for working out assignees16:31
corvusthat's not going to work long run anyway16:31
fungiright, it maps openids to lp usernames via gerrti db query and lp api call16:31
corvuswe should go ahead and add "not access gerrit db" to our requirements if we want to upgrade to nodetdb16:32
corvusnotedb16:32
mordredgood point16:32
fungii'm for hacking around it in the config for now if we can while we design something more effective long-term16:32
mordredin either case, I think we're going ot need to once more shut down and start the old way :(16:33
fungiwe also never got task assignment implemented in the storyboard-its plugin, for similar reasons16:33
mordredbecause I don't know how to hack anything in place temporarily16:33
mordredmaybe we give up on task assignment16:33
fungiyeah, i'm willing to entertain that16:34
corvusi agree that we've arrived at restart with old method16:34
mordredbut - for now - let's shut back down, start the old way, and make a list of things we need to fix yes?16:34
clarkbmordred: yes, do you want to test graceful shutdown?16:34
fungiat least it'll be shorter than the last list16:34
clarkbI've got a change just about ready to push up based on reading the init script16:34
clarkboh its already stopped16:34
mordredclarkb: oh - sorry16:35
clarkbits ok will push it up for review when its up again :)16:35
mordred++16:35
openstackgerritAndreas Jaeger proposed opendev/base-jobs master: Check that all jobs are documented  https://review.opendev.org/71904816:35
mordredI added four bullet points to the bottom of https://etherpad.openstack.org/p/gerrit-2020-03-2016:36
AJaegermnaser: ok for me to update https://review.opendev.org/#/c/719042/ ?16:36
mordreddo they look good to people?16:37
corvusmordred: yeah, updated with slight mod16:38
clarkbya that looks good16:38
corvusi tested out the configparser issue and put in a solution16:39
corvus(strict=False)16:39
fungiooh, good find on strict=false royal purple!16:39
openstackgerritClark Boylan proposed opendev/system-config master: Use HUP to stop gerrit in docker-compose  https://review.opendev.org/71905116:40
mordredawesome. strict=false seems like a great short-term solution there16:40
clarkbothers should compare ^ to the init script. The timeout is actualyl shorter now than I remembered16:40
fungiolive drab: i added something about the bare git repos since we saw jeepyb complain about that too16:40
clarkbbut I still went with 5 minutes beacuse why not.16:40
mordred++16:40
clarkbalso it really looks like sighup is used to stop gerrit gracefulyl which is weird16:40
corvusthat is weird16:40
fungionly weird if you don't consider that the original signal was a call to "hangup" the line16:41
clarkbbut I also discovered it uses sigterm if not using start-stop-daemon16:41
clarkbour server has start stop daemon isntaelled so pretty sure it uses the sighup path16:41
corvusfungi: i do remember that, but still think it's weird :)16:41
fungi(sighup will terminate shells)16:41
corvusi almost never want to stop gerrit when i hang up.16:41
fungibut yeah, i agree it's not a traditional signal i would expect for a graceful shutdown of a headless daemon16:42
clarkbyall shoudl read the init script and check that I interpreted that correctly :)16:42
fungigranted, those usually end up being some sigusrN16:42
fungior just sigterm16:42
mordredso I think step one is just a jeepyb patch that updates the hardcoded filepaths - and maybe a patch to system-config to add envvars to set the paths?16:43
mordredand also strict=False16:43
openstackgerritAndreas Jaeger proposed opendev/base-jobs master: docs: add docs for all jobs  https://review.opendev.org/71904216:43
AJaegermnaser: ^16:43
fungimordred: i think so, yes. the hook scripts are in system-config16:43
fungiso we can export our preferred paths from there16:43
mordredok. I'll work on those patches16:43
mordredI think maybe adding an envvar override to the scripts where it's just hardcoded16:44
mordredto match the other ones16:44
mordredis the smallest jeepyb change16:44
fungithat may be all we need actually. i don't see where BASE_DIR in welcome_message.py ever even gets used16:48
*** prometheanfire has joined #opendev16:48
fungiit's used in jeepyb/cmd/notify_impact.py and jeepyb/cmd/update_blueprint.py and jeepyb/cmd/update_bug.py hard-coded (checking those out now)16:49
fungibut seems to be just cargo-culted cruft in jeepyb/cmd/welcome_message.py16:49
fungiyeah, i think those are going to need to grow some envvar lookups16:50
openstackgerritMonty Taylor proposed opendev/jeepyb master: Fix issues from rolling out containers  https://review.opendev.org/71905216:50
fungithey do git commands with --git-dir=BASE_DIR/git/...16:50
mordredfungi, clarkb, corvus : ^^16:50
mordredI think that should fix the git dir and strict parsing issues16:50
fungiheh, exactly what i was thinking!16:51
mordredoh - poo - update-bug16:51
openstackgerritAndreas Jaeger proposed opendev/base-jobs master: docs: add docs for all jobs  https://review.opendev.org/71904216:51
fungiright, that one too16:51
AJaegersorry, corvus for hurting your eyes!16:52
openstackgerritMonty Taylor proposed opendev/jeepyb master: Fix issues from rolling out containers  https://review.opendev.org/71905216:52
corvusAJaeger: i left a +2 :)16:52
AJaegercorvus: updated already16:53
*** dpawlik has quit IRC16:53
fungi-> #opendev-meeting for etherpad maintenance (6 minute warning!)16:54
openstackgerritMohammed Naser proposed openstack/project-config master: Revert "Revert "Introduce job for granular GitHub mirroring""  https://review.opendev.org/71904716:57
openstackgerritMonty Taylor proposed opendev/system-config master: Set env vars pointing to correct file locations  https://review.opendev.org/71905316:58
mordredinfra-root: ^^ there's the other half16:59
clarkbmordred: and we don't need similar on the manage proejct side because it is all config file driven already right?16:59
mordredyeah - it already works and does its own bind-mounts17:00
-openstackstatus- NOTICE: etherpad.openstack.org will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html17:01
AJaegermnaser: want to +2 719048  now that it passes - and +2 71904217:03
mnaserAJaeger: +2 +w17:05
AJaegerthx17:06
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Fix check_jobs_documented linter  https://review.opendev.org/71905417:09
openstackgerritMerged opendev/base-jobs master: docs: add docs for all jobs  https://review.opendev.org/71904217:10
*** hashar has quit IRC17:20
openstackgerritmatthew wagoner proposed opendev/system-config master: Fix typo on website  https://review.opendev.org/71905717:34
fungi!!! ^17:37
openstackfungi: Error: "!!" is not a valid command.17:37
fungihush, openstack, i wasn't talking to you17:37
mnaserso would it be okay to requeue manage-projects job into post or should i make a noop commit?17:44
mnaser(there was an issue with post jobs not working for manage project that didnt create repos)17:44
mnaserexample of repo that failed in post: https://review.opendev.org/#/c/718824/17:44
mnaserwe merged the fix but yeah, now something needs to kick it off17:45
*** diablo_rojo has joined #opendev17:45
-openstackstatus- NOTICE: The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC17:51
openstackgerritMonty Taylor proposed opendev/system-config master: Set env vars pointing to correct file locations  https://review.opendev.org/71905317:57
*** melwitt has quit IRC17:59
*** mugsie has quit IRC17:59
*** cmurphy has quit IRC17:59
*** Open10K8S has quit IRC17:59
openstackgerritMonty Taylor proposed opendev/jeepyb master: Fix issues from rolling out containers  https://review.opendev.org/71905218:01
*** melwitt has joined #opendev18:03
*** cmurphy has joined #opendev18:03
*** mugsie has joined #opendev18:03
*** Open10K8S has joined #opendev18:03
openstackgerritMonty Taylor proposed opendev/system-config master: Run ansible on the backup server  https://review.opendev.org/71907618:21
openstackgerritMonty Taylor proposed opendev/system-config master: Turn backup server back off  https://review.opendev.org/71907718:21
openstackgerritMonty Taylor proposed opendev/system-config master: Add review and etherpad to backup group  https://review.opendev.org/71903618:22
openstackgerritMonty Taylor proposed opendev/system-config master: Run ansible on the backup server  https://review.opendev.org/71907618:22
openstackgerritMonty Taylor proposed opendev/system-config master: Turn backup server back off  https://review.opendev.org/71907718:22
openstackgerritMonty Taylor proposed opendev/system-config master: Set env vars pointing to correct file locations  https://review.opendev.org/71905318:25
openstackgerritMonty Taylor proposed opendev/system-config master: Set env vars pointing to correct file locations  https://review.opendev.org/71905318:29
openstackgerritMerged opendev/system-config master: Change Etherpad default intro text  https://review.opendev.org/71876318:37
openstackgerritMonty Taylor proposed opendev/system-config master: Add root cron jobs to gerrit  https://review.opendev.org/71908818:40
openstackgerritMonty Taylor proposed opendev/system-config master: Add review and etherpad to backup group  https://review.opendev.org/71903618:46
openstackgerritMonty Taylor proposed opendev/system-config master: Run ansible on the backup server  https://review.opendev.org/71907618:46
openstackgerritMonty Taylor proposed opendev/system-config master: Turn backup server back off  https://review.opendev.org/71907718:46
mordredclarkb, corvus : ^^ had to fix a linter failure - we were checking group membership and I didn't udpate the fixture18:46
mordrednow that we're 100% zuul-driven - we _could_ remove the update-system-config playbook and push the zuul content to the zuulcd user's home dir in our prod playbook, making zuul driven runs run with the context zuul prepares, rather than just the tip18:49
corvusi like that18:49
mordredI mention this because thinking about that backup stack - if we were running from zuul state, we could ladn all three patches of that stack and be sure that we would have run once with each state18:49
corvus++18:49
corvusthat will also rely on the mutex of one right now, but we don't anticipate changing that any time soon18:50
mordredyeah18:50
mordredwe'd need to update the ansible config to point to /home/zuulcd/src/opendev.org/opendev/system-config for inventory and roles instead of /opt/system-config ... or we could make a symlink18:51
clarkbthinking out loud here, how does that impact our ability to run things outside of zuul (I think this may be something to think about either way actually)18:51
clarkbwith the existing setup if we put bridge in emergency it won't update things18:51
clarkbwith the proposed setup we'd probably want to have a separate git repo for this?18:51
mordredfor what?18:51
mnaserif anyone has some spare cycles, i'd appreciate: sudo zuul enqueue-ref --tenant openstack --trigger gerrit --pipeline post --project openstack/project-config --ref refs/heads/master --oldrev abc9da185343c691fa66585ee108586eff84a9f9 --newrev 395dfed08ca3c009d0040cc7c7a326ebd3dd0bf5 --- this should rerun the manage-projects and create the projects that were missed in the zuul misconfig18:52
mordredthis is onlky talking about the state of the system-config repo - it _is_ the separate repo :)18:52
clarkbmordred: running kick.sh for whatever reason18:52
mordredclarkb: yeah - I don't think that would be impacted18:52
clarkbmordred: right but it has group vars and playbooks and stuff18:52
mordredstill not impacted18:52
mordredthe only thing that would be impacted is if we had specific inventory state18:52
clarkbmordred: isn't it? if you are expecing one state when running kick.sh but zuul updates that state18:52
mordredclarkb: oh - yeah - sorry - I didn't say a followup18:53
clarkbtoday the way we address that is bridge in emergency aiui18:53
mordredI think we want to keep /opt/system-config as a location from which we can run adhoc things18:53
clarkband that will fix the git repo until removed18:53
clarkbmordred: got it18:53
clarkband zuul would have its own thing18:53
mordredand - we probably want to go ahead and install the inventory files in install-ansible18:53
mordredrather than pointing directly at the files in system-config18:53
mordredthat way it won't be weird18:53
mordredyup18:54
* mordred will work up a patch18:54
clarkbok I need to pop out for lunch things now18:54
clarkbback in a bit (but then probably out again to bike)18:54
*** moppy has joined #opendev18:54
* mordred is going to run manage-projects for mnaser18:54
mnasermordred: thank you :>18:55
mordredmnaser: k. check your projects - they ok?18:56
mnaser i see https://opendev.org/vexxhost/rbac-helm and https://opendev.org/vexxhost/documentation so yay18:56
mordred\o/18:57
mnaserthanks mordred  - ill watch for new project creations if the fix did the right thing in teh future18:57
AJaegeranybody willing to review to move the documentation linter from zuul-jobs to opendev/base-jobs, please? https://review.opendev.org/#/c/719048/18:57
openstackgerritMerged opendev/system-config master: Fix typo on website  https://review.opendev.org/71905719:22
AJaegerand a sync back for the job linter for review, please - https://review.opendev.org/71905419:22
mordredclarkb: https://etherpad.opendev.org/p/clarkb-test is not loading for me - is it loading for you?19:28
openstackgerritMerged opendev/base-jobs master: Check that all jobs are documented  https://review.opendev.org/71904819:30
AJaegerhave a great easter weekend everybody!19:31
mordredAJaeger: you too!19:31
* AJaeger waves good night19:31
*** sgw has quit IRC19:36
clarkbmordred: hrm not loading for me either19:41
corvusi'll see if i can help it load by also trying19:42
corvusthe gerrit maint etherpad from earlier works: https://etherpad.opendev.org/p/gerrit-2020-03-2019:43
clarkbI wonder if the utf8 stuff is still weird there19:43
clarkblike maybe the db dump didnt dump that successfully?19:43
mordredyeah19:47
mordredthat dump would have had the mb4 settings active, where the previous one didn't19:47
*** moppy has quit IRC19:47
corvuswe probably should have dived the plan :)19:48
mordredmaybe we should try removing the most recent edit from that pad?19:48
*** moppy has joined #opendev19:48
clarkbya could roll back versions and see if it starts workinh19:48
corvusi'm worried about other pads though19:48
corvusbasically, is every pad with a pile-of-poo dead?19:48
clarkbthere is something poetic about that. I think rolling back versions of the test pad will help us confirm this19:50
corvusperhaps we will lose less data if we shut down the new server, and re-export the data19:50
corvuswe'll lose changes from this morning, but maybe people are not doing much today?19:50
clarkbhow would we export differently? that isnt clear to me19:50
mordredthe way I did it before19:50
mordredrun the dump on old etherpad server, copy the dump file over, then run it in on new server19:51
corvusmy understanding is that the export process we did in production is not what we tested, due to mysql settings being different.  also, perhaps running on a different host?  (this may be the reason it took longer?)19:51
clarkbI see19:51
mordredwe did todays' process yesterday to find the time length ... but we didn't chek for data integrity afterwords19:52
corvusoh, so that's not the reason it took longer19:52
mordredold etherpad is still shut down - how about I go ahead and start a dump, since it won't change regardless19:52
mordredwhile we investigate this19:52
corvusi'm concerned that if we do anything other than re-export, we will have a long tail of reports like "this pad from 2 summits ago where we had a 4-byte char doesn't work"19:53
fungiseems a reasonable next step, though we may also ought to temporarily shut production back down?19:53
fungijust so nobody makes changes and then gets surprised when they disappear19:53
corvusclarkb: if we did roll back recent edits of the test pad, what would that confirm?19:53
corvusfungi: i think so19:53
corvusfungi: we might need an announcement19:54
clarkbcorvus: that the recent utf8mb4 edits arelikely to blame19:54
clarkbcorvus: if rolling back doesnt fix then probably something else is involved19:54
corvusclarkb: ack.  wfm19:54
fungii can draft an announcement (not in an etherpad), just a moment19:54
mordredyeah19:54
*** sgw has joined #opendev19:54
* mordred is running the dump from etherpad.openstack19:55
corvusfungi, clarkb: let's do the rollback test before shutting it down19:55
mordred++19:55
fungik19:55
corvusclarkb: you know how to do that?  are you in a good place to do it?19:55
clarkbI'm not in a good spot. Currently finishing lunch19:55
corvusmordred, fungi: ^ either of you?19:55
fungii can give it a try if someone tells me the pad name and desired revision19:55
clarkbin scrollback fungi had an api command for etherpad its that with a different url19:55
corvushttps://etherpad.opendev.org/p/clarkb-test19:56
corvusno idea the revision19:56
clarkbfungi: pad name is clarkb-test and revisions will be current -1 and work backward from there19:56
corvusis there a wy to find the current rev?19:56
fungiyeah, i've done rollbacks recently, i can also probe it with the get text call to find the right rev19:56
corvus(via api or db query?)19:56
corvuscool19:56
fungiyeah, i can get current rev number too19:56
fungietherpad api docs are pretty thorough19:56
corvusmordred: ^ you want to pick up the announcement work?19:56
corvusmordred: or, if you are otherwise occupied, i can19:57
mordredcorvus: could you - I'm poking to make sure I'm dumping like I did before19:58
corvusmordred: on it19:58
corvuscorvus is writing announcement; fungi is rolling back test pad; mordred is exporting; clarkb is lunching19:59
fungioh yay, someone has defaced their wiki? https://github.com/ether/etherpad-lite/wiki/HTTP-API20:01
fungianyway i have some examples in my shell history on the old server to draw on20:02
mordredok - as an update- I am now dumping on etherpad01.opendev.org - which is what I did before. but I'm dumping on the _host_ not in the container20:03
fungigetRevisionsCount says 61520:03
mordredthe host does in fact have different my.cnf settings than the container20:03
corvushow does this look?  http://paste.openstack.org/show/791941/20:05
mordredalso - fwiw, last time we ran it in we had utf8 and utf8_bin set in the my.cnf for the container - then we loaded the dump, then we changed the settings to allow creating the mb4 chars20:06
mordredcorvus: yes, that looks good20:06
fungicorvus: lgtm20:06
corvusstatus notice Due to a database migration error, etherpad.opendev.org is offline until further notice.20:06
mordredyeah20:06
fungiclarkb: what revision did you want this rolled back to? or how can i identify it from the pad content?20:07
corvusfungi: keep doing N-120:07
fungiokay, can do, just a moment20:07
corvus#status notice Due to a database migration error, etherpad.opendev.org is offline until further notice.20:07
openstackstatuscorvus: sending notice20:07
-openstackstatus- NOTICE: Due to a database migration error, etherpad.opendev.org is offline until further notice.20:07
fungirestored it to revision 614 now20:08
corvusfungi: no joy for me20:08
mordrednope20:08
clarkbfungi: pad content wise morder added some chinese characters that require 4 bytes20:08
clarkbits possible we have to go back more than one rev20:08
funginow at 61320:08
mordrednope20:08
corvusfungi: it may be a lot of revs, probably best for you to revert and test yourself :)20:08
fungidoing20:09
corvus(ftr i'm holding the email message until we actually confirm the problem and establish the end time)20:09
mordred++20:09
mordredI've put etherpad.opendev.org into the emergency file just to be safe20:10
openstackstatuscorvus: finished sending notice20:11
fungii've rewound 10 revs so far (down to 605) but will keep at it20:11
mordredok. dump is done, perparing to be able to restore20:12
mordredif we move forward, what I'd like to do is change the mysql settings for the container back to utf8 and ut8_bin20:13
clarkbfungi: fwiw I'm not sure what forces a rev to happen but if each character is a rev we might have to go back ~30?20:13
mordredthen do the restore, then change the settings back to utf8mb420:13
mordredsince that is the sequence we used last time and it worked20:14
fungiclarkb: i'm back 20 revs to 595 now20:14
corvusmordred: we may also want to get a dump of the new db after we shut it down20:14
mordredcorvus: ok20:14
corvusjust in case someone comes to us with "i lost 40 hours of work due to the rollback" and we want to try to help20:14
corvusi think it's unlikely, but it seems like a precaution we could take20:15
mordredI will dump that one with the current db settings20:16
corvussounds good.  but wait until we shut it down after fungi's experiment20:16
mordredyes20:16
fungii'm back 30 revs to 585 now20:17
fungistill no luck but can keep going20:17
mordredI thnik it;s entirely likely it's _not_ mb4 related and that it might be any multi-byte char related - and may have been an inappropriate cross-encoding20:18
fungiand yeah, a rev can be as little as a single character added or removed20:18
corvusmordred: there are old multibyte chars in that pad, so that could make the experiment inconclusive20:19
mordredyeah20:19
fungii can try jumping back substantially further in that pad if we want20:19
mordredfungi: might as well20:19
clarkbfungi: sure its a test pad so this is what it is there for20:19
corvusmaybe we should just shut it down, save the new db (which would then be insurance against our multibyte hunch being wrong) and try the re-import?20:19
corvus^ (feel free to continue rollback experiment while considering this)20:20
mordredat this point probably a good idea20:20
mordredif the re-import doesn't work - we can always just turn old etherpad back on, apologize, and regroup for next week20:20
fungii'm starting to think the problem is not recent edits in that pad20:21
mordredyeah20:21
clarkbfwiw clarkb-test on etherpad-dev hasn't worked in years20:21
clarkbits not entirely abnormal for etherpad to decide its unhappy :/20:21
corvusclarkb: well, it worked on prod yesterday, right?20:21
mordredyeah20:21
clarkbcorvus: yes this particular one was fine yesterday20:22
corvusdoes anyone have any other pads with multibyte test chars?20:22
clarkbI'm just pointing out that we've had some sort of issue with pads breaking and never been able to anrrow it down to a cause20:22
clarkbcorvus: I am not aware of any at the moment20:22
mordredI'm driving in root screen and ready to shut down - pending folks being ready for it20:22
corvusclarkb: ack; but without any other tests to triangulate, i think we have to assume this could affect any multibyte pad20:22
corvusi'm ready20:22
fungiyeah, i'm now getting an "internal error" from getText() calls to the api for that pad20:22
corvusfungi: ?20:22
fungii do not have another pad i can think of in that situation, no20:23
corvusclarkb, fungi: ready to shut down etherpad.opendev?20:23
fungii say let's proceed with the restore20:23
fungiyes20:23
clarkbya20:23
mordredok. I'm dumping the current state of the new db into /var/lib/mysql/etherpad-new.sql20:25
mordredthe new dump from old etherpad is etherpad.sql20:26
mordredonce this is done, I will restart mariadb with the utf8 settings back to the other settings, restore the dump, then we can restart one more time with the settings restored20:26
clarkbk20:27
clarkbI'm to keyboard now fwiw so let me know if I can help with anything at this point20:31
fungiand my food is here so i may want to step away and scarf it down20:35
rm_worksmcginnis: so on EM/EOL20:36
rm_worksmcginnis: the ocata branch of octavia was marked EM one year ago -- https://opendev.org/openstack/releases/commit/d57f810ad6555528a4af039b693453b551da4cf220:36
rm_worksame with pike: https://opendev.org/openstack/releases/commit/375a23874431bec0981e5545933e977d8f39800020:36
clarkbrm_work: that might be more appropriate for #openstack-release? (we are in the middle of some operational fun so keeping on topic right now is useful)20:36
rm_workah ok sorry :) will move20:37
fungithis situation also fits the incident response purpose for #opendev-meeting if we want to head back in there20:41
corvuslooks like the dump of the new db is done20:42
mordredno - that was me accidentally pasting text into the buffer20:42
corvusindeed20:43
corvusmysqldump proc is still running20:43
mordredI just pasted a particularly confusing chunk of text20:43
corvusat least it wasn't a chapter from your tell-all book20:44
fungithat could describe my typical friday afternoon20:44
mordredcorvus: I think it was20:44
*** sshnaidm|off has quit IRC20:46
mordrednow it's done20:47
corvuslooks like the dump of the new db is done20:47
mordredI'm going to swap the db settings restart mariadb then restore20:47
mordredwell - luckily that output is all just people with root already :)20:49
mordredso - the problem, I think - is that the old db was utf8mb4 at the table level, but we did the original dump with a connection setting of utf8 - that tells the interaction "I am speaking utf8, you are utf8mb4, please transcode for me"20:51
fungiaha20:52
mordredhrm. no - my explanation doesn't hold up20:52
mordredbecause what we did today should have worked the same - just eliding 2 trancodings20:52
mordredshrug. I'm confused20:53
corvusi guess we'll found out in a few minutes20:53
mordredthere's basically always three charsets - what the table is storing, what the server is speaking and what the client is speaking20:53
fungiright, without utf8mb4 set, people would have at most been able to enter 3-byte utf8 characters, which are a proper subset of 4-byte utf8 anyway20:53
mordredwhenver one doesn't match, mysql will translate from one to the other for you20:54
mordredthe combo that worked before was "table in utf8mb4, server in utf8, client in utf8" -> "client in utf8, server in utf8, table in utf8mb4"20:55
mordredwhat we tried today that did not work was "table in utf8mb5, server in utf8, client in utf8mb4" -> "client in utf8mb4, server in utf8mb4, table in utf8mb4"20:55
mordredif that makes sense20:56
mordredso - I think the issue is that the old server was configured to speak utf8 - but we were talking to it in utf8mb4 - so it did a step of transcoding that somehow became unhappy20:56
fungithere's a utf8mb5?21:00
fungior was that a typo above?21:00
mordredthat was a typo21:00
fungiokay, makes sense in that case21:01
mordredthe thing is while it _should_ all work, if etherpad had been writing things without the correct settings, it could have been writing bogus-ish data that would still get re-translated back out right-ish- and if we apply the same set of tranformations and untransormations, the data stays the same21:02
mordredbut with an unbalanced set of transformations, assertions we're making about what the data actually is start to matter21:03
corvusmordred: so it's the middle (server) later that you think did the deed, rather than the client difference21:04
corvuss/later/layer/21:04
mordredcorvus: yeah. *I think*21:04
mordredI *know* how I dumped it last time21:05
mordredI'm 95% positive I stored it before we swapped to mb4 - because I was actually worried about this21:05
mordredwhy that worry didn't translate to today I'm not sure21:05
corvusworry overload?21:06
mordredyeah21:06
mordredprolly21:06
clarkbI think its done21:18
clarkborwait maybe the window changed?21:19
clarkbno I thik its done21:19
corvushrm, screen is not responsive for me, but i agree there is no mysqldump process running21:19
mordredyes - done21:19
mordredI will now restart mariadb with the right settings21:19
corvusscreen woke up21:19
mordredand etherpad21:19
clarkbshould we try clarkb-test now?21:20
mordredwow. still broken :(21:20
fungihttps://etherpad.opendev.org/p/clarkb-test is still hanging for me, yeah21:21
corvusditto21:21
fungiwe're sure this exact state of the pad worked on the old server?21:21
fungimaybe we can compare the fields in the two databases?21:21
* fungi says with flippant wavy-hands typical of a friday evening21:22
clarkbfungi: unless someone broke it overnight21:22
corvuswe could probably restart the old server and directly access it?21:22
corvusi'll prepare to do that21:22
mordredkk21:22
clarkbcorvus: ya that would be one way to test it21:22
corvusdone; rejiggering my hosts file to test21:23
fungiif that works, then i give it 50/50 odds either the data is different in the imported db *or* new etherpad version isn't handling it like the old version did21:24
corvusit does not work -- chat loads but the main pad still hangs21:25
corvuslet me do a browser restart just to clear ambiguity21:25
mordredok. so it;s entirely possible that old clarkb-test between the times of imports broke21:25
mordredand we're seeing that reflected here21:25
clarkbyes, the window is small but not from a computing perspective21:25
mordredwe were attempting to test writing mb4 chars to it21:26
mordredwe could have caused garbage to get written21:26
corvusconfrimed in both firefox and chromium21:26
corvusmordred: can you stop new server?21:26
mordreddone21:26
corvus(let's avoid more deltas in case we decide to roll *forward*)21:26
mordredI'm now unsure of how to validate success21:27
corvusi think at this point we could really use a pad with multibyte chars21:28
corvusmaybe we should go look at a pad index from the latest ptg21:28
mordredyeah. a pre-existing pad that peopel would expect to work21:28
mordredyeah21:28
corvushttps://wiki.openstack.org/wiki/PTG/Ussuri/Etherpads ?21:28
mordredhttps://wiki.openstack.org/wiki/PTG/Ussuri/Etherpads21:28
mordredyeah21:28
mordredcorvus: what's the ip for old etherpad?21:28
corvus23.253.238.66 etherpad.openstack.org21:29
fungietherpad01.openstack.org21:29
corvusis my hosts entry21:29
fungiwill also get you the old addresses21:29
fungias it's not been deleted from dns21:29
fungibut yeah, don't try to go to the hostname in a browser or you'll end up redirected obviously21:29
clarkbthe i18n pad seems like it may be what we want since different langauges use different chars more often21:29
corvushttps://etherpad.openstack.org/p/manila-shanghai-ptg-planning  has some chinese chars21:30
fungithey may be 3-byte, but it's work checking at least21:31
mordredwant me to start new so we can check that one?21:31
corvusyeah, our best bet for a 4 byte would be a pile of poop21:31
fungialternative would be to figure out how it gets encoded in a mysqldump and then search the export for an example of poop21:31
mordredfwiw - I cannot diff the two dumps - diff runs out of memory21:32
fungieww21:33
fungii meant more like just diff the part of the dump for that pad, but yeah maybe that's not easy to isolate given how revisions are stored21:33
mordredyeah - I mean, the fact that the db format is ... INSANE ... doens't help matters21:34
corvusmordred says poop a lot21:35
fungialso i've confirmed that those three han characters in the manila pad are 3-byte21:35
fungi\xe6\xb5\xa6, \xe6\xb1\x9f and \xe5\xae\xb4 respectively21:36
corvusi don't think my "google for etherpad poop" idea is going to work21:36
mordredwell - also worth confirming that they're good - so that's still a good pad to check21:36
mordredcorvus: can you good for etherpad (insert poop emoji)21:36
mordredgoogling for: "💩" site:etherpad.openstack.org - did not work21:37
fungipretty sure google doesn't index our etherpad (probably can't)21:38
corvuswelp, i think the only idea i have now is to roll forward again and go test a bunch of etherpads21:38
fungii mean, they have a fake browser javascript renderer engine they use for indexing js-only site content, but i have doubts it would handle something like etherpad21:38
fungialternatively, create a new pad on the old server with poo in it, do a new db dump, import that...21:39
fungibut that's giong to eat up a ton of time21:39
mordredI think I'm with corvus here - I can't think of anything more specific to test21:40
mordredmaybe before we do real quick - let's enable new etherpad and check that 3 byte page above21:40
mordredthat way if we roll forward and it doesn't work we'll now if rolling bafck will improve things21:40
clarkbthat sounds reasonable21:40
corvusheh, i found a convo from 2015, but it's about the clarkb-test etherpad21:40
mordredhahaha21:41
fungii'm in favor of just cranking it back up, and then maybe data mining the mysqldump we imported to try and identify pads with 4-byte characters in them21:41
* mordred starting real quick21:41
fungiat this point we don't have any evidence that it's actually broken21:41
mordredhttps://etherpad.opendev.org/p/manila-shanghai-ptg-planning works21:41
clarkbagreed21:41
clarkb*agreed, that etherpad works21:41
mordredso at least we;re ok with the most commonest cases21:41
mordredI think at this point shut back down, re-roll forward - double check that one as a smoke test21:42
corvus++21:42
mordredand in the meantime we can try mining for more data21:42
mordredk. I'm about to rollforward to this afternoon's new dump - everyone ok with that?21:43
corvusyep21:43
corvusmordred: oh21:43
corvusmordred: what settings will you use to restore?21:43
mordredthe current ones - they're the ones we used to dump21:43
corvussounds good21:44
fungigo for it21:45
clarkbI got no better ideas21:47
corvusi'll stop the old etherpad server now21:48
*** diablo_rojo has quit IRC21:54
clarkbI need to get a bike ride in if I'm gonna get one in today. Back in a bit22:01
mordredhave fun!22:03
corvusdone22:09
mordredI agree22:11
mordredshall we start er up?22:11
corvusyep22:11
mordredok. the shanghai pad still works22:11
corvusagreed22:12
mordredI'm going to move both of the db dumps to /opt/db wich is on the ephemeral - so that we don't keep low disk there22:13
corvuskk22:13
corvusi'll go through a bunch of the U ptg etherpad lists22:13
mordredI also just did a quick double-check and I'm comfortable that we have enough space on / for the db backups cron22:14
corvuseverything A-M looks good22:16
corvuscontinuning22:16
corvusand cinder had a lot of non-roman chars22:16
mordredgood for cinder!22:17
corvusok, i looked at every pad on https://wiki.openstack.org/wiki/PTG/Ussuri/Etherpads and they all load ok22:17
corvusi think we can just skip sending the email and send a status notice saying "all good"22:18
corvusfungi: are you around?22:18
fungiyep22:20
fungisounds good to me22:20
fungii'm working out how to maybe identify '💩' (b'\xf0\x9f\x92\xa9') in a mysqldump22:21
corvusstatus notice Maintenance on etherpad.opendev.org is complete and the service is available again22:23
corvusmordred: ^ should we go ahead and send that?22:23
fungithat looks good to me22:23
mordred++22:23
corvus#status notice Maintenance on etherpad.opendev.org is complete and the service is available again22:23
openstackstatuscorvus: sending notice22:23
-openstackstatus- NOTICE: Maintenance on etherpad.opendev.org is complete and the service is available again22:23
mordredfungi: I'm attempting to grep for 💩22:24
mordredno clue if it'l work that way22:24
mordredit did not find any :)22:25
openstackstatuscorvus: finished sending notice22:27
fungiout of curiosity, does anybody see how to set content styles in the etherpad ui now?22:28
fungiused to have a drop-down for heading levels along with a "code" style22:28
fungii wonder if that got replaced by functionality in the new skins22:28
mordrednope - I do not see that22:29
fungiany any rate, i've created https://etherpad.opendev.org/p/tANNPB4DN0J936odiiBj with 3 and 4 byte characters22:32
corvuslooks right to me22:33
corvusi think the loss of style dropdown is disappointing; i used that a lot22:34
mordredyeah - I think it's worth investigating to see if we can add it back somehow22:34
fungias did i. i suspect they decided it was incompatible with line numbering in the default theme22:34
fungiwe may just need a theme which is not as overblown as the colibris one they standardized on as their recommended default22:35
fungimordred: i don't think "rechecj" is going to do anything ;)22:37
mordredit should know what I mean22:37
mordredfungi: I was trying to find a plugin for the heading styles ... but wow the plugins page is like o_O22:43
mordredfungi: maybe https://www.npmjs.com/package/ep_headings_css ?22:43
funginow may also be a bad time to settle on plugins, since 1.8.1 (or .3?) is expected to break a bunch of them22:43
mordredneat!22:44
mordredfungi: we could restart with skinName: noSkin22:49
mordredfungi: we could restart with skinName: no-skin22:49
mordredwe're not configuring a skin right now22:49
mordredoh - actually colibris isn't supposed to be default until 2.022:50
mordredso I think we're still using no-skin22:50
fungiwe're on no-skin now, yes22:51
mordredfungi: colibris seems to have heading selector: https://pad.colibris-outilslibres.org/p/YAPAkpKBfh22:51
fungicolibris is what you get if you use their recommended configuration but we overwrite it with a config which doesn't specify a skin22:52
fungiso the in-code fallback is to no-skin22:52
fungiand yeah, i demoed the clibris skin on etherpad-dev briefly but everyone (me included) was, like, "eww"22:53
*** hashar has joined #opendev22:58
mordrednod. I think I missed that. I think it looks nice - but I'm likely weird23:01
mordredI can't find anything in the current source tree that would implement the style dropdown23:01
fungiyeah, the colibris theme seems to me like it's designed to satisfy people who want etherpad to be gdocs23:02
fungiso it has page margins and edges and all that23:02
mordredalthough I can't find anything in 1.7.0 that does either23:02
fungiand wastes a lot of space in the browser as a result23:02
mordredI do not know why old etherpad had that menu23:04
fungicould we have been using a plugin and forgotten?!?23:04
mordredaha!23:05
mordredep_headings23:05
mordredis in node_modules23:05
mordredetherpad_lite::plugin { 'ep_headings':23:06
mordredk. now let me learn how to enable that23:06
mordredwow. it's just an npm install23:07
mordredwell - we can have the headings dropdown back - but we will need to build out own image23:07
mordredso let's maybe wait for 1.8.3 and see what breaks before we dothat?23:07
fungiyeah23:07
fungii wonder how existing pads which had headings set in them are rendering now23:08
mordredfungi: soooooooo23:11
mordredfungi: game to try something completely silly?23:11
fungisure23:11
mordredwe can bind-mount the plugin in23:11
mordredand in theory it should "just work"23:11
fungithat sounds way preferable to building new images ourselves23:12
*** hashar has quit IRC23:13
mordredfungi: I have put a copy of the ep_headings module into /etc/etherpad/node_modules23:14
mordredfungi: and in the root screen you can see the line I added to docker-compose.yaml23:14
mordredwe can restart etherpad real quick to see if that works - and if it does I'll work up a change to do it properly23:15
fungijumping back into that root screen now23:16
fungiahh, yep23:17
mordred(bottom line)23:17
fungiup for it23:17
mordredk. giving it a go23:17
fungiservice unabailable23:17
fungiunavailable23:17
fungiso maybe not23:18
mordredyeah. it's a permissions thing ...23:19
mordredthere it is23:20
mordredk. we've learned how to fake this :)23:20
*** tosky has quit IRC23:20
* mordred is going to leave etherpad.opendev.org in the emergency list until that's done in ansible23:21
fungiit works!23:21
mordredso - basic things I did - on my laptop in a random directory I ran "npm install ep_headings" - I tarred up the node_modules dir that produced, copied it over, untarred it in /etc/etherpad - and then touched /etc/etherpad/node_modules/ep_headings/.ep_initialized and chmodded is for write23:22
fungineat23:23
fungiso if we decide to add other plugins, that's basically how we'd go about it?23:23
mordredyeah23:23
fungihere's hoping they don't break that one in the next release, at least23:24
mordredyeah23:24
*** DSpider has quit IRC23:26
clarkbmordred: fungi are we giong to ansiblify that?23:43
mordredyup23:45
openstackgerritMonty Taylor proposed opendev/system-config master: Install ep_headings module  https://review.opendev.org/71912323:51
mordredclarkb, fungi :^^23:51
clarkb+223:52
openstackgerritMerged opendev/system-config master: Set env vars pointing to correct file locations  https://review.opendev.org/71905323:57
clarkboh it just occured to me is ^ safe with the current running gerrit?23:58
clarkbmordred: ^23:58
clarkbreview is not in the emergency file so that may break us?23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!