Saturday, 2020-05-30

fungicorvus: thanks for looking into it, at least we can warn folks and offer options00:05
*** factor has joined #opendev00:22
*** stephenfin has quit IRC04:42
*** stephenfin has joined #opendev04:53
*** calcmandan has quit IRC04:57
*** calcmandan has joined #opendev04:59
*** stephenfin has quit IRC05:02
*** ravsingh has joined #opendev05:22
*** ravsingh has quit IRC06:17
*** sgw has quit IRC06:26
AJaegerianw: https://opendev.org/openstack/requirements/src/branch/stable/stein/.zuul.d/project.yaml#L8 https://opendev.org/openstack/requirements/src/branch/stable/train/.zuul.d/project.yaml#L7 https://opendev.org/openstack/requirements/src/branch/stable/ussuri/.zuul.d/project.yaml#L6  https://opendev.org/openstack/requirements/src/branch/stable/rocky/.zuul.d/project.yaml#L706:26
AJaegerianw: I suggest to move these master only jobs back to project-config so that removal is not forgotten...06:26
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: test-playbooks: avoid warnings with shell/command  https://review.opendev.org/73160507:06
openstackgerritTobias Urdin proposed openstack/project-config master: Remove retired congress  https://review.opendev.org/73188907:22
*** DSpider has joined #opendev08:00
*** moppy has quit IRC08:01
*** moppy has joined #opendev08:01
*** elod has quit IRC09:02
*** elod has joined #opendev09:08
openstackgerritGuillaume Chauvel proposed zuul/zuul-jobs master: ensure-twine: Update using same format as ensure-tox  https://review.opendev.org/73185409:39
*** elod has quit IRC09:57
*** dpawlik has joined #opendev10:04
*** elod has joined #opendev10:04
*** dpawlik has quit IRC10:43
*** tosky has joined #opendev11:06
*** dpawlik has joined #opendev11:18
openstackgerritMonty Taylor proposed opendev/puppet-openstack_infra_spec_helper master: Install hosts and group files into service location  https://review.opendev.org/73158311:53
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099111:53
openstackgerritMonty Taylor proposed opendev/system-config master: Override bridge hostvars directly  https://review.opendev.org/73125811:53
*** dpawlik has quit IRC12:00
*** dpawlik has joined #opendev12:00
*** dpawlik has quit IRC12:03
*** dpawlik has joined #opendev12:03
corvusfungi, clarkb: jbg in #jitsi says xmpp muc names are case-insensitive, and suggests thinking about making etherpad the same to match.13:15
corvusof course, we have 10 years of history in there...  but maybe we could do some db queries to see how many collisions there would be if we did that13:16
corvushonestly, it's probably not a bad idea -- the case-sensitivity here isn't really getting us much benefit, and can provide minor confusion.13:17
fungii can run a query13:20
fungiand yeah, we could basically just rename any with mixed case and then redirect to case-insensitive names from any case mix13:21
corvusya13:21
corvus(i doubt this is something we'd want to do right before a ptg, but thinking ahead)13:22
fungiif nothing else, getting a count of case-insensitive collisions will help inform our decision13:26
fungithe hardest part will be fiddling with docker-compose to invoke mysqlclient ;)13:26
fungiahh, got it working13:29
fungii think there must be something special involved with getting interactive mode to work, but i can do noninteractive with -e just fine13:29
corvusfungi: did you pass '-it' ?13:30
fungiahh, so there is a special flag for that ;)13:32
fungianyway, after reacquainting myself with etherpad's database schema, i think listing pads with the api will be more productive13:32
*** larainema has quit IRC13:32
corvusfungi: oh, are pad names not a column?13:36
funginot even remotely13:42
fungiit wants to use something like mongodb13:43
fungithe etherpad-lite database has a single table called "store" with two columns, "key" and "value"13:43
fungithat's the entirety of the db schema13:43
corvusgotcha.  i was hoping it would be ('pad', 'key', 'value')13:44
corvusthat's clearly just crazytalk13:44
fungiso it's a matter of working out what the patterns are for the key names and filtering with key like "%something%"13:44
fungibut the rest api is entirely serviceable: https://github.com/ether/etherpad-lite/blob/master/doc/api/http_api.md#listallpads13:44
fungii've got that dumping to a json file now13:45
fungitaking a while13:45
fungiwget -qO- 'http://localhost:9001/api/1.2.13/listAllPads?apikey='$(sudo docker-compose -f /etc/etherpad-docker/docker-compose.yaml exec etherpad cat /opt/etherpad-lite/APIKEY.txt) > padlist.json13:46
fungifor the record13:46
fungino idea how many there are, since we're just one api rev shy of when they recently implemented a getStats method13:49
fungiour deployment supports up to api 1.2.13, and getStats was implemented for api 1.2.1413:49
fungii was hoping we could get https://review.opendev.org/729029 in before the ptg, but timing was tight13:50
fungiyeah, 1.3mb json file just for the list of pads13:55
fungiparsing with python now13:55
fungi>>> len(pads['data']['padIDs'])13:56
fungi5843513:56
funginow to figure out what case collisions we've got13:57
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099114:00
openstackgerritMonty Taylor proposed opendev/system-config master: Override bridge hostvars directly  https://review.opendev.org/73125814:00
openstackgerritMonty Taylor proposed opendev/system-config master: Stop cloning drupal puppet modules  https://review.opendev.org/73194714:00
fungi2304 collisions of two or more pad names, out of 58435 total pads14:07
fungiso it's not going to be trivial14:07
fungigranted, a lot of these look like user error, for example14:07
fungi['upstream-Institute-shanghai-2019', 'upstream-institute-shanghai-2019']14:08
fungi['watcher-Boston-Meetings', 'watcher-Boston-meetings', 'watcher-boston-meetings']14:08
fungi['YVR-forum-fast-forward-upgrades', 'Yvr-forum-fast-forward-upgrades', 'yvr-forum-fast-forward-upgrades']14:08
fungilooking at the list, we'd likely be doing our users a favor by making pad names case-insensitive14:09
fungilots also just have the welcome text in them. i wonder if there's a good way to cull those14:10
AJaegerfungi: upstream-Institute-shanghai-2019 just has the default text, can you figure out via an API call whether a pad has content (at revision 0?)14:11
fungiyeah, the getText method is what i'm using to spot check some of them14:11
fungiit would take a while, but i could probably iterate over every padID and generate a checksum of the text, then look for checksum collisions to identify duplicate pads14:12
fungichecksums for the various default texts we've had over time are likely to come out orders of magnitude higher than any others14:13
fungii've got a first attempt at that running now, will see how far it gets14:37
*** elod has quit IRC15:02
fungitrying again. apparently without a retry in place, the odds that one of ~60k http calls will fail is nonzero ;)15:10
*** elod has joined #opendev15:19
mordredfungi: doh15:20
fungihrm, nope, even then some calls fail, maybe broken pads. i'll just add a try/except around it, missing some pads isn't going to hurt for this purpose anyway15:20
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099115:35
*** dpawlik has quit IRC15:44
fungiaround half the pads (24136) have contents with the same exact checksum... that's our current default pad text15:56
fungiwe have a total of 29732 unique pad content checksums15:58
AJaegerout of those 2304 collisions, how many are unique?16:09
AJaegerso, should we delete those 24136 pads?16:09
AJaegerguess, it's not worth the effort...16:10
fungithere's a long tail of duplicate contents16:15
fungithe most common text is what you see in https://etherpad.opendev.org/p/!!16:16
fungi24136 pads with that in them16:16
fungia representative of the next most common pad contents is https://etherpad.opendev.org/p/-3cNAME-OF-YOUR-BLUEPRINT16:17
fungiso empty16:17
fungithere are 1456 of those16:17
fungian example of the next most common is https://etherpad.opendev.org/p/-3Cdeploymentprocess-3E16:19
fungilooks like a different default text16:20
fungithere are 1132 of those16:20
fungithen there's 400 like this one (another default text) https://etherpad.opendev.org/p/,,,,,,,16:50
AJaegeryou find some interesting URLs ;)16:51
fungithose are just the ones sorting first16:51
fungi339 like https://etherpad.opendev.org/p/0Bcf9qsSUU where there's default text plus an abiword error16:51
fungi227 like https://etherpad.opendev.org/p/.xyz-a41837f9-5ea8-4652-a2ac-009c9 with another default content16:52
fungi114 with this default text https://etherpad.opendev.org/p/+43,-7916:52
fungi112 which are empty like https://etherpad.opendev.org/p/18amNZslXZ but for some reason not the same checksum as the other empty ones16:53
fungi105 with default text plus some blank lines, like https://etherpad.opendev.org/p/011116:54
fungi53 more like that but with a different number of blank lines16:55
fungimaybe i'll rerun this and strip leading/trailing whitespace to see if that helps condense the list16:56
*** dpawlik has joined #opendev17:39
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099117:46
fungijust realized i had also been checksumming the full json payload from the getText query, not only the data.text subfield18:00
fungiso between that and stripping leading/trailing whitespace i expect to get even more clear results18:00
fungishould hopefully know soon18:00
fungiyep, that's compressed the duplicates count curve18:30
*** dpawlik has quit IRC18:37
fungi24417 with just our current welcome text, 1790 are empty or contain only whitespace, 1144 and 408 with a couple different older default texts, 341 default text with an abiword error appended, 171 which consist solely of an empty bullet list entry, 23 with the first line of welcome text set as a heading style18:39
fungiwe could probably stand to delete all of those18:40
fungibut the bigger question is, what's the overlap between that and the case-insensitive padID collisions...18:41
*** sgw has joined #opendev19:02
clarkbfungi: the rocket should ve flying overyou nowish19:27
fungiooh, better that the one yesterday19:30
fungitoo bright out to see anything though19:31
*** sshnaidm has joined #opendev19:40
fungiokay, so if we were to delete all those empty and default content pads, the number of actual case-insensitive padID collisions we'd have to deal with is still 50420:28
fungiwhich, yes, is more than we're going to deal with in a weekend20:28
fungiodds are most of those are also dupes or trash, but they'll need more careful inspection20:28
fungii have a list20:29
openstackgerritGuillaume Chauvel proposed zuul/zuul-jobs master: ensure-twine: Check executable presence using shell+bin/bash  https://review.opendev.org/73185420:59
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099121:19
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add focal support for ensure-pip  https://review.opendev.org/73199321:33
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add focal support for ensure-pip  https://review.opendev.org/73199321:49
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add ubuntu-focal testing  https://review.opendev.org/73199521:49
*** DSpider has quit IRC22:25
*** tosky has quit IRC23:22

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!