Monday, 2019-07-22

*** dchen has joined #openstack-infra00:02
*** armax has quit IRC00:09
*** tkajinam has quit IRC00:12
*** takamatsu has quit IRC00:18
openstackgerritMerged opendev/irc-meetings master: Updated the PowerVM Chair Person Info  https://review.opendev.org/67192900:18
*** dklyle has joined #openstack-infra00:28
*** bobh has joined #openstack-infra00:28
*** armax has joined #openstack-infra00:32
openstackgerritIan Wienand proposed opendev/system-config master: Add some pointers on the OpenDev PPA  https://review.opendev.org/67095200:59
*** mgoddard has quit IRC01:01
*** mgoddard has joined #openstack-infra01:05
*** imacdonn has quit IRC01:17
*** imacdonn has joined #openstack-infra01:18
*** ruffian_sheep has joined #openstack-infra01:25
*** ruffian_sheep has quit IRC01:41
*** ruffian_sheep has joined #openstack-infra01:46
*** ruffian_sheep has quit IRC02:08
*** zul has quit IRC02:09
*** ruffian_sheep has joined #openstack-infra02:21
*** ruffian_sheep has quit IRC02:31
*** bobh has quit IRC02:34
*** ruffian_sheep has joined #openstack-infra02:37
*** ruffian_sheep27 has joined #openstack-infra02:38
*** ruffian_sheep has quit IRC02:41
*** ykarel|away has joined #openstack-infra02:43
*** ykarel|away has quit IRC02:56
*** yamamoto has joined #openstack-infra03:00
ianwi do not like the full set of POST_FAILURES i just got for that ... :/03:04
ianwnor this -- [Mon Jul 22 00:45:20 2019] EXT4-fs (dm-2): Remounting filesystem read-only03:05
ianw /srv/static/logs is offline ... looking into it03:08
ianwi'm going to reboot the host with the mount commented in fstab; it could do with new kernel updates anyway.  we can buffer logs to local disk while i presumably run a prolonged fsck03:09
*** yamamoto has quit IRC03:10
*** yamamoto has joined #openstack-infra03:14
*** yamamoto has quit IRC03:21
*** yamamoto has joined #openstack-infra03:26
*** yamamoto has quit IRC03:26
*** yamamoto has joined #openstack-infra03:27
*** ykarel|away has joined #openstack-infra03:31
*** psachin has joined #openstack-infra03:32
*** bhavikdbavishi has joined #openstack-infra03:34
*** ruffian_sheep27 is now known as ruffian_sheep03:36
*** yamamoto has quit IRC03:37
*** yamamoto has joined #openstack-infra03:39
*** bhavikdbavishi1 has joined #openstack-infra03:41
*** bhavikdbavishi has quit IRC03:42
*** bhavikdbavishi1 is now known as bhavikdbavishi03:42
*** yamamoto has quit IRC03:44
*** yamamoto has joined #openstack-infra03:47
*** yamamoto has quit IRC03:48
*** yamamoto has joined #openstack-infra03:55
ianw#status log logs.o.o : /srv/static/logs bind mounted to /opt for buffering, recovery of /dev/mapper/logs proceeding in a root screen session04:02
openstackstatusianw: finished logging04:02
*** udesale has joined #openstack-infra04:02
ianwinfra-root: ^ fyi04:02
ianwinfra-root: note mounts recorded in /etc/fstab; will need to be restored04:04
*** ykarel|away has quit IRC04:05
*** raukadah is now known as chandankumar04:10
ianwargh i just realised i should have copied out the errors too; there was no i/o type errors in the logs at the time.  just some directory related ext4 issues and then it went r/o per above at 00:4504:15
*** jamesmcarthur has joined #openstack-infra04:21
*** ykarel|away has joined #openstack-infra04:23
ianwi think it's gone, oh well :/04:24
*** ramishra has joined #openstack-infra04:27
*** yamamoto has quit IRC04:30
openstackgerritMerged opendev/system-config master: Add some pointers on the OpenDev PPA  https://review.opendev.org/67095204:39
*** yamamoto has joined #openstack-infra04:46
*** threestrands has joined #openstack-infra05:00
*** jamesmcarthur has quit IRC05:04
*** Lucas_Gray has joined #openstack-infra05:06
*** jamesmcarthur has joined #openstack-infra05:06
*** jamesmcarthur has quit IRC05:10
*** Lucas_Gray has quit IRC05:44
*** lmiccini has joined #openstack-infra05:46
*** ykarel|away is now known as ykarel05:49
*** AJaeger has quit IRC05:53
ianwahh, it's probably worth alerting that *old* logs aren't there for a bit while we reset05:56
openstackgerritMerged opendev/system-config master: Allow to rsync Centos Software Collections repo  https://review.opendev.org/67144906:00
ianw#status alert Due to a failure on the logs.openstack.org volume, old logs are unavailable while partition is recovered.  New logs are being stored.  ETA for restoration probably ~Mon Jul 22 12:00 UTC 201906:01
openstackstatusianw: sending alert06:01
ianwwell there you go, i send that and the fsck just finished06:02
-openstackstatus- NOTICE: Due to a failure on the logs.openstack.org volume, old logs are unavailable while partition is recovered. New logs are being stored. ETA for restoration probably ~Mon Jul 22 12:00 UTC 201906:04
*** ChanServ changes topic to "Due to a failure on the logs.openstack.org volume, old logs are unavailable while partition is recovered. New logs are being stored. ETA for restoration probably ~Mon Jul 22 12:00 UTC 2019"06:04
*** whoami-rajat has joined #openstack-infra06:05
openstackstatusianw: finished sending alert06:08
ianwi'm just copying buffered logs06:13
*** udesale has quit IRC06:16
*** piotrowskim has joined #openstack-infra06:17
*** AJaeger has joined #openstack-infra06:18
*** kopecmartin|off is now known as kopecmartin06:18
*** AJaeger has quit IRC06:18
*** AJaeger has joined #openstack-infra06:19
*** ricolin__ is now known as ricolin06:20
ianwthat's done.  i guess we're ahead of schedule :)06:20
ianw#status ok logs.openstack.org volume has been restored.  please report any issues in #openstack-infra06:20
openstackstatusianw: sending ok06:20
AJaegerthanks, ianw !06:21
*** ChanServ changes topic to "Discussion of OpenStack Developer and Community Infrastructure | docs http://docs.openstack.org/infra/ | bugs https://storyboard.openstack.org/ | source https://opendev.org/opendev/ | channel logs http://eavesdrop.openstack.org/irclogs/%23openstack-infra/"06:23
-openstackstatus- NOTICE: logs.openstack.org volume has been restored. please report any issues in #openstack-infra06:23
*** jaosorior has joined #openstack-infra06:25
openstackstatusianw: finished sending ok06:28
*** markvoelker has quit IRC06:32
*** pcaruana has joined #openstack-infra06:32
*** jpena has joined #openstack-infra06:34
openstackgerritIan Wienand proposed opendev/system-config master: files.o.o : publish .log as text/plain  https://review.opendev.org/67196306:37
AJaegerconfig-core, please review https://review.opendev.org/67111706:43
*** udesale has joined #openstack-infra06:47
*** yamamoto has quit IRC06:47
*** dchen has quit IRC06:51
*** dchen has joined #openstack-infra06:51
*** odicha has joined #openstack-infra06:52
*** xek has joined #openstack-infra06:56
*** Goneri has joined #openstack-infra06:56
*** slaweq has joined #openstack-infra06:57
*** joeguo has quit IRC06:57
*** tesseract has joined #openstack-infra07:01
*** ginopc has joined #openstack-infra07:03
*** markvoelker has joined #openstack-infra07:04
*** rpittau|afk is now known as rpittau07:05
*** yamamoto has joined #openstack-infra07:07
*** rcernin has quit IRC07:07
*** jamesmcarthur has joined #openstack-infra07:07
*** pgaxatte has joined #openstack-infra07:08
*** tkajinam has joined #openstack-infra07:08
*** xek has quit IRC07:09
*** jbadiapa has joined #openstack-infra07:09
*** xek has joined #openstack-infra07:10
ianwAJaeger: 671117 -- how does changing the secret change the publishing path?  or did i miss another change?07:14
AJaegerianw: changing the secrets let it publish to a different server, see how the other jobs look like. But let me double check...07:16
ianwyeah, i can see that would allow it to write to /afs/openstack.org/docs instead of /afs/openstack.org/developer-docs ... but i'm not sure how it makes up the path07:17
AJaegerianw: check zuul.d/secrets.yaml, it has "path: /afs/.openstack.org/docs" for the new secret and "path: /afs/.openstack.org/developer-docs" for the old one07:17
AJaegerThat's the magic ;)07:17
ianwooohhhh, right, yep that explains it, thanks07:17
AJaegerthanks for double checking!07:18
*** jaosorior has quit IRC07:18
*** iurygregory has joined #openstack-infra07:19
*** jtomasek has joined #openstack-infra07:21
*** jtomasek has quit IRC07:21
*** tosky has joined #openstack-infra07:21
*** jhesketh has quit IRC07:22
*** jtomasek has joined #openstack-infra07:22
*** jhesketh has joined #openstack-infra07:24
*** apetrich has joined #openstack-infra07:24
*** jamesmcarthur has quit IRC07:26
*** jamesmcarthur has joined #openstack-infra07:26
*** tkajinam has quit IRC07:36
*** ykarel is now known as ykarel|lunch07:45
*** pgaxatte has quit IRC07:51
*** lucasagomes has joined #openstack-infra07:54
*** pgaxatte has joined #openstack-infra07:54
*** jamesmcarthur has quit IRC07:56
*** jamesmcarthur has joined #openstack-infra07:58
*** iurygregory has quit IRC08:01
*** ralonsoh has joined #openstack-infra08:04
*** yolanda has joined #openstack-infra08:06
*** roman_g has joined #openstack-infra08:07
*** dtantsur|afk is now known as dtantsur08:13
*** jamesmcarthur has quit IRC08:13
*** hwoarang_ has quit IRC08:14
*** threestrands has quit IRC08:14
*** threestrands has joined #openstack-infra08:14
*** dchen has quit IRC08:14
*** kjackal has joined #openstack-infra08:22
*** zbr|out is now known as zbr08:22
*** odicha has quit IRC08:25
*** pkopec has joined #openstack-infra08:28
*** gtarnaras has joined #openstack-infra08:28
*** kjackal has quit IRC08:33
*** odicha has joined #openstack-infra08:35
*** kjackal has joined #openstack-infra08:36
*** ociuhandu has joined #openstack-infra08:37
*** ociuhandu has quit IRC08:39
*** ociuhandu has joined #openstack-infra08:39
*** iurygregory has joined #openstack-infra08:41
*** yamamoto has quit IRC08:42
*** Goneri has quit IRC08:45
*** yamamoto has joined #openstack-infra08:50
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Install gettext for translation jobs  https://review.opendev.org/67199208:53
*** jpena is now known as jpena|brb08:56
*** Goneri has joined #openstack-infra08:58
*** hwoarang has joined #openstack-infra09:01
*** mgoddard has quit IRC09:01
*** threestrands has quit IRC09:03
*** e0ne has joined #openstack-infra09:04
*** ykarel|lunch is now known as ykarel09:06
*** bhavikdbavishi has quit IRC09:11
*** ginopc has quit IRC09:15
*** ruffian_sheep has quit IRC09:24
*** priteau has joined #openstack-infra09:29
*** ginopc has joined #openstack-infra09:38
*** udesale has quit IRC09:40
*** udesale has joined #openstack-infra09:40
*** udesale has quit IRC09:42
*** udesale has joined #openstack-infra09:42
*** FlorianFa has joined #openstack-infra09:42
*** siqbal has joined #openstack-infra09:42
*** ociuhandu has quit IRC09:45
*** jpena|brb is now known as jpena09:46
*** ociuhandu has joined #openstack-infra09:48
*** adriancz has joined #openstack-infra09:49
*** pgaxatte has quit IRC10:02
*** yamamoto has quit IRC10:05
*** siqbal90 has joined #openstack-infra10:06
*** siqbal has quit IRC10:07
*** pgaxatte has joined #openstack-infra10:09
*** pgaxatte has quit IRC10:16
*** pgaxatte has joined #openstack-infra10:16
*** joeguo has joined #openstack-infra10:24
stephenfinfungi: RE: doc8, the email doesn't seem to be appearing in the archives yet, but Ian said he was happy to either transfer doc8 to his personal GitHub account (sigmavirus24) or to add you to the doc8 team in the PyQCA org so you can move it directly there10:24
stephenfinfungi: Happy to help wherever I can. Just tell me what I've to do :)10:25
*** udesale has quit IRC10:28
*** ccamacho has joined #openstack-infra10:29
*** ccamacho has quit IRC10:29
*** ccamacho has joined #openstack-infra10:29
*** yamamoto has joined #openstack-infra10:39
*** jaosorior has joined #openstack-infra10:41
*** jaosorior has quit IRC10:43
*** jaosorior has joined #openstack-infra10:44
*** joeguo has quit IRC10:45
*** ykarel is now known as ykarel|afk10:47
AJaegerconfig-core, please review https://review.opendev.org/671117 and https://review.opendev.org/671121 to further deprecate api-site repo.10:49
*** yamamoto has quit IRC10:49
sshnaidmwhere is Dockerfile for nodepool container? There is an error to fix..10:53
*** siqbal90 has quit IRC10:55
*** siqbal has joined #openstack-infra10:55
*** joeguo has joined #openstack-infra10:56
sshnaidmas I understand nodepool containers are not used anywhere and not tested, otherwise it'd fail on first run..10:58
*** joeguo has quit IRC10:59
*** joeguo has joined #openstack-infra10:59
*** ruffian_sheep has joined #openstack-infra11:02
openstackgerritSagi Shnaidman proposed zuul/nodepool master: Fix nodepool container failure  https://review.opendev.org/67201211:04
*** rosmaita has joined #openstack-infra11:07
*** ruffian_sheep has quit IRC11:08
*** gtarnaras has quit IRC11:12
*** apetrich has quit IRC11:15
*** betherly has joined #openstack-infra11:19
*** yamamoto has joined #openstack-infra11:22
*** apetrich has joined #openstack-infra11:24
*** betherly has quit IRC11:24
*** rh-jelabarre has joined #openstack-infra11:28
*** jpena is now known as jpena|lunch11:33
*** jcoufal has joined #openstack-infra11:34
*** kaisers has quit IRC11:35
*** kaisers has joined #openstack-infra11:36
*** apetrich has quit IRC11:37
*** apetrich has joined #openstack-infra11:37
*** jaosorior has quit IRC11:38
*** yamamoto has quit IRC11:38
*** jaosorior has joined #openstack-infra11:39
*** kjackal has quit IRC11:39
*** gtarnaras has joined #openstack-infra11:40
*** kjackal has joined #openstack-infra11:41
*** psachin has quit IRC11:50
*** betherly has joined #openstack-infra11:50
*** yamamoto has joined #openstack-infra11:55
*** betherly has quit IRC11:55
*** ykarel|afk is now known as ykarel12:05
*** udesale has joined #openstack-infra12:07
*** joeguo has quit IRC12:08
*** lucasagomes has quit IRC12:14
*** lucasagomes has joined #openstack-infra12:19
*** _erlon_ has joined #openstack-infra12:21
*** rfolco|rover has joined #openstack-infra12:28
*** rlandy has joined #openstack-infra12:31
*** tdasilva has joined #openstack-infra12:33
*** dklyle has quit IRC12:36
*** david-lyle has joined #openstack-infra12:36
*** gtarnara_ has joined #openstack-infra12:39
*** yamamoto has quit IRC12:39
*** irclogbot_3 has quit IRC12:39
*** electrofelix has joined #openstack-infra12:40
*** yamamoto has joined #openstack-infra12:41
*** gtarnaras has quit IRC12:42
*** irclogbot_2 has joined #openstack-infra12:42
*** ykarel has quit IRC12:43
*** roman_g has quit IRC12:46
*** jpena|lunch is now known as jpena12:47
*** yamamoto has quit IRC12:54
*** jaosorior has quit IRC12:59
fungiianw: thanks for taking care of the logs volume! sorry i wasn't on hand, had already fallen asleep by that point12:59
fungistephenfin: i'll check the moderation queue here in a moment12:59
*** siqbal has quit IRC13:02
*** siqbal has joined #openstack-infra13:02
openstackgerritMerged openstack/project-config master: Publish api-ref/api-guide to docs.o.o  https://review.opendev.org/67111713:05
*** eharney has joined #openstack-infra13:10
*** mriedem has joined #openstack-infra13:11
*** yamamoto has joined #openstack-infra13:12
*** yamamoto has quit IRC13:14
fungistephenfin: i didn't find anything in the openstack-discuss moderation queue from him. are you sure it was sent to the list address?13:15
stephenfinfungi: Sorry, wasn't clear at all /o\ I was referring to the pycqa archives13:16
fungioh, got it13:16
stephenfinI'm guessing my messaging to that haven't been approved. In any case, as noted he's onboard with moving to his account and letting him do the shuffle to PyCQA13:18
fungistephenfin: if he temporarily adds the openstackadmin user to the org, i can transfer the repo directly into it with that account13:19
*** yamamoto has joined #openstack-infra13:19
*** aaronsheffield has joined #openstack-infra13:20
stephenfinfungi: I'll ask him to do that now (y)13:20
fungias soon as it's done i'll do the transfer and then he can remove the permission13:20
AJaegermnaser: thanks for reviewing, could you look at https://review.opendev.org/#/c/671118/ also, please?13:20
*** zzehring has joined #openstack-infra13:26
*** siqbal has quit IRC13:31
AJaegerfungi, if you have time for review, please? ^13:33
fungisure, image encryption meeting wrapped up early, so i have a couple minutes13:33
fungiheh, i shouldn't have tried to pull those depends-on entries in gertty13:34
*** gtarnara_ has quit IRC13:35
*** needscoffee is now known as kmalloc13:36
AJaeger;)13:39
fungifetching changes for openstack-manuals is almost as rough as fetching changes for nova13:40
*** gtarnaras has joined #openstack-infra13:43
*** beekneemech is now known as bnemec13:43
AJaeger;/13:45
AJaegerfungi: here's another change without dependencies - installing gettext as part of the role - that you might be interested in (no urgency). https://review.opendev.org/671992 Thanks!13:46
*** tesseract has quit IRC13:55
*** ykarel has joined #openstack-infra13:59
*** gtarnaras has quit IRC14:00
*** tesseract has joined #openstack-infra14:00
*** bhavikdbavishi has joined #openstack-infra14:00
*** gtarnaras has joined #openstack-infra14:01
*** yamamoto has quit IRC14:03
*** bhavikdbavishi1 has joined #openstack-infra14:04
*** bhavikdbavishi has quit IRC14:05
*** bhavikdbavishi1 is now known as bhavikdbavishi14:05
*** yamamoto has joined #openstack-infra14:05
johnsomI have a question about gerrit ACLs. I see that there are "included groups" in the GUI ACLs, but I don't see those in the ACL definitions in the project-config repo. What is the process to update the "included groups" in the gerrit ACLs?14:05
AJaegerjohnsom: just add them in the UI like you add individual users14:06
fungijohnsom: someone with ownership of the group in gerrit makes those changes (either through the gerrit webui or api)14:06
*** yamamoto has quit IRC14:06
fungiif the group is self-owned then any member of the group can do that14:06
*** yamamoto has joined #openstack-infra14:07
fungiif the group is owned by another group, then a member of that other owner group has to do it14:07
fungiyou can find the group owner on the group's general properties page in the gerrit webui14:07
johnsomAh, ok, so I likely can manage those.  Thanks!14:07
fungiyou're welcome!14:07
fungiif you need more details, let us know the name of the group and we can work out what's needed14:07
*** michael-beaver has joined #openstack-infra14:12
openstackgerritMerged openstack/project-config master: Remove publish-openstack-manuals-developer-lang  https://review.opendev.org/67111814:12
openstackgerritMerged openstack/openstack-zuul-jobs master: Install gettext for translation jobs  https://review.opendev.org/67199214:13
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Add build releasenotes py3 template  https://review.opendev.org/67205314:19
*** yamamoto has quit IRC14:22
*** smarcet has joined #openstack-infra14:22
*** eernst has joined #openstack-infra14:25
*** smarcet has quit IRC14:28
clarkbsshnaidm: questions like that are probably best in #zuul but I believe the answer is in the nodepool repo and that those images are used in the zuul quickstart job14:43
clarkbianw: thank you for taking care of that14:45
AJaegerinfra-root, could you copy most of http://files.openstack.org/developer-docs/api-ref/ to http://files.openstack.org/docs/api-ref/ , please? In that case I can give you a list of directories to copy...14:48
clarkbAJaeger: this is part of the developer.o.o clean up?14:50
AJaegerclarkb: yes14:51
*** yamamoto has joined #openstack-infra14:51
AJaegermight be easier than waiting for all to publish to new content - but nothing that needs done immediately.14:51
* AJaeger can prepare a small script with dirs to copy14:51
clarkbin theory we can also give AJaeger write perms on docs/ in afs?14:52
fungiyeah, as long as he has kerberos working14:53
*** jamesmcarthur has joined #openstack-infra14:53
AJaegerclarkb: that means I would need to setup AFS which is not worth it for the 2 times I need it...14:53
fungi(and afs obviously)14:53
*** jaosorior has joined #openstack-infra14:53
clarkbAJaeger: ya and my opensuse install stopped working (which is why I was so excited for kafs)14:54
fungii'm happy to do the file copies though, if i get a list/script containing them14:54
clarkbme too (I'll just hop on one of our afs servers and do it from there)14:55
fungii have a meeting starting in 5 minutes, but could tackle it after\14:55
AJaegerthanks, fungi and clarkb - will provide a script in a few minutes14:55
*** yamamoto has quit IRC14:56
AJaegerfungi, clarkb, http://paste.openstack.org/show/754722/ is the script - needs adjustment in line 43 for the AFS location. Thanks!14:57
AJaegerfungi, clarkb, please figure out who does it ;)14:57
AJaegerwait - I forgot api-guide...14:58
AJaegernow with all content - http://paste.openstack.org/show/754723/14:59
AJaegerneeds three lines to adjust for AFS pathes14:59
AJaegerif you want me to edit the script, tell me - I'll be offline now for an hour or two and can update later.15:01
*** jamesmcarthur has quit IRC15:03
*** jamesmcarthur has joined #openstack-infra15:04
*** ccamacho has quit IRC15:05
openstackgerritSagi Shnaidman proposed zuul/nodepool master: Fix nodepool container failure  https://review.opendev.org/67201215:07
*** gyee has joined #openstack-infra15:08
*** jamesmcarthur has quit IRC15:08
*** smarcet has joined #openstack-infra15:08
*** odicha has quit IRC15:13
clarkbgitea01 and 02 have OOM'd since adding the swapfile15:14
clarkbso 1GB swap isn't enough for all the demand there I guess15:14
mordredcorvus: I have NO CLUE why the gerrit image isn't building in zuul. it's working locally15:14
mordredcorvus: I'm going to put in a hold15:14
clarkbI'll trigger replication against them just to be sure nothing was missed. But I guess we should start looking at rebuilding those as the next step15:15
mordredcorvus: also - I'm not sure if this is intended design or not: the image build fails, but we don't see a failure until the post job tries to upload it to the changeset registry15:15
corvusmordred: i didn't intend for that :(15:16
mordredkk. I'll look in to that after I figure out why it's not working in the first place15:17
*** priteau has quit IRC15:20
*** gtarnaras has quit IRC15:23
*** ginopc has quit IRC15:24
*** yamamoto has joined #openstack-infra15:25
*** pgaxatte has quit IRC15:29
*** lpetrut has joined #openstack-infra15:30
*** gtarnaras has joined #openstack-infra15:30
*** ykarel is now known as ykarel|away15:35
fungiclarkb: were you already working on the api-site file copies, or shall i do them now?15:43
clarkbfungi: sorry got nerd sniped digging into neutrons gate job changes so no haven't pulled up afs credentials15:44
clarkbyou should go ahead and do them. I need breakfast too15:44
*** kjackal has quit IRC15:45
fungion it, thanks!15:45
*** kjackal has joined #openstack-infra15:47
fungiAJaeger: i'm going to assume that line 55 of that paste should have been DIRS_GUIDE rather than DIRS and am adjusting the script accordingly15:47
*** gfidente has joined #openstack-infra15:50
AJaegerfungi: yes, sorry15:51
*** gtarnaras has quit IRC15:52
*** kjackal has quit IRC15:55
*** kjackal has joined #openstack-infra15:55
fungiAJaeger: also, some of those target directories already exist. is the cp still going to work fine or will we wind up copying to a subdirectory?15:55
AJaegerfungi: only baremetal-inspection exists - want to remove it from the script?15:56
AJaegerfungi: the other existing dir (network) I already removed and baremetal-inspection is new15:56
AJaeger(newly published)15:56
fungicool, i've taken baremetal-introspection out of the list to copy15:57
AJaegerthanks15:57
fungicopying starting now15:57
AJaeger\o/15:58
fungiit will likely take a while15:58
fungisince it's tromboning through my home broadband connection15:59
AJaegeroh15:59
AJaegerok15:59
openstackgerritMerged zuul/nodepool master: Fix nodepool container failure  https://review.opendev.org/67201216:01
*** jpena is now known as jpena|off16:02
*** e0ne has quit IRC16:05
AJaegerfungi: I see the first dirs on files.openstack.org ...16:07
*** udesale has quit IRC16:08
fungiyeah, it's gotten to data-protection-orchestration so far16:10
clarkbhttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=66635&rra_id=all I think the spike there shows why 1GB swap isn't enough16:13
clarkbIf we are going to start replacing some of those servers we should stop adding new projects temporarily16:13
clarkbAJaeger: ^ any concerns with that?16:14
AJaegerclarkb: nothing in the queue for new projects - so go ahead.16:15
clarkbThen we want to boot instances with an 80GB volume using the existing flavor of the other nodes. Will need to use the flag added in https://review.opendev.org/#/c/667548/ as we don't have ping6 installed on the image. Also need to remove the existing gitea0X's from the ansible inventory before booting the new instances (as tehy will conflict with launch node)16:15
clarkbfungi: ^ is that something you were still interested in doing?16:16
*** rascasoft has quit IRC16:16
*** lpetrut has quit IRC16:16
* clarkb writes change to remove gitea01 from inventory temporarily16:16
*** jaosorior has quit IRC16:17
*** lucasagomes has quit IRC16:17
*** smarcet has left #openstack-infra16:17
sshnaidmI see more and more errors, especially from OVH BHS1 like: mage prepare failed: 401 Client Error: Unauthorized for url: http://mirror.bhs1.ovh.openstack.org:8082/v2/tripleomaster/centos-binary-nova-compute-ironic/blobs/sha256:6d3a23ca3a1378376ca4268c06d7c7da7b25358e69ff389475e5a30b78549fbb16:17
sshnaidmit failed a few gate jobs16:18
clarkbsshnaidm: can you link to the job logs?16:18
sshnaidmis there something to do with that?16:18
*** rascasoft has joined #openstack-infra16:18
fungiclarkb: sorry, stepped away for a sec, but yeah let me catch up and i can start work on replacing gitea0116:18
sshnaidmclarkb, http://logs.openstack.org/26/671526/4/gate/tripleo-ci-centos-7-undercloud-containers/5610169/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz?level=ERROR16:18
*** dtantsur is now known as dtantsur|afk16:18
sshnaidmclarkb, or this: http://logs.openstack.org/26/671526/4/gate/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/a371bbe/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz?level=ERROR16:19
sshnaidmclarkb, I have a few like that16:19
openstackgerritClark Boylan proposed opendev/system-config master: Remove gitea01 from inventory so we can replace it  https://review.opendev.org/67208316:20
clarkbfungi: I think something like ^ is the 0th step16:20
*** rpittau is now known as rpittau|afk16:21
clarkbsshnaidm: those urls are just proxied to dockerhub. My best guess is that object was made private?16:21
clarkbsshnaidm: its also possible they changed their cdn again and nothing is working16:21
sshnaidmclarkb, nope, it's downloaded fine in other jobs16:21
*** michael-beaver has quit IRC16:21
clarkbhowever the fact that we get back json saying we are not authorized implies to me that its private16:22
clarkbsshnaidm: open the link in your browser16:22
clarkbsshnaidm: its definitely not working genearlly16:22
clarkband if you request the same path from multiple mirrors you get the same result16:22
fungiclarkb: also isn't there something manual we need to do about clearing cached facts after we remove it from the inventory?16:23
*** tdasilva_ has joined #openstack-infra16:23
clarkbfungi: maybe? I'm not sure if that was strictly required or I was just confused because we actually use the global inventory when launching nodes16:23
fungiahh, okay. we can give it a shot and find out, i guess16:24
clarkbsshnaidm: see also https://hub.docker.com/search?q=tripleomaster%2Fcentos-binary-nova-compute-ironic&type=image16:24
clarkbfungi: ya it should be safe because the launch script creates a one off ssh key that won't be able to log into the existing server16:25
*** piotrowskim has quit IRC16:26
*** tdasilva has quit IRC16:26
sshnaidmclarkb, it works there: http://logs.openstack.org/96/669596/4/gate/tripleo-ci-centos-7-containers-multinode/25ffdee/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz#_2019-07-22_14_15_47_21916:26
clarkbsshnaidm: http://mirror.sjc1.vexxhost.openstack.org:8082/v2/tripleomaster/centos-binary-nova-compute-ironic/blobs/sha256:6d3a23ca3a1378376ca4268c06d7c7da7b25358e69ff389475e5a30b78549fbb currently fails16:27
*** mattw4 has joined #openstack-infra16:27
clarkbso something may have changed?16:27
clarkbdid the image get deleted?16:27
clarkbor switched to a non public image?16:28
sshnaidmclarkb, https://hub.docker.com/r/tripleomaster/centos-binary-nova-compute-ironic16:28
*** iurygregory has quit IRC16:28
clarkbhuh whydoesn't that show up in a search?16:29
fungii don't have docker installed... does a docker pull of that directly from dockerhub work?16:30
*** mattw4 has quit IRC16:30
sshnaidmclarkb, it was there at least for last year, I don't think something changed and other jobs pass on that.16:31
*** mattw4 has joined #openstack-infra16:31
sshnaidmclarkb, "non authorized" errors usually come from bad proxy, kinda misleading error message16:31
clarkbsshnaidm: except our proxy shouldnt' be writing the json16:32
clarkbsshnaidm: if it was just an http status code I would agree, but that json must be coming from docker hub16:32
sshnaidmclarkb, mm.. why does dockerhub answers instead of proxy? or I miss something16:32
clarkbsshnaidm: you talk to the proxy, the proxy talks to dockerhub, docker hub returns a 401 + json document, proxy returns the 401 and json document to you16:33
clarkbif it was just the proxy at fault I would expect only the 401 and not the json document beacuse its a simple apache setup16:33
sshnaidmclarkb, why does proxy talk to dockerhub if it has this image cached?16:33
fungithe cache eventually expires16:34
fungiand has to be refreshed16:34
clarkbyes the cache entries are only allowed to live for 24 hours or something16:34
clarkbour proxy is not writing json documents16:34
fungi(also, we have limited cache space on these, so have to expire cached objects to keep from overrunning available space)16:34
clarkbit is possible that dockerhub changed their url paths on the backend and the proxy is no longer requesting things at valid addresses which could lead to this16:35
sshnaidmhmm.. so in this case it doesn't use the cached image and turns to dockerhub which returns json error? so it's dockerhub issue..16:35
clarkbsshnaidm: that is my current udnerstanding of the problem (and the json document is the key to that because I'm 99% sure nothing in our apache config knows how to write out json for http status codes)16:36
sshnaidmclarkb, is it usual thing for dockerhub? I'm not really familiar with their backend16:36
clarkbhttps://registry-1.docker.io/v2/tripleomaster/centos-binary-nova-compute-ironic/blobs/sha256:6d3a23ca3a1378376ca4268c06d7c7da7b25358e69ff389475e5a30b78549fbb ya that gives me the same result and is where we proxy you16:37
clarkbsshnaidm: dockerhub is pretty bad at being proxy friendly16:37
*** eernst has quit IRC16:38
sshnaidmclarkb, actually it is 401 error: HTTPError: 401 Client Error: Unauthorized for url: http...16:38
*** kopecmartin is now known as kopecmartin|off16:38
clarkbsshnaidm: yes it is a 401 AND a json document16:38
clarkbthe AND a json document was the clue to me that it wasn't our proxy originating the error and the url above confirms it16:39
clarkbso the problem is the destination we proxy requests to is no longer valid it appears16:39
clarkb(at least for that image)16:39
sshnaidmseems like that, Host registry-1.docker.io16:39
*** ykarel|away has quit IRC16:40
sshnaidmany ideas how we can prevent this?16:40
clarkbsomeone will need to do a docker pull and trace it out to see where requests are supposed to go to then update then proxy config16:40
clarkb(assuming it did move and this isn't dockerhub being broken)16:41
fungihttps://hub.docker.com/r/tripleomaster/centos-binary-nova-compute-ironic says it was updated 2 hours ago... was the failure for that sha256 before or after?16:41
clarkbsshnaidm: can you try the docker pull that fungi suggested locally without a proxy and see if that works?16:41
fungiif before, then this may be a wild goose chase16:41
clarkbfungi: well that blob shouldn't go away even if it is older16:42
clarkb(however I think that blob can be deleted and if that happened then ya)16:42
fungiahh, it retains old blobs on replace>16:42
fungi?16:42
clarkbfungi: ya my understanding is that it won't ever delete those blobs for you automatically16:43
clarkbbecause the idea is you can rollback or wahtever to known good state16:43
clarkbhwoever you as the user could elect to delete known bad blobs aiui16:43
sshnaidmclarkb, fungi  sorry, which command?16:43
sshnaidmif this image is 2 hours old, so maybe it wasn't cached yet..16:43
clarkbsshnaidm: `docker pull tripleomaster/centos-binary-nova-compute-ironic` (and specify the sha maybe?)16:43
clarkbsshnaidm: the blob that is failing is a week old16:43
sshnaidmclarkb, it doesn't have "latest", need to specify a tag16:44
clarkbsshnaidm: ok however you need to do it16:44
clarkbthe idae is to test if it works at all16:44
*** eernst has joined #openstack-infra16:44
clarkbwe know the url above bypassing the proxy does not work16:44
clarkbwhat we don't know is if it is supposed to work via some other path or not16:44
sshnaidmworks fine: docker pull tripleomaster/centos-binary-nova-compute-ironic:69cab4cd5356e6c5314a103ca760d531d004dc5a_802178e916:44
fungiAJaeger: copies into /afs/.openstack.org/docs/api-guide/ have finished now.16:44
clarkbdoing a docker pull should indicate that to us16:45
fungiAJaeger: let me know if anything seems to have been obviously missed there16:45
clarkbsshnaidm: ok that implies the destination we are proxying to is no longer correct because dockerhub changed something16:45
clarkbsshnaidm: so the next step is to trace out that pull and see where requests are going16:45
clarkbsshnaidm: I think if you enable debug logging it will show up in your dockerd logs16:45
sshnaidmclarkb, should it be in proxy logs?16:45
*** bhavikdbavishi has quit IRC16:45
sshnaidmah, you mean locally16:45
clarkbsshnaidm: no the proxy is requesting things at the bad location16:45
clarkbwe need someone to request it via working tooling and see what it talks to16:46
fungiclarkb: sshnaidm: well, the blob in the failure is sha256:6d3a23ca3a1378376ca4268c06d7c7da7b25358e69ff389475e5a30b78549fbb so not the same?16:46
*** roman_g has joined #openstack-infra16:46
fungior is that the same as 69cab4cd5356e6c5314a103ca760d531d004dc5a_802178e916:46
sshnaidmfungi, I hardly understand what is this "sha256"16:46
clarkbfungi: I think the sha you have is for a specific filesystem layer and the one sshnaidm has is for a manifest that includes potentiall many filesytem layers16:46
fungisshnaidm: a checksum16:46
clarkbfungi: so its possible they are teh same but someone would have to hcekc that manifest16:47
fungiahh16:47
clarkbs/the same/overlapping/16:47
sshnaidmfungi, clarkb let's take from logs: http://logs.openstack.org/26/671526/4/gate/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/a371bbe/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz?#_2019-07-22_09_29_50_58316:48
sshnaidmdocker pull docker.io/tripleomaster/centos-binary-swift-object:4cadc580aed3cde73c487f827f76bf7b92b4d1e5_10e135ca16:48
sshnaidmworks fine16:48
clarkbsshnaidm: ok you need to log what that does locally16:48
*** eernst has quit IRC16:48
clarkband compare it against what the proxy does16:49
clarkb(and update if necessary)16:49
clarkbreading their specs I don't think the api paths would have changed so it must be the hostname16:50
*** eernst has joined #openstack-infra16:50
clarkbhttps://docs.docker.com/registry/spec/api/#pulling-an-image16:51
clarkbhttps://registry-1.docker.io/v2/tripleomaster/centos-binary-nova-compute-ironic/manifests/current-tripleo-rdo gives me the same 401 and json document16:52
clarkbalso possible that there must be some token generated ? and straight up anonymous access doesn't work anymore? however you'd expect docker pull from client to figure that out regardless of destination16:53
clarkbpabelanger: ^ how did you trace this in the past?16:54
fungialso you'd think all proxied docker access would be failing similarly if that were the case16:54
*** igordc has joined #openstack-infra16:54
clarkbfungi: its possible that it is16:55
*** eernst has quit IRC16:55
fungihttps://registry-1.docker.io/v2/zuul/zuul/manifests/latest16:56
fungiis that the equivalent for the zuul/zuul container?16:57
*** eernst has joined #openstack-infra16:57
clarkbyes I think so16:57
clarkblevel=debug msg="Trying to pull tripleomaster/centos-binary-nova-compute-ironic from https://registry-1.docker.io v2" is what my dockerd says16:57
*** ricolin has quit IRC16:57
fungiseems that 401 json response is consistent for made-up manifests too https://registry-1.docker.io/v2/foo/bar/manifests/baz16:58
fungiso i suspect this is its general "i have no idea what you just asked me for" response16:58
sshnaidmsomething like that happens: https://paste.fedoraproject.org/paste/OgVKGHTKoLWUUYElUNGcug16:59
clarkbhttps://registry-1.docker.io/v2/tripleomaster/centos-binary-swift-object/manifests/4cadc580aed3cde73c487f827f76bf7b92b4d1e5_10e135ca whihc shows up in taht log is a 401 for me17:01
fungiyep, i just saw/tried the same, with the same effect17:02
*** eernst has quit IRC17:02
mordredI see authz mentions in the log - are we _sure_ that nothing accidentally got pushed that's marked as needing authentication?17:02
clarkbmordred: no I don't think we are sure fo that17:03
* mordred doesn't know - mostly just looking for things it might be17:03
mordrednod17:03
*** eernst has joined #openstack-infra17:03
clarkbmordred: as a sanity check you might want to try pulling zuul from one of our mirrors?17:03
AJaegerfungi: thanks, will double check in a bit - first look is fine17:04
clarkbfwiw nothing in their spec seems to say you should need auth to pull manifets17:05
*** e0ne has joined #openstack-infra17:05
*** jtomasek has quit IRC17:05
clarkbwww-authenticate header does say Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:tripleomaster/centos-binary-nova-compute-ironic:pull"17:08
*** eernst has quit IRC17:08
clarkband apparently that is how you are isntructed on what to do if you get a 401 /me reads more17:08
Shrewsfungi: so about the b0rked image leak problem we discussed last week... i see some examples where we repeatedly try deleting the upload, but then we give up for some reason and delete the zk record. So i was both right AND wrong about nodepool repeating the operation. I can't quite see that code path in the source. I may need to add some logging to the builders and restart them.17:10
*** eernst has joined #openstack-infra17:10
fungiShrews: at least that would explain why it seems to be infrequent17:10
clarkbI read that as you need a token scoped to that action. I would expect the dockerd/the thing pulling to request one of those?17:10
openstackgerritMonty Taylor proposed opendev/system-config master: Build a docker images of gerrit  https://review.opendev.org/67145717:11
clarkbhttps://auth.docker.io/token does generate 5 minute tokens for me17:11
openstackgerritMonty Taylor proposed opendev/system-config master: Build a docker images of gerrit  https://review.opendev.org/67145717:13
mordredclarkb, corvus: it's possible that might go green ^^ - I give it at least a 50/50 chance17:14
fungiclarkb: sshnaidm: mordred: one interesting detail... i searched logstash for occurrences of message:"Image prepare failed: 401 Client Error: Unauthorized for url" over the past 2 days and all 30 matches were for builds which ran in ovh-bhs1 or ovh-gra117:14
funginot sure if that's a clue17:14
*** eernst has quit IRC17:15
sshnaidmfungi, I see the same results, that's why I mentioned OVH bhs117:15
mordredwell ... if these are jobs using the intermediate registry, network flakiness with the proxy can cause the docker daemon to try to talk to dockerhub but using the credentials for the intermediate registry17:15
clarkbsshnaidm: fungi I'm wary of blaming ovh since we get the exact same behavior if talking directly to docker hub17:15
clarkbfwiw if I add an authorization header with the token content I still get a 401 so I'm probably doing that wrong17:16
mordredbut yeah - I think digging further into what clarkb is saying is more likely to bear fruit right now17:16
*** chandankumar is now known as raukadah17:18
clarkbthe rfc says proxies must not modify authorization headers if provided and I doubt apache breaks that which implies to me that dockerd isn't setting that up at all17:19
clarkbthe logs certainly don't make mention of it that I can see17:19
sshnaidmclarkb, maybe auth error is just a default when not finding the image17:19
clarkbsshnaidm: ya that could  be17:20
sshnaidmsecurity etc17:20
clarkbyou would expect `curl https://registry-1.docker.io/v2/` to work at a minimum reading their api spec17:21
*** eernst has joined #openstack-infra17:21
clarkbbut that also returns the same error17:21
clarkbhttps://status.docker.com/ claims no errors17:22
*** eernst has quit IRC17:25
*** panda has quit IRC17:25
*** panda has joined #openstack-infra17:26
clarkbhttps://quay.io/v2/ returns "true" instead of go away like https://registry-1.docker.io/v2/ so I'm not completely off base in how this is expected to work I think17:28
clarkbhttps://quay.io/v2/calico/node/manifests/master also returns data17:30
clarkbI think that means we've got the rough gist of how the api is supposed to work down17:30
clarkbI need to pop out for a bit and get a bike ride in before it turns into an oven outside17:30
clarkbI'll be back in a bit17:31
fungione thread is that the failures have been in providers with our old puppeted mirror hosts, not in providers with our newer ansible-only mirror hosts, so i'm comparing the two in hopes of spotting any potential differences in dockerhub proxying rules17:31
*** eernst has joined #openstack-infra17:32
clarkbfungi: fwiw if I switched the name out for an opendev mirror I got the same failure17:32
fungik17:33
fungisame error from docker pull or from direct url requests?17:33
fungialso, no, i don't see any obvious differences in configuration17:34
clarkbdirect url requests17:36
*** eernst has quit IRC17:37
fungithere are more consumers of these dockerhub v2 mirrors than just tripleo, right?17:37
*** eernst has joined #openstack-infra17:38
fungimessage:"401 Client Error: Unauthorized for url" has 517 hits in logstash currently, and all appear to be for tripleo-ci jobs17:39
clarkbzuul, our gitea stuff, and maybe kolla/loci/helm/airship?17:42
fungilooks like these exceptions are being raised when _copy_registry_to_registry() is calling requests with the broken url from tripleo_common/image/image_uploader.py17:42
*** eernst has quit IRC17:43
*** ralonsoh has quit IRC17:43
*** eernst has joined #openstack-infra17:45
*** ociuhandu_ has joined #openstack-infra17:47
fungiit builds source_config_url based on some info passed into that method, and then asks requests to fetch it17:49
*** ociuhandu has quit IRC17:49
fungiso... this doesn't look like `docker pull ...` failing17:49
fungicould it be that tripleo-common is making broken assumptions about dockerhub urls?17:50
*** eernst has quit IRC17:50
*** eernst has joined #openstack-infra17:51
*** ociuhandu_ has quit IRC17:52
*** rlandy has quit IRC17:52
clarkbmaybe the same ones we are running into doing direct requests17:53
fungigiven that it's forming its own dockerhub urls and querying those, yes perhaps17:56
fungihttps://opendev.org/openstack/tripleo-common/src/branch/master/tripleo_common/image/image_uploader.py#L125617:56
*** eernst has quit IRC17:56
*** diablo_rojo has joined #openstack-infra17:56
*** eernst has joined #openstack-infra17:58
*** tesseract has quit IRC18:01
*** eernst has quit IRC18:02
*** jtomasek has joined #openstack-infra18:04
*** eernst has joined #openstack-infra18:04
*** eernst has quit IRC18:09
*** eernst has joined #openstack-infra18:10
*** eernst has quit IRC18:11
*** electrofelix has quit IRC18:12
*** smarcet has joined #openstack-infra18:13
AJaegerfungi: my automatic tests for api-site rename all passed, so I declare it as done - thanks!18:14
*** pkopec has quit IRC18:16
fungiAJaeger: my pleasure, thanks for making it easy!18:16
*** pkopec has joined #openstack-infra18:17
*** diablo_rojo has quit IRC18:23
openstackgerritMonty Taylor proposed opendev/system-config master: Build a docker images of gerrit  https://review.opendev.org/67145718:38
donnydpower just went out here, and my generator is still not hooked up.18:42
donnyd:(18:42
donnydits on my list18:42
*** smarcet has quit IRC18:42
donnydI might get lucky and have it come back on in time... have about 25 minutes of backup UPS power18:43
*** smarcet has joined #openstack-infra18:47
gouthamrhi, i'm trying to convert a legacy job in openstack/manila to zuulv3; and had a question: How can i run something after devstack, but before tempest tests are kicked off?18:50
fungidonnyd: thanks for the heads up. should we emergency zero the quota there?18:51
donnydnot yet18:51
donnydif its not back on in about 3 minutes, then yes18:51
fungii mean, worst case zuul loses contact with nodes there and requeues the builds they were running18:52
donnydWell that never happens.. Its back on18:52
fungihah18:52
donnydWhen I have luck, its usually bad18:52
donnydI really need to get that generator hooked up.. LOL18:52
fungigouthamr: that might be a better question for #openstack-qa... the playbooks for devstack and tempest are maintained in their respective repos18:53
gouthamrah, will ask there, ty fungi!18:53
fungiof course!18:54
AJaegergouthamr: I suggest you write a new job from scratch ;)18:54
gouthamrAJaeger: trying to align to the rest of the projects, so we aren't a special snowflake :) so hope that's not necessary18:56
*** tdasilva_ has quit IRC18:56
AJaegergouthamr: talk with QA team. The framework is quite different and my udnerstanding is that rewriting from scratch based on the new framework should be one option to evaluate.18:57
*** Vadmacs has joined #openstack-infra18:58
gouthamrAJaeger: sure thing. i'll do that, ty...18:59
clarkbfungi: any luck with the docker thing?18:59
clarkbI'm back now and can help18:59
fungiclarkb: no, my only theory so far is that sometimes tripleo-ci's image uploader builds invalid urls19:00
*** e0ne has quit IRC19:01
*** eernst has joined #openstack-infra19:01
fungi(the error isn't coming from a docker pull or anything like that as far as i can tell, just the urls tripleo-ci is assembling)19:01
clarkbfungi: ya the odd thing is those urls appear valid if I'm reading the dockerhub api spec correctly19:01
*** eernst has quit IRC19:05
*** smarcet has quit IRC19:07
*** eernst has joined #openstack-infra19:07
* clarkb configures local dockerd to go through proxy19:07
fungiare you using a local proxy to trace the urls it requests, or testing with one of the ci proxies?19:09
clarkbI was just gonna check if it works successfully to docker pull through one of our mirrors19:10
clarkbif it does then ya I think it is as you suspect: problem with their script19:10
fungiahh19:10
fungii have a feeling docker pull is working or else we'd have gotten a lot more complaints. the 401 errors from tripleo-ci jobs stretch back at least a week according to logstash19:11
clarkbya I just want to be sure of it19:12
*** eernst has quit IRC19:12
clarkbyup confirmed docker pull works19:13
clarkbsshnaidm: ^ so the problem is with your script I think19:13
clarkbsshnaidm: and talking directly to the backend with the urls your script fails on produces the same results19:13
clarkbso this isn't a problem of the proxy19:13
sshnaidmEmilienM, ^^19:13
*** eernst has joined #openstack-infra19:14
fungiyeah, it does seem that the 401 auth required errors are how dockerhub responds to any unknown url, so odds are the script is sometimes assembling an invalid combination of parameters for the url19:16
fungiperhaps wrong sha256 checksum?19:17
clarkbfungi: maybe? though that manifests url should work19:19
*** eernst has quit IRC19:19
clarkbpossible you need a token even if anonymous and docker pull does that19:20
*** eernst has joined #openstack-infra19:20
mordredcorvus, clarkb: GERRIT BUILD WORKED!!!!!19:21
mordredpaladox: thanks for the help - I borrowed a few of your settings19:21
*** rh-jelabarre has quit IRC19:21
paladoxyour welcome :)19:21
clarkbanyone else want to review https://review.opendev.org/#/c/672083/ so we can get the ball rolling on gitea server replacements?19:22
*** rh-jelabarre has joined #openstack-infra19:22
mordredpaladox: sadly I also had to constrain it to 4k RAM and a single core - because any more than that it would either segfault or run out of memory19:22
paladox:(19:22
mordredclarkb: +A19:23
mordredpaladox: yeah. tell me about it19:23
clarkbmordred: they?19:23
paladoxI've only built gerrit on large ram machines19:23
*** eharney has quit IRC19:23
mordredpaladox: these are 8G VMs with 8 vcpus ... but that is apparently unhappy making19:24
mordredclarkb: they?19:24
paladoxwow19:24
clarkbmordred: from your message they would etierh segfault or run out of memory. wonder what they is19:24
paladoxmordred you have more power then the vps i did it on :P19:24
paladoxbut that did cause jenkins to OOM19:24
*** eernst has quit IRC19:25
mordredclarkb: oh - the bazel workers19:25
mordredclarkb: bazel something something19:25
mordredgets really grumpy on our vms for some reason19:25
mordredworks FINE on my chromebook19:25
AJaegermordred: are you reviewing service-types etc? Please review https://review.opendev.org/672139 and https://review.opendev.org/672136 and https://review.opendev.org/#/c/672131/ and https://review.opendev.org/#/c/672138/ - all change links from developer.o.o to docs.o.o19:26
clarkbmordred: clearly we should make a chromeos test image then19:26
mordredclarkb: wcpgw?19:26
mordredAJaeger: on it19:26
*** eernst has joined #openstack-infra19:26
paladoxmordred what segfault did it give you? I've never experienced a segfault with bazel.19:27
clarkbif anyone is wondering the tripleo rdo image for nova-compute-ironic is 1.784GB19:27
AJaegerthanks, mordred !19:27
*** igordc has quit IRC19:27
clarkbEmilienM: sshnaidm https://docs.docker.com/registry/spec/api/#api-version-check may be useful documentation19:28
clarkbspecifically it details what they claim the 401 not authorized is meant to mean and how you can deal with it19:29
fungiclarkb: wow... could the error they're getting be a normally unexercised code path which only triggers when a download initially fails? could be unpredictable network performance in ovh triggering that is what causes it to only appear on builds in that provider? a stretch, i know19:29
clarkbEmilienM: sshnaidm https://docs.docker.com/registry/spec/auth/token/ that too19:29
clarkbfungi: perhaps?19:30
mordredpaladox: http://logs.openstack.org/57/671457/8/check/system-config-build-image-gerrit/300d08d/job-output.txt.gz#_2019-07-20_18_18_09_540913 is an example of a memory error ...19:30
*** igordc has joined #openstack-infra19:30
mordredpaladox: http://logs.openstack.org/57/671457/7/check/system-config-build-image-gerrit/95547e8/job-output.txt.gz#_2019-07-20_15_38_17_543596 is an example of one of the bazel worker processes just going away mid-process19:30
*** eernst has quit IRC19:31
clarkbfungi: though I think it likely that they just need to request a token first19:31
clarkbthough perhaps that ishappening but failing which leads the pull requset to fail19:31
paladoxah19:31
sshnaidmclarkb, but we don't authenticate afaik, it's public contaienrs19:31
openstackgerritMatthew Thode proposed openstack/diskimage-builder master: update version of open-iscsi that is installed on musl  https://review.opendev.org/67215219:32
clarkbsshnaidm: yes I know, my theory is they require tokens for anonymous access too19:32
clarkbI'm working to test that19:32
sshnaidmclarkb, that would be weird, give the fact in other cases it worked19:33
*** eernst has joined #openstack-infra19:33
sshnaidms/give/given19:33
sshnaidmwhat is interesting, it's same container in two last cases19:33
sshnaidmand same proxy19:33
paladoxmordred apparently we give the docker slave that the gerrit job runs on, 36gb of ram and 8cpu (https://tools.wmflabs.org/openstack-browser/server/integration-slave-docker-1052.integration.eqiad.wmflabs)19:34
mordredpaladox: that's incredible19:35
paladoxindeed, i didn't know that until now :)19:35
clarkbsshnaidm: remember the same error happens if you make that request to docker directly19:36
clarkbsshnaidm: so while yes maybe the proxy is a clue that request fails when made without a proxy19:36
clarkbyup confirmed19:37
*** eernst has quit IRC19:37
clarkbif I manually curl out a token and use that with authorization header it works fine19:37
clarkbsshnaidm: so I think you need to implement https://docs.docker.com/registry/spec/auth/token/19:38
*** Lucas_Gray has joined #openstack-infra19:39
sshnaidmclarkb, but how then it works in all other jobs?19:39
*** eernst has joined #openstack-infra19:39
sshnaidmlike in the same jobs, just different runs19:39
clarkbI can only tell you what I observe talking from my machine directly to dockerhub19:39
clarkbI reproduce the failure trivially and if I add a bearer token it works19:39
clarkbsshnaidm: it is possible that all of the other jobs are pulling data which is still cached19:40
clarkband only ovh has expired the cache on those objects so far19:40
clarkbif this was a recent change made by docker that would be possible19:40
*** jtomasek has quit IRC19:41
*** eernst has quit IRC19:44
*** eernst has joined #openstack-infra19:46
clarkband now I've successfully followed through with a follow of the GET against the object, then a GET against that redirects me to19:46
clarkbsshnaidm: that is my best guess. Differences in cache expiration have exposed this in some regions and not others and for some iamges and not others19:47
clarkbsshnaidm: within ~24 hours when the entire cache is refreshed this will likely be a more global problem and you'll need to implement the bearer token stuff even for anonymous gets19:47
fungii have a feeling we may be caching those when they get requested through the proxy by a docker pull or similar tool which uses a token19:49
clarkbfungi: oh good point19:49
fungiand then subsequent direct requests get satisfied out of the cache until expiraiton19:49
fungiwhich could explain the intermittent nature of the failures19:50
clarkbthat certainly fits the behavior really well19:50
openstackgerritMerged opendev/system-config master: Remove gitea01 from inventory so we can replace it  https://review.opendev.org/67208319:50
*** e0ne has joined #openstack-infra19:50
*** eernst has quit IRC19:50
fungionce that ^ takes effect i can work on booting the replacement?19:51
clarkbyup19:51
*** pkopec has quit IRC19:51
*** eernst has joined #openstack-infra19:52
clarkbthe next steps are roughly boot new node, add it to inventory (but not haproxy) and remove it from the create gitea repos playbook (I forget the name should be in the docs), then recover from db backup, then click button to make all repos, then replicate then add to haproxy and remove exception from ansible playbook19:52
fungialso getting close to time for me to start making dinner, so it may happen after19:52
clarkbhopefully that is all clear in the docs and if not I'll help interpret them and fix them19:52
fungialso update dns entries19:53
clarkboh ya that too19:53
*** Lucas_Gray has quit IRC19:53
clarkbsshnaidm: an easy way to confirm would be to switch to docker hub directly rather than our mirror in your script19:54
*** eernst has quit IRC19:56
fungiclarkb: any special steps for booting a new gitea server? these go in vexxhost-sjc1 right? so bfv... you did mention at least adding the option to skip ping6 tests19:56
fungiahh, sorry, found https://docs.openstack.org/infra/system-config/gitea.html#deploy-a-new-backend19:57
* fungi should rtfm before asking questions19:57
*** smarcet has joined #openstack-infra19:58
*** eernst has joined #openstack-infra19:58
clarkbfungi: ya in sjc1 and I used an 80GB disk for gitea06 beacuse the 30GB or whatever they are now is way too small particularly if we want swap20:00
sshnaidmclarkb, heh.. dockerhub will fail in half of cases :) not the best server performance there20:00
*** pcrews has joined #openstack-infra20:00
clarkbfungi: you have to exclude ping6 because our minimal image is quite minimal and doesn't have that tooling. You should use the same image that gitea06 was booted off of (this has correct ext4 journal size)20:00
*** Vadmacs has quit IRC20:01
*** Goneri has quit IRC20:01
fungiawesome, thanks20:01
fungii guess i can just pass that in by uuid20:01
clarkbthat image is the one I uploaded from nodepool builders separately so that we could clean up the control plane nodepool builder stuff20:02
clarkbfungi: or name the image can't be deleted because it is in use20:02
fungiright20:02
clarkbI should dobule check the nodepool builder cleanup work was a success I'll do that after lunch20:02
*** eharney has joined #openstack-infra20:02
*** jcoufal has quit IRC20:02
*** igordc has quit IRC20:03
*** eernst has quit IRC20:03
*** e0ne has quit IRC20:04
*** eernst has joined #openstack-infra20:05
*** EmilienM is now known as EmilienM|pto20:05
*** mattw4 has quit IRC20:08
*** igordc has joined #openstack-infra20:08
*** mattw4 has joined #openstack-infra20:08
*** eernst has quit IRC20:10
fungiany atypical parameters i need to add besides --boot-from-volume, --volume-size=80 and --ignore_ipv6?20:10
clarkbI dont think so20:10
fungicool20:11
*** eernst has joined #openstack-infra20:11
*** michael-beaver has joined #openstack-infra20:11
*** eernst has quit IRC20:16
*** eernst has joined #openstack-infra20:18
*** eernst has quit IRC20:22
*** eernst has joined #openstack-infra20:24
*** eernst has quit IRC20:25
*** smarcet has left #openstack-infra20:28
*** tdasilva has joined #openstack-infra20:38
openstackgerritJames E. Blair proposed zuul/zuul master: WIP download and display a log file  https://review.opendev.org/67191220:42
*** xek has quit IRC20:44
corvusclarkb, fungi: i'm sticking my head up after being in a javascript hacking hole and have not been paying attention to scrollback; anything you need me for?20:44
*** xek has joined #openstack-infra20:45
fungicorvus: i don't think so. gonna try building a replacement gitea01 in a little while. we were also trying to confirm whether there was an issue with our dockerhub proxy20:47
clarkbcorvus: I'm goood. I think I managed to figure out the dockerhub oddity20:48
clarkbbasically you have to get a token even for anonymous access (not sure if that is new or not but tripleo ran into it)20:48
corvusclarkb: that sounds familiar?  i think that may be encoded into the docker roles.  also, that can vary based on which of the services you hit (it's a mess)20:49
clarkbcorvus: cool so not completely odd. Basically their jobs fail when trying to request the data directlywith a python script in some cases20:49
clarkbcorvus: fungi's theory is that depending on what we have cached ti may work at times and not work at others. And I was able to manually confirm using curl that the token was required20:50
clarkbpretty sure it isn't a fault of the proxies at least20:50
fungiyeah, tripleo is doing direct access to dockerhub via python-requests and custom-assembled urls20:50
*** joeguo has joined #openstack-infra20:52
corvuswe do some actions against "hub.docker.com" and others against "registry.hub.docker.com"20:52
corvusand that's not by accident20:53
corvuslook at this one: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/promote-docker-image/tasks/promote-retag-inner.yaml20:53
corvusbearer token agaist registry; JWT against hub20:53
clarkbah yup that includes the bearer token20:54
corvusthe list tags against 'hub.docker.com' here doesn't require auth: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/promote-docker-image/tasks/promote-cleanup.yaml20:54
openstackgerritClark Boylan proposed opendev/system-config master: Cleanup nodepool builder clouds.yaml  https://review.opendev.org/66501820:55
*** jeremy_houser has joined #openstack-infra20:55
clarkbgerrit said ^ cannot merge so I rebased it20:55
clarkbthat is the next step required in the cleanup of the nodepool builders20:56
clarkbcorvus: ^ if you have a moment to review that that would help20:56
clarkbcorvus: and also https://review.opendev.org/#/c/671858/ if you've got all the js paged in would be good20:56
clarkbmordred: fungi: you too on https://review.opendev.org/66501820:57
jeremy_housercurrently attempting to build a gate for my repository, where are parent projects like "openstack-tox-py36" defined? Im trying to understand how .zuul.yaml hooks my tox.ini to run what I have written there20:57
clarkbjeremy_houser: openstack/openstack-zuul-jobs should have most if not all of the openstack- prefixed jobs in it20:58
jeremy_houserfantastic, thank you20:58
clarkbjeremy_houser: them zuul/zuul-jobs contains a lot of very generic stuff which the openstack-zuul-jobs may build on20:58
corvusjeremy_houser: this may be helpful if you ignore the zuulv2 bits: https://docs.openstack.org/infra/manual/zuulv3.html#how-jobs-are-defined-in-zuul-v321:00
corvusit's probably time to fold that into the rest of the document normally....21:01
*** whoami-rajat has quit IRC21:01
jeremy_houserthank you21:02
*** xek has quit IRC21:03
*** xek has joined #openstack-infra21:04
*** mattw4 has quit IRC21:05
*** mattw4 has joined #openstack-infra21:05
clarkband another trick is to use http://codesearch.openstack.org21:09
*** tdasilva_ has joined #openstack-infra21:09
*** pcaruana has quit IRC21:10
*** tdasilva has quit IRC21:12
*** jeremy_houser has quit IRC21:13
*** kjackal has quit IRC21:17
*** mattw4 has quit IRC21:20
clarkbcorvus: fungi ty ty for those reviews21:20
clarkbfungi: https://review.opendev.org/#/c/667474/ is the change I used to enroll new gitea06 into inventory. Note the delta in https://review.opendev.org/#/c/667474/1/playbooks/remote_puppet_git.yaml21:21
*** tdasilva_ has quit IRC21:22
*** eernst has joined #openstack-infra21:31
*** mattw4 has joined #openstack-infra21:32
*** eernst has quit IRC21:35
*** eernst has joined #openstack-infra21:37
*** eernst has quit IRC21:41
*** eernst has joined #openstack-infra21:43
smcginnisAnyone know what happened with this release job? http://lists.openstack.org/pipermail/release-job-failures/2019-July/001193.html21:44
smcginnisUnfortunately I need to drop, but will check back if anyone has any pointers.21:45
clarkbsmcginnis: zuul is claiming the merge failed. Is that job run against openstack/releases?21:45
smcginnisThat was a post-release job.21:45
clarkbya merge failure might more generally be categorized as git had a sad setting up the repo21:46
clarkbwhat repo does that job run against? openstack/releases?21:46
smcginnisI believe the tagging is run from openstack/releases, but then clones the repo being released to add the tag to it.21:47
clarkbok that helps as now I can search for the build in zuul21:47
clarkb(though maybe it didn't get that far given the merge failrue message?)21:47
*** eernst has quit IRC21:48
smcginnisHard to tell I think. I guess it would have come from the openstack/releases repo since if it got past that initial setup we would at least have log messages showing the failure steps?21:48
clarkbyes21:48
clarkband sure enough there are no failures logged by zuul's builds page21:49
clarkbhttp://zuul.openstack.org/builds?project=openstack%2Freleases&pipeline=release-post21:49
*** eernst has joined #openstack-infra21:50
smcginnisMaybe try a reenqueue? Not sure what next steps are from here. Shouldn't really be any way for a merge failure to be an issue at that point, so I'm guessing it was a quirk.21:52
openstackgerritMerged opendev/system-config master: Cleanup nodepool builder clouds.yaml  https://review.opendev.org/66501821:52
smcginnisSorry, I really need to step away now. I'll be back later.21:52
clarkbI'm trying to figure out why it thought the merge failed21:53
clarkblooking at https://opendev.org/openstack/releases/commits/branch/master both manila tags in the last hour are present and accounted for21:55
*** eernst has quit IRC21:55
clarkbok it didn't fail against releaes, it failed against manila21:55
*** eernst has joined #openstack-infra21:56
clarkbgitea doesn't show us git tag refs huh?21:57
* clarkb clones manila21:57
*** betherly has joined #openstack-infra22:00
*** eernst has quit IRC22:01
*** slaweq has quit IRC22:01
clarkbnow I am extra confused the git logs say that the release-post pipeline didn't match for manila22:02
*** eernst has joined #openstack-infra22:02
clarkbneither tag did22:03
clarkbaha that isb eacuse it wasn't a release-post failure ...22:03
clarkbthis is the tag pipelien failing22:03
*** betherly has quit IRC22:05
*** eernst has quit IRC22:07
*** eernst has joined #openstack-infra22:09
*** mattw4 has quit IRC22:10
clarkbcorvus: is there any easy way to map a gearman merger job to the merger that arn it?22:10
clarkbcorvus: the unique key seems to maybe not be unique ( I found evidence of the job running successfully after the failure a minute later than the merge failure is logged)22:11
clarkb1.5 minutes actually22:11
clarkbhttp://paste.openstack.org/show/754740/22:13
*** mattw4 has joined #openstack-infra22:13
*** eernst has quit IRC22:13
clarkboh I'm a derp22:14
clarkbthe logs had scrolled by and it showed me the retry I think but if I scroll up the failure is thre :/22:14
*** eernst has joined #openstack-infra22:15
clarkbhttp://paste.openstack.org/show/754741/ too many connections to gerrit22:15
clarkbwe are leaking those I guess?22:16
fungiokay, evening sustenance has been prepared and consumed. catching up and then i'll look into the gitea01 replacement assuming no emergencies22:19
*** eernst has quit IRC22:19
clarkbI'm not seeing evidence of 64 connections between those two hosts (checking netstat -np --wide on both sides)22:19
clarkbchecking the gerrit connection list via gerrit cli now)22:19
fungiyeah, that's where the 64 limit comes in22:20
clarkbno evidence of the limit behing hit there either22:21
clarkband the mergers don't run concurrently right?22:21
fungiconcurrent gerrit ssh api connections in its connections list are limited to 64 per account if22:21
fungiaccont id22:21
clarkboh account id22:21
fungii'm giving up typing accuratekly22:21
clarkbok so the problem may be the zuul user22:21
* clarkb greps for connection count by zuul user22:21
*** eernst has joined #openstack-infra22:22
fungithat's also configurable, but has proven a useful defense against runaway third-party ci systems killing gerrit22:22
clarkbthere are currently only 622:23
clarkbmaybe we should bump to 96?22:23
*** eernst has quit IRC22:23
clarkbgive us a bit more room, but at least for right now it seems fine22:23
*** eernst has joined #openstack-infra22:23
*** betherly has joined #openstack-infra22:23
clarkbdoesnt' look like we currently set that configuration option so 64 must be the deafult22:26
clarkbcorvus: we have 8 (zm) + 12 (ze) mergers and they each run serially right? so in theory 20 should be enough22:26
clarkbam I missing anything obvious for why we may need more?22:26
clarkbthose release jobs in particular don't rely on the zuul user's credentials right (that would bump up potential connnection count at that time)22:27
corvusclarkb: hrm, i think the change to set up repos in parallel on the executor may end up using more than one connection22:28
corvusif that's correct then it could be one connection per starting-job from each executor22:30
*** betherly has quit IRC22:30
*** rcernin has joined #openstack-infra22:30
openstackgerritClark Boylan proposed opendev/system-config master: Increate gerrit user connection limit by 50%  https://review.opendev.org/67218822:30
clarkbcorvus: uhm thats hundreds potentially right?22:31
clarkbI don't think our 50% increase is sufficient in that case22:31
clarkbI've pushed that change though if we want to consider its merits there22:31
clarkblooks like our peak for starting builds is actually ~7 according to grafana22:32
clarkbshould the number be 12 * 7 (ze) + 8 (zm) then? That is actually really close to 9622:33
clarkb9222:33
fungidoes the job in question directly update git repositories from gerrit?22:34
corvusclarkb: yeah, that's what i'm thinking22:34
clarkbfungi: that isn't where the failure was from so not sure. The failre was on zm02 updating the manila repo to then run the jobs22:35
fungiahh, okay22:35
corvusthis has been in production for a few months, so if this is what caused it, it's probably pretty rare for us to get that high22:35
fungiwith that i agree22:35
clarkbcorvus: ya looking at grafana it seems rare that more than one executor would be starting more than a small number of jobs22:36
fungifirst i've seen it reported at the very least22:36
clarkbso 96 should actually be a good amount of breathing room22:36
fungii think that's global? but probably also still sufficiently low to catch runaway ci systems22:36
corvusthat may be something to think about for zuulv4 -- having executors/mergers coordinate to limit the overall number of connections22:36
clarkbya it appears to be global22:36
corvustobiash: ^ fyi22:37
fungialso we have conntract set to reject more than 100 concurrent ssh api sessions from the same source ip address22:37
fungier, conntrack22:37
clarkbfungi: which in the case of zuul shouldn't ever happen since its 20 hosts limited to 64 (and probably 96 soon) connections22:37
fungiyup22:37
clarkbthat conntrack limit was in place to address the problem of leaky connectors22:37
clarkb(across multiple users)22:38
fungijust thinking in terms of where our various accidental denial of service mitigations are relative to one another22:38
fungi~100 api connections per address and ~100 api connections per account are fairly compatible22:39
*** mriedem has quit IRC22:39
clarkbhttps://opendev.org/openstack/manila/src/tag/8.0.1 is the tag that we need to rerun 'tag' pipeline jobs22:39
*** tkajinam has joined #openstack-infra22:39
clarkbshould we wait for smcginnis to return and confirm that is correct before proceeding?22:40
fungiwhat were the failed tag pipeline builds?22:40
clarkbno idea22:40
clarkbzuul didn't get that far because it couldn't get a config built22:40
clarkbpossibly none22:40
fungiif they weren't anything crucial, they probably are covered by the next tag (may be just release notes)22:40
clarkblooks like at least release-notes-jobs22:42
clarkband looking at other tags that is the only job22:44
*** rascasoft has quit IRC22:46
*** rascasoft has joined #openstack-infra22:47
openstackgerritJames E. Blair proposed zuul/zuul master: WIP download and display a log file  https://review.opendev.org/67191222:49
fungiyeah, that's the only one i'm aware of generally running in the tag pipeline22:49
*** tosky has quit IRC22:49
openstackgerritJames E. Blair proposed zuul/zuul master: WIP download and display a log file  https://review.opendev.org/67191222:49
clarkband we run the job there instead of release beacue it may be a tag that isn't for a release (in which case do release notes make sense?)22:50
fungiclarkb: i think it was because $foo-eol tags don't match pre-release nor release pipline expressions22:51
clarkbah22:52
smcginnisBack and just caught up on scrollback.22:52
smcginnisSo sounds like we can just ignore that failure then?22:52
smcginnisJust wait until the next patch to get the release notes updated.22:52
clarkbsmcginnis: unless the release notes are critical and https://review.opendev.org/672188 should help prevent that from happening again (will need a gerrit restart too though)22:54
smcginnisThanks for digging in to that. I don't think it's critical. If someone comes complaining, we can do an update then to get them out there.22:55
fungii'm happy to reenqueue the ref in the tag pipeline if that happens, yes22:56
smcginnisOK, thanks! I'm guessing that will not be necessary, but good to know we can if needed.22:56
*** eernst has quit IRC23:01
*** eernst has joined #openstack-infra23:08
*** eernst has quit IRC23:12
*** rh-jelabarre has quit IRC23:13
*** eernst has joined #openstack-infra23:14
*** eernst has quit IRC23:18
*** eernst has joined #openstack-infra23:20
clarkbdonnyd: looks like tehre are 2 timeouts in fn over the last 12 hours. At least one of those appears to have been due to unhappy cloud under test (so not sure we can blame fn in that case)23:22
clarkbdonnyd: http://logs.openstack.org/53/671853/2/check/tempest-slow-py3/b70e8e1/job-output.txt is the other. That one actually fails to set up devstack but then runs tempest anyway23:25
clarkbcurious23:25
clarkball that to say I don't think we can blame fn in either case which is a great improvement23:25
fungiit's a very determined job23:25
*** eernst has quit IRC23:25
clarkbgmann: any idea what is happening in http://logs.openstack.org/53/671853/2/check/tempest-slow-py3/b70e8e1/job-output.txt#_2019-07-22_18_59_50_267474 that failed to run devstack but then starts tempest anyway23:25
fungi#status log deleted gitea01.opendev.org instance from vexxhost-sjc1 in preparation for replacement23:25
openstackstatusfungi: finished logging23:25
*** eernst has joined #openstack-infra23:27
donnydclarkb: it seems that just a few jobs don't like to run on fn. I don't really have a good answer as to why. They should work. I have worked out the performance issues, so I am kinda thinking it may be a ipv6 only thing maybe23:28
*** dchen has joined #openstack-infra23:28
donnydAlso that same aio job for Osa fails too23:29
fungiclarkb: `openstack server show gitea06.opendev.org` has "image" as blank... suggestions?23:30
fungifrom the `openstack image list` output i'm guessing it's infra-ubuntu-bionic-minimal-20190612 (and not one of the two images simply named "ubuntu-bionic-minimal")23:31
*** eernst has quit IRC23:31
clarkbfungi: I think if you do a volume show on that hosts volume you get the image name23:32
fungii suspect if i iterate through `openstack volume list` to see which volume that instance's uuid has in use and then check what image it's base don...23:32
clarkbfungi: the image would be on server show if it wasn't bfv23:32
fungiahh23:32
donnydI will keep my eyes on the failing jobs to see if it's just some particular ones, or if its totally random23:32
clarkbfungi: and ya the one I uploaded would've had a date iirc23:33
* clarkb checks bridge command history23:33
fungiyeah, luckily `openstack volume list` has a total of only 27 entries in that region23:33
clarkbfungi: ya the other two images are from february and are likely what mordred used23:35
clarkbinfra-ubuntu-bionic-minimal-20190612 is the one to use23:35
fungi| 921f1962-2046-4644-bc2e-ba22d7a4947f |23:36
fungi                                                   | in-use    |   80 | Attached to gitea06.opendev.org on /dev/vda                   |23:36
fungiso that would be the volume to track back from i guess23:36
clarkbya23:36
*** eernst has joined #openstack-infra23:36
clarkbvolume show on that volume and in a json blob should be the name of the image iirc23:36
fungiyup. confirmed23:36
*** eernst has quit IRC23:40
fungi/opt/system-config/launch/launch-node.py gitea01.opendev.org --flavor=v2-highcpu-8 --cloud=openstackci-vexxhost --region=sjc1 --image=infra-ubuntu-bionic-minimal-20190612 --boot-from-volume --volume-size=80 --ignore_ipv623:41
fungithat look right, clarkb?23:41
*** eernst has joined #openstack-infra23:42
clarkbyes23:43
*** eernst has quit IRC23:47
*** eernst has joined #openstack-infra23:49
*** aaronsheffield has quit IRC23:49
clarkbfungi: corvus have a quick moment for https://review.opendev.org/#/c/672188/ ? can try to sneak in a gerrit restart in the near future if that gets in23:50
fungiMultiple possible networks found, use a Network ID to be more specific23:50
fungibah, i guess i need to pick a network too23:50
clarkbfungi: oh you probably have to delete the volume too23:50
*** eernst has quit IRC23:51
clarkbI don't think launch node cleans up after itself in the bfv case23:51
fungiahh23:51
fungino entries in volume list mention gitea01 though23:51
fungialso `openstack network list` complains "got an unexpected keyword argument 'rate_limit'"23:52
clarkbya I think it tries but it races? I remember needing to clean it up once but I had a few failed attempts23:52
fungii guess it doesn't like our clouds.yaml on bridge.o.o?23:52
clarkbfungi: 0048fce6-c715-4106-a810-473620326cb0 | public I get that with my osc venv23:52
fungiahh23:53
fungi--network=0048fce6-c715-4106-a810-473620326cb0 seems to have gotten me further, yes23:54
fungiopenstack.exceptions.BadRequestException: BadRequestException: 400: Client Error for url: https://block-storage-sjc1.vexxhost.us/v3/462ecebbb6e34add9eeeae3936aa6cb9/volumes/ea7cd896-f78a-4e2a-ac98-8e62a7d275c2, Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be23:55
fungidisassociated from snapshots after volume transfer.23:55
fungii left my wizard's staff back at basecamp23:55
*** eernst has joined #openstack-infra23:55
fungi(and maybe i memorized the wrong spells)23:56
clarkbThat is a new one to me23:56
clarkbdoes a volume list show a volume?23:57
fungivolume list shows a few dozen volumes, but none mention attachment to gitea0123:57
fungii can just try the launch script again and see if this is repeatable23:58
clarkbfungi: ea7cd896-f78a-4e2a-ac98-8e62a7d275c223:58
fungitook a few minutes for it to get to that point the first time through though23:58
clarkbI think that is your volume but it is available so even more confusing23:58
fungiyeah, same error again23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!