Tuesday, 2019-07-16

openstackgerritIan Wienand proposed opendev/system-config master: Add mirror-update to run_all.sh  https://review.opendev.org/67092700:04
*** goldyfruit has joined #openstack-infra00:07
ianwhas anyone already debugged why CI jobs seem to want to run the cloud_launcher and access the clouds? e.g. -> http://logs.openstack.org/06/669006/1/check/system-config-run-mirror/0cbec98/bridge.openstack.org/ara-report/reports/6d295f4d-4aa0-4eb7-a6b8-a7fcd6e4c3da.html00:12
ianwcloud-launcher : Processing keypair infra-root-keys for openstackci-ovh BHS100:12
*** weifan has quit IRC00:13
*** weifan has joined #openstack-infra00:13
ianwalso we seem to have lost the "toggle ci button" ... i think something went in around that?00:14
ianwLoading failed for the <script> with source “https://review.opendev.org/static/hideci.js?e=31300f2c4db937f32384feb237fc356b”00:15
ianwoh, we have to touch the file or something ...00:15
corvusianw: run_cloud_launcher happens via cron (if the timing of the job is just right).  it's not actually part of the job, and even if it fails, zuul doesn't know about it.  it just shows up in the inner ara report because that comes from the host.00:18
*** weifan has quit IRC00:18
ianwcorvus: ahhh. ok that explains it ... :)00:18
corvusianw: you should be able to ignore it, though it is annoying.  we did something in the job to disable the run_all cron because it was racing the tested version of playbooks and that's no good.  we may be able to do the same for the cloud launcher00:19
ianw#status log touched /home/gerrit2/review_site/etc/GerritSiteHeader.html after merge of Id0cd8429ee5ce914aebbbc4a24bef9ebf675e21c00:19
openstackstatusianw: finished logging00:19
ianwso on the other thing, "Toggle Extra CI" is back00:20
ianw... but doesn't actually toggle anything for me :/00:20
fungiit should only toggle job results from not zuul now00:24
fungiso third-party ci00:25
ianwalso the page is horizontal scrolling for me now too00:25
fungihuh00:25
ianwfungi: hrm, i'm not seeing zuul results in the list00:25
ianwof comments00:25
fungiif i go to, say, https://review.opendev.org/665723 it works for me00:26
fungioh, wait, that's just the merged message00:29
clarkbit toally worked in chrome for me when I tested it :/00:29
fungiit looks like it's hiding the zuul vote comments entirely. did i misread the conditional in that script?00:30
ianwyeah, i'm seeing something like https://imgur.com/a/3drQ7mW00:30
fungii think the conditional i think i see the problem00:31
fungipatch on the way00:31
*** aaronsheffield has quit IRC00:33
openstackgerritJeremy Stanley proposed opendev/system-config master: Correct hide logic for Zuul CI comments in Gerrit  https://review.opendev.org/67092800:33
fungiianw: clarkb: ^00:33
fungibull in a china shop, as i said ;)00:33
fungii had inadvertently applied it to the conditional for whether or not to display the comment, not the one for whether or not to hide it00:34
fungiso it was unconditionally hidden00:34
fungiclarkb: is it possible you made the same mistake i did and saw a zuul "change successfully merged" comment and thought all was good?00:35
*** weifan has joined #openstack-infra00:35
*** ijw has quit IRC00:36
clarkbmaybe? I thought I sawmore than one but maybe I updated the wrong conditional when I did it manually or something?00:37
*** weifan has quit IRC00:37
ianwi'm shift-reloading https://review.opendev.org/#/c/665723/ and do see all the comments00:37
ianwalong with "extra ci" button, so that code is sort of active00:38
clarkbmaybe we need the check in both conditionals?00:38
fungino, that's me trying out that change on gerrit00:38
fungi670928 seems to basically just do what the old code did, so that's not the solution either00:38
fungiold working code, not the recent patch00:39
* fungi grumbles and looks closer00:39
*** jeremy_houser has quit IRC00:40
ianwfungi: oh, right, so you applied 670928 to the live site?  because i was looking at the .js and it had your change already, so that would explain that mystery :)00:41
fungiyep, fastest way i knew to see whether this would fix it... and it still doesn't (this basically just reverts it and does a no-op with the extra inverse match)00:42
clarkb seemslike ti works the first time00:43
clarkbbut then subsequent toggles break00:44
fungioh, yep, you're right00:44
fungithat function is being called from ci_page_loaded00:44
fungiindirectly00:44
fungioh, i think clarkb may be right about needing both00:45
fungiyep!00:46
fungiclarkb is smrt00:46
fungirevising patch00:47
donnydhttps://www.irccloud.com/pastebin/iUQpqcCI/00:47
openstackgerritJeremy Stanley proposed opendev/system-config master: Complete hide logic for Zuul CI comments in Gerrit  https://review.opendev.org/67092800:48
fungiclarkb: ianw: ^00:48
fungithat seems to work00:48
ianwfungi: is that live so can confirm there?00:48
fungiyep00:48
fungiuntil puppet undoes it anyway00:48
fungiyou'll need a force refresh thanks to all the caching in the world00:49
fungiand this explains why the earlier change seemed to work when clarkb tested. it did work, but only on initial load. then when toggling it disappeared00:49
sgwcmurphy: I think I licked the osc issues, see: http://logs.openstack.org/63/670363/27/check/starlingx-obs-build/35f6f2b/job-output.txt.gz00:50
sgwsince I am not doing checks yet about if the spec files or _service files changed or if there is ource code changes to regenerate the _service tarballs00:50
ianwyay, logs published @ http://mirror.ord.rax.opendev.org/logs/rsync-mirrors/ ... although need to convince apache that .log files are text it seems00:52
clarkbwhat is the difference between those two functions?00:58
clarkbfungi ^ called on different browser events maybe?00:58
fungiclarkb: i thought one was called on page load and one on button press00:59
fungibut that was mostly a guess, i'm really quite clueless about this stuff00:59
fungijust going by where the functions were being called from00:59
fungievents later in the script01:00
*** igordc has quit IRC01:01
*** yamamoto has joined #openstack-infra01:03
*** yamamoto has quit IRC01:06
*** yamamoto has joined #openstack-infra01:09
*** gyee has quit IRC01:14
*** yamamoto has quit IRC01:14
*** imacdonn has quit IRC01:15
*** imacdonn has joined #openstack-infra01:16
openstackgerritIan Wienand proposed opendev/system-config master: Publish .log files as text/plain  https://review.opendev.org/67093401:34
*** yamamoto has joined #openstack-infra01:48
*** yamamoto has quit IRC01:50
*** yamamoto has joined #openstack-infra01:50
*** armax has quit IRC01:53
*** apetrich has quit IRC01:57
openstackgerritMerged opendev/system-config master: Complete hide logic for Zuul CI comments in Gerrit  https://review.opendev.org/67092801:58
*** yamamoto has quit IRC02:01
*** yamamoto has joined #openstack-infra02:10
*** yamamoto has quit IRC02:11
*** yamamoto has joined #openstack-infra02:16
*** yamamoto has quit IRC02:20
*** rh-jelabarre has quit IRC02:22
*** tkajinam has quit IRC02:23
*** tkajinam has joined #openstack-infra02:24
*** bhavikdbavishi has joined #openstack-infra02:34
*** bhavikdbavishi1 has joined #openstack-infra02:37
*** bhavikdbavishi has quit IRC02:38
*** bhavikdbavishi1 is now known as bhavikdbavishi02:38
*** whoami-rajat has joined #openstack-infra02:42
*** hongbin has joined #openstack-infra03:00
*** logan- has quit IRC03:11
*** logan- has joined #openstack-infra03:14
*** factor has joined #openstack-infra03:20
*** yamamoto has joined #openstack-infra03:20
*** michael-beaver has quit IRC03:21
*** psachin has joined #openstack-infra03:26
*** ykarel|away has joined #openstack-infra03:30
*** armax has joined #openstack-infra03:37
*** hongbin has quit IRC03:42
*** udesale has joined #openstack-infra04:09
*** armax has quit IRC04:10
*** ykarel|away has quit IRC04:13
*** ykarel|away has joined #openstack-infra04:34
*** ykarel|away is now known as ykarel04:35
*** ramishra has joined #openstack-infra04:49
*** jamesmcarthur has quit IRC04:50
openstackgerritIan Wienand proposed opendev/system-config master: Disable cloud launcher cron job during CI  https://review.opendev.org/67094605:02
*** kjackal has joined #openstack-infra05:03
ianwcorvus: ^ thanks for the suggestion on that05:03
*** yamamoto has quit IRC05:05
*** igordc has joined #openstack-infra05:07
*** pcaruana has joined #openstack-infra05:08
*** weifan has joined #openstack-infra05:08
*** ricolin_ has joined #openstack-infra05:10
*** YaminiU has joined #openstack-infra05:16
YaminiUHello team, good morning our openstack CI's are failing for the last few jobs with the error in the following line The error appears to have been in '/tmp/tmpisx2rata/986191b8731c42ef85545adf29347fdd/trusted/project_0/git.zuul-ci.org/zuul-base-jobs/playbooks/base/pre.yaml': line 3, column 7, but may2019-07-15 23:56:36.141581 | be elsewhere in the f05:17
YaminiUile depending on the exact syntax problem.2019-07-15 23:56:36.141594 | 2019-07-15 23:56:36.141605 | The offending line appears to be:2019-07-15 23:56:36.141617 | 2019-07-15 23:56:36.141628 |   roles:2019-07-15 23:56:36.141639 |     - add-build-sshkey2019-07-15 23:56:36.141652 |       ^ here2019-07-15 23:56:36.142280 | PRE-RUN END RESULT_NORMAL: [tr05:17
YaminiUusted : git.zuul-ci.org/zuul-base-jobs/playbooks/base/pre.yaml@master]2019-07-15 23:56:36.142988 | POST-RUN START: [untrusted : github.com/CiscoSystems/project-config-third-party-cinder/playbooks/dsvm-tempest-cisco-zonemanager-vm-job-post.yaml@master]2019-07-15 23:56:37.998344 | 2019-07-15 23:56:37.998527 | PLAY [all]05:17
YaminiUhas anything changed in the paste05:17
YaminiUhttp://paste.openstack.org/show/754419/05:17
YaminiUthat si the error logs paste link05:18
johnsomTemporary failure resolving 'mirror.regionone.fortnebula.opendev.org'05:20
johnsomhttp://logs.openstack.org/96/668996/4/check/octavia-v2-dsvm-scenario-ubuntu-bionic/530abf0/job-output.txt.gz#_2019-07-16_04_43_46_68029905:20
johnsominstance: ubuntu-bionic-fortnebula-regionone-000895258905:20
johnsomIt looks like fortnebula is having a DNS issue05:21
ianwjohnsom: hrm, or the host is having a network issue, that may be more likely :/05:21
ianwunbound may be unhappy05:22
johnsomOk, just figured I would give you all a heads up05:22
*** weifan has quit IRC05:23
johnsomYeah, looks like unbound was trying ipv6 without luck05:27
YaminiUhi team05:27
YaminiUcan anyone help05:27
YaminiUany known changes which went in05:27
*** jamesmcarthur has joined #openstack-infra05:29
YaminiUhttp://paste.openstack.org/show/754420/05:31
YaminiUthis is the full paste of the error05:31
*** yamamoto has joined #openstack-infra05:34
*** jamesmcarthur has quit IRC05:37
*** ykarel_ has joined #openstack-infra05:39
*** ykarel_ is now known as ykarel|meeting05:39
*** ykarel has quit IRC05:39
*** udesale has quit IRC05:39
*** udesale has joined #openstack-infra05:40
*** udesale has quit IRC05:42
ianwYaminiU: what's the change that is causing that?05:44
YaminiUi did not do any change05:44
YaminiUthe last 5 jobs are failing05:44
YaminiUbefore that it was passing05:44
YaminiUhttp://paste.openstack.org/show/754420/05:44
YaminiUthis is the full paste of the failure05:44
ianwi mean what review is this being reported on?05:45
YaminiUhttps://review.opendev.org/#/c/627941/05:45
YaminiUCisco05:45
ianwYaminiU: i would say, seeing the old url's in there, the cisco 3rd party ci might be hitting something like described in http://lists.zuul-ci.org/pipermail/zuul-discuss/2019-July/000971.html05:47
YaminiUoh ok05:49
YaminiUlet me try that05:49
YaminiUthanks for the quick response05:49
YaminiU[connection zuul-git]driver=gitbaseurl=https://git.zuul-ci.org/05:50
YaminiUi need to change this05:50
YaminiUright05:50
YaminiUto opendev url05:50
openstackgerritIan Wienand proposed opendev/system-config master: Add some pointers on the OpenDev PPA  https://review.opendev.org/67095205:54
AJaegerYaminiU: yes, you need to change the parameter, give me a second...05:55
YaminiUok05:55
AJaegerYaminiU: oh, ianw answered already, see http://lists.zuul-ci.org/pipermail/zuul-announce/2019-July/000043.html as well05:56
AJaegerYaminiU: see https://opendev.org/zuul/zuul/src/branch/master/doc/source/admin/examples/etc_zuul/zuul.conf#L23-L26 on what kind of connection to add05:56
YaminiUok05:56
YaminiUthanks05:56
*** udesale has joined #openstack-infra05:57
AJaegerYaminiU: you really should subscribe to zuul-announce if you run a Zuul v305:57
YaminiUcan you please give me the link05:57
YaminiUam new to to this05:57
YaminiUi will subscribe for the same05:57
AJaegerYaminiU: See http://lists.zuul-ci.org/05:59
YaminiUthanks05:59
*** raukadah is now known as chandankumar06:03
*** kjackal has quit IRC06:24
*** kjackal has joined #openstack-infra06:29
*** ruffian_sheep has joined #openstack-infra06:32
*** dpawlik has joined #openstack-infra06:40
*** pgaxatte has joined #openstack-infra06:42
*** yamamoto has quit IRC06:51
*** yamamoto has joined #openstack-infra06:52
*** yamamoto has quit IRC06:52
*** iurygregory has joined #openstack-infra06:54
*** yamamoto has joined #openstack-infra06:58
*** yamamoto has quit IRC06:58
*** rcernin has quit IRC07:00
YaminiUAjaeger even after changing the url am seeing the same error with the new URL07:05
YaminiUhttp://paste.openstack.org/show/754422/07:05
YaminiUianw & Ajaeger even after changing the url am seeing the error with the new URL07:06
YaminiUhttp://paste.openstack.org/show/754422/07:06
*** rpittau|afk is now known as rpittau07:08
*** xek has joined #openstack-infra07:09
*** dtantsur|afk is now known as dtantsur07:12
*** igordc has quit IRC07:16
*** tosky has joined #openstack-infra07:21
*** lucasagomes has joined #openstack-infra07:21
AJaegeropendev.org/zuul-base-jobs suprises me, I would have expected opendev.org/zuul/zuul-base-jobs07:33
AJaegerYaminiU: can't help further...07:33
*** jamesmcarthur has joined #openstack-infra07:33
YaminiUeven i would have expected the same07:33
*** ricolin_ is now known as ricolin07:33
YaminiUi had given my base url as opendev.org/zuul07:34
YaminiUhttps://opendev.org/zuul07:34
*** ginopc has joined #openstack-infra07:35
*** pkopec has joined #openstack-infra07:36
*** jamesmcarthur has quit IRC07:38
*** rascasoft has quit IRC07:39
*** ykarel|meeting is now known as ykarel07:39
*** rascasoft has joined #openstack-infra07:40
*** ykarel_ has joined #openstack-infra07:43
*** priteau has joined #openstack-infra07:43
*** slaweq has joined #openstack-infra07:43
*** ykarel_ is now known as ykarel|lunch07:44
*** ykarel has quit IRC07:44
*** udesale has quit IRC07:46
*** udesale has joined #openstack-infra07:46
*** igordc has joined #openstack-infra07:47
YaminiUthere is a reference in the same discussion thread which syas the base url should be https://opendev.org07:50
YaminiUand not https://opendev.org/zuul07:51
YaminiUis it the case07:51
*** yamamoto has joined #openstack-infra08:05
*** igordc has quit IRC08:06
*** YaminiU has quit IRC08:07
*** kopecmartin|off is now known as kopecmartin08:07
*** dchen has quit IRC08:11
*** ralonsoh has joined #openstack-infra08:13
*** yamamoto has quit IRC08:15
*** tkajinam has quit IRC08:25
*** yamamoto has joined #openstack-infra08:30
*** ykarel|lunch is now known as ykarel08:43
*** derekh has joined #openstack-infra08:44
*** ruffian_sheep has quit IRC08:46
*** ruffian_sheep has joined #openstack-infra08:54
*** panda has quit IRC08:54
*** jamesmcarthur has joined #openstack-infra08:56
*** panda has joined #openstack-infra08:57
*** jamesmcarthur has quit IRC09:00
*** nfakhir has quit IRC09:17
*** ricolin has quit IRC09:20
*** jamesmcarthur has joined #openstack-infra09:28
*** jamesmcarthur has quit IRC09:32
*** yamamoto has quit IRC10:00
*** nfakhir has joined #openstack-infra10:04
*** yamamoto has joined #openstack-infra10:05
*** yamamoto has quit IRC10:08
*** priteau has quit IRC10:10
*** ykarel is now known as ykarel|afk10:12
*** gfidente has joined #openstack-infra10:12
*** apetrich has joined #openstack-infra10:16
openstackgerritMonty Taylor proposed opendev/system-config master: Silence InsecureRequestWarning from urllib3  https://review.opendev.org/67100010:16
*** tosky__ has joined #openstack-infra10:17
*** tosky has quit IRC10:17
*** tosky__ is now known as tosky10:18
openstackgerritMonty Taylor proposed opendev/system-config master: Silence InsecureRequestWarning and password warning  https://review.opendev.org/67100010:19
openstackgerritTobias Henkel proposed zuul/zuul master: Add support for smart reconfigurations  https://review.opendev.org/65211410:19
*** udesale has quit IRC10:29
*** derekh has quit IRC10:29
*** derekh has joined #openstack-infra10:38
*** yamamoto has joined #openstack-infra10:38
*** YaminiU has joined #openstack-infra10:38
*** yamamoto has quit IRC10:41
YaminiUi have given the base url as mentioned in the thread https://opendev.org/zuulJob console starting...2019-07-16 10:16:13.032990 | Running Ansible setup...2019-07-16 10:16:19.128313 | PRE-RUN START: [trusted : opendev.org/zuul-base-jobs/playbooks/base/pre.yaml@master]2019-07-16 10:16:20.497776 | ERROR! the role 'add-build-sshkey' was not found in /tm10:43
YaminiUp/tmpqq0z4517/3ccef2ba127840e9a9f7c6552ff2d642/trusted/project_0/opendev.org/zuul-base-jobs/playbooks/base/roles:/tmp/tmpqq0z4517/3ccef2ba127840e9a9f7c6552ff2d642/work/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles:/tmp/tmpqq0z4517/3ccef2ba127840e9a9f7c6552ff2d642/trusted/project_0/opendev.org/zuul-base-jobs/playbooks/base2019-07-16 10:10:43
YaminiU16:20.497863 | 2019-07-16 10:16:20.497882 | The error appears to have been in '/tmp/tmpqq0z4517/3ccef2ba127840e9a9f7c6552ff2d642/trusted/project_0/opendev.org/zuul-base-jobs/playbooks/base/pre.yaml': line 3, column 7, but may2019-07-16 10:16:20.497901 | be elsewhere in the file depending on the exact syntax problem.2019-07-16 10:16:20.497916 | 201910:43
YaminiU-07-16 10:16:20.497930 | The offending line appears to be:2019-07-16 10:16:20.497945 | 2019-07-16 10:16:20.497958 |   roles:2019-07-16 10:16:20.497976 |     - add-build-sshkey2019-07-16 10:16:20.497992 |       ^ herefirst point when it clones it clones into opendev.org/zuul-base-jobsthere is no zuul in theresecond point it tries to get roles from t10:43
YaminiUeh zuul-base jobs folderbut in actuals the base job folder does not have any rolesit is the zuul jobs folder which has all the roles10:43
YaminiUhttp://paste.openstack.org/show/754422/10:43
YaminiUpaste link of error10:43
openstackgerritTobias Henkel proposed zuul/zuul master: Add --check-config option to zuul scheduler  https://review.opendev.org/54216010:43
YaminiUherefirst point when it clones it clones into opendev.org/zuul-base-jobsthere is no zuul in theresecond point it tries to get roles from teh zuul-base jobs folderbut in actuals the base job folder does not have any rolesit is the zuul jobs folder which has all the roles10:43
*** electrofelix has joined #openstack-infra10:45
*** yamamoto has joined #openstack-infra10:47
*** bhavikdbavishi has quit IRC10:51
*** yamamoto has quit IRC10:54
*** yamamoto has joined #openstack-infra10:56
*** yamamoto has quit IRC10:56
*** yamamoto has joined #openstack-infra10:56
*** pgaxatte has quit IRC11:01
*** kjackal has quit IRC11:08
openstackgerritSimon Westphahl proposed zuul/zuul master: Spec for allowing circular dependencies  https://review.opendev.org/64330911:08
*** kjackal has joined #openstack-infra11:09
*** ykarel|afk is now known as ykarel11:12
*** jamesmcarthur has joined #openstack-infra11:29
*** tesseract has joined #openstack-infra11:30
*** rh-jelabarre has joined #openstack-infra11:31
openstackgerritGhanshyam Mann proposed opendev/elastic-recheck master: Add query for test cold migration revert failure bug 1836595  https://review.opendev.org/67101311:32
openstackbug 1836595 in neutron "test_server_connectivity_cold_migration_revert failing" [Undecided,New] https://launchpad.net/bugs/183659511:32
openstackgerritMerged zuul/zuul master: Run cleanup playbooks in job thread  https://review.opendev.org/67088811:32
*** jamesmcarthur has quit IRC11:33
*** tdasilva has joined #openstack-infra11:44
openstackgerritTobias Henkel proposed zuul/zuul master: Add support for smart reconfigurations  https://review.opendev.org/65211411:49
openstackgerritTobias Henkel proposed zuul/zuul master: Add --check-config option to zuul scheduler  https://review.opendev.org/54216011:50
openstackgerritMerged zuul/zuul master: web: add OpenAPI documentation  https://review.opendev.org/53554111:52
*** yamamoto has quit IRC11:54
*** lpetrut has joined #openstack-infra11:57
*** jamesmcarthur has joined #openstack-infra12:03
AJaegerYaminiU: use https://opendev.org/zuul/zuul/src/branch/master/doc/source/admin/examples/etc_zuul/zuul.conf#L23-L26 as is as example - don't change the baseurl. opendev.org/zuul does not work12:04
*** Lucas_Gray has joined #openstack-infra12:06
*** jamesmcarthur has quit IRC12:08
*** bhavikdbavishi has joined #openstack-infra12:10
*** tdasilva has quit IRC12:12
*** yamamoto has joined #openstack-infra12:12
*** yamamoto has quit IRC12:12
*** tdasilva has joined #openstack-infra12:12
*** jamesmcarthur has joined #openstack-infra12:15
*** goldyfruit has quit IRC12:16
*** gtarnaras has joined #openstack-infra12:17
*** snapiri has quit IRC12:18
*** snapiri has joined #openstack-infra12:18
openstackgerritMerged zuul/zuul master: web: add tenant and project scoped, JWT-protected actions  https://review.opendev.org/57690712:26
*** lpetrut has quit IRC12:28
*** ruffian_sheep has quit IRC12:30
*** lpetrut has joined #openstack-infra12:32
*** ricolin has joined #openstack-infra12:35
*** ijw has joined #openstack-infra12:38
*** ykarel is now known as ykarel|afk12:39
*** pgaxatte has joined #openstack-infra12:42
*** ykarel_ has joined #openstack-infra12:43
*** ijw has quit IRC12:43
*** ykarel_ has quit IRC12:44
openstackgerritMerged zuul/zuul master: Allow operator to generate auth tokens through the CLI  https://review.opendev.org/63619712:45
*** ykarel|afk has quit IRC12:45
*** aaronsheffield has joined #openstack-infra12:48
*** YaminiU has quit IRC12:50
*** udesale has joined #openstack-infra12:52
*** tesseract has quit IRC12:53
*** yamamoto has joined #openstack-infra12:54
*** tesseract has joined #openstack-infra12:55
*** lpetrut has quit IRC12:57
*** yamamoto has quit IRC12:58
*** ginopc has quit IRC12:59
*** ekultails has joined #openstack-infra13:01
*** Lucas_Gray has quit IRC13:05
*** michael-beaver has joined #openstack-infra13:06
*** Lucas_Gray has joined #openstack-infra13:11
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Install system dependencies for tox-molecule  https://review.opendev.org/67102913:13
*** ykarel has joined #openstack-infra13:16
*** ricolin has quit IRC13:18
*** jamesmcarthur has quit IRC13:19
*** eharney has quit IRC13:22
coreycbhi infra, would someone be able to give some input on this review wrt retaining the py35 zuul unit test job for tempest? https://review.opendev.org/#/c/670639/13:23
*** goldyfruit has joined #openstack-infra13:28
*** gtarnaras has quit IRC13:29
*** kjackal has quit IRC13:30
*** liuyulong has joined #openstack-infra13:33
*** kjackal has joined #openstack-infra13:33
*** trident has quit IRC13:38
*** trident has joined #openstack-infra13:39
*** rh-jelabarre has quit IRC13:48
*** rh-jelabarre has joined #openstack-infra13:50
sshnaidm|roverIf I have job defined with specific branches, not including master: https://review.opendev.org/#/c/670168/2/zuul.d/standalone-jobs.yaml - can I still run it on master when specifying "branches: master" for job vars in pipeline config?13:51
sshnaidm|roverbecause it doesn't seem to work: https://review.opendev.org/#/c/670176/1/zuul.d/layout.yaml13:51
*** psachin has quit IRC13:52
clarkbcoreycb: there is no issue with that. Tempest is branchless and expects to function across the breadth of currently supported branches and their platforms14:08
coreycbclarkb: ok thanks for the input. yeah i wasn't sure if we are trying to get away from xenial or not.14:08
clarkbthat means testing python35 until those branches are untestable (which is a bit undefined I think but likely lines up with platform eol)14:09
coreycbclarkb: ok sounds good, we'll keep the py35 tests then14:09
jrosseri've seen a few of these in the last couple of days seeing a fair few jobs fail with this Client Error: Not Found for url: https://opendev.org/openstack/requirements/raw/f67ac75ebb1ac96e7f6a511d9c5e4c3de21c38d4/upper-constraints.txt14:15
*** yamamoto has joined #openstack-infra14:15
jrosserarg paste, but ykwim14:15
clarkbpossible we have another corrupt fs on one of the backends14:16
clarkbwe'll have to check the 8 against that url and see if any fail14:17
*** yamamoto has quit IRC14:22
*** eharney has joined #openstack-infra14:23
clarkbgitea08 404s on that url14:28
clarkbI'm not on real computer yet but maybe someone who is can check for fs problems on that host14:28
openstackgerritArtom Lifshitz proposed opendev/elastic-recheck master: Add query for bug 1836754  https://review.opendev.org/67105114:30
openstackbug 1836754 in OpenStack Compute (nova) "Conflict when deleting allocations for an instance that hasn't finished building" [Undecided,New] https://launchpad.net/bugs/183675414:30
*** icarusfactor has joined #openstack-infra14:32
clarkbsshnaidm|rover: I think the job may not exist in the context of master since that branch was excluded which means there is nothing to override14:33
sshnaidm|roverclarkb, so maybe better to include all, and then override it in each repo?14:33
clarkbor define it mulyiple times with the group matchers for each set of branches14:35
fungiclarkb: i'm checking it now14:35
*** factor has quit IRC14:35
corvusfungi, clarkb: lots of oom errors14:35
fungiyup14:35
*** dpawlik has quit IRC14:36
fungithere was a memory spike around 06:25 according to cacti14:37
corvusit chooses to kill git a lot, so it could have just aborted a replication push14:37
fungithat roughly corresponds with the 06:18-06:22 ooms in the dmesg14:38
corvusit killed 6 git processes and 1 gitea process14:38
*** pgaxatte has quit IRC14:38
fungii'm going to check the other gitea servers for similar issues14:38
*** pgaxatte has joined #openstack-infra14:39
fungibut on 08 it looks like it killed lots of stuff14:39
fungimysqld, containerd, git, sshd14:40
*** armax has joined #openstack-infra14:42
fungiahh, nevermind, those were just the triggering mallocs14:42
corvusfungi: really? i only see it killing git... what's..14:42
corvusok14:42
fungiit's all git processes killed14:42
fungiin each case14:42
corvusand yeah, i was wrong about the gitea proc too... it said "Kill process 21950 (gitea) score 94 or sacrifice child" but apparently went with 'sacrifice child' on that one because it killed git14:42
fungiat least in this event and the similar (smaller) one on wednesday14:42
fungiback on 2019-06-27 there was a bout of ooms and a pandoc process got killed14:43
fungisame back on 2019-06-1814:43
fungiout of the 8 gitea servers, only 02 has no oom events in dmesg14:45
fungisome have more, some less, various dates and times, almost always git processes are sacrificed14:45
fungii'm going to go out on a limb and assume 02 has merely been lucky14:46
clarkbdoesnt cacti show tons of available memory though?14:47
fungimost of the time14:47
clarkbcould we be hitting lower cgroup limits?14:47
fungithese look like sudden spikes14:47
fungihttp://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=66794&rra_id=1&view_type=&graph_start=1563201378&graph_end=156328777814:47
fungier, i meant http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=66794&rra_id=all14:48
*** ginopc has joined #openstack-infra14:48
fungi[Tue Jul 16 06:22:09 2019] Killed process 16582 (git) total-vm:1536656kB, anon-rss:729868kB, file-rss:0kB, shmem-rss:0kB14:49
fungithese servers have 8gb ram and no swap14:49
fungihrm, yeah that doesn't look close to 8gb on its own14:50
clarkbThey dont have the ephemeral drive that our launch scripts normally set up for swap but also most dont have the spare disk to add a swap file or device14:50
clarkbon 06 wecould add a swap file without worry but the othersneed rebuilding with more disk14:50
clarkbdouble checking docker doesnt limit memory use by default14:51
clarkbunlikely to be cgroups then I guess14:51
fungiwell, regardless, the spike in cacti suggests the server did run out of available ram for ~1 poll14:52
*** bhavikdbavishi has quit IRC14:52
clarkbthe shortterm fix here is to force replication for gitea08 then?14:59
mordredclarkb: I think that seems reasonable14:59
fungihrm... 06:25 is when cron.daily is triggered... coincidence? probably, the ooms started around 06:18 if the dmesg timestamps are accurate. *:17 is cron.hourly kicks off but there's nothing in it14:59
clarkbdaily would do logrotate?15:00
clarkb4:42 is the db backup15:00
clarkbI guess it could be an external actor's daily cron too15:00
*** eernst has joined #openstack-infra15:01
fungiohhh15:01
fungiyeah, that's a distinct possibility15:01
clarkbwe can probably check the gitea logs for  requests around that time period?15:01
fungithere is a corresponding spike in network traffic too15:01
fungiand cpu usage15:01
*** pkopec has quit IRC15:01
fungiand load average15:02
fungino corresponding spike on other gitea servers though, suggesting it was likely a single client address?15:02
mordredso maybe just someone did a thing that caused the memory/cpu/traffic spike and beause of load balancing it happened to go to gitea0815:03
*** pkopec has joined #openstack-infra15:03
*** jamesmcarthur has joined #openstack-infra15:03
fungithat's what's looking likely so far15:04
sshnaidm|roverclarkb, trying to redefine jobs branches, but still don't see it's queued for this patch: https://review.opendev.org/#/c/671055/2/zuul.d/standalone-jobs.yaml - is anything wrong there?15:07
clarkbsshnaidm|rover: the job has to be in a pipeline queue too15:07
sshnaidm|roverclarkb, you mean this? https://review.opendev.org/#/c/671055/2/zuul.d/layout.yaml15:09
*** kjackal has quit IRC15:09
clarkbhrm not sure then15:10
clarkbfungi: mordred corvus should I go ahead and trigger gitea08 replication?15:16
*** eernst has quit IRC15:17
fungiclarkb: yeah, sorry, conference call has sapped my continued troubleshooting, hoping to get back to it shortly15:20
clarkbya juggling that too15:20
corvusclarkb: ++15:20
corvusclarkb, fungi, mordred: i found what's broken with parallel gitea repo creation -- it updates the "user" db table and sets "num_repos".  so if i do 10 creates in parallel, it updates the user table 10 times and sets num_repos to 1 each time.15:21
corvuswhen the system starts, it does a check, and so it fixes that.  that's why restarting caused it to go back to normal.15:22
corvushere's the increment: https://github.com/go-gitea/gitea/blob/master/models/org_team.go#L131-L13615:22
clarkbcorvus: that seems like a legit gitea bug15:22
corvushere's the call site: https://github.com/go-gitea/gitea/blob/master/models/repo.go#L132115:23
corvusand here's one more level up the stack where it creates the orm session: https://github.com/go-gitea/gitea/blob/master/models/repo.go#L1371-L137715:23
corvusso each creation is happening in its own session15:23
corvusthere's a Begin() in there, so maybe in its own transaction?15:24
mordredyeah - that sounds like a db layer bug - e.Incr is likely doing it go-side rather than db-side15:24
mordred(which is a common programming error by folks)15:24
mordredupdate table set foo=foo+1 is an atomic operation that's safe to do from multiple calling threads at once15:25
* mordred reads more go code15:25
corvusrepo.go:        if _, err = sess.Exec("UPDATE `user` SET num_repos=num_repos+1 WHERE id=?", newOwner.ID); err != nil {15:25
corvusmordred: ^ that exists elsewhere in gitea15:25
corvusmordred: maybe we should change that incr to that ^ ?15:25
*** chandankumar is now known as raukadah15:26
mordredmaybe so - I'm reading the xorm code to try to see what it thinks it's doing ... but yeah, that seems like a safe change to make if it's already in the codebase elsewhere15:26
corvusi'll start by checking i can repro on master :)15:27
mordred++15:28
mordredcorvus: oh - xorm seems to also have lunny as a committer - so if we do find a bug, we at least know a peron15:28
mordredperson15:28
mordredcorvus: I'm confused - the Incr call *seems* like it at least intends to do num_repos=num_repos+!15:31
mordredcorvus: I'm confused - the Incr call *seems* like it at least intends to do num_repos=num_repos+115:31
*** e0ne has joined #openstack-infra15:32
corvusmordred: huh.  could it be doing it in several isolated transactions, and as long as they come out the same, it's not a conflict?  i'm rusty here.15:34
mordredcorvus: I can't follow all the magic - I think it's worth changing to the sess.Exec and see if that fixes it15:34
corvusok.  i'm still building images, so it'll be a few mins15:35
mordredcorvus: it could do - if they're doing explicit begin/commit transactions. at least with mysql the answer to this pattern is to send the update call outside of an explicit transaction and let it be handled as an atomic operation15:35
mordredthat would also depened on isolation level15:36
*** Lucas_Gray has quit IRC15:36
*** gyee has joined #openstack-infra15:39
openstackgerritJeff Liu proposed zuul/zuul-operator master: [WIP] Verify Operator Pod Running  https://review.opendev.org/67039515:39
*** Lucas_Gray has joined #openstack-infra15:41
openstackgerritStephen Finucane proposed zuul/zuul master: web: Add warning about incompleteness of OpenAPI spec  https://review.opendev.org/67108615:41
*** diablo_rojo has joined #openstack-infra15:41
mordredcorvus: ok - no, it doesn't depend on isolation level - and update set x=x+1 should always do the right thing15:42
clarkbmordred: that is probably done without locks and instead uses a type that can always be updated atomicly. yay databases solving problems for us15:43
clarkbreplication completed and https://gitea08.opendev.org:3000/openstack/requirements/raw/commit/f67ac75ebb1ac96e7f6a511d9c5e4c3de21c38d4/upper-constraints.txt exists now15:43
clarkbjrosser: ^ hopefully that addresses the problem in the short term15:44
jrosserclarkb: great thankyou, I’ll let you know if I see any more15:44
corvusmordred: ok, reproduced with local build of gitea master; will try modifying now15:45
mordredcorvus: woot15:46
*** eharney has quit IRC15:46
*** eharney has joined #openstack-infra15:46
clarkbI'll trigger replication for a backend at a time to ensure they are all in sync as of roughly today15:46
corvusmordred: i don't suppose you ran across a "get session from engine" method? :)15:47
*** igordc has joined #openstack-infra15:47
clarkbon gitea06 we can create a swapfile (it has more disk than the others)15:48
clarkband if that makes the OOMs go away there I guess we can roll that out broadly as we replace servers?15:48
corvusmordred: nm -- the "engine" variable is actually the session15:49
fungiclarkb: that seems like a reasonable next step. we may also want to consider some sort of per-client rate limiting if haproxy is capable of that15:49
corvusclarkb: ++15:49
mordredcorvus: I only found factory factories that factory the session generation factory15:49
mordredclarkb: ++15:49
corvusmordred: that's no good, i can only use a factory factory factory15:49
clarkbalso as a reminder I'm likely to not be around much of today as I'm taking advantage of people traveling to portland for oscon15:50
clarkbWill be around for the meeting though15:50
mordredcorvus: you could wrap the factory factory in a generator factory and then async callback that promise to a deferred factory generator15:50
fungiokay, so looking at cacti graphs the cpu/load spike also appears on gitea06 but does not seem to have much in the way of a corresponding network spike15:51
mordredclarkb: I will miss the meeting today ... I have not moved the github repos in openstack-infra yet - although I do have a non-working script started locally15:51
corvusmordred: oh shoot, i think i got something backwards15:52
mordredcorvus: ono15:52
corvusmordred: i didn't notice that addRepository was operating on the team table.  that *does* have the correct num_repos.  so it looks like Incr is working.... but where is the user getting updated...?15:54
mordredcorvus: oh. well - at least that makes it make more sense in some ways - I really couldn't figure ouw what was wrong with Incr15:55
clarkbre making swap I think I'll push an update to make_swap.sh that we can quickly read over and if that looks good manually run it on 06. Then if we merge that change the new nodes will all automagically get swap15:56
mordredcorvus: are we looking for the "how many repos does this user have" setting gets updated?15:58
mordredcorvus: https://github.com/go-gitea/gitea/blob/master/models/repo.go#L130916:00
mordredcorvus: then https://github.com/go-gitea/gitea/blob/master/models/repo.go#L131216:00
mordredcorvus: it's that udpate16:00
mordredupdate16:00
*** pgaxatte has quit IRC16:00
mordredcorvus: we need to split that into two calls - one to do the update to set LastRepoVisibility (for which I think it's perfectly fine for it to race)16:01
mordredcorvus: and then an Incr call16:01
mordred(or a direct update)16:01
mordredand u.NumRepos++ should move after the updateUser call so that it doesnt' gt sucked in16:02
mordredcorvus: you want me to try to make a PR?16:02
corvusmordred: thank you! i did not see NumRepos i was looking for num_repos.  gimme a sec to catch up16:02
corvusmordred: yes, that would be great -- why don't you make a branch on github and let me test it before you actually open the pr16:03
corvusmordred: and i agree with your analysis16:04
mordredcorvus: http://paste.openstack.org/show/754445/ I thnik will do it16:06
mordredcorvus: but yes - I make branch now16:06
*** lucasagomes has quit IRC16:06
*** bobh has joined #openstack-infra16:07
*** Lucas_Gray has quit IRC16:07
openstackgerritClark Boylan proposed opendev/system-config master: Use swapfile if no extra device is present  https://review.opendev.org/67110216:08
clarkbinfra-root ^ something like that for automagically creating swap on node creation. I'll run that manually if that looks about right16:08
*** ricolin has joined #openstack-infra16:09
mordredcorvus: https://github.com/emonty/gitea/pull/new/fix-user-total-repo16:09
mordredcorvus: I'm guessing we need the Update(u) part of that - perhaps because that's how it knows what table to update?16:09
corvusmordred: is that a private repo?16:09
corvusoh, no i think that's a "create a PR" url you sent :)16:10
corvushttps://github.com/emonty/gitea/tree/fix-user-total-repo16:10
mordredoh - hahaha16:11
corvusmordred: isn't Update(u) still going to overwrite the num_repos?16:11
mordredcorvus: oh - hrm. maybe what we want there is Update(new User) like in the other call16:12
*** ginopc has quit IRC16:12
corvusmordred: wait, strike that16:12
corvusmordred: i meant the updateUser() call16:13
corvusmordred: does updateUser only update what's changed, or everything?  cause if it's everything, we could end up setting num_repos=0 in each thread before we execute the incr16:13
mordredcorvus: I would think it updates what's changed ... but let me go read16:14
corvuse.ID(u.ID).AllCols().Update(u)16:14
mordredcorvus: I pushed up a patach that does Update(new(User)) just to be sure on that front16:14
mordredcorvus: piddle16:14
corvusmordred: well, i don't know what xorm does with those methods16:14
mordredme either16:14
corvusmordred: yes i do16:14
corvusUPDATE `user` SET `lower_name` = ?, `name` = ?, `full_name` = ?, `email` = ?, `keep_email_private` = ?, `passwd` = ?, `passwd_hash_algo` = ?, `must_change_password` = ?, `login_type` = ?, `login_source` = ?, `login_name` = ?, `type` = ?, `location` = ?, `website` = ?, `rands` = ?, `salt` = ?, `language` = ?, `description` = ?, `last_login_unix` = ?, `last_repo_visibility` = ?, `max_repo_creation`16:14
corvus= ?, `is_active` = ?, `is_admin` = ?, `allow_git_hook` = ?, `allow_import_local` = ?, `allow_create_organization` = ?, `prohibit_login` = ?, `avatar` = ?, `avatar_email` = ?, `use_custom_avatar` = ?, `num_followers` = ?, `num_following` = ?, `num_stars` = ?, `num_repos` = ?, `num_teams` = ?, `num_members` = ?, `visibility` = ?, `diff_view_style` = ?, `theme` = ?, `updated_unix` = ? WHERE `id`=?16:14
corvusmordred: ^ show full processlist16:14
corvusso yeah, it's the bad thing16:15
mordredawesome. so we need to exclude num_repos16:15
corvusmordred: is it just the 2 things?  lastvis and numrepos?16:15
mordredactually - yeah - so we can jkust change that call16:15
mordredupdate coming16:16
corvus(and yeah, it looks to me like it's just the 2)16:16
mordredcorvus: do you think we use the mysql column name or the go variable name in teh Cols argument?16:16
mordredI'm gonna go with mysql to start16:16
*** mattw4 has joined #openstack-infra16:17
corvusmordred: Incr uses mysql col name16:17
corvusso that sounds like a good bet16:17
mordredcorvus: ok. force-pushed to the branch16:18
mordredshould be the full story now :)16:18
corvusmordred: cool, i'll build that and try it out16:18
*** mattw4 has quit IRC16:19
*** mattw4 has joined #openstack-infra16:19
mordredfwiw - the other solution here would be to turn the original select into a select for update - which would put a row lock on the row in the db causing the other threads to block waiting which would reduce the concurrency but allow the select/update pattern16:19
mordredthat would be much harder to plumb in though, since the User comes from teh request context16:22
*** ykarel has quit IRC16:25
*** hamzy has quit IRC16:25
*** jamesmcarthur has quit IRC16:29
*** jamesmcarthur has joined #openstack-infra16:31
corvusmordred: \o/ UPDATE `user` SET `last_repo_visibility` = ?, `updated_unix` = ? WHERE `id`=?16:33
*** tesseract has quit IRC16:33
corvusmordred: count looks correct16:33
corvusmordred: i think you're gtg on opening the pr16:34
*** electrofelix has quit IRC16:35
mordredcorvus: woot!16:37
*** rpittau is now known as rpittau|afk16:38
corvusmordred: i'm doing a full run of project creation locally, and i think i'm seeing the database be the bottleneck -- specifically this update (i mean, i think it was before, which is why i was able to actually see the update in the processlist).  i'm wondering if there's some bit of tuning we can do?  we're basically just running the mariadb image with all the defaults..16:38
corvusmordred: http://paste.openstack.org/show/754446/16:39
*** ijw has joined #openstack-infra16:39
corvusmordred: or -- am i seeing mariadb just waiting for the session commit?16:39
*** ijw has joined #openstack-infra16:39
mordredyeah - you might be just waiting for the session commit if they're not doing autocommit16:42
corvusmordred: https://github.com/go-gitea/gitea/blob/master/models/repo.go#L140116:42
corvusmordred: looks like the filesystem stuff happens after that update and before the commit16:43
corvusthat's probably the safest thing.  :(16:43
mordredyeah - but it'll definitely be a smidge of a bottleneck16:44
mordredstill - we should at least see correct results when we're done16:44
*** armax has quit IRC16:48
donnydclarkb: seems like I am still getting a timeout here or there.. but I am not sure its infra related16:48
donnydhttp://logs.openstack.org/48/670848/1/check/tempest-full-rocky/60be484/job-output.txt16:48
donnydseems to be the same job that hits it16:48
corvusmordred: it looks like the upshot of that is that parallelized repo creation is only 86% the time as serialized; so it'll save us a couple minutes, but it's not huge.  however, parallelized repo updates can save us a lot of time.  so i think we can do serialized repo creation + parallelized repo updates and cut our time in half, and then once we upgrade to gitea with your patch, shave a few more16:48
corvusmins off.16:48
corvusmordred: i'll work on setting that up16:49
donnydnot sure if that is cpu, memory or storage bound... I got the storage performance problem all shored up16:50
*** gfidente has quit IRC16:51
*** gfidente has joined #openstack-infra16:51
*** ramishra has quit IRC16:51
openstackgerritMerged zuul/zuul master: web: Add warning about incompleteness of OpenAPI spec  https://review.opendev.org/67108616:52
donnydI can say for sure its not storage bound this time around.16:53
donnyd   read: IOPS=21.9k, BW=85.7MiB/s (89.9MB/s)(3070MiB/35809msec)16:53
donnyd  write: IOPS=7334, BW=28.7MiB/s (30.0MB/s)(1026MiB/35809msec)16:53
donnydon a loaded hypervisor16:53
AJaegersshnaidm|rover: use debug: true on your check queue to figure out why jobs are run or not16:55
donnydand with a little larger block size I still have this left over   write: IOPS=460, BW=461MiB/s (483MB/s)(4096MiB/8894msec)16:55
*** hamzy has joined #openstack-infra16:55
*** derekh has quit IRC17:01
fungidonnyd: http://logs.openstack.org/48/670848/1/check/tempest-full-rocky/60be484/controller/logs/dstat-csv_log.txt.gz might be useful to see what performance the job was experiencing17:02
mordredcorvus: ++17:03
fungidonnyd: but also, you might consider filtering for build_name=tempest-full-rocky across all providers for timeouts17:04
donnydnot sure how to make heads or tails of that fungi17:04
fungicould be it's not just runs in fortnebula which are timing out, and that the job itself is just generally running dangerously close to its (currently configured) 2-hour timeout17:05
donnydseems to just be me17:05
donnydat least int he last 12 hours17:06
*** aluria has quit IRC17:06
donnydand the last 7 days17:06
fungidstat is like systat/sar, and the rows in that csv are points in time during the job, columns are various stats like cpu utilization, memory, disk read and write operations and bandwidth, et cetera17:07
*** udesale has quit IRC17:09
fungimight indicate that the job spent a long time with the guest at 100% cpu utilization or high system load count17:09
fungican also compare it to similar dstat files from other runs of the same job in different providers if that's of interest17:10
*** yamamoto has joined #openstack-infra17:11
*** kopecmartin is now known as kopecmartin|off17:11
*** dtantsur is now known as dtantsur|afk17:15
*** goldyfruit has quit IRC17:16
*** igordc has quit IRC17:17
*** e0ne has quit IRC17:20
openstackgerritAndreas Jaeger proposed openstack/project-config master: Publish api-ref/api-guide to docs.o.o  https://review.opendev.org/67111717:22
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove publish-openstack-manuals-developer-lang  https://review.opendev.org/67111817:22
openstackgerritJames E. Blair proposed opendev/system-config master: Use a thread pool to update gitea repos faster  https://review.opendev.org/67092017:32
corvusmordred: ^ that took 9m13s locally17:32
corvusmordred: i wonder if we ran the different orgs in parallel, if that would avoid the db contention (since we would be updating num_repos of different users)?17:33
corvuswe don't have a lot of orgs (atm), but even if we run x/ and openstack/ at the same time, maybe it would help17:33
corvusi'll see if i can give that a try17:33
*** armax has joined #openstack-infra17:34
*** weifan has joined #openstack-infra17:34
fungiahh, yeah, so serial creation within an org, but parallel between different orgs17:34
mordredcorvus: yeah - I could see that maybe helping - although my hunch says that since the db commit is waiting on the repo creation on disk, it's more that it _looks_ more like db contention than it actually is17:35
fungialso if you're limiting it to a fixed number of threads, sort by number of repos per org so you frontload the largest ones and do the smaller ones at the end as threads free up17:35
*** ijw has quit IRC17:36
fungithat ought to maximize packing efficiency to shave off a bit more time (and increasingly as we grow more orgs in the future)17:36
corvusmordred: yeah, i guess it depends on how efficient that disk work is -- can we get 2 of those going at the same time, or is it already at 100% utilization.  fungi ++ good point17:36
*** ijw has joined #openstack-infra17:36
fungithat only sprang to mind because we convinced gerrit upstream to implement a similar optimization in the reindex queuing17:38
* mordred afks for a bit - biab17:38
fungi(though in that case it was change refs per repo i think)17:38
*** ijw has quit IRC17:38
*** ijw has joined #openstack-infra17:39
*** ricolin has quit IRC17:40
*** jamesmcarthur has quit IRC17:43
*** igordc has joined #openstack-infra17:44
*** _Cyclone_ has quit IRC17:50
*** _Cyclone_ has joined #openstack-infra17:52
*** electrofelix has joined #openstack-infra18:00
*** tosky has quit IRC18:01
*** eharney has quit IRC18:02
*** electrofelix has quit IRC18:03
*** weifan has quit IRC18:04
*** weifan has joined #openstack-infra18:04
*** eharney has joined #openstack-infra18:06
*** jamesmcarthur has joined #openstack-infra18:07
*** weifan has quit IRC18:09
*** weifan has joined #openstack-infra18:09
*** dpawlik has joined #openstack-infra18:10
openstackgerritMatthieu Huin proposed zuul/zuul master: Zuul CLI: allow access via REST  https://review.opendev.org/63631518:13
openstackgerritMatthieu Huin proposed zuul/zuul master: Add Authorization Rules configuration  https://review.opendev.org/63985518:13
openstackgerritMatthieu Huin proposed zuul/zuul master: Web: plug the authorization engine  https://review.opendev.org/64088418:14
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove publish-openstack-manuals-developer-lang  https://review.opendev.org/67111818:14
*** goldyfruit has joined #openstack-infra18:22
*** jamesmcarthur has quit IRC18:24
*** weifan has quit IRC18:38
*** weifan has joined #openstack-infra18:38
*** jamesmcarthur has joined #openstack-infra18:42
donnyddoes anyone know if there is a way to be specific in a test job. IE I want X flavor on Y provider?18:43
*** weifan has quit IRC18:43
clarkbthere isnt18:44
clarkbwe can do provider specific flavors but generally avoid that to handle cloud outages18:44
*** artom has joined #openstack-infra18:47
artomclarkb, hey, continuing the conversation with donnyd, I'm the one who's driving for multi-NUMA-node flavors, in order to eventually test Nova NUMA-y in the gate (as opposed to 3rd party CI)18:48
artomNUMA integration is a well known gap in Nova18:48
artomHopefully the use case for provider-specific flavors makes more sense :)18:49
Shrewswow, that's the  2nd request for that feature today18:49
clarkbartom: so the problem is openstack has a long standing policy of only gating on things if more than one cloud can provide the test resources. This is because we have a long hsitory of clouds disappearing18:49
artomShrews, oh yeah, who was the first? 'cuz this was talked about at Denver, so it's not new :)18:50
clarkbartom: If we can have more than one cloud provide the required functionality then great but that won't be a provider specific lable that will be the can-run-numa label18:50
fungiartom: specifically second case today where someone requested nodepool making decisions based on an intersection of node labels18:50
artomclarkb, ah, yeah, that makes sense18:50
artomclarkb, I know vexxhost have recently-ish turned on nested virt (needed for this stuff)18:51
clarkbwe can also reevaluate the policy, however as mentioned we have a long hsitroy of that policy being extremely worthwhile so I'm not sure I would vote to change it18:51
*** jamesmcarthur has quit IRC18:51
artomHopefully multi-NUMA-node flavors are coming18:51
artomclarkb, yeah, I see where you're coming from. We start gating on things, and them the cloud disappears and we're scrambling to "ungate"18:52
*** jamesmcarthur has joined #openstack-infra18:52
funginot blocking changes on the availability of a node type available from only one provider saves the project from having to disable jobs so they can merge code any time that provider is offline18:52
artomThough in this case, it would start as an experimental job triggered manually, so that might alleviate those concerns a bit18:52
*** jamesmcarthur_ has joined #openstack-infra18:53
fungiit's possible we could name the relevant node labels with a scary enough prefix that projects won't inadvertently add them to gate jobs?18:53
Shrewsfungi: use-me-and-your-project-is-deleted-ubuntu ?18:54
artom;_:18:54
*** jamesmcarthur_ has quit IRC18:54
*** mattw4 has quit IRC18:55
clarkbya if we want to start by proving it is feasible (as something we can take to other clouds potentially) I think we can set up a lable for that18:55
clarkbthen avoid gating on that (and remember in openstack clean check means check is effectively a gate too so can't be check either)18:56
clarkbexperimental jobs would be fine18:56
fungior non-voting check jobs18:56
fungior jobs in the "silent" pipeline (if we still have that)18:56
*** jamesmcarthur has quit IRC18:57
clarkbthere exists and example in our nodepool configs with the vexxhost gpu labels fwiw18:57
*** jamesmcarthur has joined #openstack-infra18:58
clarkbhttps://opendev.org/openstack/project-config/src/branch/master/nodepool/nl03.openstack.org.yaml#L48-L53 and https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl03.openstack.org.yaml#L234-L251 if you want to try setting up something similar with donnyd18:58
*** eernst has joined #openstack-infra18:59
*** jamesmcarthur_ has joined #openstack-infra18:59
clarkbalso infra meeting time in a minute or two in #openstack-meeting18:59
*** eernst has quit IRC19:00
*** jamesmcarthur has quit IRC19:03
*** jamesmcarthur_ has quit IRC19:04
*** jamesmcarthur has joined #openstack-infra19:05
*** weifan has joined #openstack-infra19:05
*** jamesmcarthur_ has joined #openstack-infra19:06
*** tdasilva has quit IRC19:06
*** eharney has quit IRC19:08
*** jamesmcarthur has quit IRC19:10
*** weifan has quit IRC19:10
*** gfidente is now known as gfidente|afk19:10
*** diablo_rojo has quit IRC19:13
*** iurygregory has quit IRC19:16
*** jamesmcarthur_ has quit IRC19:25
*** jamesmcarthur has joined #openstack-infra19:28
*** jamesmcarthur_ has joined #openstack-infra19:29
openstackgerritMerged opendev/system-config master: Translate gitea project creation to python  https://review.opendev.org/67006019:31
*** jamesmcarthur_ has quit IRC19:32
mnaserwas there any issues in fortnebula-regionone by any chance?19:33
mnaser2019-07-16 14:05:14.31260219:33
mnaserhttp://logs.openstack.org/01/670601/2/gate/openstack-ansible-deploy-aio_metal-debian-stable/1c68e66/job-output.txt.gz -- Failed to connect to opendev.org at port 443: [Errno 0] Error19:33
*** jamesmcarthur_ has joined #openstack-infra19:33
*** jamesmcarthur has quit IRC19:33
*** jamesmcarthur has joined #openstack-infra19:34
clarkbmnaser: I'm not aware of any issues recently (and the most recent issues were job timeouts due to io contention) fn has really good network bw so maybe something related to being an ipv6 only cloud?19:35
mnaseryeah i was thinking that might be a possiblity.. but we'd notice it more often if that was the case..19:36
clarkbfn is our only ipv6 only cloud currently19:36
clarkbso if you don't get scheduled there often may not pop up? but ya I think we may need more debugging info19:37
*** jamesmcarthur_ has quit IRC19:37
fungiyeah, until limestone is back in the mix anyway19:39
mnaseri mean19:41
mnaserwouldn't it fail in PRE if it cant traceroute to opendev?19:42
mnaserhttp://logs.openstack.org/01/670601/2/gate/openstack-ansible-deploy-aio_metal-debian-stable/1c68e66/zuul-info/zuul-info.debian-stretch.txt19:42
clarkbnot if the job breaks networking19:43
*** weifan has joined #openstack-infra19:44
clarkbwe can also ask donnyd to look at it from the cloud side19:44
donnydsure, what do you want to know19:44
clarkbdonnyd: there was network connectivity error from test VM to opendev.org at http://logs.openstack.org/01/670601/2/gate/openstack-ansible-deploy-aio_metal-debian-stable/1c68e66/job-output.txt.gz#_2019-07-16_14_18_16_891952 that timestamp is utc19:45
clarkb8924a87c8b8cdccd2b2123c7736c34321baf3e23bada68cdbde7887e was the hypervisor host id for that job19:45
donnyddoes opendev have v6 records?19:46
clarkbdonnyd: yes19:46
*** ijw has quit IRC19:47
donnydmnaser: I can take a look19:47
corvusclarkb: i can't find where we install vhd-util on builders19:49
*** weifan has quit IRC19:49
donnydI am still setting up full logging to capture things like this, but it would appear to me that it was pulling things from opendev.org to a certain point in the job, so not sure if its networking infra side related or not19:50
donnydclarkb: is there a way to sideload a job. Something like, i think job x isn't running correctly on cloud y19:51
beisnerhey all, just wondering if there is a meeting that we need to discuss a project-config change, or if that just happens organically?  https://review.opendev.org/#/c/668681/19:51
clarkbbeisner: usually asking here is sufficient, particularly if AJaeger has already +2'd it19:52
clarkbdonnyd: not easily no. We can manually boot an instance and run things on it19:53
clarkbcorvus: looking19:53
*** jamesmcarthur has quit IRC19:53
*** jamesmcarthur has joined #openstack-infra19:53
donnydlooking at the logs  ( node_provider:"fortnebula-regionone" AND message:"Failed to connect" ) I don't see any other jobs with that issue19:54
clarkbcorvus: opendev/puppet-diskimage_builder/manifests/init.pp19:54
*** jamesmcarthur_ has joined #openstack-infra19:54
clarkbcorvus: we set the support_vhd flag to true19:54
*** petevg has joined #openstack-infra19:55
corvusoh i didn't check that one19:55
clarkbcorvus: beisner's https://review.opendev.org/#/c/668681/ chagne would be a good one to double test the gitea project creation changes when they are all in19:55
corvuswe'll have to remember that when we containerize19:55
clarkbI'd approve it but figure you are paying attention to that so might be better if you review and approve it instead19:56
beisnercool, thx peeps19:56
*** ijw has joined #openstack-infra19:57
*** jamesmcarthur has quit IRC19:58
fungii don't suppose anybody knows how to access a named ref via the gitea webui?20:00
clarkbfungi: like a tag?20:00
*** eernst_ has joined #openstack-infra20:01
*** mattw4 has joined #openstack-infra20:04
openstackgerritMerged opendev/system-config master: Add some logging to repo creation  https://review.opendev.org/67031720:04
*** eharney has joined #openstack-infra20:06
corvusi suspect branches and tags may be the only refs you can browse in the gui20:07
corvusearthquake!20:11
*** weifan has joined #openstack-infra20:15
corvusthere it is: https://earthquake.usgs.gov/earthquakes/eventpage/nc73225421/executive20:16
fungiclarkb: like refs/changes/41/671141/120:17
fungii want to persuade gitea to return raw content of a file with that ref (our election tooling used to do that with cgit, which allowed you to pass it in the h= parameter)20:18
fungier, a file at that ref20:18
fungifiddled with trying to pretend it was a branch, or a tag, or a commit20:21
*** kjackal has joined #openstack-infra20:23
*** hamzy has quit IRC20:30
*** bobh has quit IRC20:33
*** ralonsoh has quit IRC20:35
*** diablo_rojo has joined #openstack-infra20:37
*** rfolco is now known as rfolco_l8r20:39
*** jamesmcarthur_ has quit IRC20:45
*** sgw has quit IRC20:45
openstackgerritMerged opendev/system-config master: Run actual full project creation in gitea test  https://review.opendev.org/67031320:51
openstackgerritMerged opendev/system-config master: Improve idempotency of gitea-git-repos  https://review.opendev.org/67091920:51
*** eernst_ has quit IRC21:01
*** pcaruana has quit IRC21:01
*** dpawlik has quit IRC21:15
*** kjackal has quit IRC21:16
mordredcorvus: woot! project creation landed!21:22
openstackgerritJames E. Blair proposed opendev/system-config master: Provide better module return info from gitea create repos  https://review.opendev.org/67115921:27
openstackgerritJames E. Blair proposed opendev/system-config master: Parallelize repo creation by org  https://review.opendev.org/67116021:27
corvusmordred, clarkb, fungi: ^ that's a little more complex, but i believe that is the absolute best we can do right now with the constraints around repo creation in gitea21:27
*** kjackal has joined #openstack-infra21:27
*** joeguo has joined #openstack-infra21:27
mordredcorvus: looking - also - I got 1 approval already on the gitea PR21:28
corvusbasically, i'm pretty sure that's as fast as we can create repos, and the settings are interleaved within that such that they take no extra time21:28
fungiright on (to both of you)21:28
corvusmordred: depending on how much you want to flex database muscles, we could root-cause the thing i noted in there about the settings table.  or we could just say "databases are weird, <shrug>" :)21:29
corvus(the query that gets hung is "delete from repo_unit where ...", and i'm assuming that just does something extra with locking if there are 2 transactions doing that on an empty table)21:29
*** jtomasek has quit IRC21:29
*** gfidente|afk has quit IRC21:29
corvusoh, i forgot to give a time.  that's 7.5m on my desktop21:30
mordreduhg ... databases are weird shrug - although I almost want to dig in21:30
corvusso might be a bit closer to 10m under zuul21:30
corvusmordred: if you wanted to dig in, i'm guessing the next step would be a "show me the locks plz" command which i forgot?21:31
fungisometimes weird things are what make you want to dig in21:34
*** eernst has joined #openstack-infra21:35
*** eernst has quit IRC21:35
*** sgw has joined #openstack-infra21:38
*** armax has quit IRC21:43
*** kjackal has quit IRC21:46
*** eernst has joined #openstack-infra21:51
openstackgerritJames E. Blair proposed opendev/system-config master: Add gerrit to gitea job  https://review.opendev.org/67116221:54
*** pkopec has quit IRC21:54
corvusi'm pretty sure the step after that is to add a post pipeline job21:55
*** eernst has quit IRC21:55
mordredcorvus: we're dangerously close to having a really awesome thing here21:55
*** xek has quit IRC22:02
*** diablo_rojo has quit IRC22:13
corvusclarkb, fungi, mordred: can you look at my comments on https://review.opendev.org/651390 ?22:15
corvusi think everything there is straightforward except the second comment on .zuul.yaml line 648.22:15
corvusi'm not sure how we should proceed there (it turns out that the nameserver scenario may not be as simple as it first seems)22:16
*** weifan has quit IRC22:17
*** weifan has joined #openstack-infra22:17
*** weifan has quit IRC22:18
*** weifan has joined #openstack-infra22:18
*** weifan has quit IRC22:19
*** weifan has joined #openstack-infra22:19
corvusone option would be to say that zone repos need to be under entirely opendev root control (eg drop zuul-maint from zuul-ci.org)22:19
*** weifan has quit IRC22:19
*** weifan has joined #openstack-infra22:20
corvusanother would be to ask others to please not pwn the system; however, i think the issue there is less with trust (i trust the people) than just having too much access (zuul-maint shouldn't have to worry about being a vector for compromising bridge)22:20
*** weifan has quit IRC22:20
*** weifan has joined #openstack-infra22:20
*** weifan has quit IRC22:21
*** weifan has joined #openstack-infra22:21
*** weifan has quit IRC22:22
corvusanother would be to rethink how that job is run (eg, use a secret rather than an ssh key and use project-config to assign it to the zone repo's pipeline)22:22
*** weifan has joined #openstack-infra22:22
*** weifan has quit IRC22:22
*** weifan has joined #openstack-infra22:23
corvusanother would be to come up with some way to tell zuul to let the zonefile repo borrow the ssh key of system-config just for that one job22:23
*** weifan has quit IRC22:23
*** weifan has joined #openstack-infra22:23
*** weifan has quit IRC22:24
*** weifan has joined #openstack-infra22:24
mordredcorvus: putting zone files under opendev resonates more with me, and I think the zuul project choosing to host itself with opendev carries with it an existing assumption that the opendev admins arent' going to misuse their position of trust and put a hostname in zuul-ci.org domain that shouldn't be there22:25
*** weifan has quit IRC22:25
mordredcorvus: past that, most of the data in that file is actually opendev operational data22:25
corvusthat last one sounds hard, but it may not be -- i think we could say if a job had final:true and allowed-projects, then all of the allowed-projects get to borrow the ssh key of the defining repo.22:25
mordredso I don't know that, as a zuul-maint, I'd have a context for approving or not approving a patch to the repo vs. as an opendev-core22:25
*** betherly has joined #openstack-infra22:26
mordredsome of those words may be less relevant than others - those were just my first thoughts22:26
corvusmordred: well, "blog.zuul-ci.org CNAME wordpress.com" might be the sort of thing opendev doesn't care about, but that seems minor; i generally agree.22:26
mordredyou said wordpress22:27
corvusi said blog22:27
corvusoption 1 does have the advantage of being something we can do immediately22:27
mordredI think it also describes the actual current state more correctly22:28
mordredBUT - I could be swayed into supporting different positions - this isn't strongly or deeply held conviction or anything22:29
corvussure, i'm just like to have the option of a system where semi-overlapping groups can cooperate; we have that now (we trust zuul-maint and opendev to make changes to the zone)22:30
corvusit sort of seems a shame to drop that because of a technical issue that isn't actually related to dns22:30
*** betherly has quit IRC22:31
corvusi'd love for the openstack project to have the option to use the system; and they have lots of hostnames that aren't opendev22:31
*** whoami-rajat has quit IRC22:31
corvusi think options #1 and #4 are the most actionable; #2 is right out, #3 is meh -- i'd like to be able to use the ssh key system, so i'd rather make it better than give up on it.22:34
corvusso maybe we go with #1, and i backlog implementing #4 so we can use it if we want later.22:34
corvusmordred: ^?22:34
mordredcorvus: ++22:36
*** rcernin has joined #openstack-infra22:36
corvus#4 would actually let us drop the project-config key too; it's kinda growing on me22:38
fungicommented, but i also think that projects hosting their domains with opendev ought to be able to take advantage of our familiarity with bind and dns to okay their zonefile changes (even if we have jobs which at least catch outright breakage). it does turn us into a bottleneck for those though, and i agree it would be nice if teams were able to self-approve emergency dns changes for resources we're22:39
funginot hosting in cases where none of us are around to review22:39
*** ijw has quit IRC22:39
*** ijw has joined #openstack-infra22:43
corvusi left a followup too.22:43
corvusclarkb: ^ if you agree with all that, i think we've got next steps on that one.22:43
* mordred dinners22:45
corvusregarding adding gerrit to the gitea job... i'm wondering if maybe it's not necessary...22:45
corvusright now, we are running the gerrit/gitea playbook without the gerrit host, and that's fine because the gerrit plays just aren't running since there are no matching hosts22:46
corvusso we could make a second job just for gerrit, since there isn't really any interaction there  (unless we wanted to go so far as test replication from our fake gerrit to our fake gitea)22:47
corvuswe would trust that as long as the two jobs worked separately, a single job which ran the playbook with all the hosts would work22:47
corvusi think either approach would be worth trying (one combined test job or two split test jobs); i'm not sure which would be better22:48
openstackgerritJames E. Blair proposed opendev/system-config master: Add gerrit to gitea job  https://review.opendev.org/67116222:48
*** tkajinam has joined #openstack-infra22:53
*** slaweq has quit IRC22:55
*** betherly has joined #openstack-infra22:56
*** armax has joined #openstack-infra22:59
*** ekultails has quit IRC23:00
clarkbcorvus: fungi couldnt we use asecret in a job then let repos run that job?23:01
*** betherly has quit IRC23:01
clarkbthen they dont have access to change the secret or the job but do have access to trugger things to update23:01
*** weifan has joined #openstack-infra23:02
fungiclarkb: yeah, that was one of the options corvus mentioned above23:05
fungi"another would be to rethink how that job is run (eg, use a secret rather than an ssh key and use project-config to assign it to the zone repo's pipeline)"23:06
corvusand "#3 is meh -- i'd like to be able to use the ssh key system, so i'd rather make it better than give up on it."  is my current feeling on that23:07
*** weifan has quit IRC23:07
*** aaronsheffield has quit IRC23:07
*** goldyfruit has quit IRC23:08
clarkbin that situation opendev controls both the key andjob andbasically lets another repo trigger it?23:09
fungiyeah23:10
*** weifan has joined #openstack-infra23:14
clarkbya that would work23:16
*** diablo_rojo has joined #openstack-infra23:23
*** hamzy has joined #openstack-infra23:27
*** betherly has joined #openstack-infra23:29
*** betherly has quit IRC23:34
*** tobiash has quit IRC23:51
*** tobiash has joined #openstack-infra23:52
*** goldyfruit has joined #openstack-infra23:53
*** dchen has joined #openstack-infra23:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!