Monday, 2019-02-04

*** dtantsur|afk has quit IRC00:16
*** dsneddon__ has joined #oooq00:23
*** dsneddon__ has quit IRC00:36
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (3 more messages)00:50
*** dsneddon__ has joined #oooq01:08
*** dsneddon__ has quit IRC01:21
*** dsneddon__ has joined #oooq01:52
*** dsneddon__ has quit IRC02:05
*** dsneddon__ has joined #oooq02:33
*** dsneddon__ has quit IRC02:46
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (3 more messages)02:50
*** holser_ has joined #oooq03:06
*** holser_ has quit IRC03:13
*** dsneddon__ has joined #oooq03:19
*** dsneddon__ has quit IRC03:32
*** ykarel|away has joined #oooq03:34
*** ykarel|away has quit IRC03:36
*** ykarel|away has joined #oooq03:37
*** dsneddon__ has joined #oooq03:38
*** gkadam has joined #oooq03:48
*** dsneddon__ has quit IRC03:51
*** skramaja has joined #oooq03:52
*** udesale has joined #oooq04:07
*** dsneddon__ has joined #oooq04:22
*** dsneddon__ has quit IRC04:35
*** ratailor has joined #oooq04:41
*** ykarel|away is now known as ykarel04:50
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (3 more messages)04:50
*** ykarel has quit IRC04:54
*** ratailor has quit IRC04:58
*** ratailor has joined #oooq04:59
*** dsneddon__ has joined #oooq05:04
*** ykarel has joined #oooq05:10
*** dsneddon__ has quit IRC05:17
chkumar|ruckquique|rover|off: \o/05:33
*** dsneddon__ has joined #oooq05:44
*** dsneddon__ has quit IRC05:57
*** dsneddon__ has joined #oooq06:15
*** quique|rover|off is now known as quiquell|rover06:49
quiquell|roverchkumar|ruck: o/06:49
quiquell|roverchkumar|ruck: looks like tebroker is down06:49
quiquell|roverchkumar|ruck: all de centros nodes at RDO have  issues06:49
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (3 more messages)06:50
*** matbu has joined #oooq06:57
chkumar|ruckquiquell|rover: can we restart it?06:58
quiquell|roverwill check in a few06:59
*** jfrancoa has joined #oooq07:05
quiquell|roverzbr|ssbarnea: want to see issues at libvirt before refactor it, thats why I -1 your review07:07
chkumar|ruckbrb going for lunch07:11
*** saneax has joined #oooq07:11
*** dsneddon__ has quit IRC07:19
*** sshnaidm|off is now known as sshnaidm07:30
sshnaidmchkumar|ruck, quiquell|rover zuul queue is 64 ours in rdo cloud :D07:33
sshnaidmit's down, right?07:34
*** dtrainor has joined #oooq07:35
*** dtrainor_ has joined #oooq07:36
*** dtrainor has quit IRC07:39
mariosquiquell|rover: you guys seen this http://logs.openstack.org/67/631067/45/check/tripleo-ci-centos-7-containers-multinode/600f57c/job-output.txt.gz#_2019-02-03_18_48_54_21416707:40
marioschkumar|ruck: ^07:40
mariosoh maybe is just on that patch  ( from https://review.openstack.org/#/c/631067/)07:41
mariosquiquell|rover: chkumar|ruck ^ all the jobs fail there likely from the patch07:41
mariossorry for alert07:41
quiquell|rovermarios: rdo centos nodes are screw07:42
*** ratailor_ has joined #oooq07:42
*** ratailor has quit IRC07:44
*** dtrainor_ has quit IRC07:47
*** dtrainor has joined #oooq07:48
*** ykarel is now known as ykarel|lunch07:51
*** dsneddon__ has joined #oooq07:54
*** quiquell|rover is now known as quique|rover|brb07:57
chkumar|ruckmarios: tristin is working on that08:01
*** kopecmartin|off is now known as kopecmartin08:02
marioschkumar|ruck: which one rdo nodes you mean ? thanks quique|rover|brb :/ let me know if you need votes on fixes etc08:03
chkumar|ruckmarios: yes on rdo nodes, there is some problem with zookeeper08:04
marioschkumar|ruck: thanks08:04
chkumar|rucksshnaidm: Hello08:06
chkumar|rucksshnaidm: I am working on script to compare fs20/fs21 tempest tests results where should I keep it?08:06
*** dsneddon__ has quit IRC08:07
sshnaidmchkumar|ruck, what is purpose? To compare manually or to run it automatically?08:11
chkumar|rucksshnaidm: wes asked to have a script the compare the results so that we can cleanup the skip list periodically08:12
chkumar|ruckI think the comparison would be manually or running automatically I have to grab the subunit files from fs021 and fs020 on each run08:12
chkumar|rucklet me finish the script and send it to the rdo-infra/ci-config08:13
*** quique|rover|brb is now known as quiquell|rover08:16
*** dsneddon__ has joined #oooq08:35
*** dtrainor has quit IRC08:37
chkumar|ruckmarios: Hello08:38
chkumar|ruckmarios: We donot run standalone scenario jobs against TQE?08:38
chkumar|ruckmarios: I was trying to unskip telemetry tempest plugins tests08:38
chkumar|ruckmarios: https://review.openstack.org/#/c/604311/08:38
*** d0ugal has quit IRC08:39
marioschkumar|ruck: we do afaik /me checks08:39
marioschkumar|ruck: https://github.com/openstack-infra/tripleo-ci/blob/8776f84b0377cbc31c98d9548ec9719b4d17b09e/zuul.d/standalone-jobs.yaml#L12008:39
marioschkumar|ruck: we restrict it to the role though08:40
marioschkumar|ruck: so if iyou want it somewhere else propose a change in  ^08:40
*** ykarel|lunch is now known as ykarel08:40
chkumar|ruckmarios: let me do that08:40
marioschkumar|ruck: ack yeah i just checked your change if you want it on the validate tempest role we need to add to files for the jobs you want08:41
chkumar|ruckmarios: yes on it08:41
*** tosky has joined #oooq08:42
*** ccamacho has joined #oooq08:43
*** jpena|off is now known as jpena08:45
*** rascasoft has joined #oooq08:47
*** dsneddon__ has quit IRC08:48
*** dtrainor has joined #oooq08:50
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (3 more messages)08:50
*** d0ugal has joined #oooq08:52
*** jtomasek has joined #oooq08:56
marioszbr|ssbarnea: wut is zbr please08:59
*** bogdando has joined #oooq09:01
zbr|ssbarneamarios: me picking a new irc nick in order to avoid clashing with sshnaidm :p -- i will remove the suffix after a while.09:03
marioszbr|ssbarnea: ah ok. well fwiw, it would be indeed better to make it a suffix, rather than the prefix it currently is. like i would go looking for ssba<tab> and it would find you09:04
mariosnow it does not09:04
marioszbr|ssbarnea: oh i see.. sshn clash09:05
marioszbr|ssbarnea: but really?09:05
mariosdoes it?09:05
zbr|ssbarneamarios: i didn't want to do it initially but the increase in miss-tagging convinced me. i think is good to avoid overlapping of the two first characters.09:06
marioszbr|ssbarnea: sorisbarn has a certain ring to it09:12
quiquell|roversshnaidm: o/ what do we doo with https://review.openstack.org/#/c/633444/  ?09:14
*** apetrich has joined #oooq09:19
sshnaidmquiquell|rover, libvirt job fails there, but jobs here pass: https://review.rdoproject.org/r/#/c/18558/09:21
*** dsneddon__ has joined #oooq09:21
sshnaidmquiquell|rover, I'm fine to merge it, need weshay to remove his -109:21
quiquell|roversshnaidm: but pssing jobs there end up at a good hypervisor09:25
quiquell|roversshnaidm: I am possitive that when we forced it it was working09:25
quiquell|roversshnaidm: but it's good if we at least have a log with the latest and bad hypervisor and success09:25
sshnaidmquiquell|rover, log of what?09:25
quiquell|roversshnaidm: for example this centos7  one is passing http://logs.rdoproject.org/58/18558/33/check/tripleo-ci-reproducer-centos-7-libvirt/fb7d554/tripleo-ci-reproducer/libguestf-env.sh09:26
quiquell|roversshnaidm: but there is not force_tcg because was not needed09:26
sshnaidmquiquell|rover, I se09:27
quiquell|roversshnaidm: weshay has a point, let's ensure that we have both success and force_tcg09:32
quiquell|roversshnaidm: I am sure the latest issue is not related but let's confirm09:32
*** dsneddon__ has quit IRC09:34
*** derekh has joined #oooq09:34
mariossshnaidm: do you know where the extend time script is on beaker boxes i can't see it in /root09:42
sshnaidmmarios, hmm.. I don't find it too, I think it was somewhere in /opt09:48
chkumar|ruckquiquell|rover: I am taking a look at fedora 28 failure https://bugs.launchpad.net/tripleo/+bug/181451609:52
openstackLaunchpad bug 1814516 in tripleo "[Fedora 28] [standalone]'NoneType' object is not iterable while generating template" [Critical,Triaged]09:52
chkumar|ruckit is unrelated from my changes added in the gates, I am seeing why it is coming09:52
quiquell|roverchkumar|ruck: let me check09:53
quiquell|roverchkumar|ruck: Maybe this just miss a check for network_data09:54
quiquell|roverchkumar|ruck: and fail or no iterate if None09:54
sshnaidmmarios, maybe they don't exist anymore09:54
*** gkadam_ has joined #oooq09:55
*** gkadam has quit IRC09:58
*** chkumar|ruck has quit IRC09:58
*** holser_ has joined #oooq10:00
*** gkadam__ has joined #oooq10:02
*** dsneddon__ has joined #oooq10:04
*** gkadam_ has quit IRC10:05
*** chem has joined #oooq10:05
*** sanjayu_ has joined #oooq10:09
*** saneax has quit IRC10:10
*** dsneddon__ has quit IRC10:17
*** chandan_kumar has joined #oooq10:24
*** chandan_kumar is now known as chkumar|ruck10:24
*** dtrainor has quit IRC10:46
*** dsneddon__ has joined #oooq10:46
*** holser_ has quit IRC10:46
*** dtrainor has joined #oooq10:48
*** dtrainor_ has joined #oooq10:49
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (3 more messages)10:50
*** dtrainor has quit IRC10:52
*** chem has quit IRC10:53
*** holser_ has joined #oooq10:53
arxcruzchkumar|ruck: what do you mean need that to enable telemetry tests?10:53
chkumar|ruckarxcruz: sorry, commented10:57
*** dsneddon__ has quit IRC10:59
arxcruzchkumar|ruck: so, you're saying that on validate-tempest change you don't get all jobs running ?10:59
chkumar|ruckarxcruz: yes10:59
chkumar|ruckarxcruz: https://review.openstack.org/#/c/634644/2/zuul.d/standalone-jobs.yaml -> only it was running on roles/standlone11:00
quiquell|roversshnaidm: I don't see http://38.145.33.166/influxdb_stats11:14
quiquell|roversshnaidm: is tebroker down for good ?11:14
*** udesale has quit IRC11:14
quiquell|roversshnaidm: or we still have it to count stacks ?11:14
sshnaidmquiquell|rover, I think everything is screwed on rdo cloud11:16
chkumar|ruckquiquell|rover: sshnaidm mhu and cshatri is working on clean up the nodes11:16
chkumar|ruckit is more than 10000, I think11:17
quiquell|roverack11:17
sshnaidmquiquell|rover, try again  http://38.145.33.166/influxdb_stats11:17
quiquell|roverhuff it will take time11:17
sshnaidmquiquell|rover, works for me11:17
chkumar|ruckykarel: marios http://logs.openstack.org/11/604311/8/check/tripleo-ci-centos-7-scenario001-standalone/ed6f18f/logs/tempest.html telemetry issue still not fixed11:18
sshnaidmwow11:18
ykarelchkumar|ruck, yup that's what i anticipated, as i seen this issue in poi time to time11:18
sshnaidmchkumar|ruck, I think it's 1000 nodes, not 10 00011:18
chkumar|ruckoh sorry got extra 011:18
sshnaidmquiquell|rover, you can see ports_down=25 there, need to add a graph with ports also.11:19
sshnaidmACTIVE=0,BUILD=2,ERROR=998,DELETED=011:19
*** ccamacho has quit IRC11:20
quiquell|roversshnaidm: I think this is breaking grafana11:20
quiquell|roversshnaidm: yep11:24
sshnaidmquiquell|rover, how?11:24
quiquell|roversshnaidm: try to run telegraf --test --config telegraf.d/rdocloud.conf11:24
sshnaidmquiquell|rover, I'm not on the server..11:25
quiquell|roversshnaidm: best way is to do in vim a "goto" to go to the char11:25
quiquell|roversshnaidm: don't ned the server11:25
quiquell|roversshnaidm: at a clone just run the command is in test mode11:25
sshnaidmquiquell|rover, can you paste the output?11:25
sshnaidmI don't have it installed11:26
quiquell|roversshnaidm: found the issue let me fix11:27
quiquell|roversshnaidm: total=1000,11:27
quiquell|roverextra comma11:27
quiquell|rovermaybe we can run the telegraf --test11:27
quiquell|roverat reviews11:27
quiquell|roverbut depends on input adata so it's not reliable11:28
*** dsneddon__ has joined #oooq11:28
quiquell|roversshnaidm: https://review.rdoproject.org/r/1869711:30
sshnaidmquiquell|rover, hmm.. but it worked anyhow11:32
panda|offarxcruz: https://review.openstack.org/634380 ready to be reviewed ?11:32
panda|offmarios: how's the design going ? I remember you posted some patch on friday but I don't know where they are ...11:33
*** dtrainor_ has quit IRC11:33
*** dtrainor has joined #oooq11:33
zbr|ssbarneaquiquell|rover: chkumar|ruck : i am unable to find the etherpad for the ruck/rover, please make a note that one hour before community meeting is LP-bug bashing session with jaosorior -- one of you should attend.11:38
zbr|ssbarneais a private session to sort LP bugs (likely BJ)11:39
zbr|ssbarneait worked quite nice for the last sprint, and i think we should make it a common practice.11:39
arxcruzpanda|off: not ready for review yet, i'm still workong on it11:40
quiquell|roverzbr|ssbarnea: https://review.rdoproject.org/etherpad/p/ruckrover-unisprint511:40
quiquell|roverzbr|ssbarnea: this is new stuff ?11:40
panda|offarxcruz: ok thanks11:40
*** dsneddon__ has quit IRC11:42
*** ratailor__ has joined #oooq11:42
*** ratailor_ has quit IRC11:45
panda|offTengu: ping about https://review.openstack.org/63463511:45
quiquell|roversshnaidm: Do we have to load the new rdocloud.py at nodepool ?11:46
quiquell|roversshnaidm: or is infra as code ?11:46
Tengupanda|off: ?11:46
panda|offTengu: !11:46
Tengudoes it create some issues?11:46
panda|offTengu: no, I'd like to understand what it does, have time to explain ?11:47
Tengupanda|off: ah, yes, of course. I thought the commit message and embeded comments were sufficient.11:47
panda|offTengu: it is for the what and why, a bit less for the how.11:48
Tengupanda|off: so basically, until now, we didn't get any chance to have failed container stdout logs, hence no way to understand what's going on11:48
Tengupanda|off: for instance, I was pointed to a failed upgrade CI task, and saw a couple of container in Exited (1) state - but we actually don't have any way to know WHY11:48
Tengu-> the collect.yaml code didn't loop on the stopped containers, so no log, no debug, nothing.11:49
Tengumy patch intends to provide all the stdout for all containers, running or not. We might want to filter that in order to only get Exited (non-zero) if it creates some issues regarding disk space.11:49
Tengupanda|off: does it make sense?11:50
sshnaidmquiquell|rover, should upload it, but worth to check anyway11:50
panda|offTengu: ok, 1 doubt and three concerns: the doubt is if this keeps track of all the execution of the containers in the past. The concerns are 1) space 2) time 3) you're logging on a dir named after the container id, how do you expect to find what you're looking for in the logs11:52
Tengupanda|off: nope, we're logging in a dirname with the container name11:52
Tengupanda|off: that will loop on all containers from the output of "$engine ps -a" - so yeah, that might be "big", but not that big either. Time might be more a concern, although the main concern I have is space (hence the -w for analysis)11:53
zbr|ssbarneasshnaidm: can you please have a look at https://review.rdoproject.org/r/#/c/18699/2 ? (ovb cleanup filter fix). thanks.11:54
*** dsneddon__ has joined #oooq11:55
panda|offTengu: mmhh, sorry for the dumb question, but if you use always the same name, two containers that have diff id because diff execution will have the same name and you'll overwrite the logs ? I don't have a clear view how how docker ps looks at the end of a deployment ..11:55
Tengupanda|off: can't get the same name for container11:56
Tenguit's refused by the engine itself.11:56
Tenguso that solve this concern ;).11:56
Tengupanda|off: so, basically, we shouldn't get that much time difference with this new thing. name are unique for containers, so we won't override anything. and we keep a clean, readable directory tree as we are using the container name instead of id. So the main issue is the space, although on that part I'm not that worried11:59
Tengupanda|off: most of the application/runtime logs aren't present in the container stdout.11:59
zbr|ssbarneapanda|off or rfolco can you also send add me to the invite on trupleo-ci-community meeting, I see only your two added.12:01
*** jfrancoa is now known as jfrancoa|lunch12:01
panda|offTengu: is thre a change dependent on this that tests this in jobs ?12:02
panda|offTengu: It may be beneficial to see this in action12:03
Tengu panda|off: hmmm nope, but we should get the logs present directly in the CI for that precise job12:03
Tengupanda|off: for instance: http://logs.openstack.org/35/634635/3/check/tripleo-ci-centos-7-standalone-os-tempest/e0066de/logs/undercloud/var/log/extra/podman/containers/12:03
zbr|ssbarneait seems that weshay organized this one and I cannot add myself to it. probably he did not allow us to add ourselves, not sure.12:04
Tenguzbr|ssbarnea: btw: here's the result for my patch on quickstart-extra and the new logging: http://logs.openstack.org/35/634635/3/check/tripleo-ci-centos-7-standalone-os-tempest/e0066de/logs/undercloud/var/log/extra/podman/containers/  :)12:04
zbr|ssbarneaTengu: i think that is kinda obvious that having folders does not help with the UX here.12:05
panda|offmy is my gertty not showing this job ? :/12:05
Tenguzbr|ssbarnea: what do you mean? you'd rather get /var/log/extra/podman/container_name_stdout directly?12:05
Tengupanda|off: CI still not over, I'm picking it directly from zuul12:06
Tenguzbr|ssbarnea: or /var/log/extra/podman/stdouts/container_name.txt.gz ?12:07
*** dsneddon__ has quit IRC12:08
zbr|ssbarneaTengu: yes, i would but i see that (very) few do also have other files.  wait for feedback from others before acting on my question. i am in favour of a flatter structure.12:08
Tenguzbr|ssbarnea: yeah, so the other files are taken from the running containers, in the loop just above the one I'm adding.12:08
sshnaidmzbr|ssbarnea, commented12:08
sshnaidmzbr|ssbarnea, can I ask you just to fix the string and not to change anything? is it really so difficult to do?12:09
Tenguzbr|ssbarnea: https://review.openstack.org/#/c/634635/3/roles/collect-logs/tasks/collect.yml@271  (not sure I made the correct link, sorry)12:09
Tenguthat block loops "only" over running containers - and it wasn't that easy to make it loop also on non-running due to the "exec" calls.12:10
panda|offTengu: ok,I misunderstood the whole point probably, I see what you're doing. I'll ping again after you remove the -W12:13
panda|offTengu: you also probably already noticed there's a bit of nesting there, logs inside logs with the same content.12:14
Tengupanda|off: cool :). and thank you for your questions ;). Do you think I should update the commit message maybe?12:14
panda|offTengu: no, probably more a lack of kwnledge on container deployment on my part12:15
Tengupanda|off: ok :). the nesting is indeed a "thing" - but I will probably avoid digging further in it. Especially if we can do the move toward 'service logging to stdout' - the whole collect.yaml thing will be much more different.12:17
Tenguzbr|ssbarnea: soo about disk space, any bad news?12:18
sshnaidmpanda|off, why not to add blackregex of tempest in required featureset here? https://review.openstack.org/#/c/634144/112:19
zbr|ssbarneaTengu: seems 104M reported at http://logs.openstack.org/35/634635/3/check/tripleo-ci-centos-7-standalone-os-tempest/e0066de/logs/log-size.txt12:20
*** ccamacho has joined #oooq12:20
Tenguzbr|ssbarnea: is it a problem?12:21
Tengu4.5M /home/zuul/workspace/logs/undercloud/var/log/extra/podman/containers  seems "nice" enough?12:21
zbr|ssbarneaTengu: no. is quite small. we can fix it later if we spot a issue and we can add a tail to it.12:21
Tengucool12:22
TenguSo I can drop the -w then.12:22
*** rf0lc0 is now known as rfolco12:26
quiquell|roverchkumar|ruck: who goes to the ci scalation ?12:30
chkumar|ruckquiquell|rover: /em taking care of12:30
*** holser_ has quit IRC12:31
quiquell|roverchkumar|ruck: ack12:31
quiquell|roverchkumar|ruck: thanks mate12:31
chkumar|ruckquiquell|rover: just need to some update on manila temepst failure12:31
chkumar|ruck*need some12:32
panda|offsshnaidm: it's a rapidly moving situation, I thing featureset30 is destined to scenario007, but last time I checked was not ready to be moved to ML2/OVS, so it would require coordination to put everythign in place at the right time12:32
sshnaidmpanda|off, the question is whether this test runs in jobs w/o ovn12:33
quiquell|roverchkumar|ruck: We need new version of ceph, previous one didn't have the fix12:35
quiquell|roverchkumar|ruck: they are building new one at koji and then we will update containers12:36
quiquell|roverykarel: ^ is that it about ceph version ?12:36
*** ccamacho has quit IRC12:36
chkumar|ruckquiquell|rover: Did we get ceph 12.2.10 got built in order to fix manila tempest issue?12:37
quiquell|roverchkumar|ruck: yep but It didn't have the fix12:38
ykarelquiquell|rover, who is buildling?12:38
*** ccamacho has joined #oooq12:40
*** jpena is now known as jpena|lunch12:40
quiquell|roverykarel: alphacc12:40
ykarelquiquell|rover, ack , maybe fultonj knows about gfindente12:40
ykareland his plan for ceph12:40
quiquell|roverfultonj: ack, better to say to alphacc to wait for gfidente ?12:41
chkumar|ruckquiquell|rover: thanks, updated :-)12:41
chkumar|ruckarxcruz: Hello12:41
chkumar|ruckarxcruz: Can we do a new release of python-temepstconf?12:42
quiquell|roverpanda|off: damn... rdo is down no periodics for fedora :-/12:42
arxcruzchkumar|ruck: why?12:42
*** dsneddon__ has joined #oooq12:42
chkumar|ruckarxcruz: current release does not have py3 version12:42
chkumar|ruckarxcruz: it is needed by afazekas12:43
chkumar|ruckarxcruz: https://github.com/openstack/releases/blob/147d14e6c93275b8145e6e9bf5e6b42ef26153e0/deliverables/_independent/python-tempestconf.yaml needs to be updated12:43
arxcruzchkumar|ruck: ok12:46
quiquell|roversshnaidm: are you taking care of https://review.rdoproject.org/r/#/c/18621  ?12:46
sshnaidmquiquell|rover, yeah, I'll check it12:48
quiquell|roversshnaidm: now it dumps daemon.json but I don't see it it also dumps docker system info but registry is not there12:49
quiquell|rover:-/12:49
quiquell|roversshnaidm: But I see the ansible tasks that write that running12:49
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades, tripleo-ci-centos-7-undercloud-containers @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq- (2 more messages)12:50
*** ratailor__ has quit IRC12:51
*** ccamacho has quit IRC12:51
sshnaidmquiquell|rover, where do you see it running?12:51
quiquell|roversshnaidm: I see "Write modified docker daemon config"Â changed12:52
quiquell|roversshnaidm: at ansible execution12:52
*** rlandy has joined #oooq12:53
sshnaidmquiquell|rover, you create it in pre.yml, right?12:53
quiquell|roversshnaidm: daemon.json at run.yaml12:54
sshnaidmquiquell|rover, https://github.com/rdo-infra/ansible-role-tripleo-ci-reproducer/blob/master/playbooks/tripleo-ci-reproducer/pre.yaml#L8812:54
*** dsneddon__ has quit IRC12:55
quiquell|roversshnaidm: Ahh yep we create at pre and modify at run I mean12:56
*** jfrancoa|lunch is now known as jfrancoa12:56
*** trown|outtypewww is now known as trown12:58
quiquell|roversshnaidm: do we update toolbox with changes at ci-config ?12:58
sshnaidmquiquell|rover, should be, doesn't it work?12:58
sshnaidmquiquell|rover, if no, need to check why12:58
quiquell|roversshnaidm: ci-config clone has old version there13:00
weshayquiquell|rover, chkumar|ruck morning.. so something fixed the centos krb5 issue?13:00
quiquell|roverweshay: noop is ok now so looks like13:01
quiquell|roverchkumar|ruck: was looking into it13:01
weshayya.. zbr|ssbarnea k k.. so thinking of preventative action there w/ me is a ++13:01
quiquell|roverweshay: have mariadb stuff ready, have change it a little take a look at your patches13:02
quiquell|roverweshay: if you are ok I will workflow it13:02
weshayk .. /me looks13:02
chkumar|ruckweshay: the error got vanished , might happened both krb5 and libkadm5 landed at same time13:02
chkumar|ruckweshay: asked for help from centos-devel, no help13:02
chkumar|ruckweshay: https://bugs.centos.org/view.php?id=1577513:02
weshaychkumar|ruck, anything posted to cetos-devel?13:03
chkumar|ruckweshay: I mean on Irc13:03
quiquell|roverpanda|off: this is the only periodic f28 job I have to check 200~periodic-tripleo-ci-fedora-28-standalone-master ?13:04
weshayquiquell|rover, ah thanks... I forgot I left it in that state.. thanks man13:05
weshaybrb13:05
quiquell|roverweshay: no problem I have ideas to propagate that to queue size and noops13:06
quiquell|roverweshay: they are one shot13:06
quiquell|roverweshay: no need for series db13:06
quiquell|roverpanda|off: they are all skipped13:08
*** ykarel is now known as ykarel|away13:08
quiquell|roverpanda|off: since periodic-tripleo-fedora-28-master-containers-build is failing13:08
quiquell|roverpanda|off: So we don't have a good periodic jobs for f2813:09
quiquell|roverpanda|off: I think I am going to do a manual promotion to unblock upstream f28 job13:09
weshayah cool13:09
quiquell|roverweshay: we need a f28 promotion13:10
quiquell|roverweshay: but f28 periodic jobs are screw13:10
quiquell|roverweshay: what do we do ?13:10
weshayquiquell|rover, same thing we do everyday quique..  https://goo.gl/images/E83dZR13:12
weshayget it fixed :)13:12
* weshay looks13:12
weshayquiquell|rover, rdo cloud was down most of the weekend13:12
quiquell|roverweshay: I know, but I think they were kind of WIP13:12
quiquell|roverweshay: Or do we have to fix this http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-fedora-28-master-containers-build/1800043/logs/undercloud/home/zuul/undercloud_install.log.txt.gz  ?13:12
weshaylooks like zuul in rdo-cloud is still f'd13:12
weshaysee jobs at 64hrs chkumar|ruck <---13:13
quiquell|roveryep13:13
weshayya.. rdo is still f'd13:13
weshayquiquell|rover, since when is the promoter running on a container?13:14
weshayhow did I not know this ... lolz13:15
weshayquiquell|rover, can we move to done? https://trello.com/c/vV4WR3r2/868-cixlp1814059tripleociproa-full-disk-at-promoter13:15
quiquell|roverweshay: what ?13:16
weshaychkumar|ruck, https://trello.com/c/xUy82HzR/869-cixlp1813913tripleociproa-periodic-tripleo-fedora-28-master-containers-build-failing-with-unauthorized-access-to-rdo-registry13:16
quiquell|roverweshay: promoter in a container ?13:16
quiquell|roverweshay: No, this is the docker images we push13:17
weshayhttps://review.rdoproject.org/r/#/c/18654/4/ci-scripts/infra-setup/roles/promoter/tasks/main.yml13:17
weshayoh13:17
weshaychkumar|ruck, when you mark https://trello.com/c/xUy82HzR/869-cixlp1813913tripleociproa-periodic-tripleo-fedora-28-master-containers-build-failing-with-unauthorized-access-to-rdo-registry13:17
quiquell|roverweshay: panda|off suggested to move to overlay2 so we can do parallel promotion13:17
weshayDFG OOOCI that means the root cause is due to OOOCI13:18
quiquell|roverweshay: remember we went from parallel to sequencial because of that13:18
chkumar|ruckweshay: we fixed the fedora 28 issue cirrently we need manual promotion to know whether that issue is fixed or not13:18
weshaychkumar|ruck, help me understand how ^ unauthorized access is tripleo-ci's root cause13:18
weshaychkumar|ruck, panda|off we're still manually promoting fedora28 jobs right?13:19
chkumar|ruckweshay: updated the root cause13:19
weshaychkumar|ruck, /me still see's OOOCI13:19
weshaywhat is the root cause?13:20
*** ccamacho has joined #oooq13:20
weshaycan you explain that?13:20
*** ccamacho has quit IRC13:20
*** ccamacho has joined #oooq13:20
weshaychkumar|ruck, quiquell|rover who is joining the cix call?13:21
quiquell|roverweshay: chkumar|ruck13:22
panda|offquiquell|rover: look at the fedora28 with centos8 contaeinrs periodic job13:22
* panda|off about to finish the lunch13:22
quiquell|roverpanda|off: ahh ok13:22
chkumar|ruckweshay: when tripleo is getting deployed, it tried to read file that time, it is trying to read the file that time unclosedfile error that I fixed it and part from that it is tryign to acees the registery that'gives unauthorized error13:23
chkumar|ruckweshay: I will be joining the cix call13:23
weshaychkumar|ruck, k.. thanks13:23
chkumar|ruckweshay: what shoudl I put the root cause here https://trello.com/c/xUy82HzR/869-cixlp1813913tripleociproa-periodic-tripleo-fedora-28-master-containers-build-failing-with-unauthorized-access-to-rdo-registry ?13:23
*** ykarel|away has quit IRC13:23
*** dsneddon__ has joined #oooq13:24
weshaychkumar|ruck, is the fix in tripleo-ci/tripleo-quickstart/extras or ci-scripts?13:25
chkumar|ruckweshay: fix is in python-tripleoclient13:25
weshayis the fix in tripleo-common or python-tripleoclient13:25
weshayya.. so then it's DF13:26
weshaywe can fix it of course13:26
weshaybut DFG = DF13:26
quiquell|roverpanda|off: there is a pass with http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-centos-7-containers-standalone-master/374f5f6/logs/undercloud/etc/yum.repos.d/delorean.repo.txt.gz13:26
weshaychkumar|ruck, make sense?13:26
quiquell|roverpanda|off: going to promote manually13:26
quiquell|roverweshay: ^13:26
weshay+113:26
chkumar|ruckweshay: yes13:26
weshayquiquell|rover, chkumar|ruck f28 IS manual promotions only13:26
weshayfor now13:26
weshayunless I've missed something13:27
quiquell|roverweshay: yes but we have to look at periodic-tripleo-ci-fedora-28-centos-7-containers-standalone-master13:27
weshayok..13:27
quiquell|roverweshay: that's what I mean panda|off just point it out to me13:27
weshaywhen I went to bed last night.. both upstream and rdo were broken.. so13:27
weshayIMHO.. today is turning out pretty well13:27
weshayjust rdo is broken13:27
quiquell|roverPufff I need y our spirit :-/13:27
quiquell|roversshnaidm: I am going to remove te-broker stuff from creds doc13:28
zbr|ssbarneaweshay: chkumar|ruck got reply from centos team: they say is our fault we got into this issue due to out of sync mirrors13:29
zbr|ssbarneait seems that this can happen when base and updates get out of sync. they told me that many people forget to sync both  and if only updates gets refreshed this can happen.13:30
zbr|ssbarneanow I am not sure which mirrors were used, but we clearly need to dig it and assure we avoid it in the future.13:30
*** chem has joined #oooq13:30
*** jpena|lunch is now known as jpena13:31
weshaychkumar|ruck, u hear me?13:31
*** ccamacho has quit IRC13:33
quiquell|roverchkumar|ruck: promoted f28, will re-run my reproducer13:35
*** ccamacho has joined #oooq13:36
*** dsneddon__ has quit IRC13:37
*** quiquell|rover is now known as quique|rover|lun13:38
*** quique|rover|lun is now known as quique|rover|lch13:38
sshnaidmquique|rover|lch, just |eat13:39
*** panda|off is now known as panda13:40
quique|rover|lchsshnaidm: silly me13:40
*** quique|rover|lch is now known as quique|rover|eat13:40
pandaquique|rover|lch: try also deliveroo13:40
*** ykarel has joined #oooq13:41
zbr|ssbarneapanda: deliveroooq13:41
chkumar|ruckquique|rover|eat: cool, thanks!13:43
weshayarxcruz, can you help get bz on this https://trello.com/c/DfQAjJrJ/819-cixlp1802971tripleociproa-tempest-volumebootpattern-and-basicops-running-concurrently-causing-timeouts13:45
arxcruzweshay: ok, checking13:45
weshayhttps://trello.com/c/ro5E87Dy/852-cixlp1809974tripleociproa-master-check-jobs-failing-undercloud-install-with-mistralclientapiexception-authorization-failed-canno13:46
weshayapetrich, good morning sir.. so we need an equiv bz open on that issue that blocks a release13:47
weshayapetrich, can you help me out there?13:47
weshayarxcruz, same w/ you.. the bz needs to be marked blocker13:47
weshayand trello card updated w/ the bz13:47
apetrichweshay, I can. just a moment that I'm on a meeting.13:47
weshayapetrich, ya.. no rush man.. thanks13:48
apetrichweshay, we have it https://bugzilla.redhat.com/show_bug.cgi?id=167006713:49
openstackbugzilla.redhat.com bug 1670067 in openstack-mistral "master check jobs failing undercloud install with "mistralclient...APIException Authorization failed: Cannot authenticate without an auth_url"" [High,Assigned] - Assigned to apetrich13:49
*** zul has joined #oooq13:49
weshayapetrich, rock.. update the card w/ it. .and ur off the hook13:50
apetrichweshay, but not sure that is a real blocker because I was unable to reproduce and the gate wasn't locked by it it happened a few times and after that almost all passed13:50
arxcruzweshay: well... the logs are missing, can we unskip the test to collect the logs?13:50
arxcruzwithout data, it's kinda hard to debug it13:51
arxcruz:/13:51
weshayapetrich, ok.. /me moves to done .. cut and paste these comments into the card :)13:51
apetrichweshay, aye we removed the blocker13:51
apetrichdoing that13:51
weshayarxcruz, what does fs21 say :)13:51
apetrichweshay, done13:52
weshayapetrich, thanks13:52
apetrichno worries :) I spent some time last week chasing that13:52
arxcruzweshay: it's passing :)13:53
arxcruzweshay: nevermind, network_basic_ops is failing13:54
arxcruzweshay: i'm deploying a fs021 to debug13:56
weshayarxcruz, ok.. that job will take all day to run.. but you know this13:56
arxcruzweshay: well, the failure on fs021 is different from the bug13:56
weshayarxcruz, still interested in cleaning up the skip list..13:56
weshayis that something you are thinking about?13:56
weshayre: what we discussed13:57
arxcruzweshay: that also, but problem is that the fs021 the logs are not showing the same failure as in the bug, so i don't know if it's related or not13:57
weshayarxcruz, ok.. if it's just not relevant any more.. than kill the card w/ our conversation13:58
weshayarxcruz, however... we really do need some movement / progress on leveraging f21 to keep the skip list up to date13:59
arxcruzweshay: well, the test is skipped, we should at least unskip to see if the problem still remains13:59
arxcruzweshay: okay, i'll work on the python tool to check that13:59
weshayarxcruz, right.. but is what fs21 is for13:59
weshayarxcruz, maybe we can chat after the scrum13:59
*** sanjayu_ has quit IRC13:59
arxcruzweshay: fs021 is failing with another error13:59
weshayk k14:00
rfolcoping scrum14:00
weshayarxcruz, perhaps we need to think about this a bit more re: how to efficiently monitor skipped tests14:00
rfolcoping scrum  marios, quiquell, sshnaidm, weshay, panda, rlandy, arxcruz, mwhahaha, rfolco, chkumar, ssbarnea, kopecmarti14:01
*** quique|rover|eat is now known as quiquell|rover14:02
*** ccamacho has quit IRC14:08
*** ccamacho has joined #oooq14:08
pandachkumar|ruck: arxcruz do you have time to review https://review.openstack.org/509728 ? we need to move it to standalone job, so it's ok for now if it's not using OStempest, we'll deal with that after the migration to standalone14:10
chkumar|ruckpanda: looking14:10
*** dsneddon__ has joined #oooq14:12
quiquell|roversshnaidm: dashboard workign now thanks, what was the issue there ?14:15
sshnaidmquiquell|rover, idk, I think it's "rdo cloud" happened there14:16
quiquell|roversshnaidm: ack14:16
sshnaidmthere was outage before14:16
chkumar|rucksshnaidm: quiquell|rover https://review.rdoproject.org/r/#/c/18666/ anything thing more needed on this?14:16
sshnaidmchkumar|ruck, did you run it? does it work ok?14:17
*** rnoriega has joined #oooq14:18
chkumar|rucksshnaidm: I have runned in dry mode, it works, Do You want to delte some ports and report it?14:18
*** gkadam__ has quit IRC14:18
*** jjoyce has quit IRC14:19
sshnaidmchkumar|ruck, would be nice I think14:20
sshnaidmchkumar|ruck, but please make sure it deletes right ports :)14:21
chkumar|rucksshnaidm: let me create some resouces in down state and14:21
*** jjoyce has joined #oooq14:22
sshnaidmchkumar|ruck, I think we have such on openstack-nodepool14:22
sshnaidmchkumar|ruck, just write them down and check that script deleted them, not something else14:22
*** dsneddon__ has quit IRC14:25
fultonjykarel quiquell|rover gfidente PTO but will be back tomorrow. best to wait for him regarding issue disucssed above.14:26
ykarelfultonj, ack14:26
*** skramaja has quit IRC14:26
*** jtomasek has quit IRC14:27
*** chandankumar has joined #oooq14:28
quiquell|roverfultonj: will tell alphacc14:29
quiquell|roverfultonj: thanks14:30
*** ccamacho has quit IRC14:34
*** jpena is now known as jpena|off14:34
*** fultonj has quit IRC14:36
*** fultonj has joined #oooq14:36
*** chkumar|ruck has quit IRC14:38
chandankumarquiquell|rover, weshay I am logging out, see ya tomorrow :-)14:40
*** jpena|off is now known as jpena14:40
quiquell|roverchandankumar: bye see you tomorrow14:40
*** chandankumar has quit IRC14:41
*** vinaykns has joined #oooq14:42
quiquell|roversshnaidm: Can we do the http://cockpit-ci.tripleo.org redirection ?14:44
*** fultonj has quit IRC14:44
*** fultonj has joined #oooq14:47
*** fultonj has quit IRC14:47
sshnaidmquiquell|rover, argh, forgot.. will make a task about it :)14:47
*** fultonj has joined #oooq14:47
*** dtrainor has quit IRC14:48
*** zul has quit IRC14:50
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades, tripleo-ci-centos-7-undercloud-containers @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq- (2 more messages)14:50
*** dsneddon__ has joined #oooq14:52
rlandyrebooting14:58
*** rlandy has quit IRC14:58
rfolcomarios, do you get a min  to bj ?14:58
rfolcoweshay, do you get a min then?15:00
rfolcook nm brb15:01
*** rfolco is now known as rfolco_lunch15:01
*** jtomasek has joined #oooq15:03
zbr|ssbarnea@oooq early feedback on molecule intro presentation would be appreciated: https://docs.google.com/presentation/d/1qjJsz8zcV88R4LFsxD5syobUeitki8yFX7_jzgCxJ1w/edit?usp=sharing15:04
*** marios_ has joined #oooq15:05
*** marios_ has quit IRC15:05
*** dsneddon__ has quit IRC15:05
*** jtomasek has quit IRC15:10
*** zul has joined #oooq15:11
*** holser_ has joined #oooq15:11
*** rlandy has joined #oooq15:12
*** holser_ has quit IRC15:13
*** rlandy has quit IRC15:13
*** rlandy has joined #oooq15:14
rlandyback now on VPN through phx but response is slow15:15
rlandysshnaidm: quiquell|rover: linters are failing on finding /var/ssh/id_rsa ... https://logs.rdoproject.org/64/18664/11/check/tox-linters/c472e7e/job-output.txt.gz#_2019-02-03_20_46_41_50030415:17
rlandywas not changed in this patch15:18
quiquell|roverrlandy: red herring15:18
quiquell|roverrlandy: look up15:18
quiquell|roverrlandy: happend the same to me15:18
quiquell|roverrlandy: https://logs.rdoproject.org/64/18664/11/check/tox-linters/c472e7e/job-output.txt.gz#_2019-02-03_20_46_41_49533215:19
rlandyyeah15:19
quiquell|roverrlandy: install pre-commit15:19
quiquell|roverpre-commit install --isntall-hooks15:20
rlandyyeah15:20
quiquell|roverYou will find this stuff before review <- thanks zbr|ssbarnea15:20
sshnaidmrlandy, skip_ansible_lint tag on this task will prevent linting15:20
quiquell|roverrlandy: did you see the zuul-jobs role bindep ?15:21
quiquell|roverrlandy: Don't know if it can be of any use for us15:21
rlandyquiquell|rover: didn't look into it too much15:23
rlandyadding package checking - seeing of that works as is15:23
quiquell|roverrlandy: ack, looks like is the way openstack projects install RPM from CI at zuul15:24
quiquell|roverok closing now15:24
*** quiquell|rover is now known as quique|rover|off15:24
*** rfolco_lunch is now known as rfolco15:25
*** marios_ has joined #oooq15:25
marios_rfolco: in 1-1 now sorry (and irc/vpn issues)15:25
marios_rlandy: vpn issues are a thing indeed15:26
rlandymarios_: ack - reports started on outage15:26
rlandyI could get in through phx15:26
rlandyalthough kind of slow15:26
rlandyweshay: let me know when you want to touch base re: deps install15:27
*** holser_ has joined #oooq15:29
*** dsneddon__ has joined #oooq15:33
*** holser_ has quit IRC15:36
zbr|ssbarneaweshay: i am no longer able to reproduce the krb5 issue with you but I think i know what caused it.... *networking* and this is the perfect use-case why we need to use rule E405: Remote package tasks should have a retry” in Ansible.15:36
zbr|ssbarneaa "yum update" can fail by its own nature, while repo is updated (many times during the day). a simple retry solves the problem.15:37
zbr|ssbarneaand is damn easy to avoid, we only need to agree on two numbers retries and delay: https://devops.stackexchange.com/questions/5829/how-to-solve-e405-remote-package-tasks-should-have-a-retry-in-ansible/583715:38
sshnaidmrlandy, question about https://review.openstack.org/#/c/631067/45/roles/create-zuul-based-reproducer/templates/launcher-playbook.yaml.j215:39
sshnaidmrlandy, when you set up  featureset_override: in job, do you override all  featureset_override from parent job?15:40
*** holser_ has joined #oooq15:41
zbr|ssbarneasshnaidm: i know that you were against this at some point, so now it would be a good time to argue why you don't find it useful. any chance to convince you to embrace E405?15:42
rlandysshnaidm: no - just the one you specify15:42
sshnaidmzbr|ssbarnea, what are we talking about?15:42
sshnaidmrlandy, so it's kind of merging?15:43
rlandysshnaidm: gets passed as extra vars15:43
rlandyso it's after fs settings15:43
sshnaidmrlandy, I mean job configuration, not how fs override is passed to ansible15:44
rlandysshnaidm: hmm - I see what you are asking ...15:45
sshnaidmrlandy, for example in standalone job: https://github.com/openstack-infra/tripleo-ci/blob/master/zuul.d/standalone-jobs.yaml#L40215:45
rlandythat is a good question15:46
rlandysshnaidm: probably yes15:46
rlandysshnaidm: that is a problem for fs override evrywhere15:46
*** dsneddon__ has quit IRC15:46
sshnaidmrlandy, so we need to discover it and include it there15:46
sshnaidmrlandy, I believe it should be in some of zuul.* vars..15:46
rlandysshnaidm: we should fix that in fs override itself15:47
sshnaidmrlandy, what do you mean?15:47
rlandysshnaidm: in any job you specify fs override, you would override the parent15:48
rlandynot just in this situation15:48
sshnaidmrlandy, yeah, od course15:48
sshnaidmrlandy, but it's configured only in end jobs afaik15:48
rlandysshnaidm:  true - so far15:49
rlandysshnaidm: nothing stopping anyone configuring jobs lower down the chain with that15:49
sshnaidmrlandy, except the fact it won't work..15:49
rlandysshnaidm: so we can fix it at a reproducer level - but we should fix the fs override mechanism as a whole15:50
sshnaidmrlandy, yeah, I think there is a big task about it..15:50
rlandysshnaidm: for the reproducer, we may be able to do something about it15:51
rlandyif we can pick it up from the job15:51
rlandyand merge the two15:51
sshnaidmrlandy, yeah, I think that's the way15:52
rlandysshnaidm: for this - the real fix is a bigger problem15:52
sshnaidmneed to try to dump everything in zuul dict15:52
rlandysshnaidm: that is a good catch15:53
zbr|ssbarneasshnaidm: this was the issues I was referring to https://bugs.launchpad.net/tripleo/+bug/1814492 -- see my last two comments.15:59
openstackLaunchpad bug 1814492 in tripleo " Error: Package: libkadm5-1.15.1-34.el7.x86_64 (base)" [Critical,Triaged]15:59
weshayrlandy, k.. I have bug triage for about an hour15:59
weshayzbr|ssbarnea, I think it was out of sync mirrors15:59
rlandyweshay:np - we can meet this afternnon16:00
rfolcoweshay, ping bug triage16:00
weshayaye.. I'm joining16:01
rlandysshnaidm: well - we already collect the file - so that will help ...16:01
rlandy- name: Check if featureset-override exists16:01
rlandy  stat:16:01
rlandy    path: "/home/{{ undercloud_user }}/src/git.openstack.org/openstack/tripleo-ci/featureset-override.yaml"16:01
rlandy  register: featureset_override_file16:01
zbr|ssbarneaweshay: cannot blame mirrors, it can happen even without mirrors (far less likely anyway). is timing, and i am ready to bet that a retry(3) delay(30s) would be more than enough to avoid this kind of issue.16:01
sshnaidmzbr|ssbarnea, no way16:03
zbr|ssbarneai was just trying to build a case for us to be more careful to assume the worst and to ass some limited retries in flaky places.16:03
zbr|ssbarneatype /ass/add :D16:03
*** ykarel has quit IRC16:08
*** ykarel has joined #oooq16:08
*** ykarel is now known as ykarel|away16:08
weshayhttps://bugs.launchpad.net/tripleo/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.importance%3Alist=UNKNOWN&field.importance%3Alist=UNDECIDED&field.importance%3Alist=CRITICAL&field.importance%3Alist=HIGH&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&field.tag=ci&field.tags_combinator=16:13
weshayANY&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on&search=Search16:13
rlandysshnaidm: while I am changing the launcher playbook - what is your opinion on whether we should keep lines 48-57? Those are the role defaults anyway but I left them there as the script overwrites them and a user using just the playbooks should know that can specify those values. agree?16:14
weshayrfolco, panda http://bit.ly/2UGzBrB16:14
sshnaidmrlandy, in which file?16:15
rlandyhttps://review.openstack.org/#/c/631067/45/roles/create-zuul-based-reproducer/templates/launcher-playbook.yaml.j216:15
rlandysshnaidm: ^^16:15
sshnaidmrlandy, how does user specify them?16:17
rlandysshnaidm: you mean outside the script?16:18
rlandyif you are just running playbooks, you would edit the values in the playbook on those lines16:18
sshnaidmrlandy, yeah, or you mean he should edit the file?16:18
rlandycorrect16:18
rlandyexpert user16:18
rlandynot going through the bash script16:19
rlandysomeone like you16:19
rlandyyou want to reproduce a job,16:19
rlandyyou take this file16:19
rlandyedit to to reference your gerrit users and gp16:19
rlandygo16:19
sshnaidmrlandy, I think we can just copy this mechanism from previous reproducer script, ansible-like with -e and -e@16:19
rlandysshnaidm: so you think we should remove those vars?16:20
rlandythey are default anyways16:20
rlandythe bash script uses the -e mechanism16:20
*** dsneddon__ has joined #oooq16:20
rlandysshnaidm: put in fix for featureset_override - let's see if it works16:28
*** marios_ has quit IRC16:29
*** agopi has quit IRC16:37
*** ykarel|away has quit IRC16:45
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades, tripleo-ci-centos-7-undercloud-containers @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq- (2 more messages)16:50
*** ykarel|away has joined #oooq16:52
*** ykarel|away has quit IRC17:07
*** dmellado has quit IRC17:07
*** dmellado has joined #oooq17:08
*** dmellado has quit IRC17:08
*** dmellado has joined #oooq17:09
*** agopi has joined #oooq17:11
sshnaidmweshay, what's about https://review.openstack.org/#/c/633444/ ?  can you un-w it?17:28
weshaysshnaidm, /me looks17:29
weshaysshnaidm, pretty sure.. someone else had a -1 workflow on it.. I rebased it.. and put it back on17:30
weshaybut /me looking now17:30
*** bogdando has quit IRC17:31
*** ykarel|away has joined #oooq17:31
weshaysshnaidm, oh.. was looking to see where it was tested17:31
weshaywes hayutin17:31
weshayFeb 1 1:53 PM17:31
weshay17:31
weshayPatch Set 26: Code-Review-1 Workflow-117:31
weshaysorry, point me at a working log of {{ working_dir }}/libguestfs-test.log 2>&1;17:31
sshnaidmweshay, I pasted in the comments17:32
weshayk17:32
sshnaidmweshay, both patch where it's tested and link to logs17:32
*** dsneddon__ has quit IRC17:32
sshnaidmweshay, Sagi Shnaidman17:33
sshnaidmJan 31 15:18 UTC17:33
sshnaidmPatch Set 26: Code-Review+217:33
sshnaidmworked here: https://review.rdoproject.org/r/#/c/18558/17:33
*** jtomasek has joined #oooq17:46
*** trown is now known as trown|lunch17:46
*** jtomasek has quit IRC17:51
*** kopecmartin is now known as kopecmartin|off17:59
*** chandankumar has joined #oooq17:59
*** derekh has quit IRC18:00
weshayrlandy, ok.. ready when ever18:03
*** jpena is now known as jpena|off18:04
rlandyweshay: in a bit - just spent an hour hacking to fix my vpn - only to remember marios' comment that I had to disabled selinux18:04
rlandyreproducer messes with that18:05
rlandywhat a waste of time :(18:05
weshayoh noes18:05
weshaythe reproducer messes w/ the vpn?18:05
weshayor selinux/18:05
rlandyyou have to set selinux to permission to get on to vpn18:07
rlandymarios reported it18:07
*** dsneddon__ has joined #oooq18:08
rlandyanyways my testbox is back on vpn now - after I blamed IT for not really fixing the VPN issue :(18:08
rlandyback in 518:08
*** chandankumar has quit IRC18:19
rlandyweshay: k - ready to meet when you are18:37
zbr|ssbarneaweshay: rlandy : another example of using molecule to test a role: te-broker - https://review.rdoproject.org/r/#/c/18627/18:37
weshayrlandy, k /me goes to blue18:37
weshayzbr|ssbarnea, ya... that is a good use.. to test those infra roles.. you don't have to convince me man.. just making sure you know the te-broker is dead code we don't use18:38
weshayI think you do18:38
zbr|ssbarneaweshay: te-broker role is deploying the ovb-cleanup-tenant.sh script... which is essential piece of code for us. drop the script and i bet we are in trouble in less than 48h.18:40
zbr|ssbarneai wanted to add testing before the refactoring of the role, to validate that my changes do no break other stuff.18:40
zbr|ssbarneai need to go now, but it would be helpful to know if we do have a syslog server where we can send the logs from our services (and still be able to access them).18:42
zbr|ssbarneawhile logstash is not really a syslog server it seems quite attractive to me to consider it as an option18:44
weshayah.. makes sense18:45
zbr|ssbarneaweshay: what i want is to eliminate the http sharing of logs approach we had with te-broker. sending logs to a centralized place seems more appropiate for multiple reasons.18:46
zbr|ssbarneaso mainly we should only identify how to configured journald to send data to the right server.18:47
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades, tripleo-ci-centos-7-undercloud-containers @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq- (2 more messages)18:50
*** trown|lunch is now known as trown18:57
weshayrlandy, system_u:object_r:container_file_t:s0 /etc/pki/tls/certs/ca-bundle.crt19:00
weshayfor i in `openstack server list --status Error | cut -d " " -f 2 | head -n -1`; do echo $i; openstack server delete $i; sleep 2; done19:21
*** ykarel|away has quit IRC19:27
*** dtrainor has joined #oooq19:35
*** dtrainor_ has joined #oooq19:36
rlandyweshay: openstack server list -f json |  jq -r  '.[] | select(.["Status"] == "ERROR") | .["ID"]'19:37
*** dtrainor has quit IRC19:39
*** jfrancoa has quit IRC19:42
*** agopi has quit IRC19:51
*** agopi has joined #oooq19:51
*** jaosorior has quit IRC19:58
weshayrlandy,      -n, --non-interactive20:05
weshay                 Avoid prompting the user for input of any kind.  If a password is required for the command to run, sudo will display an error message and exit.20:05
weshayrlandy, https://review.openstack.org/#/c/634329/11/install-deps.sh20:14
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (2 more messages)20:50
rlandyweshay: https://review.openstack.org/#/c/631067 updated20:53
rlandyhttps://review.openstack.org/#/c/631067/46..48/roles/create-zuul-based-reproducer/templates/reproducer-zuul-based-quickstart.sh.j220:54
weshayk21:03
weshaylooks good21:05
rlandylet's see what ci thinks21:09
vinayknsweshay: I have a question...I'm deploying overcloud(including telemetry)...in my overcloud deploy command I don't have disable-telemetry.yaml environment file being passed to that command21:27
vinayknsweshay: but still I couldn't see any of the telemetry services in the overcloud21:28
weshayvinaykns, so that means the tripleo default is to not install it21:28
vinayknsweshay: okay...how could i override it21:29
vinayknsthis is my command http://pastebin.test.redhat.com/70734521:30
weshayvinaykns, here are some examples21:34
weshayhttps://github.com/openstack/tripleo-heat-templates/blob/master/ci/environments/scenario001-multinode-containers.yaml21:34
weshayhttps://github.com/openstack/tripleo-heat-templates/blob/master/ci/environments/scenario001-standalone.yaml21:34
vinayknsso I should pass this to overcloud deploy command as env file..?21:35
weshayvinaykns, you would just have to override parameter_defaults:21:35
weshay  ControllerServices:21:35
vinayknsokay..gotcha.!21:38
vinayknsthx.!21:38
weshayvinaykns, the context on that is .. it's not a good idea for tq to provide an interface to tripleo for most options21:50
weshaybetter to interface directly21:50
vinayknsweshay: could you elaborate more on this..?21:53
weshayjust that it's better if people work directly w/ tripleo and the ways to adjust services etc.. rather than having a tool on top of a tool21:54
weshayjust enough tool21:54
weshayis what we're shooting for21:54
vinayknsyeah..make sense.!22:03
*** trown is now known as trown|outtypewww22:06
*** agopi has quit IRC22:16
weshayrlandy, I did have some updates22:19
weshayupdating in review now22:19
rlandyweshay: ok - which review22:20
weshaycouple more minutes sec22:20
weshayrlandy, https://review.openstack.org/#/c/634329/12/install-deps.sh22:29
weshayhttps://review.rdoproject.org/r/#/c/18665/22:29
weshayrlandy, so.. all you should need in your script now ;)22:29
weshayis22:29
weshaylines 205 - 220 https://review.openstack.org/#/c/634329/12/install-deps.sh22:30
weshayrlandy, let me know if that makes sense22:30
weshaytested it a few times22:30
weshayall the rpm deps are in https://review.rdoproject.org/r/#/c/18665/3/bindep_python2.txt22:33
rlandylooking22:34
rlandyUSER_OVERRIDE_SUDO_CHECK - new22:35
weshayya22:36
weshayrlandy, I think bootstrap_ansible_via_rpm can change right... I think it can now be...22:37
weshayinstall_deps22:37
weshayinstall_bindep22:37
weshayinstall_package_deps_via_bindep22:37
weshayand we're done installing anything22:37
weshayoh no22:38
weshayI'm wrong22:38
rlandyI need to recheck because depends-on changeds22:38
rlandyso ...22:38
weshaybecause install_deps.sh is not there yet22:38
weshayleave it as it22:38
weshayis22:38
rlandywhat about USER_OVERRIDE_SUDO_CHECK?22:38
vinayknsweshay: It Could not fetch contents for file:///docker/services/ceilometer-agent-compute.yaml22:39
rlandyneeds to be added to script?22:39
weshayso folks like Sagi will get prompted if something needs to be installed vs.. skipping22:39
vinayknsweshay: but that file is there in the templates directory22:39
*** holser_ has quit IRC22:39
weshayrlandy, I think we should22:40
rlandyweshay: so I'm confused about what changes are needed on my side22:40
weshayrlandy, if you want further context.. I provide that22:40
weshayrlandy, ok.. let's chat for a hot sec22:40
weshayso we can let this run before we leave22:40
weshayvinaykns, sorry .. need a few.. and would need more details22:40
weshaythat's not enough info to help22:40
rlandyk - joining blue22:40
vinayknsweshay: http://pastebin.test.redhat.com/70735722:41
vinayknsweshay: I'm not sure why it is fetching templates from tmp/tripleoclient-vkg903/tripleo-heat-templates22:42
vinayknsinstead of my custom ones.22:42
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario003 (2 more messages)22:50
*** vinaykns has quit IRC23:06
weshayrlandy, https://review.rdoproject.org/r/#/c/18665/23:11
weshayrlandy, https://review.rdoproject.org/r/gitweb?p=rdo-infra/ansible-role-tripleo-ci-reproducer.git;a=blob_plain;f=bindep_python2.txt;h=1c5a49ffd7e297eadac5fbe3c81d23d600be371d;hb=2d689e7bf8d49bf0f46d2f381665fac7dc68be4723:12
*** rlandy is now known as rlandy|bbl23:22
*** dtrainor_ has quit IRC23:31
*** agopi has joined #oooq23:34
*** panda has quit IRC23:40
*** dtrainor_ has joined #oooq23:40
*** panda has joined #oooq23:41
*** dtrainor_ has quit IRC23:44
*** dtrainor_ has joined #oooq23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!