openstackgerrit Clark Boylan proposed openstack/diskimage-builder: Remove ssh host keys when using simple init
openstackgerrit Derek Higgins proposed openstack-infra/tripleo-ci: Upload the ironic-python-agent images to cache
openstackgerrit Derek Higgins proposed openstack-infra/tripleo-ci: [WIP] Use the cached ironic-python-agent images
openstackgerrit Derek Higgins proposed openstack-infra/tripleo-ci: Fix image caching logic
openstackgerrit Derek Higgins proposed openstack-infra/tripleo-ci: Pre install packages on the instack image
openstackgerrit Swapnil Kulkarni (coolsvap) proposed openstack/tripleo-docs: First documentation for Operational tools
openstackgerrit Swapnil Kulkarni (coolsvap) proposed openstack/tripleo-docs: First documentation for Operational tools
openstackgerrit Swapnil Kulkarni (coolsvap) proposed openstack/tripleo-docs: Docs for containerized compute node
*** Marga_ has joined #tripleo05:37
*** Marga_ has quit IRC05:38
*** Marga_ has joined #tripleo05:38
openstackgerrit greghaynes proposed openstack/diskimage-builder: Remove ssh host keys when using simple init
*** ramishra has quit IRC05:51
lazy_princeHi all.. Need reviews for pls..05:58
openstackgerrit Merged openstack/tripleo-docs: Fix some typos in docs
jaosoriormarios: Hey dude, the update gate seems to only pass in stable/mitaka. Are you aware if this is a known issue? Or, is there a specific issue that's preventing it from passing on master?06:47
hewbroccajaosorior: morning06:47
jaosoriorhewbrocca: Hey dude, how's it going?06:47
hewbroccaWell not so bad all in all06:48
hewbroccaFolks found some interesting issues with CI setup last night06:48
hewbroccajistr was working on it06:48
hewbroccaseems like things were going really slow and we were wondering if the machines were actually swapping06:49
hewbroccathe slowness was causing corosync to lose the cluster on the HA job06:49
jaosoriorI see06:50
jaosoriorwell, today the HA job seems to be passing06:50
jaosoriorbut the upgrades job has been red for a long time06:50
jaosoriorso I was just wondering if there's a specific error, or a set of errors, that are known to cause this06:50
hewbroccanot sure TBH06:50
mariosbookmark added,n,z06:51
marioso/ jaosorior06:51
marioshewbrocca: was that from qe setups with testing upgrades you mean actual ci upstream setup?06:52
* marios wondering if there is a new issue06:52
*** jaosorior has quit IRC07:00
*** openstackgerrit has quit IRC
*** openstackgerrit has joined #tripleo
*** jaosorior has joined #tripleo07:07
*** mbound has joined #tripleo07:52
*** shardy has joined #tripleo07:53
dtantsurmorning folks. so gate is still down, not worth rechecking?07:56
jaosoriordtantsur: the update gate seems to fail every time. However, HA and non-HA are passing07:56
dtantsurcould you please at least land which is from the time when gate was green?07:56
dtantsurjaosorior, but update gate is voting, right?07:56
jaosoriordtantsur: Oh yeah, I had read that code before... dunno why I didn't score it07:58
jaosoriordtantsur: it is voting. So yeah, the gate is not fully green, and I was just giving you the status of it. So if you see a failure on the HA, it might be an actual issue, cause that's passing now08:00
openstackgerrit Merged openstack/python-tripleoclient: Allow 'openstack baremetal configure boot' to guess the root device
dtantsurshardy, morning! are backport exceptions requests still accepted, and how are chances of getting one for ^^^?08:03
dtantsurthe idea of this patch is to simplify live people upgrading from Kilo08:04
*** jpena|off is now known as jpena08:04
dtantsur* life for people08:04
shardydtantsur: if it's likely to break folks on upgrade then I think we agreed upgrade related patches have an exception anyway08:05
shardybut if you think it requires discussion, please drop a mail to the list describing the reason we need it and the risk of backporting, and folks can vote there08:06
dtantsurshardy, for some definition of "break"... please take a look at the commit message08:06
dtantsurwill do, just want to get a quick sanity check08:06
shardydtantsur: in the heat meeting atm but will review after, to me it looks OK as it's just adding some new cli options, not an entirely new feature, so should be low risk08:07
shardydtantsur: also, on upgrade, note that folks will upgrade to the latest client before doing the upgrade anyway08:08
shardyalthough you said kilo->liberty so I guess it'd need to be on the liberty branch?08:08
dtantsurshardy, yep. downstream we worked around it by changing IPA default logic, but upstream is still affected08:10
* dtantsur should really write a email on all this08:10
jaosoriorthrash: Are you around?08:11
shardydtantsur: cool, email to the list sounds good08:11
jaosoriorthrash, shardy: Would sure use some reviews for this: trying to solve this BZ bug 1320950 in rhel-osp-director "os-cloud-config hardcodes SSL ports" [High,On_dev] - Assigned to josorior08:12
*** paramite has joined #tripleo08:13
openstackgerrit Dmitry Tantsur proposed openstack/python-tripleoclient: Allow import command to set deploy image and local boot
*** mcornea has joined #tripleo08:15
openstackgerrit Dmitry Tantsur proposed openstack/python-tripleoclient: Use wait_for_finish from python-ironic-inspector-client
dtantsurrebase party \o/08:17
*** jistr has joined #tripleo08:22
dtantsurshardy, mail sent08:33
openstackgerrit xin wu proposed openstack/os-net-config: Always reset-failed ivs before restart ivs
openstackgerrit Jiri Stransky proposed openstack/tripleo-heat-templates: Increase corosync token timeout
jistrgfidente: morning :)08:52
gfidentejistr I saw the changes08:52
jistrso it looks like trown|outtypewww's recheck on produced a green result indeed08:52
jistri mean for the HA job08:52
jistrand it has passed for non-HA before08:53
jistri wonder if it's worth merging, or we should try another recheck08:53
openstackgerrit Merged openstack-infra/tripleo-ci: Fix timeout on crm_resource --wait
hewbroccaderekh: Hey, did we ever turn *off* swap on the testenvs?08:54
gfidenteit's safe change in ci08:54
jistrgfidente: cool, thanks08:54
derekhhewbrocca: no, we can't08:54
derekhhewbrocca: we're still over commited although not as badly as we were08:54
gfidentejistr so eyes on now :)08:54
hewbroccaderekh: arrgh08:55
hewbroccajistr: there's your slowness, I guess08:55
hewbroccaderekh: hey I had another thought, tell me what you think08:55
derekhhewbrocca: most of the time it isn't used, and I don't think its causing the slowness08:55
hewbroccawe've ordered the RAM upgrade08:55
jistryeah i'll just post more context for derekh08:55
jistryesterday we discussed the HA job failure rate08:56
hewbroccaWhile we've got the boxes open, worth dropping an SSD in each one?08:56
jistrand it seems that the cause is generally things being slow08:56
hewbroccaassuming it's possible08:56
gfidenteyeah I think the reason for the high load is indeed disks08:56
jistrderekh: mainly corosync log looks bad
jistr"Corosync main process was not scheduled for 6085.1523 ms (threshold is 1320.0000 ms). Consider token timeout increase."08:57
jistrthis probably results in corosync losing cluster members08:57
derekhhewbrocca: it certainly wouldn't do any harm, also each host has 4 disks only one of which is in use, I was wondering about RAID for the new deployment, spreading the load might speed it up08:57
hewbroccaI'm sure it would, but not as much as an SSD :)08:58
derekhjistr: yes, we've had those warnings for months08:58
derekhhewbrocca: yup08:58
hewbroccaderekh: I'll check with the relevant sysadmin and see if they think it could be helpful08:58
openstackgerrit Merged openstack/diskimage-builder: Remove ssh host keys when using simple init
derekhhewbrocca: ack08:58
*** mkovacik has quit IRC09:01
gfidentederekh I think we should enable writeback cache too09:01
gfidentein the guests xml09:02
gfidente<driver name='qemu' cache='writeback' />09:03
gfidenteso the guests don't wait for data to be committed on disk09:04
hewbroccathat's an excellent idea09:04
*** Marga_ has joined #tripleo09:05
derekhgfidente: hewbrocca their already in unsafe mode09:07
*** shivrao has quit IRC09:08
derekhSo here is my take on the situation, 6 months ago (or thereabouts), our non HA job was looking like we would soon be able to get it below 60 minutes, since then all that has changed on the testenvs is09:12
derekh1. We have increased thew CPU and RAM allocated to each undercloud09:12
derekh2. We have increased the RAM allocated to each overcloud09:12
derekh3. We have reduced the number of test envs hosted by each hosts from 4 to 309:12
derekh4. We have added 6? more nics to each instance on the host09:12
derekh5. there is others but that the main ones I can remember09:12
derekhAnd the result is that the shortest running job I can point to is now 101 minutes09:12
derekhThe wall time on our jobs is now 50% more then it used to be09:12
derekhWe have added something or changed something or screwed up something else that is now demanding a lot more then it used to (or any combination of these)09:12
hewbroccais it just we've added more other jobs?09:13
*** coolsvap has quit IRC09:13
derekhwe need to find out what and until somebody figures it out were gonna keep adding more resources until we get to a happy place09:13
hewbroccaderekh: how big an SSD would you want09:13
hewbroccawfoster says there's space in the boxes09:13
derekhhewbrocca: no matter how many jobs we add, each host is only gonna run 3 at a time (it used to be 4)09:13
hewbroccaderekh: *nod*09:14
hewbroccais it that the HA job runs with pacemaker enabled now?09:14
*** ramishra has joined #tripleo09:15
derekhhewbrocca: I don't know the answer to that to be honest09:15
derekhhewbrocca: but that wouldn't explain why the nonha job is now never below 100 minutes even during quiet times09:16
hewbroccaindeed not09:17
jistrwe only had HA with pacemaker. But we added more services, and one more step where overcloud init is slowly moving into. Still, it's likely that there are possibilities for optimization somewhere.09:17
hewbroccathe *nonha* job takes that long?09:17
derekhhewbrocca: yes (the others a now worse)09:17
derekhhewbrocca: 100 minutes is minimum for nonha now, 130  is probably average (based on a glance of the stats for the last few days)09:18
derekhWe can't be blaming that change on HW that hasn't changed...09:19
*** paramite is now known as paramite|afk09:19
jistrcould it be that it's not a code change which did the increase, but deploying more services (aodh, sahara) made us add swap, and that made the jobs run longer?09:19
gfidenteI think we saw a bit spike after enabling netiso09:20
derekhjistr: it could, worth investigating09:20
jistrhmm that could be it too, but that's not enabled for non-ha job09:20
jistrgfidente: ^09:20
derekhgfidente: yup, thats certainly a good candidate to look into I think09:21
derekhjistr: the nics arn't used but they are persent on the instances, just having them defined on the host could be adding load somewhere09:22
*** chem has quit IRC09:23
shardyHey guys, FYI I'm digging a little into memory usage on the undercloud ^^09:24
shardyit looks like we have a bunch of non-heat memory hogs ;)09:25
shardymysql, ceilometer-agent-notification, erlang, sensu and swift-object-replicator are all using >1G on a freshly started by idle undercloud09:25
jaosoriorwhat the hell O_O09:25
jaosoriorwhat's the backend for ceilometer in the undercloud? Is it mysql?09:26
shardyYes I think it is09:27
shardywe actually don't need ceilometer at all in the undercloud09:27
*** chem has joined #tripleo09:27
shardyso we can probably save a gig by just turning it off09:27
jaosoriorThat makes sense to me09:27
jaosoriorI have no idea how to reduce the load for rabbitmq though09:28
shardyWe could even turn it off for some overclouds too, given that we don't test it09:28
jaosoriornor swift09:28
shardye.g we don't need all jobs to enable every service09:28
lazy_princegreghaynes: can you pls review
hewbroccashardy: huh yeah that seems like an easy win09:32
derekhshardy: those number don't seem right, just adding up your top 10 there gets to about 8G, also top on the undercloud on CI runs is showing ceilometer  way down the list09:34
jistrwe smoke-test it by starting it, at least the non-ha job would fail if anything fails to start up. But there might be more value on the "let's have stable jobs" side than "let's smoke-test ceilometer" anyway.09:34
*** xinwu has quit IRC09:34
openstackgerrit Merged openstack/os-net-config: Add MASTER=bond SLAVE=yes to linux bond interfaces
* shardy looks at tripleo-ci scripts09:36
*** ramishra has quit IRC09:37
hewbroccawoop woop ^^^09:38
derekhshardy: look in host_info.txt , search for top -n 1 -b -o RES09:40
shardyderekh: Yeah that does give a somewhat different view:09:43
shardyrabbit and mysql are still shown as amongst the worst tho09:44
shardyI'm not sure why the ceilometer-agent is so wrong tho09:44
* shardy juggles different memory stats09:45
jistrlooking at HA
derekhshardy: not sure either, didn't even try to understand the awk magic09:45
*** xinwu has joined #tripleo09:45
derekhshardy: also on the undercloud we have var/log/dstat-csv.log  , using that you can see how much memory is being used over time, and how much swap is being used09:47
shardyderekh: Yeah, I'm pretty sure the ps size metric is misleading, but there's definitely evidence that DB & rabbit are chewing through a fair amount09:48
* shardy runs dstat and kills ceilometer09:49
derekhshardy: yup09:49
hewbroccascwewy wabbit09:51
openstackgerrit Dmitry Ilyin proposed openstack/puppet-pacemaker: Merge with fuel-infra/puppet-pacemaker
*** ramishra has joined #tripleo09:57
*** akrivoka has joined #tripleo10:03
*** tosky has joined #tripleo10:09
*** ramishra has quit IRC10:09
*** bvandenh has quit IRC10:19
*** ramishra has joined #tripleo10:20
*** zoli|wfh is now known as zoli|intw10:24
*** mgould has joined #tripleo10:25
*** bvandenh has joined #tripleo10:33
openstackgerrit Dmitry Ilyin proposed openstack/puppet-pacemaker: Merge with fuel-infra/puppet-pacemaker
*** rhallisey has joined #tripleo11:01
*** dshulyak has joined #tripleo11:04
gfidentebandini the rabbit partition handling maybe we can make it a parameter in puppet/controller.yaml11:10
gfidenteand default to pause_minority so we don't mess with .pp ?11:10
bandinigfidente: we could, but then it would be the user who has to set it, no? If we do it in the .pp we could do it automagically when count(controller_nodes)==211:18
gfidenteyes it should be the user11:19
bandinigfidente: I wonder if it is worth adding it in controller.yaml in the short-term and then move the logic in .pp at a later time11:20
gfidentemy point was to avoid adding logic in the .pp11:21
bandiniok but then, if we ever decide to support two-node clusters, we need to tell the user: if you use two nodes set this variable, etc.11:21
bandiniI think doing it automatically would be preferable?11:21
bandinialso because I assume there will be other needed changes for full two controler node support11:22
bandiniin galera and whatnot11:22
hewbroccaLike, make everything A/P11:22
*** weshay has joined #tripleo11:22
hewbroccaWhat I don't understand is why we're not providing a P/P configuration11:22
bandinilol a full P/P config, would be fun :)11:23
hewbroccaWe'd get way fewer bugs filed...11:23
gfidentehmm we wanted to do that with ipv6 too11:25
gfidentehave an ipv6 hieradata11:25
*** Goneri has quit IRC11:25
bandiniand it did not work?11:27
*** dtantsur is now known as dtantsur|brb11:28
gfidentewe still don't have 'optional inclusion'11:29
shardygfidente: what needs to be optionally included, vs having something set (or not) in the hierarchy?11:31
*** jpena is now known as jpena|lunch11:31
shardyit would actually be possible with recent heat for us to have a parameter which joins things into the hierarchy list btw11:31
gfidenteshanky chainlist?11:31
gfidenteshardy :)11:32
gfidenteresource chains11:32
shardygfidente: I made list_join accept multiple lists11:32
shardywhich means that it'd be possible to join multiple lists, then split them back into a single hierarchy list11:32
shardyI'm not sure if that helps in this case, just mentioning FYI11:33
mburnedmarios: <-- looks like it passed all tests in the last 2 runs...11:33
*** panda has joined #tripleo11:34
mburnedmarios: anything stopping +A?11:34
mariosmburned: looking11:34
mariosmburned: heh... :/ passed ha this run, previous run passed the other 211:35
gfidenteshardy ah you mean make hierarchy a list_join where additional hieradata is added from parameter?11:35
hewbroccaSHIP IT11:35
mariosmburned: indeed +1 to merging and it is blocking the liberty11:35
mburnedmarios: yep...11:35
mariosgfidente: jistr 14:33 < mburned> marios: <-- looks like it passed all tests in the last 2 runs...11:35
* mburned just triggered recheck on the liberty version11:35
mariosgoing to +A11:35
hewbroccamerge early, merge often11:35
shardygfidente: Yes, exactly11:36
jistrmarios: ack11:36
shardyso e.g all these
gfidenteshardy ah nice, Isee11:36
shardycould be appended via some parameter which is set according to the enabled things11:36
jistrmarios, gfidente: tbh i think we can do the same here no?
shardywe may be able to make use of ResourceChain too, in the context of composable services to do something similar11:36
jistror do we expect -upgrades to be passing on liberty now?11:36
mariosjistr: looking11:37
shardyjistr: Hey, wondering if you spotted ?11:38
shardyI was hoping you and stevebaker can take a look11:38
mariosjistr: but we should land mitaka first11:38
shardyit's the first step to fixing update preview11:38
mariosjistr: (yes but..)11:38
mariosjistr: should merge in a sec11:39
jistrshardy: makes sense to me, +2'd11:39
shardyjistr: thanks!11:40
mariosshardy: heat question please bug 1324160 in rhel-osp-director "Overcloud nodes have an empty /etc/resolv.conf post upgrade" [High,New] - Assigned to mandreou11:40
marioscomment 7.. the question is about the validity of '[] for the actions of a structureddeployment11:40
*** ramishra has quit IRC11:41
openstackgerrit Merged openstack/instack-undercloud: Temporarily set +e on systemd-journald restart for +bug/1564471
shardymarios: so, you want to make NetworkDeployment not run, ever, even on CREATE?11:42
mariosshardy: yeah at least for the duration of the upgrade11:43
jistrshardy: the problem there is that we have the actions set to ['CREATE'], but for some reason NetworkDeployment runs on update even though it shouldn't, so we want to try if [] would help11:43
shardymarios: but the default NetworkDeploymentActions means we do nothing on UPDATE?11:43
gfidentewell not on an actual stack update11:44
gfidentethough on upgrade from 7 to 8 yes11:44
shardye.g CREATE should mean it does nothing on upgrade, because the Deployment is already CREATE_COMPLETE11:44
shardyjistr: the reason for that is probably not the Deployment - does it end up UPDATE_COMPLETE, or is it still CREATE_COMPLETE?11:45
shardyOk, so an input value must be changing, either to the config or deployment11:45
jistrah so input value change overrides actions?11:46
jistras in, if input changes, it's always going to be redeployed, regardless of 'actions' value?11:46
shardyPossibly, need to check, sec11:47
shardyif that's the case, it's probably a bug, but a workaround will be to map OS::TripleO::Controller::Net::SoftwareConfig to a SoftwareConfig that does nothing11:47
shardyOr, use an environment file to map NetworkDeployment to OS::Heat::None11:47
shardygive me a few and I'll try a few things11:47
jistrok thank you11:48
mariosthanks shardy11:48
shardyThe other confusing thing here is the o-r-c script, e.g 20-os-net-config probably runs every time there's *any* change to the o-a-c data11:49
shardybut that should just reassert the existing state11:49
gfidenteyeah because .json should be updated11:49
shardyThat's one reason I'm aiming for
shardywhich just runs os-net-config via a script11:49
openstackgerrit Merged openstack/instack-undercloud: Temporarily set +e on systemd-journald restart for +bug/1564471
jaosoriorslagle: Hey dude, if you have time can you give this one a review?
gfidentejaosorior while messing with the 389 and keystone I realized there is no object class which provides the infamous 'enabled' attribute11:57
gfidentejaosorior I resorted to nsAccountLock and _invert11:57
gfidentejaosorior but that's not possible for tenants, which are not account entries11:57
gfidenteso is enabled in some freeipa schema?11:58
gfidentecause openldap doesn't have it either11:58
*** ohamada has quit IRC11:58
gfidenteI wonder if we shouldn't ask keystone to just make it possible to not filter for 'enabled' at all11:58
jaosoriorgfidente: Only thing I could find is this:
*** shadower90 has joined #tripleo12:01
jistrclean corosync log @
gfidentejaosorior yeah it's using emulation12:04
gfidentesee tenant_enabled_emulation and user_enabled_emulation12:04
shardyjistr: so, testing shows that despite the deployment going UPDATE_COMPLETE, the deploy actions are respected12:04
gfidenteso ipa doesn't have 'enabled' either it seems12:04
jistrmy 2 cents is we should merge ^^ and revert if it does trouble12:04
shardye.g the deployment doesn't actually run12:04
shardythat shows why - we process the update, but return before doing anything if the action isn't in the DEPLOY_ACTIONS list12:05
shardyso, I think if you're seeing os-net-config running again, it's the orc script doing it (with the old config) not Heat triggering anything12:05
jaosoriorgfidente: Seems not :/12:05
jistrshardy: thanks for the info, that's very helpful. So it's probably not Heat triggering o-n-c after all. /cc gfidente, marios, mcornea12:06
*** masco has quit IRC12:06
jistri haven't been able to reproduce the issue for me12:06
jistrstill have the same resolv.conf as before upgrade12:07
*** tiswanso has joined #tripleo12:07
shardyjistr, marios: as far as I can see from the bug, the issue is the upgraded version of os-net-config is configuring things differently, when it's supposed to be idempotent when run multiple times with the same config12:07
*** tiswanso has quit IRC12:07
*** Marga_ has quit IRC12:07
gfidenteso maybe we should compare the config.json file12:08
*** tiswanso has joined #tripleo12:08
mariosshardy: right, we were wondering earlier if it was something like the update itself of the os-net-config package which triggers it to reapply the config12:08
gfidentebut I am pretty sure on a stack-update I don't see the resource moving in UPDATE12:08
gfidentewhile during the upgrade yes12:08
mariosso there goes the nice workaround12:08
shardygfidente: It'll only try to update if you change one of the input_values12:09
shardyHmm, actually wait12:09
*** Marga_ has joined #tripleo12:09
mariosmcornea tried the '[] too and didn't work bug 1324160 in rhel-osp-director "Overcloud nodes have an empty /etc/resolv.conf post upgrade" [High,New] - Assigned to mandreou12:10
shardyI don't think NetworkDeploymentActions will help, because Heat isn't doing anything here except returning None12:10
mariosso mcornea confirms it happens at upgrade step3 which is when the controllers are upgraded, so they get new os-net-config12:10
bandinigfidente: remind me the bug you worked on some time ago with ipv6 and no route to host errors?12:11
marioswhich we think is just triggering the re-application, but which now happens slightly differently with inclusion of netmask12:11
jistrok this supports the theory that this isn't heat related at all12:12
mariosyeah :/12:12
shardySo, yes, if input_values change, the Deployment goes UPDATE_COMPLETE, but the config is not reapplied, because we exit due to DEPLOY_ACTIONS12:12
shardybut if you don't change anything, it'll stay CREATE_COMPLETE, as we don't even try to update it12:13
shardyin either case, nothing at all should happen on the node (from Heat's perspective)12:13
*** jayg|g0n3 is now known as jayg12:13
shardySo it must be one of:12:14
mariosshardy: looks like os-net-config writes the ifcfg files slightly differently and that trigges the network restart which wipes resolv.conf12:14
shardy20-os-net-config runs every time *any* change to the oac data happens12:14
shardymarios: OK, so I guess we either fix os-net-config, or chmod/move 20-os-net-config so it can never run12:15
mariosshardy: this happens just as we have upgraded the controllers, so it gets the new version of os-net-config which writes the files slightly differnely than before (now include netmask on the ip addr)12:16
shardyperhaps dprince or dsneddon will have some ideas on if we can just fix that in os-net-config12:16
shardyit seems to fail the idempotency requirement12:16
shardyassuming the config itself hasn't actually changed12:16
gfidenteshardy thanks12:18
pradkshardy, gfidente, the gnocchi patch is passing ha and nonha can we get some +2's and get it in?
*** ramishra has joined #tripleo12:21
*** shadower39 has joined #tripleo12:21
*** xinwu has quit IRC12:22
jistrah so by the action: ['CREATE'] on NetworkDeployment we prevent changes to the configuration coming from o-a-c, but we do not prevent os-net-config from running, so a change in os-net-config itself can still result in network bumps etc.12:22
*** shadower39 has quit IRC12:22
hewbroccajistr: ARRRGH12:22
hewbroccayou're right, that's the problem12:22
shardyjistr: Yes, that's my understanding if it12:22
shardyjistr: until we land
hewbroccaI still think it's narsty that making a perfectly reasonable network change12:22
hewbroccablats over your resolv.conf12:22
shardywhich is my WIP attempt to kill all the os-net-config oac/orc stuff12:23
hewbroccathat seems like the real problem to me12:23
*** zoli|lunch is now known as zoli12:23
mariosjistr:  mcornea shardy: updated bug 1324160 in rhel-osp-director "Overcloud nodes have an empty /etc/resolv.conf post upgrade" [High,New] - Assigned to mandreou12:23
jistrhewbrocca: yea that's another issue probably. Re-running os-net-config should ideally generate pretty much the same resolv.conf as before.12:24
shardymarios: hehe, mid-air-collision, added some notes also12:24
hewbroccaUnless someone makes an explicit change to affect resolv.conf...12:25
hewbroccait should generate EXACTLY the same one12:25
hewbroccaI mean12:25
mariosshardy: ack thanks i was going to ask you but then thought we'd already hassled you enough12:25
hewbroccathat's a fairly large bug12:25
shardymarios: np, I'll push on getting that mega-patch linked into shape so we can move away from the element stuff completely12:25
*** pradk_ has joined #tripleo12:26
jistralso, i think i sunk my pitch for this patch in the previous discussion
* jistr working on pitching skills12:26
hewbroccaALL THIS CAN BE YOURS12:27
hewbroccafor the price of one tiny +212:27
hewbrocca^^^ pitch12:27
mariosjistr: 10000miliseconds12:27
jistrten thousand12:28
mariosjistr: yeah but12:28
marios    $cluster_setup_extras = { '--token' => hiera('corosync_token_timeout', 1000) }12:28
* shardy was just about to ask the same12:28
jistryea i explained in the commit message, but i'm open to -1s :D12:29
* marios re-reads12:29
jistri wanted the manifest default to be the same as the real corosync default12:29
gfidentebandini ipv6 and no routes?12:29
jistrbut if that feels weird i can sync it with hiera, or remove the manifest default altogether12:29
shardyOk, provided it was intentional +2 :)12:29
bandinigfidente: yea, unless I misremember. In which case ignore me ;)12:30
mariosjistr: i see so you set 10K but if that is unset for some reason, set the default which is 1K12:30
mariosjistr: yeah ok, i mean if we just want the default then i'd rather not set anything at all but it works like that too12:31
mariosjistr: i guess its not likely to change in the module ...12:31
jistrmarios: right, not setting anything would be best functional-wise, but it could make some complex/ugly puppet. We already if/else based on IPv6 in the same spot.12:32
mariosjistr: ack i +2 with comment12:32
jistrmaybe using joins as what we do with cinder backends could be done12:32
jistrhmm cinder backends is array though, not map, but perhaps there's a similar approach12:34
jistrif it's not a blocker i'd be inclined to land it though12:35
jistrcan we do so without -upgrades passing? it has about 1 in 10 (or less) chance of passing, judging by
jistrso recheck might not help12:36
*** Marga_ has quit IRC12:36
trown+1 to ignoring upgrades job until we up the pass rate12:37
trownit is below 10% over the last 3 days conditional on the nonha job passing on the same patch12:37
openstackgerrit Dan Prince proposed openstack-infra/tripleo-ci: Metrics tracking for TripleO deployment tasks
openstackgerrit Dan Prince proposed openstack-infra/tripleo-ci: Add common bash functions to help track metrics.
mariosjistr: gfidente shardy how about a exclude=os-net-config to prevent update for now in yum.conf?12:38
trownit has also passed 0 times in 31 tries in the last 3 days on stable/liberty12:39
*** julim has joined #tripleo12:39
derekhgoing AFK for a bit, started looking into why the periodic job is failing, looks like its a error comming back from the IPA agent
derekhwill resume looking into it later12:40
gfidentetrown I was long looking for :)12:40
trownit is pretty sweet12:40
trownbeing able to pipe output to public paste for the win12:41
* derekh sometimes uses fpaste, looks better less dependencies 12:41
trownderekh: as in no deps... just curl12:41
derekhtrown: curl is a dep12:41
trowntechnically you could use any tool that can do HTTP POST12:42
*** ramishra has quit IRC12:42
derekhand technically their all deps12:42
gfidentederekh is there a cli client for fpaste?12:43
mariosmcornea: do you still have the env you tried the [] on?12:44
*** dprince has joined #tripleo12:44
mcorneamarios: yes; credentials are in bug 1324160 in rhel-osp-director "Overcloud nodes have an empty /etc/resolv.conf post upgrade" [High,New] - Assigned to mandreou12:45
mariosmcornea: thanks just want to confirm the os-net-config update12:45
*** julim has quit IRC12:45
mariosmcornea: looking12:45
gfidentederekh trown but hey it's just a GET or POST with fpaste too12:45
*** tiswanso has quit IRC12:46
hewbroccamarios: we need the stuff in the new os-net-config, is the problem :(12:46
mariosApr 06 11:57:40 Updated: 1:NetworkManager-config-server-1.0.6-29.el7_2.x86_6412:46
derekhgfidente: yup install fpaste  , but I prefer trowns method12:47
jistrthis is what the -upgrade job failed at -- ping validation12:47
jistrTrying to ping fd00:fd00:fd00:3000::12 for local network fd00:fd00:fd00:3000::/64...SUCCESS12:47
jistrdub 06 12:18:36 overcloud-cephstorage-0 os-collect-config[5264]: Trying to ping fd00:fd00:fd00:4000::12 for local network fd00:fd00:fd00:4000::/64...SUCCESS12:47
jistrdub 06 12:18:36 overcloud-cephstorage-0 os-collect-config[5264]: Trying to ping default gateway
jistrdub 06 12:18:36 overcloud-cephstorage-0 os-collect-config[5264]: is not pingable.12:47
*** rbrady has joined #tripleo12:48
marioshewbrocca: yeah and it is indeed updated Apr 06 11:56:20 Updated: os-net-config-0.2.2-1.el7ost.noarch12:48
* marios just sanity checking12:48
*** jpena|lunch is now known as jpena12:49
openstackgerrit Merged openstack/tripleo-heat-templates: Increase corosync token timeout
jistrmarios: hmm trying to think how to update it, but fully prevent it from running on existing deployments. We could just rm 20-net-config, but that feels a bit harsh.12:58
*** akshai has joined #tripleo13:01
jistrhmm looking at failed -- also failed at *AllNodesValidationDeployment (this time controller, not ceph). It's intermittent though, so it could be some kind of a race condition.13:03
openstackgerrit OpenStack Proposal Bot proposed openstack/python-tripleoclient: Updated from global requirements
trownderekh: I added an etherpad to the trunk issues etherpad that describes building a trunk image using tripleo-quickstart, I am building an image now, and will poke at it
*** saneax is now known as saneax_AFK13:06
*** nico_auv has joined #tripleo13:06
*** myoung has joined #tripleo13:10
*** myoung|remote has joined #tripleo13:11
*** myoung|remote has quit IRC13:11
shardyguys what's the status of the upgrades job now?13:17
shardywe landed the use-master-heat patch, but what issue remain to get that passing again?13:17
*** Goneri has joined #tripleo13:18
jistrshardy: intermittent failures trying to ping default gateway13:19
jistri pasted some info above13:19
jistrcould be some kind of race perhaps13:20
jistrsubmitted if it can help us spot anything interesting13:20
shardyjistr: ack, thanks13:21
*** dprince has quit IRC13:22
shardythe upgrades job is the only one we have w/ipv6 so are we blocking everything until we get this resolved?13:22
*** tiswanso has joined #tripleo13:23
shardypradk_: ^^ this is the risk if we land your patch with failing CI, we have no coverage of ipv6 unless the upgrades job passes13:23
openstackgerrit Michele Baldessari proposed openstack/tripleo-heat-templates: Add update path for keystone to be moved under wsgi
*** dustins has joined #tripleo13:24
*** dprince has joined #tripleo13:29
*** bvandenh has joined #tripleo13:34
*** pblaho has joined #tripleo13:41
*** tzumainn has joined #tripleo13:41
jristdprince: !13:43
jristthat was fast13:43
dprincejrist: yeah well it auto-deploys man13:44
*** jaosorior has quit IRC13:44
dprincejrist: I just accepted the pull request ;)13:44
dprincejrist: I did test it first mind you. But it looked fine to me13:44
pradkshardy, from what i can tell the upgrade failure is not related to my patch13:45
dprincejrist: much improved, one minor suggestion would be to make the letters "pop" a bit more13:45
jristdprince: same13:45
jristdprince: hmm13:45
pradkshardy, there was one issue i found and fixed already
jristI thought the drop shadow did that13:45
dprincejrist: they are a bit gray to me... but totally just me perhaps13:45
dprincejrist: seriously, it is a huge step forward. So ++13:46
jristwell it is grey13:46
dprincejrist: let it ride man. It is good13:47
derekhtrown: ack13:47
dprincejrist: print the t-shirts, stickers, hats13:47
jristdprince: yeah we can always iterate :)13:47
jristdprince: I think rbrady is working on stickers :)13:47
hewbroccaWe can make swag happen13:48
rbradyjrist: limited run of vinyl13:48
jristhewbrocca: \o/13:48
slagledo i get a free mp3 download with the vinyl13:48
mariosjrist: link or it didn't happen13:48 marios !13:48
*** pradk_ has quit IRC13:49
marioslol /me grabs his coat13:49
*** jprovazn has quit IRC13:50
*** egafford has joined #tripleo13:51
bandinidprince: I am *totally* feeling the owl13:52
*** ramishra has joined #tripleo13:52
*** ramishra has quit IRC13:56
openstackgerrit Jiri Tomasek proposed openstack/tripleo-ui: Move Environment and Parameters config to single modal
openstackgerrit Pierre Blanc proposed openstack/tripleo-heat-templates: Add network ExtraConfig hook
dprincebandini: thank jrist13:59
jristbandini: glad to hear :)13:59
*** apetrich has quit IRC14:00
bandinijrist: \o/ ;)14:00
*** saneax_AFK is now known as saneax14:02
*** electrofelix has quit IRC14:02
*** jaosorior has joined #tripleo14:12
*** mkovacik has quit IRC14:13
*** snecklifter_ has quit IRC14:15
*** snecklifter_ has joined #tripleo14:15
snecklifterHi, sorry to be a noob but my patch is failing an unsure of problem or if wider issue with CI at the moment?14:21
*** ramishra has joined #tripleo14:22
trownsnecklifter: no need to apologize, the upgrades job is very unstable, and the ha job is only slightly better. Folks are working on it though14:23
sneckliftertrown: thanks, so do you think I need to spend more time on debugging my patch or is it an issue with CI?14:23
snecklifterI have done one re-check and same failure14:24
trownsnecklifter: I would not spend time debugging those jobs right now, I will add myself to your patch and recheck when CI is in a better state14:24
trownsorry for the inconvenience14:24
sneckliftertrown: no problem, stuff happens, thanks for responding14:24
*** ramishra has quit IRC14:27
*** dustins has quit IRC14:30
*** Goneri has quit IRC14:32
*** oshvartz has quit IRC14:33
*** dustins has joined #tripleo14:34
openstackgerrit Giulio Fidente proposed openstack/tripleo-heat-templates: Update .sh references from openstack-keystone to openstack-core
gfidentebandini jistr ^^ massive removal14:49
gfidentebandini given you call the migration script early14:54
*** jaosorior has quit IRC14:54
gfidenteI think we might also remove the if openstack-core thing14:54
gfidenteand just disable it14:54
gfidentesee inline comments14:54
hewbroccashardy, gfidente any more progress on that resolv.conf thing?14:56
*** mbound has quit IRC15:01
*** mbound has joined #tripleo15:01
*** julim has joined #tripleo15:06
shardyhewbrocca: Not from me - it looks like the added subnet mask comes from this commit:15:07
shardyI was hoping for some feedback from dsneddon or dprince15:07
gfidenteshardy yeah so given the config.json does not change15:07
gfidenteand removing the orc script isn't great15:08
gfidenteeither we remove that15:09
*** mbound has quit IRC15:09
gfidenteor add dns_servers for all interfaces in the nic templates15:09
*** ramishra has joined #tripleo15:09
derekhtrown: ImportError: No module named ironic_lib15:09
*** dshulyak has quit IRC15:10
trownderekh: weird... we fixed that
trownderekh: are you using "consistent"15:10
hewbroccagfidente: wait15:11
*** egafford has quit IRC15:11
hewbroccaadd dns_servers for all interfaces...15:11
*** absubram has joined #tripleo15:11
derekhtrown: yup15:11
hewbroccawhy wouldn't we just *do that*?15:11
hewbroccaSurely it is the correct solution15:11
trownderekh: because it is unfortunately a bit old since there is a new dep added for oslo config generator that is not packaged yet15:11
gfidentehewbrocca :(15:11
trownderekh: I think while we wait for RDO to pick up newton we have to use current15:11
hewbroccagfidente: like, I am perfectly fine with os-net-config updating the vlan config15:12
hewbroccathat is not a bug AFAIK15:12
hewbroccaBut it shouldn't hose the nameservers when it does it15:12
trownderekh: kind of hard for RDO to pick up newton before releasing mitaka15:12
derekhtrown: ok, I'm gonna start over with /current and see how far I get15:12
derekhtrown: yup, a lot of balls to juggle15:12
trownderekh: did you disable mistral in your run? I got stuck there15:12
hewbroccaSo if we can prevent that by putting dns_servers everywhere15:12
hewbrocca... fine?15:12
derekhtrown: nope did a run as close to ci as possible, the ironic-lib dep is the first thing I hit15:13
*** dustins_ has joined #tripleo15:13
trownderekh: ok, probably something specific to tripleo-quickstart then15:14
derekhtrown: my main problem was figuring out how to get onto the instance and see the error15:14
derekhtrown: possibly15:14
openstackgerrit Steven Hardy proposed openstack-infra/tripleo-ci: --delorean-build handle oslo.* package builds
gfidentehewbrocca give me a little more to figure why ifcfg is doing the nameservers cleanup15:14
*** pblaho has quit IRC15:14
gfidentehewbrocca I think it's ifcfg which is mistakenly clearing resolv.conf15:14
*** dustins has quit IRC15:16
*** Marga_ has joined #tripleo15:16
hewbroccagfidente: OK, makes sense15:16
*** dtantsur|brb is now known as dtantsur15:19
*** mcornea has quit IRC15:22
gfidentehewbrocca though we didn't want the interfaces to go up/down to not interfere with pcmk15:22
*** leanderthal is now known as leanderthal|afk15:22
openstackgerrit Dimitri Savineau proposed openstack/python-tripleoclient: Generate ceph client key
openstackgerrit Dimitri Savineau proposed openstack/tripleo-heat-templates: Use a different ceph key for admin/client user
hewbroccagfidente: yes, that is true15:24
hewbroccawon't matter on the upgrade though since cluster is down anyway15:25
gfidenteshardy so it looks to me if we call first_v6.ip we can still support multiple ips but we won't update it to /64 ?15:25
hewbroccaand for that matter15:25
hewbroccait wouldn't matter on the update because the cluster is down then too15:25
gfidenteoh right it's in maintenance at that stage15:25
*** mcornea has joined #tripleo15:27
*** mcornea has quit IRC15:27
hewbroccabecause yum update misbehaves a lot of things15:29
*** dshulyak has joined #tripleo15:29
*** shardy has quit IRC15:31
*** rajinir has joined #tripleo15:32
*** mikelk has quit IRC15:36
gfidenteso I reproduced it15:36
gfidentewe set the DNS in the ifcfg-br interface but if one of the vlans icfg is brought down and up after that, and it doesn't have DNS15:37
gfidentethe resolv.conf will be restored15:37
hewbroccagfidente: what does that mean15:38
openstackgerrit Jiri Stransky proposed openstack-infra/tripleo-ci: Add routing info to host_info.txt
hewbroccapoor jistr is still trying to fix CI15:42
hewbroccajistr: good work...15:42
hewbroccagfidente: so is what you're describing above a solution?15:42
jistrwell this is just adding debug info15:42
gfidentehewbrocca no15:42
derekhtrown: ^[[mNotice: /Stage[main]/Mistral::Db::Sync/Exec[mistral-db-sync]/returns:     raise TypeError('{!r} is not a Python function'.format(func))^[[0m15:42
hewbroccagfidente: excellent15:43
derekhtrown: is that the mistral error you were seeing?15:43
hewbroccagfidente: what is the solution? :)15:43
gfidentehewbrocca I think there is a change we could do in os-net-config to make it not rerun15:43
*** julim has quit IRC15:43
trownderekh: indeed15:44
gfidentepassing dns_servers to all interfaces is an alternative15:44
gfidentebut I think the real issue is in the ifup-post thing15:45
hewbroccagfidente: passing dns_servers to all interfaces has the advantage of being a doc fix (basically)15:46
hewbroccalike we could do that *now*15:46
gfidenteuhm we need to do that in the nic templates we ship15:47
hewbroccabut still, basically a doc fix?15:47
gfidenteI can put up a change for that anyway yes15:47
gfidentemaybe together with the os-net-config change15:47
bnemecWhich is a problem because it means all existing templates in the field are now broken.15:47
hewbroccayay field15:47
hewbroccabnemec: you are right15:47
bnemecRequired changes to the nic-configs are extremely painful.15:47
hewbroccaI mean, that's still a bandaid15:48
hewbroccaWhat I want is for ifconfig to stop blatting over resolv.conf15:48
gfidentehewbrocca yeah that's what I am looking into now15:48
jistrgfidente: just wondering if passing dns_servers is enough. We get a good resolv.conf, but the ifaces still get bumped, which could make pacemaker go wild, no?15:48
hewbroccajistr: cluster's down, isn't it?15:48
bnemecYeah, I don't understand how this whole per-interface DNS is supposed to work.  There's only one resolv.conf on the system.15:48
gfidenteifup script in network-scripts15:49
bnemecI assume it made sense to someone somewhere, but it doesn't to me.15:49
hewbroccabnemec: yeah. It seems just basically flawed15:49
hewbroccadsneddon: probably has a better answer15:49
*** yamahata has joined #tripleo15:49
jistrhewbrocca: good point, it could be. I'm not sure where exactly the 20-net-config run which does this happens, tbh. It's not during yum update, but i think it would be before pcs cluster start gets applied.15:50
jistrso yeah cluster is probably down15:50
gfidenteso the problem is as follows15:51
gfidenteyou start with an empty resolv.conf15:52
gfidenteif-br-cfg makes a copy of it and uses the DNS servers we specify15:52
gfidentewhen later we re-run os-net-config it only has to update the vlans15:52
gfidentein bringing those down it restores the empty resolv.conf to what it was15:52
gfidenteand in bringing it up it doesn't write back the changes because the vlan interfaces have no dns_servers15:53
openstackgerrit Ben Nemec proposed openstack-infra/tripleo-ci: Use split cirros image for ping test
bnemecderekh: slagle: ^ may help our CI load significantly.15:53
bnemecRunning the ping test locally, I realized how long it takes to create and delete a 10 gig volume.15:54
bnemec1 gig should be a lot better.15:54
hewbroccabnemec: +1!15:54
hewbroccagfidente: oh... huh... that sucks, doesn'ลง it15:54
bnemecAlso, the cirros image was basically pingable within seconds of the stack being complete, which is also a huge improvement.15:55
*** dshulyak has quit IRC15:56
hewbroccaOK folks, I really do have to take off15:56
*** electrofelix has quit IRC15:56
*** yamahata has joined #tripleo15:59
hewbroccaGet all the bugs fixed please, would you?16:00
openstackgerrit Giulio Fidente proposed openstack/os-net-config: Do not append netmaks suffix to IPV6 addresses
gfidentebnemec jistr so ^^ restores the behaviour we had before16:01
gfidentebut there could still be other things which trigger a vlan down/up16:01
mburnedgfidente: that's for the resolv.conf issue?16:01
gfidentemburned yeah16:02
bnemecgfidente: Is there just not supposed to be a netmask on the address for ipv6?  I think I missed that part of the discussion.16:02
*** devvesa has quit IRC16:02
gfidentethere can be, it's optional16:02
gfidentewhile for ipv4 it goes in the NETMASK thing16:03
gfidentelet me add some more stuff in the commit msg16:03
*** pblaho has quit IRC16:03
bnemecThat would be good.16:04
openstackgerrit Giulio Fidente proposed openstack/os-net-config: Do not append netmaks suffix to IPV6 addresses
gfidentebnemec done, but it isn't fixing an issue16:05
gfidenteit's just trying to stop os-net-config from rerunning16:06
hewbroccaI would love it if there was some -- *any* -- better way to fix this problem16:07
hewbroccasince this patche doesn't really fix it in any real sense16:07
hewbroccabut, it's better than nothing I suppose16:07
hewbroccanow I really am leaving16:07
gfidenteyeah it's just meant to not break things16:07
gfidenteI can also add dns_servers to all nic templates16:07
gfidentebut looks to me the issue is in the behaviour that the ifup script has16:08
* jistr gtg too16:10
*** jistr has quit IRC16:10
derekhtrown: I seem to be able to reproduce that mistral error quit easily, any idea why they didn't see it?
trownderekh: so it is a mistral issue not a puppet-mistral issue16:16
*** jdob has quit IRC16:16
derekhtrown: yup, look like it was a problem introduced yesterday
derekhtrown: I'm asking on #openstack-mistral now16:17
trownderekh: sadly I think the same thing that is causing failures to build on master (missing networkx dep) is making cinder-api loop on start-up so validation fails16:17
trownderekh: so I am not sure we will be able fix trunk without RDO adding the new dep16:17
derekhtrown: ok, has anybody started the networkx package? if not I'll take a look once I get passed this mistral thing16:19
trownderekh: I know apevec and number80 started on it, not sure how far they got. probably good to sync with them16:20
derekhtrown: ack, thanks16:21
*** Ryjedo has quit IRC16:23
gfidenteI might actually have found a proper fix :)16:27
*** jistr has joined #tripleo16:28
*** trown is now known as trown|lunch16:30
*** ohamada has quit IRC16:31
*** jtomasek has quit IRC16:33
openstackgerrit Giulio Fidente proposed openstack/os-net-config: Use PEERDNS when no dns_servers is provided
*** jistr is now known as jistr|off16:34
*** Marga_ has quit IRC16:34
*** dprince has joined #tripleo16:34
*** Marga_ has joined #tripleo16:37
openstackgerrit Dan Prince proposed openstack/os-net-config: Bump hacking in test-requirements.txt
*** Marga_ has quit IRC16:44
*** MaxPC has quit IRC16:46
*** tosky has quit IRC16:47
derekhtrown|lunch: will check back on ^ later, and look out for a networkx fix16:48
*** coolsvap has joined #tripleo16:48
bnemecgfidente: That looks good.  It will still trigger an interface restart though, right?16:57
openstackgerrit Giulio Fidente proposed openstack/os-net-config: Use PEERDNS when no dns_servers is provided
gfidentebnemec yeah and update the config16:57
gfidente^^ just updated16:57
gfidentewith tests16:57
bnemecgfidente: Okay, it just sounded like that might cause grief for pacemaker.16:58
*** jpena is now known as jpena|off16:58
*** ayoung has quit IRC16:59
*** saneax is now known as saneax_AFK17:00
gfidentebnemec yeah so I kept both submissions up17:01
gfidentebnemec the updated version of PEERDNS only applies when some ipaddress is provided cause dhcp is the default in and in that case we want to keep it to yes, I suppose17:01
*** apetrich has quit IRC17:03
bnemecgfidente: Yeah, it's weird.  The docs say "If using DHCP, then yes is the default. " which seems to imply when not using DHCP no is default, but that doesn't seem to be true.17:03
gfidenteyeah the ifup script just checks if PEERDNS == no17:04
bnemecI'm pretty sure I looked into this option for another thing a while ago and was equally unimpressed by the documentation that time.17:04
gfidenteshells growing out of control :P17:05
*** ramishra has quit IRC17:05
bnemecgfidente: Is there a test that sets both a static address and DNS like we would in tripleo?  I don't see one, and that's an important configuration for us to keep working.17:08
bnemecI think it should be fine, but it might be good to add coverage of that if we don't have it so we make sure it keeps working going forward.17:08
*** dprince has quit IRC17:08
*** dprince has joined #tripleo17:09
gfidentebnemec there is one yes17:10
gfidentebnemec L6317:11
openstackgerritgreghaynes proposed openstack/diskimage-builder: Fix ssh key cleanup to run in chroot
bnemecgfidente: That isn't setting DNS, is it?17:12
gfidenteah right17:13
*** jdob_lt has joined #tripleo17:13
bnemecI'm looking for something like line 553, but with IPADDR set too.17:13
*** jtomasek has joined #tripleo17:14
gfidenteL573 too17:14
bnemecYeah, but those don't include a static address.17:14
gfidenteok let me try17:15
*** eil397 has joined #tripleo17:18
openstackgerritGiulio Fidente proposed openstack/os-net-config: Use PEERDNS when no dns_servers is provided
gfidentecause it also doesn't make sense to pass DNS1 when BOOTPROTO=none17:20
gfidenteand anyway the test below is making same thing17:20
openstackgerrit Merged openstack/diskimage-builder: Revert "Skip centos functional testing"
*** eil397 has left #tripleo17:23
openstackgerrit Miles Gould proposed openstack/instack-undercloud: Install UEFI version of iPXE ROM in /tftpboot
*** Marga_ has joined #tripleo17:28
*** jdob_lt has quit IRC17:33
*** ayoung has joined #tripleo17:33
mburnedgfidente: nice, peerdns=no17:34
openstackgerrit Ben Nemec proposed openstack/instack-undercloud: Force rebuild of ramdisk as part of overcloud-full
dprincegfidente: is there a bug for ?17:35
gfidentemburned looks like it should do it yes17:35
gfidentedprince no I was working based on a BZ bug 1324160 in rhel-osp-director "Overcloud nodes have an empty /etc/resolv.conf post upgrade" [High,On_dev] - Assigned to gfidente17:36
openstackgerrit Ben Nemec proposed openstack/instack-undercloud: Force rebuild of ramdisk as part of overcloud-full
gfidentedprince it's an intricate problem though17:36
gfidentethe config.json isn't changing17:36
gfidentebut yet we need to update the ifcfg files17:36
gfidenteso firstly I tried to avoid it with
gfidentebut that is meh17:37
*** jdob_lt has joined #tripleo17:37
*** ayoung has quit IRC17:38
*** shivrao has joined #tripleo17:38
*** ayoung has joined #tripleo17:39
dprincegfidente: filing a bug upstream that we can track might still be useful. Perhaps even link to the BZ in the LP bug too17:39
dprincegfidente: otherwise noone will know why you did this17:39
gfidentedprince yeah I don't do that much :(17:40
gfidentewill do17:40
dprincegfidente: thanks17:40
jristwhat's our slogan17:41
jristfor the t-shirts17:41
jrist"it's like wearing pants on pants"17:42
*** tosky_ has joined #tripleo17:42
*** tosky_ has quit IRC17:42
*** hewbrocca is now known as hewbrocca-afk17:43
*** nico_auv has quit IRC17:44
*** david-lyle has quit IRC17:44
openstackgerrit Giulio Fidente proposed openstack/os-net-config: Use PEERDNS when no dns_servers is provided
gfidentejrist pop17:50
gfidenteI'd say l on l so we can make it a LOL17:51
*** ayoung has quit IRC17:52
*** trown|lunch is now known as trown17:53
bnemecStackception - we have to go deeper!17:54
*** rcernin has quit IRC17:56
*** jprovazn has quit IRC18:00
*** jcoufal has quit IRC18:01
openstackgerrit Dan Prince proposed openstack/tripleo-heat-templates: Add ping_retry function
*** shivrao has quit IRC18:08
openstackgerrit Ben Kero proposed openstack/diskimage-builder: Fix add-apt-repository package for precise
*** dtantsur is now known as dtantsur|afk18:10
*** pradk has quit IRC18:14
openstackgerrit Dan Prince proposed openstack/tripleo-heat-templates: Revert "Ping retry"
dprincebnemec: slagle: I'd like to fast track that ^^^18:18
dprincebnemec: slagle: and then follow it up with a correct version
*** qasims has quit IRC18:20
*** shivrao has joined #tripleo18:21
*** yamahata has quit IRC18:21
dprincetrown: you reviewed that one too18:23
*** dustins_ is now known as dustins18:24
bnemecdprince: I'm okay with that.  Out of curiosity, how did it break?18:24
dprincebnemec: the bug describes everything. Quite cryptic actually18:24
slagledprince: in your fixed patch, you only retry 5 times18:24
dprincebnemec: the $ping bash variable got set to ping618:24
slaglei think rook needed the 10 retries18:24
bnemecOh, missed the bug ref. :-)18:25
dprinceslagle: I want to land the revert first18:25
dprinceslagle: happy to fix the new patch however we want18:25
bnemecWell, we weren't sure how many retries.  10 seemed to be a sane amount.18:25
dprince10 retries is totally fine18:25
*** pradk has joined #tripleo18:25
dprinceslagle, bnemec we've lost another day of CI due to this patch so I'd like to land the revert ASAP... (it should be safe to revert)18:26
bnemecdprince: Agreed.18:26
openstackgerrit Ethan Gafford proposed openstack/tripleo-heat-templates: Trove Integration
trown+1 to revert, looking at the bug just to understand what went wrong, but no need to wait for me18:26
slagledprince: sure, wfm18:26
dprincethanks guys18:28
*** ayoung has joined #tripleo18:29
openstackgerrit Zane Bitter proposed openstack/tripleo-heat-templates: Don't have separate protocols/ports for Keystone v3
*** mgould has quit IRC18:32
*** apetrich has joined #tripleo18:34
openstackgerrit Dan Prince proposed openstack/tripleo-heat-templates: Add ping_retry function
openstackgerrit Ethan Gafford proposed openstack/python-tripleoclient: Trove integration
dprincetrown: yeah, it worked by accident for IPv4. But of IPv6 $ping=ping618:36
trownI was having trouble understanding how that patch had anything to do with IPv6 vs IPv418:36
dprincetrown: and that is a total fail18:36
dprincetrown: ping6 fails clearly in all our logs18:36
*** coolsvap has quit IRC18:37
* bnemec wonders again why they decided to push the ipv4 vs ipv6 logic onto users instead of making ping smart enough to DTRT.18:37
trowndprince: did we add IPv6 to upgrades on Mar30?18:37
dprincetrown: it just merged yesterday18:37
trownthat could be the other tricky part about reviewing that patch is seeing the upgrades job pass on Mar30 on the same patch18:37
bnemecThis is the danger of not actually gating.18:38
dprincetrown: it most likely passed CI before the upgrades job was switch to IPv618:38
trowncool, all makes sense to me now18:38
openstackgerrit Dan Prince proposed openstack/tripleo-heat-templates: Add ping_retry function
dprincetrown: yeah, it all makes me sad. But 2 days left this week so lets see what happens now :)18:39
dprincebnemec: and yeah. this is why we want to gate18:39
*** gfidente has quit IRC18:41
*** apetrich has quit IRC18:41
bnemecNo package liberasurecode-devel available.18:44
rooksup slagle?18:46
rookwhat went on with the ping_retry patch?18:48
rookdprince: ^ ?18:48
dprincerook: oh hey Joe18:49
dprincerook: so it caused CI failures for all the IPv6 jobs (upgrades)18:50
rookpreviously the gateway ping was just ping... i didn't know if that was intentional18:50
rookso i left it alone.18:50
dprincerook: $ping got set to 'ping6'18:50
rookbut the gateway ping was previously just 'ping'18:50
dprincerook: correct18:50
rookso, that was intentional18:51
dprincerook: our IPv6 still had a default gateway that was IPv4. So with the new patch this was happending...18:51
dprincerook: ping6
dprincerook: that will always fail18:51
dprincerook: try this one
rookthanks dprince18:54
rooksorry about causing hell for CI18:54
rookwe didn't see this  :/18:54
dprincerook: ah, no worries. We didn't either. Just a bit of a window where the IPv6 job hadn't come online to catch this yet18:54
bnemecrook: It's an edge case that exposed a hole in our CI.18:54
*** jayg is now known as jayg|g0n318:55
rookdprince are you on this call?18:58
rookbnemec: alright... anyway... trown has rode me hard about patches causing problems ;P18:58
dprincerook: no calls for me18:58
rooksent you a private dprince18:58
trownrook: lol, in this case I +2'd it so I have no basis to give you crap :)18:59
rooktrown :D19:01
rookwho uses ipv6 anyway!?!?!? :P19:01
rookplease don't whois me now19:02
trown[Whois] rook is rook!~jtaleric@2606:a000:4ed0:f300:21f:bcff:fe10:12aa (Joe Talerico)19:03
rookonly assholes use v619:03
rookanyhoo.. I apologize for yet again causing headaches...19:04
bnemecOkay, puppet syntax jobs are borked.19:04
* bnemec blames rook19:04
rookbnemec you and my wife.19:04
openstackgerrit Merged openstack/diskimage-builder: Fix ssh key cleanup to run in chroot
bnemecSo, we're hosed until infra gets Ansible to update Jenkins again.  Apparently there's a problem with it right now.19:13
*** jdob_lt has quit IRC19:15
bnemecdprince: FYI^  The ping revert and everything else with a puppet job is blocked indefinitely.19:16
*** ayoung has quit IRC19:16
*** ayoung has joined #tripleo19:16
* bnemec is seriously tempted just to take the rest of the afternoon off19:16
trownthe puppet jobs gate?19:16
trownw the actual f... of all the things we would have gating19:16
bnemectrown: Essentially.  Jenkins won't let anything into the gate with a failing check job.19:17
dprincebnemec: yeah, I'm noticing that it isn't wanting to land. Same package installation failure twice now...19:17
bnemecdprince: Yeah, there's a problem which should be fixed by
bnemecBut with Jenkins not updating it isn't actually applied to the environments yet.19:17
*** MaxPC has joined #tripleo19:24
*** jdob_lt has joined #tripleo19:29
*** jdob_lt has quit IRC19:29
*** jdob_lt has joined #tripleo19:29
*** jdob_lt has quit IRC19:30
*** jdob_lt has joined #tripleo19:30
*** jayg|g0n3 is now known as jayg19:33
*** mkovacik has quit IRC19:34
*** Goneri has quit IRC19:35
*** Goneri has joined #tripleo19:35
*** julim has joined #tripleo19:57
openstackgerrit Dan Radez proposed openstack/tripleo-heat-templates: Enable deployment of Ceph Storage (OSD) on the Compute Nodes
*** mgagne_ is now known as mgagne20:04
jristbnemec: I like your slogan20:04
*** rcernin has joined #tripleo20:09
*** pradk_ has joined #tripleo20:12
*** yamahata has joined #tripleo20:13
*** pradk_ has quit IRC20:13
*** pradk has joined #tripleo20:18
*** jprovazn has joined #tripleo20:21
*** mbound has joined #tripleo20:25
*** mbound has quit IRC20:25
*** mbound has joined #tripleo20:25
*** dustins has quit IRC20:38
slaglecan you not use get_input in a list_join?20:38
slaglejdob_lt: you're a heat person now. can i use get_input in a list_join?20:39
slaglei'm asking b/c it appears to not work :)20:39
jdob_ltin theory, i think it should20:42
slagleoh, i found a bug
openstackLaunchpad bug 1344284 in heat ""Items to join must be strings" - Intrinsic functions resolve before get_input in StructuredConfig resource" [Medium,Triaged]20:42
bnemecI feel like the ci status page may have quit updating.20:43
bnemecDid the fancy new logo overload the server? :-)20:43
slaglebnemec: i think it ran out of red ink20:43
*** trown is now known as trown|outtypewww20:44
*** apetrich has quit IRC20:48
*** qasims has joined #tripleo20:48
openstackgerrit James Slagle proposed openstack/instack-undercloud: Add hieradata override file
openstackgerrit Merged openstack/tripleo-heat-templates: Revert "Ping retry"
*** rcernin has quit IRC21:27
*** egafford has joined #tripleo21:27
*** jprovazn has quit IRC21:28
bnemec\o/ puppet must be working again21:30
*** tiswanso has quit IRC21:31
mburned<sigh> os-net-config ci already failed on nonha...21:39
mburnedat least that one passed on the last run21:39
*** ayoung has quit IRC21:39
* mburned crosses fingers that ha and upgrades pass21:39
dsneddonmburned, Let me know if os-net-config fails in the other CI passes. The latest change looked OK to me, but I don't know that dprince actually tested it.21:41
mburneddsneddon: it's gfidente's patch21:41
mburneddprince's is already landed21:41
dsneddonmburned, Right, I meant this one. Maybe it already passed CI:
mburnedlooks like it passed all but upgrades and they waived that since it's failing consistently21:42
bnemecWe should stop doing that now that the ipv6 ping patch has merged.21:44
bnemecWe suddenly seem to be having network issues.21:45
bnemecI've seen a couple of patches fail on DNS lookup failures for various things.21:45
bnemecI mean, why not?  Everything else has been broken today.21:46
mburnedbnemec: i don't have the power to waive it anyway...21:48
*** morazi has quit IRC21:49
openstackgerrit Ben Nemec proposed openstack-infra/tripleo-ci: TEST: Delete the overcloud when finished
openstackgerrit Ben Nemec proposed openstack-infra/tripleo-ci: Test overcloud deletion with net-iso in periodic job
openstackgerrit Ian Wienand proposed openstack/diskimage-builder: Turn of tracing around du invocations
openstackgerrit Ben Nemec proposed openstack/instack-undercloud: Add ability to auto-generate self-signed certificates
*** Goneri has joined #tripleo22:20
*** pradk has quit IRC22:36
openstackgerrit Ben Nemec proposed openstack/instack-undercloud: Generate most of the pystache context automatically
*** ayoung has joined #tripleo23:09
*** saneax_AFK is now known as saneax23:17
openstackgerrit Merged openstack/diskimage-builder: Turn of tracing around du invocations
