Wednesday, 2016-03-09

openstackgerritMerged openstack-infra/tripleo-ci: set -o pipefail in
openstackgerritDan Sneddon proposed openstack/os-net-config: Fix hierarchy for Linux Bonds and Linux Bridges
*** slagle changes topic to "TripleO | stable/liberty CI failing: | CI status: | Docs:"02:50
openstackgerritRyan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again
openstackgerritNisha Agarwal proposed openstack/diskimage-builder: Add psmisc to the packages for ironic-agent
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Add Rabbit IPv6 only support
*** admin0 has joined #tripleo08:04
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Add missing createUser line to /etc/snmp/snmpd.conf
gfidenteyesterday liberty/ci had an all green job but now it's failing consistently :(08:11
gfidentelooking into thar08:11
gfidentemarios, last thing we needed was jenkins gate failing to run the tests08:33
gfidenteunbelievable :)08:33
openstackgerritMerged openstack/tripleo-common: Adds override for the overcloud node user in upgrade-non-controller
mariosgfidente: yeah i am rechecking all the things08:40
mariosgfidente: i also saw this one which is weird
gfidentemarios, but it's something to do with rake08:40
gfidenteit's out of our domain08:41
gfidentenow on liberty/ci failing08:44
gfidenteNeutron::Agents::Ml2::Ovs/Service[neutron-ovs-agent-service]/ensure: change from stopped to running failed: Could not start Service[neutron-ovs-agent-service]: Execution of '/usr/bin/systemctl start neutron-openvswitch-agent' returned 1: Job for neutron-openvswitch-agent.service failed because a timeout was exceeded08:44
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Add dib element to generate logical volumes
openstackgerritSteven Hardy proposed openstack-infra/tripleo-ci: Move into tripleo-ci repo
jaosoriormarios: Hey dude, regardings this review I gave an answer. Thing is, those ports (like the ironic one) are the ports in which the internal service is listening09:08
mariosjaosorior: ok thanks for checking ... i wasn't sure if that was the case like i poked at
jaosoriorand those ports are not set up in the loadbalancer.pp. They're the ones that are set up by the respective puppet manifests09:08
*** dtantsur|afk is now known as dtantsur09:10
mariosjaosorior: ok thanks revoted09:10
jaosoriormarios; Thanks dude!09:10
jaosoriormarios: But probably it would make sense to pass in those ports the internal services are listening on, by some means09:11
jaosoriorin another refactor of that manifest09:11
mariosjaosorior: well if it is useful/requested but yeah this is big enough a change09:11
jaosoriorbut yeah.... this manifest is getting too big09:11
shardy is the path forward for too-big manifests IMO09:14
jaosoriorshardy: Ah! I had seen that CR. Dude, I'm all in :D09:15
shardye.g moving most stuff into puppet-tripleo so it's not all deployed directly via SoftwareDeployments09:15
mariosshardy: nice09:15
jaosoriorI was waiting for the CI result yesterday, and forgot to check it today09:15
jaosoriorit looks promising09:16
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Enable glance-api show_image_direct_url for COW
dtantsurmorning folks! are you aware that the gate is probably broken by the puppet-lint job?09:44
jaosoriordamn, well, apparently the puppet lint gates are broken :/09:46
mariosthanks dtantsur explains (saw this this morning)09:48
jaosoriormarios: got any idea where this gate-instack-undercloud-puppet-lint stuff is?09:51
mariosgfidente: didn't  you say it was something to do with rake?09:51
jaosoriorin some repo09:52
marios "NoMethodError: undefined method `last_comment' for #<Rake::Application:0x000000013ab500>"09:52
mariosjaosorior: no sorry i don't09:53
jistrmarios: this is the only script-delivery.yaml thing we don't have in master yet, right? I'll base the channel switching on top of that change.
jistrmarios: and good morning :)09:57
mariosjistr: good morning, double checking09:58
mariosjistr: there is still the swift fixup at as well09:59
jistrmarios: ah right, thanks. It's only touching the swift .sh file so it shouldn't conflict with the channel switching changes to the upgrade initialization YAMLs. So i hope it's conflict-safe to base this on top the ceph patch. Thx!10:02
mariosjistr: yeah just mentioning it the other one makes sense if your basing another change onto it10:03
gfidentejaosorior, did your change just pass lint?10:53
shadowerwhat is undercloud using swift for?10:54
adarazsshadower: somebody probably thought the underclould will work more swiftly when added. ;)10:55
openstackgerritMerged openstack/tripleo-heat-templates: Update enable-tls.yaml with new endpoints
openstackgerritMoshe Levi proposed openstack/diskimage-builder: Add lshw package to ironic-agent
shadoweradarazs: lol10:56
trowninspection stores data in swift10:57
adarazsmaybe even heat uses it for something /o\10:58
shadowerso, I'm seeing swift-proxy-server taking up tonnes of memory during the deployment10:58
trownglance is swift backed too.... I would bet the memory spike is loading images to glance10:59
shadoweroh is it?10:59
trownwhich could very well be some swift bug, or it might be expected...10:59
shadowertrown: I thought glance just used the images on disk. Because yeah, that would definitely fit what I'm seeing11:00
gfidenteshadower, heat uses it11:00
shadoweryea I realised that too now11:00
gfidenteshadower, and the deployartifacts thing which we are trying to use to install the modules on the nodes 'dynamically'11:00
gfidentebut for the deployartifacts I think we could use any http11:01
shadowergfidente: yep. Although this spike happens when booting the overcloud vms so the glance/image hypothesis fits my observations best so far911:01
gfidenteah this I didn't know though ;)11:02
shadowerI'm installing new centos + undercloud on another machine -- will see if the newer packages help any11:02
shadowergfidente: yeah, just checked the config and glance does indeed seem to use swift11:05
shardyshadower: derekh has seen the same thing, I'm not sure if there is a bug reference, but it seems like a swift bug11:14
*** adarazs is now known as adarazs_lunch11:14
shardy(or, we're configuring swift wrong I guess)11:15
shardyIt does seem wrong, I'd expect swift to chunkify the request and not load the whole thing into ram11:17
*** dshulyak has quit IRC11:18
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Make External Load Balancer templates work with IPv6
openstackgerritMerged openstack/tripleo-heat-templates: Change the default value for NetworkNexusVxlanGlobalConfig
rhalliseyderekh, morning11:25
rhalliseyI still can't figure out the gate :/11:25
rhalliseystill works locally11:25
rhalliseybut it keeps hanging here
derekhrhallisey: looking11:26
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Make External Load Balancer templates work with IPv6
openstackgerritMerged openstack/tripleo-heat-templates: Make External Load Balancer templates work with IPv6
openstackgerritMerged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Neutron settings
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Add dib element to generate logical volumes
openstackgerritMerged openstack/tripleo-heat-templates: Update VNI and TunnelID ranges.
openstackgerritMerged openstack/tripleo-heat-templates: Set swift replicas = min(device_count, replicas)
derekhrhallisey: I'm looking on the compute node for that test you linked, is there normally a journal log on these machines?11:38
openstackgerritMerged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Cinder settings
rhalliseyderekh, yes11:39
rhalliseyderekh, can you watch as the ci runs? O.o11:39
rhalliseythat would help me figure this out because heat does not return an error on this11:40
rhalliseyand the create just hangs...11:40
rhalliseyso confusing..11:40
rhalliseyderekh, look for docker-storage-setup in the journal11:41
rhalliseythat runs right after cloud-init on atomic11:41
derekhrhallisey: Yup, I can if we recheck one, but befor we do that, you say there is normally a journal log do you know if its persistet to disk ?11:43
derekhrhallisey: On our centos nodes it usually, the end of the ci job tars it up for use to look at11:44
openstackgerritMerged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Heat settings
openstackgerritMerged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Glance settings
derekhrhallisey:  Its usually somewhere like this on the cenots nodes var/log/journal/f32e0af35637b5dfcbedcb0a1de8dca1/system.journal11:45
rhalliseydoesn't look like it persists..11:46
* shardy notes we've apparrently given up on passing CI and code reviews for stable/liberty11:49
gfidenteso yesterday we had an all green job in liberty11:51
gfidentethen something went wrong and we hit this11:51
gfidenteError: Could not start Service[neutron-ovs-agent-service]: Execution of '/usr/bin/systemctl start neutron-openvswitch-agent' returned 1: Job for neutron-openvswitch-agent.service failed because a timeout was exceeded.11:51
gfidenteinterestingly this is happening for netiso and non netiso, only in liberty/ci11:51
jistrjlibosva just came over to chat about this minutes ago. it's been caused by a change in neutron11:52
gfidentebut I still couldn't figure the root cause of it11:52
jistrwe may not be hitting this in master CI because of being pinned maybe11:52
derekhrhallisey: lets run something like this in get_host_info, it should get you the whole log11:53
jistrgfidente: found out with jlibosva, but i'm not sure how to fix it best11:53
derekhjournalctl | gzip - > journal.log.gz11:53
jistrlemme type it :)11:53
derekhrhallisey: mind if I edit you tripleo-ci patch a little?11:53
rhalliseyderekh, ya go ahead11:53
rhalliseyderekh, let me add back in 2 deps11:53
rhalliseyI took out two to see if I could get different logs11:53
derekhrhallisey: ok go for it11:53
openstackgerritRyan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again
rhalliseyderekh, ok go ahead11:55
gfidentejistr, oh thanks11:55
gfidentewe were not seeing this yesterday though11:55
jistrgfidente, shardy: they implemented a change in ovs-agent that it only notifies systemd that it's up when it has connected to rabbit. This has been done apparently to fix recovery of controllers. L3 agent needs to start after OVS agent starts *and is connected to rabbit* it seems. So now they only do systemd notify when OVS connects to rabbit.11:55
gfidenteso we need to depend compute on controller11:56
gfidentedid I get it right?11:56
jistrgfidente, shardy: which is ok for controllers, but given that we deploy controllers and computes in parallel, OVS agent can start much earlier on compute, and hit systemd timeout because rabbit isn't running yet11:56
jistrgfidente: yes, exactly, but...11:57
gfidenteit's not nice, we tried to avoid that11:57
gfidentebut we had to do it for ceph fwiw11:57
jistri think it's valid to question if the neutron should behave like this in the first place11:57
shardyYeah, it will cause a significant increase in deployment time if we have to do that11:57
gfidenteas if we weren't slowing down enough using netiso :)11:58
jistryea. It basically means that controllers + computes cannot be deployed in parallel now, and i think that applies to every deployment, not just TripleO.11:58
*** mannidi has quit IRC11:58
gfidenteshardy, can I reprise this $hit over to liberty to see how it goes?11:59
gfidentejust to see if it passes so at least we have data11:59
shardyWe can probably minimise the increase by still building the nodes in parallel and only adding a depends_on to ComputeNodesPostDeployment I guess12:00
gfidenteyeah thats' what the change was supposed to do12:00
shardygfidente: Ah, yeah I see12:01
shardyI tested another patch posted by zaneb which serialized the RG's, and that added like 2mins to my local deployment time (about 20%)12:01
shardyI guess this may be somewhat less12:01
gfidenteshardy, yeah I remember tha12:01
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: Allow the continer job to run again
gfidenteyes but the good thing of zaneb's change was it preserved the order on update too if I remember correctly12:02
*** jaosorior has quit IRC12:02
gfidentewhile this patch is not, it's causing update to run on computes first12:02
derekhrhallisey: ^ that should get you the journal log in the compute tarball12:02
shardywell the post-deploy parts will still be serialized12:02
gfidenteso I'll update it just to see if it passes and -112:02
shardyso we just need to ensure all the update stuff happens in -post.yaml12:02
*** jaosorior has joined #tripleo12:03
*** trown is now known as trown|commute12:03
shardythe issue previously was UpdateDeployment was in e.g controller.yaml12:03
rhalliseyderekh, k thanks12:03
shardywe can/should probably move it, given that now we're doing update stuff in the Pre/Post puppet resources instead12:03
slaglejistr: are they fixing the neutron agent issue on their side, or is it on us?12:04
*** aufi has quit IRC12:05
jistrslagle: it's open. No decision on neutron side yet. Personally i'd +2 the patch that gfidente linked above in the meantime. That should be the fastest way to unblock us for stable/liberty merging, and then a revert can be discussed (both in neutron and in tripleo).12:06
slaglejistr: gfidente yep, ok. lets go ahead and backport that to get CI running on it12:08
michchap_'It basically means that controllers + computes cannot be deployed in parallel now, and i think that applies to every deployment'12:08
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Make compute nodes deployment depend on controller
michchap_We need to model the dependencies between components rather than using 'step'12:08
michchap_via a services registry or other mechanism so that any node can query for the state of a given service and make decisions about when to include classes12:09
michchap_Then you can deploy control and compute at the same time, and compute's neutron classes just wait for rabbit to be available in the registry before being included12:10
*** thrash|g0ne is now known as thrash12:10
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Add dib element to generate logical volumes
jistrslagle: ATM i think the new neutron behavior might do more harm than good (parallel deployment of controllers + computes is a good feature imho). Perhaps a better fix would be "let's not require L3 agent to start up after L2 agent, let it be smart and wait for L2 agent to appear before starting to perform actions that require L2 agent running", that might not be easy to achieve though, idk. jlibosva is at lunch now, we can chat more afterwards.12:12
gfidentemichchap_, that'd be a nice to have yes12:13
gfidenteI think we're going to face more of this type of issues with composable services12:13
michchap_gfidente: I had a ...bad? idea of doing it using a custom hiera backend that queries pacemaker12:14
*** chlong has joined #tripleo12:14
gfidenteI am not sure myself really, we probably don't want to rely on pacemaker and even if we were to, it might no have very granular understanding of things like it happens when you set dependencies in the puppet manifest12:15
michchap_but I heard pacemaker is taking a back seat so I can't really use it as a registry going forward12:15
michchap_so I've done this before using consul12:15
gfidentewe probably also don't want to have anything which is implementation specific, so we can't rely on the status of the puppet resources12:16
michchap_and it was actually a really elegant system/solution. Its main weakness was that the system state converges so you can't tell when it's sitting in a broken state, so failed test runs were often longer than they normally would be12:16
gfidenteso it looks to me we might just make the heat templates more granular and keep the dependencies in there12:16
michchap_right, that makes sense12:17
gfidenteshardy, would like it12:17
michchap_(I'm relatively new to tripleo) would that mean we'd end up with one heat resource per puppet 'profile'12:17
*** dtantsur|brb is now known as dtantsur12:18
gfidentemichchap_, from a two minutes chat, that's not a terrible idea yeah... not a puppet 'profile' but rather a service 'profile' though12:18
gfidenteso it's not implementation specific12:19
michchap_gfidente: yep12:20
*** lucasagomes is now known as lucas-hungry12:21
shadowershardy, derekh: thanks. So yeah, I'm hitting the swift-proxy issue and I may as well spend some time digging into it12:36
dtantsurfolks, while gate is not feeling well, could you review a couple of documentation changes please? and
* shadower looks12:38
openstackgerritMerged openstack/tripleo-heat-templates: Change the CinderISCSIHelper to lioadm
*** admin0 has quit IRC12:44
openstackgerritMerged openstack/tripleo-docs: Update documentation on fetching introspection data
dtantsurshadower, thnx!12:46
openstackgerritMerged openstack/tripleo-heat-templates: Add missing createUser line to /etc/snmp/snmpd.conf
EmilienMgfidente: for the rabbit / mongo patches - I suggest we move forward as they are12:57
EmilienMthey'll require some cleanup but not this week I think12:58
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: initialization command/snippet
openstackgerritMerged openstack/tripleo-docs: Clarify profile matching documentation
*** lazy_prince has quit IRC13:05
openstackgerritBrad P. Crochet proposed openstack/tripleo-common: Example yaml for building images
slaglejistr: gfidente : sounds like they are going to revert the neutron packaging change on liberty13:08
pradkcan i request some reviews on
jaosoriorgfidente: It hadn't. Still was running into the same error. But now I re-checked it.13:10
adarazsare the tripleo gates still busted or does it make sense to recheck?13:18
slagleadarazs: liberty is still down13:18
gfidenteadarazs, do you know if puppet-memcached and memcached were updated for centos?13:23
adarazsgfidente: nope, I don't.13:23
gfidenteok yesterday derekh pinged apevac and number80 about it13:23
gfidentewithout those the ipv6 deployment is still going to fail13:24
openstackgerritMiles Gould proposed openstack/python-tripleoclient: [WIP] Use Ironic API v1.11 to support ENROLL state
*** lucas-hungry is now known as lucasagomes13:28
gfidenteslagle, shardy, jistr hey the depend fixed the issue in ci fwiw13:32
gfidenteI was looking at zuul, results are coming13:32
gfidentehow come we're not seeing this in the matser branch though and only in liberty?13:32
gfidentederekh, ^^ ?13:33
shardygfidente: master is pinned via the current-tripleo link?13:33
jistrgfidente: is it because of the delorean pin?13:33
shardyhehe ;)13:33
gfidenteso I am not sure how the pin works then13:33
shardygfidente: master is periodically promoted (until recently this was manual, but I know the plan was for the periodic job to update it)13:34
gfidentecan it be pointed to a specific version of any arbitrary package?13:34
shardygfidente: but for stable/liberty we just use stable/liberty trunk13:34
slaglegfidente: which depends on is this?13:34
gfidenteslagle, I meant the heat depend13:34
trowngfidente: memcached should be updated in delorean deps repo13:35
openstackgerritJaume Devesa proposed openstack/tripleo-docs: Extending the image build information
slaglegfidente: oh ok13:35
slaglegfidente: they are supposedly going to revert the packaging change13:35
slaglegfidente: which i think is better than us reacting13:35
*** links has quit IRC13:35
gfidenteI think so as well13:36
gfidentebut we're completely blocked13:36
gfidenteuntil the revert happens in the packages13:36
gfidenteor can we pin some stable/liberty thing too?13:36
shadowergfidente: do you have the instructions for line 20 written somewhere? The gist in is 404 on me13:36
jaosoriorpuppet lint gate seems to be working :D13:38
jaosoriorany +As for this CR?
openstackgerritDmitry Tantsur proposed openstack/tripleo-docs: Extend the root device selection documentation
dtantsurmgould, could you proof-read this please ^^?13:38
slaglegfidente: the revert is landed13:40
slaglejust waiting on the build13:40
derekhgfidente: we can pin the stable branch with DELOREAN_STABLE_REPO_URL but we never have, if the packaging revert will be fairly quick then we may aswell wait, it will take use 2 hours anyways to get it in13:42
shadowergfidente: thanks! And the comments should be run on the undercloud, right?13:44
gfidenteundercloud yes13:45
openstackgerritRyan Hallisey proposed openstack/tripleo-common: Use Fedora 23 atomic in container gate
openstackgerritRyan Hallisey proposed openstack/tripleo-common: Properly setup DNS for the container CI job
trownderekh: gfidente, there is a current-passed-ci link already on the stable/liberty branch13:46
slaglegfidente: the build is done13:47
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Fixup systemctl_swift stop/start  during the controller upgrade
trownit is a bit old because we have had failures to build from source there for a few days13:47
slaglegfidente: i just rechecked
gfidentethanks guys13:47
slaglethe first and last in the series :)13:48
rhalliseyderekh, rechecking.. The image failed to dl.  Should be fixed this time13:48
jaosoriorslagle: You mean for stable/liberty?13:48
dtantsurfolks, do you know if the puppet-lint problem got fixed?13:48
slaglejaosorior: yes13:48
derekhtrown: ya, we should switch to that13:48
jaosoriorslagle: After that is it possible to still merge bug fixes to stable/liberty?13:48
slagledtantsur: i saw it pass this morning13:48
mariosjistr: i'll rebase onto your review in a bit13:48
mariosjistr: though actually, it doesn't touch any common files so should be ok13:49
jaosoriorslagle: alright, makes sense13:49
dtantsurslagle, first problems I saw on my patches were 10am UTC13:49
openstackgerritGiulio Fidente proposed openstack/tripleo-common: Create a test flavor for the pingtest VM
mariosjistr: check on review  (this is only on controllers so ok here)13:55
dtantsuralso folks, I know you'll hate me... but please review
dtantsurI know that it's huge :( but I found no ways to fix it step-by-step: too many wrong things all over the place13:57
jistrmarios: ah right, ok. Storage nodes can be fixed in a separate patch then.13:57
mariosjistr: yeah... also confirmed thatwhen you ControllerEnableSwiftStorage: false we only have swift-proxy on the controllers13:58
jistrmarios: so it works, cool. I bet i saw a BZ that said otherwise. worksforme :)13:59
jistrmarios: +2'd the swift controller patch. Since the swift fix is just removing a the swift-proxy service from a list, we could put it directly to the existing fixup patch. Up to you.
jistr*the swift node fix14:01
mariosjistr: yeah sure if you wana add it there is fine14:01
jistrmarios: ack will do14:02
mariosjistr: i'm looking at your other one now
jistrcool, thx14:02
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Generate fedora-atomic images using dib
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: object storage node upgrade fix
*** rlandy has joined #tripleo14:09
*** saneax is now known as saneax_AFK14:13
shardyCan anyone see what I've done wrong in ?14:15
shardyit's failing to find overcloud.yaml after the move and associated changes, but I can't currently spot why14:15
gfidenteshardy, I saw that message when one of the environment files wasn't available14:17
*** dustins has joined #tripleo14:18
shardygfidente: Ah, that could be it - misleading error if so!14:18
gfidenteshardy, in do you need to copy the tenantvm template too?14:19
gfidenteI was just editing it in
gfidenteso can you incorporate those changes? :)14:19
shardygfidente: I have moved that template already, and it appears to be failing on overcloud create, not the pingtest14:20
jprovazndsneddon: ping14:21
shardywe'll have to repropose those changes to tripleo-ci, assuming we manage to land this soon and I don't end up rebasing14:21
gfidenteshardy, yeah I was asking if you wanted to include those changes14:21
shardyIMO we shouldn't include any changes in the move, just copy from a specific tripleo-common revision14:21
gfidenteso we have them there when it lands :)14:21
gfidenteokay ... booooring14:21
shardygfidente: IMHO we shouldn't do that, sorry14:21
shardyIf they're urgent, land them to tripleo-common and I'll rebase14:22
*** NobodyCa1 is now known as NobodyCam14:23
EmilienMgfidente: is it ok to move forward with ? I'll work on the cleanup later, we might need Puppet functions for this need14:24
derekhtrown: did you say that memcached is updated in delorean deps ? I don't see it14:24
EmilienMdprince: hey, can you please revisit your -1 on ? for a first iteration I think it's ok to have it like this, and we can add a hiera level for rabbit later14:25
gfidentetrown, we need python-memcached too14:27
gfidenteto make it pass14:27
dprinceEmilienM: sure, I can let that pass for now14:28
EmilienMdprince: thanks14:29
openstackgerritMerged openstack/tripleo-heat-templates: Add Rabbit IPv6 only support
adarazsgfidente, derekh: can you help me figure out where's the problem coming from in the ipv6 gate?:
*** mannidi_ has quit IRC14:43
derekhadarazs: could that commit that just merged be relevant ? ^14:43
adarazsderekh: looks like it, thanks!14:44
adarazsdoes "Depends-On:" check out a change (with all the dependent changes) or does it cherry pick?14:47
adarazsso should I include all the changes in the commit message?14:48
adarazsgfidente: ^14:48
gfidentedepends-on is only used by CI14:48
gfidentecheckout will pick all the changes14:48
gfidentecherry-pick will apply the change on your existing tree14:49
adarazsgfidente: yeah, I know, I want to use it for CI on my change. :)14:49
adarazs -- you told me yesterday that I only need that single depends-on14:49
*** jprovazn has quit IRC14:49
adarazsto get all the necessary IPv6 changes14:49
adarazsis that really true?14:49
gfidenteah yes14:50
adarazsbecause should have been included.14:50
gfidenteI see the question now14:50
*** oshvartz has quit IRC14:50
adarazsat least I think it should have14:50
gfidenteI thought the problem was how to checkout the tree of deps14:51
gfidenteso yes depends_on is doing checkout afaik14:51
adarazsso that should have pulled that rabbit change... anyway, I did a recheck, we'll see if it fails the same way, maybe that rabbit change actually doesn't fix this error.14:53
derekhAnybody else want to describe their dev setup here? so newcomers get an idea of what HW setup they may be able to use for tripleo,
*** bnemec has joined #tripleo14:53
*** admin0 has quit IRC14:54
gfidenteadarazs, I'm checking the rabbit config to see if the changes which that submission does were applied14:57
gfidenteadarazs, yeah not in there14:59
adarazsgfidente: okay, so now it should work as it was merged.14:59
gfidentewell it should break on mongo15:00
adarazsgfidente: is it still true that this change should pull all the remaining necessary changes?
gfidentederekh, ^^ do you know if that is the case and if we can control this behaviour?15:00
gfidenteI though when depends-on was pointing to a change we would git checkout the change (the entire tree the change needs to be applied cleanly)15:01
gfidenteis it doing git cherry-pick instead?15:01
derekhgfidente: it should pull in the entire tree, whats missing ?15:03
gfidenteit apparently didn't for adarazs' change15:04
adarazsgfidente: not sure if that rabbit change was part of the dependency for the ceph patch though. /o\15:04
rhallisey derekh looks like ci is failing everywhere on setup15:04
rhalliseynvm that just for all my patchs15:05
derekhrhallisey: 2016-03-09 14:47:55.933 | /opt/stack/new/tripleo-common/scripts/ line 164: REPO_PREFIX: unbound variable15:07
rhalliseyya rebasing15:07
derekhrhallisey: does one of your patches change that^15:08
*** saneax_AFK is now known as saneax15:08
derekhgfidente: adarazs which patch didn't get pulled in that was supposed to?15:08
openstackgerritRyan Hallisey proposed openstack/tripleo-common: Use Fedora 23 atomic in container gate
openstackgerritRyan Hallisey proposed openstack/tripleo-common: Properly setup DNS for the container CI job
openstackgerritRyan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again
*** pradk has joined #tripleo15:09
adarazsderekh: probably
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Add Rabbit IPv6 only support
derekhadarazs: once that patch merge ZUUL with ignore depends on for it15:13
derekhadarazs: but there is a window of time before the package appears in the trunk repository15:13
derekhif you ran your job during that window then you wont have gotten the package with the change15:14
adarazsderekh: the problem was that I assumed that it was pulled during the last gate run, but we had that rabbit error which seemed to suggest we were missing that rabbit ipv6 change.15:14
openstackgerritBen Nemec proposed openstack/python-tripleoclient: Remove keystone init deprecation message
openstackgerritBen Nemec proposed openstack/python-tripleoclient: Revert "Remove keystone init deprecation message"
adarazsderekh: okay, let's see where this recheck takes us :)15:14
derekhthe window was about 33minutes long15:14
adarazshuh, okay...15:14
gfidentebnemec, so we're not going to go endpoints and stuff from puppet for liberty? :(15:15
pradkjistr, slagle, are we ok with getting this merged into master or are we waiting on anything else?
bnemecgfidente: It doesn't look promising.  It's not even ready for Mitaka yet:
slaglepradk: i think it just needs to be re-reviewed at this point15:16
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Remove forced rabbitmq::file_limit conversion to string
shardyslagle: Hey, quick question about the overcloud-full element - is there any reason we couldn't just setup the element-deps for that so it pulls in all the overcloud pieces?15:20
shardyI've just needed to rebuild an overcloud-full image direct via dib, and it's pretty unwieldy15:21
slagleshardy: overcloud-full? is that something we still use?15:23
slaglei guess so15:24
slagleyea i see15:25
shardyslagle: Yeah, inside tripleoclient we end up creating a pretty long list of elements15:25
slagleit doesn't really do anything15:25
slaglei wonder if it's even needed15:25
slaglebut yes, if we wanted it to encapsulate all the elements we actually need, i suppose we could15:25
shardyMaybe not, will check it out - I was thinking it'd be useful to have it be a meta-element which references the other overcloud-* stuff15:26
slagleindeed, that would be useful15:26
slaglei guess the thinking is with the yaml image building stuff, we'd have a single yaml definition of overcloud-full15:27
shardyYeah that would work too I guess15:28
openstackgerritMerged openstack/puppet-tripleo: Make OpenStack service ports configurable in HAProxy
shardyWe do some other weird stuff, like include the undercloud-package-install element for overcloud images15:28
shardyanyway, thanks for the sanity check, I may look at updating the overcloud-full deps as that would improve my current dib workflow15:29
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: stable/liberty: set default upgrade level to kilo
*** rpothier has joined #tripleo15:33
*** trozet has joined #tripleo15:36
gfidenterecheck time15:38
EmilienMby enabling SSL in Puppet OpenStack CI, our CI jobs take a very long time to run and sometimes timeout. Have you already hit a similar problem in tripleo?15:45
bnemecEmilienM: No, but we don't SSL everything right now, only the public endpoints.15:46
openstackgerritSergey Gotliv proposed openstack/tripleo-heat-templates: Trove Integration
EmilienMbnemec: ok - our CI is deploying without SSL termination, SSL is configured in WSGI services directly15:47
EmilienMI checked cpu_info and we have aes flag15:47
openstackgerritDmitry Tantsur proposed openstack/tripleo-docs: Document benchmarks and add extra data examples
EmilienMI was wondering if the ssl key size could have an impact on handshakes times15:48
EmilienMwe use a 4k size iirc15:48
bnemecEmilienM: dsneddon tells me that it's normal for SSL on internal things to have a big impact on performance.  You might want to talk to him about it.15:48
bnemecHe seems to have some experience with doing that.15:48
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Make OpenStack service ports configurable in HAProxy
openstackgerritSergey Gotliv proposed openstack/python-tripleoclient: Trove integration
trowntripleo-quickstart demo starting shortly:
socialhmm I'm having issues with VMs, I rebooted whole node because the baremetal vms got stuck in shutdown/reboot in nova15:59
socialand ironic/nova still keep reporting old state after reboot even though everything is down15:59
jaosoriortrown: watchin :D16:00
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Ensure access to Redis is password protected
*** aufi has joined #tripleo16:02
mgoulddtantsur, done - sorry I only just noticed your request!16:04
openstackgerritGiulio Fidente proposed openstack/python-tripleoclient: Generate a password for Redis and pass it as deployment parameter
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: WIP: Allow predictable IPs for Controllers on the ctlplane
openstackgerritRyan Hallisey proposed openstack/tripleo-common: Use Fedora 23 atomic in container gate
openstackgerritRyan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: WIP: Allow predictable IPs for Controllers on the ctlplane
openstackgerritRyan Hallisey proposed openstack/tripleo-common: Properly setup DNS for the container CI job
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: Allow the containerized compute node to spawn larger VMs
openstackgerritRyan Hallisey proposed openstack/tripleo-heat-templates: Remove unused Neutron Agents container
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: WIP: Allow predictable IPs for Controllers on the ctlplane
openstackgerritMerged openstack/tripleo-heat-templates: Permits configuration of Cinder enabled_backend via hieradata
*** mbound has quit IRC16:32
gfidenteslagle, it passed!16:34
gfidenteholy green ci16:34
slaglegfidente: yea, i just merged the first 7 patches16:34
gfidenteso we can't blame ci anymore16:35
EmilienMcongrats folks16:35
* rhallisey lives in the red XD16:37
openstackgerritMerged openstack/tripleo-heat-templates: Add IPv6 Support to Isolated Networks
openstackgerritMerged openstack/tripleo-heat-templates: Add IPv6 versions of the Controller NIC configs
openstackgerritMerged openstack/tripleo-heat-templates: Make the Neutron subnet ipv6_{ra,address}_mode configurable
openstackgerritMerged openstack/tripleo-heat-templates: Allow to enable IPv6 on Corosync
openstackgerritMerged openstack/tripleo-heat-templates: Fix rabbit_hosts list for glance-api for IPv6
openstackgerritMerged openstack/tripleo-heat-templates: Set /64 cidr_netmask for pcmk VIPs when IPv6
openstackgerritMerged openstack/tripleo-heat-templates: Fixup the memcached servers string in nova.conf for v6
openstackgerritDmitry Tantsur proposed openstack/tripleo-docs: Extend the root device selection documentation
slagleunfortunately, tripleo ci is again awash in red16:40
*** trown has quit IRC16:41
dprinceslagle: is it just the capacity, memory/CPU issues. Or are you seeing a functional failure?16:41
slagledprince: i'm starting to look through them16:42
dprinceslagle: I'm wondering if we shoudl expediate the resize we talked about on the list...16:42
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: compute: include VIR_MIGRATE_TUNNELLED when doing VM shared storage
bandinijistr: do you have some notes on how the update process (via the new pacemaker_upgrade_1/ scripts) is supposed to work?16:42
slaglethis..., failed with 2016-03-09 14:24:41.811 | Calling <function virsh_start at 0x7ff63efdbc08> with: ['start', 'seed_3']16:42
slagle2016-03-09 14:24:42.074 | error: Failed to start domain seed_316:42
slagle2016-03-09 14:24:42.074 | error: Unable to add port vnet31 to OVS bridge 3brbm_one3: Operation not permitted16:42
slaglewhat the heck is that16:42
dprinceslagle: sounds like the testenv it ran on is hosed, or perhaps wasn't cleaned up properly16:43
derekhslagle: I've seen that before, but not usualy untell the testenv has a long uptime16:43
derekhare we seeing many of them? maybe it now happens more oftens as we have a lot move nics plugged into each instance16:44
slagleand on the other job on that patch, introspection timed out16:44
slaglei see no discernible pattern in any of this :)16:44
*** admin0 has quit IRC16:45
*** mikelk has quit IRC16:45
dprinceslagle: but the queue has been packed today16:45
slaglethe next one i'm looking at is "No valid host" during oc node deployment16:46
*** trown has joined #tripleo16:46
dtantsurslagle, I've also seen this "Failed to start domain seed_3" on one of my patches16:46
openstackgerritEmilien Macchi proposed openstack/instack-undercloud: Use pymysql database driver for OpenStack DBs
slagledprince: yea, anecdotally, i just want to say these are performance related16:46
slaglebut i have no facts to that effect16:46
jistrbandini: yes, we're hashing it out, i'll send you the link16:46
dprinceslagle: yeah, I follow. That is why I'm wondering if we re-allocate the testenv's if it would help us16:47
dprincederekh: thoughts?16:47
dprincederekh: when could we take a window of time and re-deploy them?16:48
slagleit's a tough call, if we re-allocate something could go wrong16:48
slagleand we could be completely down for a day or so16:48
*** dshulyak has joined #tripleo16:49
*** yamahata has quit IRC16:50
*** xinwu has quit IRC16:51
*** bvandenh has joined #tripleo16:51
derekhdprince: slagle we can reallocate them in batches, maybe to a small few and see if things improve16:53
*** trown has joined #tripleo16:53
derekhdprince: slagle any ideas if all the extra nics could be putting a higher load on the host then there used to me16:53
* derekh saw a load of 140 thismorning16:54
dprincederekh: they certainly could be16:54
slagletestenv12-testenv0-or57ccjfv7w2 env num 3 must be bad16:54
dprincederekh: I was definately suspicious of it last week16:54
slagleit's failed 2 jobs with that can't add ovs port error16:54
dprincederekh: with low load things seemed to pass just fine though16:54
openstackgerritDan Sneddon proposed openstack/os-net-config: Fix hierarchy for Linux Bonds and Linux Bridges
dprincederekh: repeatedly, over the weekend16:55
derekhslagle: once that heppens, you gotta rebuild the host, that env is hosted16:55
*** jaosorior has quit IRC16:55
derekhslagle: that env will continue to fail with the same error until the host is rebooted, we've always had it as a problem, I've just rebuilt them when it happened16:56
derekhslagle: but it usually happen after a few months of uptime16:56
derekhalso we have CPU's overheating, I'm think maybe we want to turn off turbo to prevent CPU getting throttled under heavey load16:57
*** xinwu has joined #tripleo16:57
derekh# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo16:57
derekh bug 924570 in kernel "regression, package temp above normal induced mce" [Unspecified,New] - Assigned to kernel-maint16:57
*** tosky has quit IRC16:57
slagledid we have that before the redeploy for net iso?16:58
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers
*** rwsu has quit IRC16:58
derekhslagle: which one the CPU heating? yes we noticed it while in brno, but I feel its worse now (but thats just a gut feeling)16:59
slaglederekh: the no_turbo setting16:59
slaglejust wondering if we were throttling before16:59
derekhslagle: ahh, the setting value should have been the same all along, we never changed it17:00
*** bvandenh has quit IRC17:01
*** xinwu has quit IRC17:04
*** jistr has quit IRC17:06
openstackgerritMerged openstack/instack-undercloud: Nova should not sync power state of overcloud nodes
*** yamahata has joined #tripleo17:10
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers
* bnemec just rechecked a patch he had already rechecked17:19
EmilienMbnemec: I figured the SSL thing, it was nothing with ssl but the tests in temepst that we run (too much...)17:22
bnemecEmilienM: Cool17:22
slaglei think we're going to redeploy some testenv hosts to 3 envs and more ram17:25
openstackgerritPradeep Kilambi proposed openstack/python-tripleoclient: Add gnocchi password as a deployment param
slaglein an attempt to quell the bloodbath17:25
bnemecI picture our CI rack looking like that now.  Thanks slagle :-P17:27
*** dtantsur is now known as dtantsur|afk17:27
slaglebnemec: how did you pull that still frame off my web cam?17:28
bnemecslagle: Hax!17:28
*** adarazs has quit IRC17:29
*** dshulyak has joined #tripleo17:29
*** mbound has joined #tripleo17:33
*** admin0 has joined #tripleo17:36
*** mbound has quit IRC17:38
*** panda has quit IRC17:40
*** panda has joined #tripleo17:40
derekhHi all, expect jobs to start mysteriously failing(differently to usual), I'm about to start redploying some TE hosts as per the email last night17:43
*** mbound has joined #tripleo17:49
openstackgerritMerged openstack/os-cloud-config: Fix a typo in usage.rst
*** mbound has quit IRC17:50
openstackgerritMerged openstack/os-cloud-config: Put py34 first in the env order of tox
openstackgerritBen Nemec proposed openstack/instack-undercloud: Remove trailing / on keystone admin endpoint
*** dustins has quit IRC17:55
*** bnemec changes topic to "TripleO | testenvs are being redeployed, expect random CI failures for a while | CI status: | Docs:"17:59
bnemecderekh: Updated the channel topic.  Should we ask people to hold off on rechecks until you're done?17:59
greghaynesbnemec: ianw Hello there - I want to chat about if youall have a min18:00
stevebakerbnemec: hey, if I were to try network isolation in OVB which net environment should I use?18:00
derekhbnemec: wouldn't do any harm to ask,18:00
stevebakerbnemec: popular!18:01
*** bnemec changes topic to "TripleO | testenvs are being redeployed, expect random CI failures for a while. Please wait to recheck until the redeploy is complete | CI status: | Docs:"18:01
bnemecderekh: Done18:01
bnemecstevebaker: Indeed! :-)18:01
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: Kill CI job if it doesn't get a testenv quickly
greghayneshaha, we can chat when things arent explodey18:01
bnemecstevebaker: You'll need to have multiple nics available.  Neutron doesn't allow tenant vms access to vlans.18:02
derekhslagle: dprince bnemec now that we'll have more jenkins nodes then testenvs, something like that ^^ would be a good idea,18:02
bnemecstevebaker: I use the "simple" templates from here:
stevebakerbnemec: yep, I'll be adding multiple nics and assuming I can get past introspection18:02
bnemecI added the public nic back to the baremetal nodes in my local templates.18:02
openstackgerritPradeep Kilambi proposed openstack/tripleo-heat-templates: Deploy Gnocchi as a Ceilometer metrics storage backend
derekhslagle: dprince bnemec gotta run in a minute, here is what I';ve done so far,
*** shardy has quit IRC18:03
derekhdprince: was gonna take over, I'll be back later to see if there is anything I can help with18:03
stevebakerbnemec: thanks, I'll take a look18:03
bnemec# Handover to other people and run away18:03
* bnemec likes that plan18:03
slaglederekh: thanks :)18:04
bnemecstevebaker: If you want to add a bunch of nics, should work too.18:04
bnemecI need to test that and make sure OVB can actually handle it though.18:05
bnemecI'm a little concerned based on my experience with adding lots of nics to the bmc that we may run into OpenStack bugs doing that.18:05
*** shivrao has joined #tripleo18:07
stevebakerbnemec: I may start with that, since its upstream and I can in theory create as many nics as I need18:07
*** Marga_ has joined #tripleo18:08
*** Marga_ has quit IRC18:09
*** xinwu has joined #tripleo18:10
*** Marga_ has quit IRC18:17
greghaynesianw: bnemec ok, replied on - I think theres multiple groups hoping for that to move forward right now so it would be awesome if I could get some more feedback18:18
*** lucasagomes is now known as lucas-dinner18:18
*** mkovacik has quit IRC18:19
*** Marga_ has joined #tripleo18:22
*** shivrao has quit IRC18:39
*** Marga_ has joined #tripleo18:54
openstackgerritMerged openstack/tripleo-heat-templates: Cisco nexus config template - obsolete parameter (replay count).
bnemecGood news, everyone!18:57
bnemecIt looks like fedorapeople is down, so CI wouldn't be passing anyway (apparently we pull packages from there).18:57
larsksHow do I pass custom config settings to the *undercloud* install?  The 'openstack undercloud install' command doesn't seem to take any parameters...18:58
*** shivrao has joined #tripleo18:59
bnemecOr maybe those two things are unrelated.  My undercloud install is failing to download a package that doesn't even exist on my older undercloud. :-/18:59
bnemeclarsks: It doesn't.  The undercloud intentionally exposes relatively few configuration options to keep it simple.19:00
bnemecWhat do you need to customize?19:00
larsksbnemec:  I want to pass in [ssh] libvirt_uri, because I am targeted libvirtd running unprivileged rather than running as root.19:01
larsksI can obviously just edit ironic.conf and restart the service, but I was hoping for a more graceful option...19:01
trownhmm, this seems worth wiring in to undercloud.conf maybe19:02
larskstrown: I know, right? :)19:02
bnemecI'm not so sure.  I don't want to wire in an option that is only of interest to dev/test people who _aren't_ setting up their environment the way we suggest.19:03
bnemecEverything in undercloud.conf is exposed to the user too.19:03
trownI think hiera could work too, will play with that19:05
trownbut first food19:05
*** trown is now known as trown|lunch19:05
dmsimardbnemec: fedorapeople is probably the temporary monitoring stuff ?19:06
bnemecYeah, I actually am starting to think maybe we need to allow for arbitrary puppet somehow.19:06
*** liverpooler has joined #tripleo19:06
bnemecdmsimard: It's a tempest dep.19:06
bnemecIt's not even installed on my overcloud from maybe a week ago.19:07
jdobgfidente, ping19:07
bnemecMaybe the package just hasn't arrived in the right repos yet.19:07
jdobactually, bnemec ping19:09
jdobwhen you're done with that conversation19:09
bnemecjdob: What's up?19:10
jdobbnemec, these two patches in the client for the passwords (gnocchi and redis), if they land before the THT stuff, it's gonna break everything right? since they are passing in params that don't exist19:10
jdobor am I wrong there19:10
bnemecjdob: Right, but there's no way the template changes could pass CI without the client ones.19:11
jdoband if i'm right, how do we handle landing these so that we don't get in a weird state19:11
jdobthis is true, i can't even run the gnocchi patch without the client19:11
bnemecThere should be a depends-on in the template changes that points at the corresponding client change.19:11
bnemecGerrit won't allow them to merge out of order then.19:11
jdobright, but if the client merges and that THT patch takes another few days, isn't shit broken until the THT stuff lands too?19:12
bnemecAlthough like I said, unless someone completely ignores CI that shouldn't happen.19:12
jdobor does that password param get ignored if it's specified and unused19:12
jdobbasically, i'm wondering if there is a circular dependency here19:12
jdoband how we resolve landing them19:12
bnemecI thought we had done this before, but I could be wrong.  Let me look quickly.19:13
jdobnot sure what happens in gerrit if we have two depends-on pointing to each other, but IIRC, i heard that's bad mojo19:13
bnemecjdob: Okay, yeah, we're fine if the client lands first.  The client passes everything as parameter_defaults:
jdobahhh, right, that was the way around this for other issues19:14
jdobnow  i remember, thanks bnemec19:14
bnemecApparently it's only me who can't get to fedorapeople too.  I wonder if they got pissed because I kicked off 60+ test rpm builds at once to catch up on the ones that failed because I broke my local network.19:16
jdobi can ssh into if that helps19:17
jdobor doesn't, I suppose19:17
bnemecjdob: hasn't passed CI yet.19:18
bnemecNor has
jdoboh shit, my bad, I saw the green but didn't pay close enough attention19:18
jdobremoved the +As19:19
*** mkovacik has joined #tripleo19:19
slagledprince: there are some jobs running now on the new testenv30 hosts derekh_afk deployed19:20
*** saneax_AFK is now known as saneax19:21
dprinceslagle: got a link to the jenkins so we can watch it?19:21
slaglei'm just logged in right now19:23
slagleis there a way to tell from that?19:23
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Set default locale to image in ubuntu-minimal
dprinceslagle: maybe, I thought perhaps you just noticed one of the jobs from zuul, or tripleo.org19:24
dprinceslagle: I think you could tell, but you'd have to be on the undercloud, or jenkins perhaps19:25
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers
*** akrivoka has quit IRC19:26
slagledprince: here's one,
dprinceslagle: boom, thanks19:28
dprinceslagle: if this goes well I'll deploy more then right?19:29
dprinceslagle: vs deploying them all now19:31
slagleyea, sounds good19:35
*** sthillma has joined #tripleo19:36
gfidentejdob, it's ignored since the client passes stuff as parameter_defaults not parameters anymore19:39
gfidenteso we can pass my_mother_went_to_milan: true19:40
jdobgfidente, please tell me there's a test case with that exact data in it19:40
jdobthat would be amazing19:40
gfidentebut this wasn't the case before though19:40
gfidenteclient was using parameters: initially19:40
gfidentegoing for the day guys!19:41
*** gfidente is now known as gfidente|afk19:41
*** gfidente|afk has quit IRC19:43
*** jprovazn has quit IRC19:43
*** yamahata has quit IRC19:44
*** dshulyak has quit IRC19:50
slaglei suppose that could be related to the patch being tested, but that seems unlikely19:51
dprinceslagle: the one you linked before looks to be stuck building images19:51
dprincesudo dd of=/var/log/image_build.txt19:51
*** trozet has quit IRC19:52
dprinceslagle: got a link to the one waiting on Ironic callback?19:52
slagleit's one of the running nonha ones19:53
slaglelet me see if i can find it19:53
slagledprince: the nodes are active now, they got rescheduled19:58
slaglethat was odd...could explain some of these long job times19:58
openstackgerritRyan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again
slaglei wonder if BOOTP packets got dropped somewhere19:59
dprinceslagle: perhaps, related to what though?20:02
*** jaosorior has quit IRC20:02
*** jaosorior has joined #tripleo20:03
slagledprince: not sure. but I only see the requests for boot.ipxe in the httpd access log after they initially timed out and were then deployed successfully20:07
slaglewhich means they never requested boot.ipxe the first time20:07
slagleso i'd guess dhcp failed20:07
dprinceslagle: interesting. Well this is probably the first job to use that testenv since it came up (we think?)20:08
slagleoh let me check the ipxe script20:08
slaglehow many nics does it try?20:08
slaglewe have 10 now20:08
dprinceslagle: yes20:08
dprinceslagle: but the first Nic should take priority I think20:09
dprinceslagle: rather, they are ordered20:09
slaglei dont think that always works20:09
slaglethat's why lucas had to add this20:09
slaglebut, i see it loops over all the nics anyway20:10
slagleassuming that's working20:10
slagleor they're all active20:10
dprinceslagle: yes, it would eventually try all of the active NICs20:10
dprinceslagle: I am doing this:
slaglei see some errors in the inspector log as well20:14
*** jcoufal has joined #tripleo20:15
slaglei wonder if they booted into the inspector ramdisk during deployment20:16
dprinceslagle: Hmm, not sure. The inspector logs indicate it is skipping some interfaces which were not PXE booting. However I would have expected one of them to be PXE booting (eth0 for example)20:17
slaglewell i dont see any DHCPREQUEST's in the inspector-dnsmasq log after the oc deployment started, so that's probably not it20:18
*** snecklifter has quit IRC20:26
bnemecIt may not be working, depending on how old the libvirt in the testenvs is.20:27
*** trown|lunch is now known as trown20:27
bnemecThere's a fallback path for older roms that don't support the inc command.20:27
bnemecNo idea whether that would be a problem here, but it's possible.20:28
*** sthillma has quit IRC20:28
bnemecSpecifically this:
bnemecNote || goto old_rom20:29
openstackgerrityolanda.robla proposed openstack/diskimage-builder: Set default locale to image in ubuntu-minimal
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: IPv6: duak-stack support for Keystone
dprinceslagle: the HA job failed with: 2016-03-09 20:45:00.006 | Error finding address for Unable to establish connection to
dprinceslagle: this was during the ping test to the overcloud20:49
dprinceslagle: seen that before?20:49
*** saneax is now known as saneax_AFK20:50
slaglepossibly, it looks a little familiar20:50
slaglewe could check in logstash20:50
dprinceslagle: I'm gonna say that is possibly new or unrelated to what we are chasing in general20:51
slagleyea i dont think it's related20:51
slaglebnemec: does * work within double quotes on logstash?20:54
slaglesee what happens when you start an etherpad of queries?20:55
slaglepeople ask you questions20:55
bnemecslagle: I'm not sure. It automatically does a substring match on the Message field, so leading or trailing *'s would be unnecessary.20:55
bnemecIn theory anyway. :-)20:55
slagleyea, that i see working20:55
slaglebut if i put one in the middle, it doesn't work20:56
*** prometheanfire has joined #tripleo20:58
dprinceslagle: 3 new testenv's online20:58
dprinceslagle: testenv31-....20:58
prometheanfireso, trying to add a element to test in dib, hopefully only for periodic jobs20:58
prometheanfireis all I need to do is add a subdir called test-elements/build-succeeds like in debian/fedora/apt-sources/ironic-anget?20:59
bnemecslagle: Might be useful: A query such as "foo bar"~10000000 is an interesting alternative to foo AND bar21:01
slagleoh i see21:03
bnemecslagle: Although I'm not finding that it quite does what I would expect.  I'm just looking at the reference here:
dprince3 more testenv's active21:19
prometheanfirenot sure if that will make it run for all checks though21:22
*** dshulyak has joined #tripleo21:22
*** dshulyak has quit IRC21:23
openstackgerritPradeep Kilambi proposed openstack/tripleo-heat-templates: Deploy Gnocchi as a Ceilometer metrics storage backend
*** trown is now known as trown|outtypewww21:25
dprince3 more testenv's active21:27
*** derekh_afk is now known as derekh21:29
derekhhow goes it?21:30
dprincederekh: I'm deploying testenv34 now21:30
slagledan is deploying more testenvs21:30
slaglederekh: we had some jobs pass testenv30, so decided to redeploy the rest21:31
dprincederekh: so we've got like 12 more testenv's or so...21:31
dprincederekh: sorry, 12 more test environments21:31
slaglesome also unrelated ways21:31
dprinceyeah, still sort of unsure but decided to go all in to get better results21:32
derekhdprince: slagle ok21:32
slaglei just rechecked
slagleand the 3 jobs got the 3 envs from testenv3121:32
slagleso that will be a good test21:32
dprincederekh: I'm also setting times, and swap up on these21:32
derekhdprince: great, was about the check ;-)21:33
dprincederekh: testenv34 just had some failed machines. Retrying it again...21:35
*** panda has quit IRC21:40
*** r-mibu has joined #tripleo21:47
dprinceokay, all 3 testenv34... workers are running now21:48
dprincetestenv35 building now21:48
* dprince wonders off for a bit21:49
dprincewander even21:49
derekhdprince: gonna remove the unused overcloud ports21:49
dprinceoh crap, I forgot about those21:50
dprincederekh: you can tell the unused ones though right?21:50
derekhdprince: their named, so anything thats not te_testenv3X ,21:51
dprincederekh: right21:51
derekhdprince: at this stage all the old testenvs are gone arn't they21:51
dprincederekh: you doing it or me?21:51
*** ccamacho has quit IRC21:51
derekhdprince: yup21:51
dprincederekh: I've got a ~/clean-ports.sh21:51
dprincederekh: be very careful with that guy though21:51
derekhdprince: I'll stick the my normal way ;-)21:52
dprincederekh: script isn't guarded very well. Wrong regex and it'd go wonky21:52
* derekh deleted all the ports on the overcloud once, back when we were setting all this up, it didn't go very well21:53
*** ccamacho has joined #tripleo21:57
*** jdob has quit IRC22:00
*** jayg is now known as jayg|g0n322:03
*** lblanchard has quit IRC22:08
dprincederekh: 51 environments. That should be close to enough right?22:08
slagledo we have any cache for the puppet modules?22:08
derekhdprince: cool, I guess we just wait now22:08
dprinceslagle: I don't think so22:08
derekhslagle: the modules should already be on the jekins node, and we should be getting them from there, if we're not something is broken22:09
derekhslagle: for the other we cloning from github mostly22:09
derekhslagle: this is intented to change things to start cloning from the mirror server I've been talking about
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Reuse the source-repositories cache during the image build
slaglederekh: ok. that's probably pointless then ^22:11
derekhslagle: hold on I thought I fixed that recently, standby22:12
bnemecSo testenvs are all done?22:12
*** rpothier has left #tripleo22:14
*** dshulyak has joined #tripleo22:14
slagle17 minute image build22:14
slagleseems like something is going faster22:15
slaglei was lookign in the wrong place22:16
derekhbnemec: yup, all done,22:17
bnemecderekh: Roger, thanks22:17
*** bnemec changes topic to "TripleO | testenvs redeployed. recheck away! | CI status: | Docs:"22:18
derekhok, I'm off, will check back in the morning, better tell my green pixels to get ready22:18
*** derekh has quit IRC22:19
bandiniso I am getting "ERROR: Failed to validate: Failed to validate: resources[0]: "str_replace" parameters must be a mapping" while deploying THT from master. How do I troubleshoot this in general? heat-*.log aren't all too helpful:
openstackgerritSteve Baker proposed openstack/tripleo-heat-templates: Add a large packet ping to all nodes check
openstackgerritDerek Higgins proposed openstack-infra/tripleo-ci: Kill CI job if it doesn't get a testenv quickly
slagledprince: do we need dhcp-all-interfaces running on the testenv hosts?22:40
slagleisn't it going to keep more and more failed services?22:41
slaglei thought there was a bug on that about it eventually degrading performance22:41
slagledprince: yea, this one bug 1293712 in rhel-osp-director "/etc/udev/rules.d/99-dhcp-all-interfaces.rules causes a slow and miserable degradation until things fail" [Urgent,Assigned] - Assigned to dprince22:42
slagleand i see lots of failed services for the vnet interfaces on the testenv host22:43
slaglei saw an instance of ironic failing to start one of the vm's on the host, and it had to redeploy it22:43
slagledprince: i'm getting a lot of this in one of the running jobs now,
slaglei gotta step away. but it might be something to consider...turning off dhcp-all-interfaces on the testevnvs22:46
slagleseems like that could be related to the vm's have 10 nics now22:46
openstackgerritMatthew Thode proposed openstack/diskimage-builder: Add testing for the Gentoo element
*** dshulyak has quit IRC22:48
openstackgerritBen Nemec proposed openstack/puppet-tripleo: Allow enabling authentication on haproxy.stats

Generated by 2.14.0 by Marius Gedminas - find it at!