18:00:40 <amrith> #startmeeting trove
18:00:41 <openstack> Meeting started Wed Jan 18 18:00:40 2017 UTC and is due to finish in 60 minutes.  The chair is amrith. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:42 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:45 <openstack> The meeting name has been set to 'trove'
18:00:50 <peterstac> o/
18:00:51 <amrith> ping peterstac johnma slicknik dougshelley66 pmalik vgnbkr mvandijk trevormc aliadil spilla songjian trevormc apsarshaik
18:00:57 <trevormc> o/
18:00:58 <amrith> hi peter
18:01:00 <amrith> hi trevor
18:01:07 <peterstac> hi all
18:01:12 <aliadil> o/
18:01:16 <amrith> #agenda https://wiki.openstack.org/wiki/Meetings/TroveMeeting
18:01:23 <songjian> o/
18:01:50 <amrith> johnma said she'd be here, let's give her a couple of minutes
18:02:14 <amrith> peterstac, would you please remind vgnbkr that he has a -1 on 337914 that may be addressed now.
18:02:22 <johnma> o/
18:02:54 <peterstac> sure, but he's not at his desk right now ...
18:02:58 <amrith> hi mariam, let's get started.
18:03:04 <amrith> thx peterstac when you get a chance
18:03:13 <amrith> #topic Stability of the gate
18:03:23 <amrith> so, it appears to be on the fritz again
18:03:29 <amrith> thx peterstac for trying to get it going again
18:03:32 <amrith> how goes that battle?
18:03:44 <peterstac> well, there seem to be multiple issues
18:03:55 <amrith> specifically, are there a small set of tests that we can skip for now
18:03:55 <peterstac> not sure how many are our fault either
18:03:57 <amrith> and get around it.
18:04:15 <peterstac> I've got a patch up to restrict resize-vol and resize-inst to just mysql
18:04:32 <peterstac> but then the postgres tests fail for an unrelated issue ...
18:04:35 <amrith> i saw that, but it appears to have failed for some other reason
18:05:27 <peterstac> it's in the gate now, see if it merges this time
18:05:47 <amrith> what of the postgres failure?
18:05:47 <peterstac> that'll help somewhat, but I still believe we have xenial/neutron issues on some of the clouds
18:06:00 <amrith> you mean the timeout for qemu?
18:06:12 <peterstac> I believe it was one where the instance just didn't go active
18:06:36 <peterstac> that's the only case now where we don't get a guest log printed out so it'll be hard to debug
18:06:56 <peterstac> (doesn't just happen for postgresql either, I've seen it on redis too)
18:07:06 <peterstac> (any maybe mysql as well)
18:07:13 <peterstac> s/any/and/
18:07:33 <amrith> ok
18:07:59 <peterstac> so basically I think it'll be more stable soon, but we're not entirely out of the woods yet
18:08:14 <amrith> how confident are you that we can get (for example) a string of changes like i18n merged soon
18:09:03 <peterstac> I'm guessing we can still only expect 50-60% to run without any issues
18:09:16 <amrith> ok, thx peterstac
18:09:26 <amrith> will go look at those instance not going active cases
18:09:38 <amrith> one thing that could cause that is a kernel panic in the guest
18:09:40 <peterstac> sounds good, thx amrith
18:09:42 <amrith> which I've been watching for
18:09:49 <amrith> it is a side effect of the kvm/qemu change
18:10:28 <johnma> I might be out of the loop here amrith but what is this kvm/qemu change we are talking about
18:10:32 <peterstac> hmmm, hard to know if that's causing it - maybe we need to resurrect your 'pipe guest over to conductor' patch ?
18:10:42 <amrith> johnma, a while ago I made a change
18:10:48 <amrith> let me find a link
18:11:02 <amrith> I85364c6530058e964a8eba7fb515d7deadfd5d72
18:11:19 <amrith> #link https://review.openstack.org/#/c/413166/
18:11:24 <amrith> the gist of it is this ...
18:11:40 <peterstac> #link https://review.openstack.org/#/c/412011/
18:11:41 <amrith> we look during a test run and see whether we can force nova to use KVM
18:11:48 <peterstac> ^--- that's the first one
18:12:16 <amrith> peterstac, yours was the test,  https://review.openstack.org/#/c/413166/ was the finalized one which had your suggested changes (like the name of the method)
18:12:20 <amrith> so anyway
18:12:26 <amrith> at run time we see whether we can force KVM
18:12:37 <amrith> now the infra folks warned me that if we do this we could have some number of panics
18:12:55 <amrith> the symptoms would be just as peterstac felt, the guest never makes it far enough
18:13:05 <johnma> oh ok
18:13:11 <amrith> but in these cases, typically we have the whole test go away
18:13:30 <peterstac> that seems to happen around 5-10% of the time
18:13:39 <amrith> that ~ instance not launching
18:13:53 <peterstac> right, and then the wheels fall off
18:13:55 <amrith> looking at kernel panics, I see them about 1 in a 1000 vm's
18:14:12 <amrith> so ther's some orders of magnitude difference between the two
18:14:15 <amrith> which needs investigating
18:14:37 <amrith> in any event, I'll look into that
18:14:42 <amrith> and get back to you at next meeting
18:14:45 <peterstac> right - it could also be a networking issue, or a path mismatch
18:14:58 <amrith> yes, what was that .1 issue you mentioned?
18:15:05 <peterstac> might be quick to determine if we could get into the guest
18:15:21 <peterstac> ah, we calculate CONTROLLER_IP as the actual ip of the box
18:15:47 <peterstac> we use 'hostname -I' however that lists gateways and other stuff too
18:15:58 <peterstac> so all the .1 addresses were filtered out
18:16:17 <peterstac> some clouds however will give out .1 addresses as a valid ip
18:16:39 <peterstac> in that (rare) case the tests would fail since the controller ip wouldn't be set properly
18:16:53 <clarkb> worth pointing out taht ironic and lbaas have been having similar issues with nested virt. It might be worthwhile to talk to them about what they are seeing and possibly debug together
18:16:59 <peterstac> I didn't see it that often, but enough that I wanted to fix it
18:17:01 <amrith> hmm, ok
18:17:12 <amrith> clarkb, I have replied to that email thread
18:17:28 <peterstac> there's also an issue with Redis backups - I've put a bug in for that
18:17:36 <amrith> sounds good peterstac is there a fix for the .1 issue?
18:18:04 <peterstac> yes, it's merged (yesterday or early today) so hopefully that's resolved
18:18:08 <clarkb> yup I saw, but there is a lot mroe detail in here so far :) in any case better communication around shared issues can only help
18:18:18 <peterstac> #link https://bugs.launchpad.net/trove/+bug/1656432
18:18:18 <openstack> Launchpad bug 1656432 in OpenStack DBaaS (Trove) "Redis backup can fail if auto one already running" [Undecided,New]
18:18:23 <peterstac> ^^^ redis bug
18:18:45 <amrith> ok, thx
18:18:54 <peterstac> clarkb, I'll take a look at the thread and see if anything new is revealed
18:19:53 <amrith> ok, so if we're done with that topic of gate stability, let's move along ...
18:20:12 <amrith> #topic Code Reviews (peterstac)
18:20:18 <amrith> peterstac, you are up
18:20:41 <peterstac> ok, it's just a couple of reviews that have been there for a little while that I'd like to go in soon
18:20:56 <peterstac> (mostly because there are other ones that are waiting on them)
18:21:11 <peterstac> module-instances
18:21:21 <peterstac> #link https://review.openstack.org/#/c/403287/
18:21:35 <peterstac> cluster-restart
18:21:37 <peterstac> #link https://review.openstack.org/#/c/417454/
18:22:01 <peterstac> And redis from compile (less critical, but I think it's ready)
18:22:03 <peterstac> #link https://review.openstack.org/#/c/416361/
18:22:50 <peterstac> oh amrith I talked to your concern on the cluster-restart one
18:22:53 <amrith> ok, this afternoon. getting them to merge, left as an exercise to peterstac (the unbreaker of the gate)
18:22:59 <amrith> let me look
18:23:40 <amrith> thx peterstac, that was my hope, thanks for confirming
18:23:58 <johnma> peterstac: I reviewed all of them this morning. I havent gotten to testing them though. I tried but I think something about my env is messed up. Tried testing mongo and redis changes and things keep failing. So I am rebuilding my env. Hopefully once thats fixed I can test these real quick
18:24:37 <amrith> thx johnma
18:25:21 <peterstac> thx johnma
18:25:49 <amrith> peterstac, the only concern I have about the redis (and also postgresql) compile from source each time is that it is an awful waste of time
18:25:59 <amrith> but I can't think of a good way to automate it either
18:26:08 <peterstac> sure, but with Redis we may not have a choice
18:26:24 <peterstac> we were using a ppa and it looks like the maintainer is letting it go stale
18:26:25 <amrith> so what I was thinking was to make a repo of just the distros we use for testing
18:26:35 <amrith> and call it something like (trove-db-packages)
18:26:41 <amrith> and import packages from there
18:26:51 <amrith> or have them push artifacts into some place we can get them from
18:26:59 <amrith> and a build job to rebuild the packages, maybe?
18:28:02 <amrith> anyway, we can take that up at a different time
18:28:12 <amrith> anything else on the subject of reviews ...
18:28:28 <trevormc> I uploaded more troveclient changes for osc :)
18:28:32 <trevormc> #link https://review.openstack.org/#/q/status:open+project:openstack/python-troveclient+branch:master+topic:bp/trove-support-in-python-openstackclient
18:28:52 <johnma> do we have any reviews left for troveclient before the deadline this week
18:29:25 <amrith> johnma, good question
18:29:48 <amrith> #link https://review.openstack.org/#/q/project:openstack/python-troveclient+status:open
18:30:45 <amrith> so there are a couple
18:31:09 <trevormc> so if the patches aren't merged by the deadline then they will be going into pike?
18:31:09 <amrith> peterstac, can you help get https://review.openstack.org/#/c/402802/6 to go?
18:31:19 <amrith> trevormc, no
18:31:26 <amrith> let's talk about the deadline later
18:31:31 <amrith> this week is a soft freeze for the client
18:31:32 <trevormc> ok
18:31:37 <amrith> so we can prioritize those for review
18:31:51 <peterstac> amrith, I can look at that
18:32:00 <amrith> client freeze is next week
18:32:15 <amrith> peterstac, if you can get that one to go, there's its client change which looks good
18:32:27 <peterstac> right, I'll test both at the same time
18:32:44 <amrith> I'll +2 it but since i co-authored it, I won't approve
18:33:38 <amrith> so the priority for the next week of reviews is https://review.openstack.org/#/q/project:openstack/python-troveclient+status:open
18:33:46 <amrith> to get the client stuff done by next week
18:33:51 <amrith> when our client will freeze
18:34:49 <amrith> so let's move along
18:34:51 <amrith> #topic Reviews for abandonment
18:34:57 <amrith> I abandoned four reviews this morning
18:35:10 <amrith> they've been sitting around and idle for a while with negative review comments
18:35:24 <amrith> if someone feels strongly about them, please restore and get them ready to go.
18:35:42 <amrith> that's all I wanted to mention
18:35:49 <amrith> anyone want to add something ...
18:36:45 <amrith> #topic Ocata Release Schedule - Update
18:36:54 <amrith> #link https://releases.openstack.org/ocata/schedule.html
18:37:19 <amrith> so, to johnma's earlier point, we have a soft freeze for the trove client and guest requirements this week
18:37:36 <amrith> I see nothing changing that (other than the redis compile change, maybe)
18:37:44 <amrith> we should be tood to go with that
18:37:50 <amrith> there's no actual 'deliverable' at this stage
18:37:55 <amrith> the only deliverable comes next week
18:38:02 <amrith> note that next week, we freeze the client.
18:38:09 <amrith> no if's, and's, or but's.
18:38:23 <amrith> trevormc, anything osc related that doesn't merge by jan 27 is not in ocata
18:38:34 <amrith> similarly peterstac the volume type support client side stuff
18:39:51 <amrith> we had one community goal for ocata
18:39:53 <amrith> #link https://review.openstack.org/#/c/396267/
18:39:55 <amrith> we completed it
18:40:11 <amrith> it is also the string soft freeze
18:40:18 <amrith> so the i18n changes should merge if at all possible
18:40:25 <amrith> the hard freeze for the i18n changes would be R-3
18:40:33 <amrith> any questions ...
18:41:05 <amrith> hearing none
18:41:08 <amrith> #topic open discussion
18:41:59 <amrith> anybody ...
18:42:30 <trevormc> so it sounds like there is more pressure on reviews this week than uploading new patch sets to the troveclient. I can stop uploading patches if thats what we want for this week
18:43:05 <trevormc> meaning I'll have more time for reviews... :)
18:43:06 <amrith> you can upload all you want, they may not get reviewed :) but it'd be great if all could review/test this week
18:43:10 <johnma> I dont think that is what amrith meant trevormc.
18:43:28 <amrith> yes, what johnma says
18:43:35 <johnma> you should go ahead and upload whatever you want to get into this release
18:43:55 <johnma> we should also do our part in helping with reviews :)
18:44:11 <peterstac> for the i18n stuff it's not the reviews that's the problem, it the gate :)
18:44:34 <amrith> if the gate is stable, the saturday morning merge schedule is best
18:44:48 <peterstac> yeah I do that a lot over weekends :D
18:44:49 <amrith> I queued up the rechecks for 530am on saturday and they fired automatically
18:44:59 <amrith> next time, I'll do them better; i.e. not 18 at a time :)
18:45:07 <amrith> but two or three at a time
18:45:30 <trevormc> Thanks for doing the housekeeping on those.
18:46:11 <amrith> no worries trevormc; happy to do it, I understand the constraint you have there
18:46:32 <amrith> ok, anything else for open discussion ...
18:46:39 <peterstac> nothing here
18:46:47 <songjian> https://review.openstack.org/#/c/421631/2  Perhaps we need to look at this patch, I reproduce the problem today, do not know whether it is universal
18:47:27 <amrith> i was thinking I'd look at it once the submitter could get basic tests to pass
18:47:40 <peterstac> just a pep8 failure
18:47:42 <amrith> I don't understand either the problem or the fix and the commit message is empty
18:47:45 <peterstac> should be easy to respin
18:48:00 <amrith> so if that isn't fixed, and there's no explanation for the change, it will be harder to review
18:48:03 <peterstac> yeah, it'd be good if the bug info was copied into the commit message
18:48:14 <peterstac> maybe I'll put a note there :)
18:49:00 <amrith> also, I'm not entirely positive of the environment setting that the guy shows in the bug
18:49:07 <amrith> I don't know that it is all legitimate
18:49:23 <amrith> are we sure that the settings aren't junk to begin with?
18:49:49 <amrith> is it 'really a bug'?
18:49:57 <amrith> what's the severity
18:50:08 <amrith> should there be additional tests to make sure this isn't breaking something else
18:50:14 <amrith> after all, we're 1 week from client freeze
18:50:28 <amrith> I'm not going to do back flips to merge this unless we're damn sure that it is right
18:50:31 <amrith> and not going to regress
18:50:53 <amrith> and if we aren't sure, I'd rather not take it, and wait till ocata comes out and respin the client
18:51:53 <peterstac> well, it only affects log-tail so it's fairly low-risk
18:51:59 <amrith> johnma, trevormc, aliadil, anything for open discussion
18:52:31 <songjian> thx,I also think so, stay focused
18:52:34 <trevormc> no I'm good.
18:52:42 <amrith> even so, if it is fairly low risk and limited to log tail, we can take it after the release is cut
18:52:46 <johnma> I am good. Thanks amrith
18:53:00 <amrith> ok, have a good afternoon folks
18:53:04 <amrith> #endmeeting