#openstack-meeting log

17:03:58 <sdague> #startmeeting qa
17:03:59 <openstack> Meeting started Thu Jul 11 17:03:58 2013 UTC.  The chair is sdague. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:04:00 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:04:02 <openstack> The meeting name has been set to 'qa'
17:04:35 <sdague> #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
17:04:46 <sdague> ok, lets get rolling with the agenda
17:04:46 <dwalleck> And me
17:04:56 <sdague> #topic QA Program Business
17:05:25 <sdague> so QA is now an official program, which means we need a PTL and a program description
17:05:52 <sdague> I believe I'm still the only PTL nominee as of this meeting, so is there any objection to closing the nominations?
17:06:01 <mtreinish> sdague: nope, congrats
17:06:20 <dwalleck> nope, sounds like a good idea to me
17:06:21 <mlavalle> sdague: congrats Mr. PTL
17:06:34 <sdague> ok, thanks :)
17:06:34 <afazekas> sdague: congrats
17:06:51 <sdague> ok, topic 2 in QA
17:07:04 <sdague> #info sdague officially now QA PTL
17:07:17 <sdague> ok, next topic that is
17:07:22 <sdague> #topic QA Program Description
17:07:35 <sdague> I think we've got a pretty fruitful email thread going on on this one
17:07:57 <sdague> so I'd mostly like to just highlight that, and ask people to participate in it if they have other ideas
17:08:26 <sdague> I'll try to assimilate it all tomorrow into a mission statement we can take forward to the TC
17:08:35 <sdague> and get another round of feedback on it
17:08:51 <mtreinish> sdague: ok, sounds good to me
17:09:11 <sdague> if you haven't been on the ML thread it's here - http://lists.openstack.org/pipermail/openstack-qa/2013-July/000602.html
17:09:13 <sdague> #link http://lists.openstack.org/pipermail/openstack-qa/2013-July/000602.html
17:09:22 <afazekas> sdague: IMHO tools should be generally useabel both for upstream and downstream usage
17:09:38 <sdague> afazekas: ok, you want to get back on the thread there?
17:09:49 <sdague> I think that's the right place to drive the discussion
17:09:57 <sdague> as it lets everyone participate
17:10:15 <afazekas> sdague: yes
17:10:21 <sdague> cool, great
17:10:27 <dwalleck> afazekas: Agreed. A tool is just a tool. It should be flexible enough to use anywhere
17:10:55 <sdague> ok, so lets get to Blueprints
17:11:10 <sdague> #topic Blueprints
17:11:35 <sdague> #link https://launchpad.net/tempest/+milestone/havana-2
17:12:09 <mtreinish> sdague: when is the h2 deadline?
17:12:13 <sdague> can people provide #info updates to blueprints they are working
17:12:18 <sdague> h2 is a week from now
17:12:33 <sdague> #info h2 happens July 18
17:13:16 <sdague> let's at least ping folks on the high and up ones
17:13:23 <sdague> mtreinish: how's testr looking?
17:13:42 <mtreinish> sdague: so I've respun the testrepository patch based on lifeless's review
17:13:42 <sdague> afazekas: how's the leak detection looking?
17:13:45 <afazekas> I have small POC code , where I can add new detecors/monitors by 5-8 lines  https://review.openstack.org/#/c/35516/, I before I am starting puppuliation it with new detectors, I would like to knw is the basic frame ok or not
17:13:47 <mtreinish> it's pending for review: https://code.launchpad.net/~treinish/testrepository/testrepository-group-regex/+merge/173749
17:14:14 <mtreinish> I also have a tempest patch using that approach to move over run_tests.sh and not gating tox jobs to use testr --parallel pending
17:14:25 <sdague> afazekas: ok, I'll take a look this afternoon on the patch
17:14:36 <mtreinish> I can push it as WIP but it won't work unless you patch testr with those commits
17:14:43 <sdague> #info review requested on afazekas' leak detector patch https://review.openstack.org/#/c/35516/
17:15:13 <sdague> #info testr work awaiting review on testr code change - https://code.launchpad.net/~treinish/testrepository/testrepository-group-regex/+merge/173749
17:15:26 <sdague> mtreinish: yeh the WIP might be nice for folks to just see
17:15:30 <afazekas> mtreinish: can we add temporary testr fork to tempest, until your patch not merged ?
17:16:05 <mtreinish> afazekas: I guess we could, but I'm reluctant to do that if lifeless comes back with more changes it'll just diverge more when I make the changes
17:16:08 <sdague> afazekas: I think we should avoid that, as we'll be doing dual maintenance until testr gets the features
17:16:14 <mtreinish> sdague: ok, I'll push it out after the meeting
17:16:37 <afazekas> ok
17:16:38 <sdague> adalbas: you have an update on the grenade enablement in the gate?
17:17:15 <sdague> I think that's the last high priority blueprint that's up there that we've got folks around for
17:18:34 <sdague> ok, well lets move on to reviews
17:18:53 <sdague> #topic Critical Reviews
17:19:07 <afazekas> https://review.openstack.org/#/c/33211/
17:19:09 <sdague> ok, critical reviews people want to get more eyes on?
17:19:16 <sdague> #link https://review.openstack.org/#/c/33211/
17:20:01 <sdague> afazekas: ok, I'll relook at that
17:20:05 <sdague> as I'm the only one blocking it
17:20:06 <afazekas> I do not really see why we need this kind of tests in tempest
17:20:18 <afazekas> sdague: ok, thx
17:20:56 <afazekas> https://review.openstack.org/#/c/35165/
17:21:19 <sdague> #link https://review.openstack.org/#/c/35165/
17:21:34 <sdague> ok, I'm openning up tabs to go take a look post meeting
17:21:39 <sdague> any other critical reviews?
17:21:54 <afazekas> https://review.openstack.org/#/c/36367/
17:22:19 <dwalleck> Is this supposed to be some sort of state-machine like verification?
17:22:20 <sdague> #link https://review.openstack.org/#/c/36367/
17:22:22 <afazekas> The real heat scenario tests needs a lot of time, how we will be able to handle it
17:22:39 <afazekas> Probably the same will be true for the ceilometer tests
17:23:14 <sdague> afazekas: so we were actually discussing heat and ceilo in -infra yesterday
17:23:38 <sdague> I think we're going to try to handle ceilo by decorators on existing tests, dhellmann_ liked the idea, and was going to propose something
17:24:02 <afazekas> cool
17:24:27 <sdague> for heat, I think we might want to make a separate heat gate job. But we'll see.
17:24:50 <sdague> I guess a big part of landing those heat tests is if the testr patch is accepted and we can get that in within a week
17:24:58 <sdague> if so we can just hold the heat patch until it's ready
17:25:10 <sdague> otherwise we should come up with an alternative that lets them be in the gate
17:25:13 <afazekas> ok, BTW: I tried to but together what I know about the heat tests .. : https://wiki.openstack.org/wiki/Blueprint-add-basic-heat-tests , I would like confirm it with heat devs
17:25:23 <sdague> afazekas: great
17:25:35 <sdague> #link https://wiki.openstack.org/wiki/Blueprint-add-basic-heat-tests heat test information
17:25:59 <sdague> ok, any other reviews?
17:26:28 <sdague> #topic Stress Tests in CI
17:26:51 <sdague> was this a dkranz thing? or a jog0-away thing?
17:27:15 <sdague> that's not promissing that neither are around, maybe we skip and see if jog0-away or someone else pops back up in a bit
17:27:28 <afazekas> mkoderer: ^^^
17:27:35 <mtreinish> sdague: dunno, but I setup a periodic for the stress tests last week
17:28:04 <mtreinish> there is a tox job that gets used for it so if there are new stress tests just add them to the tox job
17:28:08 <sdague> mkoderer: is that your thing?
17:28:14 <mkoderer> no it's not
17:28:18 <sdague> which I also see at the end of the agenda
17:28:34 <sdague> ok, well why don't we take your topic now too, as it seems like it's related
17:28:44 <mkoderer> I know that dkranz wanted to do something with CI and stress tests
17:28:55 <mkoderer> yes sure
17:29:07 <sdague> mkoderer: or should we just take it to the mailing list because dkranz isn't around, and we want to get him in on it as well?
17:29:08 <afazekas> IMHO we should add more stress periodic jobs in the future..
17:29:21 <hemna> so a while back we wrote a cinder stress test to bang on our 3par drivers.  we recently published it on github
17:29:32 <afazekas> As I remember an another stress test was proposed on the mailing list
17:29:51 <sdague> yeh, I really think like we've got the proliferation of different but similar approaches
17:29:59 <hemna> we've been running it to find issues w/ our drivers and cinder and nova
17:30:37 <mkoderer> sdague: I spoke to dkranz we can discuss it here now
17:30:44 <sdague> mkoderer: go for it
17:30:52 <afazekas> IMHO, nowadays the openstack has too many race issue, so we should put more effort on the stress testing ..
17:30:57 <mkoderer> so we are planning a stress test for the new backup function in cinder
17:31:08 <mkoderer> so question .. .what is the right tool? ;)
17:31:10 <sdague> hemna: so how similar, or different is it from the stress stuff in tempest today?
17:31:20 <mkoderer> I think tempest stress test looks quite good
17:31:32 <hemna> I haven't seen the tempest stress stuff yet
17:31:43 <sdague> hemna: can you take a look?
17:31:52 <hemna> sure
17:31:58 <mkoderer> so I already put a new test for review
17:32:01 <sdague> it would be nice if we could figure out a way to have everyone playing in one stress test tree
17:32:03 <hemna> here is our stress tool FWIW : https://github.com/terry7/openstack-stress.git
17:32:11 <mkoderer> https://review.openstack.org/#/c/36652/
17:32:11 <hemna> sdague, +1
17:32:12 <mtreinish> hemna: they're here:  https://github.com/openstack/tempest/tree/master/tempest/stress
17:32:13 <sdague> so we don't duplicate infrastructure
17:32:25 <afazekas> the tempest test stuff is simpler, but configurable. The tool  I saw on the ML was more complex, but less flexible
17:32:29 <hemna> mtreinish, thanks
17:32:43 <afazekas> I see good values in both tool
17:32:46 <sdague> right, I guess are they resolvable to meet everyone's needs?
17:32:57 <sdague> because maintaining 2+ tools for this is silly
17:33:02 <hemna> yah
17:33:02 <sdague> if we can avoid it
17:33:14 <afazekas> sdague: It would't be an easy job
17:33:29 <hemna> we wrote ours just as a quick and dirty tool to get the job done, and then we found it useful for finding issues
17:33:44 <sdague> hemna, mkoderer: I guess can I ask you two to get together over the next week and figure out if we could merge these efforts in some way?
17:33:45 <afazekas> we would need to increase the complexity
17:33:47 <hemna> I'd be willing to work on a stress tool, as it has great benefit for us as well as the community as a whole
17:34:00 <afazekas> cool
17:34:01 <adalbas> sdague, sorry, just joined now.
17:34:02 <mkoderer> sdague: yes sure!
17:34:25 <sdague> afazekas: well we could also spin the stress directory out of tempest if it would be dragging too much complexity there
17:34:34 <sdague> as a program we don't need to be just one git tree
17:34:58 <sdague> and we'll already be 2 really soon, grenade is going to fall under this program, dtroyer and I just need to work out a groups thing in gerrit
17:35:14 <afazekas> :), It will not be a too big difference
17:35:18 <adalbas> i have an update on grenade job in the gate, sdague
17:35:29 <sdague> adalbas: ok, lets come back around in a minute on that
17:35:53 <sdague> hemna: ok, you up for taking some time this week with mkoderer to figure out if things could merge together?
17:36:08 <sdague> I don't want to volunteer folks for things unless they have the time :)
17:36:15 <hemna> so the tempest test, is there a way to focus the testing on one component?  say I want to stress cinder only
17:36:20 <afazekas> So the tempest stress tool can similar stress, but in different way.
17:36:21 <hemna> or does it run against the entire OS suite ?
17:36:56 <hemna> sdague, yah I'll take a look at the tempest stress tool and see if it can do what my tool does
17:37:04 <sdague> hemna: cool, thanks
17:37:05 <mkoderer> hemna: currently there are just 2 test inside
17:37:11 <afazekas> hemna: the basic concept of  the tempest stress tool you can define worker threads with peridic stress job
17:37:30 <afazekas> for cinder it would be great if it would read the cinder's log file as well
17:37:36 <hemna> I would suppose there would be 'plugins' for each OS component
17:37:45 <sdague> #todo hemna and mkoderer to look at the possibility of making https://github.com/terry7/openstack-stress.git and https://github.com/openstack/tempest/tree/master/tempest/stress be able to become one approach
17:38:08 <hemna> and possibly parameterize the run to say...I want to test these OS components only...
17:38:10 <sdague> we'll put this back on the agenda for next week to check in on what people found
17:38:18 <afazekas> hema: https://github.com/openstack/tempest/blob/master/tempest/stress/actions/volume_create_delete.py
17:38:18 <hemna> sdague, ok
17:38:22 <dwalleck> I'd definitely like to see some more flexibility in stress testing. What I'm interested in is combinations/randomization of load. That's when things get interesting
17:38:22 <sdague> #todo revisit stress testing next week
17:38:33 <hemna> I don't see why it can't meet everyone's needs in 1 tool/codebase
17:38:47 <sdague> hemna: we should be able to poke just one component, I think that should be a design point
17:38:57 <hemna> I've found race conditions in my FibreChannel code in Nova due to my tool.  been very helpful
17:39:04 <sdague> cool
17:39:08 <hemna> sdague, yah I agree.
17:39:17 <sdague> ok, great
17:39:29 <sdague> ok, lets move onto the last topic then
17:39:33 <sdague> #topic Full Open Stack networking gate job
17:39:49 <sdague> anyone around from neutron, or anyone working on that about?
17:40:17 <sdague> you know I think I'll need to start asking folks to put their names next to agenda items :)
17:40:19 <afazekas> I added this topic to the agenda
17:40:24 <sdague> ok, cool
17:40:29 <sdague> afazekas: the floor is yours
17:40:45 <afazekas> What is missing to have a full neutron voting gate job ?
17:41:25 <afazekas> As I see tens of test cases are failing for ~7 reason
17:41:38 <sdague> the biggest issue is API issues where neutron causes different error codes when called via nova
17:41:46 <nati_ueno> hi are you talking about gating issues now?
17:41:58 <sdague> nati_ueno: no, though I was about to bring that up
17:42:11 <afazekas> Can we  skip those with bug until they are fixed ?
17:42:13 <sdague> because we're about to be non voting on neutron at all
17:42:16 <nati_ueno> sdague: gotcha
17:42:24 <sdague> afazekas: the problem is no one is working to fix them
17:42:32 <mlavalle> sdague, afazekas: i've been working on fixing some of those issues
17:42:40 <sdague> ok, sorry :)
17:42:47 <mlavalle> in fact, I have a review pending
17:42:49 <afazekas> Do we track / group those issues in anyway ?
17:42:54 <mlavalle> https://review.openstack.org/#/c/35724/
17:43:04 <mlavalle> that is aimed at one of those issues
17:43:15 <afazekas> I reported one today ..
17:43:33 <sdague> mlavalle: great
17:43:50 <mlavalle> sdague: there is also Jordan Pittier working on this
17:43:54 <sdague> #info please review - https://review.openstack.org/#/c/35724/
17:44:12 <mlavalle> sdague; I will ping him to see what progress he has achieved
17:44:22 <mlavalle> on the isues
17:44:26 <mlavalle> he is working on
17:44:27 <sdague> mlavalle: yes, jordan seemed to get blocked recently though, as he wend back down the path of special casing tempest, which I really don't agree with
17:44:50 <sdague> nova v2 api can't return different values with neutron in the backend if we want to provide a seemless migration path for people
17:44:51 <mlavalle> sdague: I will review the email thread as see how can I help
17:44:58 <sdague> mlavalle: cool, thanks
17:45:48 <afazekas> IMHO we should have an etherpad or wikipage or blueprint  or something, where we track  the full neutron gate blocker issues
17:46:15 <mlavalle> afazekas: We already have one ether pad. I will resend it to the ML
17:46:34 <afazekas> mlavalle: cool, thank you
17:46:55 <sdague> mlavalle: great, do we have a blueprint for this?
17:47:13 <mlavalle> sdague: no, I've been working off of the etherpad
17:47:23 <sdague> so can you make a blueprint as well
17:47:27 <sdague> link it to the etherpad
17:47:32 <mlavalle> sdague: sure
17:47:39 <sdague> but we can track it as a high priority item to revisit each week
17:47:39 <afazekas> cool :)
17:47:48 <sdague> great
17:48:08 <mlavalle> sdague; please add it as a todo to this meetings log
17:48:09 <sdague> so one other thing... we're actually about to turn off voting on neutron entirely
17:48:24 <sdague> #todo mlavalle create a blueprint for full neutron gate tests
17:49:09 <sdague> because there is a neutron flakey bug with at least 124 rechecks on it
17:49:22 <sdague> #link https://bugs.launchpad.net/neutron/+bug/1194026
17:49:25 <uvirtbot> Launchpad bug 1194026 in neutron "check_public_network_connectivity fails with timeout" [Critical,Confirmed]
17:49:34 <sdague> nati_ueno: you were poking at this last night, any updates on it?
17:49:38 <afazekas> sdague: that neuron recheck bug is very annoying, may be enough to skip just that one
17:49:58 <nati_ueno> sorry, no. I'm writing stress code for floating ip
17:50:00 <sdague> afazekas: so the concern was that the neutron team couldn't reproduce locally
17:50:13 <sdague> which means if we skip it in the gate, it's impossible to debug
17:50:23 <sdague> so instead the approach was to make neutron non voting
17:50:42 <sdague> nati_ueno: ok, no problem, someone said you were working on it earlier
17:50:47 <afazekas> sdague: my problem I frequently trigger an another issue when I try to reproduce a race issue .. :(
17:50:51 <nati_ueno> Sorry for inconvenience. Mark and I'm OK for non-vonting now
17:51:02 <afazekas> But I will try it
17:51:24 <nati_ueno> Mark and Gary and me is at least working on this issue as a top priority
17:51:25 <afazekas> sdague: BTW: https://review.openstack.org/#/c/33932/
17:51:53 <afazekas> I usually using tracers what needs to run from the beginning
17:51:59 <sdague> afazekas: right, I think part of the issue is neutron logging doesn't seem very sufficient in the gate, which is making it harder to debug. Might end up needing some enhancements to neutron to make that better to get to problems quick in the gate
17:52:07 <afazekas> That is the reasion why that change would be helpfull for me
17:52:44 <sdague> afazekas: ok, let me ponder that
17:53:00 <sdague> the reality is, when we hit these kinds of races in nova, we can usually get really close to the issue with just the debug logs
17:53:01 <afazekas> I have doubts this time tracing just single process is enough, maybe I will need to use sytemtap ..
17:53:20 <sdague> so I'd hope we could do the same with neutron
17:53:59 <sdague> ok, I think we're good there for now
17:54:07 <sdague> #topic Open Discussion
17:54:17 <sdague> any other items?
17:55:27 <sdague> going once
17:55:35 <sdague> going twice
17:55:47 <sdague> ok, thanks for joining everyone, we'll see you on #openstack-qa
17:55:53 <sdague> #endmeeting