17:00:33 #startmeeting qa 17:00:34 Meeting started Thu Jul 18 17:00:33 2013 UTC. The chair is sdague. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:35 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:37 The meeting name has been set to 'qa' 17:00:46 hey folks, who's around for the meeting? 17:00:49 hi 17:00:50 hi 17:00:51 Hi! 17:01:23 also, if anyone cares, and has google hangouts working, I'm going to run my video camera open for the meeting - https://plus.google.com/hangouts/_/4770be3e1806b15ec7aaf802806072fbfc5c4a64?hl=en 17:01:24 hi 17:01:30 o/ 17:01:47 sdague: cool I will try it 17:01:51 ok, lets get agenda up 17:02:29 agenda - https://wiki.openstack.org/wiki/Meetings/QATeamMeeting 17:03:02 ok, topic 1 Havana-2 status check in - https://launchpad.net/tempest/+milestone/havana-2 (sdague .. et al) 17:03:39 so I'm going to have to push a bunch of stuff to H3 17:03:40 so we need to either close those out or move them to H3? 17:03:56 yeh, I wanted to see if anything else made H2 before I do the mass change 17:04:10 updates from folks? 17:04:29 I guess while we are waiting for those, we'll do testr check in 17:04:34 mtreinish: you're up 17:04:46 ok sure 17:04:58 https://review.openstack.org/#/c/35516/ I would like too have more ayes on this change, before adding everything else 17:05:19 so we've got testr running tempest now as a non voting job on every check job in zuul 17:05:21 afazekas: ok, cool, will look after meeting 17:05:37 #note testr non voting runs now in the check queue 17:05:43 results are looking promising tempest takes about ~15.5min in the gate 17:05:54 #info testr non voting runs now in the check queue 17:06:01 nice 17:06:04 very cool 17:06:06 mtreinish: cool 17:06:14 right now we're hitting 4 failures with the jobs, I've outline them here: https://etherpad.openstack.org/debugging-testr-tempest 17:06:45 once we sort out these issues and the nonvoting jobs are stable enough we'll be able to migrate everything to using testr 17:06:51 that will be awesome 17:07:16 we still have the smoke test issue right, where smoke tests will run more than they should? 17:07:16 which is important for the neutron smoke job 17:07:17 so I'll appreciate any help with the debugging of those issues 17:07:37 any additional volunteers to help track down testr races? 17:07:43 sdague: so I think that can be sorted using subunit-filter 17:07:55 I would like to help in tracing the issues 17:08:02 afazekas: great 17:08:13 nice, mtreinish and mkoderer on the hangout now with me :) 17:08:23 afazekas: cool, that woudl be great 17:08:39 #action afazekas and mtreinish to work through testr issues to get that to be our primary runner in the gate 17:08:55 mtreinish: you should mute :) 17:08:57 also, I should note that there is an issue with the test count being reported with testr 17:09:02 sorry 17:09:24 mtreinish: cool, is there a bug up with testr on that? 17:09:39 IMHO we should have more stress task and job combination in order to simplify , the everything racing with everything issue 17:09:54 not yet, but soon I'm collecting some logs to give to lifeless. Then I'll report it. 17:10:02 afazekas: I suspect the testr races are actually tempest's fault 17:10:05 and configure jobs for the suspected combinations 17:10:06 not the system 17:10:18 but I agree that the stress tests will shake out good things 17:10:30 ok, how about we talk about stress tests 17:10:38 good segway... mkoderer 17:10:39 sdague: I hope it will be tempest failure, it is more easy to debug :) 17:10:39 ok great 17:10:44 afazekas: :) 17:10:59 so let's have a short look to the open items 17:11:16 I think we can finish nearly everything next week... 17:11:28 awesome 17:11:28 open for me is the jenkins stuff 17:11:38 what is the current state here? 17:11:57 I think there is already something running right? 17:12:01 mkoderer: what jenkins stuff? we've got the periodic job working again after the json fix I pushed yesterday 17:12:03 mkoderer: which jenkins issue? 17:12:16 so we have those items open: 17:12:19 mkoderer: https://jenkins.openstack.org/job/periodic-tempest-devstack-vm-stress/ 17:12:32 Create a json test description for a standard test to be run in a periodic jenkins job: INPROGRESS 17:12:43 mkoderer: that's done 17:12:57 mtreinish: can you update the blueprint? 17:13:04 ok cool... so I hope we can finish it next week 17:13:10 they've got a good whiteboard list of tasks in it 17:13:16 afazekas and me will add some new tests 17:13:18 sdague: sure, I'm really bad about dealing with blueprints... 17:13:18 #link https://blueprints.launchpad.net/tempest/+spec/stress-tests 17:13:32 https://jenkins.openstack.org/job/periodic-tempest-devstack-vm-stress/lastBuild/console 17:13:32 mtreinish: well I guess a scolding is in order :) 17:13:55 cool, dkranz on as well 17:14:07 sdague: I often forget to link my commits to my own blueprints.... 17:14:10 ok, any other blueprints? 17:14:31 sdague: well, I added one for creating service tags... 17:14:37 but that is pretty self explanatory 17:14:48 I'll move stuff to H3 tomorrow. I'm actually curious what happened to the HP folks that were adding tests, I haven't seen updates on their BPs for a while 17:15:00 mtreinish: great 17:15:24 ok, moving on from blueprints 17:15:34 sdague: we got approval from our internal process so HP will be doing checking in H3 mostly 17:15:34 #topic Mailing list move (voting on moving to -dev) (sdague) 17:15:50 ok, so we discussed this on the mailing list some already 17:16:02 but I wanted to open up for discussion, vote here. 17:16:26 my concern is that by having our own mailing list we end up in a funny 'qa corner', which doesn't include the main dev teams 17:16:39 sdague: I'm fine with moving as long as there is a tag we agree on using for qa related topics 17:16:41 which I'd like to avoid, and be more integrated with the dev conversations 17:16:43 sdague: I think anything that separates qa from dev should be eliminated 17:16:55 sure, my suggestion is going to be a [qa] tag 17:17:04 which would apply to tempest, grenade, or other related things 17:17:18 given qa is the program name 17:17:23 sdague: that's fine, just wanted there to be convention (makes mail filtering easier :) 17:17:34 +1 17:17:36 are there any descenting opinions? 17:17:39 +1 17:17:43 +1 17:17:50 +1 17:17:51 ok, wait, let me use the vote thing :) 17:18:04 sdague: smiles all around 17:18:13 +1 17:18:13 #vote move mailing list traffic to openstack-dev list 17:18:36 hmm... maybe I don't know how to make that work :) 17:18:43 well we're +1s all around so far 17:18:45 any -1s? 17:19:09 ok.... we'll call that done deal then 17:19:18 I'll work with infra to get us sorted tomorrow. 17:19:28 sdague: #startvote 17:19:38 http://ci.openstack.org/meetbot.html 17:20:01 sdague: ^^^ maybe you should read that as a meeting chair 17:20:11 #startvote move mailing list traffic to openstack-dev? Yes, No 17:20:12 Begin voting on: move mailing list traffic to openstack-dev? Valid vote options are Yes, No. 17:20:13 Vote using '#vote OPTION'. Only your last vote counts. 17:20:19 #vote Yes 17:20:26 #vote Yes 17:20:33 #vote Yes 17:20:35 #vote Yes 17:20:36 #vote Yes 17:20:39 #vote Yes 17:20:46 mtreinish: yes, probably... :) 17:20:51 #vote Yes 17:21:11 #vote Yes 17:21:16 ok, last chance to voice... 17:21:17 #vote you guys are just lonely 17:21:17 dansmith: you guys are just lonely is not a valid option. Valid options are Yes, No. 17:21:21 oh, damn. 17:21:27 heh :) 17:21:37 #endvote 17:21:38 Voted on "move mailing list traffic to openstack-dev?" Results are 17:21:39 Yes (8): krtaylor, mlavalle, mtreinish, dkranz, afazekas, sdague, Bageshree, mkoderer 17:21:46 ok 17:22:00 #agreed moving mailing list traffic to openstack-dev with [qa] tag 17:22:08 #action sdague to work with infra to implement 17:22:18 ok, thanks mtreinish for the assist 17:22:27 next topc 17:22:35 #topic White Box Tests 17:22:48 I don't know who listed this on the agenda, can they step forward? 17:23:28 also, reminder, if you have working google hangout, dkranz, mkoderer, mtreinish, and I are at - https://plus.google.com/hangouts/_/4770be3e1806b15ec7aaf802806072fbfc5c4a64?hl=en all muted, but you can see video 17:23:29 sdague: I put it there but I think it was on behalf of afazekas 17:23:38 ok, afazekas you want to speak to it? 17:23:42 So my question about the whitebox test, do we really want to direct DB updates by tempest ? 17:24:03 afazekas: I'd say no, but I thought that there was more in whitebox than just that 17:24:09 afazekas: that's a good question 17:24:36 mtreinish: mostly state transition matrix 17:24:39 afazekas: if we are sure that we can cover that content back in unit tests, I'd be happy to get it out of there 17:24:58 I think jaypipes brought that in to handle some bugs we weren't having any luck in in unit test tracking down 17:25:06 especially in nova state transitions 17:25:17 but we aren't actually running them right now 17:25:18 sdague: IMHO some parts are even better can be covered by stress tests 17:25:19 right? 17:25:26 afazekas: that's true 17:25:36 if we can cover them other places, I'm totally good with removing them 17:25:53 sdague: I think there is a place for whitebox tests but the ones that cannot be covered otherwise are pretty low priority now. 17:26:26 yeh, what I'd say right now is they aren't a focus, if someone wants to clean them up, or to drop them because we've got it covered else where, I'd +2 it 17:26:43 sdague: We could just scrutinize new submissions more closely to make sure they really need to be whitebox 17:26:51 Another related question, do we want to test state transition impossibilities (negative) by tempest (stable states, so expected to be gate friendly) 17:26:55 sdague: If there are any 17:26:56 yeh, I don't think we've had that many new submissions there 17:27:20 afazekas: if we think it will expose a bug, sure 17:27:28 ok 17:27:50 anything else on white box tests? 17:28:05 I saw skip removal review 17:28:19 well even so, whitebox aren't in full 17:28:30 so a skip removal doesn't do anything 17:28:41 except for mtreinish's all periodic run 17:29:02 sdague: I didn't set that up... 17:29:28 ok, so for the all periodic (not mtreinish's) :) 17:29:53 ok, anything else on white box? 17:30:05 #topic Critical Reviews 17:30:09 Ok so someone secretly not restored the state matrix to the old one 17:30:32 afazekas: ok, is that something we need to fix? 17:30:46 I do not think so 17:30:47 also, critical reviews time, pimp your reviews that you need landed 17:30:56 or that someone should look at 17:31:01 But someone at the nova side should confirm it 17:31:06 sdague: We need to deal with the slow heat issue. 17:31:21 sdague: It is not really waiting for code review 17:31:28 dkranz: well, my current strategy is testr 17:31:32 sdague: But we have no way to run it now. 17:31:48 because the tempest runs dropped a lot with that 17:31:55 sdague: OK, we can push the limits of that 17:32:09 dkranz: agreed, and if we do, we split the heat case off 17:32:12 sdague: sbaker said there were more slow tests coming 17:32:25 sdague: I think he is just waiting for a way to run them. 17:32:32 ok, I guess those guys are all asleep right now, right? 17:32:39 sdague: Yeah. 17:32:57 dkranz: ok, how about we come back to this after running on reviews 17:33:01 sdague: Sure 17:33:14 just so that we don't prevent people from posting reviews they need eyes on, then I'll #topic it 17:33:27 ok, any more reviews? 17:33:33 or any reviews (I 17:33:34 https://review.openstack.org/#/c/33211/ 17:33:37 sdague: A lot just need +A 17:34:04 afazekas: did we figure out why that went non votable? 17:34:40 dkranz: I'll make an effort to do some real reviews today. I've been bad about that this week. 17:34:47 dkranz: sure, but if there were specific reviews that are hot that we need eyes on, this is a good time to bring them up 17:34:52 basically a hot reviews time 17:35:15 sdague: I think every review is important for someone 17:35:16 also, remember, even if you aren't qa-core we could use review comments 17:35:16 sdague: I think just afazekas leak stuff which was already mentioned 17:35:34 afazekas: true, but not all of them have the same impact on the project 17:35:51 Of course we should try to review as rapidly as we can. 17:35:57 yep 17:36:04 sdague: the core reviewers main duty to review , AFAIK 17:36:16 I will need this one https://review.openstack.org/#/c/36820/ for the blueprint 17:36:27 There must be some gerrit bug with https://review.openstack.org/#/c/33211/ 17:36:32 but dkranz found something so... I need to fix it first... ;) 17:36:41 ok, cool, got it up in my window 17:37:03 dkranz: could andreaf have made it a draft? 17:37:46 sdague: I don't know. It doesn't look any different except for can't review 17:37:51 #action sdague to take https://review.openstack.org/#/c/33211/ to -infra to figure out why we can't vote on it 17:37:57 because I think it was +2 all around 17:38:07 sdague: Right. 17:38:24 ok, any other reviews? 17:38:26 going once.... 17:38:33 going twice.... 17:38:39 ok, moving on 17:38:44 #topic Heat tests in gate 17:39:07 sdague: Even if we make a separate job, it could still take too long. 17:39:11 dkranz: so how slow is slow for these tests? 17:39:12 dkranz: any idea from sbaker how long running they think the heat tests are going to be? 17:39:26 mtreinish: well they are going to use real OS images, not cirros 17:39:28 The one there is supposed to be 10min 17:39:47 I think we need to make a separate job and give it a budget. 17:40:01 dkranz: ok, I'm fine with that 17:40:21 you want to take making that job? 17:40:23 dkranz: yeah if 1 test is 10min that's probably all we can do. 17:40:50 sdague: I can do that but won't get to it until Monday because I am out tomorrow. 17:40:55 sure, that's fine 17:41:04 I think, given their length, I'd like to start them on testr 17:41:16 even if that means they start non voting 17:41:25 assuming that's going to help with their overall run 17:41:26 sdague: We can not mark it as gate and then run the heat tests in the other job. 17:41:34 sdague: well it's going to be just the heat jobs right? 17:41:44 sorry heat tests in the heat job. 17:41:50 not everything + heat 17:41:54 mtreinish: Right 17:42:06 mtreinish: right 17:42:18 we're going to need to separate heat from the rest of api 17:42:32 sdague: Why? 17:42:42 well... conceptually 17:42:49 sdague: Sure. 17:42:52 because the main runs are going to be api - heat 17:43:00 can we do that sanely in testr? 17:43:02 dkranz: otherwise they'll get picked up when we tell testr/nose to run tests in tempest.api 17:43:09 without having to list a million subdirs? 17:43:13 sdague: But if heat is not marked gate it won't run in the main ijob. 17:43:25 and the heat job can just select heat 17:43:33 sdague: I'll prioritize figuring out how to run subunit-filter with our env 17:43:39 I'm not convinced we actually got that tag working in the gate 17:43:46 sdague: the autoscaling (scenario?) is the only slow heat test 17:43:47 actually the service tag bits would solve this 17:43:51 mtreinish: But if it is not gte it won't run 17:44:01 afazekas: ok, so there will be heat tests that don't do that? 17:44:13 dkranz: we run tempest.api tempest.scenario tempest.cli and tempest.thirdparty in the gate 17:44:16 we don't use the tag 17:44:17 so maybe the right thing to do is put the autoscaling tests outside of api 17:44:25 sdague: the existing ones are ok IMHO 17:44:31 afazekas: ok, cool 17:44:34 mtreinish: Oh. So why do we have it if we are not using it? 17:44:49 dkranz: because it got only partially implemented 17:44:50 dkranz: no idea I just know what's being run now 17:45:11 We really need to decide if we are using that or not. 17:45:21 It's silly to have it and not use it. 17:45:24 dkranz: agreed, lets solve the heat thing first 17:45:29 well, it is a pain to individually tag 1063 tests... 17:45:34 well, a lot of it was about dealing with testr 17:45:40 sdague: OK 17:45:43 but who was doing that work 17:45:46 because we had no idea if that was going to work in time 17:46:05 but now that it's close we can probably do a purge of the gate tag once it lands 17:46:11 mtreinish: A bunch of stuff with 'smoke' was submitted. I thought it was all but I guess not. 17:46:13 ok, anyway, the heat autoscale tests 17:46:32 sdague: I'll clear out gate when I start working on the service tags 17:46:44 what if we put that in as a scenario test, use the heatclient for those 17:46:47 Can't we exclude the heat directory with testr? 17:47:04 dkranz: well afazekas just said that a lot of the heat tests are fine in api 17:47:09 and not super long running 17:47:13 just the autoscaling bits 17:47:21 which feels like a scenario test to me anyway 17:47:34 if there are heat api tests in there now they're being run 17:47:40 yes, there are a couple 17:47:55 but I'd like to keep their simple api testing in the main runs 17:48:00 sdague: OK, so let's make it scenario. 17:48:11 as that means it gets tested on other dbs, with neutron, etc. 17:48:18 and only push autoscale seperate 17:48:25 cool 17:48:31 We still need to keep it out of the main run. 17:48:37 yep 17:48:57 We could just mark it slow 17:49:05 but we'll figure out a tag to exclude on for that 17:49:08 and skip those in the main run 17:49:27 sure, or something, lets work that out later. 17:49:32 you want to communicate with sbaker about the approach? 17:49:46 sdague: OK 17:50:41 #info have the heat team submit the autoscaling tests as scenario tests to make it easy to give them their own time budget 17:50:56 #action dkranz to work with sbaker to get any other details sorted out 17:51:06 cool 17:51:08 ok 17:51:29 I had one more issue I should have put on the agenda 17:51:30 #topic Open Discussion 17:51:43 ok, any other topics 17:51:43 9 minutes left :) 17:51:53 testing and stability of python clients 17:52:00 Adding test cases with skip attribute 17:52:18 people have started talking about this as if it were true but we don't test it at all 17:52:18 go ahead 17:52:30 dkranz: who talked about it? 17:52:41 That is, using a new client library will work with older versions 17:52:42 afazekas: don't we shouldn't add a test case unless there is a way to run it in ci 17:52:42 maybe that's just education 17:52:59 dkranz: right, I think there is a missing kind of test run 17:53:09 which is new clients on stable release 17:53:18 sdague: The context was with uncapping versions of client libs 17:53:31 dkranz: right 17:53:36 sdague: It was agreed that uncapping was good because it was "not needed" 17:53:46 mtreinish: I have no idea how many test case can be lost in somewhere in the history and now, they should be able to run.. 17:53:58 dkranz well actually it was agreed that it was good, because otherwise we release broken clients 17:54:01 sdague: I sent an email to the list about this a few days ago. 17:54:04 which happened with keystone 17:54:16 a last q: have we started putting together etherpad for tempest/testr or any framework topic to be discussed in summit? 17:54:18 sdague: I agree with the principle, but we don't test it at all 17:54:27 dkranz: not directly 17:54:31 but we do in the interactions 17:54:38 because nova uses keystone client 17:54:46 sdague: Yes, that gives us some coverage 17:54:47 and glance client, cinder, neutron client 17:54:52 under the covers 17:55:07 that was what we got ourselves in trouble with, as we weren't testing their git trees 17:55:09 afazekas: we have the skip tracker tool to see if bugs are fixed but it's not perfect 17:55:09 But it does not test that a client still runs against a previous release that has the same api version 17:55:19 dkranz: correct 17:55:26 afazekas: I'm sure there are skips we can remove in there now, but we don't want to add things with skips 17:55:27 and for that I think we need a new stable bitrot job 17:55:34 sdague: We would need to run old stable branch scenario tests with current clients 17:55:41 right, exactly 17:55:50 anyone want to volunteer to get that spun up? 17:55:58 it should be reasonably simple 17:56:08 sdague: That will be hard going back becaues the scenario tests are not many 17:56:14 But should get better in time 17:56:24 dkranz: actually, api does a lot of testing indirectly 17:56:34 because of the integration points 17:56:38 sdague: No, api does not use client libs 17:56:45 dkranz: not directly 17:56:53 dkranz: it doesn between the services 17:57:02 for example nova uses glanceclient to make images calls 17:57:03 but compute/volumes tests us cinderclient for nova => cinder comms 17:57:09 right, exactly 17:57:17 mtreinish: Yes, you're right 17:57:25 so it won't catch everything, for sure 17:57:27 I will create a blue print for this. 17:57:35 but it will expose totally broken things 17:57:38 and should help 17:57:48 dkranz: great 17:58:09 #action dkranz to create blueprint for doing stable tests with upstream git clients 17:58:16 ok, anything in the last minute? 17:58:19 repeating my q again but have we started putting together etherpad for tempest/testr or any framework topic to be discussed in summit? 17:58:21 I've got a hard stop 17:58:32 Bageshree: Not yet. 17:58:37 Bageshree: not really, it's just what's in the code 17:58:45 thanks 17:59:11 ok, well I've got to run, so I'll close out the meeting, please continue the conversation in #openstack-qa 17:59:20 thanks all for comming 17:59:23 #endmeeting