17:02:04 #startmeeting qa 17:02:05 Meeting started Thu Jun 19 17:02:04 2014 UTC and is due to finish in 60 minutes. The chair is mtreinish. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:02:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:02:09 The meeting name has been set to 'qa' 17:02:11 hi who's here today? 17:02:14 hi 17:02:15 hi 17:02:23 o/ 17:02:25 hi 17:02:26 hi 17:02:30 o/ 17:02:31 #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting#Proposed_Agenda_for_June_19_2014_.281700_UTC.29 17:02:35 ^^^ Today's agenda 17:03:05 let's get started 17:03:15 #topic Specs Review 17:03:22 we should also put the midcycle in the agenda somewhere 17:03:30 so we have 2 prepopulated on the agenda today 17:03:39 sdague: sure I'll fit it in after bps 17:03:51 #link https://review.openstack.org/94473 17:03:54 mtreinish: I put the first one 17:03:56 dkranz: I'm assuming you added that one 17:04:02 dkranz: ok go ahead 17:04:09 I chatted with boris-42 17:04:26 They are working on this but I encouraged him to update the spec as soon as possible 17:04:34 Because we want to review the spec first 17:04:41 Also, no one else can contribute until he does 17:04:54 He said he would update it soon, by next week in any event 17:05:18 dkranz: ok good. We can always just take over the bp if it takes too long 17:05:26 So I will continue to keep after this because it is important 17:05:29 because this is an important one 17:05:35 It was also raised on the fits ml 17:05:58 I think that's it unless some one else has a comment 17:06:26 dkranz: I didn't even know that was a ML 17:06:45 dkranz: ok 17:06:55 #link https://review.openstack.org/#/c/101232/ 17:07:03 yfried: you posted this one 17:07:09 yeah 17:07:22 what can I say that's not in the spec? 17:07:48 it's part of mlavalle's and myself larger effort on enhancing scenarios starting with network 17:08:13 yfried: ok 17:08:21 code duplication in scenario is something that's been bothering me for a long time 17:08:46 yfried: I think this just needs to be reviewed 17:08:50 are there any objection? should I add more details? 17:08:55 when it starts passing jenkins everyone will review it 17:09:05 I don't think anyone has had a chance to really read it yet 17:09:12 mtreinish: ok. 17:09:16 yfried: yeh, I'd run tox locally to make sure it will once uploaded 17:09:24 faster than waiting for nodes to cycle back around 17:09:35 sdague: +1 17:09:41 what kind of gate is there for rst file??? 17:09:53 yfried: that it actually compiles 17:09:57 yfried: syntax check 17:10:05 ok are there any other specs to bring up today? 17:10:15 yfried: look in the failed log 17:10:16 yes 17:10:16 anyway, could I use this stage to ask about scenario in general? 17:10:26 dkranz: I will 17:10:29 andreaf: ok go ahead 17:10:43 #link https://review.openstack.org/#/c/86967/ 17:10:49 the non-admin one 17:11:08 is there any further concern with it, did people have a chance to review the latest version? 17:11:30 andreaf: I am still concerned that it's not doable until the preallocated ids one is done 17:11:39 which I don't think yet exists 17:11:59 sdague: yes that's the other spec I wanted to talk about https://review.openstack.org/#/c/98400/ 17:12:46 andreaf: I propose a possible solution there - my aim was to try and make it as transparent and efficient as possible to people configuring tempest 17:13:20 but we could go with the assumption that each process will use up to N accounts and ask people to configure N x number of processes accounts 17:13:34 with some naming conventions to split them across processes 17:13:36 sdague: why do you think it is not doable? It is not as good, but should still work with no isolation. You just can't run parallel. 17:13:55 andreaf: so the first one depends on Hierarchical multi-tenancy or the pre allocated ids both of which don't exist yet 17:14:14 dkranz: I think it builds this other path of things we don't self test 17:14:36 I basically think we should remove non isolation from the code, because it's not regularly tested 17:14:59 we run it every night 17:15:01 it doesn't work 17:15:06 mtreinish: right 17:15:10 and no one fixes it 17:15:12 sdague: perhaps, but not until non-admin is working 17:15:16 sdague: mtreinish: it's still needed by some 17:15:34 yfried: not if we do pre-allocated ids 17:15:47 this is why I think this whole effort has to start there 17:15:55 sdague: mriedem and I were looking at it the other day 17:16:05 we kinda got side tracked with gate things 17:16:05 if non-isolation code is removed test cannot be run in deployment using VLAN Manger. 17:16:21 mtreinish: have you looked at https://review.openstack.org/#/c/98400/? 17:16:22 sdague: I think that is fine. Just saying we should not rip out existing code yet because it is used. 17:16:31 dkranz: yeh, not today 17:16:35 sdague: Just because we choose not to have gate jobs for other confis 17:16:38 dpaterson: what's the specific issue there? 17:16:47 dkranz: I'm trying to use that as priority setting 17:16:51 andreaf: it's next on my spec review list. I have some ideas on doing the threading stuff 17:16:59 sdague: fine 17:17:12 So in testing HA deployment that requires vlan manager you cannot turn on tenant isolation 17:17:13 1) do pre allocation of ids 2) do non admin 3) remove non isolation case 17:17:15 sdague: I'm unfamiliar with pre-allocated ids, so I won't press the issue. but just 17:17:22 non isolated way is useful with n-net linuxbridge driver, because tempest does not de-allocates the implicitly assigned tenant networks 17:17:23 sdague: sure 17:17:40 afazekas: that's just a bug right? 17:17:42 sdague: That is the issue. 17:18:00 sdague: the tenant isolation code does not work with many possible network configs. Only those used in the gate now 17:18:10 sdague: on tenant deletion it can could be solved by some extra code afaik 17:18:19 sdague: Because it makes too many assumptions about the network 17:18:28 sdague: Of course this could be fixed 17:18:30 dkranz: sure, but then lets solve them 17:18:38 instead of keeping around a job which is always failing :) 17:18:40 ok in any case tenant pre-allocation should work with all nw configs, or? 17:18:58 sdague: But then we will have the config issue because we will want to have the fixes tested 17:19:12 sdague: can we have another job with nova-network vlan? 17:19:22 if there are specific requirements for the setup of accounts it would be good to include them in the spec for preallocated accounts https://review.openstack.org/#/c/98400/ 17:19:23 dkranz: we can just unit test that the proper calls are being made 17:19:30 and assume that works 17:19:36 we won't have perfect gate coverage 17:19:38 dkranz: possibly, but to get anywhere, this has to start with the pre allocation of ids 17:19:42 mtreinish: ok, that's a resonable compromise 17:19:57 sdague: yes 17:20:06 dpaterson: I'd still like to know if you think it's impossible to work with vlan or just that it doesn't today? 17:20:17 sdague: just that it does not today 17:20:21 ok 17:20:33 sdague: also some tempest tests fail and they will have to be fixed too 17:20:46 ok so the take away is for everyone to go and review andreaf's spec :) 17:20:49 sdague: because tempest assumes there is a single fixed network shared by all tenants 17:20:50 and then we can go from there 17:21:06 dkranz: right, that we need to fix anyway. So it's goodness. 17:21:29 sdaque:just broken today, would be great to run parallel tests with vlan but not possible today. 17:21:47 ok are there any other specs to bring up? 17:21:51 https://review.openstack.org/#/c/91725/ 17:22:01 again :) 17:22:06 #link https://review.openstack.org/#/c/91725/ 17:22:11 so like mtreinish said, lets focus on andreaf's spec and try to get that approved by next week 17:22:36 dpaterson: ok, I haven't had a chance to review the latest rev yet 17:22:47 but I'll take a look when I can 17:22:51 dpaterson: I will look at it today 17:23:01 k, thanks 17:23:20 ok are there any other specs? If not let's move on 17:24:01 #topic Blueprints 17:24:03 dpaterson: the only other thing I'd like in it is a --dry-run that doesn't do any deletes, just shows what it would delete 17:24:15 so we can use that for audit 17:24:24 otherwise it looks good 17:24:29 #link https://blueprints.launchpad.net/tempest/ 17:24:34 sdague: k good idea 17:24:43 ok are there any in progress bps that we need to discuss 17:24:46 or have a status update on 17:25:38 mlavalle: ? 17:26:01 ok then, let's move on 17:26:11 hi 17:26:19 at the next meeting we'll look at the essential and high prio in progress items 17:26:47 #topic Midcycle Meetup 17:26:59 #link https://wiki.openstack.org/wiki/Qa_Infra_Meetup_2014 17:27:38 mtreinish: regrets :( 17:27:41 so for those who haven't seen it yet we're having a midcycle meetup on the week of july 14th in Darmstadt 17:27:57 mtreinish: But I am going to the neutron thing the week before 17:28:09 dkranz: ok no worries 17:28:17 I just booked yesterday, looking forward to it 17:28:24 the schedule has been basically finalized, at this point 17:28:36 I'm working on the list of topics to discuss and work on 17:28:44 so if people have ideas please share :) 17:29:05 - more debug on the gate 17:29:12 - periodic stress jobs 17:29:16 afazekas: heh, that's a constant 17:29:27 :) 17:29:41 also if you're planning to attend please add yourself to the wiki 17:29:48 * mtreinish looks at mordred 17:30:08 other than that I don't think there is anything else about it that's not on the wiki 17:30:12 mordred is a quantum phenomena he'll just appear 17:30:45 sdague: yeah but we're but using that as a head count for planning :) 17:30:51 although I guess we can assume that +1 17:30:58 dkranz / afazekas: given all the fedora in gate interest, it might be nice to get more folks that seem to be working on that there 17:31:12 I don't know who else from redhat would be appropriate 17:31:41 sdague: I'll suggest that 17:31:41 psedlak will be there as well. 17:32:08 afazekas: ok can you add them to the wiki? 17:32:30 The test respository / test server / client etc is quite a large one it would be interesting to share some ideas about where do we want to go on that 17:32:52 andreaf: yeah I'm sure that'll come up :) 17:33:00 well if there isn't anything else about the midcycle let's move on 17:33:11 #topic Grenade 17:33:25 sdague: the floor is yours 17:33:29 sure 17:33:42 it's actually kind of been a grenade week for me 17:34:16 our last big failure cause in the gate is because of slow nodes in the new hp cloud, some timeout increases based on screen helped 17:34:46 this was basically services not starting on the new side because screen hadn't finished spawning bash 17:35:01 also, the grenade logs in the gate aren't all in console.html any more 17:35:13 but instead peeled off like in devstack runs 17:35:50 the things that are currently in progress, EmilienM is trying to complete getting the javelin2 pieces I put in tempest to work in grenade 17:36:11 sdague: well we probably should approve that bp before we approve more patches... 17:36:15 I think there is still an issue with image resources because they weren't in the right place 17:36:20 mtreinish: sure 17:36:48 there will also be more resource support coming on the tempest side for that, driven by EmilienM, he's doing heat and ceilometer resources 17:36:52 and neutron 17:37:06 so I have a todo to fix the spec, I'll do that today 17:37:20 cool, sounds like grenade is getting active 17:37:22 which is a good thing 17:37:39 yeh, been great to get assistance from EmilienM. Also happy if other folks want to jump in 17:37:46 oh, I also have a bashate integration patch :) 17:37:57 heh, so it was renamed 17:38:03 https://review.openstack.org/#/q/status:open+project:openstack-dev/grenade+branch:master+topic:bashate,n,z 17:38:07 it's in process 17:38:11 cool 17:38:32 ok anything else on the grenade front? 17:38:38 mostly I'm trying to make grenade less confusing to people, hopefully will help get more folks engaged 17:38:40 I think that's it 17:39:06 ok, yeah I agree making it less confusing is a good goal :) 17:39:15 ok well let's move on then 17:39:26 #topic Neutron full voting on icehouse and juno (afazekas) 17:39:32 afazekas: this is you 17:40:07 So the neutron single thread smoke, and multi thread full almost takes the same time to run 17:40:45 afazekas: it looks like a 10min delta 17:40:45 I am wondering Do we have any know issue which prevents replacing the smoke neutron jub with the full one ? 17:41:00 afazekas: what's the current success rate? 17:41:30 mestery or salv-orlando said there are some neutron bugs, but we should be making the switch by j-2 17:42:09 sdague: I'm not sure the numbers I'd have to dig them up 17:42:28 sure, I was hoping afazekas had them as he added the agenda item :) 17:42:41 mtreinish: I am sitting next to mestery in a meeting. I can ask him about the bugs 17:42:50 http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOiBcIkxvYWRpbmcgbm9kZSBlbnZpcm9ubWVudCB2YXJpYWJsZXNcIiBBTkQgcHJvamVjdDpcIm9wZW5zdGFjay90ZW1wZXN0XCIgQU5EIGJ1aWxkX25hbWU6XCJjaGVjay10ZW1wZXN0LWRzdm0tbmV1dHJvbi1mdWxsXCIgQU5EIGJ1aWxkX3N0YXR1czpcIkZBSUxVUkVcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTQwMzE5NzUwNzM1NiwibW9kZSI6IiIsImFuYWx5emV 17:42:50 fZmllbGQiOiIifQ== 17:42:51 http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOiBcIkxvYWRpbmcgbm9kZSBlbnZpcm9ubWVudCB2YXJpYWJsZXNcIiBBTkQgcHJvamVjdDpcIm9wZW5zdGFjay90ZW1wZXN0XCIgQU5EIGJ1aWxkX25hbWU6XCJjaGVjay10ZW1wZXN0LWRzdm0tbmV1dHJvblwiIEFORCBidWlsZF9zdGF0dXM6XCJGQUlMVVJFXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MDMxOTc2MTQ4NDQsIm1vZGUiOiIiLCJhbmFseXplX2ZpZWx 17:42:55 kIjoiIn0= 17:43:41 afazekas: big urls... 17:43:45 afazekas: these links don't show anything 17:44:38 Unfortunately I see a difference 17:45:16 54 failure with smoke , 123 with full from 405 job 17:45:21 afazekas: yeah it's not quite there yet.. 17:45:33 mtreinish: mestery doesn't have the details about the bugs in the Neutron job. He agreed to add the issue as an explicit point this coming Monday during the Neutron IRC meeting 17:45:36 so, who's actually working on it? 17:45:53 mlavalle: ok 17:46:02 rosella_s and salvatore 17:46:13 sdague^^^ 17:46:15 I think that's the concern I have, to the point where I'd almost like a moratorium on neutron tests until that's resolved 17:46:33 because I feel like more of the neutron team should be fixing this 17:47:11 as this was really supposed to have been an icehouse goal 17:47:25 sdague: I think that's a fair point 17:47:55 sdague: I will make sure that by Monday we have a detail list of bugs to be fixed, so we know what the gap is 17:48:02 mlavalle: great 17:48:12 sdague: Fair point indeed, I'll take this up and get back to you. 17:48:24 I think we should hold off on reviewing new neutron tests until we make the switch 17:48:29 either way 17:48:49 sdague, mtreinish: if the team feels that is the best use of my upstream bandwidth, I can take this as my main focus over the next few weeks 17:48:53 mlavalle: If you can tell me 1-2 most frequent failure signature, I also can have a look 17:48:54 yeh, especially because the time budget concerns are going to be different 17:49:01 mlavalle: yeh, that would be great 17:49:10 I think this is the most important neutron qa item at the moment 17:49:23 especially because parallel exposes so many things 17:49:42 sdague: +1 17:49:53 I'll send something to the ML about holding off on adding new neutron tests 17:50:03 ok, I will realign my priorities…… by Monday I will a clear picture of where we are in this fromt 17:50:10 mlavalle: great, thanks 17:50:33 mlavalle: great, yeah hopefully we're almost there 17:50:50 I don't think we are that far, I think it is a metter of this being someone's key priority 17:51:22 ok let's move on ~10min left 17:51:40 #topic Critical Reviews 17:51:55 so there are 3 from yfried on the list already 17:52:04 #link https://review.openstack.org/#/c/62101/ 17:52:09 #link https://review.openstack.org/#/c/92573/ 17:52:17 #link https://review.openstack.org/77816 17:52:25 mtreinish: the first is kinda your baby. I'm just a guardian 17:52:36 that 3rd one is blocked per the discussion we just had... 17:52:47 mtreinish: I don't follow 17:53:06 it adds a new neutron test right? 17:53:18 are there any other reviews people want to bring up 17:53:24 mtreinish: we should not kill flying patches 17:53:25 mtreinish: I think we could accept tests that are already in flight 17:53:33 patch #2 was previously approved, I think it's fine to take it back in 17:53:54 my main concern now is patch 1 17:54:01 mtreinish: any objection on my fast approving https://review.openstack.org/#/c/92573/ 17:54:05 dkranz: it's just holding off on hitting the +A 17:54:30 sdague: nope +A 17:54:44 dkranz: if they aren't hugely expensive 17:54:47 LGTM 17:54:53 the lbaas tests in flight are long 17:55:03 and I think we need to reevaluate where that's all going 17:55:12 sdague: Agreed, but that is a different issue 17:55:16 dkranz: sure 17:55:21 sdague: but scenarios should be longer than usual 17:55:38 sdague: as they are testing complex operations 17:55:53 https://review.openstack.org/#/c/98065/ rhis should stabilize the lbaas tests 17:56:14 This whole gate budget thing is a nightmare we need to resolve but there is no quick fix. 17:56:29 yeh, gate time budget should be on the agenda for germany 17:56:48 sdague: yeah I'll add it there 17:56:51 sdague: dkranz: could we set complex/long scenarios to run less frequently? 17:57:02 yfried: I think we probably need to figure out which lbaas scenario is the right one to gate every commit on, and just run the rest in the periodic job 17:57:24 I don't think we should have 3 that each run over a minute in the main gate 17:57:26 sdague: that's what I'm asking. we shouldn't give up on them though 17:57:33 yfried: We could run them nightly but historically, nightlies are treated as 10th-class citizens 17:57:48 sdague: we could always just start making a general slow tag and job 17:57:53 not just limit it to heat 17:57:53 If we could crack that issue it would open up many more possibilities 17:57:54 dkranz: yeh, but with nightly being in the experimental queue 17:58:13 I'm not so sure I buy the fact that it doesn't count 17:58:31 sdague: didn't understand your last comment 17:58:32 sdague: so that's a general acceptance of the patches? 17:58:40 sdague: If we would an an extra worker to the parallel jobs this kind of things would not be an issue 17:58:58 afazekas: there is a limit 17:59:02 afazekas: that doesn't actually speed things up that much 17:59:06 it's a diminishing margin 17:59:35 sdague: could I just have your general thoughts about patch #3 before the mtg is over? 17:59:39 mtreinish: a better worker balance would be more helpful 17:59:54 sdague: you were initially against it 18:00:40 afazekas: it'll help but not give us a huge decrease in run time 18:00:51 yfried: only because no one showed test results for it 18:01:11 #link https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:branchless-tempest-docs,n,z 18:01:12 we're at time 18:01:18 sdague: yeah, that's cause it's multi-node 18:01:18 yeah I just noticed that 18:01:23 hwclock drift 18:01:29 I need to switch to the ntp daemon 18:01:30 mtreinish: use ntp :) 18:01:36 instead of on a cron job 18:01:39 #endmeeting