15:02:30 #startmeeting third-party 15:02:31 Meeting started Mon Apr 6 15:02:30 2015 UTC and is due to finish in 60 minutes. The chair is anteaya. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:33 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:02:36 The meeting name has been set to 'third_party' 15:02:39 hello 15:02:51 hi 15:02:52 o/ 15:02:56 hi 15:02:59 raise your hand if you are here for the third party meeting 15:03:00 o/ 15:03:02 hi 15:03:04 hi 15:03:10 so eantyshev you had a question 15:03:16 hi 15:03:43 eantyshev: would you like to share your question? 15:05:10 or maybe later 15:05:24 does anyone else have somethign they would like to discuss today? 15:05:30 anteaya: SCP publisher plugin seems like not storing logs when one of build steps fails 15:05:39 okay here we go 15:06:01 eantyshev: alright well we are the most helpful when we have a stacktrace to look at 15:06:28 so if you could find a stacktrace and paste it somewhere, paste.openstack.org is an option 15:06:36 and bring us the url we can all look 15:06:45 thought this is its normal behavior... 15:06:50 and then someone can offer a suggestion of what to do next if they have one 15:07:19 no trace, it just doesn't do anything 15:07:24 well if a build fails we need to know why 15:07:31 so we need logs in that situation 15:07:33 eantyshev, which "build step" 15:07:59 devstack-gate 15:08:29 can you paste the code for that step? 15:08:50 http://paste.openstack.org/ 15:10:13 eantyshev, I'd say look at others setup for SCP, if the testing moves on before copy is complete it may not capture all the artifacts 15:10:34 eantyshev, I can't help much here, we are using swift 15:10:38 asselin: http://paste.openstack.org/show/198549/ 15:11:25 eantyshev: how long is the build running before it fails? 15:11:29 i've seen instances where the test will take longer than the job timeout and that will result in no logs being uploaded. What is the type of build error? 15:12:06 akerr: yes, that is why having a build-timeout 5 minutes longer than the job timeout is important 15:12:26 krtaylor: afraid not in my case, the only way I could get them exported was to abort the build 15:12:40 the build-timeout is 6 hours longer, which is a tad excessive but at least it is longer 15:13:39 akerr: http://paste.openstack.org/show/198550/ 15:15:42 I noticed jenkins.openstack.org is of 1.565.3 , and puppet deployed me 1.580.2 15:16:24 eantyshev: how long does the job run before it fails? 15:16:39 anteaya: 30min 15:16:46 okay 15:17:21 can you run devstack successfully by hand on your environment? 15:18:11 anteaya: yes, it fails in tempest 15:18:16 and just so I get a sense, who else has items they would like to discuss today? 15:18:43 eantyshev: so when you run devstack by hand in your environment, you tempest tests fail, did I understand you correctly? 15:19:00 anteaya: yes 15:20:07 okay well it dosn't have anything to do with your jenkins version then 15:20:53 until you can get devstack and your tempest tests passing in your environment when running them manually, devstack-gate is just going to continue to fail for you 15:21:12 eantyshev: Do you have the "Copy after failure" option checked for the scp publisher plugin on the job? 15:21:24 so focus on running devstack by hand in your set up and learn the logs for the tempest tests 15:21:32 akerr: good thought 15:23:09 does anyone else have anything they would like to discuss today? 15:23:10 akerr: I don't have such an option in job -> configure 15:23:39 eantyshev: do you have a gui for your plugin? 15:23:40 anteaya: persistent disconnects using the jenkins ssh plugin in ec2 if anyone has a good fix for that 15:23:49 beecee: you are up next 15:24:18 eantyshev: do you use jenkins job builder or do you manually define the tests? 15:25:22 akerr: I'm using Hudson SCP publisher plugin https://wiki.jenkins-ci.org/display/JENKINS/SCP+plugin 15:25:48 akerr: jobs are made by JJB 15:26:35 eantyshev: I'm going to have to cut this off now, to make some room for beecee 15:26:56 eantyshev: the best idea is to continue working on your issue and chatting on irc with others 15:27:17 anteaya: Okay 15:27:24 is there someone in the meeting who can take some time this week and help eantyshev with some basic navigation of his/her setup? 15:27:39 :( 15:27:44 that is a disappointment 15:27:54 I didn't mean for them to leave 15:28:03 well moving on then 15:28:08 beecee: you were saying? 15:28:51 anteaya: ok, jenkins 1.605, EC2 plugin, provisions a new instances each time gerrit says so. 15:29:18 just sometimes the slave doesn't respond to jenkins at random points during the devstack-gate build 15:29:28 that isn't helpful 15:29:38 any artifacts you can share? 15:29:40 even tried on a xlarge instance in case it was mem/cpu related 15:30:05 not really sorry, was more hoping for a 'flip this bit you dumbkoff' response ;) 15:30:58 ah 15:31:10 well I don't use the phrase dumbkoff personally 15:31:21 so there is a disappointment for you right off the bat 15:31:29 anyone using the EC2 plugin? 15:31:46 nope :( 15:31:57 sorry to interrupt here, but in case eantyshev decides to read the logs of this meeting here is a link to the scp publisher definition for JJB which shows how to enable copy-after-failure: http://paste.openstack.org/show/198559/ 15:32:20 akerr: thanks for that 15:32:34 #link http://paste.openstack.org/show/198559/ 15:32:59 so if anyone sees eantyshev again please share that with him/her, and if you don't know how please ask 15:33:08 on that note, here's the link to openstack's custom scp plugin: #link http://tarballs.openstack.org/ci/ 15:33:09 beecee: we might not have much for you 15:33:14 anyeaya: I'll go back to the google-fu saltmines on this one 15:33:27 beecee: do come back and share if you find a fix 15:33:39 so we can have something in the logs to share with others 15:33:46 asselin: thank you 15:34:17 beecee: sorry we don't have anything else for you today 15:34:27 does anyone have any other item they would like to discuss? 15:35:07 I figured out my soft lockup issue last week 15:35:13 asselin: oh do tell 15:35:20 #link https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1413540 15:35:21 Launchpad bug 1413540 in linux (Ubuntu Trusty) "Trusty soft lockup issues with nested KVM" [High,In progress] - Assigned to Chris J Arges (arges) 15:35:30 well done 15:36:16 just posted my workaround: use qemu on top of kvm. Default is kvm on top of kvm. 15:36:26 ah 15:36:47 asselin: did you get any further on your issue with test_minimum_basic_scenario? 15:37:01 I played around with power setting on the baremetal bios. It seemed to help a bit, but not 100%. 15:37:33 asselin: glad it at least didn't take you backward 15:37:39 rhe00, no, that was late last week. Will do that today. I was focusing on the soft lockup issue 15:38:06 asselin: ok, I didn't make any progress either over the weekend 15:38:31 mtreinish said he'd help out today 15:38:37 who else was hitting that one? 15:38:47 I had thought there was one other person 15:38:59 anteaya: yes, trying to remember his nick 15:39:03 my memory is telling me 3 for some reason 15:39:12 o/ 15:39:21 it was in the infra channel right after this meeting last week 15:39:23 kaiser something? 15:39:25 marcusvrn1: was it you 15:39:31 yes! 15:39:45 yay you are here 15:39:54 marcusvrn1: anything to share that might be helpful? 15:40:13 rhe00: kaisers1 was part of the discussion last week too, yes 15:40:18 good memory 15:40:30 no...I didn't make any progress 15:40:37 okay that's fine 15:40:47 I hope you all are in #openstack-qa 15:40:49 I'm continue on that 15:40:55 Also, for cinder folks: tempest has a false positive! https://bugs.launchpad.net/tempest/+bug/1440227 15:40:57 Launchpad bug 1440227 in tempest "encrypted volume tests don't check if a volume is actually encrypted" [Undecided,New] 15:41:09 so that when asselin and mtreinish talk about it, you can contribute your thoughts too 15:41:20 so if you're passing that test, you might actually be failing.... 15:41:26 asselin: wonderful, thanks for finding that 15:41:33 I'll join the #openstack-qa 15:41:56 marcusvrn1: good idea 15:41:58 asselin: darn it! :) 15:42:33 are any of the operators willing to assign themselves that bug? 15:42:37 asselin: I was happy that mine was passing. thanks for the heads up. :) 15:43:00 when you agree to take on a bug, you don't have to write the code to fix it, you just have to find someone to help you with that 15:43:14 and update the bug report and make sure progress is being made 15:43:22 so it is a bug in the tempest test that doesn't check the encrypted flag? 15:43:23 you can get lots of help if you take a bug 15:43:43 rhe00, no doubt. We had one driver passing, and one failing. After investigating, couldn't figure out why the one passing was working. It was a driver bug. 15:44:33 asselin: my driver is passing too 15:44:34 rhe00, right, if the driver incorrectly doesn't set the encrypted flag it will pass the encrypted volume test 15:45:04 ok, but if we fix the bug in tempest, alot of CI's will start failing. Is it up to the bug fixer to fix all the drivers as well or? just asking what the common prcatice is? 15:46:04 rhe00, yes, if they aren't setting the encrypted flag. 15:46:25 asselin: ok 15:46:26 not sure what the process is if there's a lot of driver's with that issue. Better to stay ahead 15:47:06 to fix the bug in tempest you fix the tempest bug 15:47:23 it is the driver maintainer's responsiblity to fix their driver 15:47:29 which is why it is good you are here 15:47:41 anteaya, +1 15:47:44 anteaya: yes. :) 15:47:45 the point of the test is to test that something is working 15:47:59 anteaya: +1 15:48:06 if it isn't working, we need to know (failing tests) and for folks to fix what is there's to fix 15:48:18 so if you take on the bug, all you have to do is fix the tempest test 15:48:32 then it will properly fail when it actually is failing now anyway 15:48:40 and folks will know about it and can fix their code 15:48:43 make sense? 15:48:50 fail for the right reason instead of the wrong reason 15:48:55 exactly 15:49:07 so first we have to fail for the right reason 15:49:16 ok, is it still time to do it for Kilo? 15:49:26 anteaya, or pass for the right reasons :) 15:49:30 right 15:49:46 rhe00: the right question is how do we ensure code quality for kilo 15:49:59 so someone take this to cinder core in cinder channel 15:50:07 make sure it gets on the cinder meeting agenda 15:50:27 then cinder will know and prioritize if as they need to for rc's 15:50:29 make sense? 15:50:34 I can add it to cinder agenda 15:50:39 thank you 15:51:02 and please have a chat in cinder channel so that folks know and can figure out the priority 15:51:22 thanks for bringing this up 15:51:30 anteaya, +1 probably better to do that + add a topic on mailing list 15:51:52 asselin: I agree thought what is posted to the ml is thingee's call I would say 15:52:04 as clarity is important, not confusion 15:52:09 anteaya, ok 15:52:12 thanks 15:52:26 when you dno't know whether to say something or not? 15:52:32 say something in open source 15:52:49 we would rather know something is broken and address it with confidence 15:53:09 than ship something someone knew was broken but didn't tell anyone else 15:53:20 at the very least release notes are good 15:53:26 make sense? 15:53:43 +1 15:53:48 great 15:53:53 so 6 minutes 15:54:01 anyone have anything else? 15:54:42 we can stare at our screens for another 5 minutes or move on 15:54:56 I'm for moving on 15:55:00 any objections? 15:55:15 okay thanks everyone 15:55:22 I appreciate your participation 15:55:26 see you next week 15:55:36 #endmeeting