13:31:39 <esberglu_> #startmeeting powervm_ci_meeting
13:31:39 <openstack> Meeting started Thu Dec  8 13:31:39 2016 UTC and is due to finish in 60 minutes.  The chair is esberglu_. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:31:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:31:42 <openstack> The meeting name has been set to 'powervm_ci_meeting'
13:31:49 <esberglu_> Hey guys
13:32:26 * adreznec waves
13:33:40 <thorst_> o/
13:34:36 <esberglu_> #topic status
13:34:51 <esberglu_> So those runs are _slowly_ going through
13:35:13 <adreznec> Yeah
13:35:16 <esberglu_> The runs themselves seem to be fine
13:35:18 <thorst_> how slow?
13:35:20 <adreznec> Seeing some scary times out there on the queue
13:35:52 <esberglu_> thorst_: I can ping you the zuul ip if you want to look
13:36:06 <thorst_> I have it
13:36:08 <thorst_> 10 hours?
13:36:13 <esberglu_> But like 10 hours
13:36:14 <esberglu_> Yeah
13:36:28 <adreznec> Do we know what's bogging things down yet?
13:36:31 <thorst_> are any actually going?
13:36:59 <thorst_> the jenkins has a ton of idle VMs.
13:37:40 <esberglu_> Yeah. There are 3 going through right now, 20 - 30 have gone through in the last 12 hours
13:37:46 <adreznec> Yeah
13:38:00 <adreznec> It doesn't actually look like anything's been running for all that long
13:38:03 <esberglu_> That's about the volume I would expect
13:38:07 <adreznec> but things have been in the queue for a long time
13:38:13 <esberglu_> Its just they sit around in the queue forever first
13:39:16 <esberglu_> Which means the queue just keeps getting bigger
13:41:03 <adreznec> Ok, so I think we need to nail down exactly what's causing the initial queuing to build up
13:41:28 <adreznec> If it's git issues, then we probably need to invest in mirrors at this point
13:42:57 <thorst_> did we tell Zuul to only let 3 through at a time?
13:43:16 <thorst_> wasn't there some gate in zuul about throughput?
13:43:34 <thorst_> I don't really know how this could be git...what's that train of thought there?
13:44:21 <thorst_> (not that a mirror is a bad idea...)
13:44:38 <esberglu_> No the only zuul conf that changed was moving nova from silent to check pipeline
13:44:45 <thorst_> hmm
13:45:13 <adreznec> Well we were seeing those git performance issues yesterday, and one theory was that we were hitting some kind of internal timeouts doing the clones/fetches
13:45:40 <thorst_> ahh, cause zuul does some sort of clone
13:45:51 <thorst_> which I don't understand...I'd have thought that was just in the Jenkins slave VM
13:45:51 <adreznec> Because we could see it attempting to do the same fetch multiple times on different PIDs
13:46:03 <esberglu_> We were seeing these git fetch <change>
13:46:10 <esberglu_> That seemed to just be looping
13:46:15 <adreznec> Not sure we have enough data to say that concretely
13:46:21 <adreznec> But it was a theory
13:46:54 <esberglu_> The only other thing that I thought it might be
13:47:09 <esberglu_> There are these changes in the queue that depend on like 10 other changes
13:47:31 <esberglu_> And some of the changes are having merge issues
13:48:01 <thorst_> why is zuul doing this?  ssh -i /var/lib/zuul/ssh/id_rsa -p 29418 powervmci@review.openstack.org git-upload-pack '/openstack/nova'
13:48:18 <esberglu_> Here's an example of one of those changes https://review.openstack.org/#/c/337789/
13:48:32 <esberglu_> Not sure
13:49:34 <thorst_> that process has been running for a while
13:51:27 <adreznec> Interesting...
13:51:29 <thorst_> the commit message in zuul about why that runs is "I'll document this later"
13:51:48 <adreznec> That should never really be a particularly long-running command
13:52:27 <thorst_> I suggest we kill that proc
13:52:30 <thorst_> and see if we unwedge.
13:52:54 <esberglu_> Sure
13:53:23 <adreznec> thorst_: How long is a while?
13:53:27 <adreznec> Hours?
13:53:51 <thorst_> says 08:48 in the ps aux output
13:54:15 <thorst_> so under 5 min.
13:54:19 <thorst_> its done now.
13:54:36 <esberglu_> Yeah I killed it. Another one just popped up in its place
13:55:09 <thorst_> did you kill that second one?
13:55:13 <thorst_> they just seem to be really slow
13:55:14 <esberglu_> No
13:56:13 <adreznec> Right
13:56:35 <thorst_> wonder what git-upload-pack does
13:56:44 <thorst_> needs some investigation, because I don't think a clone would help that...
13:58:52 <thorst_> well...when in doubt, just run by hand.
13:59:05 <thorst_> it returns quite the amount of data.
14:00:22 <adreznec> I think it does discovery/fetching of objects from git during a fetch
14:00:39 <adreznec> Not 100% sure on that
14:02:45 <thorst_> alright...so is that the status.  Figure out why we're wedged.
14:02:56 <thorst_> (since we're over on time in the meeting)
14:03:09 <adreznec> Yeah
14:03:16 <adreznec> Clearly we need longer than 30 minutes to investigate this
14:03:28 <thorst_> just running the command ourselves may take 30 minutes
14:03:33 <esberglu_> Yeah. Other than that I put a wiki page up for CI
14:03:48 <esberglu_> If you guys want to take a look. Still need to finish a few sections and polish it up
14:04:56 <adreznec> Where did it land?
14:04:58 <adreznec> Novalink wiki?
14:05:34 <esberglu_> Neo dev wiki
14:05:49 <esberglu_> Subpage under PowerVM CI System
14:06:04 <adreznec> Ok
14:06:12 <esberglu_> _WIP_ CI System and Deployment
14:06:16 <thorst_> so that is also for wangqwsh as you train him to be able to redeploy the CI?
14:06:20 <esberglu_> Yep
14:06:38 <thorst_> excellent.  And if we do need a git mirror, that may be a good project for wangqwsh to drive
14:07:24 <esberglu_> #endmeeting