17:00:15 <tjones> #startmeeting VMwareAPI
17:00:16 <openstack> Meeting started Wed Oct  2 17:00:15 2013 UTC and is due to finish in 60 minutes.  The chair is tjones. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:19 <openstack> The meeting name has been set to 'vmwareapi'
17:00:22 <tjones> hi folks - who's here?
17:00:49 <dims> howdy tjones
17:00:53 <tjones> hi dims
17:00:54 <garyk> hi, i am here
17:00:57 <tjones> hi gary
17:01:22 <garyk> tjones: hi
17:01:37 <vuil> hi
17:01:46 <tjones> ok - i think everyone knows that hartsocks is on paternity leave for a few weeks.  so i'll run the meetings for a bit
17:02:05 <garyk> mazal tov! (aka good luck)
17:02:12 <tjones> lets get started - i ran the bug report this morning - http://paste.openstack.org/show/47843/
17:02:22 <tjones> oops -
17:02:26 <tjones> #topic bugs
17:02:57 <tjones> in terms of high/critical bugs we have
17:03:01 <tjones> #link https://bugs.launchpad.net/bugs/1227825
17:03:04 <uvirtbot> Launchpad bug 1227825 in openstack-vmwareapi-team "datastore selection bug - fills first disk only" [Critical,In progress]
17:03:14 <tjones> which hartsocks was working on.
17:03:24 <garyk> tjones: that has been deferred to I as it is a feature
17:03:26 <smurugesan> Hey All, Sabari here
17:03:52 <tjones> this is the issue where we will only use 1 datastore and then throw exceptions when full.  russellb has put it back into rc potenial.  can someone take this over?
17:03:52 <garyk> russellb said that there may be a slight chance of getting review for it but it is doubtful
17:04:01 <smurugesan> I can take it over
17:04:13 <tjones> i think it's very close.  thanks smurugesan
17:04:24 <smurugesan> because i have a disk usage bug that can only be fixed after Shawn's patch.
17:04:31 <smurugesan> let me pull that for record
17:04:39 <tjones> #action smurugesan take over  https://bugs.launchpad.net/bugs/1227825
17:04:41 <uvirtbot> Launchpad bug 1227825 in openstack-vmwareapi-team "datastore selection bug - fills first disk only" [Critical,In progress]
17:04:56 <tjones> then we have 2 high/medium that need revision
17:05:10 <tjones> #link https://bugs.launchpad.net/bugs/1184807
17:05:12 <uvirtbot> Launchpad bug 1184807 in openstack-vmwareapi-team "Snapshot failure with VMwareVCDriver" [High,Fix committed]
17:05:15 <smurugesan> https://bugs.launchpad.net/nova/+bug/1220459 will piggy back on 1227825's fix
17:05:17 <uvirtbot> Launchpad bug 1220459 in nova "VMware Driver reports incorrect disk usage" [High,Confirmed]
17:05:40 <tjones> oops - looks like the script has a bug.  that one is already committed
17:05:56 <garyk> tjones: that has been approved (it may be in the script as it is a grizzly backport)
17:06:21 <tjones> ah ok.  lets look at the other one #link https://bugs.launchpad.net/bugs/1213269
17:06:22 <uvirtbot> Launchpad bug 1213269 in openstack-vmwareapi-team "_check_if_folder_file_exists only checks for metadata file" [High,In progress]
17:06:36 <tjones> hee hee - that one is mine.  i'll get on it today
17:07:09 <tjones> the other 2 need review - in fact there are quite a few needing +1
17:07:37 <smurugesan> I will be doing some reviews today, I will take a look at them.
17:07:50 <tjones> by "quite a few" i actually mean 2 high prio and 6 low.  lets focus on the critical/high
17:07:59 <tjones> any other bugs needing discussion?
17:08:38 <tjones> *listens*
17:08:44 <garyk> tjones: https://bugs.launchpad.net/nova/+bug/1225002
17:08:46 <uvirtbot> Launchpad bug 1225002 in nova "VMware: no VM connectivity when opaque network does not match bridge id" [Medium,In progress]
17:09:35 <garyk> It also needs to be backported to grizzly. it has been around for quite a while now and we need to escalate to core reviewers
17:09:58 <smurugesan> Other bugs, I am working on https://bugs.launchpad.net/nova/+bug/1193980 - should push a patch today. It's a regression over Grizzly.
17:10:00 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed]
17:10:01 <tjones> is it in review?
17:10:12 <tjones> i don't see the link in the bug
17:10:32 <garyk> it has been in review since august - give me a sec
17:10:42 <garyk> tjohttps://review.openstack.org/#/c/41977/
17:10:48 <garyk> tjones: https://review.openstack.org/#/c/41977/
17:11:14 <tjones> wow - yes it's very ready for core
17:12:06 <tjones> funny no link in the bug.  ok that one is marked need core review in the report.
17:12:25 <garyk> tjones: https://code.launchpad.net/bugs/1197041
17:12:26 <uvirtbot> Launchpad bug 1197041 in nova "nova compute crashes if you do not have any hosts in your cluster" [Medium,In progress]
17:12:55 <smurugesan> for some reviews, the bug is not getting updated. It happened with me as well.
17:13:12 <tjones> gark: yes that one is a pain to debug.  i have hit it and other users have reported it with VOVA.
17:13:38 <tjones> smurugesan: that iscsi - i'll add to the list
17:13:47 <tjones> any other bugs?
17:13:48 <garyk> i do not know why russellb removed this form the rc candidate. i'll try and check with him later
17:13:48 <smurugesan> thanks!
17:14:23 <garyk> tjones: https://bugs.launchpad.net/nova/+bug/1228847
17:14:24 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress]
17:14:31 <tjones> smurugesan: russellb has added havana-rc-potential to that one
17:14:34 <garyk> this is really problematic
17:14:48 <tjones> yes it is hitting our CI guys
17:14:59 <tjones> do you have root cause on it?
17:15:04 <garyk> if there is a an exception in the driver - for example nova fixed ips are all used up, then the actual exception is corrupted
17:15:21 <garyk> this one needs to be moved to high
17:15:43 <tjones> i don't believe i have the ability to do so.  But we can ping russellb
17:16:10 <garyk> i think that anyone who is part of the nova bugs team can (you just need to join the group)
17:16:22 <tjones> oh ok - i'll do that :-D
17:16:41 <garyk> russellb moved it from high to medium (just saw this now). i do not agree with his assesment. it makes troubleshooting practically impossible
17:16:43 <tjones> #action set https://bugs.launchpad.net/nova/+bug/1228847 to high prio
17:16:45 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress]
17:17:16 <tjones> #action target https://bugs.launchpad.net/nova/+bug/1193980  for rc
17:17:18 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed]
17:17:39 <tjones> any other bug issues besides triage?
17:18:47 <tjones> #topic bug triage
17:18:54 <tjones> ok here' s the list : http://goo.gl/pTcDG
17:19:18 <tjones> #link https://bugs.launchpad.net/nova/+bug/1232348
17:19:20 <uvirtbot> Launchpad bug 1232348 in nova "VMware: vmdk converted via qemu-img may not boot as SCSI disk" [High,New]
17:19:41 <vuil> i filed this. Can someone confirm this well known issue.
17:19:48 <smurugesan> vuil is working on couple of these bugs.
17:19:50 <smurugesan> oh there he is
17:19:54 <smurugesan> :)
17:20:03 <vuil> sorry out for a couple minutes
17:20:17 <vuil> I am testing a fix. Hopefully out today.
17:20:47 <tjones> ok next is #link https://bugs.launchpad.net/nova/+bug/1194076
17:20:50 <uvirtbot> Launchpad bug 1194076 in nova "current_workload in  nova hypervisor-show not recover after nova suspend/resume" [Medium,Incomplete]
17:21:43 <tjones> gark it looks like you were discussing that with the reporter.  do you think user error?
17:21:44 <garyk> i was unable to reproduce this
17:21:52 <tjones> ok lets leave that one
17:22:14 <tjones> next is #link https://bugs.launchpad.net/nova/+bug/1226543
17:22:16 <uvirtbot> Launchpad bug 1226543 in nova "VMware: attaching a volume to the VM failed" [Medium,Incomplete]
17:22:40 <tjones> no action on this after garyk
17:22:48 <tjones> commented.
17:23:29 <tjones> last we have the results of our test team doing stress tests #link https://bugs.launchpad.net/nova/+bug/1230047
17:23:31 <uvirtbot> Launchpad bug 1230047 in nova "VMware: spawning large amounts of VMs sometimes causes errors" [Undecided,New]
17:24:24 <tjones> rhsu was looking into the nfs server to see if that was the issue.  did anyone else look at this?
17:24:24 <garyk> tjones: this is a tough one. initially we thought it was https://code.launchpad.net/bugs/1228847
17:24:27 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress]
17:25:02 <garyk> after we added the patch that i have fixed for that problem we saw the real exception and it was that the VC was return an exception that a vmdk was not found
17:25:12 <garyk> when ryan tried on another setup he was unable to reproduce
17:25:20 <garyk> i was also unable to reproduce
17:25:26 <garyk> we need to try and reproduce this
17:25:38 <tjones> hum - the other setup was one in the BLR lab - so that is why he was thinking it was the NFS server somehow
17:26:00 <tjones> ok let me talk with rhsu and see what the next steps are on this.
17:26:16 <tjones> #action talk to the test team about repo on https://bugs.launchpad.net/nova/+bug/1230047
17:26:18 <uvirtbot> Launchpad bug 1230047 in nova "VMware: spawning large amounts of VMs sometimes causes errors" [Undecided,New]
17:26:46 <tjones> any other issues before we go to open discussion?
17:27:16 <garyk> tjones: i think that we need to talk about stable grizzly
17:27:28 <tjones> #topic open discussion
17:27:30 <garyk> tjones: we should also go over documentation
17:27:32 <tjones> ok lets talk about that
17:27:46 <tjones> we have until 10/10 to get backports in correct?
17:27:56 <garyk> ok. regarding the stable grizzly, the feature freeze is the 10th of the month
17:28:18 <garyk> i think that the relase will be a few days later (due to gating problems over the last few days and stable gate is broken)
17:28:38 <garyk> we need to make sure that we have all of our critical and high bugs backported and tested hopefully by them
17:28:58 <garyk> it would be nice if we can make this a formal part of these meetings as the stable branch is very importnat for all
17:29:16 <tjones> #action add grizzly backports to the meeting agenda
17:29:30 <garyk> tjones: thanks
17:29:52 <tjones> ok we have 5 patches that gary has called out as needing backport
17:29:58 <garyk> regarding documentation i saw https://review.openstack.org/#/c/48859/ (and had a comment). can someone else please take a look
17:29:59 <vuil> tjones: where is the list of bugs you showed yesterday that had grizzly status
17:30:21 <tjones> http://partnerweb.vmware.com/programs/vmdkimage/customer_bugs.html
17:31:07 <tjones> garyk: i have that on my list to review today
17:31:32 <garyk> tjones: thanks!
17:32:24 <vuil> will take a look as well
17:32:30 <garyk> thanks
17:32:41 <tjones> #action bug owners review http://partnerweb.vmware.com/programs/vmdkimage/customer_bugs.html and backport their bugs if they have grizzly-backport-potential tags
17:33:21 <tjones> off the top of my head  - vui - you have 2, sabari you have 2, i have 1
17:33:28 <vuil> was going to ask what the procedure is.
17:33:40 <smurugesan> sure
17:33:45 <vuil> sure.
17:33:50 <tjones> literally off the top of my head :-) so please check.
17:34:13 <garyk> i think that i have done one of vui's (and was planning on doing a few others)
17:34:25 <vuil> I saw that. Thanks.
17:34:30 <tjones> vuil - i'll paste the email from garyk on pastebin
17:34:43 <garyk> tjones: thanks
17:34:59 <tjones> here you go - http://paste.openstack.org/show/47845/
17:35:10 <russellb> guys, please don't target new stuff to rc1
17:35:19 <russellb> it's being released today, just waiting on the last change to go through the gate
17:35:26 <tjones> hey russellb
17:35:51 <vuil> Probably can't assume folks not at this meeting get to the minutes in time to deal with their backports, so best we all take all pass through that list.
17:36:13 <garyk> russellb: so will these bugs be targeted for rc2?
17:36:24 <russellb> only if we determine that they qualify as release blockers
17:37:10 <russellb> none of these really seem to be, but happy to evaluate if you think something qualifies
17:37:16 <garyk> ok, understood, the reason we added them to the rc1 list is that we feel they are release blockers
17:37:34 <garyk> for example if there is an exception in the driver we may be unable to trouble shoot as an invalid stack is logged
17:37:35 <russellb> they're not things that block everyone, are they?
17:37:36 <tjones> for example - one is a grizzly regression
17:37:50 <garyk> they block everyone using the vmware drivers
17:38:05 <russellb> you have an interesting definition of block, then
17:38:16 <garyk> can you please clarify
17:38:45 <garyk> my take is that if the service crashes for some reason or another then that is a blocking issue
17:38:49 <russellb> at this point, block needs to be something that can't be worked around, and affects major functionality
17:38:59 <russellb> would you like to go 1 by 1?
17:39:10 <garyk> sure.
17:39:18 <garyk> i think that there are 2 serious bugs:
17:39:33 <garyk> https://review.openstack.org/#/c/41977/
17:39:50 <russellb> ok on that one, is this a configuration error basically?
17:39:53 <russellb> how do you hit the problem?
17:39:58 <garyk> this may result in a case where there is no network connectvity with the vm
17:40:21 <garyk> the problem happens when the esx host does not have a matching opaque network.
17:40:34 <russellb> so, a setup error?
17:40:41 <garyk> it can happen after a reboot of the host
17:41:26 <garyk> if a new host is added to a cluster then vm's deployed on that host may not have network connectivty
17:41:27 <russellb> what causes it to happen?
17:42:02 <garyk> if the host goes into maintenace mode and then say is rebooted (power outage for example)
17:42:45 <garyk> it is kind of like the ovs having no rules that match traffic
17:43:56 <garyk> my bad is that i did not convey this information on the bug (was away on vacation…)
17:44:05 <garyk> but that is not an axscuse
17:44:15 <russellb> to be honest, i still don't understand what you're saying
17:44:50 <russellb> can you write up on the bug in more detail what the problem is, how it occurs, and make a point to demonstrate that it's a bug with no workaround?
17:45:13 <garyk> sure, i'll do that
17:45:27 <tjones> #action gark: update https://review.openstack.org/#/c/41977/ with more detail
17:45:38 <tjones> #undo
17:45:38 <openstack> Removing item from minutes: <ircmeeting.items.Action object at 0x30f3c90>
17:45:46 <garyk> it is like someone reboots a host with libvirt and after the reboot that are unable to run traffic to any vms on that host
17:45:51 <tjones> irc://chat.freenode.net:6667/#action garyk: update https://review.openstack.org/#/c/41977/ with more detail
17:46:14 <russellb> garyk: i get the end result, but not the steps that lead up to putting the host in that situation
17:46:15 <garyk> the second issue is https://code.launchpad.net/bugs/1228847
17:46:17 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress]
17:46:21 <garyk> ok
17:46:29 <russellb> i saw you saying that one helps debugging
17:46:36 <russellb> but it's not preventing functionality from working for users
17:46:40 <russellb> so i don't consider that a blocker
17:47:03 <russellb> keep in mind that this far into the RC period, the bar has to be *really* high, or we'll never release
17:47:28 <garyk> ok.
17:47:41 <garyk> the previous bug is really high and this one can be defered
17:47:54 <garyk> deferred (i think my spelling is a mess)
17:48:11 <russellb> k, ping me when you have a more detailed writeup on the bug ready, and hopefully i'll see what you see then
17:48:18 <garyk> ok, thanks
17:49:19 <tjones> ok one more - this is a reggression from grizzly
17:49:20 <tjones> https://bugs.launchpad.net/nova/+bug/1193980
17:49:22 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed]
17:50:15 <russellb> no patch?
17:50:28 <russellb> and it's tagged grizzly-backport-potential?
17:50:36 <russellb> so does that mean it exists in grizzly as well?
17:50:57 <tjones> we've been having some issues with bugs not showing they are in review.  sabari - can you comment on this one?
17:51:28 <russellb> even if it doesn't happen automatically, you should update it manually :-)
17:51:43 <tjones> um - if it's marked for grizzly i may have mixed it up with another :-D  i'll check on it
17:51:51 <russellb> k
17:52:11 <tjones> anything else for russellb folks?  we are getting close to tiem
17:52:12 <tjones> time
17:52:26 <russellb> looks like the grizzly tag was added when the bug was filed
17:52:32 <garyk> a hug when i start to cry :)
17:52:43 <tjones> :-D
17:52:45 <russellb> so, clarify if it's a regression from grizzly to havana
17:52:51 <tjones> ok will do
17:52:53 <russellb> and update the bug with the review, and set to In Progress, if there's a patch up
17:53:04 <russellb> and ping me after those updates
17:53:19 <tjones> #action follow up on https://bugs.launchpad.net/nova/+bug/1193980
17:53:22 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed]
17:53:24 <garyk> russellb: thanks for the time. much appreciated
17:53:36 <russellb> yep, sorry to be tough, just have to protect the release timeline
17:53:55 <tjones> russellb: thanks! no worries - this is the *fun* part
17:54:04 <russellb> yup
17:54:17 <tjones> anything else folks?
17:54:33 <tjones> going once….
17:55:09 <tjones> thanks for attending!  see you next week
17:55:11 <tjones> #endmeeting