15:00:11 <johnthetubaguy> #startmeeting XenAPI
15:00:12 <openstack> Meeting started Wed Sep  4 15:00:11 2013 UTC and is due to finish in 60 minutes.  The chair is johnthetubaguy. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:15 <openstack> The meeting name has been set to 'xenapi'
15:00:23 <johnthetubaguy> hello all
15:00:30 <johnthetubaguy> who is around for todays meeting
15:01:16 <johnthetubaguy> I have a few things for the open discussion
15:01:25 <BobBall> Excellent
15:01:28 <johnthetubaguy> other people got stuff for later in the meeting?
15:01:38 <BobBall> just the stuff we were talking about with safe_copy_vdi earlier
15:02:12 <johnthetubaguy> OK
15:02:21 <johnthetubaguy> #topic Actions from last meeting
15:02:38 <johnthetubaguy> BobBall: time for updates on VDI.copy workaround
15:02:57 <johnthetubaguy> you have a plan that involves taking snapshots?
15:03:25 <BobBall> So - for Mate and the logs - I realised why we use safe_copy_vdi - there get_cached_vdi function can have two independent calls to it - if one call finds the VDI not there, and starts to copy, the second call might see it and then try and copy it at the same time
15:03:43 <BobBall> Can't do two copys of the same VDI at the same time
15:03:50 <BobBall> so we added this hack to do the copy outside of XAPI
15:04:04 <matel> Ah, so it's some kind of locking.
15:04:13 <BobBall> but the right fix is to snapshot the VDI, copy it, then delete the snapshot
15:04:26 <BobBall> the copy of the snapshot creates a full copy - not the differencing disk
15:04:36 <BobBall> and you can therefore do them in parallel
15:04:51 <johnthetubaguy> well, would be nice for VDI.copy to do two copies at once, or block somehow, but snapshot seems like a good solution
15:04:52 <matel> That makes sense.
15:05:01 <BobBall> that leaves the only other possible race (which I assume is dealt with already) being in the glance plugin where it downloads it to a known UUID
15:05:08 <johnthetubaguy> Bob, is there a bug for that?
15:05:12 <BobBall> VDI.copy can't do two at once - we have to mount the VDI and then it's in use
15:05:23 <BobBall> Gimme an action and I'll raise a bug
15:05:25 <johnthetubaguy> glance doesn't download to a known uuid
15:05:38 <matel> it's always re-generating uuids.
15:05:41 <BobBall> it doesn't?  Then how will the caching ever work?
15:06:04 <BobBall> oh hang on
15:06:06 <BobBall> of course - sorry
15:06:12 <BobBall> it matches the VDI based on image_id not the uuid
15:06:36 <BobBall> so what we need to do is once we find the image_id then we can use it - snapshot, copy, delete snapshot, done.
15:06:41 <johnthetubaguy> yep
15:06:51 <johnthetubaguy> uses image tag
15:06:59 <johnthetubaguy> uuids are auto_generated on each call
15:07:03 <johnthetubaguy> even across retries
15:07:24 <johnthetubaguy> BobBall: got the bug id, we might have one already
15:07:59 <BobBall> not yet
15:08:03 <BobBall> I haven't created it
15:08:10 <BobBall> I'll do it after the meeting
15:08:21 <johnthetubaguy> https://bugs.launchpad.net/nova/+bug/1215383
15:08:23 <uvirtbot> Launchpad bug 1215383 in nova "XenAPI: Consider removing safe_copy_vdi" [Low,Triaged]
15:08:24 <johnthetubaguy> there we go
15:08:29 <johnthetubaguy> #link https://bugs.launchpad.net/nova/+bug/1215383
15:08:51 <johnthetubaguy> #topic Blueprints
15:08:52 <BobBall> oh right
15:08:56 <BobBall> I'll update that bug report
15:08:59 <johnthetubaguy> its freeze day
15:09:08 <johnthetubaguy> mate, I see your patches
15:09:15 <johnthetubaguy> I think everything else is in?
15:09:20 <matel> yes
15:09:49 <BobBall> Mine's in
15:09:51 <BobBall> woohoo
15:10:08 <BobBall> There clearly isn't time to make the change to Mate's code before the end of freeze day
15:10:26 <BobBall> Can we get it in and refactor as a bug during H-4?
15:11:21 <johnthetubaguy> maybe, can't decide
15:11:28 <johnthetubaguy> would like another opinion
15:11:32 <BobBall> Could we involve russelb?
15:11:48 <johnthetubaguy> could do, I would rather be more general
15:11:50 <BobBall> russellb even - I assume that russelb doesn't trigger his hilight :D
15:12:02 <russellb> i have russelb highlghted too actually :-)
15:12:16 <BobBall> oh... ok :)
15:12:20 <russellb> whatcha need
15:12:31 <BobBall> So, the scenario is that we've got a change that could do with some refactoring to make it more generic
15:12:48 <BobBall> We don't have time today to do that refactor
15:13:05 <BobBall> so I was wondering what your thoughts were on using a bug to track the refactoring and us committing to fix it during H-4
15:13:10 <BobBall> and getting the fix in today?
15:13:23 <russellb> which patch
15:13:29 <matel> https://review.openstack.org/#/c/40909/
15:13:36 <matel> https://review.openstack.org/#/c/41651/
15:13:48 <russellb> how much time do you need?
15:13:48 <matel> russelb: those are the patches
15:13:58 <BobBall> Only a few days
15:14:10 <russellb> like, could it be merged this week with a refactor?
15:14:21 <matel> sure
15:14:55 <russellb> i'd rather "do it right" in general
15:15:17 <russellb> so if doing it right means you need an extra day or two, and it's limited to the xenapi driver
15:15:22 <russellb> and you guys are comfortable with it going in
15:15:32 <russellb> and johnthetubaguy commits to reviewing it immediately
15:15:47 <johnthetubaguy> I am good with +2 if we commit to refactor next week
15:15:49 <russellb> then i think we can grant an exception for extra time
15:15:55 <russellb> can it be this week?  :-)
15:16:01 <russellb> oh i see what you're saying
15:16:01 <russellb> hm
15:16:07 <johnthetubaguy> well, it could be called a bug
15:16:12 <russellb> yeah
15:16:22 <johnthetubaguy> basically, its some extension to the glance download/upload
15:16:22 <russellb> would you be upset if this went out in havana without a refactor?
15:16:30 <johnthetubaguy> I would be OK with that I guess
15:16:41 <russellb> just using that as a test for whether it should go in now or not
15:16:45 <johnthetubaguy> just its extra driver specific stuff, where it doesn't have to be I guess
15:16:46 <russellb> not saying we should do it
15:16:50 <russellb> ah i see
15:17:02 <russellb> well if you're OK with it as is, i say go for it
15:17:04 <johnthetubaguy> does targz raw images download/upload
15:17:19 <russellb> and then we'll evaluate the refactor when ready
15:17:23 <russellb> hard to call it a bug though
15:17:24 <johnthetubaguy> cool, OK
15:17:35 <johnthetubaguy> yeah, thats fair
15:17:46 <russellb> but maybe we'll look at it and decide it's worth it
15:17:49 <russellb> hard to say
15:17:54 <russellb> i'm speaking high level, haven't been able to look at code
15:18:07 <russellb> but if you want it in havana, sounds like you should merge what you have as long as you're OK with it as is
15:18:18 <johnthetubaguy> cool
15:18:20 <russellb> if it was just another day, i'd grant an extension
15:18:45 <russellb> but if we're looking at next week ... i'd go with what you have
15:18:49 <russellb> good luck guys :)
15:18:54 <matel> thanks.
15:18:57 <johnthetubaguy> russellb: thanks
15:19:04 <johnthetubaguy> OK, so I am +2 on those now
15:19:10 <johnthetubaguy> we should move on
15:19:17 <BobBall> thanks
15:19:19 <matel> Okay, thanks.
15:19:29 <johnthetubaguy> #topic Docs
15:19:33 <johnthetubaguy> any updates this week
15:19:40 * BobBall looks at matel
15:19:43 <johnthetubaguy> I saw some activity bug wise that could be related?
15:19:44 <BobBall> that was you too :)
15:19:47 <matel> So install guide
15:20:05 <matel> I was thinking, that we should move all xenserver setup stuff to the install guide
15:20:12 <johnthetubaguy> +1
15:20:30 <matel> So that it would be part of both RH, Ubuntu, etc docs
15:20:39 <johnthetubaguy> would be nice to add some specific nova-compute steps too
15:21:20 <johnthetubaguy> would be good to not make it look like we just cut and paste some ideas to the front of the doc, they are quite step-by-step structured from what I remember?
15:22:30 <matel> Sure.
15:23:53 <johnthetubaguy> OK,
15:24:00 <johnthetubaguy> so any more on that?
15:24:07 <johnthetubaguy> or shall we move on to QA?
15:24:35 <matel> OKay.
15:24:47 <BobBall> There were going to be more updates on docs next week - but if Mate has to do the rebasing... ;)
15:25:03 <BobBall> refactoring*
15:25:06 <BobBall> anywya
15:25:10 <BobBall> move on to QA yeah
15:25:20 <johnthetubaguy> #topic QA and Bugs
15:25:39 <johnthetubaguy> hows the gating stuff going?
15:25:53 <matel> Mate will be on Holiday for 10 days starting next week Monday
15:25:55 <BobBall> https://bugs.launchpad.net/bugs/1218528 is an interesting one
15:25:57 <uvirtbot> Launchpad bug 1218528 in nova "openvswitch-nova in XenServer doesn't work with bonding" [Medium,Confirmed]
15:25:57 <BobBall> oh okay - gating
15:26:13 <BobBall> SS is now automatically commenting with the XS bugs
15:26:19 <johnthetubaguy> lets come back to that one
15:26:21 <BobBall> after we've parsed out the puppet and build failures
15:26:33 <BobBall> so that's on the road to getting -2 privs for SS
15:26:35 <BobBall> which is great
15:26:41 <BobBall> BUT the infra team are too good
15:26:42 <johnthetubaguy> sweet
15:26:52 <johnthetubaguy> yeah, yet are very impressive
15:26:52 <BobBall> or, more importantly, a combination of -infra and tempest
15:26:56 <johnthetubaguy> they^
15:27:09 <BobBall> now the nodes are running tempest in parallel test times have almost halved
15:27:20 <johnthetubaguy> tempest would be a much better set of tests to run, because devs can run them
15:27:29 <BobBall> making the plan of SS commenting before tempest completes less useful
15:27:39 <BobBall> It's now much more of a race condition than it was before
15:27:56 <BobBall> so I'm not happy for SS but of course faster gate checks are better
15:28:05 <BobBall> in terms of where that leaves us... well... it's an interesting one
15:28:10 <johnthetubaguy> well SS still tests XenServer, and that is golden
15:28:24 <johnthetubaguy> without that, we will get pulled out of nova
15:28:27 <BobBall> I've also been working with RS cloud to get a XS virtualised - which works now
15:28:35 <BobBall> so I'm now thinking about how to get that into HP cloud
15:28:37 <johnthetubaguy> cool, thats a good thing to have
15:28:50 <BobBall> at which point we could run subset of tempest (I'd rather keep it to a subset than the full thing)
15:29:02 <johnthetubaguy> does tempest work in the RAX cloud now?
15:29:05 <BobBall> and have that integrated in -infra fully
15:29:12 <johnthetubaguy> KVM + HVM should work right?
15:29:26 <BobBall> can't use RAX because it uses xenstore to pass the IP which we can't get at because we're running Xen in the HVM which would intercept the hypercalls
15:29:46 <BobBall> so until RAX uses configdrive we can't do nested tempest there
15:29:55 <johnthetubaguy> that should be very very soon
15:30:10 <BobBall> in terms of HP, they use DHCP, so if I can get an instance there we can use it
15:30:21 <johnthetubaguy> except it doesn't have the IP address in there for some reason, I need to work on that then
15:30:24 <BobBall> getting an instance there is more fun because HP don't support PXE or iso or image upload...
15:30:30 <BobBall> *grin*
15:30:44 <johnthetubaguy> hmm, enjoy
15:30:58 <BobBall> plan is to do something extremely ugly
15:31:00 <BobBall> but quite fun
15:31:20 <BobBall> create centos image - new partition to dump the XS iso on, replace bootloader, kernel, initrd, reboot and pray
15:31:21 <johnthetubaguy> what about testing xenserver-core?
15:31:41 <johnthetubaguy> hmm, well that could work
15:31:46 <BobBall> xenserver-core has a number of things we're working on fixing ATM.  XS is more "certain" than xenserver-core in terms of how confident we are that it'd work
15:32:08 <johnthetubaguy> OK, well keep in touch about the RAX issues, we have a few things in the pipe to fix that soon(ish) at least
15:32:29 <BobBall> well I'd quite like to create that hacky swizzle script
15:32:39 <BobBall> that way we can replace anything with a XS with the right IP etc
15:32:41 <BobBall> would be very cool
15:33:12 <BobBall> Althouhg... the next problem would be that infra expect everything to run in the host that you ssh to... so maybe would need redirection to the domU and some weird setups
15:33:29 <johnthetubaguy> yeah, thats the bit I think is worth trying to fix
15:33:59 <johnthetubaguy> hmm, so you could use Rescue mode on RAX to hack around the IP issue
15:34:07 <BobBall> I'm hoping it'll be quite simple - but unless we can get the hard bits done I'm not going to look at that
15:34:15 <johnthetubaguy> mount the disk and inject the IP address, for the moment
15:34:40 <BobBall> indeed
15:34:47 <johnthetubaguy> but I think it would be better to fix that issue of integration with infra stuff, rather than getting XenServer running, that should fix its-self in a few days
15:34:55 <johnthetubaguy> but thats just my 2 cents
15:34:58 <BobBall> but if I can get the swizzle working I'd use that to install XS on the RS cloud too
15:35:15 <BobBall> the way it works at the moment (hidden images, PXE boot, custom URL...) is faffy too
15:35:22 <johnthetubaguy> yeah, for sure
15:36:05 <johnthetubaguy> anyways, bugs you want to mention
15:36:13 <johnthetubaguy> would be good to set a goal for H-4 too
15:36:21 <BobBall> https://bugs.launchpad.net/bugs/1218528
15:36:22 <uvirtbot> Launchpad bug 1218528 in nova "openvswitch-nova in XenServer doesn't work with bonding" [Medium,Confirmed]
15:36:37 <johnthetubaguy> yeah, I didn't quite get what was going on there, do tell
15:36:45 <johnthetubaguy> some init script running in Dom0??
15:37:13 <BobBall> indeed
15:37:17 <BobBall> tos et up the firewall rules
15:37:40 <johnthetubaguy> MAC and IP address stuff?
15:37:48 <johnthetubaguy> anti-spoof?
15:37:48 <matel> https://github.com/openstack/nova/tree/master/plugins/xenserver/networking/etc/init.d
15:37:51 <BobBall> yeah
15:38:28 <johnthetubaguy> hmm, that was written for bridge and nova-network right?
15:38:45 <johnthetubaguy> oh, maybe not
15:38:46 <BobBall> yes
15:38:48 <BobBall> no?
15:38:50 <johnthetubaguy> says OVS
15:39:20 <BobBall> but it is nova-network
15:39:22 <BobBall> I think
15:39:29 <BobBall> yes, of cousre it is
15:39:30 <johnthetubaguy> sure, nova-network flatDHCP with OVS
15:40:04 <BobBall> oh
15:40:09 <BobBall> I've got another fun one
15:40:13 <BobBall> let's talk about it here!
15:40:15 <BobBall> woohoo
15:40:17 <BobBall> TrustedFilter
15:40:22 <johnthetubaguy> yeah, that stuff is a mystery to me
15:40:28 <BobBall> looks at nova's interpretation of a host
15:40:35 <johnthetubaguy> oh right
15:40:38 <BobBall> each service has a host and a node
15:40:50 <BobBall> with host being set to what the compute reports and node being set to hypervisor_hostname
15:40:56 <BobBall> I think that's plain wrong.
15:41:02 <BobBall> I think it should be the other way round
15:41:15 <BobBall> but changing it probably breaks the world
15:41:39 <johnthetubaguy> not sure, that stuff was added for baremetal
15:42:09 <johnthetubaguy> host is defiantly nova-compute though
15:42:17 <johnthetubaguy> bug do go on
15:42:22 <johnthetubaguy> you got a link?
15:42:28 <johnthetubaguy> but^
15:42:33 <BobBall> just getting it
15:43:07 <BobBall> https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L436
15:43:27 <BobBall> My issue is that "host" should be the hypervisor and "node" should be the compute node
15:43:31 <BobBall> but this gets them the wrong way round
15:43:49 <johnthetubaguy> not so sure, the host has always been nova-compute
15:43:49 <BobBall> There are one of two bugs here
15:43:52 <BobBall> either nova has got it wrong
15:44:03 <BobBall> or XenAPI needs to report the DomU name as hypervisor_hostname
15:44:08 <BobBall> which is completely screwy
15:44:33 <BobBall> the node has always been the nova-compute too :)
15:44:38 <BobBall> except for XenAPI
15:44:45 <johnthetubaguy> yeah, given the current model, I thought it would be both DomU address
15:44:48 <BobBall> so surely we get to chose which way round it is? ;)
15:44:53 <BobBall> it's not
15:45:00 <johnthetubaguy> Conf.host is the name of the nova-compute
15:45:01 <BobBall> service['host'] is the DomU name
15:45:14 <BobBall> 'hypervisr_hostname' is the hostname
15:45:20 <johnthetubaguy> hmm
15:45:29 <BobBall> it's fine for libvirt
15:45:40 <BobBall> but all other hypervisors have probably got it the other way round
15:45:46 <johnthetubaguy> so the service host has to be what the RPC message goes to
15:46:02 <BobBall> ATM yes
15:46:18 <johnthetubaguy> the node, in baremetal, is the thing the VM is (or in our case, runs on)
15:46:38 <johnthetubaguy> anyway, we are getting descracted
15:46:44 <BobBall> Is this another question that Mr RB would be good to call in on? just to understand which way round it should be?
15:46:47 <johnthetubaguy> what is the issue with trusted filer?
15:46:51 <BobBall> The problem is the TrustedFilter uses .host
15:46:54 <BobBall> which is the DomU
15:47:00 <johnthetubaguy> right
15:47:02 <BobBall> and .node "feels" wrong
15:47:12 <johnthetubaguy> well node is the DomU always
15:47:16 <johnthetubaguy> I mean
15:47:19 <BobBall> but if .host really is the compute node and .node really is the host for the VMs then TrustedFilter needs changing
15:47:22 <johnthetubaguy> node is what is running nova-compute
15:47:31 <BobBall> not according to that code :D
15:47:37 <BobBall> .node is the machine
15:47:40 <johnthetubaguy> oh man
15:47:40 <BobBall> .host is nova-compute
15:47:44 <johnthetubaguy> I typed it wrong
15:47:54 <johnthetubaguy> yep, host is nova-compute
15:48:09 <BobBall> which is wrong.  Almost as wrong as calling VMs servers...
15:48:10 <johnthetubaguy> node is the specific hypervisor, or baremetal node
15:48:40 <johnthetubaguy> well depends what you are looking at
15:48:40 <BobBall> So TrustedFilter has it the wrong way round
15:48:43 <johnthetubaguy> nope
15:48:54 <BobBall> yeah - TrustedFilter _MUST_ check the host
15:48:59 <johnthetubaguy> trusted filter is meant to confirm the nova-compute code, I thought?
15:49:05 <BobBall> no
15:49:09 <BobBall> the hypervisor that'll run the VMs
15:49:10 <johnthetubaguy> as well at the KVM hypervisor right?
15:49:32 <johnthetubaguy> well, in KVM land it would apply to the whole host right?
15:49:38 <BobBall> https://github.com/openstack/nova/blob/master/nova/scheduler/filters/trusted_filter.py#L291 is what I mean
15:49:52 <BobBall> sure - but the thing that can register as trusted is the hypervisor only
15:50:00 <BobBall> even in KVM land there is no trustworthyness of the nova code
15:50:01 <BobBall> that's never checked
15:50:16 <BobBall> only the hypervisor (I think in KVM it's just kernel and qemu even)
15:50:28 <johnthetubaguy> OK
15:50:38 <johnthetubaguy> so its what the attestation server is confirming
15:50:56 <BobBall> The attestation service only knows about the hypervisors
15:50:59 <johnthetubaguy> well I think the trusted filter should be checking the other value
15:51:08 <BobBall> it has no knowledge at all about the nova-computes
15:51:10 <johnthetubaguy> that should work for both right?
15:51:19 <BobBall> who knows :)
15:51:25 <BobBall> maybe
15:51:37 <BobBall> if we're asserting that in KVMland the two are always the same value
15:51:42 <johnthetubaguy> yep, I am
15:51:56 <johnthetubaguy> I thought that was true for us too, but it makes sense we use the second value
15:52:18 <johnthetubaguy> for a cluster you have nova-compute, hypervisor-x
15:52:51 <BobBall> yeah
15:53:04 <johnthetubaguy> but bare meta has nova-compute, random-server-for-VM-on-bare-metal-thingy
15:53:09 <johnthetubaguy> anyways
15:53:16 <johnthetubaguy> running out of time
15:53:19 <johnthetubaguy> any other bugs?
15:53:20 <BobBall> yeah
15:53:24 <BobBall> probably
15:53:29 <BobBall> but lets save some of the fun for next week
15:53:35 <johnthetubaguy> What is our goal for H-4?
15:53:52 <johnthetubaguy> I would like to see all medium bugs squashed, is this a crazy idea?
15:54:02 <BobBall> Probably - depending on what they are
15:54:08 <BobBall> We want to seriously improve the docs
15:54:15 <johnthetubaguy> well, medium as defined by the priority
15:54:27 <johnthetubaguy> Yes, docs would be a good thing too
15:54:34 <BobBall> I meant depending what the specific bugs are
15:54:39 <johnthetubaguy> Are you Citrix guys concentrating on docs then?
15:54:52 <johnthetubaguy> I am talking about all bugs that impact key features
15:55:02 <BobBall> for the next few weeks yes
15:55:06 <BobBall> but we'll see how that goes
15:55:17 <BobBall> I don't think most of the bugs are as bad as the docs ATM
15:55:23 <BobBall> so that is our priority
15:55:30 <johnthetubaguy> OK, that sounds good
15:55:38 <johnthetubaguy> I will concentrate on bugs, if you guys are on Docs
15:55:49 <BobBall> Great
15:55:57 <johnthetubaguy> I will try rope in some help from some other Rax guys too
15:56:06 <johnthetubaguy> we only have a few weeks for bug fixing
15:56:17 <johnthetubaguy> before we get more cautious towards release
15:56:23 <johnthetubaguy> One more thing...
15:56:33 <BobBall> oh yeah
15:56:46 <johnthetubaguy> testing release candidates, and things
15:56:58 <johnthetubaguy> we should really help out with that, make sure H has good XenServer support
15:57:09 <johnthetubaguy> I mean some testing beyond the CI
15:57:27 <BobBall> I'd much rather get more stuff tested in the CI
15:57:35 <johnthetubaguy> stuff like full tempest (do you run that all the time now?) plus manual testing through the GUI to make sure it all hangs together OK
15:57:39 <BobBall> which is a background thing we've been working on
15:57:51 <BobBall> yes - full tempest is running in the CI and passing
15:57:58 <johnthetubaguy> OK thats good
15:58:11 <matel> I'm just wondering, could we get an account, and report the results?
15:58:17 <johnthetubaguy> I am only talking about half a day playing with the builds manually at RC?
15:58:21 <matel> Voting with +1 ?
15:58:25 <BobBall> only once we've got the logs auto connected matel
15:58:28 <johnthetubaguy> matel: its open to anyone, so go for it
15:58:39 <johnthetubaguy> BobBall: you can just use paste.openstack.org for the logs
15:58:48 <BobBall> I know - but we need the CI to collect them
15:58:52 <BobBall> which it doesn't yet
15:58:57 <matel> It requires a captcha for big files
15:59:14 <johnthetubaguy> hmm, that sucks
15:59:26 <BobBall> besides which - we've got space we can put it - so no need to use paste
15:59:32 <BobBall> we'll see
16:00:41 <BobBall> Can we call time?
16:00:48 <johnthetubaguy> yep
16:00:48 <BobBall> and johnthetubaguy
16:00:50 <BobBall> can I call it?
16:00:52 <johnthetubaguy> #endmeeting