15:00:10 #startmeeting XenAPI 15:00:11 Meeting started Wed Feb 4 15:00:10 2015 UTC and is due to finish in 60 minutes. The chair is BobBall. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:12 Better. 15:00:14 The meeting name has been set to 'xenapi' 15:00:30 Howdy 15:00:42 Ping for johnthetubaguy and yamamoto 15:00:44 hi 15:00:52 hello 15:01:04 Morning / evening guys! 15:01:16 So yamamoto did the right thing and actually added some items to the agent 15:01:19 agenda* 15:01:24 so we'll follow the agenda properly today 15:01:36 quick introductions first? 15:01:44 No actions, blueprints or docs that I'm aware of - any dissent on skipping those items? 15:02:02 heads up on blueprints, we are closed until L now 15:02:13 feature freeze is tomorrow for non priority things 15:02:14 yamamoto: Would you mind doing a quick intro? Not sure that John has seen the emails so where your focus is 15:02:32 ok 15:02:43 i'm a developer of ryu and ofagent 15:03:12 and currently working on ovs agent to make it use ryu ofproto library 15:03:16 instead of ovs-ofctl command 15:03:41 any questions? 15:03:49 so the link to XenAPI is that we've currently got the rootwrap to proxy ovs-ofctl commands through to dom0 15:03:53 But we'll get on to that later :) 15:04:13 yamamoto: sounds cool, thanks for the intro 15:04:15 #topic Bugs / QA 15:04:32 The CI hit the swift configuration bug this week so suffered a 4-day outage 15:04:35 but that's all fixed now 15:04:50 Still seeing the lower-than-usual pass rate for the shelving tests 15:04:56 I don't suppose you've heard any more on that johnthetubaguy? 15:04:57 yamamoto: I am a principle engineer working on Rackspace's public cloud, good to meet you :) 15:05:14 Yes... I should have introduced myself too... Sorry! 15:05:16 BobBall: nothing obvious, we don't use the feature though 15:05:39 I'm with Citrix in the XenServer team and I manage our OpenStack efforts 15:05:52 Ah yes, I'd forgotten you don't use shelve 15:05:54 yamamoto: oh, and Rackspace Public Cloud is mostly using XenServer for the servers, hence I am here 15:05:57 which is why it's not a priority for you :) 15:06:10 BobBall: ironically we added it, but thats a long story 15:06:20 Understood. 15:06:25 nice to meet you johnthetubaguy and BobBall! 15:06:40 Anyway - that reminded me of a (side) question johnthetubaguy - is there a way to live migrate a VM between RAX regions? 15:07:07 nope, you can't migrate between cells in the same region, live or not, due to IP constraints 15:07:18 Ah ok 15:07:25 you can snapshot, export, then import though, I think 15:07:45 It's a good reason not to allow it 15:07:47 Anyway - moving on. 15:07:54 libvirt+xen CI is making good progress 15:08:04 You may have seen that today a couple of nova fixes went in 15:08:24 With those fixes, and a latest libvirt (compiled from source currently) then all tempest tests pass 15:08:24 Is it still mostly suse folks working on that? 15:08:31 No - Citrix is working on it too 15:08:50 cool, do you plan a test framework for that? 15:09:06 I'm in the process of trying to set up a zuul+jenkins+etc for it 15:09:28 then hopefully my efforts will meet in the middle with the xen.org folk who are working on the tests + libvirt support from the ground up 15:09:55 I'm expecting that by the next meeting (fortnight from today) we'll have the infrastructure set up 15:10:04 probably won't be running the jobs but it'll be running a no-op 15:10:45 cool, glad to see a ray of light there 15:10:47 Aim is to have a voting CI in a couple of months (trying to be realistic here - might arrive before that) 15:11:05 end of kilo sounds like a good goal post, but understood 15:11:08 OK, not sure there is anything else on that unless there are questions? 15:11:18 I am good 15:11:28 #topic Xen-rootwrap 15:11:38 yamamoto: Since you added this to the agenda, do you want to lead? 15:11:42 #link https://wiki.openstack.org/wiki/Meetings/XenAPI 15:11:49 sure 15:11:59 (yamamoto did add some items to the page above - hence the link johnthetubaguy) 15:12:21 there are a couple of neutron blueprints which likely involve xen-rootwrap 15:13:14 i want to ask opinions from you folks familiar with xenapi 15:13:30 So this is https://blueprints.launchpad.net/neutron/+spec/ovs-ofctl-to-python 15:13:36 Still due to hit in K-3? 15:14:03 i'm not sure if it can as it's low-priority bp 15:14:15 but i'm actively working on it currently 15:14:45 OK - so is your current view is that code will be ready but reviewers may not have bandwidth to give feedback / approve 15:14:55 exactly 15:15:10 except xenapi part which i'm still not sure how to deal with 15:15:29 Will there be any remoting capabilities? 15:15:34 right, you say "proxies OpenFlow channel between domains" 15:15:40 what does that mean? 15:15:47 let me confirm my understanding of the current mechansim first 15:15:56 For XenServer the OVS is in domain 0 but Neutron is running in a vitual machine 15:16:02 neutron/plugins/openvswitch/agent/xenapi is dom0 side 15:16:07 and xen-rootwrap is domu side 15:16:10 yes 15:16:19 and they communicate via xenapi "transport" 15:16:27 https://raw.githubusercontent.com/matelakat/shared/xs-q-v1/xenserver-quantum/deployment.png might be useful 15:16:49 Lines from q-agt to xapiY and xapiX bridges are just using dom0-rootwrap script to modify the OVS in dom0 15:16:58 Lines around q-domua are actual connections made by XAPI and the OVS 15:17:07 good figure. thank you 15:17:42 That figure is mostly relevant in the devstack case as both neutron and compute are in the same VM 15:18:17 well you need a neutron agent on every compute host right, so that bit stays 15:18:18 my bp is incompatible with the current mechanism because it stops using ovs-ofctl command and thus xen-rootrwap 15:18:31 § If we were to deploy in two VMs the q-agt is needed in the Compute node but q-domua is needed in the Network node. 15:18:39 yeah, your python bindings, do they talk to a C lib? 15:18:54 or directly to the OVS? 15:19:18 the python binding (ryu ofproto lib) is pure python library speaking openflow protocol 15:19:34 over pure tcp socket (or tls) 15:19:37 So we could (in theory?) open the openflow port on the xenserver hsot 15:19:42 ah, so it talks over the network interface? thats not so bad 15:19:43 and get your library to talk directly to it? 15:20:18 the plan is make ovs connect to my controller embedded in ovs-agent 15:20:32 BobBall: or use an internal host local network? 15:20:51 HIMN is an option, but it caused issues for quite a few things... 15:21:07 HIMN? 15:21:10 hmmmm - hang on - do we have a single neutron node for every host? 15:21:21 a single agent for every node 15:21:24 himn = host internal management network. One that allows TCP traffic between VMs and the host 15:21:41 OK - but you can have multiple agents in one neutron VM? 15:21:43 ovs-agent runs on every compute/network nodes 15:21:50 right 15:21:59 its 1 agent per OVS instance 15:22:05 yes 15:22:17 So we can't use HIMN as the neutron VM may not be located on the hypervisor 15:22:25 my bp makes OVS-agent have an openflow controller embedded 15:22:58 and make ovs connect to the controller via 127.0.0.1 15:23:32 The whole point of the xen-rootwrap script is to proxy the commands from the Neutron node to the XenServer host - if we're using a controller we can easily set up the xenserver host to connect to that over TCP rather than the commands being proxied 15:23:39 so for xen integration, i need a way to make dom0's ovs connect to domu's openflow controller 15:23:48 So I think that this BP makes life a _lot_ easier? 15:23:49 Yes 15:24:08 is dom0 and domu ip-reachable? 15:24:19 right, its just a case of getting the networking configured correctly, which is a bit tricky in the general case 15:24:25 They will need to be for this to work, yes. Just a restriction on the deployment architecture 15:24:37 Currently they have to be anyway 15:24:41 because the XAPI commands go over IP 15:25:01 ah i didn't know that 15:25:01 right, its the same interface, I forgot that, its just a pain to configure 15:25:22 So we need a plugin for XAPI that will set the OVS controller for a given bridge 15:25:43 so can i assume ip reachability? 15:25:47 and we need Neutron to call that plugin when it sets up the agent? 15:25:50 yes yamamoto 15:25:56 it can be done with the current mechanism using ovs-vsctl 15:26:07 right, makes sense 15:26:25 perhaps yamamoto - but I'd much rather the "rootwrap" were more restrictive. We don't need to run much as root, just setting the bridge. 15:26:43 But you're right, first pass, keep the current mechanism and just set the controller 15:26:55 then we can deprecate the rootwrap script and put in something more restrictive 15:27:18 ah, I see where you are going, but you can do that second 15:27:32 can we assume domu -> dom0 connectivity too 15:27:35 yes 15:27:37 can we assume domu -> dom0 connectivity too? 15:27:48 Call it neutron-node -> dom0 15:27:57 it will likely be necessary for terry's bp 15:28:04 its required for the xenapi <-> nova, but BobBall is right, call it neutron-node, so it can be off box 15:28:07 domU is "any domain on this host" and the neutron-node doesn't have to be on the host. 15:28:29 ok 15:29:07 But yes - I think that's an absolute requirement for Neutron 15:29:15 distraction, but do you plan on making nova-compute work better when its on a different host to the hypervisor (I am thinking about the disk formatting code)? 15:29:27 surely that has to tbe the case with any hypervisor that neutron is setting up the network for? 15:30:10 BobBall: the usual case is being on the same linux box, with access to unix sockets and localhost right, so its a slightly different trade off 15:30:13 Long term plan, yes john - but no changes expected very short term 15:30:23 well, not trade off, more definition 15:30:26 But that's running a neutron-node on every host 15:30:31 BobBall: ack 15:30:36 not just an agent for each host 15:30:38 (same here) 15:31:03 OOI does RAX run the neutron node on each hypervisor? 15:31:05 oh wait, I see where you are heading, I thought this was the agent that had the controller on it? 15:31:24 oh yes - I think that's what yamamoto said - I was getting confused. 15:31:49 which ip address is appropriate to use for my openflow controller? is it currently known to ovs-agent? 15:31:50 rackspace uses udev rules in the hypervisor to configure OVS, to avoid needing an agent, currently using xenstore as a backchannel to send the config, working on replacing that with Redis 15:32:07 ok 15:32:24 I'm not certain yamamoto but I think it must be 15:32:41 I don't know how else xen-rootwrap would contact the hypervisor 15:32:45 yamamoto: I think its a new config, or its a config the same config we already use for the rootwrap thing, if thats possible 15:32:45 yes - it must be 15:33:16 introducing a new config for ovs-agent is appropriate? 15:33:36 it can be a setting thats specific to your plugin, I think thats quite normal 15:33:45 but the neutron experts might shoot me down 15:33:46 the rootwrap thing has neutron-node side ip address config? 15:34:19 (xenapi transport is a magic to me) 15:34:27 just checking 15:34:43 iniset $Q_RR_CONF_FILE xenapi xenapi_connection_url "$XENAPI_CONNECTION_URL" 15:34:46 yes 15:34:56 xenapi is just a http web service that runs in dom0 that the nova-compute VM talks to 15:35:18 BobBall: it is dom0 address right? 15:35:23 yes yamamoto 15:35:43 what i need is a neutron node address to listen for openflow connection 15:36:09 doesn't the agent have to run _on_ the neutron-node itself? 15:36:28 yes 15:36:39 So the neutron-node address is just localhost's ip? 15:37:00 but it might have multiple addresses etc 15:37:07 True 15:37:28 i can introduce a new config but if there's an existing one i want to know 15:37:36 Personally I think that you should have a generic config for the controller IP which just happens to default to 127.0.0.1 15:37:46 Ah - I don't know if there is one *not a Neutron expert* 15:38:12 but you are xenapi expert 15:38:21 yes :) 15:38:43 So the neutron node runs in a VM - but which IP you want to listen on for the controller in the neutron-node isn't a XenAPI question 15:39:31 which ip to use to connect to dom0 is a xenapi question isn't it? 15:39:50 Connecting to dom0 goes through the above $XENAPI_CONNECTION_URL 15:40:06 but connecting back _from_ dom0 to the neutron-node's OVS controller is a different question 15:40:42 So we're currently saying you use the connecfrom from neutron-node --> dom0 once only to set up the controller in dom0? 15:40:45 ok then no explicit "src ip" config exists? 15:41:12 yeah, no traffic is going that way right now, as far as I remember 15:41:38 got it thank you 15:41:41 But I think that you'll need that for the generic case too (unless you only care about neutron-node on-hypervisor) 15:42:16 your ci mentioned earlier will cover neutron? 15:43:14 So... This is an interesting point... 15:43:20 With a very long answer 15:43:39 Currently we have a XenAPI CI that uses nova-network that votes on jobs 15:43:53 We also have an experimental internal XenAPI CI that's trying to run against Neutron jobs 15:44:06 but we're experiencing a non-trivial number of failures at the moment which we're trying to debug 15:44:19 we think there is a race condition somewhere and we've not been able to track it down yet 15:44:32 Until that's sorted it's clearly of very limited use to expose that. 15:44:59 The libvirt+xen CI discussed earlier will hopefully be running on Neutron from the start, but it will not be using the rootwrap method as that is specific to XenAPI 15:45:59 Frustratingly my team has also been hit by some high turnover recently so there will be a period of time to get new team members up to speed 15:46:01 BobBall: I think I can tell you what that race condition might be, but lets leave that till open discussion 15:46:49 xenapi ~= xcp/xenserver ? 15:46:50 As far as I've seen the efforts to improve the Neutron (OVS) + XenAPI integration are limited to citrix and while many people want it it's something that we have to resource to fix it etc 15:46:55 yes yamamoto 15:47:34 BobBall: we hope to get quark upstream soon, at least I hope to get it more upstream, given the new neutron model 15:47:52 fantabulous 15:48:13 what's quark? 15:48:15 Perhaps we can talk about defaulting to quark rather than OVS - but currently we have to focus on something that's upstream 15:48:50 As I understand it, Quark makes use of XenAPI to push the rules down into dom0 rather than OVS with the rootwrap 15:50:15 yep, basically 15:50:28 quark uses OVS, but yes, thats miles down the line 15:50:29 https://github.com/rackerlabs/quark this one? 15:50:49 yamamoto: possibly, its probably out of date 15:50:52 As far as neutron is concerned OVS isn't involved though, correct? 15:51:07 yamamoto: I take that back, thats quite up to date 15:51:33 BobBall: erm, neutron uses a non ML2 plugin, put it that way 15:52:24 OK 15:54:00 But for now I think that quark is a different question 15:55:05 Have we covered the xen-rootwrap topic sufficiently yamamoto or are ther emore questions? 15:55:13 yep, it should be 15:55:14 thank you for having a discussion. it seems easier than i thought. 15:55:21 let's move on 15:55:25 fingers crossed it makes all our lives easier! 15:55:34 looks really promising, good stuff 15:55:41 #topic Open Discussion 15:55:54 so you have some races in the neutron CI 15:56:00 We do indeed 15:56:01 have you seen the code libvirt has to avoid those? 15:56:09 probably not 15:56:19 * johnthetubaguy goes to find the link 15:56:30 * BobBall crosses his fingers 15:57:24 https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L4251 15:57:38 you start the domain paused, then you wait for neutron to get everything wired up 15:57:45 then you start the domain 15:58:04 otherwise, its very… race-ey 15:58:10 Ah I see 15:58:38 * BobBall makes a note of that 15:58:42 I'll look into that approach 15:59:00 Is that not racy for nova-network too? 15:59:12 given how the udev scripts work, we kinda side step that issue, at least 90% of it 15:59:45 nova-network doesn't do that stuff right, it just creates a VIF on a VLAN or not, there is no async agent doing configuration stuff 16:00:08 *nod* 16:00:18 I've not seen this before... 16:00:20 My head hurts 16:00:23 What does it mean? 16:00:25 launch_flags = events and libvirt.VIR_DOMAIN_START_PAUSED or 0 16:00:34 int1 and intb or 0 16:00:43 surely the or 0 is useless? 16:01:20 duno 16:01:26 not read through that 16:01:26 ah - events is a list not an int 16:01:31 but we are out of time 16:01:31 wow - I wonder what that does 16:01:32 I guess that a "False" gets translated to a 0 16:01:45 We are 16:01:48 BobBall: it basically is waiting for a REST call from neutron to say its done 16:01:58 I see 16:02:13 yes flip214 - but it seems that events is a list 16:02:14 and-or is a confusing way to write if-else 16:02:55 ah ok - so it's saying "events if VIR_DOMAIN_START_PAUSED is set, otherwise launch_flags will be 0" 16:03:15 *confused* 16:03:16 anyway 16:03:17 johnthetubaguy: is right 16:03:19 we're out of time 16:03:24 So thanks all 16:03:28 Next meeting is in a fortnight 16:03:38 #nextmeeting 18th Februar 16:03:40 +t 16:03:41 +y 16:03:45 is #nextmeeting not a thing? 16:03:54 Anyway - see everyone on the 18th! 16:03:58 thanks 16:03:59 #endmeeting