14:00:29 <PaulMurray> #startmeeting Nova Live Migration
14:00:30 <openstack> Meeting started Tue Mar 29 14:00:29 2016 UTC and is due to finish in 60 minutes.  The chair is PaulMurray. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:34 <openstack> The meeting name has been set to 'nova_live_migration'
14:00:40 <pkoniszewski> o/
14:00:43 <mdbooth> o/
14:00:46 <davidgiluk> o/
14:01:25 <PaulMurray> hi - I'm surprised to find people here - its been so quite today
14:01:45 * davidgiluk was here an hour ago, but then I noticed this is a UTC meeting :-)
14:01:46 <scheuran> o/
14:01:51 <PaulMurray> Agenda: https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration
14:02:10 <PaulMurray> lets wait one more minute in case of late arrivals
14:02:15 <jlanoux> o/
14:02:43 <pkoniszewski> yeah, at the very beginning I wanted to say thanks for remainding about time changes, i'd fail hard today ;)
14:03:11 <pkoniszewski> PaulMurray: ^
14:03:18 <PaulMurray> #topic Bugs
14:03:32 <PaulMurray> https://bugs.launchpad.net/nova/+bug/1550250
14:03:32 <openstack> Launchpad bug 1550250 in OpenStack Compute (nova) "migrate in-use status volume, the volume's "delete_on_termination" flag lost" [High,In progress] - Assigned to YaoZheng_ZTE (zheng-yao1)
14:03:51 <PaulMurray> markus_z sent out a mail about this one
14:04:25 <PaulMurray> #link mail thread about "delete_on_termination" bug http://lists.openstack.org/pipermail/openstack-dev/2016-March/090684.html
14:04:43 <PaulMurray> Does anyoe have an opinion on this bug ?
14:05:17 <PaulMurray> it seems volumes lose their delete_on_termination flag on live migration
14:05:25 <pkoniszewski> is this somehow related to live migration?
14:05:35 <pkoniszewski> thought that this is cinder migration, not nova's live migration
14:05:52 <pkoniszewski> 4.run cinder migrate volume
14:06:01 <pkoniszewski> (one of provided steps)
14:06:07 <mdbooth> Yup, that's not nova live migration
14:06:21 <PaulMurray> ah, your right - I was given the impression it was nova, but it looks like cinder ?
14:07:03 <PaulMurray> ok - moving on
14:07:09 <PaulMurray> (embarrased)
14:07:37 <PaulMurray> are there any bugs anyone wants to bring up?
14:08:12 <PaulMurray> #topic Summit sessions
14:08:48 <PaulMurray> Just a reminder to add anything you want considered to the etherpad
14:09:06 <PaulMurray> #link summit sessions: https://etherpad.openstack.org/p/newton-nova-summit-ideas
14:09:41 <PaulMurray> #topic Update libvirt domain xml interface section on live migration
14:09:54 <PaulMurray> this added by scheuran
14:09:57 <scheuran> Hi
14:10:01 <PaulMurray> scheuran, do you want to take this
14:10:04 <scheuran> yep
14:10:30 <scheuran> so my goal is to update the interface section of the domain.xml before live migration starts
14:10:59 <scheuran> I have 2 use cases
14:11:03 <mdbooth> scheuran: On the destination I assume?
14:11:17 <scheuran> during pre_live migration
14:11:33 <scheuran> but update the xml with the destination information, right
14:11:49 <scheuran> #1 Live Migration with newly added neutron macvtap agent
14:11:57 <scheuran> #2 live migration cross neutron agents
14:12:14 <scheuran> e.g. migrate from a host with linuxbridge agent to a host with ovs agent
14:12:38 <scheuran> what I need for this is the vif information for the destinatnion host
14:12:58 <scheuran> neutron generates this when the update_binding_host_id call is made in post live migration
14:13:09 <pkoniszewski> is this something that you can take only from the destination? I mean, this vif information
14:13:23 <scheuran> pkoniszewski, from the neutron server
14:13:30 <scheuran> and only from the neutron server
14:13:37 <pkoniszewski> okay
14:13:57 <scheuran> so in pre_live migration i need to call neutron, and ask for the vif information
14:14:15 <scheuran> but this information gets updated in post live migration today, like described before
14:14:38 <pkoniszewski> is there any incomatibility between agents? or can you live migration from any to any?
14:14:43 <scheuran> so what I want to discuss is, if we can make the udpate_binding_host_id call in pre_live migration insteaed of post_live migration
14:15:03 <scheuran> pkoniszewski, today you cannot migrate between agents
14:15:09 <PaulMurray> scheuran, at the moment the port binding update triggers creating networking on dest
14:15:11 <pkoniszewski> i mean from neutron perspective
14:15:16 <scheuran> as the on the target, nova plugs always the old vif type
14:15:28 <pkoniszewski> yeah, that's right
14:15:32 <PaulMurray> there are a few neutron-nova bugs that want network setup in pre_live_migrate
14:15:36 <scheuran> pkoniszewski, no, nothing from Neutron
14:15:46 <pkoniszewski> okay
14:16:04 <scheuran> PaulMurray, right, I'm aware of them - but they all do not solve this issue
14:16:22 <scheuran> #link https://review.openstack.org/#/c/297100/
14:16:27 <scheuran> this is the prototype I did
14:16:36 <scheuran> it's working for the good cases
14:17:06 <scheuran> however I'm still looking for a way how to rollback the portbinding if migration failed...
14:17:24 <pkoniszewski> don't you have old domain XML on source?
14:17:30 <scheuran> yes I have
14:18:00 <scheuran> and I need to update it with the new vif information
14:18:10 <mdbooth> Probably stupid question as I know very little about networking: I assume it's possible to have these different agents on the same segment as presented to the vm?
14:19:04 <scheuran> mdbooth, yes, those agents can serve the same network segment
14:19:20 <scheuran> but not on the same host of course
14:19:51 <scheuran> but you could have a mixed cloud, running linuxbridge and running ovs and everybody could talk to each other..
14:20:03 <PaulMurray> scheuran, does updating the port binding affect the source networking ?
14:20:16 <scheuran> PaulMurray, no
14:20:28 <scheuran> it's just a database operation
14:20:52 <PaulMurray> I saw you removed the migreate_instance_finish() - what does that do ?
14:21:47 <pkoniszewski> scheuran: so, you are asking about rollback, maybe a stupid question, isn't rollback_live_migration in compute manager enough? if source networking is unaffected it should work
14:21:57 <scheuran> PaulMurray, did I? In which file?
14:22:13 <pkoniszewski> in compute manager
14:22:14 <PaulMurray> https://review.openstack.org/#/c/297100/2/nova/compute/manager.py
14:22:21 <PaulMurray> its commented out
14:22:23 <pkoniszewski> https://review.openstack.org/#/c/297100/2/nova/compute/manager.py@5562
14:22:34 <scheuran> PaulMurray, a right
14:22:52 <scheuran> the only purpose of this method is updating the binding_host_id
14:23:02 <scheuran> today
14:23:37 <scheuran> so for production code we could move things around a bit that this hook is still present
14:24:03 <scheuran> pkoniszewski, is rollback_live_migration executed in every case?
14:24:20 <pkoniszewski> if something after pre_live_migration fails - yes
14:24:36 <pkoniszewski> including pre_live_migration step
14:24:51 <scheuran> pkoniszewski, yeah - the problem is, that today won't fail in pre livmigration
14:25:02 <scheuran> but during migration operation
14:25:10 <scheuran> when libvirt tries to define the xml on the destination
14:25:12 <pkoniszewski> it's okay, LM monitor will trigger rollback
14:25:20 <scheuran> and the requested devices are not present...
14:25:47 <pkoniszewski> you mean when LM is already finished from hypervisor perspective?
14:26:19 <pkoniszewski> ah ok, got it
14:26:29 <scheuran> pkoniszewski, not sure - when libvirt is executing the migration
14:26:42 <scheuran> pkoniszewski, not sure if nova treats this as already finished...
14:27:10 <pkoniszewski> this is fine, live_migration_operation goes in seperate thread
14:27:20 <pkoniszewski> we will still start monitor
14:27:23 <pkoniszewski> that will call rollback
14:27:36 <scheuran> pkoniszewski, ok, so I'll give it a try!
14:28:05 <scheuran> just to name the alternatives..
14:28:38 <scheuran> it would be a new neutron api, that returns that vif information for the destination - without persisting it in the database...
14:28:56 <scheuran> or allow a port to be bound to 2 nodes (source & target)
14:29:21 <scheuran> but just doing the portbinding in pre_live migration seemed to be the most easiest way
14:29:55 <pkoniszewski> we used the last approach that you mentioned for volumes
14:30:10 <pkoniszewski> we use *
14:30:29 <scheuran> pkoniszewski, so that during migration both hosts own the volume?
14:30:44 <PaulMurray> scheuran, volumes are attached to both hosts
14:30:50 <PaulMurray> during migration
14:30:57 <pkoniszewski> we have connection open to both hosts, even if nova fails during LM, instance will keep operating
14:31:08 <scheuran> pkoniszewski, ok I see
14:31:16 <pkoniszewski> sounds like the most secure approach to me
14:31:24 <scheuran> but in the database the owner is still the source, until migration finished, right?
14:31:41 <pkoniszewski> the owner is an instance which does not change
14:31:41 <PaulMurray> but its slightly different - volumes can be used from both hosts
14:31:50 <PaulMurray> with networking we want only one to get packets
14:31:52 <scheuran> right, cause for neutron its just a database problem
14:32:12 <scheuran> physcially I wouldn't do anything other than today
14:32:32 <scheuran> it's "just" about updating the database record
14:33:00 <PaulMurray> scheuran, do you have a spec for this ?
14:33:08 <scheuran> so during migration, the database says that port is bound to destinatnion, although it is still active on source
14:33:17 <scheuran> PaulMurray, not yet
14:33:28 <scheuran> PaulMurray, I first wanted to get a feeling which approach is the best one
14:33:43 <PaulMurray> I understand
14:33:56 <PaulMurray> specs are a good way to get wider opinion as well
14:34:06 <scheuran> PaulMurray, ok
14:34:06 <PaulMurray> so when you think you have an idea
14:34:13 <PaulMurray> its good to write it down
14:34:18 <scheuran> also creating a blueprint?
14:34:22 <PaulMurray> yes
14:34:37 <PaulMurray> blueprints are really only used fro tracking
14:34:47 <PaulMurray> but the spec will get reviewed and you get feedback
14:34:53 <scheuran> PaulMurray, ok, than this is my todo until next week..
14:34:57 <scheuran> *then
14:35:30 <scheuran> I'm a Neutron guy - so not very familar with the nova process..
14:35:40 <scheuran> so spec + bp, perfect
14:35:41 <PaulMurray> no worries, we're friendly
14:35:45 <scheuran> :)
14:36:09 <pkoniszewski> scheuran: also if you have spec, you can discuss it during nova unconference session on summit
14:36:24 <scheuran> pkoniszewski, good point
14:36:40 <scheuran> I already added this topic to the nova-neutron topics (in the neutron etherpad)
14:36:51 <pkoniszewski> which is actually the best way to clarify things that can be implemented different ways
14:37:21 <scheuran> ok. so to summarize - I'll try out the rollback stuff
14:37:29 <scheuran> and come up with a bp + spec until next week
14:37:49 <PaulMurray> good - thanks for bringing this to our attention
14:38:00 <PaulMurray> #topic Open Discussion
14:38:00 <scheuran> yes, thank you guys!
14:38:16 <PaulMurray> does anyone have anything else to brin gup ?
14:39:20 <PaulMurray> ok - thanks for coming
14:39:24 <luis5tb> Hi!
14:39:39 <davidgiluk> luis5tb: Hi Luis
14:39:43 <luis5tb> I would like to know if someone is taking a look at including post-copy live migration
14:40:08 <PaulMurray> luis5tb, are you interested in that ?
14:40:16 <luis5tb> I've been working on including it (but for JUNO version) and would like to know if that would be of interest
14:40:45 <PaulMurray> what do you mean by "for Juno version" ?
14:40:55 <luis5tb> yep, I want to take a look at the latest migration code and try to adapt it (many things have changed since then)
14:41:23 <luis5tb> I integrated post-copy into nova (OpenstacK Juno release) a year ago
14:41:42 <PaulMurray> I think there is interest
14:41:50 <PaulMurray> there is a list here: https://etherpad.openstack.org/p/newton-nova-live-migration
14:42:02 <PaulMurray> it is on the list - you could add yourself
14:42:11 <PaulMurray> or rather anything you want to add as information
14:42:12 <luis5tb> I saw points 6 and 7
14:42:21 <davidgiluk> but if I understand the next step, is that  it would be to write a spec?
14:42:46 <PaulMurray> davidgiluk, yes, took the words out of my mouth
14:42:57 <PaulMurray> I was actually wondering if someone is already doing that
14:43:00 <PaulMurray> ?
14:43:08 <PaulMurray> I not then please do
14:43:15 * davidgiluk doesn't know of anyone writing a spec
14:43:26 <luis5tb> ok, just wondering if someone else already took a look into it, or is in the to do list for the future
14:43:46 <luis5tb> ok
14:44:01 <pkoniszewski> so, the libvirt change is not in yet
14:44:08 <davidgiluk> pkoniszewski: Oh yes it is!
14:44:11 <PaulMurray> luis5tb, i think we put it off before (as a group) because we had a lot to do
14:44:13 <luis5tb> yep, it is
14:44:18 <pkoniszewski> oh, i missed it then
14:44:22 <PaulMurray> also there was the 2.6 change
14:44:42 <davidgiluk> pkoniszewski: Got merged last week
14:44:59 <pkoniszewski> good news
14:45:02 <PaulMurray> so may be a good time to move on with it
14:45:07 <pkoniszewski> yeah
14:45:13 <luis5tb> great
14:45:49 <PaulMurray> luis5tb, don't worry about waiting for others - if someone else wanted to do it they can get together with you
14:46:23 <luis5tb> ok, I'll try to write a spec regarding work item 6
14:46:41 <luis5tb> ok, I'll try to write a spec regarding work item 6/
14:46:42 <luis5tb> ok, I'll try to write a spec regarding work item 6/7
14:46:48 <PaulMurray> great - that's a big help
14:46:49 <pkoniszewski> yup, i will be interested in helping there
14:46:50 <PaulMurray> thanks
14:47:03 <davidgiluk> pkoniszewski: Great
14:47:16 <luis5tb> great
14:47:38 <PaulMurray> please add it to the list on the etherpad too
14:48:18 <PaulMurray> ok - anything else
14:48:29 * PaulMurray will give a big long pause this tiem
14:49:32 <PaulMurray> thanks for coming
14:49:36 <PaulMurray> #endmeeting