13:08:44 <ndipanov> #startmeeting sriov
13:08:46 <openstack> Meeting started Tue Mar 15 13:08:44 2016 UTC and is due to finish in 60 minutes.  The chair is ndipanov. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:08:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:08:49 <openstack> The meeting name has been set to 'sriov'
13:09:01 <ndipanov> ok so I guess we could see where we stand on bugs...
13:09:19 <ndipanov> https://etherpad.openstack.org/p/sriov_meeting_agenda
13:09:50 <lbeliveau> for my cold migration bug (https://review.openstack.org/#/c/242573/17), I had to -1 myself, something got broken
13:10:05 <ndipanov> there were 3 bugs we were hopping to get into mitaka
13:10:20 <ndipanov> lbeliveau, ah yes I saw that
13:10:40 <vladikr> Hi everyone
13:10:53 <ndipanov> hello
13:10:58 <lbeliveau> Hey
13:11:18 <moshele> hi
13:11:25 <ndipanov> vladikr, what's up with this one? https://review.openstack.org/#/c/283198/4
13:12:00 <vladikr> ndipanov, it merged i think
13:12:02 <vladikr> sec
13:12:20 <ndipanov> ah nice was looking at an outdated revision
13:12:30 <vladikr> yea :)
13:12:52 <lbeliveau> nice
13:14:06 <ndipanov> https://review.openstack.org/#/c/216049 merged too!
13:14:44 <lbeliveau> yeap
13:15:10 <lbeliveau> I still need to open a new bug on this issue related to migration
13:15:18 <ndipanov> lbeliveau, how come?
13:16:30 <lbeliveau> I think something got broken, on devstack, I see a PCI device get allocated on the destination node, but it's not saved with the instance, so my stuff in neutronv2 don't see the new PCI devices
13:16:43 <lbeliveau> still need to troubleshoot, but was side tracked
13:16:48 <ndipanov> lbeliveau, ah you think you've hit a new bug
13:16:51 <ndipanov> ok I see
13:17:03 <ndipanov> well that's good that we are finding bugs I guess :)
13:17:19 <ndipanov> if there are no more (known) bug fixes we wanna rush in
13:17:25 <ndipanov> we can talk briefly about
13:17:28 <ndipanov> THE FUTURE
13:17:39 <ndipanov> aka newton work we wanna try and get done
13:17:45 <lbeliveau> I was not seeing that before, but I haven't run my patch in devstack for a while since the fixes I was making only needed tox
13:17:59 <moshele> lbeliveau: l can try your patch on my setup
13:18:23 <moshele> lbeliveau: it will take me sometime to prepare one
13:18:52 <lbeliveau> moshele: that would be great, but I won't be able to resume this activity until next couple of days
13:19:30 <yonglihe> lbeliveau: i hope i can help to review your code several days later, is that too late?
13:20:10 <lbeliveau> yonglihe: no, it's not too late, I have to fix a related bug before I can submit this one
13:20:50 <ndipanov> yonglihe, hi - I saw you mentioned some fixes for cold migration on the ML the other day
13:20:58 <yonglihe> lbeliveau: ok, let's do that. i spend lots of time debug the PCI migration itself, hop i can help
13:20:59 <ndipanov> could that be what's tripping lbeliveau up?
13:21:26 <yonglihe> lbeliveau: my patch is too old, and logic is urgly
13:22:18 <lbeliveau> not sure why I saw it working before (many times), maybe I was jsut getting lucky or something changed on master
13:23:15 <ndipanov> when I was looking at the code last which was a few weeks ago
13:23:25 <ndipanov> I think I conclude that it would not work as expected
13:23:26 <moshele> lbeliveau: we tested in QA your patch and it worked, but it was like 2 month ago
13:24:08 <lbeliveau> moshele: ok so that might confirm that something got broken
13:24:38 <lbeliveau> anyhow, it means we don't have a unit test for that :)
13:24:48 <moshele> lbeliveau: the test was put 2 vm on 2 different computes and the migrate one of them
13:25:41 <lbeliveau> moshele: sounds right, and it's best to use PCI devices with different PCI addresses to make sure new ones a allocated on the destination node
13:26:31 <lbeliveau> I'll work on it in a couple of days and will get back to you guys with my findings (and questions)
13:28:11 <ndipanov> cool
13:28:26 <ndipanov> anything regarding newton we wanna talk about here
13:28:26 <ndipanov> ?
13:29:05 <moshele> yes I have these spec Add scheduling with NIC capabilities https://review.openstack.org/#/c/286073/
13:29:30 <lbeliveau> moshele: I'
13:29:34 <lbeliveau> I'll have a look
13:30:18 <moshele> so we want that nova will be aware of NIC capabilities  for scheduling
13:30:52 <ndipanov> moshele, I will too
13:30:57 <lbeliveau> we want to propose a BP to make it easier to use SR-IOV, by combining the neutron port creation and the boot in one step
13:32:06 <moshele> lbeliveau: we had something like this before but it was rejected
13:32:09 <ndipanov> moshele, so this has been discussed before but I'll take a look
13:32:11 <ndipanov> and comment
13:32:17 <ndipanov> lbeliveau, that sounds useful
13:32:27 <ndipanov> moshele, how come?
13:32:37 <moshele> lbeliveau: let me look for the spec a sec
13:33:28 <moshele> see this https://review.openstack.org/#/c/138808/
13:35:06 <ndipanov> moshele, I know there are folks who would hate to have that kind of logic live in nova
13:35:09 <lbeliveau> moshele: that looks very similar to what we wanted to propose !  we actually have something like that in our product
13:35:41 <lbeliveau> ndipanov: what is their argument ?
13:36:04 <lbeliveau> it can also avoir race condition issues when two isntances are trying to bind to the same SR-IOV port
13:36:22 <ndipanov> well that nova should not do any sort of orchestration like that and that it's up to user scripts/heat etc.
13:36:57 <ndipanov> I could see arguments for both personally
13:37:52 <lbeliveau> ndipanov: I see, but using heat and all doesn't make it easier to use, if usability is a concern
13:38:01 <moshele> when I was in vancouver summit it seems that everyone of the cores  was against it
13:38:02 <ndipanov> lbeliveau, yeah
13:38:27 <ndipanov> moshele, yeah - I don't feel too strongly there are good arguments against it
13:38:33 <lbeliveau> should we try again ?
13:38:33 <ndipanov> though
13:38:42 <lbeliveau> or it is a lost cause ?
13:38:49 <ndipanov> if you have the energy - I would
13:38:55 <ndipanov> neutron is slightly different I'd say
13:39:02 <ndipanov> since in order for SR-IOV to work
13:39:09 <ndipanov> you need to configure nova properly too
13:39:28 <ndipanov> so some automation around that could be useful regardless of the "policy"
13:39:29 <ndipanov> I guess
13:39:38 <vladikr> I had a similar problem with another spec at that time.. the main argument was that the setting it to specific - and they wanted me to use some kind of an abstraction
13:40:22 <vladikr> instead of queues=8, use the term "fast" or anything like that
13:40:39 <vladikr> wasn't very useful in my case
13:42:09 <ndipanov> vladikr, that's not necessarily the same thing though
13:42:16 <lbeliveau> not sure for newton, but we still haven't got a way to boot with SR-IOV with horizon
13:42:45 <moshele> lbeliveau: but this can be fix in horizon
13:42:53 <ndipanov> lbeliveau, how come - we can't create an sri-ov port via horizon?
13:43:30 <lbeliveau> ndipanov: I looked last week and I didn't found a way, maybe just an oversight on my part
13:43:55 <moshele> lbeliveau:  I know we can create direct port in horizon, I am not sure if we can select a port for vm boot
13:44:17 <lbeliveau> moshele: will have a closer look next time I have a devstack running
13:44:31 <moshele> beliveau:  also I think  that horizon  has a bug on that
13:44:32 <ndipanov> that sounds like a good fix to have for sure
13:46:09 <vladikr> If we still have time to talk about bugs, there is one that I've ran into yesterday, https://review.openstack.org/#/c/291847/ - looks like it stops claiming pci devices in claim_test. I thought we are claiming the devices on purpose, to avoid a race ? (maybe we are not cleaning , but that's a different story )
13:46:34 <ndipanov> should we raise a bug for the horizon issue?
13:47:03 <lbeliveau> ndipanov: I'll have a look, and if it is indeed a bug I'll take the action of raising one
13:47:04 <ndipanov> vladikr, that sounds like a regression
13:47:09 <ndipanov> lbeliveau, thanks
13:47:17 <ndipanov> if it's a regression we should hold up the RC for that
13:47:26 <ndipanov> so I propose you raise the priority to critical
13:47:31 <ndipanov> and tag the bug just in case
13:48:41 <ndipanov> vladikr, but just from a brief look I'm not sure it needs to do that there...
13:48:53 <ndipanov> will look closer
13:49:33 <mriedem> i've marked https://bugs.launchpad.net/nova/+bug/1549984 as mitaka-rc-potential
13:49:33 <openstack> Launchpad bug 1549984 in OpenStack Compute (nova) "PCI devices claimed on compute node during _claim_test()" [High,In progress] - Assigned to Jay Pipes (jaypipes)
13:49:53 <ndipanov> mriedem, thanks
13:49:56 <jaypipes> mriedem: ty
13:50:10 <ndipanov> lurkers
13:50:20 <mriedem> :)
13:51:40 <yonglihe> mriedem: hi
13:52:11 <yonglihe> good morning, mriedem
13:52:14 <mriedem> hi
13:52:31 <moshele> lbeliveau: horizon bug https://bugs.launchpad.net/horizon/+bug/1402959
13:52:31 <openstack> Launchpad bug 1402959 in OpenStack Dashboard (Horizon) "Support Launching an instance with a port with vnic_type=direct" [Medium,In progress] - Assigned to Itxaka Serrano (itxakaserrano)
13:53:10 <lbeliveau> moshele: thanks, good !
13:53:15 <yonglihe> i might get wrong message, do you ever want sriov-test comments to nova?
13:53:43 <lbeliveau> yonglihe, what do you mean ?
13:53:51 <mriedem> yonglihe: i'm assuming you're asking about 3rd party CI?
13:54:08 <yonglihe> mriedem: you might just want NFV test recheck a patch? yes third-party CI
13:54:37 <mriedem> ndipanov: are you in open discussion? i don't want to derail your meeting
13:54:38 <lbeliveau> yonglihe: intel-pci ?
13:54:56 <ndipanov> mriedem, I think we are probably done?
13:54:57 <ndipanov> folks?
13:55:03 <lbeliveau> I'm good
13:55:05 * ndipanov is hungry
13:55:09 <moshele> I think so
13:55:13 <mriedem> yonglihe: let's talk in -nova
13:56:47 <ndipanov> ok folks thanks for the meeting
13:56:51 <ndipanov> #endmeeting sriov