13:00:44 #startmeeting sriov 13:00:44 Meeting started Tue May 3 13:00:44 2016 UTC and is due to finish in 60 minutes. The chair is moshele. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:48 The meeting name has been set to 'sriov' 13:00:51 hi all 13:01:00 moshele, hi 13:01:08 Howdy. 13:01:50 let wait for more people to join 13:01:56 o/ 13:02:17 hi 13:02:56 ok let start 13:03:04 here is that agenda https://etherpad.openstack.org/p/sriov_meeting_agenda 13:03:49 I did some cleanups last couple of days, let me add them in the etherpad 13:03:52 currently I am still working cleaning up the pci code with jaypipes and also fixing the resize 13:05:59 ok thanks lbeliveau 13:06:20 If we're doing updates, I've started working on docs for general high-performance (NFV?) features, including SR-IOV and PCI passthrough 13:06:49 sfinucan: do you have a draft/review ready ? 13:06:59 sfinucan: ok do you have patches or is it a wiki? 13:07:21 lbeliveau: not even close yet, but I'll add you to the review when I do 13:07:29 moshele: patches (nova-docs) 13:07:47 sfinucan: can you add them to the etherpad ? 13:08:01 Will do, once I publish them (tomorrow'ish) 13:08:11 cool 13:08:13 perfect, can I help ? I know this area pretty well ? 13:08:53 and I can help with the SR-IOV PCI passthrough part 13:08:53 lbeliveau: Yes, please. I want to cover the PCI stuff, but I've started with the stuff I know best - NUMA and pinning 13:09:20 lbeliveau: Actually, I think you signed up to help at the summit :) 13:09:21 ok, I can have a stab at pci if you want 13:09:25 Sure thing 13:09:29 yeah I did 13:09:34 ok I'll get started 13:10:21 so that you know, don't want to step on your toes :) 13:11:02 lbeliveau: Sure thing, heh. Would like you insight into where they should go too. I'm making additional changes to the extra-specs docs at the moment, but I'm thinking a dedicated NFV section would be useful 13:11:30 I think so too 13:11:31 #link http://docs.openstack.org/admin-guide/compute-flavors.html 13:11:44 sfinican: I'll have a look after this meeting, will ping you 13:12:17 Sweet 13:12:21 Sorry for sidetracking, moshele. Back to the agenda :) 13:12:49 it fine I will add a Doc section to see we are making progress 13:13:59 so I hope to finish the resize fix by the end of this week 13:14:19 that would be sweet 13:14:20 and then I will need lbeliveau help with the colde migration 13:14:26 sure thing 13:14:31 ok cool 13:15:07 any news on CI ? 13:15:45 I sent an email to smooney to get insight of his plan to support multi-node testing, no answer yet 13:15:46 currently we we adding more test to the Mellanox SR-IOV CI 13:16:17 but I don't no what is the status with the intel NFV CI 13:16:24 moshele, those tests are not public right ? meaning not in public git ? 13:16:43 moshele: I can provide one, if it would help? 13:16:44 public in git they are in tempest 13:17:11 sfinucan: that would be great 13:17:40 ok, how does it work with tempest in terms of reviewing, nova cores also review there ? 13:17:40 moshele: Sure 13:18:08 Who is responsible for owning the CI? Since it requires some rather expensive hardware. 13:18:08 sfinucan: are those tests somewhere on github? 13:18:14 So we've been working on migrating everything to the upstream model (nodepool, zuul, etc.) 13:18:15 sfinucan: also can you check if you CI can test pci-passthrough resize 13:18:34 That's now complete, and our CI should be switching over this week 13:18:58 is this just for the NUMA testing? 13:19:46 NUMA, hugepages and some pinning 13:19:50 sfinucan: what is the name of this CI ? It is run in the gate ? 13:20:04 We run the standard tempest tests with different flavors first 13:20:20 (hugepages enabled, NUMA topology specified and possibly some pinning) 13:20:34 Then we run our custom tempest tests, which can be found here 13:20:42 #link https://github.com/openstack/intel-nfv-ci-tests/ 13:21:14 lennyb: ^^ is that what you're looking for? 13:21:17 is there a plan to test multi-node ? 13:21:52 lbeliveau: Is that addressed at me? 13:21:56 * lennyb checking link 13:22:04 sfinican: yes 13:22:12 for mellanox CI is something that we need to investigate 13:22:56 In that case, I don't know tbh. Despite suggestions at the summit, wznoinsk (rather than smooney) is the best person to ask 13:22:58 for cold migration we need at least two nodes (or maybe a chroot env) 13:23:38 I can ask him to join next week, if that would help? 13:24:01 sfinucan: yes 13:24:17 moshele: pci-passthrough is probably something that the Intel PCI CI should/does test 13:24:45 I don't have much information on that at the moment, other than the fact that we (Intel Shannon) should be taking ownership of it shortly 13:24:50 sfinican: yes, we absolutely need that, cores are resistant merge some of these patchs since it's not CI tested at the moment 13:25:12 Again wznoinsk is the guy to ask RE: specifics 13:25:38 hi all 13:25:51 wznoinsk: hi 13:25:57 wznoinsk: hi 13:26:03 Will tests automatically filter out if required hardware is not present? Or based on a setting in tempest.conf? 13:26:51 moshele we use SRIOV VF in our NFV CI at the moment, but the proper testing of SRIOV (like resize) would be done in the Intel SRIOV CI, https://wiki.openstack.org/wiki/ThirdPartySystems/Intel-SRIOV-CI 13:27:41 efried_: Again, is that directed at me? :) 13:28:35 wznoinsk: Mellanox CI is testing SR-IOV ports and it working fine, I just want to undestand what is the status of the pci-passthrough testing 13:28:45 I see things coming from different angles, could we throw it into an etherpad... a wishlist about SRIOV testing? 13:29:23 moshele I'd have to talk to the guys from the SRIOV CI above on that, they're in PRC hence e-mail is probably best to catchup 13:30:12 wznoinsk: ok also I would like to talk to after the meeting if that is possible 13:30:23 moshele sure 13:30:31 let move on 13:31:36 we have some bugs around the PF passthrough which Nikola work on, do you know if anyone continue the work 13:32:49 I guess we can share the work, do we have a list of those ? Some are most likely PCI related and other numa 13:33:21 lbeliveau: I put is on the ehterpad under PF passthrough 13:33:24 moshele: NUMA and pinning work would be preferred, but I can look into slotting this into my Newton backlog if necessary 13:33:28 I think we should put down a list and add it to the etherpad for people to pick up this work 13:34:41 lbeliveau: I think mriedem or johnthetubaguy has one. I'll put it in the etherpad if I find one 13:35:36 I can take over some of them (not all) 13:35:50 I guess we will split the work :) 13:36:26 sure :) 13:36:26 ^ yes, please :) 13:36:26 let move to the NUMA and pinning 13:37:01 so I am not expert on that so sfinucan you can update 13:37:30 lbeliveau: johnthetubaguy doesn't know about any specific doc, but the notes from the summit would be a good start 13:37:32 #link https://etherpad.openstack.org/p/newton-nova-performance-vms 13:37:35 moshele: Sure 13:37:49 thanks 13:38:05 I'll start a list and put it in the etherpad 13:38:06 I want to get this all into the feature classification matrix 13:38:19 so we are all clear about the current docs, testing and implementation state 13:38:29 So around NUMA and pinning, there a few ongoing things 13:38:29 but thats very much a work in progress at this point 13:39:15 honestly, as a sub group, if you can agree a list of "issues" where we need help, then narrow that down to a list of newton targets, that would be awesome 13:39:23 then send a summary to the ML 13:39:32 at least, thats would I would recommend you focus on 13:40:08 For pinning, there are two bugs with pinning that need to be resolved and backported (they're listed in the agenda) 13:40:51 We also need more work on documenting this feature for people (ongoing,sfinucan) and some additional functional tests (tbd) 13:40:54 johnthetubaguy: so the agenda etherpad covers all the current issues https://etherpad.openstack.org/p/sriov_meeting_agenda 13:41:38 For NUMA, the only big issues I know of are live migration. I _think_ ndipanov resolved most of these, but tbh I don't know what got merged in the end vs. what didn't 13:42:23 sfinucan: do you have the bug at least? 13:42:36 moshele: One moment 13:44:32 So there are two closely related bugs 13:44:36 #link https://bugs.launchpad.net/nova/+bug/1417667 13:44:37 Launchpad bug 1417667 in OpenStack Compute (nova) "migration/evacuation/rebuild/resize of instance with dedicated cpus needs to recalculate cpus on destination" [Medium,In progress] - Assigned to Nikola Đipanov (ndipanov) 13:44:48 #link https://bugs.launchpad.net/nova/+bug/1289064 13:44:49 Launchpad bug 1289064 in OpenStack Compute (nova) "live migration of instance should claim resources on target compute node" [Medium,In progress] - Assigned to Nikola Đipanov (ndipanov) 13:45:38 But tbh, I don't know the status of these now so it might be best to leave this discussion until next week 13:46:11 ...when I can come back with a full report and we can decide who to take over this stuff (if anyone) 13:46:47 sfinucan: ok, by the way the meeting is biweekly do you want to do it every week? 13:48:40 anyone else? do you want to change it to weekly meeting? 13:49:04 every week is good with me as well 13:49:23 moshele: Yes, every week would be good 13:49:27 there are no many tasks that need to be tracked that every week would make sense 13:49:50 ok I will update the irc-meeting 13:49:54 ...at least while getting these priorities in order 13:50:05 agree 13:50:30 when PCI (and numa) gets more stable I guess it will become natural to do this every 2nd week 13:51:26 agreed 13:52:10 So I will reorganize the https://etherpad.openstack.org/p/sriov_meeting_agenda and send an update on the ML 13:52:29 is that cool 13:53:18 perfect 13:53:43 anything else? 13:54:03 sounds great 13:54:17 Bar getting eyes on those pinning bugs, nothing else from me 13:54:42 good with me 13:55:02 ok cool, that it, thanks everyone for joining the meeting 13:55:08 see you next week :) 13:55:12 ttfn 13:55:12 #endmeeting