11:01:20 #startmeeting Scientific-SIG 11:01:21 Meeting started Wed Jul 18 11:01:20 2018 UTC and is due to finish in 60 minutes. The chair is martial__. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:01:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:01:24 The meeting name has been set to 'scientific_sig' 11:01:54 Good day, welcome to a short version of the weekly Scientific SIG meeting 11:02:05 o/ 11:02:14 gday Martial! 11:02:26 hello daveholland and janders 11:02:57 Both Stig and Blair are otherwise occuppied and our agenda is light 11:03:21 Berlin CFP closed - bit of relief 11:03:49 John G persuaded me to put a proposal in, first time I've submitted one 11:03:50 Hi everyone 11:04:14 Hi priteau 11:04:24 daveholland: cool :) 11:04:32 daveholland: great, congratulations! :) what is your preso about? 11:06:21 it's intended to be a high-level description of our first 12-18 months running a production OpenStack service (content to be finalized/argued about, if the proposal is accepted) 11:07:08 look forward to hearing about it 11:07:11 very cool! 11:07:13 me too 11:07:40 like I mentioned at the top of the hour, today is light 11:07:46 CFP is closed 11:08:22 martial: have you put any proposals in? 11:08:29 just to mention we have a Super Computing 18 Birds of a Feather proposal in the work to discuss OpenStack, Containers and Kubernetes 11:08:41 very cool! 11:09:01 janders: with SC18 at the exact same time as Berlin, sadly I did not, I will be in Dallas 11:09:14 yeah it's a tough one, isn't it... 11:09:27 sadly it is is 11:09:38 I'll almost certainly choose Berlin but I will have hard time bringing more people with me 11:10:34 a lot of our usual HPC operators are at SC18. I have confirmed this with them 11:10:47 some are on our BoF as well 11:11:00 it is likely Blair will be as well 11:11:20 Stig will be in Berlin to run the Scientific SIG 11:11:34 ACK! That's good to know 11:11:50 and that is the entire content of the agenda that I had 11:12:05 hehe. I have a quick NUMA question for AOB 11:12:15 like I said a short meeting so 11:12:18 #topic AOB 11:12:24 daveholland: go 11:12:44 Hello 11:12:48 so, we're using NUMA-aware instances for one particular user/project, with extra_spaces like hw:cpu_policy='dedicated', hw:cpu_thread_policy='isolate', hw:numa_nodes='2' 11:13:08 it's successful in that they see a performance benefit. Now we're being asked to enable this more widely. 11:13:30 What are people's experiences with mixing NUMA-aware/non-NUMA-aware instances on the same hypervisor? 11:13:39 (should be go "NUMA only"?) 11:13:59 what's your motivation behind using NUMA aware instances? 11:14:12 consistent, predictable performance? 11:14:12 "make it go faster" 11:14:29 I haven't tried this myself, but my gut feel is mixing it will yield random results 11:14:30 this is a CPU/memory access heavy workload 11:14:53 I did something similar before the NUMA aware days, I had flavors with and without CPU overcommit 11:15:02 these would be tied to different host aggregates 11:15:06 worked well 11:15:22 if I were to implement NUMA I'd probably start with the same 11:15:31 I should clarify, this is in an isolated aggregagte, we are looking at 26 or 52 VCPU instances (on a 56 CPU host... 28 with HT enabled) 11:15:48 daveholland: mixing the two sounds like a Brave and Exciting move to me 11:16:01 verdurin: that was my initial reaction too 11:16:31 we are considering enabling it for the biggest flavor only (so it would be the only instance on the hypervisor so couldn't trip over anything else) 11:16:46 do you overcommit CPUs on NUMA-aware instances? 11:17:09 (sorry have to check on kids) 11:17:11 from your above comment I understand you don't - correct? 11:17:15 janders: not yet. We think it would be a Bad Idea because the cpu pinning + noisy neighbours would make life worse than expected 11:17:50 ok.. I think I'm getting it more 11:18:17 so - the concern is that you'll have enough cores to run both instances w/o overlapping cores 11:18:38 however having mix of NUMA and non-NUMA can somehow cause scheduling both VMs onto the same cores? 11:18:44 do I get this right? 11:18:47 janders: yes, I think that sums it up 11:18:59 (I'm using a case with two VMs to make it easier for me to follow) 11:19:47 if we take NUMA out of the picture for a second - if we had two non-NUMA instances on the hypervisor w/o CPU overcommit 11:20:05 is there any chance they would hit the same cores? My guess - no. 11:20:24 I wonder how KVM handles that 11:20:33 From http://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/virt-driver-cpu-thread-pinning.html, I read: If the host does have an SMT architecture (i.e. one or more cores have “thread siblings”) then each vCPU will be placed on a different physical core and no vCPUs from other guests will be placed on the same core. 11:20:37 and then question is - how NUMA awareness can affect that 11:20:43 for a non-overcommit flavor/host/aggregate - I think you are correct (we don't cpu-pin for those flavours as no perceived need) 11:20:58 It's not clear if that works only when all instances use hw:cpu_thread_policy=isolate 11:21:25 priteau: thanks, I hadn't seen that spec 11:21:36 (we are on Pike) 11:22:04 what OS and OpenStack "distro"? 11:22:06 daveholland: It's the first hit I got on Google, but I see pretty much the same text is used on the official doc: https://docs.openstack.org/nova/pike/admin/flavors.html 11:22:10 RHOSP12 11:22:20 have you looked at real time KVM doco? 11:23:14 I'm not sure if they go down to that level of detail, but I remember the RHATs using these things for their NFV implementations 11:23:27 no (I thought that was more for SRIOV or NFV, we haven't touched those) 11:23:45 I'll have a quick look out of curiosity 11:24:04 I noticed that at my time at RHAT - and now it's on my TODO list for a bit later in the project 11:25:24 heh, Google tells me there are past summit presentations on RT KVM, I will check them out 11:26:28 daveholland: keep us updated what you find in a follow up meeting? 11:26:50 I was hoping that RHAT have a dedicated section on RT KVM in OSP doco, but unfortunately not 11:26:55 maybe worth a support ticket? 11:27:04 Certainly will.I think we understand most of the machinery (how to pin, what to pin, what thread policy etc) - and have had success with single instance per hypervisor in a separated aggregate - our uncertainty is mixing this configuration with the vanilla flavors. 11:27:16 +1 to support ticket, thanks 11:27:30 idea: 11:27:40 say you have 32 pCPU 11:27:55 spin up 16x 2vCPU instances, half NUMA aware, half not 11:28:03 look at xmls to see what cores they landed on 11:28:36 daveholland: Quick look at the code that enforces the isolate policy (nova/virt/hardware.py), I *think* that your non-NUMA instances may be forbidden to execute on the core where a NUMA instance is pinned 11:28:53 using fewer larger VMs may yield more sensible results, but chances of running into a scheduling clash are lower 11:29:17 priteau: good stuff! 11:29:52 does NUMA aware xml differ much from a vanilla one? 11:30:22 you get to see the CPU mapping AIUI 11:30:27 thanks for the ideas and pointers. 11:30:50 thanks for an interesting question! :) 11:30:57 daveholland: You could enable debug logs and look at what Nova (I assume nova-compute) prints out 11:31:14 It should say something like "Selected cores for pinning: …" 11:32:25 are you using cpu_mode=host-passthrough ? 11:32:25 http://git.openstack.org/cgit/openstack/nova/tree/nova/virt/hardware.py?h=stable/pike#n890 11:32:29 OK I think our best bet is to do some experiments too 11:33:26 we have cpu_mode=host-model (all the hypervisors are identical....... currently) 11:34:02 regarding experiments - perhaps it's worth running Linpack with just the NUMA-aware instance running 11:34:09 and then add non-NUMA-aware 11:34:16 see if there's much fluctuation 11:35:12 I only tried running Linpack in VMs with CPU passthru though 11:35:52 I'd think that if there's more of an overhead in other CPU modes it will be consistent, but it's not that I tested that.. 11:36:02 the future is hazy but if we agreed not to want migration then host-passthrough is worth a look, yes 11:36:40 I remember needing passthru for nested virt, too 11:36:48 (OpenStack-on-OpenStack development) 11:37:15 but spot on all these optimisations can come back to bite later.. 11:43:46 plenty to think about, thanks all 11:44:59 thanks daveholland 11:45:20 please follow up in the channel for an update on this 11:45:25 another other AOB? 11:45:53 Otherwise let me call this meeting to an end (I must go unfortunately) 11:46:00 thanks everybody 11:46:03 #endmeeting