15:10:50 <danpb> #startmeeting libvirt
15:10:51 <openstack> Meeting started Tue Jul 22 15:10:50 2014 UTC and is due to finish in 60 minutes.  The chair is danpb. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:10:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:10:54 <openstack> The meeting name has been set to 'libvirt'
15:11:01 <apmelton> o/
15:11:04 <thomasem> o/
15:11:04 <s1rp_> o/
15:11:07 <sew> o/
15:11:15 <dgenin> o/
15:11:38 <danpb> sorry i got side tracked & forgot ...
15:12:29 <thomasem> No worries, it happens. We just had a topic from our team that we'd really value your input on.
15:12:36 <danpb> ok, go for it
15:13:21 <thomasem> So, we're exploring cpu tuning for containers, but, as I'm sure you've seen, /proc/cpuinfo still shows the host's processor info instead of something that better reflects the tuning for the guest, like /proc/meminfo does.
15:13:54 <danpb> ah, so this is a real can of worms you'll wish you'd not raised :-)
15:14:00 <apmelton> haha
15:14:05 <thomasem> Lol, oh my favorite!
15:14:32 <danpb> so for a start containers don't really have any concept of virtualized CPUs
15:14:47 <apmelton> so with cpushares/quota, you still technically have every cpu, but if you've locked down the cpus with cpusets, I believe you would only have the cpus you've been allocated
15:14:53 <apmelton> so you can simulate vcpus with cpusets
15:15:01 <danpb> eg if you tell libvirt  <vcpus>3</vcpus> or <vcpus>8</vcpu>  <vcpus>whatever</vcpu> it is meaningless
15:15:10 <thomasem> mhmm
15:15:19 <danpb> what containers do give you is the ability to set affinity of the container to the host
15:15:35 <danpb> so you can say  only run this container on host CPUs  n->m
15:15:43 <danpb> which is done with cgroups  cpuset
15:16:01 <danpb> the /proc/cpuinfo  file though is really unrelated to CPU affinity masks
15:16:09 <danpb> eg, consider if you ignore containers for a minute
15:16:24 <danpb> and just have a host OS and put apache inside a  cpuset cgroup
15:16:35 <danpb> you then have the exact same scenario wrt /proc/cpuinfo
15:17:13 <danpb> what this all says to me is that applications should basically ignore /proc/cpuinfo as a way to determine how many CPUs they have available
15:17:34 <danpb> they need to look at what they are bound to
15:18:08 <thomasem> How would we inform applications of that? Is it common for applications to inspect /proc/cpuinfo for tuning themselves?
15:18:45 <danpb> i don't know to be honest - I've been told some (to remain unnamed) large enterprise software parses stuff in /proc/cpuinfo
15:19:00 <apmelton> heh
15:19:10 <thomasem> hmmm
15:19:19 <danpb> i kind of see this is a somewhat of a gap in the Linux ecosystem API
15:19:46 <danpb> nothing is really providing apps a good library API to determine available CPU / RAM availability
15:20:30 <thomasem> I see where you're coming from
15:20:53 <danpb> i kind of feel the same way about /proc/meminfo - what we hacked up in libvirt is really not container specific - same issue with any app that wants to see "available 'host' memory" which is confined by cgroups memory controller
15:21:15 <sew> so wrt vcpu and flavors for containers, we're considering just setting vcpu to zero for our lxc flavors - does that sound reasonable?
15:21:20 <danpb> so in that sense I (at least partially) regret that we added overriding of /proc/meminfo into libvirt
15:22:05 <danpb> sew: not sure about that actually
15:22:23 <danpb> sew: it depends how things interact with the NUMA/CPU pinning stuff I'm working on
15:22:37 <thomasem> What would you have in place of the /proc/meminfo solution in Libvirt to provide guests a normal way of understanding its capabilities?
15:22:45 <danpb> sew: we're aiming to avoid directly exposing the idea of setting a CPU affinity mask to the user/admin
15:22:53 <apmelton> danpb: working on that in libvirt or nova's libvirt drive?
15:23:00 <danpb> sew: so the flavour would just say  "want exclusive CPU pinning"
15:23:18 <danpb> and then libvirt nova driver would figure out what host  CPUs to pin the guest to
15:23:26 <sew> interesting concept danpb
15:23:31 <danpb> so to be able todo that, we need  the vcpus number set to a sensible value
15:23:41 <apmelton> ah, danpb, so we could mimic vcpus for containers with that?
15:23:42 <danpb> just so that we can figure out how many host CPUs to pin the guest to
15:24:10 <danpb> even though when we pass this vcpu value onto libvirt it will be ignored
15:24:30 <danpb> IOW from Nova flavour level,   vcpus is still useful even though it isn't useful at libvirt level
15:24:47 <danpb> apmelton: it is a Juno feature for Nova libvirt driver
15:25:20 <apmelton> ah really, I wasn't aware of that, how do you use it?
15:25:23 <danpb> thomasem: ultimately i think there needs to be some kind fo API to more easily query cgroup confinement / resource availability
15:25:51 <danpb> apmelton: big picture is outlined here  https://wiki.openstack.org/wiki/VirtDriverGuestCPUMemoryPlacement
15:25:56 <thomasem> Ah, so a process, whether in a full container guest, or simply under a single cgroup limitation can find its boundaries?
15:26:15 <danpb> thomasem: yeah, pretty much
15:26:26 <thomasem> Hmmm, I wonder how we could pursue that, tbh.
15:26:37 <thomasem> Start a chat on ze mailing list for LXC?
15:26:48 <thomasem> Or perhaps work like that is already underway?
15:26:49 <danpb> with the way systemd is rising to a standard in linux, and the owner of cgroups, it is possible that systemd's DBus APIs might be the way forward
15:27:00 <thomasem> oh okay
15:27:25 <thomasem> interesting
15:27:29 <danpb> but i'm fairly sure there's more that systemd would need to expose in this respect still
15:28:06 <danpb> overall though the current view is that Systemd will be the exclusive owner of all things cgroup related - libvirt and other apps need to talk to systemd to make changes to cgroups config
15:28:15 <sew> systemd does seem like the logical place for all that to happen
15:28:21 <thomasem> gotcha
15:28:36 <thomasem> Something to research and pursue, then.
15:31:05 <danpb> that all said, if there's a compelling reason for libvirt to fake /proc/cpuinfo  for sake of compatibility we might be able to explore that upstream
15:31:34 <danpb> just that it would really  be a work of pure fiction based solely on the <vcpu> value from the XML that does nothing from a functional POV :-)
15:31:47 <thomasem> Yeah, we'd be lying.
15:31:48 <thomasem> lol
15:32:00 <danpb> for added fun, /proc/cpuinfo is utterly different for each CPU architecture - thanks linux :-(
15:32:02 <apmelton> danpb: wouldn't it be better to base it off cpupinning?
15:32:05 <thomasem> It's just the question of whether it's better to lie closer to the truth :P
15:32:16 <apmelton> or is that not supported with libvirt-lxc?
15:32:30 <danpb> you can do guest pinning with libivrt lxc
15:33:35 <apmelton> for instance, instead of ignoring the vcpu value in lxc, could libvirt translate that into cpu pins?
15:33:56 <danpb> if the kernel ever introduced a cgroup tunable   "max N number of processes concurrently in running state for schedular" that would conceptually work for a vcpu value but that's probalby not likely to happen
15:34:13 <apmelton> heh
15:34:17 <danpb> apmelton: that would mean the guests were always pinned even when pinning is not requested
15:34:38 <danpb> which is something i'd prefer to avoid since while it works ok as you startup a sequence of guests
15:34:48 <danpb> once you shutdown a few & start some new ones, you end up very unbalanced in placement
15:34:59 <apmelton> yea, that gets complex fast
15:35:15 <danpb> you'd have to have libvirt constantly re-pinning containers to balance things out again
15:36:41 <apmelton> ok, that makes sense
15:39:27 <thomasem> regarding faking /proc/cpuinfo for compatibility, I am not immediately aware of an application use-case that would look for that. Can anyone think of an issue with the guest being able to see the host processor info in general (in a multi-tenant env)?
15:40:00 <danpb> i don't think there's any real information leakage problems
15:41:41 <thomasem> Okay
15:41:44 <dgenin> maybe if you new that the data you were after was on a particular type of node you could use /proc/cpuinfo to navigate the cloud
15:42:10 <dgenin> just an idea
15:43:41 <danpb> dgenin: the /proc/cpuinfo file is pretty low entropy as far as identifying information is concerned
15:44:05 <danpb> particularly as clouds will involve large pools of identical hardware
15:44:13 <dgenin> true, there are bound to be many nodes with the same cpuinfo
15:44:31 <danpb> there's many other easier ways to identify hosts
15:44:52 <dgenin> what do you have in mind?
15:46:10 <danpb> sysfs exposes host UUIDs :-)
15:47:08 <dgenin> yeah, but the attacker is not likely to know something so precise
15:47:20 <dgenin> another possibility is that cpuinfo is not sufficient alone but it could be combined with other identifying information to pidgeonhole the node
15:47:47 <danpb> any other agenda items to discuss, or we can call it a wrap
15:48:28 <thomasem> Not from me. I have stuff to think about now. :)
15:48:45 <thomasem> Not like I didn't before, but more now. hehe.
15:49:04 <sew> thx for the background on cpuinfo danpb
15:49:26 <danpb> ok, till next week....
15:49:30 <danpb> #endmeeting