#openstack-meeting log

09:00:40 <oneswig> #startmeeting scientific_wg
09:00:42 <openstack> Meeting started Wed Jul  6 09:00:40 2016 UTC and is due to finish in 60 minutes.  The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:43 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:46 <openstack> The meeting name has been set to 'scientific_wg'
09:01:24 <oneswig> #link agenda for today (such as it is) https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_July_6th_2016
09:01:33 <oneswig> Hello ...
09:01:39 <oneswig> #topic roll call
09:01:48 <oneswig> Anyone remember the new time?
09:02:06 <ptrlv> Yes, good morning. Peter here.
09:02:10 <priteau> Good morning oneswig
09:02:33 <oneswig> Greetings!
09:03:20 <oneswig> I had a message from Blair, he's likely to be joining late
09:03:34 <daveh> good morning, this is a much more civilised time for us UK people :)
09:04:04 <oneswig> Great, more the merrier
09:04:19 <james_> Hi james@sanger
09:05:46 <oneswig> OK, lets get going
09:06:35 <oneswig> I just messaged with bauzas and hopefully he'll be joining later to talk Blazar, let's defer discussion on that if we can
09:07:13 <oneswig> #topic Bare metal
09:07:35 <oneswig> Do we have anyone else with an interest here (apart from me)?
09:07:54 <daveh> interested but no experience yet
09:08:03 <apdibbo> same here
09:08:23 <ptrlv> the previous minutes linked to a youtube vid
09:08:24 <daveh> (other than using RH Director to deploy overclouds)
09:08:32 <priteau> A lot of interest for Chameleon obviously
09:08:43 <ptrlv> can you summarise that vid quickly?
09:08:51 <dariov> Hello!
09:08:53 <priteau> and experience as well
09:09:22 <oneswig> ptrlv: can you remind me of the link... was it the chameleon talk?
09:09:35 <ptrlv> yes
09:09:36 <oneswig> dariov: hi, welcome
09:09:48 <ptrlv> http://eavesdrop.openstack.org/meetings/scientific_wg/2016/scientific_wg.2016-06-28-21.01.html
09:10:14 <oneswig> Ah, right, Kate Keahey's talk from the summit
09:11:09 <oneswig> Some of the interesting follow-up from that talk I heard of was the issues with scheduling access (blazar) and issues with managing bare metal serial consoles.
09:11:51 <oneswig> priteau: do you know if there was any progress on serial consoles?  I still don't have them managed through Ironic on our deployment
09:12:57 <priteau> oneswig: I saw a little activity on the Gerrit topics in the past two weeks, but not much. On our side we just shipped new features for Chameleon and will now get back to evaluate serial console
09:13:44 <priteau> I may have more information by the end of July, once we are able to test the patches proposed on Gerrit
09:13:56 <verdurin> Sorry I'm late - Central Line ate my homework
09:14:05 <oneswig> priteau: OK thanks I'll keep tracking that myself, would hope to have some more experience to report in about the same timeframe
09:14:15 <oneswig> verdurin: hi!
09:15:34 <oneswig> We've had some curious problems this week with bare metal and Dell kit that might be worth sharing
09:16:39 <oneswig> Using RH director (ie TripleO), the Ironic python agent ramdisk was unable to cope with the long start time of the raid
09:17:06 <oneswig> Also have some projects underway to configure Dell BIOS and RAID config ... via Ansible...
09:18:00 <oneswig> Also came across a really excellent tool for identifying discrepancies in Ironic introspection data using a tool called cardiff
09:18:03 <verdurin> oneswig: if you can share those, perhaps they can be generalized e.g. to Lenovo too?
09:18:33 <priteau> We would love to learn more about your experience with BIOS config
09:18:42 <oneswig> verdurin: it's using the iDRAC interface, I suspect that's unique to Dell alas?
09:19:04 <apdibbo> could it be done using the redfish interfaces?
09:19:11 <priteau> oneswig: Are you using the PXE DRAC Driver?
09:19:18 <ptrlv> oneswig: still useful given we likely procure Dell stuff
09:19:38 <oneswig> priteau: share and enjoy!  Will keep you updated.  Yes we (sometimes) use pxe_drac but it's not solid
09:20:17 <verdurin> Sadly redfish support is embryonic at best for many vendors
09:20:25 <oneswig> apdibbo: Is that the interface for OCP?  Sounds good to me.  I'm building the modules on Ironic's python-dracclient but they ought to be transferrable
09:20:46 <priteau> What's the current link to cardiff? I found this with Google, but the GitHub link is a 404: https://www.redhat.com/archives/rdo-list/2015-April/msg00188.html
09:20:59 <oneswig> I think it's part of python-hardware
09:21:36 <apdibbo> oneswigh: we are using the redfish interface to collect some info, it was initially from OCP but has been picked up pretty well by Dell and Supermicro
09:21:49 <oneswig> then it's hardware-cardiff
09:22:14 <oneswig> apdibbo: thanks I'll look out for that, is it also possible to modify settings through redfish on those bioses?
09:22:43 <apdibbo> oneswig: I believe so
09:23:00 <oneswig> apdibbo: Interesting, will check
09:23:14 <oneswig> AOB for bare metal?
09:23:20 <priteau> #link https://github.com/redhat-cip/hardware
09:24:17 <oneswig> OK lets roll on
09:24:24 <oneswig> #topic parallel filesystems
09:25:10 <oneswig> I don't have anything myself here.  We intend to connect our deployment to a new (external) lustre store later today.  The two have not met yet.
09:26:09 <b1airo> evening
09:26:10 <daveh> we have not connected Openstack and Lustre yet either, concerns over security i.e. no multi-tenant in Lustre (same concerns as accessing NFS from instances)
09:26:16 <oneswig> Hi b1airo
09:26:19 <oneswig> #chair b1airo
09:26:20 <openstack> Current chairs: b1airo oneswig
09:27:02 <b1airo> really need to get my irc bouncer setup again - missing the history
09:27:24 <b1airo> speaking of HPFS by the sound of it?
09:27:25 <oneswig> daveh: Right.  Our policy is to put the filesystem on a provider network visible only to the project using it
09:27:38 <b1airo> oneswig, ours too
09:27:49 <daveh> so... each filesystem is used by a single project? we have shared filesystems which makes life... more interesting
09:27:50 <aloga> hi sorry for being late
09:27:51 <oneswig> b1airo: yes, just covered bare metal and cooling down with hpfs :-)
09:28:09 <aloga> on a colliding meeting
09:28:14 <oneswig> daveh: agreed - not my area alas
09:28:21 <oneswig> aloga: hi, thanks for coming
09:28:25 <aloga> FYI we are using GPFS
09:28:32 <aloga> with some nasty and ugly workarounds
09:28:39 <oneswig> aloga: In OpenStack?  Can you share the config?
09:28:58 <ptrlv> has anyone got their hands dirty with AWS EFS announce recently?
09:29:01 <aloga> oneswig: we're running sun grid engine worker nodes as OpenStak instances
09:29:11 <b1airo> this is in Indigo aloga ?
09:29:15 <aloga> b1airo: nope
09:29:24 <aloga> b1airo: this is internally at out CSIC infrastructure
09:29:28 <dariov> ptrlv, planning to do so in the future
09:29:30 <oneswig> ptrlv: not on my radar unfortunately
09:29:46 <verdurin> experimentally we've tried a similar per-project provider network isolation approach with GPFS
09:29:47 <ptrlv> #link https://aws.amazon.com/blogs/aws/amazon-elastic-file-system-shared-file-storage-for-amazon-ec2/
09:29:53 <aloga> b1airo: we are a scientific datacenter: WLCG, astrophysics, bioinformatics, engineering, etc.
09:30:03 <dariov> ptrlv: hopefully by september I think
09:30:07 <aloga> b1airo: we have two different infrastructures, an HPC one and a HTC one
09:30:19 <aloga> the HTC one is running on top of OpenStack since long
09:30:32 <aloga> with access to our GPFS cluster
09:30:56 <oneswig> aloga: Do the two infrastructures share a common gpfs filesystem?
09:31:20 <aloga> oneswig: no
09:31:31 <aloga> oneswig: the HPC one goes over infiniband
09:31:48 <oneswig> aloga: is there a need to share datasets between them, if so how?
09:32:22 <aloga> oneswig: no, they are separated, as they have different entry policies
09:32:42 <b1airo> verdurin, you said experimentally - did it work out?
09:34:52 <oneswig> b1airo: perhaps he's gone on the central line again? :-)
09:35:13 <oneswig> (perhaps you missed that bit...)
09:35:16 <b1airo> oh, flakely connection?
09:35:31 <b1airo> "flakely"... what?
09:35:35 <b1airo> :-)
09:35:36 <oneswig> homework excuse!
09:36:14 <b1airo> so aloga, is your hpc using openstack too?
09:36:14 * bauzas waves
09:36:20 <oneswig> Shall we move on?
09:36:22 <oneswig> Hi bauzas
09:36:29 <aloga> b1airo: nope
09:36:37 <oneswig> thanks for joining
09:36:55 <oneswig> #topic accounting and scheduling
09:37:01 <aloga> b1airo: although there are plans on the sight :)
09:37:31 <oneswig> We have had some interesting discussions recently
09:37:45 <oneswig> and they have often included Blazar for resource reservation management
09:38:22 <oneswig> Blue sky stuff: Sometimes with how it might (in theory) be combined with preemtible instances (OPIE) to good effect
09:38:36 <verdurin> b1airo: it works in testing, we're reluctant to roll out more widely yet
09:38:46 <bauzas> FWIW, Blazar would need a certain amount of effort for catching-up with the latest Nova API
09:39:00 <b1airo> aloga, sounds good - perhaps an announcement when the summit comes to you (though i guess you are in madrid?)
09:39:47 <aloga> b1airo: actually 400km north :-P
09:39:52 <aloga> b1airo: Santander, over the sea
09:40:00 <oneswig> priteau: did you have any success with moving your patches upstream for blazar?
09:40:45 <oneswig> bauzas: how far out of date do you think it is?
09:40:45 <priteau> oneswig: I haven't had the time to push any yet, but I am working on this now
09:41:43 <bauzas> oneswig: so, there are 2 backends
09:42:03 <bauzas> oneswig: one is basically using aggregates, and those haven't really changed
09:42:24 <bauzas> oneswig: that's the backend you use when you want to do a full host reservation
09:42:42 <bauzas> oneswig: the problem is with the other backend, used for instance reservation
09:43:13 <bauzas> oneswig: it was using an old compute v2.0 extension mechanism that has been deprecated
09:43:32 <priteau> (in Chameleon we use the first backend)
09:43:36 <oneswig> priteau: is this problem familiar to you and is it part of your changes?
09:43:36 <bauzas> so, tbh, I see the functionality broken
09:44:18 <bauzas> oneswig: one way to solve the problem could be to deprecate that Blazar API resource, and only allow to reserve hosts
09:44:35 <priteau> oneswig: We are not using the instance reservation feature at all, and haven't made any fixes to it
09:44:44 <bauzas> particularly if the Chameleon project isn't really using ity
09:45:05 <oneswig> bauzas: so for bare metal hosts we'd use the first backend only, and that's effectively partitioning off an entire nova compute node in the virtualised case?
09:45:22 <priteau> bauzas: Do you know if OPNFV is using or intending to use instance reservation?
09:45:43 <bauzas> oneswig: I'd say reserving an host would make more sense from a baremetal PoV
09:46:01 <bauzas> even if the Ironic driver conceptualizes that differently
09:46:28 <bauzas> priteau: I haven't heard about OPNFV intending to use Blazar yet
09:46:43 <b1airo> i think the instance reservation use-case would be the most popular generally
09:47:04 <b1airo> i.e., for general purpose science clouds
09:47:51 <bauzas> that's a valid concern
09:47:56 <b1airo> there is often a need to provide some fraction of the resource as guaranteed capacity to projects doing time blocked things like training courses and so forth
09:48:09 <bauzas> I'm just expressing the point that I'm pretty sure it won't work with the latest Nova release :)
09:48:18 <oneswig> bauzas: there was some discussion on the constraint that Blazar effectively manages its own partitioned group of nova computes.  I think I'm getting to understand the issue behind that now but is there a way around it so that we might be able to get higher utilisation?
09:48:50 <bauzas> oneswig: well, the problem is that you're in a cloud, right?
09:48:56 <b1airo> at the moment we achieve this in NeCTAR by manually manipulating host aggregates, but that really sucks because it puts an admin in the critical path and means resources are usually held idle for a long time
09:49:03 <bauzas> oneswig: so you can't really control which workloads will run in the future
09:49:37 <b1airo> bauzas, yes understood re. api compatibility, thanks for raising that
09:49:44 <bauzas> oneswig: that's why the host reservation backend used aggregates, to isolate a full group of hosts we could control
09:50:06 <bauzas> that said, the instance reservation backend was working totally differently
09:50:20 <bauzas> without really isolating groups of hosts
09:50:23 <oneswig> bauzas: right, but it reminds me of other circumstances where a block of resource gets scheduled on demand, somehow we get there...
09:50:59 <bauzas> the instance reservation backend was basically spawning an instance, and then shelving it
09:51:25 <bauzas> and then unshelving it on purpose (when the lease started)
09:52:05 <b1airo> oh wow, and that worked?
09:53:09 <oneswig> bauzas: there's been some discussion previously on the possibility of alternatively preempting running instances according to some policy to make way for a resource reservation.  Any hope for that?
09:53:23 <b1airo> seems like there must be some overlap between the scheduler mechanism opie is proposing for preemptible instances and what would be needed for reservations
09:53:43 <dariov> bauzas: but that would lock all the resources even when the instance is shelved, right?
09:53:46 <bauzas> b1airo: well, the problem with shelve offload is that nova frees up the resource usage
09:54:01 <bauzas> b1airo: that's not a problem for Nova, but that's a problem for Blazar
09:54:39 <bauzas> b1airo: that means that when you unshelve the instance, you would then reschedule the instance to another host
09:54:55 <bauzas> with no guarantee that you'll have enough capacity by this time
09:55:30 <bauzas> dariov: resources are locked only when you shelve an instance, not when you shelve offload it
09:56:08 <bauzas> dariov: you could say "easy, let's just shelve the instance without offloading it"
09:56:22 <b1airo> may as well just suspend it then
09:56:44 <bauzas> dariov: but then, the host would need to support the added capacity for the instance that is going to be used only in 1 month
09:57:22 <dariov> bauzas: yep, that was my concern
09:57:25 <bauzas> dariov: suspend doesn't really differ from the active state, except that your VM is paused
09:57:59 <b1airo> i think we can agree shelving might have been a convenient hack, but it's not an adequate solution to this problem
09:58:16 <bauzas> that's why I'm particularly thinking that it's kinda necessary to allocate a certain amount of resources (hosts or whatever else) for Blazar
09:58:36 <aloga> yes, but even if you reserve those resources for blazar, they are going to be infrautilized
09:58:45 <bauzas> because you need to control and guarantee that you'll be able to spawn the instances when the lease will start
09:58:57 <aloga> as there will be periods when the hosts will be empty, right?
09:58:58 <bauzas> aloga: I agree
09:59:23 <oneswig> Unfortunately we are out of time
09:59:23 <bauzas> aloga: well, of course, that depends on your lease management system
09:59:37 <b1airo> bauzas, for us i think if we could backfill those hosts with preemptible workload we'd be happy for a while, but ultimately i think both reservations and preemption are things we'd like to see built into nova scheduler
09:59:45 <oneswig> bauzas: any thoughts on that?
09:59:46 <dariov> bauzas: that’s where preemptible instances come to the rescue
09:59:51 <aloga> b1airo: +1
10:00:34 <aloga> we struggle to get our resources near 100% utilization
10:00:39 <bauzas> b1airo: well, that's an open point that can't be solved by the time we have :)
10:00:48 <b1airo> interesting food for thought, but doing anything with nova scheduler has seemed fraught for a few cycles at least
10:00:57 <oneswig> Sorry all, we must release the channel - perhaps shelve the discussion...
10:01:07 <b1airo> nice one oneswig
10:01:08 <aloga> shelve offload or just shelve? :P
10:01:18 <oneswig> Thanks all!  Until next time
10:01:23 <oneswig> #endmeeting