21:00:29 <martial> #startmeeting Scientific-SIG
21:00:30 <openstack> Meeting started Tue Jan 23 21:00:29 2018 UTC and is due to finish in 60 minutes.  The chair is martial. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:33 <openstack> The meeting name has been set to 'scientific_sig'
21:00:37 <martial> #chair oneswig
21:00:38 <openstack> Current chairs: martial oneswig
21:00:43 <martial> #chair b1airo
21:00:44 <openstack> Current chairs: b1airo martial oneswig
21:00:49 <oneswig> #link Today's agenda https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_23rd_2018
21:01:00 <oneswig> greetings
21:01:17 <b1airo> Hi!
21:01:26 <martial> Hello all
21:01:42 <oneswig> Welcome back b1airo, how's the sea legs?
21:02:42 <b1airo> Well sadly I didn't end up getting the planned sailing trip, but did do a bit of exploring around the peninsula and free diving
21:02:49 <yankcrime> o/
21:03:00 <martial> sounds fun nonetheless
21:03:13 <martial> welcome back
21:03:24 <oneswig> Hi Nick!
21:03:27 <b1airo> Was good
21:03:28 <yankcrime> hey oneswig
21:04:07 <oneswig> b1airo: did M3 stay up without you? :-)
21:04:41 <b1airo> Ha! Mostly. It was down briefly for lustre expansion
21:04:56 <oneswig> BTW any word from Nvidia on firmware that doesn't take down the hypervisor?
21:05:17 <b1airo> Security wise?
21:05:53 <b1airo> No, I think pushing that would require a working exploit
21:06:11 <rbudden> hello
21:06:19 <jmlowe> Hey Bob
21:06:19 <oneswig> Hi Bob
21:06:53 <oneswig> OK, shall we get going?
21:07:27 <oneswig> #topic Vancouver summit
21:07:27 <b1airo> And let's face it, the public cloud peeps doing GPUs have more to worry about right now with Spectre and Meltdown still largely unfixed!
21:07:48 <b1airo> I'd like to talk to the OVH folks at some point though
21:07:57 <oneswig> b1airo: A good time to run bare metal (for a change)
21:08:05 <b1airo> Yep!
21:08:11 <martial> So the CFP for talks for the Summit has been posted
21:08:21 <martial> #link https://www.openstack.org/summit/vancouver-2018/summit-categories
21:08:56 <martial> if you have an idea for a talk please enter the requested information before the deadline (Feb 8?)
21:09:16 <oneswig> martial: are you planning to speak this time round?
21:09:45 <martial> #link https://t.e2ma.net/click/qi5vw/62zxny/27cwli
21:09:54 <b1airo> I think I like the category reworking this time around
21:10:04 <martial> #link https://www.openstack.org/summit/vancouver-2018/call-for-presentations/
21:10:29 <martial> oneswig: as in despite our usual sessions?
21:10:51 <oneswig> right, be good to hear more about what you're up to!
21:12:49 <oneswig> I think the topic categories all look good, it's better than having University/Research tucked away under "government" (or whatever it was)
21:13:06 <oneswig> somewhat reminiscent of "beware of the leopard"
21:13:23 <b1airo> What about you oneswig ? I'm not sure if I'll get to Vancouver again yet, would certainly like to if only for the beer selection ;-)
21:13:43 <martial> good point, let me see what can be done
21:14:09 <oneswig> b1airo: there's beer?  That's that decided :-)
21:14:50 <jmlowe> I suppose that's a good question, who thinks they can go?
21:14:57 <jmlowe> +1 myself
21:15:46 <b1airo> Probably, but haven't talked about travel budgets for the year yet
21:15:57 <martial> planning to
21:16:01 <oneswig> I am hoping to, and Belmiro and I might put forward a talk again
21:16:13 <b1airo> If I have to choose it'd be Germany
21:16:14 <jmlowe> I think Bob, and Tim Randles have said yes
21:16:41 <jmlowe> SC'18 is a heart breaker conflicting with the Berlin summit
21:17:16 <rbudden> I need to get official approval, but the plan is to attend Vancouver
21:17:23 <rbudden> and +1 to jmlowe about Berlin/SC
21:17:31 <b1airo> We may recycle the Cumulus talk proposal, but that's really more a vendor thing and not particularly OpenStack specific
21:19:09 <oneswig> Seems they don't support bare metal as yet, now that would be an interesting combination
21:20:03 <b1airo> Yeah, we have just kicked off a project to migrate the network in our second DC to Cumulus and I'm trying to throw that into the pot
21:21:07 <oneswig> b1airo: let me know what happens
21:21:57 <oneswig> jmlowe: before I forget, I had a question about the LUG meeting at IU.
21:22:08 <jmlowe> oneswig: sure
21:22:10 <oneswig> Did you attend the DDN talk on Lustre for OpenStack?
21:22:12 <b1airo> What special requirements does Ironic have from the network in this regard oneswig ? I kinda thought if their ML2 driver worked then it should be doable...?
21:22:39 <jmlowe> oneswig: I caught one talk, it was lustre for cinder
21:22:42 <oneswig> b1airo: it's about creating access ports for bare metal nodes, rather than (effectively) trunk ports for hypervisor nodes
21:23:10 <oneswig> jmlowe: anything happen on that since?
21:23:10 <b1airo> I.e. "access" ports?
21:23:56 <oneswig> b1airo: the same.  Untagged ports in a specific vlan, that's all.  Not something that ML2 drivers with physical network awareness have to do otherwise.
21:24:19 <jmlowe> oneswig: I'm not sure, it was at the most rudimentary level possible, simply loopback files on a lustre filesystem if I remember correctly
21:24:53 <b1airo> Sounds like it warrants a lab setup
21:24:54 <oneswig> jmlowe: really just a proof-of-concept then, I guess?
21:25:11 <jmlowe> oneswig: nothing that couldn't be accomplished with a simple path manipulation in stock cinder
21:25:33 <smcginnis> oneswig, jmlowe: Someone actually has proposed a lustre Cinder driver: https://review.openstack.org/#/c/395572/
21:25:35 <oneswig> b1airo: we've used networking-generic-switch in this role before, might port to Cumulus
21:25:46 <smcginnis> They just have not followed through with all the requirements to have it accepted yet.
21:25:55 <jmlowe> oneswig: I'm not sure there was even a concept to prove
21:26:36 <oneswig> Thanks smcginnis, useful to know
21:26:44 <b1airo> oneswig: probably, will look into it - also have an email from you somewhere here I didn't get to while on holiday!
21:26:59 <smcginnis> Might be worth pinging them if there is interest to have that available.
21:29:04 <oneswig> seems like they've not progressed it for ~9 months, but perhaps it can be reinvigorated
21:30:37 <oneswig> OK shall we move on?
21:31:02 <oneswig> #topic PTG wish list
21:31:17 <oneswig> I'm hoping to attend the PTG in Dublin in about a month.
21:31:40 <martial> cool
21:31:49 <oneswig> The OpenStack Foundation invited us to book some time in the cross-project sessions for Scientific SIG activities
21:32:06 <oneswig> Right now, we'd like to gather areas of interest
21:32:58 <oneswig> We have preemptible instances, Ironic deploy steps, cinder multi-attach, some other bare metal use cases.
21:33:08 <oneswig> What do people want to see?
21:33:22 <rbudden> I know Tim and I were interested in Ironic diskless boot
21:34:05 <rbudden> Tim more so than me
21:34:08 <b1airo> +1 to jump-starting the preemption spec
21:34:18 <jmlowe> +1 prempt
21:34:39 <oneswig> These are all topics that are underway and active to varying degrees.
21:34:44 <jmlowe> +1 ironic diskless boot (especially if it could be ceph backed)
21:34:53 <oneswig> It's about redoubling the effort to produce something we really need.
21:35:42 <priteau> jmlowe: I believe you can already boot ironic nodes from Cinder volumes (which could be stored on Ceph)
21:35:44 <oneswig> jmlowe: that's an interesting take on it.  I think all discussion to date has been around iscsi not rbd
21:36:22 <oneswig> hi priteau, snap :-)
21:36:38 <jmlowe> let's say you wanted to boot a 20k ironic cluster in under 30 min, ceph's scale out is probably your only hope
21:37:22 <b1airo> Presumably the initial netboot image could mount an RBD, but probably means you have to be happy to let the bare-metal node have access to your Ceph nodes
21:37:27 <oneswig> jmlowe: would certainly help.  Got some nice graphs earlier this week showing ceph scaling client load better than gpfs
21:37:33 <priteau> I haven't tried this feature (we're on Ocata), I see the doc mentions iSCSI: https://docs.openstack.org/ironic/pike/admin/boot-from-volume.html
21:38:06 <jmlowe> rbudden might be interested in that as well
21:38:14 <oneswig> priteau: one issue is, in jmlowe's example we have 20k nodes = 20k volumes
21:38:26 <oneswig> thin-provisioning - how far would that get you?
21:38:38 <rbudden> I believe what Tim was looking for was more akin to booting a ramdisk
21:39:04 <rbudden> then RO mount of /
21:39:26 <jmlowe> I'm thinking relative to thinly provisioned storage ramdisks move more data across the wire during boot
21:39:36 <rbudden> so one image (as oneswig mentions) instead of thousands of Cinder volumes
21:40:13 <rbudden> I admit that cinder boot from volume solves my problem (as we only have 4 superdomes without disks)
21:40:21 <rbudden> on my todo list to test ;)
21:40:57 <oneswig> only 4 superdomes and you couldn't afford disks for them? :-)
21:41:25 <priteau> oneswig: good point, I didn't think about that. But could you boot all nodes from the same volume read-only volume, and then use overlayfs for writes?
21:42:30 <rbudden> oneswig: no place for local drives, was easier/cheaper to just netboot them ;)
21:42:33 <oneswig> priteau: I think there's something in that.  Currently, the cinder multi-attach work doesn't cover this use case, as far as I understand, and enable cacheing of data for a volume that is (and always will be) read-only everywhere
21:43:26 <priteau> Tim's use case of booting into a ramdisk is useful anyway, a cloud may not have cinder, or not a performant / HA one
21:43:49 <oneswig> There was also reference to using kexec to cut boot time, right?
21:44:10 <rbudden> priteau: agreed. I’d love to have all nodes in Ironic instead of 4 stragglers done manually
21:44:43 <rbudden> oneswig: +1 kexec would be great since boot reboot times on HPC nodes is killer
21:45:24 <rbudden> half the reason we manage software stacks via puppet instead of an ironic reimage is due to reimage time because of the multiple reboots necessary
21:45:27 <priteau> oneswig: that I would like for Chameleon, it could cut our (frequent) deployment time by almost half
21:45:28 <jmlowe> what, you don't want to post a couple of TB of ram or something?
21:45:55 <rbudden> well the superdome’s heh… yeah that’s even worse ;)
21:46:06 <oneswig> Any other items for the wish list?
21:46:45 <jmlowe> these release are getting down right boring, we almost have a fully functional cloud operating system
21:47:05 <oneswig> steady on...
21:48:39 <martial> jmlowe :)
21:48:51 <oneswig> OK, ok, I'll add "fully functional cloud OS" to the list
21:49:33 <oneswig> #topic AOB
21:49:43 <b1airo> Where did ttx's release cycle thread end up, I lost track ?
21:50:01 <oneswig> Longest thread in history, I believe
21:50:10 <jmlowe> 5 weeks to queens!
21:50:16 <b1airo> Any popular proposal found for board consideration?
21:51:10 <oneswig> didn't follow it to conclusion, alas
21:51:49 <b1airo> I'll go looking for a dumpster fire in my inbox... ;-)
21:53:37 <oneswig> b1airo: might be in among occasional references to something called "meltdown"...
21:54:25 <b1airo> I have to say I'm still amazed at the complete and utter debacle of Meltdown et al
21:56:11 <b1airo> Has anyone else been watching Linus flame Intel devs?
21:56:30 <oneswig> Just saw something along those lines...
21:57:30 <b1airo> Classic tech industry case study: "This is why we can't have nice things"
21:58:08 <b1airo> "(also AI will murder us all - in our sleep if we're lucky)"
21:58:28 <oneswig> on that happy note, we are almost out of time
21:58:31 <martial> +1 on both
21:58:41 <oneswig> to the end of the meeting, not AI-fuelled armageddon...
21:59:01 <b1airo> ;-)
21:59:25 <b1airo> ttfn!
21:59:30 <martial> sure :)
22:00:09 <oneswig> Thanks all, until next time
22:00:15 <oneswig> #endmeeting