21:00:26 #startmeeting scientific-sig 21:00:27 Meeting started Tue Apr 17 21:00:26 2018 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:30 The meeting name has been set to 'scientific_sig' 21:00:40 hello 21:00:46 #link agenda for today - https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_April_17th_2018 21:00:53 greetings trandles 21:01:02 see my mail? 21:01:19 just now, thx 21:01:36 I'll give it a read and provide feedback ASAP (after this meeting of course ;) ) 21:01:38 excellent! Michael did a great job last week 21:02:03 I saw you've got some contributions from the Sanger Centre in the UK - are they active users? 21:03:13 Yeah, I think Sanger is (Matthew Vernon perhaps?) 21:03:14 I liked your paper from FOSDEM, very useful and informative. 21:03:32 How was the conference? 21:03:32 Oh, that's the paper from SC17 btw, I didn't realize FOSDEM republished it 21:03:40 sorry I am late 21:03:48 Hey martial, no problem! 21:03:52 #chair martial 21:03:53 Current chairs: martial oneswig 21:03:59 Wish I could have attended FOSDEM but alas... 21:04:00 It's just us and trandles currently 21:04:25 trandles: was Charliecloud presented by Reid at FOSDEM, or nobody? 21:04:27 that's cool 21:04:46 nobody that I know of 21:04:57 I just sent Reid the FOSDEM link 21:05:00 Tim, do you know if Bob and Mike are going to SC18 or OpenStack summit Berlin? 21:05:11 I do not know 21:05:24 I think that's up in the air at the moment 21:05:35 I will reach out to them, and offer to redo the OpenStack BoF 21:06:01 for SC18 21:06:12 I'd be going to Berlin OpenStack instead of SC18 21:06:16 oneswig: It looks like someone named Georg Rath gave a scientific containers overview and provided several papers 21:06:18 https://fosdem.org/2018/schedule/event/containers_scientific/ 21:06:31 it is very unfortunate that both are exactly at the same time 21:06:44 I'm much rather go to Berlin but I have other program requirements for attending SC18 :( 21:07:07 trandles: handy link, thanks - I'll compare notes! 21:07:21 martial: is it ok if I send you something small via USPS for you to take to Vancouver on my behalf? 21:07:41 yes of course 21:08:08 I will send you my address 21:08:20 cool, thx 21:09:08 trandles: got some interest over here in the boot-to-ramdisk idea. 21:09:26 the more the merrier 21:09:32 One of the team is looking at ways of updating the RAID controller firmware - it's boot-to-ramdisk or boot-from-volume 21:09:58 I have to admit to having a tab opened to the storybook entry and another to the ironic spec wiki page and I haven't had a chance to actually work on it yet 21:10:37 No problem. I'm on a train tomorrow & thursday - I could get a first draft up if you're snowed under. 21:11:12 snowed? 21:11:17 is there an easy way to collaborate on the spec? Maybe once it's been submitted? 21:12:00 Absolutely - you commit, I commit, it's tennis... 21:12:03 Since we have little people, this will have less impact (will post it on the forum/ML too), but Bob asked me to share: Workshop on Container-based systems for Big data, Distributed and Parallel computing (CBDP’2018) https://sites.google.com/view/cbdp18/home Co-located with Euro-Par 2018 http://europar2018.org August 27-28 2018, Torino, Italy 21:12:36 trandles: the main thing is to keep amending the same commit (and making sure we don't tread on each others toes) 21:12:50 martial: looks interesting 21:13:03 Turin is a lovely setting :-) 21:14:21 Another trip I wish I could take. Lucy and I are heading to Singapore and Thailand for two weeks, departing September 3, so Turin the week before is out. 21:14:51 unless you leave from Italy :) 21:14:57 Some people have all the fun. 21:15:29 So, oneswig, what do you have in the way of HPCAC round-up other than the container write-up? 21:15:29 I think I'll be camping that weekend :-) 21:15:47 Another good year for the conference 21:16:03 Plenty of focus this year on containerisation (hence the writeup) 21:16:19 #link http://www.stackhpc.com/the-state-of-hpc-containers.html 21:16:39 But a couple of interesting talks on storage and AI as well. 21:17:04 pretty cool too 21:17:10 There's a huge focus on applying HPC techniques to AI, a study published by Baidu in particular seems to attract a lot of one-upmanship 21:18:07 A new form factor for NVMe was described - the "ruler" - up to 32 TB on something that looks like (you guessed it) a ruler 21:18:56 it takes up 4x PCIe lanes. An AMD EPYC has 128 PCIe lanes => up to 1PB of NVMe in 1U (but no PCIe left for high-speed networking...) 21:19:30 A new blog post from oneswig, nice! 21:19:49 Hi everyone 21:19:49 There was another interesting talk on high-density compute for exascale - liquid cooled 21:20:06 Is RDMA to NVMe available in virtual machines? Do any hypervisors support it? 21:20:12 Hi priteau, welcome! Yep that's what I've been working on the last couple of days :-) 21:20:53 Hey Pierre, welcome :) 21:21:02 I will read it tomorrow while drinking my morning coffee ;-) 21:21:05 trandles: depends on what level the nvme is exposed. I'd do it through an SR-IOV RDMA NIC in the instance. The alternative would be NVME-over-fabrics into the hypervisor then p155ed up the wall by QEMU 21:21:30 Sorry I can't stay longer for this meeting, but will read the logs 21:21:44 The issue with the former approach would be exposing a storage network to tenants (potentially) 21:22:21 yeah...but it makes me think you could provide burst buffer-like capabilities to VM-based applications 21:22:54 There's a huge, huge potential being opened up around that. 21:24:05 trandles: what's the news on OpenHPC & Charliecloud? I looked and saw the PR but there's nothing in the package repos yet 21:24:11 it's another barrier being knocked down to HPC scale cloud-based scientific workloads 21:24:31 So there was a review today for Charliecloud inclusion in OpenHPC 21:24:43 I had been told, erroneously, that it was already approved. 21:25:07 The review surfaced some questions. Reid and Michael participated on the call and I haven't had a chance to debrief them 21:25:37 Ah - so not just yet? 21:26:30 The 20 second summary is that they think it's a go, but the issue hasn't been updated one way or the other in git 21:27:03 The user namespace stuff seems to be a concern, in that exposes new kernel code paths for probing from unprivileged users. What's the view on that - are the low-hanging issues all found and fixed to LANL's satisfaction? 21:27:26 We believe so. 21:27:33 we do mitigate some things 21:28:04 a lot of the recent CVE's around user namespace haven't been with user namespace itself, but user namespace made it easier to attack another vulnerability 21:28:35 so we disable namespaces not required for Charliecloud (i.e. everything but user and mount) 21:29:25 Makes good sense but perhaps not for the general use case (or OpenHPC)? 21:29:41 the biggest one being network...but network is useless for our unprivileged use case because you still need real root privs to add the required network interface to the namespace to make it usable 21:30:33 you can't disable namespaces on the Cray kernel btw... 21:30:41 Right - I saw that - unprivileged containers work for this use case but not for most others 21:30:52 Cray let you rebuild the CLE kernel?? 21:30:59 no, don't have to 21:31:03 they're shipping 4.4 21:31:36 Ah OK - so this isn't about unsetting CONFIG_*_NS then? 21:32:13 we'd have to recompile and do that, unset the NS stuff, or get rid of the unshare system call, patch clone, etc... 21:32:45 namespaces are pretty deeply embedded in the 4.+ kernel 21:33:09 Well you wouldn't want a half measure 21:33:58 the RHEL-based and upstream (in 4.something...) kernels include a /proc/ interface for specifying the allowed number of each namespace 21:34:32 Was there something about nested depth as well? 21:35:00 not sure about that one 21:35:10 look in /proc/sys/user/ btw... 21:35:28 trandles: I'm on a mac 21:35:31 :-) 21:35:52 when you get the chance ;) 21:36:20 whew, finally caught one here after the change to daylight savings 21:36:26 Does OpenHPC already package either Shifter or Singularity? 21:36:31 Hi jmlowe 21:36:38 Singularity is in OpenHPC AFAIK 21:36:44 Hi Mike 21:36:54 Hello! 21:37:14 hey Mike, welcome ;) 21:37:23 There was a question earlier - are you planning on attending either SC2018 or OpenStack Berlin? 21:37:39 (yep that was me) 21:38:33 breaks my heart but I don't see how I can justify Berlin over SC18 21:39:05 want to have a redo of the OpenStack BoF? 21:39:12 (at SC18?) 21:39:16 One last comment on user namespaces. We took our unprivileged approach because we didn't want to wade into the privilege escalation/de-escalation morass. We rely on the kernel for security, which we already do implicitly. 21:39:20 Sure 21:39:40 jmlowe: do you know if Bob is going to SC18 too? 21:39:44 trandles: I'm with you on that. 21:41:07 trandles: Met with Ms Lovett last week at the Docker Federal summit 21:41:26 oh good 21:42:28 yeah, that's another thing that's maybe not widely known yet, but some of us (LANL, NERSC, CSCS) have been talking with Docker since SC17. We want them to take Charliecloud and Shifter and consider them reference implementations of acceptable HPC container runtimes. 21:43:36 she spoke about that, and your help. They have made a lot of progress I am told. I have to follow up with her, I would love her to do a couple presentations with us 21:43:43 Seems wise to work from a common ecosystem, specialised where necessary 21:44:26 not ready to announce at DockerCon in June (I will be there) 21:44:45 We'd much rather go to a vendor-supported solution in the long term. 21:45:21 I think she would be happy/interested to talk about their progress too 21:45:48 I'm hoping to meet with her in May when I'm out in the Bay Area for other meetings. 21:45:59 she is in DC 21:46:35 Hrm, Christian said he thought she'd be out west when I am. I need to email them. ;) 21:47:33 am sure if you are in the area she can be around 21:49:08 Stig, should I ask her if she want to present to our group? 21:49:46 after all if the tech is builtin :) 21:49:48 Seems like a good idea to me - but perhaps when they have something concrete to discuss? 21:50:17 that's cool :) 21:51:10 A common runtime that meets these requirements (and scales to the simultaneous launch of 10,000 instances) would be excellent. 21:51:48 trandles: on that note, have you looked at CRI-O or similar? 21:52:14 Michael has been looking at CRI-O, but I can't speak to where he's at in that effort. 21:52:35 He mentioned something left-field - bproc was it? 21:52:49 that I don't konw 21:52:51 *know 21:53:14 I may have misheard him 21:54:35 Time is creeping up on us, was there more to cover? (checks the agenda...) 21:55:56 I'm not sure there is. I think the action is on martial to submit a BoF for SC2018, correct? 21:56:28 And trandles and I will work on a spec for boot-to-ramdisk 21:56:39 I promise 21:56:53 As do I :-) 21:57:16 yes Sir 21:57:43 excellent :-) 21:57:59 martial: if you send me your mailing address I'd appreciate it :) 21:58:48 Tim, yes am going to do that now :) 21:59:34 done 21:59:37 Anyone else see this? https://insidehpc.com/2018/04/accelerating-ceph-rdma-nvme/ 22:00:12 Bizarrely, iWARP based 22:00:25 OK, time is upon us 22:00:27 I had not, thanks for the pointer 22:00:36 going going... 22:00:43 ohhh really interesting 22:00:49 #endmeeting