11:00:14 #startmeeting scientific-sig 11:00:15 Meeting started Wed Aug 28 11:00:14 2019 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:00:18 The meeting name has been set to 'scientific_sig' 11:00:25 ahoy 11:00:43 #link Agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_August_28th_2019 11:01:14 back to school... 11:01:37 too cool for school! 11:01:50 g'day all 11:02:06 hey janders, how's things? 11:02:33 not too bad... just got the boot media for our GPFS cluster 11:02:42 was overlooked in the bill of materials somehow 11:02:47 build starting next week 11:02:57 Afternoon 11:03:00 sounds good - how big? 11:03:12 hi verdurin, trust you've had a good summer 11:03:23 small but should be punching above it's weight 11:03:25 six nodes 11:03:29 12x8TB NVMe each 11:03:46 2x 100G IB as well? 11:03:47 oneswig: I like to think it's not quite over yet, but we'll see... 11:03:55 50GE+HDR 11:03:58 verdurin: indeed 11:04:16 janders: nice! does the HDR work? 11:04:28 I'm quietly hoping to reach the 100GB/s mark 11:04:41 well.. it doesn't.. yet 11:04:46 but hopefully soon 11:05:40 if we wanted pure IB we could probably make it work today, but we're waiting for the VPI stuff to work 11:05:46 it's still few weeks from GA 11:06:17 dual port hdr? 11:06:23 correct 11:06:40 phewee. 11:06:49 this is actually a nice intro to the HPC-ceph discussion 11:07:01 We only have one item for the agenda - let's move on to it... 11:07:06 the idea behind this cluster is ceph-like self healing with the goodness of RDMA transport 11:07:15 #topic Scientific Ceph SIG 11:07:41 so - speaking of Ceph & HPC - how's RDMA support in Ceph going? 11:07:45 #link Ceph meeting later today http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036705.html 11:07:59 janders: my experience of it wasn't compelling 11:08:10 but it was ~9 months ago 11:08:20 what were the issues? 11:08:27 Yes, saw someone ask you about that on the mailing list oneswig 11:08:37 And I'd love to spend some quality time with it in the absence of distraction... 11:09:03 the issues were various. It worked OK for RoCE but didn't work reliably in other fabrics 11:09:10 I'm trying to squeeze an update on this from Mellanox 11:09:15 at the time I had EDR IB and OPA to compare with 11:09:27 OPA I think was a first 11:09:31 wow 11:09:43 this might be the first time I hear something works with RoCE but not IB 11:09:51 usually it's the other way round isn't it? :) 11:10:25 On the RoCE side, performance was inconclusive. I had a 25GE network and the performance saturated that. The devices I had were SATA SSDs and networking was not the bottleneck. 11:10:39 So it was hard to show substantial benefit on the hardware available. 11:10:58 It actually underperformed TCP for large messages 11:10:58 riiight 11:12:06 The IB, I forget what the issue was but I think there were stability problems. Lost transactions in the msgr protocol, that kind of thing. 11:12:24 yeah sounds a bit rough around the edges every time I look at it 11:12:37 it would be awesome to have it relatively stable 11:12:50 One neat thing was that Intel had contributed some work for iWARP that appeared to have nothing to do with iWARP per se, but did introduce rdmacm, which was needed for all these weird and wonderful fabric variants 11:13:09 janders: +1 on that, would love to see it work (and be worth the trouble) 11:13:28 Intel... I have some harsh words to say there in the AOB 11:13:49 keyword: VROC 11:13:51 later! 11:14:01 ooh, won't be long to wait, we only have a short agenda for today. 11:14:31 verdurin: how is Ceph performing for your workloads? 11:15:01 No complaints so far, oneswig: though I know there is much heavier work to come 11:15:24 #link Ceph day for Research and Scientific Computing at CERN https://indico.cern.ch/event/765214/ 11:15:43 verdurin: sounds intriguing - new projects coming on board? 11:16:12 Yes, and we're also looking at Ceph now for a different storage requirement. 11:16:41 Will you be at the CERN event? 11:16:51 I will. 11:17:03 super, see you there :-) 11:17:05 Curious how it relates to the other Ceph Day that's scheduled for London in October. 11:17:23 I envy you guys... Geneva - sure, 2hrs flight if even that... 11:17:50 Sadly my CERN User Pass has expired, so I have to visit as a civilian, like everyone else. 11:18:00 janders: we don't get to go to LCA, in return 11:18:19 if I were to pick one I don't think it'd be LCA 11:18:29 verdurin: I'm sure they'll remember you... 11:18:30 will be a while till we get the Summit again 11:18:57 This is the other Ceph Day I mentioned: https://ceph.com/cephdays/ceph-day-london-2019/ 11:19:27 you guys heading to Shanghai? 11:19:38 Obviously you have to filter out the CloudStack parts. 11:19:44 I'm planning to be there janders 11:19:53 nice! 11:19:58 not sure what I'll do 11:20:17 No plans to go so far. 11:20:21 I wouldn't expect much support given rather tense AUS-CN relations these days 11:20:29 verdurin: last time I went to Ceph London it was in a venue somewhere in Docklands. Good to see they've relocated it. 11:20:34 would be fun though! 11:20:55 do you guys know if it's possible to remote into the PTG? 11:20:58 janders: you're not the first to entangle openstack with geo politics. 11:21:05 oneswig: Canada Water, yes. 11:21:39 verdurin: ah I remember seeing you there. Wasn't it around the time of your move out west? 11:22:28 oneswig: A few months afterwards. 11:22:46 Canada Water Printworks? 5 minute walk from where I live. But I Agree - not a great conference venue. 11:22:55 oneswig: lots of good user talks at the CERN day. Do you know whether upstream people will be there? 11:23:14 verdurin: don't know I'm afraid 11:23:26 Hi HPCJohn, good to see you 11:23:48 It was a small venue that doubled as a daycare centre. Don't remember the name... 11:24:00 If I have the chance I'll join the Scientific Ceph meeting later 11:24:18 Might be interesting to run the RDMA flag up the pole there. 11:24:30 +1 11:24:33 Yes, I'm hoping to do the same. 11:24:40 what time is it? is it back to back with our meeting, or much later? 11:24:57 I think any serious effort on making it zing is deferred until the scylladb-derived messenger is introduced (forget the name) 11:25:05 May I ask what time British Summer time? I think 16:30 as that is European time also 11:25:46 Time was 10:30am us eastern/4:30pm eu central time and I think that means 3:30pm BST 11:26:02 In Canberra, it's the middle of the night :-( 11:26:18 HPCJohn: Yes, in my calendar as 15:30 11:26:24 BST 11:26:37 yeah... makes me realise how lucky I am with the timing of this meetup! 11:26:42 I should move to Perth 11:27:05 janders: they're going to be discussing the timings, so it's quite possible they'll alternate like this one 11:27:16 that would be very cool 11:27:32 OK shall we move on? 11:27:35 #topic AOB 11:27:40 janders: ? 11:27:47 ok... bit of Intel bashing as promised 11:27:53 have you guys played with VROC much? 11:28:07 To my shame I admit to never having heard of it 11:28:17 CPU-assisted hybrid RAID 11:28:33 Ah, I think I've heard of that. 11:28:34 we originally wanted to run our BeeGFS nodes in RAID6 11:28:52 but as RAID6 sucks on NVMes, we ended up with smaller RAID0s 11:29:12 but because of these limitations we thought we'll try VROC on our Cyber kit 11:29:17 so we did 11:29:19 catch? 11:29:23 Supported on RHEL7 11:29:29 ...unless you use Ironic 11:29:35 aha. 11:29:40 Ironic is too cut down to understand it 11:29:50 s/ironic/IPA image 11:30:00 That's not surprising 11:30:10 so we have dual NVMes but so far can only run off a single one, no RAID protection 11:30:21 Can you boot from sw-assisted raid? 11:30:25 it's work in progress, we're thinking of retro-fitting RAID after the fact in an automated way 11:30:27 Looking forward to CEPH later on. Lunch calls! 11:30:39 oneswig: yes, we can 11:30:52 just can't mark the VROC array as root_hint 11:31:02 bit of a bummer really 11:31:20 we're working on the retrofit script in the meantime 11:31:24 Sounds like a DIB element for your IPA image might help with building in the support 11:31:25 we've got an RFE/BZ with RHAT 11:31:48 but all in all I have a sneaking suspicion we'll just leverage newly added SW-RAID support 11:32:02 as with RAID0/1 VROC doesn't bring much benefit AFAIK 11:32:04 With root hints don't you get a broad range of options, even device size? 11:32:09 RAID5/6 - different story 11:32:18 trouble is - IPA just can't see the array 11:32:25 full RHEL can but not IPA 11:32:39 we had RHAT support look at it, we tweaked IPA initrd but no luck 11:32:54 sounds like unnecessary complexity if you ask me 11:33:01 It's a fiddly environment to debug, for sure. 11:33:10 so I'd say if you're looking for VROC for RAID0/1 scenarios, probably steer away from it 11:33:17 for sure! 11:33:32 janders: not "Cost-efftice and simple" then? 11:33:35 for RAID5/6, especially if there's an alternative root drive, sure 11:33:36 effective 11:33:42 not at all 11:33:44 not at all 11:33:58 it did cost us way too much time to troubleshoot this already 11:34:04 so I thought I will share 11:34:12 good idea but so far pretty troublesome 11:34:38 Thanks for suffering on all our behalf 11:35:03 I look forward to the sequel... 11:35:11 hopefully SW-RAID support will fix this once and for all 11:35:20 VROC->disable; done! 11:35:43 on a good news front I played around with NVMe based iSCSI cinter 11:35:49 just for a small POC 11:35:54 s/cinter/cinder 11:36:04 3/2GB per sec read/write 11:36:10 70k IOPS on a single client 11:36:14 200k IOPS across four 11:36:39 not a bit fan of iSCSI so was quite impressed 11:36:40 have you tried it with iSER? That works well 11:36:57 it's just a placeholder for the GPFS when it's ready - so I did not 11:37:10 but I was expecting maybe 10% of that 11:37:27 nice surprise 11:37:33 ( unlike VROC ;) 11:38:24 janders: do keep us informed about your GPFS work 11:38:31 will do! 11:38:48 Thanks janders! 11:39:08 happy to give a lightning talk about it in Shanghai if it's ready (and if I am there) 11:39:17 (or remote in if that's an option) 11:39:24 I'll pencil you in... 11:39:50 it's EC-GPFS so if it all works as designed it will be a very interesting little system 11:40:06 bit of ceph-like design 11:40:11 12x drives a node and self healing 11:40:23 PLUS all the GPFS features and stable RDMA transport 11:40:52 janders: what's the condition of the OpenStack drivers for it? 11:41:10 not ideal, not too bad 11:41:15 I'm guessing this is bare metal so infrastructure storage requirement is limited 11:41:26 they claim they discontinued support for it few months back 11:41:37 but we tested it also few months back and it worked reasonably well 11:41:58 good to hear. 11:42:11 we'll probably use it across the board (glance & cinder for VMs, native for BM) 11:42:57 Did IBM add Manila? 11:43:08 I don't think so but not 100% sure 11:43:37 the promise is that they cut down development effort for OpenStack to ramp up development effort for k8s 11:43:54 so when we get to that part, there might be more interesting options 11:44:02 not sure where that's at right now though 11:44:23 janders: no restricted-data concerns for this system? 11:44:32 long story 11:44:35 short answer - not yet 11:44:46 It's always a long story... 11:44:53 long answer - we're still working out some of the cybersecurity research workflows 11:44:56 actually... 11:45:05 given we're in AOB and we've got 15 mins to go 11:45:28 do you guys have any experience with malware research related workflows on OpenStack 11:45:36 as in: 11:45:46 1) building up the environment in an airgapped system 11:46:05 2) getting the malware into the airgapped perimiter 11:46:06 3) storing it 11:46:10 4) injecting it 11:46:15 that sort of stuff 11:46:33 we're getting more and more interest in that and it feels a bit like we're trailblazing... but are we? 11:47:14 janders: we have an air-gapped system, though not for malware research. 11:47:48 how do you balance security with giving users reasonable flexibility to built up their environment? 11:47:59 It's a very interesting use case. Hope your Ironic node cleaning steps are good. 11:48:31 right now we're in between the easy way (just importing CentOS/RHEL point releases) and the more flexible way (regularly updating repo mirrors inside the airgap, including EPEL) 11:48:54 and this is even before we think about the malware bits 11:49:23 right now (and this is only a few days old concept) I'm thinking: 11:49:41 1) build the environment in a less restricted environment (repo access, ssh in via floating IP) 11:49:55 janders: we're more like the latter, though in fact our users have very little ability to customise beyond requesting changes for us to implement 11:50:00 2) move the environment to a restricted area (isolated network, no floats, VNC access only) 11:50:33 3) inject malware by attaching volumes (with encrypted malware inside) 11:51:05 yeah the longer we thing about it the more convinced we are we'll end up playing a significant role in setting up user environments... :( 11:51:32 s/thing/think 11:51:59 it really feels like something that either hasn't been fully solved - or has been solved by people who tend not to talk about their work.. 11:52:27 janders: we did take an approach similar to your 1) and 2) steps, in order to make it more bearable/efficient at the preparation stage 11:52:54 so do you run 1) in non-airgap, take snapshots and import into airgapped for 2)? 11:53:22 exactly, yes 11:53:31 ok! might be a way forward, too 11:53:45 what worries me is maintaining relatively up-to-date CentOS/RHEL/Ubuntu/... 11:53:55 if we go with what we described, that's an non-issue 11:54:25 Yes. 11:54:26 fun times! 11:54:40 You're always up to something interesting... 11:54:51 indeed. Great use case 11:54:59 my life these days is figuring this out, while getting HDR/VROC to work - and writing paperwork to keep funding going! :) 11:55:07 argh 11:55:09 frustrating at times 11:55:15 hopefully we're past the worst 11:55:35 I will see where we end up with all this - and if I am still allowed to talk about it at that stage I'm happy to share the story with you guys 11:56:48 thanks janders, would be a great topic for discussion. 11:56:55 thanks guys! 11:56:59 OK, shall we wrap up? 11:57:03 I think so 11:57:15 thanks for a great chat 11:57:28 please do wave the RDMA flag at the Ceph meeting! 11:57:38 I'll try... 11:57:42 #endmeeting