11:00:06 #startmeeting scientific-sig 11:00:07 g'day! :) 11:00:07 Meeting started Wed Dec 19 11:00:06 2018 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:00:08 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:00:10 The meeting name has been set to 'scientific_sig' 11:00:17 janders: you are quick :-) 11:00:22 g'day back 11:00:36 :) how are things? 11:00:37 Hello there 11:00:44 #link Agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_December_19th_2018 11:01:08 Morning. 11:01:15 All good here, thanks. Been a busy couple of weeks. I presented 3 times in two weeks, on 3 different subjects... 11:01:25 Morning verdurin priteau 11:01:32 all well? 11:01:38 that is very busy indeed! :) 11:02:04 All done now, just need to catch up on the Christmas shopping 11:02:12 haha same here :) 11:02:25 we're sorting out a couple different procurement activities if that counts as Christmas shopping 11:02:40 ooh, like what? 11:02:44 looking good, should have some cool things to work on next year 11:02:54 mostly hardware 11:03:08 for the cyber system based on SuperCloud architecture 11:03:17 s/cyber/cybersecurity 11:03:26 Is the SuperCloud scaling up for production now? 11:03:46 we'll be writing prod-ready provisioning code from early jan 11:04:05 we switched focus to other things (like BeeGFS) while in tender mode 11:04:28 when the new equipment lands we should be able to build it pretty quickly 11:04:47 I met some BeeGFS guys at a conference last week and got a couple of lovely led-bejewelled BeeGFS pin badges from it :-) 11:05:03 excellent! :) 11:05:15 by provisioning code, you're talking tripleo heat templates or what? 11:05:18 I finally had a chance to think a bit about what we want from BeeGFS from the OpenStack angle 11:05:28 yes, among other things 11:05:38 but that's a fair part of it 11:06:22 How is it differing from the pilot system you're currently developing on? 11:06:46 we might try to apply the ephemeral hypervisor philosophy to more OpenStack components, so the tripleo run might be just one stage of building the system up 11:06:59 more emphasis on resiliency 11:07:07 and security 11:07:27 It will be fascinating to see where you take this in 2019. 11:08:05 the prototype focused on proving that the concept is sound and demonstrating the baremetal capability 11:08:26 Seems you're not the only one blurring the baremetal boundaries. 11:08:38 the next one will have all that, but should be able to withstand a fair bit more punishment and hardware failures :) 11:09:14 mgoddard shared this spec out for review on future Ironic-Kubernetes integration: https://review.openstack.org/#/c/625730 11:10:06 I'd guess it's quite a way off yet, but interesting to see where ironic could be in a few years 11:10:23 Is there a term to describe this pattern of running different layers of services mixed up together? 11:10:24 maybe less, depending on how keen RH are 11:10:47 excellent work! :) 11:11:31 Anyway, I guess we should get to the agenda (there's not much on it, but we are starting with AOB) 11:11:48 #topic 2018 retrospective 11:12:08 I realised this week I did nothing to write up the activities from the Berlin summit 11:12:26 Which was unfortunate as there was a good deal going on. 11:13:26 I finally managed a write-up of the presentation I did at Ceph days Berlin, so it must be next... 11:13:44 #link HBP presentation from Ceph days Berlin https://www.stackhpc.com/ceph-on-the-brain-a-year-with-the-human-brain-project.html 11:14:24 The most interesting piece from this was the performance quirks of Intel NVMEs using LVM and Bluestore (current Ceph best practice) 11:17:18 From my SIG-centric POV it's been unfortunate that Blair is not so hands-on with OpenStack in his new job - hopefully he'll get back to that... 11:17:19 excellent work guys! 11:17:35 thanks janders :-) 11:17:36 I'm quiet cause I'm barely keeping up to read through the links 11:17:46 but topics are so interesting I cant resist reading right now 11:17:55 bluestore seems ground breaking 11:18:16 it made a huge difference in our tests 11:19:45 have you guys looked into IOPS on NVMe-ceph much? 11:20:12 I just skimmed through the later part of the article, sorry if I missed it 11:20:45 Not so much on IOPS, I was testing at the RADOS level and looked almost exclusively at aggregate bandwidth. 11:21:40 We've been doing some work recently on processing the latency histograms from multiple client runs of fio and resampling to generate a single latency histogram, which is pretty cool 11:23:30 I was also interested in gathering input on what we should do differently in 2019. Any thoughts on things we should be doing? 11:23:56 I haven't done any work with ceph/bluestore myself but the concept makes perfect sense and your results reflect that 11:24:34 I like the final comment on xfs vs bluestore on ceph blog: 11:24:35 It was a good result, I was happy to see it! 11:24:43 In the end, we found there was nothing wrong with XFS; it was simply the wrong tool for the job. 11:24:43 :) 11:24:59 ha, that's nice. 11:26:18 I am very happy with the SIG, lots of good discussions and inspiring ideas, not sure what I'd change 11:26:52 OK that's good to know, thanks. Any thoughts from anyone else? 11:26:56 location of the at-the-Summit meeting rooms perhaps :) 11:27:09 (the walk to the Berlin one was a bit of a challenge) 11:27:13 We filled that room! 11:27:42 true! :) scientific stackers are a particularly hardy and stubborn types 11:28:01 We enjoy a hike 11:28:11 #topic Conferences for 2019 11:28:39 I had a few mails this week and it got me thinking we should gather details on conferences that might be of interest. 11:28:54 great idea 11:29:09 do you guys know what's the talk proposals deadline for the Lugano conference? 11:29:17 #link London - UKRI cloud workshop https://www.eventbrite.co.uk/e/ukri-cloud-workshop-tickets-53580893896 11:29:28 verdurin: are you involved in this? 11:29:49 oneswig: Yes, I am. We had a call about it on Monday. 11:30:09 I've been a couple of times and found it to be a very informative day. 11:30:11 Abstract submissions welcome, and accepted until mid-January. 11:30:31 Is there a particular theme this year? 11:31:48 We've suggested themes in the CfP - there's no overall theme beyond that. 11:32:00 ok, thanks verdurin 11:32:07 #link Lugano, Switzerland - HPC Advisory Council http://hpcadvisorycouncil.com/events/2019/swiss-workshop/submissions.php 11:32:23 Very beautiful place to visit 11:32:48 Last year there was a great discussion on hpc container infra, really useful. 11:33:33 There's a sister conference in Perth janders - well worth the trip I'd say 11:34:01 #link Singapore - SRECon Asia/Australia https://www.usenix.org/conference/srecon19asia/call-for-participation 11:34:30 I see Perth is late August 11:34:36 good to know 11:34:45 I went to an SRECon before and found it pretty useful (although mostly in abstract terms, there's little consideration for OpenStack content) 11:34:47 good excuse to escape the end of the winter here in Canberra :) 11:35:25 janders: winter in Canberra... you'll be telling me there's a ski season next :-) 11:35:40 not quite here but 2.5 drive away - yes! 11:36:13 Any other conferences/workshops people would like to announce? 11:36:41 it would be good to know what's the submission deadline for Lugano if you happen to know 11:36:55 wasn't able to find it on the event website 11:37:05 do you remember what was it like last year? 11:37:41 I can't see a date for that either. 11:38:17 Is it too early to start thinking about running something at ISC? 11:38:39 verdurin: I was thinking about that too, we've not done that before and perhaps we should. 11:38:43 Do you go? 11:39:00 The same - haven't done before, quite likely to this time. 11:39:53 John T is a regular there, I'll see what he thinks. 11:41:08 oneswig: https://photos.google.com/photo/AF1QipPLFErIFTfFqE2GPdq3EtVKgQutLsKIV8vAqB7t that's from June, somewhere over Perisher Valley, New South Wales 11:41:48 That link's not working for me, alas 11:43:31 https://photos.app.goo.gl/y8FyirV1G5XCEDLH7 11:43:52 sorry, google rewrites the URL after clicking, I copy pasted the re-written one 11:44:23 Snow in Australia, looks good! 11:44:34 good xc skiing up there 11:45:32 Lugano or ISC would likely be good timing to present the work we're planing to do with Bright 11:45:54 Lugano might be better if we make it early enough 11:46:45 Sounds good to me - but much better to present work actually delivered than still in the conceptual phase :-) 11:47:50 #topic AOB 11:48:02 What else is new? 11:48:20 my guys are making good progress with benchmarking the BeeGFS 11:48:33 I think they got up to 180GB/s on a quarter of the cluster 11:48:56 That's huge! 11:49:01 Are you OPA limited? 11:49:06 2xEDR 11:49:31 Ah, of course, I forgot. 11:49:45 I think the 24NVMes can do 26-28GB/s per node but 2xEDR ports cut that to around 24GB/s per node 11:50:32 "only" 24GB/s... 11:51:02 How many server nodes are delivering 180GB/s? 11:51:12 John's hints were invaluable in making decisions on some of the config details - thanks for that 11:51:17 checking the numbers now.. 11:51:50 janders: you're welcome - share and enjoy 11:52:42 8 servers 11:52:45 not sure how many clients 11:52:54 and more precisely it's 180GB/s peak and 160GB/s sustained 11:53:36 I think they used to run 8 clients, but they might have doubled or tripled that since, can't see the client number in the test report 11:53:41 That's pretty much where the data point is on the graphs I've seen from Cambridge, and performance scales linearly from there to 24 servers. 11:54:16 wow! that is great news for us 11:54:22 #link "Hero Performance Numbers" - slide 35 https://www.stackhpc.com/resources/2018-11-12-Berlin-HBP-Ceph.pdf 11:54:25 have you tested past 24? 11:54:56 I don't think so. Not sure how many storage servers they have available 11:55:15 we have 32, so not much more 11:55:34 capacity requirement was quite high, hence 32 nodes with 24NVMes each 11:55:57 That's a lot of nmve 11:56:18 I never quite liked the NVMe density as this requires a fair bit of blocking but it does buy the capacity :) 11:56:46 if it wasn't PCIe3 we'd perhaps be looking at 72GB/s per node 11:57:17 janders: ever considered bcache with backing to something bigger and slower? 11:58:08 Looking ahead to 2019, we have sessions planned on 2FA and iRODS in the pipeline for January. Anything else people would like to see? 11:58:38 not at CSIRO - in my days at RHAT I did look at something similar 11:58:46 If we are looking at iRODS, we might also want to look at Rucio 11:59:26 OK, we are at time - anything more to add? 11:59:28 one last comment on BeeGFS - I finally spent some time thinking about BeeGFS-OpenStack integration ideas - will share a googly doc 11:59:45 janders: the people I met were very enthusiastic about this, please do! 11:59:50 janders: yes, would be interested to see 12:00:03 what I have right now is nothing groundbreaking but it'll be a start 12:00:14 and other than that - Merry Christmas All! :) 12:00:27 Same to you janders - seasons greetings everyone 12:00:29 I'll be away 21 Dec - 6 Jan so speak after the break 12:00:38 #endmeeting