11:01:11 #startmeeting scientific-sig 11:01:12 Meeting started Wed Aug 29 11:01:11 2018 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:01:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:01:15 The meeting name has been set to 'scientific_sig' 11:01:22 just in the nick of time... good day 11:01:41 #link agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_August_29th_2018 11:01:56 priteau: did you make it? 11:02:03 I 11:02:06 I'm here 11:02:16 bravo :-) 11:02:33 How's your Blazar PTG preparation going? 11:02:47 Saw the etherpad, thanks for the link 11:02:53 evening 11:03:02 b1air: g'day! 11:03:07 #chair b1air 11:03:08 Current chairs: b1air oneswig 11:03:19 We have a good agenda already, still need to set up a discussion with placement API folks 11:03:22 How's NZ going? 11:03:27 just intending to pop in more-or-less briefly, getting a bit late down here 11:03:38 hello team (a little distracted because ... kids getting ready for school) 11:03:42 priteau: good to see the interest from that end, really encouraging 11:03:49 Morning martial! 11:03:52 #chair martial 11:03:53 Current chairs: b1air martial oneswig 11:04:08 i'm chalking up the air miles, put it that way 11:04:16 Around NZ or further afield? 11:04:25 hi. I'm here from the Sanger Institute (my colleague Dave who is often around is stuck trying to get registered with NickServ) 11:04:39 Hi Republic, welcome! 11:04:53 locally - been up to Auckland every week so far 11:05:01 Republic: good to have you on board 11:05:08 evening janders 11:05:39 I visited the airbus factory on vacation where they make turboprops for Air NZ internal flights. 11:05:59 Hi oneswig 11:06:08 interesting vacation activity oneswig 11:06:09 Very cool :) 11:06:11 Guess we ought to get going. How about PTG first martial? 11:06:32 #topic PTG prep 11:07:05 sure thing 11:07:27 martial: it's about 2 weeks away now, anything new on the PTG? 11:08:30 not much yet, the main link is at 11:08:32 #link https://etherpad.openstack.org/p/ops-meetup-ptg-denver-2018 11:08:51 I was thinking, to follow up on discussion in Dublin, I ought to get together a write-up of our experience with Cinder boot-from-volume 11:09:21 limited success so far but I think it worked in a basic case for networking and cinder configuration. 11:09:42 and the Scientific content SIG info is at 11:09:46 #link https://etherpad.openstack.org/p/scientific-sig-denverptg2018 11:10:05 I'd chip in some thoughts on ironic resiliency enhancements 11:10:07 One big thing that went by recently was the Ironic gang proposing work on boot from (ceph) volume. Which we'd really love 11:10:18 I'll add my bit too 11:11:01 I've had some networking issues probably not worth mentioning in detail here - long story short when one node gets stuck in "deploying" ironic dies till conductor is restarted 11:11:29 There's a BZ for that, I'll try look it up 11:11:50 wow, I have never seen this bug 11:12:16 Is it happening on a specific release? 11:12:35 Sounds nasty janders, not seen that ourselves and plenty of things can go wrong ... 11:12:43 There are two, actually 11:12:54 Queens 11:13:03 1) the thing just described 11:13:33 2) when nova compute gets started before keystone is responsive, it'll go deleting placement records 11:14:02 These are private BZs, let me chat to RHAT to see how to best go about this 11:14:39 Ah, perhaps something will appear on storyboard if they get to commit a patch 11:15:34 We have ~500 Ironic nodes on a Queens deployment. The infrastructure's second hand so things do get flakey at times but I'm not sure we've hit these specific issues. 11:16:50 o/ (freenode registration email vs our spam filter, bah) 11:17:09 hi daveholland, well done for making it through 11:17:12 janders: what is deleted in placement exactly? On Ocata we've seen issues were new compute nodes entries are created, which don't match existing resource providers in placement 11:19:16 daveholland: I had a question for you about networking VRFs, perhaps in AOB... 11:19:21 don't let me forget 11:19:32 priteau: records for ironic nodes disappear. It manifests itself when instances don't get scheduled to ironic nodes anymore. 11:19:34 feel free to add it to the etherpad, who knows maybe somebody has some idea :) 11:19:49 oneswig: sure (networking not my strong point but I can pass questions on) 11:20:50 daveholland: simple question was, what consumes 1 VRF on the TOR switch? Is 1 VRF required for every VXLAN VNI? 11:21:57 oneswig: do you have any experience running vxlan over IB? 11:22:04 oneswig: I don't believe so but I will check 11:22:49 janders: no, sorry. Isn't pkeys your best bet there for VTN segregation? 11:22:58 Saw a great talk on it at the summit :-) 11:23:30 oneswig: :) pkeys are cool for baremetals and sriov vms 11:24:15 I'm trying to see if I can squeeze in some vanilla virtualised Ethernet nics on my SuperCloud instead of building a separate one 11:24:33 OK, we've got a little off topic... was there any more on the PTG? For those of us not going, anything we'd like to see pursued from the OpenStack devs who will be attending? 11:24:57 oneswig: Jono says that you only consume VRFs for hardware routing for tenant networks (which we're not currently doing) (to a max of 2047 VRFs on these switches... but you can scale-out) 11:24:59 janders: a question of multiple physnets? 11:25:14 From my side I'll chat to rhat on how to best chase up these resiliency issues. Already sent them a note. 11:25:28 thanks janders, would be very helpful 11:25:52 thanks daveholland - and Jono - so this is specifically for routing between VXLAN overlay networks? 11:26:05 oneswig: Is it possible to specify sdn controller on per physnet basis? 11:26:15 yes, but only when you go to hardware routing or distributed routing (DVR) 11:26:45 janders: I believe so, we use a mix of Mellanox IB and networking-generic-switch on multiple deployments 11:27:28 Great! Thanks heaps oneswig: 11:27:34 Any final wish list items to add for PTG discussion? 11:27:36 I'll chase up mlnx 11:27:50 Security 11:27:58 janders: I'll connect you with our expert on doing this 11:28:05 Locking bios version/config 11:28:21 Is this of interest to you guys? 11:28:34 janders: sounds like trusted boot / TPM? 11:28:48 Certainly of interest 11:29:00 oneswig: thank you. I'll be in and out but will follow up when back in office. 11:29:14 I'll put it on the SIG-PTG agenda janders 11:29:40 oneswig: thank you again. 11:29:50 One more thing: 11:29:59 IPA connectivity 11:30:20 Rhat ref arch is using dedicated flat provisioning network 11:30:29 janders: go ahead and articulate that on https://etherpad.openstack.org/p/scientific-sig-denverptg2018 11:30:35 Ok! 11:31:29 #topic Up-coming CFPs 11:31:51 A few came by recently, some them were even in this broad time zone :-) 11:32:47 Anyone going to SC, there's an Intel speakership programme on the side 11:32:53 #link Intel speakerships at SC https://easychair.org/cfp/IntelSpeakershipsatSC18 11:33:04 we will probably send a Lucky Victim 11:33:17 Specifically they include HPC/cloud and OpenHPC in the list of topics 11:33:39 daveholland: I just get the theme tune to Dallas as an earworm every time I hear about it 11:33:50 oneswig: IPA connectivity question added to the PTG etherpad. 11:33:56 great 11:34:33 For those based in the UK, there was a CFP for Computing Insight UK 2018 in Manchester in December 11:34:42 #link CIUK 2018 CFP https://www.scd.stfc.ac.uk/Pages/CIUK2018_Presentations.aspx 11:35:08 I've been the last couple of years and there's usually a good turn out of HPC/Cloud content 11:35:33 thanks, hadn't seen that one, I'll pass it on 11:36:03 this year may be particularly so. Andrew McNab from Manchester Uni is speaking on OpenStack and scientific computing at Manchester OpenStack meetup next Wednesday 11:36:25 #link MCR OpenStack Meetup https://www.meetup.com/Manchester-OpenStack-Meetup/events/253630128/ 11:36:47 He's going to be talking about a new scientific OpenStack research federation 11:37:18 Hopefully he'll be more informative than its current home page - https://www.iris.ac.uk/ 11:37:56 er! yes 11:38:13 That's pretty much all I had on CFPs - anyone else seen things going on to announce? 11:39:36 #topic AOB 11:39:46 So what's new? 11:40:03 The schedule for the Berlin Summit is online now, if people want to start selecting interesting topics. Not all of it is available though, for example project updates are not yet on it. 11:40:27 Thanks priteau. Will you be doing a Blazar project update? 11:41:21 Yes, there will be one! 11:41:29 priteau: great! 11:41:35 Already looking forward to it :-) 11:42:49 If I am not mistaken I think the Forum agenda is not organized yet either 11:43:10 I don't think there has been anything on SIG events either. 11:43:29 martial: b1air: you seen anything? 11:43:53 the Forum session proposal period opens on September 12 11:44:00 no, bit early for that i think 11:44:14 hey ildikov, thank you! 11:44:14 full timeline is here: https://wiki.openstack.org/wiki/Forum 11:44:15 ah, thx ildikov 11:44:17 Thanks ildikov! 11:44:22 np :) 11:45:11 i did hear some rumours about Summit - PTG reconvergence in 2019 though... 11:45:37 yes, that is indeed the plan 11:46:07 And it's in Denver again 11:46:14 oneswig: not heard anything so far 11:46:21 co-located but not overlapping, so we will keep the Forum during the Summit 11:46:32 priteau: can't have everything ;-) 11:47:18 Someone in OpenStack must like a good Denver omelette :-) 11:48:23 lol, I never associated Denver with omelette :) 11:48:44 I've been revisiting the state of play with various efforts for Ceph+RDMA. Just saw today's contender has hard-coded IPs in the C++... 11:48:57 So perhaps that one's not quite ready yet 11:49:45 🤭 11:49:51 OK, anything more to add for today? 11:50:06 Wow, emojis in IRC. I had no idea 11:50:19 Anyone out there using eth_ipoib? 11:50:44 janders: doesn't that require a downgraded version of OFED, or is it reinstated now? 11:50:59 I've been running into problems and getting them fixed 11:51:44 oneswig: not anymore, works on latest now, we requested it to be readded 11:51:52 is that ethernet over ib, or perhaps that's what oneswig was thinking...? 11:52:01 b1air: yes 11:52:14 That's great. We avoided the need by using config drive for IPAM on our IB network 11:52:29 There are some *fun* mtu issues on older cards 11:52:40 Sounds like DHCP may become possible again (once you've fixed the last bugs :-) 11:52:57 DHCP *just works* today 11:53:22 Carrying data over the network in bulk can be a bit iffy but that's almost fixed too 11:54:36 Are you closer to production scale now? 11:56:09 Yes, making steady progress 11:56:25 The solution is getting quite solid now 11:56:48 I hope for solid POC with users on by the end of the year 11:56:58 And continuous delivery from there 11:57:13 janders: sounds good. Are you doing a containerised tripleo deployment? 11:57:32 Most ppl would probably call that prod but I like to play things safe 11:57:58 oneswig: working towards it. First release will be non tripleo, custom provisioning 11:58:23 Will redeploy between POC and tripleo/prod 11:58:24 But still RH supported? 11:58:31 oneswig: yes 11:58:40 I didn't realise that was possible 11:59:06 That's the short answer. It's a long story but yes we're covered 11:59:18 Good to hear it :-) 11:59:26 OK, final comments anyone? 11:59:33 Well work together with rhat to make things more mainstream 11:59:49 oneswig: I'm good. Great chatting, thanks everyone. 12:00:01 Good discussion, thanks all 12:00:04 #endmeeting