11:04:27 #startmeeting scientific-sig 11:04:28 Meeting started Wed Jul 3 11:04:27 2019 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:04:29 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:04:31 The meeting name has been set to 'scientific_sig' 11:04:33 #chair b1airo 11:04:33 Current chairs: b1airo oneswig 11:04:43 janders: used the 2600TP white box servers 11:04:44 sorry, laptop just froze at a very inopportune moment 11:04:59 oneswig: yes, those ones 11:05:11 do you have any experience with automating BIOS configuration? 11:05:12 Would prefer not to, bmc and bios firmware was not good. 11:05:28 oops then we might be a little screwed 11:05:37 janders: I don't think it is possible. I recall we had to upgrade NIC firmware on them and it was painful at the time. 11:05:46 (this was 2 years ago, things may have improved). 11:05:47 they have partial redfish support but it seems very read-only especially if done via ansible 11:06:04 there are some new tools my guys are testing 11:06:05 sounds like things have evolved a bit then. 11:06:18 were they cheap enough to warrant it...? 11:06:37 but I'm in discussion with their devs, they seem to be keen to make things better 11:07:08 allright.. it's good to know we're not alone 11:07:20 these seem pretty cool piece of kit other than this "little 11:07:24 " detail 11:07:41 I will try to work out something with Intel and happy to report back if anyone is interested 11:08:00 always like a discussion on things like that... 11:08:27 belmoreira: I have John and Mark here, are you available to discuss next week? 11:09:21 should be fine. Send me your availability (day, hour) and I will check with the other guys 11:09:39 will do, thanks (and apologies for the thrash) 11:10:08 oneswig no worries 11:10:35 How's the new control plane rollout going? 11:10:53 how's your control-plane k8s-ing going belmoreira ? 11:11:05 jinx 11:11:12 :-) 11:11:19 :) 11:11:42 we have production request going to the k8s cluster with glance 11:12:03 next step is nova-api (maybe next week) 11:12:11 nice! 11:12:17 and then have a full cell control plane 11:12:43 the main motivation is to gain ops experience and see if it's makes sense to move the control plane to k8s 11:13:02 belmoreira: when complete, will you run a "small/internal" k8s to run OpenStack control plane that will then run "large/user-facing" k8s on top? 11:14:40 janders not clear. What I would like is to have the k8s cluster deployed with magnum in the same infrastructure. Like the inception that we have today for the control plane VMs 11:15:41 inception... is the OpenStack running itself, essentially? 11:15:52 if i've understood belmoreira's current cell-level control-plane architecture correctly then each control-plane k8s cluster will itself be running across some subset of his production cloud! it's turtles, or dogfood - hopefully those things aren't equivalent - all the way down/round 11:16:57 b1airo: that's it 11:17:54 memory can't be too rusty yet then, despite being part of a "fleshy von neumann machine" (referencing a conversation earlier this evening) 11:18:06 anyone played with Singularity 3.3 yet? 11:18:16 fakeroot builds and such? 11:18:48 (it's still in rc for the moment i think, so won't be surprised if not) 11:21:38 belmoreira: got any tips for ceph mds performance? We've got a deployment with CephFS as a cluster filesystem and it's apparently pegged in directory operations 11:23:58 yeesh, is telling them not to do that an option oneswig ? 11:24:44 belmoreira: back on the self-hosted control plane, what have you looked at in terms of disaster recovery scenarios? How to fix a broken control plane from within? 11:25:05 b1airo: It was my idea ... 11:25:37 I have to head off 11:26:20 b1airo: can you take the reins? Got to head out now 11:26:34 oh, well in that case it's a fine idea :-P. i guess that means it's not a prod deployment that you're trying to fix though? 11:26:40 sure oneswig 11:26:47 see you janders belmoreira b1airo, nice to briefly catch up 11:26:59 b1airo: no, a bit of scientific experimentation 11:27:03 cheers all 11:27:06 oneswig: thank you, till next time 11:27:07 o/ 11:27:55 sorry, was away for few moments 11:28:13 i'm assuming oneswig must have already looked at directory fragmentation/sharding across multiple MSD's, but perhaps worth mentioning anyway 11:28:23 *MDS's 11:28:51 oneswig "ceph mds performance" I don't have great tips. Maybe Dan can help 11:30:38 oneswig "disaster recovery scenarios" in case of catastrophe... we always keep few physical nodes to bootstrap the cloud. But that would be very unlikely 11:31:08 also, vul at unimelb might have some tips - they are running a HTC/HPC system against CephFS, so have probably seen some of these things 11:32:33 ok, i think we can probably call it quits for now. late here anyway 11:32:46 agreed 11:32:49 thanks guys! 11:32:52 till next time 11:33:15 thanks 11:34:02 cheers! 11:34:06 #endmeeting