15:00:15 #startmeeting ironic 15:00:15 Meeting started Mon Mar 25 15:00:15 2024 UTC and is due to finish in 60 minutes. The chair is rpittau. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:15 The meeting name has been set to 'ironic' 15:00:22 o/ 15:00:26 o/ 15:00:31 o/ 15:00:53 hello ironicers! welcome to our weekly meeting! 15:01:02 The meeting agenda can be found here: 15:01:02 #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_next_meeting 15:01:28 #topic Announcements / Reminders 15:01:50 #info Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio: 15:01:56 #link https://tinyurl.com/ironic-weekly-prio-dash 15:02:20 #info Project Teams Gathering (PTG) will be held from Monday, April 8 to Friday, April 12 2024 15:02:50 I'm going to leave the topics page open for another couple of days before finalizing the schedule 15:03:00 #link https://etherpad.opendev.org/p/ironic-ptg-april-2024 15:03:04 #link https://ptg.opendev.org/ptg.html 15:03:08 rpittau: If you want any help with that, feel free to reach out. I'll be around this week but gone next. 15:03:21 * iurygregory has one item to add to announcements 15:03:30 thanks JayF! definitely will need help :) 15:03:33 o/ 15:03:41 iurygregory: go ahead 15:04:05 #info The CFP for the #OpenInfraSummit Asia is now open! Submissions for the CFP can be submitted in both English and Korean. The CFP closes May 29, 2024 at 11:59 PM KST! 15:04:13 #link https://openinfrafoundation.formstack.com/forms/openinfra_asia_summit_2024 15:04:27 awesome, thanks for the reminder 15:04:28 we also have some CFP open for the OpenInfra Days in Europe if I recall 15:04:41 Yes, they are also open 15:04:57 #link https://openinfra.dev/days 15:04:58 or at least, several of them I believe are open 15:05:18 yup 15:05:34 we're particularly interested in one of them 15:05:34 #info Ironic Meetup/BareMetal SIG June 5, OpenInfra Days June 6 @ CERN. Signup at https://indico.cern.ch/event/1378171/ and https://indico.cern.ch/event/1376907/ 15:05:56 I will be there, hope to see a lot of you :) 15:06:29 anything else to add to announcements / reminders ? 15:06:55 I'll actually add one more thing now :D 15:07:30 I'd like to thank JayF for his work as PTL for 3 cycles! 15:07:43 ++ 15:07:48 ++ 15:08:24 Thanks :) PTL of Ironic is the easiest leadership gig in OpenStack :D 15:08:33 lol 15:08:59 well I should be relieved since I'm officially the PTL now :P 15:09:09 rpittau: congrats! 15:09:15 thanks! 15:09:16 congrats rpittau \o/ 15:09:44 Congratulations :D 15:09:50 thank you all! :) 15:09:53 Congratulations 15:10:24 ok, moving forward 15:10:33 #topic Caracal Release schedule 15:10:41 we're at R-1 week 15:11:10 the deadline for the final release is March 28th, so this week 15:11:50 I think we want to make sure to do one more release this week with the scciclient bump and hopefully won't have te release team yelling at us :D 15:11:55 cid proposed openstack/ironic master: Special treatment of .json is now disabled for nodes with .json extension. https://review.opendev.org/c/openstack/ironic/+/913467 15:12:36 #topic Review Ironic CI Status 15:12:47 I was out last week, is the metalsmith job still kaput? 15:13:09 dunno, I had a family emergency last week which largely consumed me 15:13:29 I also didn't look to much at CI upstream last week .-. 15:13:53 Nobody fixed the metalsmith job. 15:13:55 AFAICT 15:14:24 ok, I think the issue was due to the most recent tinyipa, but didn't also have the time to check 15:14:57 can anyone double-check this week? otherwise I'll see if I can make some time 15:15:27 it is definitely the new tinyipa 15:15:56 It tosses an error. I think we have two base questions, what to do with tinyipa and what to also do with metalsmith 15:16:26 the issue is just on the legacy job, UEFI seems to work just fine 15:16:50 ++ 15:16:59 so maybe the question becomes, are we okay with dropping the legacy job 15:17:28 I will note the number of cases I'm seeing on the actual support ironic side of things where people are trying to intentionally use bios boot are almost zero these days 15:17:40 we've disabled the job in IPA for the time being 15:17:40 is that the only legacy job we have ? 15:17:51 So continuing to worry about legacy bios booting as much as we do might not be sustainable given the consumption pipeline 15:18:23 I'm fairly sure we have some others, but I don't think we need *all of them* 15:18:31 that is a valid argument 15:18:40 the base issue is though, we've seen tinyipa fail on other legacy bios jobs as well which are not metalsmith 15:19:09 ok, didn't see that either 15:20:20 So maybe we can drop the metalsmith bios job and look at the rest of the failing bios jobs? 15:20:32 I say we give it a try for a fix, but if it's not immediately clear what's wrong, we remove the metalsmith legacy job 15:21:33 What are we actually testing re: bios vs legacy boot? 15:21:38 Is there a way to get there without a full scenario job? 15:22:24 UEFI based boot versus BIOS based boot 15:22:51 in this case, things fail when the ramdisk is tinyipa and it hits the bios path with partitioning 15:23:11 parted throws a "you found a libc bug!" error 15:23:24 What I'm asking is, in terms of *what Ironic does differently* 15:23:35 sets the modes differently to the BMC 15:23:40 different options get used with IPMI 15:23:41 Are we testing anything other than the ability for e.g. sushy-tools to translate "boot uefi" 15:23:46 the ramdisk will partition things differently as well 15:24:02 TheJulia: I will note I know cases of BIOS booted machines in production with modern-ish ironic 15:24:09 but we have extensive unit testing around that *as well* 15:24:37 JayF: but are new machines being deployed in 1+ years? 15:24:43 *with* tinyipa?! 15:24:45 TheJulia: re your comment above: we literally have legacy boot in the Metal3 quick start guide because it was written against ilo4 15:24:47 the issue is tinyipa 15:24:59 TheJulia: I am making no comments re: how much we should test, just noting a data point to go with the one you asserted earlier :) 15:25:05 TheJulia: it's coming late because of latency :D 15:26:01 I think we're much more likely to end up with better ironic slimming the coverage so we spend more time making Ironic better and less time fiddling with tinyipa and zuul 15:26:20 Look, the tl;dr is we're at a union of the venn digram of "resources" "tinyipa limitations" and "complexity", I'm not saying we drop legacy boot entirely, I'm saying we work the fundimental problems and reduce our exposure 15:26:49 The failure, we know is libc not playing nicely with parted in tinyipa 15:27:15 we know, because of of differing clouds and ipa images which get used, that this doesn't impact dib based images 15:27:58 That does mean removing the coverage for our parted code completely 15:28:06 parted+grub I guess 15:28:20 it's the only legacy job in ipa 15:28:25 parted+grub with legacy boot? 15:28:34 grub implies legacy boot, yes 15:28:34 That seems wacky to me, you can't even secure boot on legacy/bios? 15:28:45 that is only if we entirely remove legacy boot support, and I'm not advocating that purely, just thinking the reality is we cannot keep metalsmith stuffs around forever 15:29:10 Metalsmith is also deprecated as of last ptg, which should impact our choices as to how coverage falls 15:29:11 yeah, I'm just highlighting the code path left with zero testing 15:29:30 Metalsmith is not deprecating, nor are we discussing coverage for it 15:29:46 and my point ends up being, do we "really" need to test that moving forward given the amount of hardware out there that simply doesn't even have bios support in the classical sense anymore 15:30:15 I'm sensing a larger discussion at PTG for this :) 15:30:18 we changed our default... ?2? or ?3? years ago 15:30:22 I don't mind us stopping supporting that eventually, but we need to be careful with expecations and messaging 15:30:24 rpittau: oh very yes :) 15:30:28 dtantsur: https://github.com/openstack/metalsmith/commit/e4fd02fa30164de00bc5a354af954b503f42c89b deprecated was too strong of a word, and I now also remember we often use it here for "going away eventually" while others use it for "already gone" 15:30:30 dtantsur: ++++++ 15:31:03 JayF: yep, hence we used a softer wording. Metalsmith can leave in maintenance mode for a long time, not that it's going to rot (unlike its CI job) 15:31:13 so 15:31:22 let's see if we can find the time to at least understand the root cause of the failure, maybe someone will be able to reproduce the issue locally 15:31:22 then we can discuss at PTG for the next steps 15:31:22 wdyt? 15:31:25 what is our fastest/easiest fix, I think it is to pull tinyipa usage out 15:31:33 We could start with "We recommend users of partition images to only use UEFI mode because of lack of testing" 15:31:33 or start to dial it out of existence 15:31:51 That is also an easy path :) 15:32:00 I wonder if the metalsmith jobs could use IPA images, by the way 15:32:08 DIB images ? 15:32:10 I suggest we table some of this for PTG? Discussion around replacing tinyipa with $something is alreayd scheduled for it 15:32:12 yeah 15:32:14 They probably have more RAM because they use "normal" OS 15:32:21 And I think that technical outcome will impact this discussion 15:32:43 wondering the same, but I think thy're just too big 15:32:43 anyway, we need more time as this expands rapidly. so PTG it is :) 15:32:43 e.g. we wouldn't even be having this chat if tinyipa itself wasn't bugged, I suspect 15:32:53 yeah 15:33:28 I'll add the topic after the meeting, unless it's already there :D 15:33:59 moving on! 15:34:09 #topic Bug Deputy 15:34:22 JayF: anything to report? 15:34:58 That I'm extremely bad at prioritizing this kind of work in light of a week of firefighting :( 15:35:08 :D 15:35:17 I apologize, nothing to report, nothing meaningful done here, my time has been eaten 15 minutes at a time 15:35:24 no worries! 15:35:32 Best bug deputy thing I did was linking the gigabyte server bug to the person on the list lol 15:35:52 I was wondering if we had any interesting/priority bug 15:36:18 JayF: what day was that last week? I saw mention of it, but I don't remember the what anymore 15:36:32 TheJulia: I'll just link you the post directly, easier that way 15:36:47 TheJulia: https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/VILS5HFLO4FKG4G7ZVDB7HN3YO7HGJRE/ 15:38:52 we also need a volunteer for the bug deputy for the week, anyone available? 15:39:48 Gah, no arne to see if he replied to Arne directly 15:39:53 Eh, I might be able to 15:40:00 TheJulia: thanks! 15:40:18 #info bug deputy for the week: TheJulia 15:40:39 we don't have any RFE to review so I'll jump to 15:40:39 #topic Open Discussion 15:41:12 next monday is bank holiday in Europe and I guess North/South America too ? 15:41:25 I guess we can cancel the meeting 15:41:42 I've alluded to this various places, I'm going to be on PTO, and completely out of pocket next week with notifications off. If the world somehow pivots on getting in touch with me, many of you have my personal cell phone -- use it. 15:42:00 JayF: enjoy! :) 15:42:19 anything else for Open Discussion ? 15:42:33 I think I put it in as a bug, but I have a RFE. 15:42:58 dking: sure, a link ? 15:43:16 If anybody would be interested in looking it over: https://launchpad.net/bugs/2057668 15:43:36 I put in a commit for review: https://review.opendev.org/c/openstack/ironic-python-agent/+/913209 15:44:00 I thought it might be interesting to bring up as Jay will be out next week. 15:45:07 dking: the idea seems reasonable to me 15:45:15 thanks dking, it looks ok to me 15:45:19 That seems like a good feature to me, for folks doing customization, but it does overlap pretty heavily with the skip hinting we do 15:45:25 and actually would help folks with the super complex support cases, reminiscent of arne :) 15:45:41 I'd suggest considering, as part of this implementation, adding a new hardware manager to examples/ that covers the use case not handled by existing code 15:46:13 The skip hinting is a neat feature. Unfortunately, I needed something with a little bit more control. I think that I'm often that guy. 15:46:34 JayF: that was my thought as well, plus at a minimum a release note is needed :) 15:46:35 Yeah, I am +1 to the rfe just suggesting you ensure the path is paved with examples for the next "that guy" :D 15:46:45 dking: Also, add a release note :) 15:47:40 TheJulia: Okay. I might needs some pointers. 15:47:56 dking: can you please fix the indentation in https://review.opendev.org/c/openstack/ironic-python-agent/+/913208 also ? 15:48:35 I can add a bit to an example hardware manager. I could probably put it on one of the existing ones. 15:48:48 dking: on the text, or use of the reno (pip install reno && reno new format? 15:48:56 I specifically request it's a separate one please dking :) 15:49:04 rpittau: sure. I did a squash merge and forgot about that. 15:49:10 dking: thanks! 15:49:24 JayF: Sure. 15:49:44 dking: fyi, https://docs.openstack.org/reno/2.1.1/usage.html 15:50:35 going to close the meeting, we can keep discussing afterwards if needed 15:50:38 thanks everyone! o/ 15:50:42 In a related issue, it may also be nice to go back and review the current hooks available for hardware managers. I have a separate bug in because we have some methods in HardwareManager which almost seem to imply that they could be overridden, but are never called with dispatch_to_managers. 15:50:47 #endmeeting