Monday, 2022-11-07

opendevreviewIan Wienand proposed zuul/zuul-jobs master: test-registry-post: collect k8s logs  https://review.opendev.org/c/zuul/zuul-jobs/+/86378101:45
ianwhttps://zuul.opendev.org/t/openstack/build/f9a34ce5f5b04f89b269620681f598bc/console failed with ansible-playbook returning '4' ... but there appears to be no error in the actual ansible run02:02
ianwthe next run passed -> https://zuul.opendev.org/t/openstack/build/8063fd9d6ee4453d8f7a5dabc943528a02:48
*** yadnesh|away is now known as yadnesh04:20
Clark[m]ianw: I think rc 4 means network issues to the inventory nodes. May have just been a blip?04:31
opendevreviewIan Wienand proposed zuul/zuul-jobs master: enable-kuburnetes: 22.04 updates  https://review.opendev.org/c/zuul/zuul-jobs/+/86381005:22
ianwClark[m]: yeah, i hope so.  Today I slowly merged everything up to the "refer to things via prod_bastion" change and afaics everything is still happy05:23
*** marios is now known as marios|ruck05:59
*** yadnesh is now known as yadnesh|afk07:30
*** yadnesh|afk is now known as yadnesh08:02
*** jpena|off is now known as jpena08:42
*** marios|ruck is now known as marios|ruck|call09:00
*** marios|ruck|call is now known as marios|ruck09:21
*** soniya29 is now known as soniya29|afk09:57
*** dviroel|out is now known as dviroel10:16
*** diablo_rojo_phone is now known as Guest70210:39
*** pojadhav- is now known as pojadhav11:04
*** yadnesh is now known as yadnesh|afk12:32
dtantsurhey, do I get it right that I cannot use required-projects to clone from github?12:42
fungidtantsur: you can, zuul just needs to know about the repo first12:46
dtantsurah, great, do you have any references? I'm looking at the user guide, but probably a wrong place..12:49
fungidtantsur: you'll find a bunch of them we've added in its config here: https://opendev.org/openstack/project-config/src/branch/master/zuul/main.yaml#L1415-L146412:49
fungiif there are some you need, propose the addition with a new change12:49
dtantsurfungi: oh, so I cannot do it without modifying the openstack-wide configuration?12:50
fungidtantsur: yes, any repository (even those we host in our gerrit) needs to be listed in zuul's configuration if we want it included12:51
dtantsurack, thank you! Then I'll prototype with a manual git clone first.12:51
fungithe main benefits you'll get from inclusion in the config is caching on the executors (so fewer jobs failing due to a network error cloning from github), and the ability to use depends-on to github pull requests12:52
dtantsuryep. I just don't want to change the global configuration until we prove that the thing even works.12:53
*** yadnesh|afk is now known as yadnesh|away13:09
*** dasm|off is now known as dasm14:11
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add another role for Zookeeper installation  https://review.opendev.org/c/openstack/project-config/+/86315814:22
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add os_skyline repo to CI  https://review.opendev.org/c/openstack/project-config/+/86316714:23
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add repository for Skyline installation by OpenStack-Ansible  https://review.opendev.org/c/openstack/project-config/+/86316514:23
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Add os_skyline repo to CI  https://review.opendev.org/c/openstack/project-config/+/86316714:23
*** dviroel is now known as dviroel|lunch15:08
fricklerclarkb: since there seems to be no more progress in the storyboard mailing thread, I've added the topic to the meeting agenda, maybe johnsom gtema melwitt are interested in joining (tomorrow 19UTC). or maybe have a dedicated talk that is more EU friendly (for me and gtema and possibly sean)?15:33
gtemayes, a better slot is appreciated15:34
johnsomI am west coast US, so something compatible would be appreciated.16:05
*** sfinucan is now known as stephenfin16:10
JayFI felt a little weird participating in that thread given Ironic already made the decision in PTG to go from storyboard->LP 16:12
JayFso I think we're sorta past the decision phase for that project16:12
JayFI wonder if others were in the same boat; there's not much of a call to action on that thread if you've already decided to move (back) to LP16:12
frickleriiuc clarkb was hoping there would be more interest in keeping storyboard alive, which I don't see happening. for sdk I'd think were mostly at the same point16:14
fricklerso for me there are two things to discuss: a) is anyone interested in building tooling to help with moving from sb to lp? otherwise only manual moving of issues would be possibly16:15
clarkbyes, the call to action was "tell us if you would like to see this better supported" so that we can find a way to do that somehow. But no one has done that16:15
fricklerb) (more an opendev topic) how long do we want to keep running sb if users are fleeing from it16:15
fungialso please avoid dramatic terms like "fleeing" as that's the sort of attitude which has led me to completely avoid reading any of that ml thread so far16:17
fricklerwell that matches my personal feelings about it, sorry if that sounds offensive16:17
funginot offensive, just overtly negative, and i'd rather spend my time dealing with more positive community interactions16:19
fungii only have so much bandwidth for negativity in my day, and try not to reach my quota if it can be helped16:20
clarkbfwiw in my email I hinted that after collecting feedback from anyone trying to use storyboard long term and how we can support that that we could followup with supporting alternatives16:21
clarkbit does sound like the vast majority of people are looking at lp.16:22
clarkbmy high level goal was to get the discussions out of individual project comms and into a broader forum so that we could avoid duplication of work and ensure we weren't stepping on each others toes16:22
clarkbmaybe now is a good time to followup on discussion for those looking to move? and then we can worry about a call later if that is necessary16:23
fricklerat least for openstack projects LP makes most sense, there is some integration and there are a lot of projects already/still using it16:23
fricklerso maybe the next question would be: which project like storyboard and would enjoy continuing to use it?16:23
clarkbwell that was the initial question I posed16:24
clarkbI was hoping to find help/support/aid for storyboard before anything else as that has the potential to vastly improve the storyboard situation16:25
johnsomI think we have felt the pain of not having all projects on the same platform, so that is the motivator for LP over some other tool.16:25
clarkbyes, I think one of he major pieces of feedback for openstack as a whole is that openstack should probably do its best to stay on a single platform16:26
clarkbsince much of the feedback to that thread indicated this was problematic. It does make me wonder if people aren't aware of LP's external bug tracker linking, but I don't think that would solve all the problems16:27
johnsomI also think we have a "build" vs "buy" question here. Is there enough advantage to get support to spin up a zuul like effort or should we just use something that already exists.16:27
fungialso the counterargument we've heard often is that people who are employed by canonical's competitors are often disinclined to report or work on bugs for openstack projects because that means they have to sign up for an "ubuntu" account (and this is apparently an emotional topic for some)16:28
fungiputting our own central authentication system together addresses that concern for the services we're hosting, but would not solve it for launchpad16:29
johnsomYeah, I think that was one of the main goals of storyboard, to have the foundation logins work.16:33
*** marios|ruck is now known as marios|out16:33
clarkbjohnsom: that definitely something to account for and something that years ago people definitely felt was worhwhile. Part of bringing this up is acknowledging things have shifted (hence individual project planning) and needing to reevaluate. But I was still trying to answer the question fo whether or not there was even interest first16:33
clarkbI'm good with accepting no one has expressed interest yet so it is unlikely to occur in the future and taking the discussion from there which is what next. I do think it may be a bit early for a conference call though and we can continue to discusson the mailing list?16:34
*** dviroel|lunch is now known as dviroel16:35
johnsomYeah, I don't have a need for a call personally.16:35
clarkbit is worth noting that openstack's needs/demands/requirements largely drove the creation and development of storyboard. If openstack isn't pushing for that anymore and isn't providing maintainers/operators and is looking at moving projects back to lp then I think it is important that we reevaluate opendev's caretaker position too16:35
johnsomI think what I shared was the top of mind perspective many on the Octavia team have.16:36
fricklerI don't like mails so I wanted to move to IRC for a bit, but feel free to send mails if you prefer. my main focus was to get things moving again, which it seems I've succeeded in16:36
clarkbotherwise openstack will continue to complain at us for something that openstack orphaned and handed over.16:36
clarkbfrickler: I think it is important to keep this stuff on the mailing list as much as possible so that as many people as possible remain informed and don't feel decisions were made quickly on a conference call (this is the whole reason I pushed the discussion to the mailing list as is felt like openstack was essentially saying we don't want this thing anymore nd not telling everyone16:37
johnsomlol, well, I am not a fan of us-or-them, I still feel we are one community.16:37
clarkbthat was necessary to involve)16:37
clarkbjohnsom: I agree we are one community, but in this particular case it really feels like people have actively excluded us16:37
clarkband I'm reacting to that16:37
clarkbwhich is why I'm pushing hard to keep discussion on the mailing list. I'll try to write a followup email today16:38
johnsomFrankly, many of us were not aware of that email list.16:38
fricklerwell afaict there is no "openstack", it is individual projects discussing things16:38
clarkbjohnsom: I spent a year telling people to subscribe and no one did and it is listed on opendev.org.16:38
fungiless us-vs-them and more now-vs-then... the people who were insistent this solution was worthwhile are no longer around or have changed their minds16:38
clarkbfrickler: thats fair. But I think that is a problem too16:38
johnsomYaeh, we knew about announce, but this one was missed somehow.16:38
clarkbI think it is fine to have the discussions but they shouldn't be hidden away16:39
johnsomAgreed16:39
fricklerI don't agree they are hidden, most discussions are documented well in the respective projects PTG notes16:39
johnsomI was commenting on conference calls where details don't always get captured well.16:40
clarkbfrickler: I was unable to attend any of the discussion and I spent a fair bit of time pre ptg looking over etherpads to make a rough schedule so that I wouldn't miss stuff like this. It is possible that this comes down to adeficiency in how things get scheduled at the ptg. But even then I think discussions like this should go to a mailing list and be more async16:40
fungithe concern raised was that different openstack subprojects having the same basic discussions in parallel may end up making different choices if they weren't made aware of those other discussions, and that could lead to even greater divergence16:41
fricklerthat's why I was suggesting IRC meeting, not conference call like zoom or other16:41
johnsomThe topic wasn't planned in our session, but was added after another project raised the topic.16:42
fricklerfungi: I tried to cross connect those discussion where I was aware of them16:42
johnsomfrickler Thank you for the heads up about the email.16:42
clarkbanyway I think the thread has shown that there is value in openstack maintaining consistency for bug tracking as one of hte issues people have is the tool split. Given that I think it is even more important we try to centralize the discussion as much as possible and involved as many as possible16:42
clarkbto me that means keeping things on the mailing list as long as possible is important16:42
clarkbit keeps a public record of discussion and allows people to jump in asnychronously with their input16:43
fricklerwhen discussing a common strategy for openstack, using the openstack-discuss would likely be more appropriate though16:43
clarkbagreed16:44
fungibut initiating the discussion on an opendev mailing list at least gives other non-openstack users of the service more of a chance to see it and weigh in16:45
fungii agree, though, that openstack-specific decisions about it are best handled on openstack-discuss16:46
clarkbI'll work on a followup to the thread to try and summarize ^ which is basically "its been a bit with no additional feedback. We've not seen anyone indicate a desire to keep using storyboard and help maintain it. Storyboard was built fr openstack needs/requirements and it seems like those have changed over time. What next for opendev and for openstack?" and we can split that16:47
clarkbdiscussion off into openstack discuss for openstack specifics16:47
fricklerI also haven't seen any non-openstack responses, which is why I think explicitly addressing those with a direct question "would you like to continue using storyboard?" would also be helpful 16:48
clarkbok. I'll incorporate that into my email.16:48
fricklermaybe also either do a single crosspost or send pointers to other project lists like zuul16:49
clarkbwell thats the sort of thing I've been trying to avoid. We spent a year doing that and telling people to subscribe to these lists if they wanted to be involved. We also attempted to get project liasons.16:51
clarkbI think it is great for openstack to have an openstack specific discussion with an end result that can be fed back to the original discussion but it gets incredibly confusing when people start posting concurrent discussion back and forth with missing emails and so on16:51
clarkbBasically at some point I have to stop trying to cross post to the world.16:52
clarkbI feel it is incredibly unfair to me in particular to be asked to do that in perpetuity. I did it for about a year with clear warnings I would not continue to do so16:53
clarkbit creates a significant amount of busywork on my part16:53
frickleranother option might be to consider whether it is worth to send a mention of this discussion to service-announce, which seems to have a much larger audience. otherwise some people might only read the announcement of us telling that we can no longer run storyboard, which (trying to be as positive as possible) seems to be the only feasible result if nothing else happens16:57
clarkbYes, that may be worthwile. A single email pointing people to the thread keeps -announce low volume but does potentially make more people aware16:57
clarkbI can do that16:58
clarkbI can use that as an opportunity to remind people of the three lists we maintain and what their purposes are16:58
fricklerI'll also point tc-members to what we just spoke, maybe one of them will have interest in taking over organizing the openstack side of the discussion17:03
slittleAnyone else having gerrit issues this morning?17:27
slittleError submitting review TypeError: NetworkError when attempting to fetch resource17:28
clarkbI've only done a couple reviews, but did not have problems. NetworkErrors can be one of many things. Might help to try and narrow it down (ipv4 vs ipv6 is one more relaible than the other? etc)17:29
clarkbfrickler: ok followup sent. Now to send the pointer email17:30
slittleipv417:32
fricklerslittle: do you have a complete traceback?17:34
frickler(assuming that is part of git-review output)17:35
slittledns and ping to review.opendev.org are ok.  review02.opendev.org is answering17:36
*** jpena is now known as jpena|off17:37
slittlemy signin expired while writing my comment on https://review.opendev.org/c/starlingx/kernel/+/863603.17:38
clarkbslittle: that may be related to network connectivity issues too17:38
clarkbthe gerrit UI sometimes interprets network errors as authentication issues. If you refresh (and the network problem isn't persistent) it comes back17:39
fungithe string "fetch resource" doesn't appear in the git-review source, so that exception is likely getting raised by one of its dependencies17:39
slittleyet, that did it17:40
fungiactually i can't find "Error submitting" in the git-review source either17:42
clarkbfrickler: and now pointer email has been sent17:49
*** dviroel is now known as dviroel|afk19:40
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] enable-kuburnetes: debugging 22.04  https://review.opendev.org/c/zuul/zuul-jobs/+/86381020:21
opendevreviewJay Faulkner proposed openstack/project-config master: Allow Ironic cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393120:42
opendevreviewMichael Johnson proposed openstack/project-config master: Allow Designate cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393220:50
opendevreviewMichael Johnson proposed openstack/project-config master: Allow Octavia cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393420:55
opendevreviewJay Faulkner proposed openstack/project-config master: Allow Ironic cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393121:15
opendevreviewMichael Johnson proposed openstack/project-config master: Allow Designate cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393221:18
opendevreviewMichael Johnson proposed openstack/project-config master: Allow Designate cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393221:20
opendevreviewGhanshyam proposed opendev/irc-meetings master: Update TC weekly meeting Day & time  https://review.opendev.org/c/opendev/irc-meetings/+/86393921:21
clarkbinfra-root heads up I'm working on testing the server rescue paths for regular disk and bfv nodes in vexxhost ca-ymq-121:23
clarkbI think if that all owrks out I may just need to push a docs update to document the process for us?21:23
opendevreviewMichael Johnson proposed openstack/project-config master: Allow Octavia cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393421:24
opendevreviewMichael Johnson proposed openstack/project-config master: Allow Designate cores to toggle WIP state  https://review.opendev.org/c/openstack/project-config/+/86393221:24
clarkbinfra-root this is really interesting. I've rescued a normal disk instance and it booted into my test nodes root / using the rescue instance's kernel21:29
clarkbI did not expect that21:29
clarkbI did use our test images as the base image for the test node though. Maybe our boot by label is winning?21:29
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] enable-kubernetes: check pod is actually running  https://review.opendev.org/c/zuul/zuul-jobs/+/86381021:29
opendevreviewIan Wienand proposed zuul/zuul-jobs master: ensure-kubernetes: move testing into common path  https://review.opendev.org/c/zuul/zuul-jobs/+/86394021:29
clarkb(I did that because it made the root disk content distinct enough to know if the rescue was working but there are other ways to check that)21:30
ianwso basically that doesn't help us if the root disk init is borked right?  21:31
clarkbright I think this is undesireable behavior21:31
clarkbbut it may be that grub is finding our cloudimg labeled disk on the test nodes and it is winning?21:32
clarkbI'm going to boot a new test instance based on the cloud provided image and see if the behavior changes21:32
ianwthat could be the case, if the rescue instance boots with the same kernel args as a regular instance i guess.  if both disks are attached, i guess it would find the LABEL=cloudimg...whatever we call it21:33
ianwbut that wouldn't happen with control plane nodes, though?  21:33
clarkbyes both appear to be present21:34
clarkband it is finding the rescue images kernel21:34
clarkbbut that kernel is running with / mounted from what we are trying to rescue21:34
clarkband now I've unrescued and it booted its normal kernel again21:35
clarkb(I'm glad I used different ubuntu versions to catch that)21:35
clarkbusing the regular cloud images produces the same result21:41
clarkbthe node reboots using the rescue image kernel but / is from the node being rescued21:41
clarkbits weird that it can find the rescue image kernel but then somehow selects /dev/vdb1 to mount as / instead of /dev/vda121:42
clarkbmelwitt: mnaser__ ^ that behavior is really surprising to me. Do ya'll know if that is expected from the nova or cloud side of things?21:43
clarkbI'm going to test bfv next to see if it does the same thing21:44
mnaser__uh thats weird21:45
mnaser__from nova's pov, it will boot the vm with the rescue image as vda and original as vdb21:45
mnaser__the interaction once it boots isn't controlled, but i suspect the issue is in the label=cloudimg21:45
clarkbmnaser__: yup that is what I see in our testing. The weird thing is / is mounted from vdb not vda21:46
clarkbmnaser__: ya it looks like both the cloud (vexxhost) provided image and our test node images built with dib set cloudimg-rootfs21:47
clarkband maybe we're seeing behavior where the initramfs is having to pick one of the two?21:48
clarkbI wonder if nova needs to take an extra step here to force a device somehow21:48
melwittwhat version of nova is this?21:48
clarkbmaybe by attaching the vdb volume post boot21:49
clarkbs/volume/device/ (it may not be a volume)21:49
melwittasking bc I wondered if https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/virt-rescue-stable-disk-devices.html could be related21:49
clarkbthat definitely seems related21:50
clarkbmelwitt: is that spec essentially proposing that clouds have rescue specific images that can have properties that help ensure the correct behavior?21:51
melwittclarkb: I'm still trying to parse the spec as well ... unfortunately don't know much about rescue. this spec was implemented in ussuri21:53
clarkbmelwitt: "The rescue root device will be added as the last device in the configuration, but will be marked as bootable for the BIOS, so it takes priority over the existing root device. This relies on KVM/QEMU supporting the “bootindex” parameter, which all supported versions do." that bit doesn't appear to happen here as we have /dev/vdb1 mounted. That implies to me the rescue21:54
clarkbimage was not added last (but I guess linux may not garuntee the ordering to be stable?)21:54
clarkbI think what is happening here is both the rescue image and the "prod" image have label set to cloudimg-rootfs on their respective / then the kernel boot lines for grub say to mount / using that label and its finding one or the other21:55
clarkba fix for that behavior outside of nova may be to use a rescue specific image that mounts something other than label=cloudimg-rootfs21:55
clarkbok I've also now confirmed that you cannot rescue bfv nodes22:02
clarkbboth of these things seem like problems? I guess with a bfv node you can shut it down, detach the root disk, boot a new instance and then attach the volume for the other node?22:03
opendevreviewMerged opendev/system-config master: Reference bastion through prod_bastion group  https://review.opendev.org/c/opendev/system-config/+/86284522:03
clarkbmelwitt: ^ do you have any idea if that sort of process is what nova expects people to do for boot from volume nodes?22:03
melwittclarkb: is this cloud older than ussuri? bc both of those things appeared to have been addressed in ussuri https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/virt-bfv-instance-rescue.html22:04
melwittok, for bfv the microversion must be passed to get the new behavior https://docs.openstack.org/nova/latest/user/rescue.html22:05
clarkbmelwitt: I don't know what version the cloud is. This is vexxhost's montreal public cloud region22:12
clarkbok it doens't look like openstackclient exposes that to me. The onyl options are the image to rescue with and the password. It probably could pass in a microversion new enough to do that if the cloud supports it otherwise return an error?22:13
clarkboh in this case the error comes from the cloud. So maybe what the client should do is try the newer microversion if available otherwise fall back to default microversion and pass any errors through?22:14
melwittyou passed the microversion like this openstack --os-compute-api-version 2.87 server rescue SERVER22:14
melwittor are you using the old novaclient22:15
clarkbno this should be latest openstackclient. I think the issue is I'm doing openstack server rescue --help and that doesn't say anything about microversions.22:15
clarkbas an end user if --help on the command I want to run doesn't give me the help I need I think that is a flaw :)22:15
melwittopenstackclient historically has defaulted to the lowest microversion, so unfortunately has to be specified like that currently22:16
clarkbbut also I shouldn't have to specify a microversion if a newer one is necessary to complete the request I've made22:16
clarkbthe tool should just know that and do it for me22:16
melwittclarkb: you are right, that is a flaw if the help doesn't have that info22:16
clarkbdoes anyone know if you can do discovery through the openstack client? Looks like catalog show and list give me the default microversion info (2.1)22:19
melwittI don't think it's possible ... but I checked the doc and found you can at least see what range is available per service with 'openstack versions show'22:26
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] enable-kubernetes: check pod is actually running  https://review.opendev.org/c/zuul/zuul-jobs/+/86381022:26
clarkbI discovered too that you can fetch the root of the nova api endpoint from the catalog (just drop the version specific stuff) and it gives yousimilar info22:27
*** dasm is now known as dasm|off22:27
clarkbI used version 2.88 since that is the newest supported22:29
clarkband no error now22:29
clarkbhowever that then failed with an error server showing the instance shows. "cannot be rescued: Driver Error: Cannot access storage file" and a info with paths and uids and stuff22:30
clarkbunrescue also fails because the instance is in an error state22:31
clarkbmnaser__: ^ should I try deleting the instance (60d798a1-f7f5-4c65-8711-7654040ca180) or is that something you might be interested in looking at and I should leave it be?22:32
mnaser__clarkb: if you can send me a paste of the error and you can wipe it after22:32
clarkbthe instance deleted, but the volume did not. I can't remember if --boot-from-volume XY implies a delete on instance deletion volume or not so this may be expected.22:38
*** cloudnull6 is now known as cloudnull22:38
* clarkb manually deletes it22:38
clarkbI believe all the resources I created to test this stuff have been cleaned up now. And we learned some good info22:41
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] enable-kubernetes: check pod is actually running  https://review.opendev.org/c/zuul/zuul-jobs/+/86381023:19
clarkblooks like the snapd removal change ladned at some point23:38
clarkbI guess no news is good news on that one :)23:38
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] enable-kubernetes: check pod is actually running  https://review.opendev.org/c/zuul/zuul-jobs/+/86381023:42
clarkbI'm just about ready to send out the meeting agenda. Anything else to add to it?23:43
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] enable-kubernetes: check pod is actually running  https://review.opendev.org/c/zuul/zuul-jobs/+/86381023:52

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!