Thursday, 2021-09-16

fungiclarkb: the reason we turned off autoreload is that it basically dropped all pending tasks in the queue at reload00:00
funginot sure if latest gerrit still has that behavior, but it was resulting in lots of lost replication tasks and stale repo mirrors00:01
Clark[m]I think I saw it say it does similar in the docs. That would explain it. I knew there was a good reason just didn't remember specifics00:08
fungiwell, at the time we were making much more frequent changes to the replication config. now we hardly change it at all so it might be okay? but ultimately there's still some risk02:13
*** ykarel_ is now known as ykarel03:57
*** ysandeep|away is now known as ysandeep05:09
*** bhagyashris|off is now known as bhagyashris05:34
*** jpena|off is now known as jpena07:28
newopenstackNeed to setup openstack with 6 server and wnat to use Maas and Juju07:46
newopenstackplease advise.07:46
newopenstackand then want to grow the infrastucture to more compute nodes07:46
newopenstackalso want to use some storage center from dell07:46
newopenstackplease share some guidelines.07:47
newopenstackany one can help .. please07:47
*** ykarel__ is now known as ykarel08:03
*** ykarel is now known as ykarel|lunch08:20
*** ykarel|lunch is now known as ykarel09:27
*** odyssey4me is now known as Guest6510:07
opendevreviewMichal Nasiadka proposed opendev/bindep master: Add Rocky Linux support  https://review.opendev.org/c/opendev/bindep/+/80936210:11
*** ysandeep is now known as ysandeep|brb10:52
*** dviroel|out is now known as dviroel11:20
*** jpena is now known as jpena|lunch11:21
*** ysandeep|brb is now known as ysandeep11:52
*** ykarel is now known as ykarel|afk11:54
*** jpena|lunch is now known as jpena12:21
funginewopenstack: sorry, this is the channel where we coordinate the services which make up the opendev collaboratory. you're probably looking for the #openstack channel or more likely the openstack-discuss@lists.openstack.org mailing list12:55
funginewopenstack: though since you mentioned maas and juju (software made by canonical, it's not part of openstack really) you might want to be looking closer at https://ubuntu.com/openstack12:56
fungihope that helps!12:56
opendevreviewMerged openstack/project-config master: Add openstack-loadbalancer charm and interfaces  https://review.opendev.org/c/openstack/project-config/+/80783813:08
*** ykarel|afk is now known as ykarel13:20
*** slaweq__ is now known as slaweq13:23
*** frenzy_friday is now known as anbanerj|ruck13:35
*** odyssey4me is now known as Guest7413:42
*** ysandeep is now known as ysandeep|dinner14:26
opendevreviewdaniel.pawlik proposed opendev/puppet-log_processor master: Add capability with python3; add log request cert verify  https://review.opendev.org/c/opendev/puppet-log_processor/+/80942414:55
*** ykarel is now known as ykarel|away15:01
*** marios is now known as marios|out15:33
clarkbWe have no currently leaked replication tasks15:41
*** ysandeep|dinner is now known as ysandeep|out15:41
clarkbI've just confirmed the inmotion boots continue to fail. Will try and dig into that after some breakfast15:44
clarkbI've got tails running against the three different servers' nova api error logs. If that doesn't record anything interesting in the next bit I'll dig in further. I expect this should give me a clue in the next few minutes though16:16
clarkbThe api was very quiet. looking at other things I find messages like Instance f98ce366-90b1-43ba-8513-bf2ea559c931 has allocations against this compute host but is not found in the database. in the nova compute log16:28
clarkbI suspect that may be the underlying cuase? we're leaking instances that don't exist but count against quota?16:28
*** jpena is now known as jpena|off16:28
clarkbhrm no quotas as reported by openstackclient look fine16:30
clarkb"Allocations" seems to be what placement does16:31
fungimight be a question for #openstack-nova16:32
clarkbnova.exception_Remote.NoValidHost_Remote: No valid host was found. <- is what the conductor says16:32
clarkbso ya I think what is happening is placement is unable to place possibly ebcause it has leaky allocations.16:33
clarkbhttps://docs.openstack.org/nova/latest/admin/troubleshooting/orphaned-allocations.html is the indicated solution from the nova channel16:39
*** ysandeep|out is now known as ysandeep16:39
clarkbthank you melwitt!16:39
clarkbI'll have to digest that and dig around and see if I can fix things.16:39
melwittclarkb: lmk if you run into any issues or have questions and I will help16:40
clarkbwill do16:40
fungiyeah, this particular provider is unique in that they give us an automatically deployed turn-key/cookie-cutter openstack environment, but it's mostly us on the hook if it falls over16:42
clarkbany idea what provides the openstack resource provider commands to osc? seems my installs don't have that16:43
melwittclarkb: osc-placement is the osc plugin you need16:44
melwittyou just install it and then it works16:44
clarkbthanks16:44
clarkband now I've hit policy problems. I think I need to escalte my privs. I expect the next bit will just be me stumbling around to find the correct incantations :)16:45
melwittclarkb: placement api is defaulted to admin-only16:46
clarkbmelwitt: I've found the env to administrate the environment and can run the resource provider commands. When I run openstack server list --all-projects only one VM shows up (our mirror). In the doc you shared it showed performing actions for specific VMs but I don't seem to have that here. In this case would I just run the heal command first?16:54
clarkband i guess make note of the allocation for the single VM that is presnet first16:54
* melwitt looks16:55
clarkbthere also doesn't appear to be a way to list all resource allocations.16:57
melwittclarkb: ok yeah sorry, heal_allocations is when you still have the server and want to "heal" it. but it might still work if you pass the uuid of the server from the error message16:58
melwittif not, we'll want to do allocation deletes directly16:58
clarkbmelwitt: got it. Do you know if there is a way to list the allocations? I can show the allocations for the uuids in the logs and they show up but I can't seem to do a listing of all of them16:58
clarkbbut worst case I can parse the log and generate a list to operate on. That should be doable16:59
melwittlisting allocations can be done per resource provider by 'openstack resource provider show <compute node uuid> --allocations"16:59
clarkbaha thanks!17:00
melwittcompute node uuid == resource provider uuid17:00
clarkbI think I have what I need then. I can list all the allocations. Remove allocation(s) for the mirror VM then iterate over that list deleting the allocations and healing them17:01
melwittyeah you just want to remove allocations for any servers that no longer exist17:02
melwitti.e. "not in the database"17:02
melwittand the "consumer" uuids in placement map to the server uuids in nova17:03
melwittmost of the time consumer == nova server/instance17:04
melwittI say "most of the time" because other services/entities can consume resources in placement as well17:05
clarkbmakes snse. in this case I only see allocations that seem to map to nova17:06
clarkbtheir attributes have servery things like memory and disk and cpus17:06
melwittah yeah17:11
melwittyou are right, those are nova17:11
clarkbmelwitt: do I need to run the heal command at all if these instances don't exist? I should be able to simply delete the allocations then I am done? Or are there other side effects of the heal that I want?17:16
melwittclarkb: no I think heal is when the instance is still around but has some extra allocations from env "irregularities" during migrations etc. you are good to just delete for these servers that were deleted in the past17:16
clarkbmelwitt: thanks for confirming17:17
melwittclarkb: ok so sorry but I got the tools mixed up 😓 this is the one I should have told you https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement-audit for this case where you want to delete ones that no longer exist17:18
clarkbmelwitt: oh thanks17:18
melwitt'nova-manage placement audit --verbose' will iterate over all resource providers and look for orphaned allocations and if you pass --delete it will delete them for you17:19
clarkbI'll try that before  I manually delete over my list. Though I have to figure out whwere the nova-manage command is. I think it must be in one of the containers. Does nova-manage talk to the apis like osc and need those credentials or is it more behind the scenes?17:19
clarkblooks like it reads configs directly in the install somewhere17:20
melwittyeah was just looking through, it does call the placement api as well but you don't need your own creds for it17:22
clarkbalright it cleaned up 65 allocations and the mirror still shows up with its allocations17:22
clarkbnow we wait and see if nodepool can launch successfully17:23
melwittok cool17:23
clarkbmelwitt: doing it the more difficult way was good because I feel like I learned a bit more :)17:24
clarkbbut then having easy mode at the end was nice17:24
melwitt:)17:25
clarkb[node_request: 300-0015441935] [node: 0026535559] Node is ready17:25
clarkbI think it is happy now17:26
melwittphew!17:26
fungiawesome17:30
clarkbhttps://grafana.opendev.org/d/4sdNjeXGk/nodepool-inmotion?orgId=117:40
*** ysandeep is now known as ysandeep|out18:27
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Explicit tox_extra_args in zuul-jobs-test-tox  https://review.opendev.org/c/zuul/zuul-jobs/+/80945619:01
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Add tox_config_file rolevar to tox  https://review.opendev.org/c/zuul/zuul-jobs/+/80661319:17
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Support verbose showconfig in tox siblings  https://review.opendev.org/c/zuul/zuul-jobs/+/80662119:17
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Include tox_extra_args in tox siblings tasks  https://review.opendev.org/c/zuul/zuul-jobs/+/80661219:17
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Explicit tox_extra_args in zuul-jobs-test-tox  https://review.opendev.org/c/zuul/zuul-jobs/+/80945619:17
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Pin protobuf<3.18 for Python<3.6  https://review.opendev.org/c/zuul/zuul-jobs/+/80946019:17
fungiinfra-root: bad news, ticket from rackspace says they're planning a block storage maintenance for 2021-10-04 impacting afs01.dfw.opendev.org/main0419:43
fungii suppose we should attach a new volume, add it as a pv in the main vg on the server, and then pvmove the extents off main04 and delete the volume19:44
fungii'll try to get that going today or tomorrow, it should be hitless for us19:45
fungiat least we have a few weeks warning19:46
fungiunfortunately, cinder operations in rackspace are a pain because of the need to use the cinder v1 api which osc no longer supports19:47
*** odyssey4me is now known as Guest9320:05
Clark[m]fungi: I think the osc in the venv in my home for on bridge works with tax cinder you just have to override the API version on the command line to v120:36
opendevreviewSlawek Kaplonski proposed opendev/irc-meetings master: Update Neutron meetings chairs  https://review.opendev.org/c/opendev/irc-meetings/+/80947820:48
fungiClark[m]: i'll give that a try, but i also have cinderclient set up on bridge i can use to do the cinder api bits20:49
*** dviroel is now known as dviroel|out21:00
opendevreviewClark Boylan proposed opendev/system-config master: Run daily backups of nodepool zk image data  https://review.opendev.org/c/opendev/system-config/+/80948321:13
clarkbinfra-root ^ that isn't critical to backup but nodepool has grown the ability to do those data dumps so I figure we may as well take advantage of it'21:13
fungii just did `curl -XPURGE https://pypi.org/simple/reno` (and a second time with a trailing / just in case) based on the discussion in #openstack-swift about job failures which look like more stale reno indices being served near montreal22:11
*** odyssey4me is now known as Guest10122:56
fungiclarkb: i think ianw was able to work out how to extract the cached indices from the fs at one point, but i don't recall how he located samples23:29
ianwfungi: ISTR it being a inelegant but ultimately fruitful application of "grep"23:31
ianw2020-09-16 :  "pypi stale index issues ...  end up finding details by walking mirror caches" is what i have in my notes23:32
fungisounds about right23:32
fungiwow, and today's the anniversary! coincidence?23:33
clarkbfwiw I did a find /var/cache/apache2/proxy -type f -name \*.header -exec grep reno {} \;23:36
ianwhttps://meetings.opendev.org/irclogs/%23opendev/%23opendev.2020-09-15.log.html#t2020-09-15T20:22:5623:36
clarkbthen looked at all the files. It seems that pip explicitly asks for uncached data and that the only version of the file we cached was up to date23:37
clarkbfor reno's index specifically on the iweb mirror23:37
ianwfungi: haha yes, i guess that's from my timestamp, so happened on the 15th UTC23:37

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!