Saturday, 2020-04-11

clarkb	fungi: ^ fyi	00:01
*** mlavalle has quit IRC		00:01
openstackgerrit	Clark Boylan proposed opendev/system-config master: Revert "Set env vars pointing to correct file locations" https://review.opendev.org/719124	00:07
clarkb	I'm not 100% sure we need to do that yet, but I haven't found anything that would indicate the new paths work with non container gerrit	00:08
clarkb	mordred: fungi ^ I'll leave that up and if someone else thinks it is also necessary we can land it	00:11
fungi	exporting unused environment variables shouldn't break anything	00:17
fungi	the jeepyb side of that hasn't landed yet anyway, right?	00:17
clarkb	only the git dir is new	00:18
clarkb	I belive the other 4 vars areexisting	00:18
clarkb	and will point to invalid paths on non container gerrit	00:18
fungi	is the current running non-container gerrit deployed by that playbook also, or still by puppet?	00:23
clarkb	puppetis gone aiui	00:24
clarkb	that playbook affects the current running gerrit I think as the host isnt in the emergency file	00:24
clarkb	but its friday and I could be wrong	00:25
fungi	no, i think you're right	01:03
fungi	mordred: ^	01:03
fungi	i'll go ahead and approve the revert for now	01:04
*** Eighth_Doctor has joined #opendev		05:35
Eighth_Doctor	👋	05:35
*** moppy has quit IRC		07:16
*** moppy has joined #opendev		07:31
*** DSpider has joined #opendev		07:41
*** tosky has joined #opendev		08:29
zbr	are there any active/in-progress plans to upgrade opendev gerrit?	09:13
*** sgw has quit IRC		11:52
*** ChanServ has quit IRC		12:55
*** ChanServ has joined #opendev		13:03
*** tepper.freenode.net sets mode: +o ChanServ		13:03
*** ChanServ has quit IRC		13:08
*** ChanServ has joined #opendev		13:10
*** tepper.freenode.net sets mode: +o ChanServ		13:10
mordred	zbr: yup	13:14
fungi	all of the recent gerrit maintenances have been mainly in service of getting us to a point where we can easily upgrade	13:18
mordred	clarkb: good catch	13:19
mordred	so we should land the re-revert and do another restart aroud the same time	13:20
mordred	fungi, clarkb: the revert didn't land - might be the same amount of waiting/time to just land https://review.opendev.org/#/c/719052/ and do a quick restart	13:26
mordred	zbr: yeah - what fungi said. once we're done with this current maint (finishing switching deployment from puppet to ansible/docker) we'll be working on the upgrade plan	13:28
fungi	mordred: ahh, i'll take a quick look	13:30
fungi	oh, yeah, i'm already +2 on that one	13:31
fungi	if we want to give it another quick try i'm up for that	13:31
fungi	i've gone ahead and approved it	13:32
mordred	fungi: cool. I think on a holiday saturday morning it should be fairly low impact - and shouldn't be _worse- than waiting on the revert to land	13:37
openstackgerrit	Monty Taylor proposed opendev/system-config master: Install ep_headings module https://review.opendev.org/719123	13:39
mordred	fungi: there's the ansible for the hack from yesterday btw ^^	13:40
openstackgerrit	Monty Taylor proposed opendev/system-config master: Run cloud_launcher from zuul https://review.opendev.org/718798	14:00
openstackgerrit	Monty Taylor proposed opendev/system-config master: Stop removing cloud-launcher cron https://review.opendev.org/718799	14:03
corvus	mordred: +2 on the npm thing, but honestly, i think building the image is going to be the better long-term solution -- mostly because if we're running npm on the host, suddenly we care about what version of node/npm is on the host, which is the main thing we want to avoid with all the container stuff.	14:16
openstackgerrit	Merged opendev/jeepyb master: Fix issues from rolling out containers https://review.opendev.org/719052	14:25
mordred	corvus: yeah - I thnik you're probably right	14:25
mordred	I'll work on some patches to do that a little later	14:25
corvus	mordred: you had some earlier right?	14:26
corvus	did we merge those and revert, or did we just revise them before merging?	14:26
mordred	corvus: we revised before merging	14:27
mordred	but I can go cherry-pick the changes out	14:27
mordred	corvus, fungi : the jeepyb promote job for 2.13 has succeeded, so we should have new gerrit images, and I think the new scripts are already in place on gerrit - should we try another restart?	14:28
mordred	I confirm - the new version of the scripts have been applied	14:29
corvus	mordred: wow so fast :)	14:30
mordred	:)	14:30
corvus	mordred: sure, let me blink the sleep out of my eyes and let's go for it	14:30
mordred	kk. I'm in the root screen on review	14:30
corvus	i have joined	14:31
mordred	status notice Restarting gerrit to fix an issue from yesterday's mataintenance	14:31
mordred	yeah?	14:31
mordred	wow. except that's horrible spelling	14:31
mordred	status notice Restarting gerrit to fix an issue from yesterday's maintenance	14:31
corvus	lgtm	14:31
mordred	#status notice Restarting gerrit to fix an issue from yesterday's maintenance	14:31
openstackstatus	mordred: sending notice	14:31
-openstackstatus- NOTICE: Restarting gerrit to fix an issue from yesterday's maintenance		14:32
mordred	wow, openstackstatus is taking its time	14:34
openstackstatus	mordred: finished sending notice	14:35
mordred	corvus: ok. shall we?	14:36
corvus	mordred: ++	14:36
corvus	there's like a constant stream of hangups from stackalytics-bot-2 in the error log...	14:36
mordred	corvus: "neat"	14:36
fungi	okay, back	14:37
mordred	I suppose I could have pulled before stopping :)	14:37
mordred	fungi: we're in root screen on gerrit	14:37
corvus	live and learn	14:37
corvus	mordred: the screen has stopped updating for me	14:38
corvus	it's on extracting 208407758d73:	14:38
fungi	yep, joining	14:38
corvus	mordred: but it looks like gerrit is running	14:38
corvus	what's going on?	14:38
mordred	corvus: weird. yeah- it seems fine?	14:38
corvus	mordred: did it finish and did you restart it?	14:39
mordred	yes	14:39
fungi	i saw control return to a shell prompt	14:39
mordred	I'm now tailing logs	14:39
mordred	[2020-04-11 14:38:54,813] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.13.12-11-g1707fec ready	14:39
corvus	my screen caught up	14:39
mordred	let me push upa patch to trigger some scripts	14:40
corvus	[2020-04-11 14:40:25,990] [HookQueue-1] INFO com.googlesource.gerrit.plugins.hooks.HookTask : hook[patchset-created] output: FileNotFoundError: [Errno 2] No such file or directory: '/home/gerrit2/review_site/etc/gerrit.config'	14:40
mordred	oh - somebody did	14:40
mordred	really?	14:41
mordred	why is patchset-created not updated/	14:41
mordred	?	14:41
mordred	I'm going to manually fix that real quick to make sure it fixes the issue	14:42
mordred	it's bind-mounted in so it should fix wihtout restart	14:42
openstackgerrit	Monty Taylor proposed opendev/system-config master: WIP Update install-ansible away from /opt/system-config https://review.opendev.org/719186	14:43
mordred	did that patchset created trigger the error?	14:43
corvus	[2020-04-11 14:43:17,782] [HookQueue-1] INFO com.googlesource.gerrit.plugins.hooks.HookTask : hook[patchset-created] output: TypeError: cannot use a string pattern on a bytes-like object	14:43
corvus	http://paste.openstack.org/show/791957/	14:43
openstackgerrit	Monty Taylor proposed opendev/system-config master: Actually install patchset-created hook https://review.opendev.org/719187	14:44
fungi	another missed python3 fix i suppose	14:44
mordred	corvus: STELLAR	14:44
mordred	well - there's the hook fix	14:44
fungi	d'oh	14:44
mordred	ah - it's because subprocess.Popen	14:46
fungi	i guess we need to .decode('utf-8') the fd from it?	14:47
openstackgerrit	Monty Taylor proposed opendev/jeepyb master: Decode utf-8 from subprocess.Popen https://review.opendev.org/719188	14:49
mordred	corvus, fungi: ^^	14:49
mordred	I could exec into the container and apply that same fix live to double check it (and keep things going until that patch lands)	14:50
corvus	mordred: sgtm to keep the loop going	14:51
* mordred is a little worried that the slow version of whack-a-mole here might take an age		14:51
mordred	yeah	14:51
fungi	well, it's not overly-broken, code review is working some hooks aren't successfully running, and another restart or several for new images ought to address it, right?	14:53
mordred	k. done	14:53
corvus	fungi: yeah, but we might be able to get that down to just one restart with all the fixes at once	14:53
mordred	yeah - but I _think_ we're close enough that we might be able to get by with only one more restat	14:53
mordred	yeah	14:53
mordred	and then be actually done with this mess	14:53
mordred	and remind ourselves to never write a completely untested program like jeepyb ever again	14:54
fungi	well, yes, hopefully only one restart. granted each time we've restarted so far we thought we had all the fixes in ;)	14:54
* mordred looks forward to reworking these hooks as zuul jobs		14:54
mordred	fungi: indeed :)	14:55
corvus	i pushed a patchset	14:56
corvus	i watched the gerrit queue, there are no more patchet-created hook entries	14:57
corvus	so i think that means success	14:57
mordred	\o/	14:58
mordred	I've got an update on the jeepyb patch - pep8 gods are unhappy	14:58
openstackgerrit	Monty Taylor proposed opendev/jeepyb master: Decode utf-8 from subprocess.Popen https://review.opendev.org/719188	14:58
mordred	corvus, fungi : ^^	14:58
corvus	i will prepare breakfast while those land	15:02
mordred	corvus: one of the promote jobs failed on the previous jeepyb patch (not important, it was 2.15)	15:03
mordred	corvus: https://zuul.opendev.org/t/openstack/build/800a7224cf0143158e86ede8a9a35bdd/log/job-output.txt#89	15:03
mordred	corvus: we might want to put in some retries	15:03
mordred	corvus: although ,.. that's a little weird ... why does it say tag=change_719052_2.13 - that's the 2.15 job	15:04
mordred	all the vars seem to match in the jobs fwiw	15:05
openstackgerrit	Monty Taylor proposed opendev/system-config master: Update install-ansible away from /opt/system-config https://review.opendev.org/719186	15:16
openstackgerrit	Davlet Panech proposed openstack/project-config master: Add kernel to StarlingX https://review.opendev.org/718772	15:16
mordred	corvus: ^^ step one in "run ansible from zuul checkout" - I believe that's an ok and self-standing change	15:16
corvus	mordred: that was a 'list tags' call	15:21
corvus	mordred: it looks kind of like a dockerhub internal consistency error	15:22
corvus	mordred: that might explain why an unrelated tag was mentioned	15:22
mordred	corvus: ah - nod	15:25
Eighth_Doctor	hey folks!	15:27
Eighth_Doctor	nice to see that this channel isn't dead :D	15:27
corvus	mordred: left an idea on that change	15:29
openstackgerrit	Monty Taylor proposed opendev/system-config master: Update install-ansible away from /opt/system-config https://review.opendev.org/719186	15:33
openstackgerrit	Monty Taylor proposed opendev/system-config master: Run playbooks out of zuul checkout https://review.opendev.org/719190	15:33
mordred	corvus: cool - there's the followup to finish it	15:33
mordred	corvus: yes - I think that's a good idea	15:33
mordred	is it possible to pass explicit vars to template: ?	15:34
corvus	mordred: i think you can do that for any task?	15:34
mordred	corvus: so you can just add a vars: block to it?	15:35
fungi	Eighth_Doctor: why would it be dead? ;)	15:35
Eighth_Doctor	well, when I joined last night, I was the only person here :)	15:36
Eighth_Doctor	and I've been in other openstack channels that looked empty before...	15:36
fungi	ahh, i think a lot of us were drained by a long friday after a long week	15:36
Eighth_Doctor	was something particularly bad happening?	15:43
fungi	nah, just getting lots done!	15:43
fungi	also weekends tend to be quieter	15:48
Eighth_Doctor	so there was something I was curious about	15:52
mordred	yeah - we had some maintenances to do and wanted to take advantage of the slow holiday friday as a good time to do that	15:52
mordred	them	15:53
Eighth_Doctor	why did opendev select gitea over other options?	15:53
mordred	it was visually nice - and it allowed us to completely disable the features we don't use (like pull requests)	15:53
mordred	we used cgit before - but our users weren't super thrilled with it as a code browser and so more consistently fell back to mirrors on github	15:54
Eighth_Doctor	was pagure ever considered?	15:54
corvus	Eighth_Doctor: there's some documentation about that decision: https://docs.opendev.org/opendev/infra-specs/latest/specs/opendev-gerrit.html	15:54
corvus	also https://review.opendev.org/#/c/623033/	15:56
mordred	Eighth_Doctor: I believe I did look at it - iirc one of the issues was inability to full disable things like the pull request interface. I feel like there was another reason as well but I sadly don't remember what it was	15:56
Eighth_Doctor	mordred: so we added the ability to disable damn near everything instance-wide last year	15:57
corvus	was it related to search?	15:57
corvus	gitea does have code searching (though we aren't able to use it yet, we still plan to enable it)	15:57
fungi	Eighth_Doctor: however, we were making this decision in 218	15:57
fungi	2018	15:57
Eighth_Doctor	ah	15:57
Eighth_Doctor	pagure 5.0 was released at the end of 2018, and our zuul integration was completed in mid 2019	15:58
Eighth_Doctor	so that explains it... poor timing	15:58
mordred	yeah - might have just been timing	15:58
mordred	yeah	15:58
fungi	well, we also didn't need zuul integration for our use case	15:58
fungi	we definitely didn't want to replace our choice of code review system	15:59
fungi	just needed a source browser	15:59
Eighth_Doctor	fungi: having zuul status report back into commits is nice though :)	15:59
fungi	why?	15:59
fungi	i mean, if you're proposing changes to that system then yes, but we're not	15:59
Eighth_Doctor	because it makes it very easy for people to reference back and forth between tested commits and such	15:59
Eighth_Doctor	it's something I personally find handy, even if you're not using PRs	15:59
fungi	we're just using it as a read-only code browsing frontend, not to do change review	16:00
fungi	we do pretty much all our testing pre-merge	16:00
Eighth_Doctor	right, that's a zuul feature	16:00
fungi	so still not clear what we'd be reporting from zuul into the code browsing system	16:01
Eighth_Doctor	well, whatever the merged commit was, it would have a status link back to zuul that people can click to see the test results	16:01
fungi	zuul in our case is being triggered by activity in the code review system	16:01
Eighth_Doctor	presumably also would have a link to gerrit, so you can see the reviews	16:01
fungi	so zuul doesn't/wouldn't know about the code browser	16:01
corvus	that's a good point, i wonder if we can link change-id footers in gitea back to gerrit	16:02
Eighth_Doctor	it'd also be trivial to customize the template so that instead of showing a PR tab or issues tab, it'd give a link to gerrit for the project	16:02
Eighth_Doctor	or storyboard for issues	16:02
fungi	and yeah, we're working on getting the gerrit links displaying. they're in git notes, we just need to turn on displaying git notes in gitea now that it (i think?) added capability to display arbitrary notes refs	16:02
Eighth_Doctor	Fedora's pagure instance does this to replace issues with a link to rhbz	16:02
corvus	we did do that for gitea -- the "Proposed changes" tab links to gerrit	16:02
Eighth_Doctor	https://src.fedoraproject.org/rpms/pagure	16:02
fungi	we do have gitea configured to link to gerrit and storyboard or launchpad already	16:02
Eighth_Doctor	oh nice, I guess I missed that piece	16:03
Eighth_Doctor	I mainly looked at the zuul projects, since that's my main opendev interest atm :)	16:03
Eighth_Doctor	but yeah, I see you already did that	16:03
mordred	another nice thing about pagure - it's in python :)	16:03
Eighth_Doctor	yeah :D	16:03
Eighth_Doctor	also, another thing about pagure, docs are stored as a git repo :)	16:04
fungi	yep, at https://opendev.org/zuul/zuul issues links off to https://storyboard.openstack.org/#!/project/zuul/zuul and proposed changes links to https://review.opendev.org/#/q/status:open+project:zuul/zuul	16:04
Eighth_Doctor	(technically, same goes for issues and PR metadata, but you don't care about those)	16:04
fungi	what docs are you talking about?	16:04
Eighth_Doctor	project documentation (e.g. gh-pages, readthedocs, etc. stuff)	16:05
fungi	ahh, well we already develop our documentation in git repos through code review anyway	16:05
Eighth_Doctor	ah okay	16:05
fungi	and use zuul to render/publish them	16:05
openstackgerrit	Merged opendev/system-config master: Actually install patchset-created hook https://review.opendev.org/719187	16:05
Eighth_Doctor	well, then I guess the only thing left I have is pagure scales?	16:05
Eighth_Doctor	it handles ~30K repos with ~10K concurrent users accessing performantly from one server (src.fedoraproject.org)	16:06
Eighth_Doctor	and has means for scaling beyond that	16:06
fungi	we're running 8 gitea servers behind a load balancer right now, but better clustering (especially for the code search functionality) would be nice, yes	16:06
Eighth_Doctor	holy crap, 8?!	16:07
Eighth_Doctor	I knew gitea wasn't great for scaling, but that's awful	16:07
fungi	partly so that we can handle bursts of cloning activity better	16:07
Eighth_Doctor	sure, makes sense	16:07
fungi	they're usually under-utilized	16:07
mordred	since you say that - I'm curious if pague would be better at browsing teh nova repo	16:07
fungi	also gives us the ability to take some of them offline without impacting performance	16:08
fungi	for upgrades et cetera	16:08
Eighth_Doctor	is openstack/nova usually the problem child?	16:08
mordred	well - it's the best example of a problem child	16:08
fungi	yeah, that repo is large, has ~10 years of history, et cetera	16:09
mordred	it is a large repo and gitea has had some issues with doing the right things caching its refs in the past	16:09
Eighth_Doctor	well, let's see if I can even download it! :P	16:09
Eighth_Doctor	we've hosted mirrors of the linux kernel reasonably well on pagure.io (which has less resources than src.fedoraproject.org) and I think I have a copy of mongodb pre sspl there	16:10
mordred	Eighth_Doctor: does pagure handle operating in a cluster decently? like - if we wanted to run 8 pagures in a k8s but treat them as a single server?	16:10
Eighth_Doctor	mordred: this is of comparable size: https://pagure.io/mongodb-agplv3	16:10
Eighth_Doctor	mordred: I personally do not know because I don't run pagure that way, but I know of users who are running it in OpenShift or Kubernetes and scaling the backend workers accordingly to handle the load well	16:11
Eighth_Doctor	so far, I haven't heard any complaints	16:11
Eighth_Doctor	there's a WIP helm chart PR for pagure, but neither I nor the other developers have experience with k8s enough to be able to do anything meaningful with it	16:12
mordred	nod. I mean - the k8s part isn't as important as the being able to scale it horizontally part	16:12
fungi	master branch of nova is nearly 60k commits at this point, looks like	16:12
fungi	Eighth_Doctor: what's the typical server size for pagure, do you think? part of why we're running 8 backends for gitea is that they're each small virtual machines with like 8gb ram	16:13
fungi	but we're also not nearly the repository count of fedora, only a little over 2k repositories at the moment	16:14
Eighth_Doctor	I don't have the exact details, but I think the existing src.fedoraproject.org server is basically a VM with 4GB of RAM	16:14
fungi	neat	16:14
Eighth_Doctor	it might be 8GB of RAM now, but I know it's not a huge machine	16:14
corvus	here's utilization of the individual gitea backends: http://cacti.openstack.org/cacti/graph_view.php	16:15
corvus	click the 'gitea farm' on the left	16:15
Eighth_Doctor	that's not too bad	16:16
Eighth_Doctor	storage I can ignore, since those are synced	16:16
corvus	looks like a median load average might be about 0.25, peaking at 2	16:16
Eighth_Doctor	I'm pretty sure the utilization levels are similar on src.fp.o	16:16
mordred	if I'm reading https://docs.pagure.org/pagure/overview.html#pagure-workers right - in general there is expected to be one copy of the git repos on disk and pushing to those would be via a gitolite instance. then the pagure web interface is going to read from that filesystem copy via async worker tasks	16:17
fungi	our typical activity levels would probably be handled with only a couple backends, but with some frequency people point high-volume ci systems at our git refs and start cloning hundreds of copies of repositories at the same time	16:17
Eighth_Doctor	yep	16:17
mordred	so if the filesystem were shared amongst workers, teh read traffic looks like it would be pretty scalable	16:17
corvus	i bet we could halve the cluster (to 4 8gb vms) with no significant impact to performance. more than that we'd probably have peak memory usage issues.	16:17
Eighth_Doctor	this is essentially the characteristic for fedora	16:18
Eighth_Doctor	we also have things like koschei, zuul, etc. constantly checking out and interacting with pagure API	16:18
mordred	via scale out - but writes might still have a spof?	16:18
Eighth_Doctor	and it's doing very well with just one server	16:18
Eighth_Doctor	the only bottleneck is if you need to scale storage... but if you're operating in k8s, this is abstracted for you	16:19
mordred	oh - we're not :)	16:19
mordred	but - that's been a thing we've been looking at doing if we could get to a clustered solution for the git browsing	16:19
Eighth_Doctor	... then I'm confused about k8s?	16:19
mordred	right now we replicate to all 8 machines independently	16:19
Eighth_Doctor	oh... ouch	16:20
mordred	we'd LIKE to have a single clustered system that we replicate to once	16:20
mordred	but so far that's problematic	16:20
Eighth_Doctor	that means you're inducing state sync load	16:20
Eighth_Doctor	I've usually seen this solved with either shared nfs or gluster	16:20
mordred	with cgit it was just impossible. with gitea there are some indexes that made single-machine assumptions that are in process of being fixed	16:20
Eighth_Doctor	that's not to say other solutions aren't valid, but those are the two I usually see	16:21
mordred	yeah- that was/is the gitea design - run the gitea cluster on top of a cephfs	16:21
Eighth_Doctor	there is an option for sharding git storage in pagure	16:21
mordred	but there were 2 things it was doing that were storing index files in the filesystem which needed to be abstracted out into plugin interfaces so they could store in a service	16:21
Eighth_Doctor	but we don't use it in fedora right now and it needs some love	16:22
Eighth_Doctor	https://github.com/repoSpanner/repoSpanner	16:22
Eighth_Doctor	this does work with pagure, but the issue is that the sync penalty is too high in some cases	16:22
mordred	that would be a cost in push right?	16:23
Eighth_Doctor	there was some in-progress work for improve performance, but interest died off on completing it	16:23
Eighth_Doctor	yes	16:23
mordred	I'd LOVE to be able to scale without needing to run a shared filesystem	16:23
Eighth_Doctor	repoSpanner was designed to avoid the shared filesystem requirement	16:23
clarkb	yes memory is the major thing. You need about a gig of memory for each git operation on several of our repos	16:23
Eighth_Doctor	because we don't use one in Fedora	16:23
clarkb	as long as git is used regardless of frontend I dont expect that changes dramaticallu	16:24
clarkb	then you add N operations amd suddenly you need quite a bit of memory	16:24
openstackgerrit	Merged opendev/jeepyb master: Decode utf-8 from subprocess.Popen https://review.opendev.org/719188	16:24
clarkb	also note the split git repos arent an issue as long as this is a read only frontend	16:24
clarkb	its going ti be eventually consistent regardless due to how gerrit replication works	16:25
mordred	infra-root: ok the jeepyb change landed - I'm about ready to try another restart	16:25
Eighth_Doctor	so perhaps pagure + repospanner would work in your specific scenario	16:25
clarkb	(so overcomplicating that to sync isnt worth much imo, using a fa that syncs for us is nice and simple	16:25
clarkb	*using a fs	16:25
mordred	clarkb: yah - but ... it's possible running repoSpanner might be easier than running ceph	16:26
Eighth_Doctor	mordred: _that_ I can say is true :)	16:26
mordred	(if we got a ceph magically from someone already running one, using a ceph would be easier)	16:26
Eighth_Doctor	isn't that how that always works? :)	16:27
corvus	mordred: i have to run; i can check back in in a few hours, but i support you restarting if you're comfortable	16:27
mordred	I'm comfortable	16:27
fungi	thanks corvus	16:27
fungi	and yeah, i'm around again	16:28
mordred	corvus, fungi, clarkb: images ahve been pushed, ansible change have applied	16:28
mordred	I'm going to try another restart	16:28
mordred	clarkb: we're in root screen on review if you wanna watch	16:28
clarkb	Im half around. Drinking tea and eating cornbread	16:28
mordred	although the root screen itself isn't super exciting	16:28
Eighth_Doctor	I don't know if you guys use ansible or something else for config management, but you can see Fedora's ansible role for pagure here: https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/pagure	16:29
Eighth_Doctor	(the ansible repo hasn't yet moved to pagure.io)	16:30
mordred	clarkb, fungi : gerrit looks like it's back u	16:31
mordred	ip	16:31
mordred	UP	16:31
Eighth_Doctor	CentOS also runs an instance and has an Ansible role: https://github.com/CentOS/ansible-role-pagure	16:31
Eighth_Doctor	pagure uses MySQL on CentOS and PostgreSQL on Fedora ;)	16:32
fungi	based on our present direction for evolution of our deployment methodology, we'd presumably consume docker images for the service components and then deploy those images with ansible	16:32
mordred	yeah	16:32
Eighth_Doctor	that's fine too :)	16:33
fungi	either consume upstream-provided docker images, or (re)build our own with our ci system and then consume those	16:33
Eighth_Doctor	we have Dockerfiles for pagure that we use primarily for dev and CI, but we don't currently publish any containers for prod	16:34
Eighth_Doctor	so the latter would probably be the way for you to go	16:34
mordred	yeah - that's what we do for gitea too - their docker images aren't structured for what we'd want in prod - are more focused on the AIO "I want to run it quickly on my laptop" use case	16:34
mordred	which is an important use case	16:35
mordred	but not what we're doing :)	16:35
fungi	yep, that's how we're deploying gerrit in production, as of, well, today i suppose (if we don't have to roll back again)	16:35
mordred	fungi: I'm going to roll this forward today if it kills me	16:35
fungi	how about let's just not stick to deployment models which leave dead sysadmins in their wake	16:35
mordred	ok fair	16:36
Eighth_Doctor	pagure is packaged for Fedora, RHEL/CentOS via EPEL, Mageia, and openSUSE by me	16:36
Eighth_Doctor	so if you want to play with it in a VM or a container, it's pretty easy to do ;)	16:36
clarkb	"Here lies Mordred. A java program eventually got the best of him"	16:36
Eighth_Doctor	RIP	16:36
mordred	HOOKS HAVE RUN WITH NO TRACEBACKS	16:38
mordred	I declare victory	16:38
clarkb	mordred: fungi if there is a list of things to review I can help with that but probably not get into it beyond that	16:38
* Eighth_Doctor sees mordred fall over in a heap		16:39
fungi	clarkb: i think we've got them all in now? can probably abandon the revert of the hooks update which didn't merge	16:39
clarkb	cool will do that	16:39
fungi	we can resurrect it if we decide we do have to roll back for reasons we can't correct immediately	16:40
mordred	fungi: while I've got you - would you mind reviewing https://review.opendev.org/#/c/719088/ ?	16:42
mordred	fungi: if you're ok with that - I'll delete the old ones and land it	16:42
Eighth_Doctor	fungi, mordred: if you were interested in the k8s based approach: https://pagure.io/pagure/pull-request/4483	16:44
fungi	mordred: yep, cool will do	16:45
Eighth_Doctor	and since we've been talking about performance, here's the info I gave the FSF for helping them set up a performant system for their forge based on pagure: https://lists.pagure.io/archives/list/pagure-devel@lists.pagure.io/message/SZ7GJ5P65Q76FRZIDNYFP3HI4RD4H6LT/	16:47
clarkb	oh thats the other performance related issue we do have. We have to use source ip based load balancing due to unshared git repos	16:50
clarkb	because a fetch executing across different repos of the same logical entity can fail	16:50
clarkb	(it depends on how objects are packed iirc)	16:51
clarkb	and that hasissues when large companies funnel through a single NAT IP	16:52
Eighth_Doctor	yup	16:52
Eighth_Doctor	that might be where repoSpanner helps here	16:55
openstackgerrit	Monty Taylor proposed opendev/system-config master: Write out db config for root user https://review.opendev.org/719192	16:56
Eighth_Doctor	assuming you want to have multiple storage replicas	16:56
Eighth_Doctor	clarkb: I'm not sure, given your usage model, that repoSpanner would be necessary, but it would avoid the load balancing problem	16:59
Eighth_Doctor	you could run one frontend app with some number of workers, and then have a repoSpanner cluster that handles the git storage	16:59
openstackgerrit	Merged opendev/system-config master: Install ep_headings module https://review.opendev.org/719123	16:59
clarkb	Eighth_Doctor: ya any shared repo content or synced content would fix that I think	16:59
fungi	as long as the shared backend guaranteed all frontends were serving exact same copies of the content at the same times	17:02
mordred	clarkb, fungi: corvus suggested earlier that we should build etherpad image instead of doing that ep_headings hack above and i agree	17:05
mordred	I'll ressurect the child-image-building code in a bit	17:05
fungi	mordred: i take it cron running track-upstream outside the container is fine?	17:08
mordred	fungi: it actually runs it in a container :)	17:09
fungi	huh... looking closer	17:10
mordred	fungi: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gerrit/templates/track-upstream.j2	17:10
mordred	fungi: that gets installed into /usr/local/bin	17:10
fungi	oh! so it does ;)	17:10
fungi	also... has gerritbot config update gotten solved yet? i want to say we're still not seeing infra-manual changes in here since the namespace move	17:11
mordred	it has not - I've got the first change up	17:12
mordred	https://review.opendev.org/#/c/715635/	17:13
openstackgerrit	Merged opendev/system-config master: Add review and etherpad to backup group https://review.opendev.org/719036	17:13
openstackgerrit	Merged opendev/system-config master: Run ansible on the backup server https://review.opendev.org/719076	17:13
fungi	oh, cool	17:16
fungi	right, we were containering it, i forgot	17:17
fungi	+2	17:17
mordred	fungi: so many containers	17:18
fungi	okay, christine's got a lengthy list of things i need to repair around the house, but i'll be in and out to keep tabs on gerrit in case we run into any more unforeseen problems	17:18
mordred	fungi: cool. I think we're good though - it seems like we've finally finished this phase!	17:29
openstackgerrit	Merged opendev/system-config master: Add root cron jobs to gerrit https://review.opendev.org/719088	17:46
fungi	here's hoping!	17:48
mordred	fungi, clarkb, corvus: the backup playbook is not working	18:27
mordred	my brain can't quite process it at the moment	18:27
mordred	but we shoud fix it :)	18:28
fungi	i'll see if i can figure it out in a bit	18:37
fungi	once this leftover curry is gone ;)	18:37
fungi	mordred: when you say "the backup playbook is not working" you mean the periodic pipeline job running the playbooks/service-backup.yaml playbook?	19:40
mordred	fungi: I mean the playbook itself - if you look in /var/log/ansible/service-backup.yaml.log on bridge	19:44
mordred	fungi: looking at the ansible it looks like there's maybe a mismatch in variable name - but I'm not 100% sure and I'm not 100% sure of the intent	19:44
mordred	so the job is running the playbook fine -but the playbook itself is bombing out :)	19:46
openstackgerrit	Sean McGinnis proposed openstack/project-config master: Make job template update best effort https://review.opendev.org/719308	19:47
mordred	fungi: oh! I think I might know what the issue is	19:50
mordred	fungi: etherpad01.opendev.org is in the disabled list	19:50
mordred	so it's not being run in the backup role - so it's not setting the bup_user variable	19:51
fungi	d'oh	19:51
mordred	BUT - we do with_inventory_hostnames in backup-server	19:51
mordred	on 'backup'	19:51
mordred	which does not subtract hosts in the disabled group	19:51
mordred	ooh - it supports exclusion patterns	19:52
openstackgerrit	Monty Taylor proposed opendev/system-config master: Exclude disabled group from backup-server loop https://review.opendev.org/719309	19:54
mordred	fungi, corvus : ^^	19:54
mordred	this is an issue that will only arise if we have a server we backup disabled at a time when we have backup-server enabled	19:54
mordred	like now	19:54
mordred	also - I think we can unemergency etherpad	19:55
mordred	but why don't I leave it in emergency so we can check that backup runs correctly in this scenario	19:55
fungi	good call	20:07
fungi	mordred: so... we don't want to backup servers which have config management disabled?	20:09
mordred	fungi: well - we do - but we probably don't want to set up new backup info on them if they're disabled	20:31
mordred	(or we can't, since we won't have run the corresponding stuff on the server themselves - so there's potentially no user to connect to yet - which would be true in the case of etherpad)	20:32
mordred	backups _themselves_ are via cron - but attempting to set up new backups while disabled == sad panda	20:32
fungi	okay, so the bup_users set would only be used for initial configuration, not to decide which to run the backups for when already set up, got it	20:33
fungi	the job name system-config-run-backup was mildly misleading	20:34
fungi	now realizing it's infra-prod-service-backup i meant to be looking at	20:35
fungi	and yeah, now i see in playbooks/roles/backup/tasks/main.yaml we're still configuring a cronjob, not triggering backups directly	20:36
fungi	makes sense, thanks	20:36
*** tosky has quit IRC		23:24
*** DSpider has quit IRC		23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!