Tuesday, 2020-10-20

openstackgerritIan Wienand proposed opendev/system-config master: Make haproxy role more generic  https://review.opendev.org/67790300:36
openstackgerritIan Wienand proposed opendev/system-config master: borg-backup: use unique mark in .ssh/config  https://review.opendev.org/75887900:43
kevinzianw: just FYI, I've been noticed from the Colo facilities that Linaro-US is under PDU maintainence, the servers will be restarted. The reflected time will be narrowed to about 2 hours00:45
ianwkevinz: ok, thanks00:47
ianwkevinz: we've stil had a few instances of the servers in the control plane shutting down unexpectedly00:47
kevinzlet me check00:51
ianwmaybe if you can scroll back through the logs for nb03 and see if anything jumps out?  if you give me timestamps i might be able to correlate00:52
kevinzianw: yes I saw the log, looks that all the instances have been scheduled to 1 hosts, and all instance fail to launch01:04
kevinzI will change the scheduler rules to see if it become better01:04
kevinzseveral node got no instances, while several nodes are running a lot of instances01:05
clarkblooks like openstackid02 and 03 still arent in cacti for some reason. Its way late for me to debug that further (previous issue was the LE stuff iirc)01:50
*** mlavalle has quit IRC01:50
clarkbianw: ^ if that is something you have time for that would be great otherwise I'll try to look tomorrow if fires are quiet :)01:50
*** mlavalle has joined #opendev01:51
ianwclarkb: ok, i don't think ansible got re-enabled yet?01:57
clarkbI removed the DISABLE-ANSIBLE file a while back01:57
ianwahh, ok, will poke then01:57
clarkbat least I thought I did, worth double checking01:58
ianwyeah it's gone02:00
ianwthere's a bunch of stuff stuck on logstash-worker12.openstack.org to start02:01
ianwlooks like our standard stuck for undetermined reasons; rebooting it02:03
clarkbfungi ianw this look good ? #status notice We are investigating an issue with our hosted Gerrit services. We will provide an update as soon as we can. If you want to follow the latest, feel free to join #opendev03:22
clarkb#status notice We are investigating an issue with our hosted Gerrit services. We will provide an update as soon as we can. If you want to follow the latest, feel free to join #opendev03:23
openstackstatusclarkb: sending notice03:23
-openstackstatus- NOTICE: We are investigating an issue with our hosted Gerrit services. We will provide an update as soon as we can. If you want to follow the latest, feel free to join #opendev03:23
openstackstatusclarkb: finished sending notice03:26
*** lamt has joined #opendev03:32
*** aprice has joined #opendev03:38
*** fressi has joined #opendev04:21
clarkb#status notice We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able.04:28
openstackstatusclarkb: sending notice04:28
-openstackstatus- NOTICE: We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able.04:28
openstackstatusclarkb: finished sending notice04:31
*** amotoki has quit IRC04:35
*** amotoki has joined #opendev04:37
*** lpetrut has joined #opendev04:51
*** lpetrut has quit IRC04:51
*** fressi has quit IRC05:35
*** brinzhang has joined #opendev05:49
*** mkalcok has joined #opendev06:02
*** gnuoy has joined #opendev06:03
*** whoami-rajat__ has joined #opendev06:08
whoami-rajat__hi #opendev team, is there an estimate as to when review.opendev.org will be active again?06:08
*** ralonsoh has joined #opendev06:12
*** eolivare has joined #opendev06:26
*** user_19173783170 has joined #opendev06:27
*** sshnaidm|afk is now known as sshnaidm06:44
*** chandankumar has joined #opendev06:47
*** qchris has quit IRC07:02
*** rpittau|afk is now known as rpittau07:04
yoctozeptomorning infra! how bad is it?07:04
*** qchris has joined #opendev07:06
*** sboyron has joined #opendev07:08
ttxyoctozepto: just got up, but last I heard they were comparing backups to see if anything was compromised07:10
*** andrewbonney has joined #opendev07:10
*** slaweq has joined #opendev07:12
ianwthat's more or less it.  there's no catastrophe but we want to make sure things are good before turning anything back on07:12
yoctozeptottx, ianw: thanks! understood07:13
Tengudoes gerrit supports cryptographic signature for the commits? That might help preventing issues..07:22
*** tosky has joined #opendev07:43
*** tobias-urdin has joined #opendev07:45
*** mgoddard has joined #opendev07:50
*** yoshito-ito has joined #opendev07:51
*** sean-k-mooney has joined #opendev07:51
*** ricolin has joined #opendev07:52
*** jlibosva has joined #opendev07:56
*** fressi has joined #opendev08:00
whoami-rajat__can anyone provide an ETA for the same?08:11
*** lajoskatona has joined #opendev08:21
*** mhu has joined #opendev08:23
AJaegerwhoami-rajat__: Not yet, please be patient.08:24
*** hashar has joined #opendev08:24
*** yoshito-ito has quit IRC08:28
*** insected has joined #opendev08:32
-openstackstatus- NOTICE: We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able.08:34
*** ChanServ changes topic to "We identified a possible vulnerability in Gerrit and are investigating the potential impact on our services. Out of an abundance of caution we have taken our OpenDev hosted Gerrit system offline. We will update with more information once we are able."08:34
mhuMorning, is the gerrit vulnerability documented somewhere? We're running a few gerrits too and we would like to audit our instances if the problem is severe08:37
ttxmhu: it's unclear that the vulnerability is in gerrit at this point08:40
ttx(or what the vulnerability would be)08:40
*** ysandeep|away is now known as ysandeep08:45
*** priteau has joined #opendev08:45
*** insected has quit IRC08:51
*** jcapitao has joined #opendev09:11
*** ysandeep is now known as ysandeep|lunch09:12
*** Eighth_Doctor has quit IRC09:24
*** mordred has quit IRC09:24
*** ysandeep|lunch is now known as ysandeep09:34
*** mordred has joined #opendev09:34
*** sshnaidm is now known as sshnaidm|afk09:48
*** Eighth_Doctor has joined #opendev09:58
*** user_19173783170 has quit IRC09:59
*** fressi has quit IRC10:07
*** fressi has joined #opendev10:08
*** elod is now known as elod_afk10:20
*** DSpider has joined #opendev10:57
AJaeger#status alert Update on gerrit downtime: After investigation, we believe the incident is related to a compromised Gerrit user account rather than a vulnerability in Gerrit software. We are continuing to review activity to verify the integrity of git data and expect to have an additional update with possible service restoration in approximately 2 hours.11:02
openstackstatusAJaeger: sending alert11:02
-openstackstatus- NOTICE: Update on gerrit downtime: After investigation, we believe the incident is related to a compromised Gerrit user account rather than a vulnerability in Gerrit software. We are continuing to review activity to verify the integrity of git data and expect to have an additional update with possible service restoration in approximately 2 hours.11:03
*** ChanServ changes topic to "Update on gerrit downtime: After investigation, we believe the incident is related to a compromised Gerrit user account rather than a vulnerability in Gerrit software. We are continuing to review activity to verify the integrity of git data and expect to have an additional update with possible service restoration in approximately 2 hours."11:03
openstackstatusAJaeger: finished sending alert11:08
mhuThanks for the update AJaeger - looking forward to a full report if possible, this was intriguing to say the least11:11
*** insected has joined #opendev11:13
*** jcapitao is now known as jcapitao_lunch11:15
*** eolivare has quit IRC11:17
*** gnuoy has quit IRC11:23
*** gnuoy has joined #opendev11:24
*** lyarwood has joined #opendev11:34
mnaserDo we have any ETA?12:00
AJaegermnaser: see recent alert "approx. 2hours", so in another hour from now on. But that is  promise for an update...12:04
mnaserOh, right. I missed that. Thanks AJaeger12:05
*** insected has quit IRC12:12
*** elod_afk is now known as elod12:13
*** weshay|ruck has joined #opendev12:14
*** sshnaidm|afk is now known as sshnaidm12:20
*** fnordahl has joined #opendev12:23
*** eolivare has joined #opendev12:26
*** jcapitao_lunch is now known as jcapitao12:28
fungidoes status alert when there's already an alert in progress just update the message?13:27
clarkbI'm not sure13:27
fungii'm also worried about leaving this in alert for any extended period because if there's a netsplit we'll end up having to manually fix >100 channel topics13:28
fungithat's why we were initially using status notice13:28
AJaegerfungi: AFAIK it does13:29
AJaegerfungi: hope I didn't create chaos already ;(13:29
fungiokay, i'll cross my fingers13:30
fungi#status alert We've confirmed that known compromised identities have been reset or had their accounts disabled, and we are auditing other service accounts for signs of compromise before we prepare to restore Gerrit to working order. We will update again in roughly 2 hours.13:30
openstackstatusfungi: sending alert13:30
-openstackstatus- NOTICE: We've confirmed that known compromised identities have been reset or had their accounts disabled, and we are auditing other service accounts for signs of compromise before we prepare to restore Gerrit to working order. We will update again in roughly 2 hours.13:31
*** ChanServ changes topic to "We've confirmed that known compromised identities have been reset or had their accounts disabled, and we are auditing other service accounts for signs of compromise before we prepare to restore Gerrit to working order. We will update again in roughly 2 hours."13:31
openstackstatusfungi: finished sending alert13:37
*** mattd01 has joined #opendev13:47
*** sshnaidm is now known as sshnaidm|afk13:52
sean-k-mooneyso currently we use ubuntu one for gerrit login14:11
sean-k-mooneysomeone downstream made a comment about two factor auth14:11
sean-k-mooneywhich that does not support14:11
clarkbit actually does but its convoluted14:11
*** sshnaidm|afk is now known as sshnaidm14:12
sean-k-mooneyim wonderign how feasible it would be to swap to the openinfraid system and maybe ebale twofactor14:12
sean-k-mooneyoh really i dont see it in the ui14:12
clarkbya its not in the ui you have to join an lp group or something14:12
clarkbif you google it you'll get the page with details14:12
*** clayg has joined #opendev14:12
clarkbwe also have a spec on improving the authentication system14:12
sean-k-mooneyyou join launchpad.net/~sso-2f-testers14:13
sean-k-mooneysomeday we will be able to log into everyting using ssh key pairs i hope14:14
*** slittle1 has quit IRC14:45
*** slittle1 has joined #opendev14:46
*** hjensas has joined #opendev15:20
mnasersean-k-mooney: yeah i use 2fa for ubuntu one15:23
mnaserits tricky to setup but itsit works ok for me15:23
sean-k-mooneyvia that faq15:24
mnaseri dont even think you need to be in the sso-2f-testers group (anymore)15:24
sean-k-mooneye.g. adding yourself to the group15:24
sean-k-mooneyoh ok15:24
mnaseror maybe you do15:24
mnaserit looks like i am in it i gues15:24
sean-k-mooneyya dont remove yourself incase it breaks15:25
sean-k-mooneyi was just wondering if we could swap in the openstack/openinfra openid/oath2 provider as an alternitive login with gerrit15:26
sean-k-mooneyassuming that could support 2 factor15:26
sean-k-mooneyi know gerrit can be configured to use multiple login providers but we dont15:27
mnaserthere's a spec up to work on this sean-k-mooney -- i'd link but i cant :)15:27
sean-k-mooneyoh ok cool15:27
sean-k-mooneyno worries i can find it myself some other time15:27
sean-k-mooneywhen things are less on fire15:28
clarkb#status alert Auditing is progressing but not particularly quickly. We'll keep updating every 2 hours or so.15:38
openstackstatusclarkb: sending alert15:38
-openstackstatus- NOTICE: Auditing is progressing but not particularly quickly. We'll keep updating every 2 hours or so.15:38
*** ChanServ changes topic to "Auditing is progressing but not particularly quickly. We'll keep updating every 2 hours or so."15:39
openstackstatusclarkb: finished sending alert15:44
*** fressi has quit IRC15:50
*** jlibosva has quit IRC15:58
claygthanks for the updates!  keep up the good work.  We appreciate y'all!16:00
*** mlavalle has quit IRC16:00
*** hamalq has joined #opendev16:00
*** hamalq has quit IRC16:03
*** hamalq has joined #opendev16:04
*** hamalq has quit IRC16:05
*** hamalq has joined #opendev16:05
*** mlavalle has joined #opendev16:06
*** jcapitao has quit IRC16:16
*** slittle1 has quit IRC16:21
*** lajoskatona has quit IRC16:32
*** rpittau is now known as rpittau|afk16:38
*** mattd01 has quit IRC16:52
*** hashar has quit IRC17:06
*** eolivare has quit IRC17:11
*** insected has joined #opendev17:39
*** insected has quit IRC17:40
*** mattd01 has joined #opendev17:44
clarkbfrickler: fungi corvus how about #status ok Not actually ok, but resetting topics in order to reduce IRC spam.17:52
clarkbthen #status notice please refer to https://review.opendev.org/maintenance.html or #opendev for further updates. We're trying t cut back on IRC spam.17:52
fungis/resetting/restoring/ maybe17:53
corvusclarkb: why not one more update to the alert with the link then leave that in place?17:53
corvus(ie, just stop spamming, but leave the alert until it's really back up?)17:53
clarkbcorvus: there is concern that a netsplit will leave us having to manually reset 100 or so channel topics when we are done17:53
clarkbfungi: frickler ^ I believe that was coming from you17:53
corvusif that happens, we can semi-automate that since they're logged17:54
clarkbok in that case #status alert We'll stop sending alerts every couple hours. Instead please refer to https://review.opendev.org/maintenance.html or #opendev for the latest.17:55
clarkbhow does that look?17:56
corvusi'd do: status alert "Gerrit is offline due to a security compromise.  Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates."17:56
clarkb++ thats more informative17:56
fungii raised that concern, but mostly out of already being overwhelmed with things we need to fix and not wanting to add everyone's irc channel topics to the growing pile17:56
clarkb#status alert Gerrit is offline due to a security compromise.  Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates.17:58
openstackstatusclarkb: sending alert17:58
-openstackstatus- NOTICE: Gerrit is offline due to a security compromise. Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates.17:59
*** ChanServ changes topic to "Gerrit is offline due to a security compromise. Please refer to https://review.opendev.org/maintenance.html or #opendev for the latest updates."17:59
openstackstatusclarkb: finished sending alert18:04
*** Tengu has quit IRC18:06
*** Tengu has joined #opendev18:08
*** qizhangapp has joined #opendev18:09
*** andrewbonney has quit IRC18:09
*** Tengu has quit IRC18:21
*** hashar has joined #opendev18:24
*** qizhangapp has quit IRC18:27
*** Tengu has joined #opendev18:27
dmsimardhug ops <318:29
*** portdirect has joined #opendev18:31
clayghugs ops <318:35
claygthe plan on the maintenance.html is extensive - it sounds reasonable - BUT an audit of every change in every repo across distributed teams going back to 10/6 will take time - we need to download and review WIP ASAP (or temporarily move work elsewhere)18:37
claygcan you provide any more estimates - i'm sure folks who have been at this will eventually need a break!18:38
clarkbwe've got work split up right now so we're hoping that we can make good progress but I'm wary of putting an eta on it18:40
clarkbsince we don't know what we'll find yet18:40
mhugood luck folks, sounds like going through 2 weeks' worth of activity on opendev's gerrit won't be fun18:43
avassclarkb: good luck! I hope everything goes well18:44
yoctozeptohey infra; keeping fingers crossed for you18:47
yoctozeptomeanwhile having a question18:47
yoctozeptois it possible to fetch proposed change contents if I know its id but gerrit is down?18:48
clarkbyes, they are in gitea18:48
yoctozeptoas in two digits?18:48
yoctozeptothanks, I'll try18:48
clarkbya they shard by the last two digits in the change number18:48
clarkbthen follow it with the full change number18:49
yoctozeptoand can also include patchset number?18:49
clarkboh ya the patchset comes next in the path iirc18:49
clarkbso xy/abcxy/118:49
yoctozeptotrying then18:49
yoctozeptoyup, it forces me to guess the patchset number18:51
yoctozeptobut works overally so better than nothing ;D18:54
yoctozeptothanks clarkb18:54
yoctozeptoI wonder what kind of suspicious activity that was what you discovered18:55
*** tbogue has joined #opendev19:04
*** priteau has quit IRC19:16
*** mattd01 has quit IRC19:21
*** tosky has quit IRC19:54
*** xavpaice has joined #opendev20:16
*** timburke has joined #opendev20:23
*** sboyron has quit IRC21:04
*** slaweq has quit IRC21:12
*** slaweq has joined #opendev21:17
*** slaweq has quit IRC21:26
*** hashar has quit IRC21:45
*** mlavalle has quit IRC21:45
*** mlavalle has joined #opendev21:47
*** ralonsoh has quit IRC22:12
*** qchris has quit IRC22:41
*** guilhermesp has joined #opendev22:52
*** qchris has joined #opendev22:54
*** rchurch has joined #opendev22:58
*** mlavalle has quit IRC22:59
*** mattmceuen has joined #opendev23:08
*** DjeufackZane has joined #opendev23:10
*** DjeufackZane has quit IRC23:18
*** whoami-rajat__ has quit IRC23:56

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!