Wednesday, 2020-12-09

*** tosky has quit IRC00:00
*** slaweq has quit IRC00:05
*** slaweq has joined #openstack-meeting00:07
*** martial has quit IRC00:18
*** yasufum has quit IRC00:29
*** yasufum has joined #openstack-meeting00:29
*** macz_ has quit IRC00:39
*** benj_ has quit IRC00:54
*** zigo has quit IRC00:54
*** benj_ has joined #openstack-meeting00:54
*** mlavalle has quit IRC01:04
*** manpreet has quit IRC01:09
*** amotoki has quit IRC02:01
*** amotoki has joined #openstack-meeting02:02
*** lifeless has quit IRC02:25
*** lifeless has joined #openstack-meeting02:27
*** macz_ has joined #openstack-meeting02:31
*** rfolco has joined #openstack-meeting02:34
*** macz_ has quit IRC02:36
*** jamesdenton has quit IRC03:17
*** jamesdenton has joined #openstack-meeting03:18
*** psachin has joined #openstack-meeting03:20
*** rfolco has quit IRC03:24
*** armax has quit IRC03:38
*** manpreet has joined #openstack-meeting03:41
*** ricolin has quit IRC04:21
*** vishalmanchanda has joined #openstack-meeting04:55
*** arne_wiebalck has quit IRC05:27
*** yonglihe has quit IRC05:27
*** yonglihe has joined #openstack-meeting05:28
*** zeestrat has quit IRC05:28
*** arne_wiebalck has joined #openstack-meeting05:29
*** zeestrat has joined #openstack-meeting05:30
*** evrardjp has quit IRC05:33
*** evrardjp has joined #openstack-meeting05:33
*** zbr has quit IRC06:06
*** manpreet has quit IRC06:10
*** yasufum has quit IRC06:12
*** gyee has quit IRC06:28
*** yasufum has joined #openstack-meeting06:38
*** mvalsecc has joined #openstack-meeting06:50
*** lpetrut has joined #openstack-meeting07:12
*** psachin has quit IRC07:17
*** ralonsoh has joined #openstack-meeting07:19
*** dklyle has quit IRC07:34
*** lifeless has quit IRC07:54
*** lifeless has joined #openstack-meeting07:56
*** tosky has joined #openstack-meeting08:02
*** rpittau|afk is now known as rpittau08:14
*** mvalsecc has quit IRC08:33
*** ociuhandu has joined #openstack-meeting08:42
*** geguileo has joined #openstack-meeting08:51
*** ociuhandu has quit IRC08:52
*** rcernin has quit IRC08:56
*** rcernin has joined #openstack-meeting08:56
*** ociuhandu has joined #openstack-meeting08:59
*** rfolco has joined #openstack-meeting09:00
*** rcernin has quit IRC09:23
*** rcernin has joined #openstack-meeting09:32
*** rcernin has quit IRC09:48
*** rcernin has joined #openstack-meeting09:49
*** macz_ has joined #openstack-meeting10:02
*** macz_ has quit IRC10:07
*** ssbarnea has joined #openstack-meeting10:18
*** yasufum has quit IRC10:18
*** zbr has joined #openstack-meeting10:24
*** rcernin has quit IRC10:31
*** ssbarnea has quit IRC10:36
*** ociuhandu has quit IRC10:47
*** ociuhandu has joined #openstack-meeting10:51
*** rcernin has joined #openstack-meeting10:53
*** zbr has quit IRC11:25
*** zbr has joined #openstack-meeting11:28
*** ociuhandu has quit IRC11:35
*** e0ne has joined #openstack-meeting11:37
*** zbr has quit IRC11:46
*** zbr has joined #openstack-meeting11:49
*** ociuhandu has joined #openstack-meeting11:49
*** ociuhandu_ has joined #openstack-meeting11:58
*** ociuhandu has quit IRC12:02
*** rcernin has quit IRC12:04
*** ociuhandu_ has quit IRC12:51
*** ociuhandu has joined #openstack-meeting12:51
*** ociuhandu has quit IRC13:06
*** ociuhandu has joined #openstack-meeting13:08
*** ociuhandu has quit IRC13:16
*** ociuhandu has joined #openstack-meeting13:24
*** zbr has quit IRC13:24
*** zbr has joined #openstack-meeting13:27
*** zbr has quit IRC13:45
*** zbr has joined #openstack-meeting13:47
*** ricolin has joined #openstack-meeting13:58
*** rfolco is now known as rfolco|brb13:59
*** ociuhandu has joined #openstack-meeting14:00
*** zbr has quit IRC14:03
*** zbr has joined #openstack-meeting14:04
*** ociuhandu has quit IRC14:07
*** ociuhandu has joined #openstack-meeting14:11
*** TrevorV has joined #openstack-meeting14:17
*** johanssone has quit IRC14:23
*** bbowen has quit IRC14:24
*** bbowen has joined #openstack-meeting14:24
*** johanssone has joined #openstack-meeting14:26
*** ociuhandu has quit IRC14:36
*** ociuhandu has joined #openstack-meeting14:43
*** ociuhandu has quit IRC15:11
*** ociuhandu has joined #openstack-meeting15:15
*** TrevorV has quit IRC15:18
*** ociuhandu has quit IRC15:20
*** ociuhandu has joined #openstack-meeting15:25
*** ralonsoh has quit IRC15:27
*** ralonsoh has joined #openstack-meeting15:27
*** zbr has quit IRC15:29
*** ociuhandu has quit IRC15:30
*** zbr has joined #openstack-meeting15:31
*** ralonsoh has quit IRC15:34
*** ociuhandu has joined #openstack-meeting15:35
*** dklyle has joined #openstack-meeting15:37
*** ralonsoh has joined #openstack-meeting15:38
*** ociuhandu has quit IRC15:39
*** ociuhandu has joined #openstack-meeting15:43
*** ralonsoh_ has joined #openstack-meeting15:45
*** ralonsoh has quit IRC15:46
*** ociuhandu has quit IRC15:47
*** zbr has quit IRC15:48
*** dsariel has joined #openstack-meeting15:49
*** zbr has joined #openstack-meeting15:51
*** macz_ has joined #openstack-meeting15:52
*** armax has joined #openstack-meeting16:00
*** rajinir has joined #openstack-meeting16:04
*** mlavalle has joined #openstack-meeting16:13
*** ociuhandu has joined #openstack-meeting16:29
*** rfolco|brb is now known as rfolco16:46
*** lpetrut has quit IRC16:46
*** rpittau is now known as rpittau|afk17:12
*** e0ne has quit IRC17:12
*** TrevorV has joined #openstack-meeting17:19
*** ociuhandu_ has joined #openstack-meeting17:24
*** ociuhandu has quit IRC17:27
*** zbr has quit IRC17:27
*** ociuhandu_ has quit IRC17:28
*** zbr has joined #openstack-meeting17:30
*** ociuhandu has joined #openstack-meeting17:34
*** e0ne has joined #openstack-meeting17:34
*** ociuhandu has quit IRC17:39
*** zbr has quit IRC17:47
*** zbr has joined #openstack-meeting17:50
*** zbr has quit IRC17:56
*** gyee has joined #openstack-meeting17:57
*** zbr has joined #openstack-meeting17:58
*** yasufum has joined #openstack-meeting18:01
*** baojg has quit IRC18:37
*** e0ne has quit IRC18:51
*** armstrong has joined #openstack-meeting18:55
*** yasufum has quit IRC19:05
*** ralonsoh_ has quit IRC20:28
*** ociuhandu has joined #openstack-meeting20:42
*** ociuhandu has quit IRC20:46
*** ociuhandu has joined #openstack-meeting20:48
*** ociuhandu has quit IRC20:52
*** timburke has quit IRC20:53
*** vishalmanchanda has quit IRC20:54
*** zaitcev has joined #openstack-meeting20:56
*** ociuhandu has joined #openstack-meeting20:59
*** acoles has joined #openstack-meeting21:00
*** tdasilva has joined #openstack-meeting21:00
acoleswho's here for the swift meeting?21:03
mattoliverauo/21:03
kota_o/21:03
rledisezo/21:03
mattoliverau#startmeeting swift21:03
openstackMeeting started Wed Dec  9 21:03:42 2020 UTC and is due to finish in 60 minutes.  The chair is mattoliverau. Information about MeetBot at http://wiki.debian.org/MeetBot.21:03
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:03
*** openstack changes topic to " (Meeting topic: swift)"21:03
acoles#startmeeting21:03
openstackThe meeting name has been set to 'swift'21:03
openstackacoles: Error: Can't start another meeting, one is in progress.  Use #endmeeting first.21:03
mattoliverauI beat ya21:03
rledisez:D21:03
acolesoh thanks mattoliverau for starting the meeting21:03
*** ociuhandu has quit IRC21:04
mattoliverauI'll start chairing i guess until tim comes online21:04
acolesdoes that mean you will chair as well hahah!21:04
acolesso, apologies form timburke, he has unexpected childcare duties21:04
mattoliverau#topic Audit watchers21:04
*** openstack changes topic to "Audit watchers (Meeting topic: swift)"21:04
acolesfrom*21:04
acolesthanks mattoliverau21:04
mattoliverauacoles: cool thanks for letting us know.21:05
acolesBTW the agenda is here https://wiki.openstack.org/wiki/Meetings/Swift21:05
mattoliverauoh thanks21:05
*** armstrong has quit IRC21:05
zaitcevBut it's old21:05
mattoliverau#link  https://wiki.openstack.org/wiki/Meetings/Swift21:05
mattoliverauAny updateing from audit watchers?21:05
mattoliverauI know I reviewed it again last night21:06
mattoliverauand it's looking really great21:06
acolesI know the final agenda topic is intended for today at least21:06
mattoliverauI think we need to get some documentation in place, but that is a follow up patch I feel.21:06
zaitcevI am going to write it.21:07
zaitcevBy "it" I mean the doc for watchers.21:08
mattoliverauCool, thanks zaitcev, I'll review it and land it when you're done.21:08
*** rfolco has quit IRC21:08
mattoliverauThe PR in question is: https://review.opendev.org/c/openstack/swift/+/70665321:09
mattoliverauI was temped to put a +A on it, but knew Tim said we planned to review it.21:09
mattoliverau*he21:09
mattoliverauIf no more questions on audit watchers shall we move on?21:10
zaitcevMove on.21:10
acolesgreat work guys, thanks21:10
zaitcevIt's dsariel's debut, BTW21:10
*** ociuhandu has joined #openstack-meeting21:10
mattoliverau\o/21:10
zaitcevSo I wasn't touching it on purpose, to let him get the lumps :-)21:11
dsarielwith zaitcev's great help21:11
mattoliverau#topic s3api, +segments container, and ACLs21:11
*** openstack changes topic to "s3api, +segments container, and ACLs (Meeting topic: swift)"21:11
mattoliverauI know this is an old agenda, so do we have any update on this?21:11
*** slaweq has quit IRC21:12
mattoliverau#link https://review.opendev.org/76310621:12
zaitcevI looked at it and it seemed fine21:12
zaitcevBut I didn't +221:12
zaitcevOh21:12
zaitcevI know. Clay sniped me on it.21:13
mattoliveraulooks like it's +Aed21:13
mattoliverautho not merged21:13
mattoliverauSo I guess it just needs handholding through the gate.21:13
zaitcevI think we can move on from that particular thing onto s3api in general if anyone knows what's up with it.21:13
*** slaweq has joined #openstack-meeting21:13
acolesmove on - IIRC last week it was just to nudge it for a +A21:14
acolessince then it has been in recheck-land21:14
mattoliveraukk21:14
zaitcevYea. I knew it had no chance, so didn't recheck until Tim's "retry" patch.21:14
mattoliverau we can come back to s3api if anyone has anything at the end21:14
mattoliverau#topic what still has to be done in order to enable automatic sharding21:15
*** openstack changes topic to "what still has to be done in order to enable automatic sharding (Meeting topic: swift)"21:15
acolesthat's a great question21:15
mattoliverauit is :)21:15
zaitcevbefore that, mattoliverau, are you working on that off the tip we have right now?21:15
mattoliverauthanks dsariel for adding it :)21:15
dsarielI will be happy to help with this. Anything I can do?21:16
zaitcevDuring PTG someone (Tim or Clay) mentioned that nVidia has some patches in production that are necessary for the current sharding.21:16
zaitcevSo I was wondering where the development is occurring.21:16
acolesthere's a few patches on gerrit around sharding, some of which we have shipped21:16
zaitcevOh21:17
zaitcevI thought they weren't in gerrit.21:17
mattoliverauSo I obivously can't speak for nvidia.21:17
zaitcevOK, I can find them. Well, David can find them heh.21:17
mattoliveraubut I belive they're not using auto sharding.21:17
zaitcevyes yes, just the exisitng sharding21:17
mattoliverauthey have some smarts in their controller that identify things that need shardsing and the sharding management tool is used.21:17
acolesbut first, at a high level, my personal view is that we need to (a) put in place all we think we need to recover from split-brain autosharding and (b) convince ourselves that we have done the best we can to avoid split-brain auto-sharding21:18
mattoliverauAuto sharding, where I want to get too I believe are some upstream WIP patches I have.21:18
mattoliverauwhat acoles said ^21:18
acoleswe have a proprietary approach to avoiding split-brain sharding, and we do not enable autosharding21:18
acoleswe use swift-manage-shard-ranges21:19
zaitcevGot it.21:19
mattoliverauTurns out the main problem with the current auto-sharding approach is there are ways theis split brain can occur.21:19
acolesoh, and one final piece, we need to have autoshrinking sorted too21:20
mattoliverau+121:20
mattoliverauI have one POC/WIP patch that improves the leader election, but after playing with it, it minimalises these edge cases, but doesn't remove them.21:20
mattoliverauSo moved on to what acoles mentioned. If we have a way to recover from split brains and gaps then that needs to come first.21:21
zaitcevGuys21:22
zaitcevOur quorum is 1/2 or greater, right?21:22
mattoliverauthem we might fine we're happy to have the simple "sam" leader election approach we have now.. or decide to improve leader election.21:23
mattoliverauwe have 2 quorums21:23
zaitcevCan it be used productively, so that the minority always agrees with majority (which presumably has a leader selected)?21:23
dsarielcan I ask a noob question: what is split-brain sharding?21:23
mattoliveraudsariel: when more then 1 thinks they are the leader and make a diffferent set of shard ranges21:24
zaitcevdsariel: it's a network partition, so now you have 20 nodes doing one thing and 15 nodes doing other thing.21:24
dsarielgot it. thanks21:24
mattoliverauwe have a ceil[replica/2] and a majority quorum ( replica / 2 + 1 )21:24
zaitcevreplica // 2 or21:25
mattoliverauso yeah we'd use a majority quorum for making leader election decisions if we went and asked.21:25
zaitcevpy3 world is harsh21:25
mattoliveraulol21:25
acolesyes, so in the auto-sharding mode the node that thinks it is node index 0 in ring picks shard ranges and replicates them to other nodes. problem is if another node also thinks it is 0, but is likely to pick a different set of shards21:25
mattoliverauYes, the WIP PR I have adds some majority quorum on who actaully is index 0 and what's the version of the ring, to get rid of old nodes who may not agree because they have an old ring21:26
zaitcevOh, I see.21:27
mattoliveraubut that's alot of expra requests. and does minimalise the split brain edge case window. but doesn't completely eradicate it.21:27
zaitcevEven if the network is split, the administrator is not split, the human maintains the rings, and that is the source of truth even if not used directly by sharding.21:28
acolesso mattoliverau 's work is towards my step (b) above - reduce the chance of a mistake in choosing the leader21:29
mattoliverauyup21:29
acolesI've been working on recovering from mistakes21:29
mattoliveraubut turns out step (a)21:29
mattoliverauis what we need to solve.21:29
acolesso https://review.opendev.org/c/openstack/swift/+/765624 is a WIP, and i think mattoliverau may also have some ideas21:30
acoles^^ building on some discussion at the PTG21:30
mattoliverauwe apparently gerrit is not loading for me atm...21:31
mattoliverau*well21:31
zaitcevsame here21:31
acolesI'm deliberately keeping it simple at first - the cases we have seen are 'simple' duplicate paths that you would expect if two nodes had acted as leaders but with different local sets of objects, so choosing slightly different shard ranges21:31
mattoliverauAnyway step (a) is what we want to solve first, when we do, leader election edge cases become less of an issue.21:31
acoleshmmm, I just pushed it gerrit but now also not loading for me21:32
acolesanyway, that patch adds a 'repair' command to swift-manage-shard-ranges that will find all paths, choose one and shrink all others into it21:32
acolesmattoliverau: did you have some graph visualisation stuff? IIRC you did some work before I returned to swift-land? it would be cool to see that too21:34
mattoliverauAnd I've been playing with a RangeScanner that can rebuild and/or choose best paths. It's latest addition is a testing out a new gap filler approach that uses the weighting algorithm to coose the best path (acoles spider suggestion).21:34
acolescool. so mattoliverau checkout https://review.opendev.org/c/openstack/swift/+/765624 - we may have some overlaps :)21:35
mattoliverauacoles: yeah the patch includes some grpahvis to manage-shard-rangers show command  that turns shardranges into a graph.21:35
mattoliveraunice :)21:35
acolesbut I have dodged gap-filling for now21:35
mattoliverauI defintely will!21:36
acolesso one answer to dsariel's question - it would be great to have review of the patches we have in progress :) and review might include getting a container sharded with split-brain and checking out the new repair command etc21:36
mattoliverauand so should dsariel :)21:37
mattoliverau+121:37
acolesdsariel: the probe test in the patch may be a good starting point to understand the problem space21:37
mattoliverauthe code isn't finished but reviewing and testing would be a huge help21:37
acoles(the probe test uses ridiculously small numbers of objects vs real life)21:38
*** e0ne has joined #openstack-meeting21:38
acolesI also have https://review.opendev.org/c/openstack/swift/+/765623/4 which adds a 'compact' command to swift-manage-shard-ranges, its a precursor to the other because it uses similar functionality i.e. shrinking unwanted shards)21:39
acolessorry, I feel like this is shameless promotion of my patches, don't mean it to be21:39
acolesit's quite likely that reviewing those and mattoliverau's patches will generate further work to help move things along21:40
mattoliveraulol, not that's for all the work21:40
mattoliverau*thanks21:41
mattoliverauapparently I cant type this morning21:41
acolesdsariel: is that helpful?21:41
mattoliverauseeing as I can't access gerrit atm, I think this is my rangescanner (plus graphviz) POC/WIP: https://review.opendev.org/#/c/749614/21:42
dsarielthanks, probe tests was the place I started to look at. I'll take a look on the patches. Guess will have many questions. Apologize in advance for that.21:42
acolesplease ask questions21:42
mattoliverauyou probably will, and thanks fine, great even. you'll see things a fresh which I think will be great!21:42
mattoliverau*that's fine21:43
dsarielAdding more objects to probe tests will increase the time they take. Is is possible to run them is a separate job?21:43
acolesBTW those patches I linked are on a chain that starts with a fix to shard audit that we found we needed in order to shrink some overlaps21:43
mattoliverauman, I need to read before I press <enter> :P21:43
acolesso dig down through the patch dependency21:43
acolesyou can run an individual probetest with something like 'nosetests ./test/probe/test_sharder.py:TestManagedContainerSharding.test_manage_shard_ranges_repair_shard'21:44
mattoliverauAnything else on this topic? seems dsariel has a bunch of code to read and test :)21:45
acolesthe small object count isn't necessarily a problem, I was just explaining that the tests aren't run at real world scale :)21:45
dsariel:-)21:45
acolesone other thing21:45
acolesI rediscovered this tool 'python swift/cli/shard-info.py'21:46
acolesit dumps all the root and shard container state after a probe test. its use is really limited to probe test analysis, it could definitely be improved, but it is a lot better that nothing21:47
acolesdsariel: reach out to us in #openstack-swift with any questions21:48
mattoliverau+10021:48
dsarielawesome, thanks! will try it21:48
mattoliveraulet's move on to open floor then.21:48
mattoliverau#topic open floor21:48
*** openstack changes topic to "open floor (Meeting topic: swift)"21:48
*** timburke has joined #openstack-meeting21:49
mattoliverauis there anything else anyone wants to bring up and discuss?21:49
dsarielthanks a lot for the directions21:49
zaitcevYes21:49
zaitcevnot on the topic of sharding though21:49
* timburke sneaks in finally21:49
acolesphew timburke will rescue us21:49
mattoliveraulol, hey timburke :)21:49
zaitcevtimburke: 11 minutes left, come on21:50
zaitcevkid okay?21:50
timburkeyup, just got overdue for his nap21:50
zaitcevso, I was looking at the failure of Romain's patch on py27 and so far I was unsuccessfull.21:51
timburkeoh yeah -- the queuing patch, i think, is that right?21:52
zaitcevI pulled all the remotely relevant patches from eventlet 2.29 into the 2.25 that's locked in tox, but no dice.21:52
zaitcevI think I'll need to find just where the exceptions get stuck.21:53
zaitcevMost of the time it's ChunkWriteTimeout, although not always.21:53
zaitcevI'm going to dump every ChunkWriteTimeout as it's instantiated and trawl through them.21:54
zaitcevI'm not asking for help so far, but it looks grim21:54
timburkefwiw i suspect the ChunkWriteTimeout may be from an old watchdog for an already-passed test21:54
zaitcevYeah, something like that.21:55
timburkeit reminds me in some ways of the trouble i've seen in prod where a ChunkWriteTimeout pops and logs a path *but it has the wrong txn id*21:55
timburkei've grown worried about eventlet's (green)threadlocal behavior...21:56
zaitcevBut it works fine on py3, right?21:56
timburke...i guess? seems to be better, anyway21:57
mattoliverausounds.. tedius, thanks zaitcev for going down this particular rabbit hole.21:57
mattoliverauwe have 3 minutes before we reach time. Anything else or shall we move any discussions into #openstack-swift ?21:58
zaitcevI'm all set.21:58
timburkethere are some py3 patches and some s3api patches i'd appreciate eyes on, but i can drop those in -swift21:59
mattoliveraukk thanks timburke21:59
mattoliverautimburke: maybe you could update priority reviews if you get the chance :)21:59
mattoliverauI'll call it21:59
timburkethank *you* mattoliverau! sorry i hadn't gotten to updating the agenda21:59
timburkethat's a great idea!22:00
acolesthanks mattoliverau for jumping in to chair, great job!22:00
mattoliverauThanks for all your hard work and thanks for working on swift!22:00
mattoliverau#endmeeting22:00
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"22:00
openstackMeeting ended Wed Dec  9 22:00:15 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)22:00
openstackMinutes:        http://eavesdrop.openstack.org/meetings/swift/2020/swift.2020-12-09-21.03.html22:00
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/swift/2020/swift.2020-12-09-21.03.txt22:00
openstackLog:            http://eavesdrop.openstack.org/meetings/swift/2020/swift.2020-12-09-21.03.log.html22:00
mattoliveraumy pleasure22:00
*** zaitcev has left #openstack-meeting22:00
*** acoles has left #openstack-meeting22:00
*** rcernin has joined #openstack-meeting22:02
*** rcernin has quit IRC22:04
*** rcernin has joined #openstack-meeting22:05
*** ociuhandu has quit IRC22:09
*** raildo has quit IRC22:13
*** e0ne has quit IRC22:14
*** ociuhandu has joined #openstack-meeting22:15
*** timburke has quit IRC22:16
*** ociuhandu has quit IRC22:19
*** ociuhandu has joined #openstack-meeting22:20
*** tdasilva has quit IRC22:27
*** tdasilva has joined #openstack-meeting22:28
*** timburke has joined #openstack-meeting22:33
*** ociuhandu has quit IRC22:34
*** ociuhandu has joined #openstack-meeting22:41
*** baojg has joined #openstack-meeting22:43
*** dsariel has quit IRC22:51
*** baojg has quit IRC23:09
*** timburke has quit IRC23:09
*** baojg has joined #openstack-meeting23:09
*** TrevorV has quit IRC23:22
*** slaweq has quit IRC23:23
*** zbr has quit IRC23:46
*** zbr has joined #openstack-meeting23:48

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!