Friday, 2019-10-11

*** diablo_rojo has quit IRC00:21
openstackgerritTim Burke proposed openstack/python-swiftclient master: Actually run tempauth tests in swiftclient dsvm jobs  https://review.opendev.org/68777300:23
openstackgerritTim Burke proposed openstack/python-swiftclient master: v1auth: support endpoint_data_for() api  https://review.opendev.org/68777400:23
*** gyee has quit IRC00:40
openstackgerritTim Burke proposed openstack/swift master: Fix misleading error msg if swift.conf unreadable  https://review.opendev.org/58128000:55
openstackgerritTim Burke proposed openstack/python-swiftclient master: v1auth: support endpoint_data_for() api  https://review.opendev.org/68777401:13
*** diablo_rojo has joined #openstack-swift02:59
*** lbragstad_ has joined #openstack-swift03:31
*** lbragstad has quit IRC03:31
*** lbragstad has joined #openstack-swift04:25
*** diablo_rojo has quit IRC04:26
*** lbragstad_ has quit IRC04:28
*** lbragstad_ has joined #openstack-swift04:33
*** lbragstad has quit IRC04:34
*** lbragstad has joined #openstack-swift04:40
*** lbragstad_ has quit IRC04:41
*** pcaruana has joined #openstack-swift04:55
*** tkajinam has quit IRC05:01
*** tkajinam has joined #openstack-swift05:02
*** tkajinam has quit IRC05:23
*** tkajinam has joined #openstack-swift05:23
viks___Hi, in my swift cluster storage node, a disk had errors and so drive audit tried to unmount it. But the unmounting hanged, due to which load average increased too much and it became unresponsive. Later after restarting that storage node and replacing the drive, everything normal. But how do i handle such case without restarting? why unmounting got hung? Any one has any idea/solution?05:42
*** evrardjp_ has joined #openstack-swift05:49
*** ktsuyuzaki has joined #openstack-swift05:53
*** ChanServ sets mode: +v ktsuyuzaki05:53
*** timss- has joined #openstack-swift05:54
*** evrardjp has quit IRC05:55
*** kota_ has quit IRC05:55
*** timss has quit IRC05:55
*** irclogbot_2 has quit IRC05:56
*** irclogbot_0 has joined #openstack-swift05:58
*** lbragstad_ has joined #openstack-swift06:00
*** lbragstad has quit IRC06:01
*** early has quit IRC06:10
*** rdejoux has quit IRC06:11
*** early has joined #openstack-swift06:11
*** baojg has quit IRC06:31
*** rcernin has quit IRC07:03
*** tesseract has joined #openstack-swift07:03
*** ccamacho has joined #openstack-swift07:04
*** rdejoux has joined #openstack-swift07:09
*** rpittau|afk is now known as rpittau07:53
*** tkajinam has quit IRC07:58
openstackgerritMerged openstack/swift master: Fix misleading error msg if swift.conf unreadable  https://review.opendev.org/58128008:07
*** mikecmpbll has joined #openstack-swift08:07
alecuyerviks___: If the disk was failing and unresponsive, it would explain both the high load average and the inability to unmount it (you may have seen processes in "D" state?) . When that happens on linux, I don't know of a solution other than a reboot08:33
viks___alecuyer: Thanks... I did not check for "D" state which i was unaware... Next time i'll check for it..08:39
*** takamatsu has joined #openstack-swift08:49
*** e0ne has joined #openstack-swift09:09
*** mvkr has quit IRC09:56
*** rcernin has joined #openstack-swift10:08
*** mvkr has joined #openstack-swift10:09
*** rpittau is now known as rpittau|bbl10:15
*** rcernin has quit IRC10:21
*** rdejoux has quit IRC11:12
*** mikecmpbll has quit IRC11:17
*** mikecmpbll has joined #openstack-swift11:26
*** rdejoux has joined #openstack-swift11:40
zigohttp://paste.openstack.org/show/783003/ <--- Big Badaboum ...11:48
openstackgerritThomas Goirand proposed openstack/swift master: Fix on-disk encryption under Python 3  https://review.opendev.org/68811312:13
*** rpittau|bbl is now known as rpittau12:33
openstackgerritThomas Goirand proposed openstack/swift master: Fix on-disk encryption under Python 3  https://review.opendev.org/68811312:46
*** BjoernT has joined #openstack-swift13:16
*** lbragstad_ is now known as lbragstad13:18
*** BjoernT_ has joined #openstack-swift13:21
*** BjoernT has quit IRC13:24
*** lbragstad has quit IRC13:26
*** BjoernT_ is now known as BjoernT14:25
*** diablo_rojo has joined #openstack-swift14:40
*** rdejoux has quit IRC14:43
*** e0ne has quit IRC14:53
*** FlorianFa has quit IRC15:03
*** rpittau is now known as rpittau|afk15:44
*** mikecmpbll has quit IRC16:04
*** e0ne has joined #openstack-swift16:23
*** e0ne has quit IRC16:26
openstackgerritThomas Goirand proposed openstack/swift master: Fix on-disk encryption under Python 3  https://review.opendev.org/68811316:33
*** BjoernT_ has joined #openstack-swift16:39
*** BjoernT has quit IRC16:41
*** paladox has quit IRC16:49
*** gyee has joined #openstack-swift16:53
*** paladox has joined #openstack-swift16:57
*** BjoernT_ has quit IRC17:00
timburkezigo, on the py3 encryption bug -- which keymaster are you using? the simple one, kmip, barbican?17:28
*** e0ne has joined #openstack-swift17:59
*** e0ne has quit IRC18:25
*** e0ne has joined #openstack-swift18:35
*** e0ne has quit IRC18:53
*** pcaruana has quit IRC18:55
*** tomha has joined #openstack-swift20:04
*** tomha has quit IRC20:05
zigotimburke: Hi there!20:23
zigotimburke: Barbican.20:23
zigotimburke: What do you think of the patch?20:23
timburkeo/20:24
timburkeoh good, that explains why i hadn't repro'd with the simple or pykmip ones ;-)20:24
zigotimburke: I didn't know there was another way to run encryption ! :)20:24
zigotimburke: How do you do the other way with kmip ? Is it documented somewhere ?20:25
timburkeseems pretty straight-forward -- i might make it a little more localized to the barbican keymaster, and try to sort out if i should be proposing a patch to castellan and/or barbicanclient20:25
timburke...sort of? there's another middleware: https://github.com/openstack/swift/blob/2.23.0/etc/proxy-server.conf-sample#L1138-L115520:26
timburkeworks similar to the barbican one; there's a preference for putting the config in a separate file (so you can have separate permissions), and that'll look like https://github.com/openstack/swift/blob/2.23.0/etc/keymaster.conf-sample#L78-L9620:27
timburkethere's some actual docs at https://docs.openstack.org/swift/latest/overview_encryption.html#encryption-root-secret-in-a-kmip-service20:29
timburkeanyway, i'll get back to getting barbican running locally so i can repro and test :-)20:31
*** BjoernT has joined #openstack-swift20:31
timburkei should really also try to figure out how to use barbican in our dsvm jobs...20:33
zigoThat'd be helpful indeed.20:37
zigoI'm about to push to production a cluster which may grow quickly, and I hope to be able to upgrade swift to Train before it gets in really in use.20:38
zigoIt's currently running Stein, so Py2, the Train release of Swift in Debian is Py3 only.20:38
zigoWhich is why I discovered this.20:38
timburkeright -- i'll be real interested in knowing how it goes :-)20:39
zigotimburke: Well, so far, everything works well in my virtualized PoC ...20:39
zigo(ie: 16 virtual machines consisting of 3 controllers, 3 proxies, and 10 swiftstores)20:39
timburkewas it an all-new deployment, or was there data from py2-swift?20:40
zigoAll new.20:40
zigotimburke: Do you fear there would be issues when I upgrade from py2 to py3?20:41
zigoI haven't tested that just yet ...20:41
zigoI'll probably do when I come back from the Debian cloud sprint in Boston next week.20:41
zigoBasically, for me, it should just be apt-get dist-upgrade and re-run puppet ...20:41
timburkei've done everything i can to ensure that the py2-py3 upgrade will be smooth -- but i also have to admit that i haven't tested it as thoroughly as i would have liked20:42
zigoWe do have a few thousands objects in that cluster already (basically, our internal tests...).20:43
timburkethanksfully zaitcev_ has been testing it too -- he spotted https://bugs.launchpad.net/swift/+bug/1837805 between 2.22.0 and 2.23.0, for example20:43
openstackLaunchpad bug 1837805 in OpenStack Object Storage (swift) "py3: account metadata handling is busted" [High,Fix released]20:43
*** BjoernT_ has joined #openstack-swift20:43
zigoWe're building a dropbox-like drive using swift as back-end. :)20:43
timburkelove it!20:44
zigoI'm also building a 3rd cluster from scratch, because the 1st one is already too big ...20:44
zigoWe got nearly 50 storage nodes, each with 12x 12TB spinning disks.20:44
zigoRebalance are becoming painful, even if we have 2x 10Gbits/s network on each nodes.20:44
zigoSo, building a 2nd cluster... :)20:45
zaitcev_My only objection is the lack of consistency. The code and comments must match. If you change the invariants, change the comments too.20:45
timburke7PB raw... not too shabby20:45
*** BjoernT has quit IRC20:45
zigoThe funny bit is that the controllers (running keystone) are doing almost nothing ...20:46
zigoSo for the next cluster, we'll be using old re-purposed hardware. :)20:46
timburkezaitcev_, zigo: i think if we do the type-coercion in kms_keymaster as we receive the secret, we wouldn't need to touch the comments20:47
zigoI hope I'm not bothering you too much with my use case and that it's entertaining.20:47
timburkeabsolutely! i love hearing about how people are using swift :-)20:48
zigo:)20:48
timburkehow are rebalances getting painful? what are the symptoms, and what's triggering the rebalances? how much is changing at once?20:48
*** zaitcev_ is now known as zaitcev20:51
zigotimburke: The 1st cluster is always getting full, when it reaches 15%, we add new nodes (one per swift zone), and doing that just creates a storm of network traffic.20:51
zigoAs our deployment is having 1 region in one dc, and 2 regions in another, the traffic between DCs is costy.20:52
zigoWe'll soon upgrade that line, when it gets 100 Gbit/s, it will be less of a problem.20:52
zaitcevBut you're familiar with the method where you start with a small weight and ramp it gradually as replicators digest it?20:52
zigoCurrent, at 20 Gbits/s shared with other services from my company, I got to carefuly tweak the object-replicator and rsync.20:53
zigozaitcev: Well, I do, but I also don't want to babisit the rebalance for too long.20:54
zigoI usually push the weight up to 100% in like 6 or 7 times of weight increase.20:54
timburkei wonder if handoffs_first might be useful... try to prioritize the intra-DC movement20:54
timburkeis it mostly triple replica?20:55
zigoWell, what I'd like is to have the LEAST possible traffic between regions.20:55
zigoYeah, 3 replicas.20:55
zigoWhich is why we have 3 regions.20:56
zigoWe want one replica in each...20:56
zigoEach region is devided in 2 swift zones.20:56
zigoZones are physically in different racks.20:56
zigoOh, one more advice, I may love to have ...20:57
zigoHow many swift-proxies should I setup per core?20:57
zigoIs it one per core ? Or more ?20:57
zigoOne per thread, I mean...20:57
zigoThat's currently what we more or less do...20:58
timburkehttps://github.com/openstack/swift/blob/2.23.0/etc/proxy-server.conf-sample#L26-L30 makes it seem like 1 per core would be about right -- but i must admit, i haven't really played with that kind of tuning much. rledisez or clayg might have some insights21:00
zaitcevI'd amazed if a geo-replicated cluster was stuck on proxies.21:00
zaitcevEspecially since you don't have quorum in any 1 DC21:02
zigozaitcev: My thinking is just about having best performances on proxies, as we do have a HUGE traffic ...21:04
zigoIt's a backup solution, so most clients are doing thoudands of HEAD requests to check of objects are saved.21:04
zigoIt doesn't look like we're having any preformance issue though! :)21:05
rledisezwe actually let auto as it match the number of core, it seems a good fit. i'm currently more interested in the way trafic is distributed across the workers21:05
rledisezat some point we were also pinning the workers to the cores. we saw a small gain, but nothing notable, I think it disappeared during one of our upgrades and nobody cared to get it back21:07
zigoOne more thing: is it ok to upgrade a cluster directly from Rocky to Train?21:13
rledisezzigo: yes, we jsut did it few weeks ago (from 2.18 to 2.22). nothing special to note. recommendation is to upgrade object-server first, then account/container and finally proxy21:15
rledisezalways proxy in last so if a new features is presented through the API, all account/container/object servers are up-to-date to implement it21:16
zigoThanks for the tip.21:17
*** tesseract has quit IRC21:18
*** BjoernT_ has quit IRC21:42
*** BjoernT has joined #openstack-swift21:43
*** diablo_rojo has quit IRC22:03
*** MooingLe1ur is now known as MooingLemur22:13
*** diablo_rojo has joined #openstack-swift22:29
*** rcernin has joined #openstack-swift22:42
*** BjoernT has quit IRC23:01
*** BjoernT has joined #openstack-swift23:01
*** BjoernT has quit IRC23:15
*** BjoernT has joined #openstack-swift23:16
*** BjoernT has quit IRC23:17
*** BjoernT has joined #openstack-swift23:18
*** BjoernT has quit IRC23:19
*** BjoernT has joined #openstack-swift23:21
*** BjoernT has quit IRC23:36

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!