Tuesday, 2018-08-28

*** gyee has quit IRC00:02
*** tetsuro has joined #openstack-nova00:08
*** slaweq has joined #openstack-nova00:11
*** brinzhang has joined #openstack-nova00:13
*** slaweq has quit IRC00:16
*** macza has quit IRC00:17
*** macza has joined #openstack-nova00:17
*** macza has quit IRC00:20
*** itlinux has joined #openstack-nova00:59
*** gbarros has joined #openstack-nova01:11
*** imacdonn has quit IRC01:18
*** imacdonn has joined #openstack-nova01:18
*** Dinesh_Bhor has joined #openstack-nova01:23
openstackgerritJiaJunsu proposed openstack/nova master: Remove args(os=False) in monkey_patch  https://review.openstack.org/56899901:32
*** hongbin has joined #openstack-nova01:39
openstackgerritYikun Jiang (Kero) proposed openstack/nova master: Make monkey patch work in uWSGI mode  https://review.openstack.org/59228501:51
*** prometheanfire has joined #openstack-nova01:56
prometheanfireI should be able to `curl http://169.254.169.254/latest/meta-data/instance-id` even on config-drive instances right?01:56
*** slaweq has joined #openstack-nova02:11
*** slaweq has quit IRC02:16
*** moshele has joined #openstack-nova02:17
*** lei-zh has joined #openstack-nova02:24
*** moshele has quit IRC02:32
*** lei-zh has quit IRC02:33
*** lei-zh has joined #openstack-nova02:33
donghmHi folks, how can I check this number https://github.com/openstack/nova/blob/master/nova/cmd/status.py#L290 via db query or api?02:38
donghmI'm using nova in master02:38
donghmwhen I run command: nova-status upgrade check02:38
donghmit return There are no compute resource providers in the Placement service but there are 1 compute nodes in the deployment.02:38
*** dpawlik has joined #openstack-nova02:48
*** dpawlik has quit IRC02:52
*** psachin has joined #openstack-nova02:55
*** sapd1 has quit IRC03:16
*** gbarros has quit IRC03:34
*** lei-zh has quit IRC03:39
*** lei-zh has joined #openstack-nova03:39
*** hongbin has quit IRC03:43
*** Dinesh_Bhor has quit IRC03:46
openstackgerritTakashi NATSUME proposed openstack/nova master: Transform libvirt.error notification  https://review.openstack.org/48485103:58
openstackgerritTakashi NATSUME proposed openstack/nova master: Adds view builders for keypairs controller  https://review.openstack.org/34728903:58
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (3)  https://review.openstack.org/57410403:58
*** takashin has joined #openstack-nova03:59
*** lei-zh has quit IRC03:59
*** lei-zh1 has joined #openstack-nova03:59
*** tbachman has quit IRC04:10
*** tbachman has joined #openstack-nova04:10
*** lei-zh1 has quit IRC04:11
*** lei-zh1 has joined #openstack-nova04:11
*** slaweq has joined #openstack-nova04:11
*** ratailor has joined #openstack-nova04:14
*** slaweq has quit IRC04:15
*** lei-zh1 has quit IRC04:16
*** Bhujay has joined #openstack-nova04:18
*** Bhujay has quit IRC04:19
*** tbachman has quit IRC04:20
*** Dinesh_Bhor has joined #openstack-nova04:36
*** crazik has joined #openstack-nova04:40
crazikhello04:41
crazikIs it safe to cleanup cell mapping table?04:41
crazikI have almost 10k entries there, looks like it was never purged when instances were deleted...04:42
crazik(I want to delete entries where instance_id is no longer in nova instances table04:43
*** ratailor_ has joined #openstack-nova04:47
*** ratailor has quit IRC04:50
openstackgerritMerged openstack/nova master: Make instance_list perform per-cell batching  https://review.openstack.org/59313104:55
*** udesale has joined #openstack-nova04:56
*** udesale has quit IRC04:57
*** udesale has joined #openstack-nova04:58
*** udesale has quit IRC04:59
*** macza has joined #openstack-nova05:00
*** ratailor__ has joined #openstack-nova05:08
*** ratailor_ has quit IRC05:11
*** slaweq has joined #openstack-nova05:11
*** slaweq has quit IRC05:16
*** udesale has joined #openstack-nova05:22
*** tetsuro has quit IRC05:24
*** janki has joined #openstack-nova05:26
*** slaweq has joined #openstack-nova05:28
*** bnemec has quit IRC05:30
*** udesale has quit IRC05:31
*** bnemec has joined #openstack-nova05:31
*** slaweq has quit IRC05:33
*** ratailor has joined #openstack-nova05:36
*** ratailor__ has quit IRC05:37
*** ratailor_ has joined #openstack-nova05:38
*** ratailor__ has joined #openstack-nova05:40
*** ratailor has quit IRC05:40
*** ratailor_ has quit IRC05:43
*** lei-zh1 has joined #openstack-nova05:43
*** hongda has joined #openstack-nova05:43
*** icey has quit IRC06:05
*** icey has joined #openstack-nova06:06
*** udesale has joined #openstack-nova06:06
*** icey has quit IRC06:08
*** ratailor__ has quit IRC06:08
*** tetsuro has joined #openstack-nova06:08
*** tetsuro has quit IRC06:09
*** slaweq has joined #openstack-nova06:11
*** macza has quit IRC06:12
*** ratailor has joined #openstack-nova06:12
*** moshele has joined #openstack-nova06:14
*** slaweq has quit IRC06:16
*** links has joined #openstack-nova06:16
*** icey has joined #openstack-nova06:21
*** icey has quit IRC06:21
*** rha has joined #openstack-nova06:32
*** icey has joined #openstack-nova06:33
*** jchhatbar has joined #openstack-nova06:35
*** janki has quit IRC06:38
*** Dinesh_Bhor has quit IRC06:39
*** pcaruana has joined #openstack-nova06:39
*** udesale has quit IRC06:48
*** adrianc has joined #openstack-nova06:54
*** udesale has joined #openstack-nova06:55
*** tetsuro has joined #openstack-nova06:57
*** alexchadin has joined #openstack-nova07:00
*** Dinesh_Bhor has joined #openstack-nova07:01
*** sahid has joined #openstack-nova07:02
*** rcernin has quit IRC07:02
*** sahid has quit IRC07:02
*** slaweq has joined #openstack-nova07:02
*** tetsuro has quit IRC07:07
*** tssurya has joined #openstack-nova07:08
*** dims has quit IRC07:08
*** dims has joined #openstack-nova07:10
*** vivsoni has quit IRC07:11
*** vivsoni has joined #openstack-nova07:16
*** sahid has joined #openstack-nova07:18
*** sapd1 has joined #openstack-nova07:19
*** sahid has quit IRC07:21
*** sahid has joined #openstack-nova07:21
hongdadansmith : Excuse me.  I am working on this patch: "https://review.openstack.org/#/c/579093/"    Do you have any questions about this patch now ?07:26
*** dpawlik has joined #openstack-nova07:29
*** sapd1 has quit IRC07:32
*** sapd1 has joined #openstack-nova07:33
*** hongda has quit IRC07:33
*** andymccr has joined #openstack-nova07:35
moshelesahid: hi07:37
moshelesahid: I didn't understand you commnet, do you agree with my change? what are the alternative here?07:37
sahidhello moshele,07:38
moshelesahid: I am talking about this commit https://review.openstack.org/#/c/595592/07:38
sahidi'm not really agree, so i wanted to know your thinking07:39
sahidbecause i think if we keep your change like that we will have the same issue at some point for an other vif07:39
*** cdent has joined #openstack-nova07:39
sahidOR at least what about to add a comment to indicate the limitation of your fix?07:40
moshelesahid: ok, but I am not sure how to do it otherwise because the network_model.VIF_MODEL_VIRTIO is set anyway by this code https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L137-L14207:41
moshelesahid: and I didn't find any other way to solve this07:41
moshelesahid: I can add that in the future not virtio nics should be skipped as well07:42
moshelesahid: I mean adding a comment ^07:43
sahidyes... nut i have a last question07:44
sahidbut07:44
sahidit seems for me model wil always be VIF_MDEL_VIRTIO that because CONF.libvirt.use_virtio_for_bridges is defaulted to True07:45
sahidso your comment is not really clear, i mean i don't undertand why my initial suggestion is not working07:46
moshelebecause if the default is  CONF.libvirt.use_virtio_for_bridges  is True and you will use direct port it will failed07:47
sahidoh i see :) that is the problem actually :) even for VIF DIRECT model is equal to VIRTIO07:47
mosheleshaid: and because I don't understandthis option "CONF.libvirt.use_virtio_for_bridges" I didn't want to change the logic there07:47
moshelesahid: ^07:48
sahidyes i can understand you want to limit the change07:48
sahidi m goint to comment on the review07:48
sahidand see if everyone are agree07:48
moshelesahid: we update the macvtap ci to run with the rx/tx quues option07:48
*** jpena|off is now known as jpena07:49
moshelesahid:  and the macvtap ci is passing but it only configure the  rx_queue_size07:49
moshelesahid: http://13.74.249.42/92/595592/2/check-nova/Nova-MACVTAP-ML2-Sriov/ea9063b/logs/n-cpu.service.log.gz look for  rx_queue_size  and not the tx_queue_size07:50
moshelesahid:  it seem that the tx_queue_size  is configured with vshostuser is that correct behaviour?07:50
sahidyes07:52
sahidfor vhostuser you should also be able to configure RX07:52
moshelesahid: ok cool07:53
moshelesahid: so my fix will make it work on  rx_queue_size macvtap07:54
moshelesahid:  I will update my commit soon. thank for the help :)07:55
sahidmoshele: thanks for your work on it :)07:56
openstackgerritTushar Patil proposed openstack/nova-specs master: Bi-directional enforcement of traits  https://review.openstack.org/59347508:05
*** macza has joined #openstack-nova08:11
*** macza has quit IRC08:16
*** tetsuro has joined #openstack-nova08:19
*** vivsoni has quit IRC08:21
openstackgerritAlex Xu proposed openstack/nova-specs master: Resource retrieving: add change-before filter  https://review.openstack.org/59197608:21
*** vivsoni has joined #openstack-nova08:22
*** claudiub has joined #openstack-nova08:22
*** ttsiouts has joined #openstack-nova08:29
*** takashin has left #openstack-nova08:32
*** jchhatbar is now known as janki08:37
*** Dinesh_Bhor has quit IRC08:38
*** adrianc has quit IRC08:50
moshelesahid: another way is that I will change the code to skip setting the model to virtio in https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L137-L142 for direct passthoght ports and then I can do you check of the virtio model. I think this will be cleaner08:56
*** hoangcx has joined #openstack-nova08:56
*** Dinesh_Bhor has joined #openstack-nova08:59
*** rcernin has joined #openstack-nova08:59
*** priteau has joined #openstack-nova09:01
*** sambetts|afk has quit IRC09:03
*** ttsiouts has quit IRC09:05
*** sambetts_ has joined #openstack-nova09:06
*** holser_ has joined #openstack-nova09:07
*** ratailor_ has joined #openstack-nova09:08
*** ratailor has quit IRC09:10
*** takamatsu has joined #openstack-nova09:15
*** hoonetorg has quit IRC09:20
openstackgerritStephen Finucane proposed openstack/nova master: conf: Use new-style choice values  https://review.openstack.org/53092409:21
openstackgerritMoshe Levi proposed openstack/nova master: libvirt: skip setting rx/tx queue sizes for not virto interfaces  https://review.openstack.org/59559209:24
*** ttsiouts has joined #openstack-nova09:30
*** ratailor__ has joined #openstack-nova09:31
*** alexchadin has quit IRC09:31
*** lei-zh1 has quit IRC09:33
*** ratailor_ has quit IRC09:34
*** dtantsur|afk is now known as dtantsur09:44
*** erlon has joined #openstack-nova09:44
*** cdent has quit IRC09:45
*** rcernin has quit IRC09:47
*** ratailor__ has quit IRC09:50
sahidmoshele: yes sounds, can you comment so other contributors that are involved on the patch can understand why you decided to do that :)09:55
*** ttsiouts has quit IRC10:01
*** dpawlik has quit IRC10:01
*** moshele has quit IRC10:02
*** tetsuro has quit IRC10:02
*** vivsoni has quit IRC10:03
*** alexchadin has joined #openstack-nova10:04
*** vivsoni has joined #openstack-nova10:04
*** dpawlik has joined #openstack-nova10:04
*** cdent has joined #openstack-nova10:20
*** nicolasbock has joined #openstack-nova10:20
*** vivsoni has quit IRC10:20
*** Dinesh_Bhor has quit IRC10:23
*** Dinesh_Bhor has joined #openstack-nova10:27
*** ratailor has joined #openstack-nova10:28
*** ccamacho|brb has quit IRC10:29
*** vivsoni has joined #openstack-nova10:30
*** stephenfin has quit IRC10:35
*** stephenfin has joined #openstack-nova10:36
TahvokHey guys! I'm trying to understand why nova conductor is receiving a lot of messages.. I mean I get from 30 to 120 messages a second. The environment is not small, we have around 160 compute hosts, but it's not very active. We have a new instance coming up/deleted every hour or so10:46
TahvokApart from it, we have our rabbit service working with 100% up to 400% cpu all the time, and along with it, we see the nova-conductor service processes taking 10% cpu each10:47
*** ratailor has quit IRC10:50
*** ccamacho has joined #openstack-nova10:50
*** sapd1 has quit IRC10:51
*** ttsiouts has joined #openstack-nova10:52
sean-k-mooneyTahvok: i could be wrong but it think all database acess from the computenodes is relayed via the conductor so the periodic jobs that update the compute node resouces and health will be a portion of those messages10:53
Tahvoksean-k-mooney: thanks.. I10:53
TahvokI've just tried something else: default_log_levels = oslo,messaging=DEBUG10:54
TahvokAnd I see lots of messages like this: 2018-08-28 05:54:26.497 23601 DEBUG oslo.messaging._drivers.impl_rabbit [-] Timed out waiting for RPC response: Timeout while waiting on RPC response - topic: "<unknown>", RPC method: "<unknown>" info: "<unknown>" _raise_timeout /openstack/venvs/nova-15.1.25/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py:105210:54
TahvokWhat are this unknown calls? Also, rabbit seems to be working fine, as we don't have any timeout issues when creating new instances10:55
sean-k-mooneyhum that not very desciptive. we probaly should have better logging10:55
TahvokBtw, I'm on Ocata if that matters10:55
*** alexchadin has quit IRC10:56
*** alexchadin has joined #openstack-nova10:56
*** alexchadin has quit IRC10:56
sean-k-mooneystephenfin: any RPC people around that you can think of. i would ping dansmith  but he should be sleeping for another few hours10:57
*** alexchadin has joined #openstack-nova10:57
TahvokWhat I'm trying to fix is basically this periodic (every second or 2) spikes: http://paste.openstack.org/show/728925/10:57
*** alexchadin has quit IRC10:57
*** alexchadin has joined #openstack-nova10:58
TahvokEach infra host has 32 thread cores, I think it should be enough to handle 160 compute hosts10:58
*** alexchadin has quit IRC10:58
sean-k-mooneyTahvok: so every few seconds the condoctor is taking 100% cpu across all cores?10:58
*** alexchadin has joined #openstack-nova10:58
TahvokI might be wrong thought, and we need to increase our resources, that's why I'm consulting with you10:59
*** alexchadin has quit IRC10:59
sean-k-mooneyTahvok: do you have 1 controler or several10:59
sean-k-mooneyTahvok: also no you should be fine10:59
*** adrianc has joined #openstack-nova11:00
Tahvoksean-k-mooney: not all cores, as there are only 18 nova-conductor processes running, at least according to this: http://paste.openstack.org/show/728926/11:00
Tahvoksean-k-mooney: 2 controllers, and we plan to add another one by the end of this week11:00
sean-k-mooney1 contoler should be eaislly able to handle 160 nodes that are more or less idel in terms of vm lifcyle events11:00
sean-k-mooneyTahvok: my guess is the spikes are caused by the periodic jobs. if all the clocks are synced all 160 nodes will submit the there updates around the same time11:02
sean-k-mooneywe proables should be intoducing some spread in when they run. that said without any logs/error that is jsut a guess11:03
TahvokWe have other network issues that network team is handling right now (ksoftirqd is taking lots of cpu (80%+)), but we are trying to fix the rabbit issue. That's what we see on the active router controller: http://paste.openstack.org/show/728928/11:03
Tahvoksean-k-mooney: don't they submit the sync every minute? It doesn't explain ~100 messages every second11:04
TahvokAnd we don't have ceilometer, so rabbit handles only the basic openstack services11:05
*** sahid has quit IRC11:09
sean-k-mooneythey do. but that does not seam that high to me for a cloud of your size.11:09
*** Dinesh_Bhor has quit IRC11:09
Tahvoksean-k-mooney: so 100+ messages a second is normal for this size?11:13
sean-k-mooneyTahvok: i unfortuetly dont have that data to hand.11:18
*** Dinesh_Bhor has joined #openstack-nova11:21
*** erlon has quit IRC11:22
*** Dinesh_Bhor has quit IRC11:24
sean-k-mooneyTahvok: we did some sacle testing back 2 years ago https://review.openstack.org/#/c/352101/ that im quickly checking.11:24
*** hongda has joined #openstack-nova11:25
sean-k-mooneyTahvok: that was done with a cloud with 3 contolers and about 230 compute nodes. if i remember correctly the total cpu usage on the contolers was in low 10% for the majority of the testing11:25
*** jpena is now known as jpena|lunch11:28
*** slagle has joined #openstack-nova11:30
*** vivsoni has quit IRC11:33
*** alexchadin has joined #openstack-nova11:35
*** alexchadin has quit IRC11:41
*** sahid has joined #openstack-nova11:44
*** szaher has joined #openstack-nova11:46
*** giblet_off is now known as gibi11:49
*** vivsoni has joined #openstack-nova11:50
*** alexchadin has joined #openstack-nova11:51
stephenfinsean-k-mooney: bauzas would be my other suggestion but he's still on vacation12:02
*** donghm has quit IRC12:03
sean-k-mooneystephenfin: ya he came to mind. Tahvok i would suggest asking again in an hour or so. the us based cores that work on the conductor will be online then and perhaps can give a better answer12:07
Tahvoksean-k-mooney: ok, thanks!12:08
*** maciejjozefczyk has quit IRC12:18
*** sambetts_ is now known as sambetts12:26
*** jpena|lunch is now known as jpena12:28
*** mriedem has joined #openstack-nova12:35
*** diliprenkila has joined #openstack-nova12:42
openstackgerritKonstantinos Samaras-Tsakiris proposed openstack/os-traits master: Add CUDA versions 8 and 9  https://review.openstack.org/59711112:42
*** cdent has quit IRC12:46
openstackgerritMerged openstack/nova master: Deprecate Core/Ram/DiskFilter  https://review.openstack.org/59650212:47
*** kosamara has quit IRC12:50
*** adrianc has quit IRC12:52
*** _hemna has quit IRC12:57
*** _pewp_ has joined #openstack-nova12:58
*** eharney has quit IRC12:58
*** _hemna has joined #openstack-nova12:58
*** gbarros has joined #openstack-nova13:03
*** adrianc has joined #openstack-nova13:04
*** brinzhang has quit IRC13:04
*** kosamara has joined #openstack-nova13:05
*** tbachman has joined #openstack-nova13:07
gibimriedem: hi! Do you have topics for the today's notification subteam meeting?13:12
mriedemnope13:13
*** jchhatbar has joined #openstack-nova13:14
gibimriedem: cool, then I will cancel13:15
mriedemwfm13:15
*** janki has quit IRC13:17
gibimriedem: we got a report that not just the flavor.disabled is missing in some old embedded falvors but flavor.is_public as well. https://bugs.launchpad.net/nova/+bug/1739325 I will look at it a bit later but I guess a solution will be similar than for flavor.disabled13:17
openstackLaunchpad bug 1739325 in OpenStack Compute (nova) ocata "Server operations fail to complete with versioned notifications if payload contains unset non-nullable fields" [Medium,In progress] - Assigned to Matt Riedemann (mriedem)13:17
*** jchhatbar has quit IRC13:19
mriedemgibi: i suppose,13:19
mriedemdifference between disabled and is_public is disabled isn't in any API but is_public is13:19
gibiso I have to check that what happens on the API if such old flavor is present and fix possible failures there as well13:20
*** janki has joined #openstack-nova13:20
mriedemwell,13:20
mriedemthis isn't a failure on flavor resources, or shouldn't be, it's embedded flavors in the instance which were originally migrated from the instance system_metadata13:21
*** Phinitris has joined #openstack-nova13:21
*** Phinitris has quit IRC13:21
mriedemthis is the problem http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/flavors.py#n5213:21
mriedemis_public isn't in there, so it wasn't stored in the embeded instance.flavor13:22
mriedemso we'll have to default to is_public=True13:22
mriedemif it's not in the embedded flavor13:22
mriedemi left a comment on the bug13:24
gibimriedem: OK, I have to look at the API code showing the embeded flavor to see how is_public is handled there13:24
mriedemi'll report a new bug for is_public since this other one is already closed13:24
mriedemwe don't show that field from the instance.flavor13:25
mriedemhttp://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/views/servers.py#n34913:25
gibimriedem: then the API is not broken. cool. You are way faster to find these things in the code than me.13:26
gibimriedem: I agree that it needs a separate bug13:27
gibimriedem: thanks for reporting it13:27
mriedemhttps://bugs.launchpad.net/nova/+bug/178942313:27
openstackLaunchpad bug 1789423 in OpenStack Compute (nova) "Server operations fail to complete with versioned notifications if payload contains unset is_public field" [Undecided,New]13:27
mriedemlyarwood: speaking of which, can you hit this? https://review.openstack.org/#/c/580525/13:28
*** cdent has joined #openstack-nova13:29
*** efried is now known as efried_doc13:29
lyarwoodmriedem: yup looking13:32
*** stephenfin has quit IRC13:33
*** rtjure has quit IRC13:34
gibijaypipes: thanks for the mail about the consumer gen handling in nova I think efried_doc's and your comment together helps me redo the patch series (now I have to find the time to do it).13:34
*** stephenfin has joined #openstack-nova13:34
*** purplerbot has joined #openstack-nova13:34
*** tbachman has quit IRC13:37
*** rtjure has joined #openstack-nova13:37
*** tbachman has joined #openstack-nova13:37
*** psachin has quit IRC13:39
jaypipesgibi: np13:42
*** liuyulong has joined #openstack-nova13:42
*** sapd1 has joined #openstack-nova13:47
*** psachin has joined #openstack-nova13:49
*** eharney has joined #openstack-nova13:50
*** mlavalle has joined #openstack-nova13:52
*** awaugama has joined #openstack-nova13:57
*** psachin has quit IRC13:58
* gibi reads the cross cell migration mail and wonders what happened with the hatered of shelf_offload operation14:00
*** hongbin has joined #openstack-nova14:00
*** tbachman has quit IRC14:05
kosamaraHi efried, are you around?14:07
openstackgerritJay Pipes proposed openstack/os-traits master: Add CUDA versions 8 and 9  https://review.openstack.org/59711114:09
jaypipesstephenfin: can you +2/W https://review.openstack.org/#/c/597111/ please?14:10
stephenfinjaypipes: Sure, done14:11
jaypipesstephenfin: danke14:11
jaypipeskosamara: ^^14:11
jaypipeskosamara: thx mate14:11
*** itlinux has quit IRC14:12
*** kosamara has quit IRC14:15
*** ttsiouts has quit IRC14:16
*** hamzy has quit IRC14:19
*** kosamara has joined #openstack-nova14:20
kosamarajaypipes: thanks!14:20
*** hamzy has joined #openstack-nova14:20
*** dpawlik has quit IRC14:21
kosamaraefried: I'd like to take up your offer :) From the 4 major things to address on the spec, could you focus on 2,3,4 so that I can focus on 1 for libvirt?14:23
kosamaraefried: I'll also see if I can carry over content from your OOT spec for powervm there. And I think alex_xu's questions are part of the "cyborg intersection".14:24
*** links has quit IRC14:27
*** prometheanfire has left #openstack-nova14:29
*** pcaruana has quit IRC14:29
*** erlon has joined #openstack-nova14:29
*** pcaruana has joined #openstack-nova14:30
*** gbarros has quit IRC14:33
*** Bhujay has joined #openstack-nova14:33
*** Bhujay has quit IRC14:34
*** Bhujay has joined #openstack-nova14:35
*** ttsiouts has joined #openstack-nova14:36
*** eharney_ has joined #openstack-nova14:37
*** eharney has quit IRC14:37
*** r-daneel has joined #openstack-nova14:39
*** efried_doc is now known as efried14:41
*** tbachman has joined #openstack-nova14:42
mriedemtommylikehu: you had a question about volumes?14:43
tommylikehuyeah, I have a question regarding deleting operations, When deleting instance and its related volumes, there could be a period of time that those volumes' statues can be available, right? I mean right after the volume is detached14:44
mriedemtommylikehu: i believe so yes14:44
mriedemb/c you can't delete an in-use volume14:44
mriedemi think you can force delete an in-use volume though....14:45
*** markvoelker has joined #openstack-nova14:45
mriedembut nova doesn't force delete and it's an admin-only API, and historically nova just relies on the user context token to do the detach/delete of the volume, we don't use configured admin credentials for that14:46
*** tbachman has quit IRC14:46
tommylikehuwe got a bug report from our customers saying that that period could be dangerous since other operations are not prohibitted14:46
TahvokHey guys! Asked here before, but sean-k-mooney told me to wait 'till nova conductor cores show up online. We see some spikes of cpu usage of nova-conductor processes (every second or two): http://paste.openstack.org/show/728925/.14:47
openstackgerritMerged openstack/os-traits master: Add CUDA versions 8 and 9  https://review.openstack.org/59711114:47
*** alexchadin has quit IRC14:48
efriedkosamara: Hi, I'm here now. Ack, I'll write some words for 2,3,4.14:48
efriedkosamara: Would you like me to upload new patch sets to the spec, or dump the content somewhere for you to pull in?14:48
*** markvoelker has quit IRC14:49
tommylikehumriedem: :)14:49
TahvokWe have a big environment, with 160 compute hosts, and 2 controller (adding a third by the end of this week). I've enabled debugging on nova-conductor service, and saw that it's receiving around 30~120 messages per second. So I'm trying to investigate what could be throwing so much messages each second.. Our cloud is not very active, we have around 1 instance coming up/deleted every hour or so14:49
*** alexchadin has joined #openstack-nova14:49
kosamaraefried: either works for me, but I think many patch sets can tangle up the discussion. Of course, this is my first spec, so you know.14:50
efriedkosamara: Meh, patch sets are no big deal; often they can help provide history/context of the evolution of a thing. As long as comments aren't lost along the way, it's fine.14:51
mriedemtommylikehu: "since other operations are not prohibitted" ?14:51
mriedemtommylikehu: meaning, the customer thinks something could reserve the volume in that split second while we're deleting the instance and then fail to delete the volume?14:51
efriedkosamara: Just need to avoid stepping on each other. So we should just both check in with each other before posting a new patch set to make sure the other doesn't have local edits pending.14:51
mriedemand thus leave it orphaned14:51
tommylikehumriedem:  yeah14:52
mriedemtommylikehu: if something else reserves the volume in the interim, then clearly it wanted it yeah?14:52
kosamaraefried: I've also got pending changes on the "done" things. I'll post them tomorrow.14:52
efriedkosamara: Okay. How about I just compose content and post it to a pastebin for you to pull in?14:53
kosamaracool14:53
*** tbachman has joined #openstack-nova14:53
mriedemtommylikehu: unless you're aware of some other way to atomically delete a volume, this is just a known issue. the only thing i can see nova doing is using force_delete with cinder admin creds (if nova is configured for those) to delete the volume while it's attached to the server14:53
tommylikehumriedem:  can we do something to protect this process?14:54
tommylikehumriedem:  something like what we do when attaching volume14:54
mriedemtommylikehu: is this an actual issue someone ran into or they are just doing some kind of audit?14:54
mriedemand what client software is waiting a millisecond to attach a volume while we're deleting it from another server?14:55
mriedemthis seems extremely low priority14:55
tommylikehumriedem:  personally I think it's the second case14:55
mriedemok; never tell them about ports then :)14:55
mriedembecause you can attach/detach those to servers out of band all you like14:56
*** nicolasbock has quit IRC14:56
mriedemPUT /v2.0/ports/{port_id} with a new device_id - now it's my port yay!14:56
mriedemtommylikehu: so like i said above, nova could use the force_delete API if we're configured with cinder admin creds, but i'd consider it very low priority14:57
tommylikehumriedem:  oooook, thanks:)14:58
*** Bhujay has quit IRC14:58
*** nicolasbock has joined #openstack-nova14:59
*** pcaruana has quit IRC15:00
mriedemtommylikehu: i guess start by reporting a bug to nova15:00
*** tbachman has quit IRC15:00
mriedemso it's on the books15:00
*** hongbin has quit IRC15:00
*** Altabay has joined #openstack-nova15:00
*** gbarros has joined #openstack-nova15:01
dansmithmriedem: tommylikehu the concern is that a volume nova is going to delete becomes available for a second before being deleted?15:01
tommylikehudansmith:  yes15:02
mriedemtrump could get his grubby hands all of ma volumes15:02
mriedem*all over15:02
dansmithand what, something could attach those and block the delete?15:02
mriedemyeah i guess15:03
dansmithbut the only entity that could do that already owns the thing15:03
mriedemnote: that doesn't block the server delete,15:03
dansmithso, who cares?15:03
mriedemand we already fail to delete the volume if it has snapshots15:03
mriedemso yeah, this is like the lowest of priorities15:03
dansmithor you could say it's working as designed and thus not a bug, which would be my preference15:04
tommylikehulol15:04
*** ttsiouts has quit IRC15:04
mriedemthat works for me15:04
mriedemas i said above, "(9:52:21 AM) mriedem: tommylikehu: if something else reserves the volume in the interim, then clearly it wanted it yeah?"15:04
dansmiththis would be like complaining that something that is able to hardlink the image file on disk before nova deletes it can still read the data15:05
dansmithand the response to that is "yup. that's how that works"15:05
dansmithI mean, just MHO of course15:05
*** jroll has quit IRC15:05
*** jroll has joined #openstack-nova15:06
mriedemwe have insanely bigger fish to fry so yes15:06
mriedemlike the port thing i already mentioned15:06
*** itlinux has joined #openstack-nova15:07
dansmithhow much bigger is insanely bigger?15:07
dansmithlike bigger^2?15:07
*** alex_xu has quit IRC15:10
mriedemhyperbole sized bigger15:10
*** dpawlik has joined #openstack-nova15:12
*** dpawlik has quit IRC15:17
dansmithtssurya: mriedem: don't we want this to be under the big stack of down-cell patches so we can merge and backport it? https://review.openstack.org/#/c/592428/215:19
dansmithoh sorry I guess it is15:20
dansmithnevermind15:20
openstackgerritMerged openstack/nova stable/ocata: Default embedded instance.flavor.disabled attribute  https://review.openstack.org/58052515:24
*** gyee has joined #openstack-nova15:27
dansmithmelwitt: if you want to hit this last patch in my series, it'll make the down cell stuff soon able to be based on master: https://review.openstack.org/#/c/594577/1115:27
*** sapd1 has quit IRC15:31
*** janki has quit IRC15:32
*** david-lyle has quit IRC15:33
sean-k-mooneydansmith: Tahvok was asking about periodic spike in nova-conductor cpu usage in a cloud of ~160 compute nodes eairler. bejond the periodic jobs i was not sure what would be likely to cause the condoctor to be processing ~100 rpc messages a second. any toughts?15:33
dansmithsean-k-mooney: sounds like a support (not dev) question15:33
*** dklyle has joined #openstack-nova15:34
dansmithsean-k-mooney: but yeah, conductor pretty much just answers to nova-compute on an idle cloud, so it'd be periodics from compute nodes15:34
sean-k-mooneydansmith: perhapes but do we expect 18 nova-condoctor worker treads to spike to 100% usage every 1-2 seconds http://paste.openstack.org/show/728925/15:34
dansmithwe should totally call it condoctor15:35
*** tbachman has joined #openstack-nova15:35
sean-k-mooney:)15:35
Tahvoklol15:37
dansmithI assume that was a rhetorical question, but 160 computes configured to run some periodics every minute could certainly generate a fair bit of traffic15:37
sean-k-mooneyi was wondering if we should consider introducing intetional jitter in the perodic jobs to maybe spread when the jobs are running on each node15:37
dansmiththat is already done15:37
Tahvokdansmith: but 100 messages a second? Don't they sync every minute or so? That should generate around 2.5 message a second.. Not a 100..15:37
openstackgerritMerged openstack/nova master: Make monkey patch work in uWSGI mode  https://review.openstack.org/59228515:38
dansmithTahvok: they sync as often as you have them configured for.. without knowing what the messages are, who is sending them, etc, it's hard to say what the problem is15:38
diliprenkilaHi all, I keep getting these errors on my compute nodes "ERROR oslo_service.service [req-0c500027-fc8f-4a24-b1c1-9714b3f248e6 - - - - -] Error starting thread.: AttributeError: '_TransactionContextManager' object has no attribute 'async_'"15:40
Tahvokdansmith: I have tried to debug the rpc calls.. I've tried setting default_log_levels = oslo,messaging=DEBUG, and got lots of messages like this:15:40
TahvokAnd I see lots of messages like this: 2018-08-28 05:54:26.497 23601 DEBUG oslo.messaging._drivers.impl_rabbit [-] Timed out waiting for RPC response: Timeout while waiting on RPC response - topic: "<unknown>", RPC method: "<unknown>" info: "<unknown>" _raise_timeout15:40
*** tbachman_ has joined #openstack-nova15:40
TahvokNo idea what this unknowns are...15:40
dansmithTahvok: I have no idea what would cause that kind of debug with unknown calls and topics15:40
*** jhesketh has quit IRC15:41
dansmithTahvok: I would strongly suspect something is majorly broken with your setup15:41
Tahvokdansmith: almost all the messages were followed with this unknown message15:41
*** jhesketh has joined #openstack-nova15:41
sean-k-mooneyTahvok: didnt you say the cloud was fuctionlaly and able to boot vms?15:41
Tahvoksean-k-mooney: yep, everything works15:41
*** tbachman has quit IRC15:41
*** tbachman_ is now known as tbachman15:41
dansmithtimeouts generally come from overwhelmed services, rabbit or conductor or db, but I've never seen unknown timeouts like that15:42
TahvokThe reason we got to conductor, is because we are trying to investigate a high cpu usage from rabbit service15:42
*** jistr is now known as jistr|call15:42
dansmithwhich makes me wonder if there's some garbage on the bus, or some messages being echoed because of bad HA or something weird like that15:42
diliprenkila<diliprenkila> Hi all, I keep getting these errors on my compute nodes "ERROR oslo_service.service [req-0c500027-fc8f-4a24-b1c1-9714b3f248e6 - - - - -] Error starting thread.: AttributeError: '_TransactionContextManager' object has no attribute 'async_'" fuul log is at https://etherpad.openstack.org/p/i2kJvQ4s4o15:42
sean-k-mooneyTahvok: well rabbit is usually the first thing to melt as you scale out15:42
*** jistr|call is now known as jistr15:43
Tahvoksean-k-mooney: I know that, but it's working completely fine apart from the high cpu usage (around 100%~400% on 32 core machine) .. Everything is green15:43
dansmithTahvok: conductor does not send messages except as replies, so if you think some service is generating load on rabbit, it'd be something else15:43
*** sapd1 has joined #openstack-nova15:43
dansmithbased on what you've said, I would not suspect a nova bug, but a misconfiguration or something else acting up15:44
Tahvokdansmith: so it's trying to reply to some unknown messages? Is it only computes it's talking to?15:44
stephenfindiliprenkila: You've got a package version mismatch15:44
*** dtantsur is now known as dtantsur|afk15:44
stephenfindiliprenkila: Also, for usage questions like that, please use #openstack15:44
dansmithTahvok: no I don't think garbage will cause it to try to reply to things with "unknown" like that15:44
diliprenkila<stephenfin> How did u find that ?15:45
dansmithTahvok: the only nova-related wrinkle is that if you have something hammering rabbit causing some timeouts between compute and conductor, the retries on compute may exacerbate the problem, causing more load on conductor as the backlog grows15:45
dansmithTahvok: if you think there's a nova bug you should file a bug with complete logs (not just single lines like you have provided here) and someone can look, but like I said, I suspect something non-nova as the root cause15:45
stephenfindiliprenkila: https://github.com/openstack/oslo.db/commit/df6bf3401266f42271627c1e408f87c71a06cef715:47
Tahvokdansmith: I have sorted 500 messages from the log, and all replies had different id's, which made me think it's conductor itself doing some stuff, and not simply retrying to answer to same timedout calls15:49
sean-k-mooneyTahvok: one think you could try is deploying a seperate rabitmq instance for nova. that would help you isolate the issue. that said i know you may not want to do that on a running cloud15:50
mriedemTahvok: what versions of oslo.db and oslo.messaging are you using?15:50
mriedem"AttributeError: '_TransactionContextManager' object has no attribute 'async_'"" suggests you're using an old oslo.db15:50
sean-k-mooneymriedem: Tahvok mentioned it was an ocata cloud this morning15:50
mriedemthat doesn't tell me what i'd need to konw15:51
mriedemyou can be using min or max versions of oslo.db from ocata15:51
Tahvokmriedem: sec, looking15:51
mriedemor something completely different15:51
stephenfinmriedem: diliprenkila had the same issue, unless you're mixing them up15:51
dansmithmriedem: are you confusing Tahvok and diliprenkila ?15:51
stephenfindansmith: Yeah :)15:51
mriedemyeah sorry15:52
stephenfinmriedem: I think we've an issue there though. The patch I linked was released in 4.40.0, but we pin on a lower version https://github.com/openstack/oslo.db/commit/df6bf3401266f42271627c1e408f87c71a06cef715:52
stephenfinI assume we should be handling the older version of bumping our minimum15:52
TahvokAll versions of nova-conductor container: http://paste.openstack.org/show/728972/15:52
stephenfin...and I also assume we don't see this in tests because we don't run a lower-constraints functional test15:52
sean-k-mooneystephenfin: well is it on master or stable. we cant bump stable miniums15:53
stephenfinsean-k-mooney: Hmm, lemme check15:53
stephenfinsean-k-mooney: Master (git branch --contains 2d532963fa2e013e16cc403f2674a4488c4170ab)15:54
Tahvoksean-k-mooney: I think deploying a separate cloud would not replicate the problem (I simply don't have 160 compute hosts sitting around)15:55
sean-k-mooneystephenfin: i ment the issue you thing bumping the min verion would fix15:55
stephenfindiliprenkila: Looks like that's a bug. Want to open one with that log and I'll fix that quickly?15:55
stephenfinsean-k-mooney: Yeah, that's what I'm referring to15:55
sean-k-mooneyTahvok: i was not suggesting deploying a seperate cloud. just 1 more rabbitmq node and pointing your existing nodes to use it.15:56
mriedemstephenfin: sorry, is diliprenkila on master?15:56
diliprenkila<mriedem> diliprenkila oslo.db==4.40.0 ,oslo.messaging==8.1.0 on the compute nodes and oslo.db==4.38.0, oslo.messaging==6.4.1 on the nova controller node15:56
sean-k-mooneydiliprenkila: so you are running an older controler then compute node?15:57
openstackgerritJay Pipes proposed openstack/os-traits master: clean up CUDA traits  https://review.openstack.org/59717015:57
diliprenkila<mriedem> I have already opened a bug , https://bugs.launchpad.net/nova/+bug/178883315:57
openstackLaunchpad bug 1788833 in OpenStack Compute (nova) "Error during ComputeManager.update_available_resource: AttributeError: '_TransactionContextManager' object has no attribute 'async_" [Undecided,New]15:57
Tahvoksean-k-mooney: yeah, doing that on production env is a bit risky, at least we would need lots of time to prepare such change.. I was hoping there is some better way to check what conductor was replying to15:57
jaypipesdansmith, kosamara: pls review https://review.openstack.org/59717015:58
stephenfinsean-k-mooney: This is the issue https://github.com/openstack/nova/commit/2d532963fa2e013e16cc403f2674a4488c4170ab#diff-8fec546e4c39f78d233f8e21dadaa3ff15:58
sean-k-mooneyTahvok: am you could maybe dump the contence of the message queues from rabbitmq but other then that im not sure15:58
stephenfinsean-k-mooney: We want that change but only when oslo.db>=4.40.015:59
diliprenkila<sean-k-mooney> Yes i am using old ones, but i did installed all nova packages from ubuntu cloud archive rocky15:59
dansmithTahvok: you probably need to sniff the bus or something like that15:59
*** udesale has quit IRC15:59
dansmithTahvok: or talk to oslo.messaging people about how that "unknown" thing can even happen15:59
TahvokI was thinking maybe simply stopping all conductor services, and then looking which queue was accumulating the most?15:59
stephenfindiliprenkila: You're using master nova on the broken node though, I imagine15:59
stephenfinBecause the code that's broken isn't on stable/rocky, from what I can see16:00
diliprenkila<stephenfin> Yes i am using master16:00
sean-k-mooneystephenfin: chaning from async to async_16:00
*** liuyulong is now known as liuyulong_zzz16:00
efriedkosamara: Are you still around?16:00
dansmithTahvok: the only time conductor initiates a message (not a reply) is during a build or resize type operation, that I can think of16:01
sean-k-mooneystephenfin: it looks like that is in RC3. we cant bump miniums for rocky rc3 at this point16:01
stephenfinsean-k-mooney: sec. We don't need to16:01
dansmithTahvok: sniffing the bus, you may find that one node is going crazy, spewing messages or something16:01
stephenfinsean-k-mooney: But I don't think it is. 'git branch --contains 2d532963fa2e013e16cc403f2674a4488c4170ab' doesn't show it16:02
Tahvokdansmith: we actually have some compute nodes with errors like: [instance: 631c2697-1cdf-4d97-8ae9-006cc5ed6e35] Instance not resizing, skipping migration.16:02
TahvokBut this happen every sync operation (once a minute)16:02
TahvokWhich is a separate issue we are trying to investigate..16:02
dansmithTahvok: I mean during the active part of a migration, in response to an api call16:03
Tahvokdansmith: then it's not the case.. As nothing is migrating right now16:03
dansmithTahvok: right16:03
sean-k-mooneystephenfin: if it was backported the commit id would have changed but the chaing id would still be in the git log. ill check16:03
*** alexchadin has quit IRC16:03
*** macza has joined #openstack-nova16:03
*** eharney_ is now known as eharney16:04
sean-k-mooneystephenfin: ya its in rc316:04
sean-k-mooneystephenfin: actully one second16:05
mriedemstephenfin: did i summarize this correctly? https://bugs.launchpad.net/nova/+bug/1788833/comments/116:05
openstackLaunchpad bug 1788833 in OpenStack Compute (nova) "Error during ComputeManager.update_available_resource: AttributeError: '_TransactionContextManager' object has no attribute 'async_" [Undecided,New]16:05
stephenfinmriedem: Yup16:05
TahvokJust realized I didn't paste the messages I'm actually seeing that conductor is handling: http://paste.openstack.org/show/nqYc4ghBYpLAFvinYYAp/16:06
openstackgerritStephen Finucane proposed openstack/nova master: Don't use '_TransactionContextManager._async'  https://review.openstack.org/59717316:06
sean-k-mooneystephenfin: ya its in rc316:06
stephenfinmriedem, sean-k-mooney: And there's the fix (tl;dr: this can has some kicking left in it)16:06
Tahvokdansmith: all of them are actually replies as you can see16:07
stephenfinsean-k-mooney: How'd you figure that out?16:07
* stephenfin doesn't like not being able to discover this stuff from the CLI16:07
dansmithTahvok: that doesn't really tell us anything16:08
sean-k-mooneygit fetch --tags && git checkout 18.0.0.0rc316:08
sean-k-mooneythen git log and search for change id16:08
Tahvokdansmith: doesn't a reply message indicate it's a sync from compute nodes?16:08
*** liuyulong_zzz has quit IRC16:08
stephenfinsean-k-mooney: Ah, 'git branch -a --contains 2d532963fa2e013e16cc403f2674a4488c4170ab'16:08
stephenfinThe '-a' is important. I don't have stable/rocky locally yet16:09
sean-k-mooneystephenfin: also if you look at https://github.com/openstack/nova/commit/2d532963fa2e013e16cc403f2674a4488c4170ab#diff-8fec546e4c39f78d233f8e21dadaa3ff it shows what branches have it16:09
dansmithTahvok: I don't think those lines are telling you that they're replies, they're telling you what queue will be used for the reply16:09
sean-k-mooneystephenfin: mriedem in anycase https://review.openstack.org/#/c/597173/1 will need to get applied to stable/rocky16:11
*** jpena is now known as jpena|off16:11
mriedemyes i know16:11
mriedemi left a comment in there16:11
mriedemwe can't do an rc4 so i guess this is just going to be a known broken issue for anyone not using oslo.db 4.40 which is at least in upper-constraints for stable/rocky16:11
mriedemand i'd think/hope most deployments should be using what's in upper-constraints for dependent libraries since those are the versions we test against16:12
*** dave-mccowan has joined #openstack-nova16:13
*** swamireddy has joined #openstack-nova16:13
melwitt.16:13
sean-k-mooneymriedem: yeah thats unfortunate but at least as you said upper constratins allows 4.416:14
openstackgerritStephen Finucane proposed openstack/nova master: Don't use '_TransactionContextManager._async'  https://review.openstack.org/59717316:14
sean-k-mooney*4.4016:14
efriedstephenfin: Getting late for you, you want me to propose the fup?16:16
melwittdansmith: cool, will look16:17
*** Bhujay has joined #openstack-nova16:17
*** Altabay has quit IRC16:17
openstackgerritStephen Finucane proposed openstack/nova master: Revert "Don't use '_TransactionContextManager._async'"  https://review.openstack.org/59717416:18
stephenfinefried: Cheers but it was pretty easy :)16:18
efriedstephenfin: Cool. Can you put Related-Bug in the commit?16:18
openstackgerritStephen Finucane proposed openstack/nova master: Revert "Don't use '_TransactionContextManager._async'"  https://review.openstack.org/59717416:19
stephenfinefried: done16:19
efriedThanks. +216:19
*** sahid has quit IRC16:21
*** Bhujay has quit IRC16:22
*** Bhujay has joined #openstack-nova16:30
*** macza_ has joined #openstack-nova16:30
openstackgerritMerged openstack/nova master: Make scheduler.utils.setup_instance_group query all cells  https://review.openstack.org/54025816:32
*** macza has quit IRC16:34
*** diliprenkila has quit IRC16:35
openstackgerritStephen Finucane proposed openstack/nova master: tests: Further simplification of test_numa_servers  https://review.openstack.org/59683216:39
openstackgerritStephen Finucane proposed openstack/nova master: tests: Validate huge pages  https://review.openstack.org/39965316:39
*** sambetts is now known as sambetts|afk16:41
*** Bhujay has quit IRC16:43
*** Bhujay has joined #openstack-nova16:44
openstackgerritMerged openstack/nova master: Record cell success/failure/timeout in CrossCellLister  https://review.openstack.org/59426516:46
*** hongda has quit IRC17:06
sean-k-mooneyoh by the way i finally got around to rebasing https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1759420 if people have time to review17:09
TahvokIs it safe to reduce the number of conductor workers17:17
Tahvok?17:17
TahvokI'm only afraid if I would lose some data..17:17
sean-k-mooneyTahvok: i belive if you want to reduce the number of workers you will need to restart the nova-conductor.17:21
Tahvoksean-k-mooney: yeah, that's not the problem..17:22
TahvokJust done it, I wanted a more organized log: http://paste.openstack.org/show/Iz2EukH6oxZnmdK3OKDX/17:22
sean-k-mooneyif the condoctor has dequeue messages but not acked them then rabbit will keep them in the queues17:22
sean-k-mooneyother then that you should not loose any data17:22
*** Bhujay has quit IRC17:24
*** adrianc has quit IRC17:30
dansmithgdi, I can't get a running devstack to save my life17:35
mriedemsaml?17:38
mriedemdansmith: i've heard other people reporting devstack failing b/c not being able to install pysaml17:39
dansmithno, I think the problem is n-api isn't getting setup in the wsgi container,17:40
dansmithso the n-api log is all kinds of missing python stuff17:40
sean-k-mooneymriedem: ya had issues on centos 7 pysaml2 would not install from pip17:40
dansmithhmm, maybe this is a py3 thing?17:40
sean-k-mooneydansmith: if you are hitting the pysaml2 issue i have locally capped it to 4.5.017:42
sean-k-mooney4.6.0 was complaining about direcories not existing17:42
* dansmith deletes more stuff and tries again17:43
mriedemsean-k-mooney: in -dev they said you needed newer pip/setuptools17:44
sean-k-mooneymriedem: devstack on centos7 keeps downgrading my pip version17:44
sean-k-mooneyit seams to be pinning me to pip 9.0.317:45
sean-k-mooneymriedem: i think devstack is prefering system python which is rather old on centos 717:48
sean-k-mooneythat said i tought i pulled down pip directly form pypi so i dont know why im stuck on version 917:49
*** markvoelker has joined #openstack-nova17:51
sean-k-mooneyoh there is a tools/cap-pip.txt with pip!=8,<1017:51
sean-k-mooneymriedem: did they say waht version you need?17:52
*** r-daneel_ has joined #openstack-nova17:53
*** r-daneel has quit IRC17:53
*** r-daneel_ is now known as r-daneel17:53
dansmithsean-k-mooney: okay I deleted enough stuff to get to the saml thing17:54
dansmithso what's the workaround?17:54
sean-k-mooneydansmith: mine is to modify the upper-constatints file to cap it to 4.5.017:54
dansmithum modify where?17:55
sean-k-mooneythe other workaround would appear to be use a newer pip/setuptools but devstack caps pip to pip!=8,<10 in tools/cap-pip.txt17:55
dansmithopt/requirements?17:55
dansmithwill that stick?17:55
sean-k-mooneydansmith: yes /opt/requiremets17:55
sean-k-mooneyit will stick if you dont have RECLONE=Ture set17:55
dansmithack, thanks, trying17:57
dansmithokay I think I'm past that point now17:59
sean-k-mooneywe proably should figure if we can raise the pip/setuptools version in devstack if we really do need a newer versions. that or blacklist pysaml2===4.6.017:59
openstackgerritMerged openstack/nova master: Optimize global marker re-lookup in multi_cell_list  https://review.openstack.org/59457718:23
*** sapd1 has quit IRC18:26
sean-k-mooneydansmith: by the way, we were talking about deleteing nova-net yesterday. just realisted that we are still blocking dropping rootwrap until nova-net is gone https://review.openstack.org/#/c/554438/18:27
*** georgem1 has joined #openstack-nova18:27
dansmithsean-k-mooney: aye18:27
sean-k-mooneyis this on the ptg techdebt section? ill add it if not18:27
*** r-daneel_ has joined #openstack-nova18:31
sean-k-mooneythere my only addtiontion to the etherpad this time round. suggesting we drop cellsv1 so we can drop nova-net so we can drop oslo-rootwrap. tottally trivial and not controversial at all. at least i hope cern will agree with that else i will be sad18:32
*** r-daneel has quit IRC18:32
*** r-daneel_ is now known as r-daneel18:32
*** markvoelker has quit IRC18:32
*** jpena|off has quit IRC18:33
*** georgem1 has left #openstack-nova18:33
*** markvoelker has joined #openstack-nova18:33
*** jpena|off has joined #openstack-nova18:33
*** dave-mccowan has quit IRC18:36
*** markvoelker has quit IRC18:37
*** dave-mccowan has joined #openstack-nova18:42
openstackgerritEric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees  https://review.openstack.org/57623619:05
*** markvoelker has joined #openstack-nova19:19
*** dpawlik has joined #openstack-nova19:21
*** hoonetorg has joined #openstack-nova19:22
*** hoonetorg has quit IRC19:24
*** dpawlik has quit IRC19:26
*** dave-mccowan has quit IRC19:26
smcginnisAnyone aware of any known issues that would be causing this - http://logs.openstack.org/93/596493/1/check/cinder-tempest-dsvm-lvm-lio-barbican/d716acf/logs/screen-n-cpu.txt.gz#_Aug_27_14_31_35_67899119:28
efriedmriedem, dansmith: Didn't y'all see this the other day ^ ?19:30
mriedemno it was an instance.save() when i saw it19:31
*** owalsh_ has joined #openstack-nova19:31
smcginnisOdd that it seems to be somewhere that it kills the whole service.19:32
mriedemin this case, compute is asking the cell conductor for the min nova-compute service version in that cell,19:32
mriedemand timing out waiting for a reply19:32
mriedemsmcginnis: it's on startup of the service - init the compute rpc api19:32
mriedemi wonder if nova-compute is starting before nova-cond_cell1?19:32
mriedemn-cond-cell1 i mean19:33
smcginnisWas just checking timestamps there.19:33
openstackgerritEric Fried proposed openstack/nova master: Do test_reshape with an actual startup  https://review.openstack.org/59721819:33
mriedemn-cond-cell1 appears to be up around at least Aug 27 14:30:40.18597319:33
mriedemn-cpu fails at Aug 27 14:31:35.67899119:34
*** owalsh has quit IRC19:34
smcginnisYeah, about a second ahead of the failure on another one I'm looking at too.19:34
*** maciejjozefczyk has joined #openstack-nova19:34
mriedemso n-cpu and n-cond-cell1 looks like they are starting at roughly the same time,19:34
mriedemn-cpu starts querying n-cond-cell1 before it's fully up maybe19:34
mriedemthen times out19:34
smcginnisDoes that need to happen in __init__? Seems like maybe somewhere after where it could retry once or twice might be good.19:35
*** owalsh_ has quit IRC19:35
*** owalsh has joined #openstack-nova19:36
mriedemdoesn't probably have to no19:36
mriedemwe could do it in post_start_hook or something19:36
mriedemnot sure if it would help19:36
mriedemself.compute_rpcapi is pretty much only used for inter-compute comms anyway19:37
mriedemit's not used on startup19:37
*** dpawlik has joined #openstack-nova19:38
mriedem2018-08-27 14:30:31.159 | + functions-common:_run_under_systemd:1482 :   sudo systemctl start devstack@n-cond-cell1.service19:38
mriedem2018-08-27 14:30:32.414 | + functions-common:_run_under_systemd:1482 :   sudo systemctl start devstack@n-cpu.service19:38
mriedemi think using upgrade_levels=auto is relatively new in devstack, and i think i added it...19:39
smcginnisI'm fine with blaming you.19:39
mriedemhttps://github.com/openstack-dev/devstack/commit/21221d1ad1462cdcaed4d052c3324ae384b407d419:40
mriedembeen there since may so not new19:40
mriedemhttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22_determine_version_cap%5C%22%20AND%20message%3A%5C%22MessagingTimeout%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d19:40
mriedembut we might not have multi-line indexing properly in this job19:40
mriedemhere we go http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22_determine_version_cap%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d19:41
mriedemsmcginnis: can you report a bug for this so we can at least start tracking it in e-r?19:41
melwittinternally, people have asked for retry behavior there, if nova-compute starts up before nova-conductor is available19:41
melwittI looked at it a bit but didn't see where/how we could add it19:41
*** dpawlik has quit IRC19:42
smcginnisOK, I'll file a bug so at least it's tracked.19:42
mriedemi'd probably move it to post_start_hook,19:42
mriedemadd a 3-time retry on it,19:42
mriedemfail if we still can't get the thing19:42
mriedemthreading it out doesn't help since then we can't kill the service19:42
*** awaugama has quit IRC19:43
mriedemalthough i think you can just SIGHUP and get nova-compute back on track after n-cond-cell1 is runnign19:43
mriedemi know a guy in portland that knows how all of this shit works19:43
melwitt:)19:43
smcginnisLovely - launchpad timeout submitting bug.19:48
openstackgerritEric Fried proposed openstack/nova master: reshaper gabbit: Nix comments re doubled max_unit  https://review.openstack.org/59722019:49
smcginnisDone - https://bugs.launchpad.net/nova/+bug/178948419:49
openstackLaunchpad bug 1789484 in OpenStack Compute (nova) "n-cpu fails init on timeout calling n-cond-cell1" [Undecided,New]19:49
*** dpawlik has joined #openstack-nova19:52
*** ccamacho has quit IRC19:54
smcginnismriedem: This looks like it started on the 18th - http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22_determine_version_cap%5C%22%20AND%20tags:%5C%22screen-n-cpu.txt%5C%22&from=7d19:55
*** dpawlik has quit IRC19:56
*** vivsoni has quit IRC20:03
*** vivsoni has joined #openstack-nova20:03
mriedemefried: comments and nits inline https://review.openstack.org/#/c/576236/20:06
efriedmriedem: ack, thx20:06
mriedemsmcginnis: we only save 10 days of logs20:06
smcginnisWell poop20:06
*** owalsh_ has joined #openstack-nova20:14
*** erlon has quit IRC20:15
*** tssurya has quit IRC20:16
*** owalsh has quit IRC20:17
*** owalsh- has joined #openstack-nova20:17
*** owalsh_ has quit IRC20:18
*** tbachman has quit IRC20:20
*** tbachman has joined #openstack-nova20:22
*** tbachman has quit IRC20:25
*** adrianc has joined #openstack-nova20:26
*** adrianc has quit IRC20:30
*** itlinux has quit IRC20:31
*** owalsh- is now known as owalsh20:37
*** slaweq has quit IRC20:40
*** slaweq has joined #openstack-nova20:40
*** maciejjozefczyk has quit IRC20:42
*** priteau has quit IRC20:43
openstackgerritEric Fried proposed openstack/nova master: Fail heal_allocations if placement is borked  https://review.openstack.org/59723720:55
efriedmriedem: ^20:55
mriedemmmm, bork20:59
mriedemefried: i need a retry decorator, do you remember any pros of retrying.retry over oslo's RetryDecorator?21:04
efriedmriedem: yeah, lemme context switch...21:05
*** markvoelker has quit IRC21:05
efriedmriedem: I think the main advantage of retrying is: no threading.21:05
efriedBut RetryDecorator has a simpler/better-documented interface21:06
mriedemok that's about what i remembered21:10
mriedemduring the RT._update thing21:10
*** bnemec has quit IRC21:12
*** bnemec has joined #openstack-nova21:13
melwittmriedem: I've been using a script for counting blueprint stuff for future charting, if you wouldn't mind reviewing https://review.openstack.org/592628 (second patch in series)21:15
*** eharney has quit IRC21:16
*** holser_ has quit IRC21:24
mriedem+221:28
*** N3l1x has joined #openstack-nova21:28
efriedmriedem: Responded to your questions on https://review.openstack.org/#/c/576236/21:28
*** owalsh has quit IRC21:29
efriedmriedem: Let me know if you want anything acted on immediately, or if fups are okay.21:29
mriedemmmm f ups21:29
mriedemyeah was just opening that21:29
mriedemthere is also a strange small human in my house now since school is out so i'll be potentially distracted21:29
efriedmriedem: No huge hurry, I think I'm probably going to need to knock off for the evening soon.21:30
melwittmriedem: thanks. the counting blueprint script is the second patch stacked on top of that one, though. pls21:31
mriedemoh right...21:31
*** owalsh has joined #openstack-nova21:34
*** dpawlik has joined #openstack-nova21:35
*** claudiub has quit IRC21:35
*** hoonetorg has joined #openstack-nova21:36
mriedemefried: +2 on that21:40
mriedemhttps://review.openstack.org/#/c/576236/ that is21:40
efriedmriedem: Cool, thx21:40
mriedemjaypipes was +2 so i suppose i could +W21:40
efriedyahsure21:40
mriedemjaypipes: do you want to +W https://review.openstack.org/#/c/576236/ ?21:40
mriedemor happy to have me proxy21:40
efriedthat would put the whole pile in the gate.21:40
jaypipesmriedem: done21:41
efriedwoot21:42
openstackgerritEric Fried proposed openstack/nova master: Name arguments to _get_provider_ids_matching  https://review.openstack.org/59729121:45
*** rcernin has joined #openstack-nova21:46
*** takashin has joined #openstack-nova21:51
*** mchlumsky has quit IRC21:58
*** cdent has quit IRC22:16
*** dpawlik has quit IRC22:18
openstackgerritEric Fried proposed openstack/nova master: Other host allocs may appear in gafpt during evac  https://review.openstack.org/59730122:24
openstackgerritEric Fried proposed openstack/nova master: Mention (unused) RP generation in POST /allocs/{c}  https://review.openstack.org/59730422:37
openstackgerritmelanie witt proposed openstack/nova-specs master: Propose configurable maximum number of volumes to attach  https://review.openstack.org/59730622:54
*** macza_ has quit IRC23:00
*** sambetts|afk has quit IRC23:01
*** sambetts_ has joined #openstack-nova23:05
*** r-daneel has quit IRC23:10
*** holser_ has joined #openstack-nova23:13
*** holser_ has quit IRC23:20
*** tbachman has joined #openstack-nova23:32
*** erlon has joined #openstack-nova23:34
openstackgerritMerged openstack/nova master: Don't use '_TransactionContextManager._async'  https://review.openstack.org/59717323:34
openstackgerritMerged openstack/nova master: Revert "Don't use '_TransactionContextManager._async'"  https://review.openstack.org/59717423:34
*** owalsh_ has joined #openstack-nova23:36
*** gbarros has quit IRC23:37
*** owalsh has quit IRC23:37
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4)  https://review.openstack.org/57410623:40
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5)  https://review.openstack.org/57411023:40
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (6)  https://review.openstack.org/57411323:40
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (7)  https://review.openstack.org/57497423:41
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (8)  https://review.openstack.org/57531123:41
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (9)  https://review.openstack.org/57558123:41
*** mlavalle has quit IRC23:43
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (10)  https://review.openstack.org/57601723:43
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (11)  https://review.openstack.org/57601823:43
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (12)  https://review.openstack.org/57601923:44
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13)  https://review.openstack.org/57602023:44
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14)  https://review.openstack.org/57602723:45
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15)  https://review.openstack.org/57603123:45
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16)  https://review.openstack.org/57629923:45
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17)  https://review.openstack.org/57634423:46
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18)  https://review.openstack.org/57667323:46
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19)  https://review.openstack.org/57667623:46
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20)  https://review.openstack.org/57668923:47
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21)  https://review.openstack.org/57670923:47
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22)  https://review.openstack.org/57671223:48
*** stakeda has joined #openstack-nova23:48
*** itlinux has joined #openstack-nova23:53
*** threestrands has joined #openstack-nova23:54
*** brinzhang has joined #openstack-nova23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!