Tuesday, 2017-09-26

mriedemsuperconductor foils me again00:04
mriedemnova-manage db archive_deleted_rows in devstack uses nova.conf by default, which is cell0 in devstack00:04
mriedemwe could make nova-manage db archive_deleted_rows use the api database to get the cells...00:08
openstackLaunchpad bug 1719487 in OpenStack Compute (nova) "nova-manage db archive_deleted_rows is not multi-cell aware" [Medium,Triaged]00:12
openstackgerritMatt Riedemann proposed openstack/nova master: For map_instances argument destination is not defined  https://review.openstack.org/50223600:24
openstackgerritMatt Riedemann proposed openstack/nova master: Fix --max-count handling for nova-manage cell_v2 map_instances  https://review.openstack.org/50223600:26
*** ijw has quit IRC01:02
*** Apoorva_ has quit IRC01:02
*** ijw has joined #openstack-nova01:02
*** acormier has joined #openstack-nova01:08
*** yangyapeng has joined #openstack-nova01:12
openstackgerritjichenjc proposed openstack/nova master: propagate OSError to MigrationPreCheckError  https://review.openstack.org/46977901:20
openstackgerritjichenjc proposed openstack/nova master: check query param for server groups function  https://review.openstack.org/50034701:31
*** hongbin has joined #openstack-nova01:36
*** yangyapeng has quit IRC01:40
*** yangyapeng has joined #openstack-nova01:40
openstackgerritjichenjc proposed openstack/nova master: fix race condition of instance host  https://review.openstack.org/49445801:44
yufei@alex_xu, could you please help take a look at this patch when you are free? a small patch which remove re-auth logic for service-users when nova call ironic.  https://review.openstack.org/#/c/502382/01:57
openstackgerritwanghao proposed openstack/nova master: Set min_disk in the image meta based on the root volume's size  https://review.openstack.org/40739702:25
*** yingjun has joined #openstack-nova02:39
*** mingyu has joined #openstack-nova02:39
*** Dinesh_Bhor has quit IRC02:57
*** jichen has joined #openstack-nova02:57
*** Dinesh_Bhor has joined #openstack-nova02:57
*** yangyapeng has joined #openstack-nova03:01
*** mikal has joined #openstack-nova03:02
*** yangyape_ has joined #openstack-nova03:03
openstackgerritSteven Webster proposed openstack/nova master: Update nova network info when doing rebuild for evacuate operation  https://review.openstack.org/38285303:05
openstackgerritSteven Webster proposed openstack/nova master: Race condition between audit and migrate/resize revert  https://review.openstack.org/40099503:06
*** yangyapeng has quit IRC03:07
gmannalex_xu: do you remember why limit and marker are single_param in this - https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/schemas/hypervisors.py#L2203:26
*** pooja_jadhav has joined #openstack-nova03:27
*** bhagyashris has joined #openstack-nova03:27
*** Dinesh_Bhor has joined #openstack-nova03:28
*** neha_alhat has joined #openstack-nova03:28
gmannmriedem: alex_xu: because it was added in microversion ? because we agreed on keep accepting the multi param for 'limit' and other pagination query param and let controller code to fetch the one(which is second one as default due to dict)03:29
*** acormier has joined #openstack-nova03:39
gmannmriedem: alex_xu similar for additionalProperties, it is False. i think those are correct as those query are introduced with microversion. I will keep 2.33 schema also same way to avoid any inconsistency between 2.33 and 2.5303:39
*** hongbin has quit IRC03:40
*** vladikr has quit IRC03:45
*** neha_alhat has joined #openstack-nova03:45
*** vladikr has joined #openstack-nova03:45
*** Dinesh_Bhor has joined #openstack-nova03:46
*** bhagyashris has joined #openstack-nova03:47
*** yufei has quit IRC03:47
*** pooja_jadhav has joined #openstack-nova03:49
gmannmriedem: alex_xu this one for 2.33-   https://review.openstack.org/50734404:33
alex_xugmann: we should keep backward-compatible in 2.33?05:02
*** Apoorva has quit IRC05:02
*** Apoorva has joined #openstack-nova05:05
*** Apoorva has quit IRC05:05
*** sree has joined #openstack-nova05:06
*** Apoorva has joined #openstack-nova05:08
*** sree has quit IRC05:09
*** yangyape_ has quit IRC05:09
*** sree has joined #openstack-nova05:09
*** yangyapeng has joined #openstack-nova05:09
gmannalex_xu: but there was no query param before 2.33 so making those as single value and no additional property should be fine?05:15
gmannalex_xu: only issue was, we did not restrict and documented about that while doing 2.3305:15
gmannalex_xu: which can break people if someone using it with multi param and additional params  >2.33 (<2.53 as it restrict those with schema)05:16
*** Apoorva has quit IRC05:18
gmannalex_xu: i think i agree to keep those same as it is behaving currently.05:19
alex_xugmann: yea, keep the same behaving currently05:23
*** armax has joined #openstack-nova05:59
*** felipemonteiro has quit IRC06:00
wlfightupcan care me?06:02
*** thorst has quit IRC06:03
*** chyka has quit IRC06:24
*** jaosorior has joined #openstack-nova06:35
*** moshele has joined #openstack-nova06:36
*** Oku_OS is now known as Oku_OS-away06:40
*** crushil_ has quit IRC06:45
openstackgerritMerged openstack/nova master: Live Migration sequence diagram  https://review.openstack.org/50637006:45
*** belmoreira has quit IRC07:19
openstackgerritAlex Xu proposed openstack/nova-specs master: Add trait support in the allocation candidates API  https://review.openstack.org/49771307:19
*** ratailor is now known as ratailor|Lunch07:43
* bauzas has a lovely talk to prepare for telling about Pike and the PTG07:44
*** belmoreira has joined #openstack-nova07:44
bauzas3 days, 20 slides07:45
bauzasI mean, 3 days left, 20 slides to write07:45
bauzasI can do it :)07:45
gmannbauzas: all the best :)07:45
*** sree has quit IRC07:46
gibiI feel your pain. I created a one slider for a local meetup.07:48
openstackgerritjichenjc proposed openstack/nova-specs master: Adds spec for instance live resize  https://review.openstack.org/14121907:50
openstackgerritTakashi NATSUME proposed openstack/nova master: Fix test_get_volume_config method  https://review.openstack.org/48946707:50
gibibauzas: I decided to do it a bit blindly and only created a single slide. I will talk about what I will remember. Bit more stress during the talk lot less stress during the week before the talk :)07:59
*** belmoreira has quit IRC07:59
*** alexchadin has joined #openstack-nova08:00
*** sree has joined #openstack-nova08:02
openstackgerritjichenjc proposed openstack/nova-specs master: Adds spec for instance live resize  https://review.openstack.org/14121908:02
*** zhurong has joined #openstack-nova08:20
*** OctopusZhang has joined #openstack-nova08:20
*** yufei has quit IRC08:23
*** OctopusZhang is now known as yufei08:23
*** yamamoto has quit IRC08:47
openstackgerritGhanshyam Mann proposed openstack/nova master: Implement query param schema for GET hypervisor(2.33)  https://review.openstack.org/50734408:50
*** udesale has joined #openstack-nova08:51
openstackgerritTakashi NATSUME proposed openstack/nova-specs master: Abort Cold Migration  https://review.openstack.org/33473209:07
openstackgerritTakashi NATSUME proposed openstack/python-novaclient master: Microversion 2.54 - Enable cold migration with target host  https://review.openstack.org/40670709:08
*** mvk has joined #openstack-nova09:28
*** yangyapeng has quit IRC09:30
*** wlfightup has quit IRC09:31
*** wlfightup has joined #openstack-nova09:33
*** yangyapeng has quit IRC09:34
*** yangyapeng has joined #openstack-nova09:35
*** sree has joined #openstack-nova10:11
*** wlfightup has joined #openstack-nova10:12
openstackgerritYikun Jiang proposed openstack/nova master: Update Instance action's updated_at when action event updated.  https://review.openstack.org/50747310:34
openstackgerritYikun Jiang proposed openstack/nova master: Update Instance action's updated_at when action event updated.  https://review.openstack.org/50747310:36
openstackgerritGhanshyam Mann proposed openstack/nova master: Implement query param schema for simple_tenant_usage  https://review.openstack.org/50748010:54
openstackgerritRodolfo Alonso Hernandez proposed openstack/nova master: Read Neutron port 'binding_profile' during boot  https://review.openstack.org/50748110:54
openstackgerritBalazs Gibizer proposed openstack/nova master: cover migration cases with functional tests  https://review.openstack.org/49386511:02
*** Tom_ has quit IRC11:06
openstackgerritZhenyu Zheng proposed openstack/nova master: nova-manage db archive_deleted_rows is not multi-cell aware  https://review.openstack.org/50748611:14
mdboothgibi: Before I rebase it, any chance you could promote your +1 here to a +2: https://review.openstack.org/#/c/479802/8 ?11:33
mdboothThen I'll rebase in a bit after it merges11:33
*** zhurong has quit IRC11:35
openstackgerritMatthew Booth proposed openstack/nova master: Ensure errors_out_migration errors out migration  https://review.openstack.org/47980211:37
openstackgerritMatthew Booth proposed openstack/nova master: Use Migration object in ComputeManagerMigrationTestCase  https://review.openstack.org/50212611:37
openstackgerritMatthew Booth proposed openstack/nova master: Automatically revert resize which fails on destination  https://review.openstack.org/46252111:37
mdboothActually meh it was a clean rebase anyway. Should keep the existing +2.11:37
*** yangyapeng has joined #openstack-nova11:38
*** zhurong has joined #openstack-nova11:39
*** markvoelker has joined #openstack-nova11:39
*** acormier has joined #openstack-nova11:43
*** acormier has quit IRC11:47
openstackgerritAlex Xu proposed openstack/nova-specs master: Add trait support in the allocation candidates API  https://review.openstack.org/49771312:13
bauzasgibi: sorry was outside, but back now, +W'd12:14
*** edmondsw has joined #openstack-nova12:14
ratailorefried, not yet. could you help ?12:17
efriedratailor Your observation is correct, and by design.  As for the reasoning behind it... that's complicated.12:18
efriedratailor But if you want a specific microversion in osc, you can ask for it with an env var, CLI opt, or (I think) conf var.12:18
bauzasefried: if the microversion is not asking for a new attribute :p12:19
efriedYeah, there's that :)12:19
manasmbauzas: this is because the instance_group is set to None in the request_spec.12:20
manasmIs there a known issue around that?12:21
bauzasmanasm: do you have an open bug for that?12:21
ratailorefried, and backward compatible.12:21
mdboothIt breaks some tempest tests, though :/12:39
mdboothProbably means I didn't think of something.12:40
mdboothIt does look obvious, though.12:40
stephenfinmdbooth: Reviewed both. Tidy job12:40
* mdbooth hasn't investigated the tempest failures yet, though.12:40
kashyapmdbooth: Wonder if you want to link to to the virDomainMigrateFlags page: https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainMigrateFlags12:40
*** artom has joined #openstack-nova12:51
mdboothkashyap: Yep, in the block migration tests :)12:52
kashyapmdbooth: I'm looking at the log - http://logs.openstack.org/02/507202/2/check/gate-tempest-dsvm-py35-ubuntu-xenial/8485b63/console.html12:52
kashyapI'll note on the review if I learn something new from the log12:53
openstackgerritYikun Jiang proposed openstack/nova master: Update Instance action's updated_at when action event updated.  https://review.openstack.org/50747312:53
openstackgerritYikun Jiang proposed openstack/nova master: Update Instance action's updated_at when action event updated.  https://review.openstack.org/50747312:57
*** erlon has joined #openstack-nova12:57
*** esberglu has joined #openstack-nova12:57
*** alexchadin has quit IRC13:01
openstackgerritEric Fried proposed openstack/nova master: Get auth from context for glance endpoint  https://review.openstack.org/49005713:02
openstackgerritEric Fried proposed openstack/nova master: Get auth from context for glance endpoint  https://review.openstack.org/49005713:02
openstackgerritSean Dague proposed openstack/nova master: Break out BasicTestCase  https://review.openstack.org/50725313:06
openstackgerritSean Dague proposed openstack/nova master: Don't use mock.patch.stopall  https://review.openstack.org/50752713:06
openstackgerritSean Dague proposed openstack/nova master: Remove REQUIRES_LOCKING as nothing needs process locking in the tests  https://review.openstack.org/50752813:06
openstackgerritSean Dague proposed openstack/nova master: WIP: demonstrate no use of external locking  https://review.openstack.org/50752913:06
*** pooja_jadhav has joined #openstack-nova13:06
*** yamamoto has quit IRC13:06
*** bhagyashris has joined #openstack-nova13:07
*** neha_alhat has joined #openstack-nova13:07
*** lucasxu has joined #openstack-nova13:07
openstackgerritEric Fried proposed openstack/nova master: Don't fix protocol-less glance api_servers anymore  https://review.openstack.org/50531713:08
*** Dinesh_Bhor has joined #openstack-nova13:08
*** ratailor has quit IRC13:10
*** yingjun has joined #openstack-nova13:11
*** mriedem has joined #openstack-nova13:11
openstackgerritYikun Jiang proposed openstack/nova master: Update Instance action's updated_at when action event updated.  https://review.openstack.org/50747313:12
sdaguegibi: so, interesting fact from this morning, I'm pretty convinced we don't need any of the REQUIRES_LOCKING code13:16
*** gbarros has joined #openstack-nova13:16
sdaguegibi: also, if you are able to take a look at the qemu 2.10 support patch, that would be cool - https://review.openstack.org/#/c/505673/13:20
sdagueor bauzas13:20
*** shaner has quit IRC13:21
*** baoli has joined #openstack-nova13:23
*** smatzek has quit IRC13:27
*** udesale has joined #openstack-nova13:27
mdboothstephenfin:  think I'm going to leave that etree.tostring() patch always producing unicode, but I'm going to add a 2/3 hack next to the migrateToURI3 which handles the difference, because IMHO the libvirt bindings should accept a unicode string there, and it's just saner.13:38
kashyapMatt, hmm, the assertThat() mismatch seems to be related to ID:13:39
sdaguemriedem: I was diving through thinking more about https://review.openstack.org/#/c/507239/ last night, why do you think that external locking is required there? Because it should blow up if not provided but is needed13:51
mriedemsdague: i thought that also locked those tests to run serially13:52
mriedemmaybe i should be using https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/fixture/lockutils.py#L2213:52
mriedem^ is actually what i started using13:52
sdaguewith no shared state13:54
eantyshevmikal: Hello! Regarding your review https://review.openstack.org/#/c/49232513:54
*** trinaths has left #openstack-nova13:54
sdagueThe reason we had REQUIRES_LOCKING at all was because oslo required a directory name or it exploded13:54
sdaguebut we feed those all temp directories anyway, they never cross lock between workers13:55
*** belmoreira has joined #openstack-nova13:55
eantyshevit fails on parallels virt_type, and I'd like to update it for you, don't you mind?13:56
sdaguebut I actually don't think that default behavior holds any more, and we can probably fully delete that variable anyway, as it definitely confuses people as to what it does13:56
sdaguemriedem: https://review.openstack.org/#/c/507253/ - I was experimenting this morning13:56
*** acormier has joined #openstack-nova13:57
*** eharney has joined #openstack-nova13:58
*** armax has quit IRC13:58
mriedemavolkov: you might like to take a crack at this https://bugs.launchpad.net/nova/+bug/171946013:59
openstackLaunchpad bug 1719460 in OpenStack Compute (nova) "(perf) Unnecessarily joining instance.services when listing instances regardless of microversion" [Medium,Triaged]13:59
mriedemshould be pretty simple13:59
*** awaugama has joined #openstack-nova14:01
*** shaner has joined #openstack-nova14:01
*** smatzek has joined #openstack-nova14:01
mriedemgoing to need to do that flavor thing because otherwise it's a 60 second rpc timeout per call to select_destinations14:04
*** belmoreira has joined #openstack-nova14:04
*** rmart04 has quit IRC14:04
gibimriedem, sdague: this also means that the problems we see with the rpc tests in bug 1685333 is not beacuase of the lack of locking14:06
openstackbug 1685333 in OpenStack Compute (nova) "Fatal Python error: Cannot recover from stack overflow. - in py35 unit test job" [High,Confirmed] https://launchpad.net/bugs/168533314:06
sdaguemriedem: is there a reset on rpc variables that is needed that's not happening?14:06
mriedemsdague: the TestRPC class does a reset per test method14:07
*** yamamoto has joined #openstack-nova14:07
*** armax has joined #openstack-nova14:07
sdaguegibi: I don't see how it could be. It might be a deadlock14:07
mriedemdansmith: also came across this last night https://bugs.launchpad.net/nova/+bug/171948714:07
openstackLaunchpad bug 1719487 in OpenStack Compute (nova) "nova-manage db archive_deleted_rows is not multi-cell aware" [Wishlist,Triaged] - Assigned to Zhenyu Zheng (zhengzhenyu)14:07
sdaguethe biggest issue though is it doesn't have the timeout bits in place, so it's hard to see what's going on14:07
sdagueI think if we trigger the timeout we get a stack trace14:08
gibisdague: that would be nice14:08
dansmithmriedem: meh14:08
mriedemit's wishlist, sure14:08
mriedemgibi: you can be brave locally to start :)14:11
sdaguegibi: yeh, well we should at least get the test_rpc under timeout control, regardless of the rest of it14:11
*** Tom__ has quit IRC14:13
gibimriedem: I can definitly do that14:13
avolkovmriedem: ack14:13
openstackgerritMatt Riedemann proposed openstack/nova master: Make TestRPC inherit from the base nova TestCase  https://review.openstack.org/50723914:13
sdaguegibi: in the fixtures?14:15
gibisdague: here https://github.com/openstack/nova/blob/62c4535a85f7d37f1c9da1e8a747f25ec63dc785/nova/tests/unit/api/openstack/test_requestlog.py#L3814:16
sdagueah, cool, good catch14:16
mriedemi thought ^ was intentional14:16
mriedemfor the placement split or something14:16
gibisdague: I think fixtures are OK to derive from testtools.TestCase as we use fixtures like mixins14:17
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: Fix --max-count handling for nova-manage cell_v2 map_instances  https://review.openstack.org/50755214:17
sdaguegibi: yeh, some of the more advanced ones should see the timeout14:17
*** baoli has quit IRC14:17
sdaguebut I think that's follow on14:17
sdaguemriedem: it's a good question14:17
sdagueI do agree that we should get that under timeout control14:18
*** rmart04 has quit IRC14:20
manasmbauzas: here is the exception I saw with the resize -2017-09-22 07:49:16.377 13573 ERROR nova.api.openstack.extensions   File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line 567, in setup_instance_group14:20
manasm2017-09-22 07:49:16.377 13573 ERROR nova.api.openstack.extensions14:21
jaypipesmriedem, dansmith, gibi, sdague, bauzas: any of you noticed weird glitches in the new Gerrit web UI where the screen blinks and flashes when you open up long in-page comments?14:21
*** eantyshev has left #openstack-nova14:21
gibiat least not yet14:21
jaypipesefried: around? want to chat about "trait inheritance"...14:22
efriedjaypipes I thought you'd never ask :*14:23
efriedDid you see my long-winded comment with example based on (or at least attributed to) your response to my response etc. etc.?14:24
jaypipesefried: yeah, so it's absolutely correct that whatever is constructing the provider tree will need to attach traits at the appropriate provider leel14:24
efriedBecause the code is gonna hafta do some work to percolate 'em around, if that's supported.14:25
jaypipesefried: no, there's no percolating around...14:25
jaypipessdague: are you talking about the gerrit thing or the nested providers thing? :)14:27
jaypipessdague: mostly seen it happen on specs with long (>8 replies) inline comment "threads"14:27
jaypipessdague: next time it happens I'll ping you a link14:27
efriedBut not reproducible, cause I pop up to the review and back down and do the same thing and it doesn't happen the second time.14:28
sdagueif it's gone slow, or your connection is weird, it might take a while for them to pile in and render14:28
jaypipessdague: nah, it's more like a loop in the UI that happens.14:28
jaypipessdague: ya, what efried said :)14:28
*** zhouyaguo has quit IRC14:29
sdaguenote, we also inject a lot of our own custom client side js to do the CI rollup, so it's entirely possible that is related to the issue14:30
*** lucasxu has quit IRC14:30
efriedjaypipes Okay, so in the example in the comment I linked above: does that work as stated?14:32
*** tetsuro has joined #openstack-nova14:33
jaypipesefried: this is excellent:14:34
jaypipes"With nested resource providers, traits defined on a parent RP are assumed to belong to all its child (descendant) RPs. However, traits defined on a child RP do not apply to the parent (ancestor) RPs. There is no implied sharing of traits within aggregates."14:34
jaypipesefried: even more explicit would be pointing out that aggregates don't actually have *any* traits associated to themselves at all (there's no aggregate_metadata table like there is in Nova)14:34
jaypipesefried: only resource providers have traits associated with them.14:34
efriedRightright, point being that RP1 doesn't inherit any traits from RP2 just because they're in the same aggregate.14:35
jaypipesefried: correct. it's worth spelling that out. aggregates are only grouping mechanisms, nothing more.14:35
efriedjaypipes Okay, cool.  So traits are inherited in NRPs, downwards but not upwards.  And the example below that sentence would work as described.  I guess the implementation details aren't important, but I'm a bit curious how it would work if you didn't actually internally copy the traits from the parent to its children.14:36
*** coreywright has quit IRC14:37
jaypipesefried: don't worry about the implementation details of the queries at this point.14:37
efriedjaypipes Roger that.  So okay, it sounds like we're in agreement.  Thanks for the talk.14:38
jaypipesefried: just typing up in the review... gimme a few14:39
openstackgerritBalazs Gibizer proposed openstack/nova master: Moving more utils to ServerResourceAllocationTestBase  https://review.openstack.org/49953914:46
openstackgerritBalazs Gibizer proposed openstack/nova master: factor out compute service start in ServerMovingTest  https://review.openstack.org/50303714:46
openstackgerritBalazs Gibizer proposed openstack/nova master: Test resource allocation during soft delete  https://review.openstack.org/49515914:46
*** moshele has joined #openstack-nova14:46
openstackgerritRodolfo Alonso Hernandez proposed openstack/nova master: Change 'InstancePCIRequest' spec field  https://review.openstack.org/44925714:52
openstackgerritRodolfo Alonso Hernandez proposed openstack/nova master: Read Neutron port 'binding_profile' during boot  https://review.openstack.org/50748114:52
*** ralonsoh_ is now known as ralonsoh14:52
*** lbragstad has joined #openstack-nova14:54
gibiif somebody want some easy patches to review then I'd like to suggest a test improvement series starts https://review.openstack.org/#/c/499539 and mriedem already +2 on it.14:54
bauzasjaypipes: I did noticed14:55
bauzaswhen I say FTW, it's sarcastic14:56
bauzasso yeah, I'm hitted too14:56
bauzasI suspected a French regulation cause, but looks like it's not :p14:57
gibimriedem: I think the image part is covered by https://review.openstack.org/#/c/505673/5/nova/tests/unit/virt/libvirt/test_utils.py@18114:59
bauzasit's just an helper module15:02
bauzasso no way to pass a flag but just adding a global var which sucks in my mind but which sucks less than the other possibilities you envisaged15:02
*** moshele has quit IRC15:05
*** slaweq_ has joined #openstack-nova15:10
mriedemweird, seeing this in the logs when creating 500 servers using the fake driver15:10
mriedemSep 26 15:09:38 devstack nova-compute[30351]: DEBUG nova.compute.resource_tracker [None req-f0c0e899-18e6-47f9-b13b-34829646d07e demo demo] Instance 62604071-77c2-46e1-9f57-d7192edc3f82 has been deleted (perhaps locally). Deleting allocations that remained for this instance against this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1}}. {{(pid=30351) _remove_deleted_instances_allocations /opt/stack/15:10
mriedemSep 26 15:09:38 devstack nova-compute[30351]: INFO nova.scheduler.client.report [None req-f0c0e899-18e6-47f9-b13b-34829646d07e demo demo] Deleted allocation for instance 62604071-77c2-46e1-9f57-d7192edc3f8215:10
*** eharney has joined #openstack-nova15:11
mriedemdoesn't really matter for what i'm testing, but it's odd15:11
mriedemdtroyer told us to make the command names "openstack resource provider <action>"15:15
mriedemto be consistent with everything else in osc15:15
mriedembeing new to osc, i'm going to follow his recommendations15:15
*** rnoriega_ is now known as rnoriega15:16
stephenfinmriedem: There isn't one. Removed now15:16
mriedemand yeah, it's a bit large, but it lays the base crud ops for RPs15:16
mriedemplus the common framework stuff15:16
*** felipemonteiro_ has joined #openstack-nova15:16
*** andreas_s has quit IRC15:18
*** brault has quit IRC15:19
*** felipemonteiro has quit IRC15:20
*** brault has joined #openstack-nova15:20
*** chyka has joined #openstack-nova15:21
*** felipemonteiro_ has quit IRC15:21
cdentjaypipes: one of the other things I did almost immediately after you asked was limiting aggregate checks: https://review.openstack.org/#/c/489633/15:21
*** tonyb has quit IRC15:22
sdaguegibi: it's tested in images.py15:23
sdaguemriedem / gibi: https://review.openstack.org/#/c/505673/5/nova/tests/unit/virt/libvirt/test_utils.py15:23
jaypipescdent: cool, will check shortly.15:23
mriedemsdague: i think he meant there is no unit test for the driver code setting the variable in https://review.openstack.org/#/c/505673/5/nova/tests/unit/virt/libvirt/test_driver.py15:23
gibisdague: that test sets the images.QEMU_VERSION  directly, but you have some code in the driver.py that sets images.QEMU_VERSION15:23
*** tonyb has joined #openstack-nova15:24
mriedemlike test_next_min_qemu_version_ok15:24
openstackgerritChris Dent proposed openstack/nova master: [placement] gabbi tests for shared custom resource class  https://review.openstack.org/48520915:24
mriedemgibi: you can -1 and i can add the test later15:25
sdaguemriedem: he already did15:25
mriedemdansmith: you said os reboot and restack would cleanup devstack?15:27
dansmithmriedem: reboot and stack15:27
*** gbarros has joined #openstack-nova15:28
*** lyan has quit IRC15:32
openstackgerritEvgeny Antyshev proposed openstack/nova master: Add ploop procedures to privsep.libvirt  https://review.openstack.org/50756915:32
*** sree has joined #openstack-nova15:32
*** psachin has quit IRC15:32
*** lyan has joined #openstack-nova15:32
*** gbarros has quit IRC15:34
*** sree has quit IRC15:37
*** gbarros has quit IRC15:37
sdaguegibi / mriedem unit test added15:40
gibisdague: looking15:41
dansmithsdague: he's currently working on being fast and merge things on my instance list patch15:47
*** tssurya has quit IRC15:53
*** sbezverk has quit IRC16:00
*** jistr is now known as jistr|mtg16:01
gibiI've just realized that I have to migrate the notification burndown chart from openshift 2 to 3 until end of September. This will be joyful16:02
mdboothstephenfin: Are the changes to https://review.openstack.org/#/c/507488/ evil?16:02
mdboothI still think it's saner to have unicode everywhere, and convert to something else at the point of use16:03
dansmithcdent: are you working your way up that migration uuid series? if so, I'll hold off pushing that fix you just identified until you have a chance to nit out on anything else16:04
stephenfinmdbooth: No, I'd probably do the same thing16:04
dansmithmeaning, you can identify more nit-ish things and I can fix them, vs. just pushing for nits later16:04
* stephenfin awaits 2020 and the death of Python 2.7 eagerly16:04
cdentdansmith: yup (on phone at the moment though)16:04
dansmithcdent: okay, np16:05
openstackgerritMerged openstack/nova master: Remove SCREEN_LOGDIR from devstack install setting  https://review.openstack.org/50742516:07
*** udesale has quit IRC16:08
*** Apoorva has joined #openstack-nova16:09
*** mvk has quit IRC16:11
*** moshele has joined #openstack-nova16:15
*** armax has joined #openstack-nova16:15
*** yassine has quit IRC16:16
bauzasedleafe: jaypipes: cdent: I thought we would be talking of how we would lead reschedules in https://review.openstack.org/#/c/498830/16:20
*** markmc has joined #openstack-nova16:21
bauzasedleafe: jaypipes: cdent: I see a couple of comments in that spec review, but have you settled down on discussing reschedules as being out of scope for that spec?16:21
dansmithcdent: ah I see you have +1s on most of the rest of the set anyway, so I'll just push16:21
cdentdansmith: i’m in the midst of the last one now16:22
dansmithcdent: ah okay then I'll wait16:22
cdentso give me a couple of minutes (phone call was shorter than expected)16:22
dansmithyeah not trying to rush you16:22
openstackgerritEric Berglund proposed openstack/nova master: PowerVM Driver: config drive  https://review.openstack.org/40940416:23
dansmithI just want to push that review button and get that squirt of dopamine you know :P16:23
edleafebauzas: reschedules will be a different spec16:24
bauzasedleafe: so the spec is litterally just for mentioning which object the scheduler will return to conductor ?16:24
rybridgesHey guys. I am using the ocata release and am wondering if there is any way to print the user data associated with an instance that i own from the cli with the openstack client16:25
bauzasedleafe: looks uber too much16:25
bauzasI mean, super heavy16:25
edleafebauzas: since this will be sent over RPC, we needed agreement on it so that we don't find ourselves changing it later16:26
bauzasI'd be up concentrating our minds on how we plan to pass that object16:26
bauzasedleafe: we did a couple of RPC changes that didn't require a spec fortunately16:26
bauzasbut I leave the mic to mriedem16:26
edleafebauzas: the idea is to get it close to correct before we make the change16:26
dansmithbauzas: specs are cheap16:26
dansmithif edleafe wants separate specs, I don't think there's a problem16:27
edleafebauzas: and given the amount of discussion on the Selection object spec, I'd say it was a good thing to do16:27
*** yufei has joined #openstack-nova16:27
dansmithwe should focus on getting the work done and not the process16:27
bauzasdansmith: well, I'd rather then look at code, but okay :)16:27
bauzasyeah that16:27
bauzasedleafe: the thing is, if you want a spec, fine with me, but then precise the scope16:28
*** yufei has left #openstack-nova16:28
bauzassince it was a work item, I was expecting more16:28
dansmithmriedem: so I was just looking at this for evac and live migration.. this method _moves_ allocations to the destination, not copies AFAICT: https://github.com/openstack/nova/blob/master/nova/scheduler/utils.py#L222-L22416:29
dansmithmriedem: is that right?16:29
bauzasif the spec isn't targeting to mention how reschedules would be done, fair enough but just make sure you clearly scope that16:29
cdentdansmith: dobne16:29
dansmithcdent: yes the last one is less done16:30
* cdent nods16:30
mriedemdansmith: copies16:32
*** kristian__ has quit IRC16:32
dansmithmriedem: oh does claim_resources() do the doubling thing?16:33
mriedemit takes the allocations for the instance on the source node, and makes those same allocations for the instance on the dest node16:33
dansmithwhich will erase the source allocation16:33
mriedemit's basically what the scheduler would do,16:33
dansmithbecause... only one consumer16:33
mriedemoh it calls claim_resources,16:33
mriedemso yeah it doubles16:33
mriedemthis is the thing where force=True16:34
mriedemso we don't call the scheduler to double the allocs16:34
mriedemand i said i wanted to move back into the scheduler, but we'd need a skip_filters flag in select_destinations16:34
dansmithokay I didn't think claim_resources was the doubling one, but maybe so, I'll dig a bit16:34
*** jdillaman has joined #openstack-nova16:34
mriedemclaim_resources calls the double stuff method16:34
dansmithcdent: can you look at my comment on the DRY thing and see if you buy what I'm sellin' ?16:35
cdentdansmith: I will buy that with an entire whole dollar, if you comment the plan16:38
dansmithcdent: you saw the "when we have an atomic operation we should remove this" right?16:40
*** jangutter has quit IRC16:40
cdentyes, but (unless I missed it) there’s no “this dupe with that other thing but we don’t care cuz”16:42
dansmithI will add more words16:42
cdentI’ll still buy it for a dollar even if you don’t16:42
*** dtantsur is now known as dtantsur|afk16:42
openstackgerritDan Smith proposed openstack/nova master: Make allocation cleanup honor new by-migration rules  https://review.openstack.org/49894816:45
openstackgerritDan Smith proposed openstack/nova master: Pre-create migration object  https://review.openstack.org/49895016:45
openstackgerritDan Smith proposed openstack/nova master: Revert allocations by migration uuid  https://review.openstack.org/49894916:45
openstackgerritDan Smith proposed openstack/nova master: Refactor resource tracker to account for migration allocations  https://review.openstack.org/50641916:45
openstackgerritDan Smith proposed openstack/nova master: Make migration uuid hold allocations for migrating instances  https://review.openstack.org/50642016:45
*** yufei has joined #openstack-nova16:50
*** yufei has quit IRC16:51
openstackgerritSean Dague proposed openstack/nova master: Move ploop commands to privsep.  https://review.openstack.org/49232516:51
*** claudiub has quit IRC16:52
openstackgerritSean Dague proposed openstack/nova master: Move ploop commands to privsep.  https://review.openstack.org/49232516:54
*** derekh has quit IRC16:54
*** trinaths has joined #openstack-nova16:56
*** trinaths has left #openstack-nova16:56
*** trinaths1 has joined #openstack-nova16:57
*** trinaths1 has left #openstack-nova16:57
mriedemnotifications meeting in openstack-meeting-4 in 2 minutes16:58
*** cdent has quit IRC16:58
gibi... and now it is started17:00
*** baoli has quit IRC17:01
*** slaweq_ has quit IRC17:02
*** Swami has joined #openstack-nova17:23
openstackgerritMoshe Levi proposed openstack/nova master: Don't overwrite binding-profile  https://review.openstack.org/50561317:27
*** gbarros has quit IRC17:38
dansmithmriedem: and why do you have locally-deleted instances for this test?17:39
melwittfor local deletes, allocations aren't cleaned up till the compute host heals it17:39
dansmithright, what melwitt said17:39
mriedemmysql> select count(id) from consumers;17:39
mriedem| count(id) |17:39
mriedem|      2002 |17:39
mriedem1 row in set (0.01 sec)17:39
*** cdent has joined #openstack-nova17:39
mriedemmelwitt: there is no compute for these17:40
mriedemthey failed during scheduling17:40
mriedemalthough yeah why would placement have allocations for these...17:40
mriedemstack@devstack:~$ nova list | grep -c ERROR17:40
melwittoh, hm17:40
mriedemso i've got 1000 instances in ERROR state, and 2002 consumers in the api db17:40
*** artom has quit IRC17:41
melwittallocations are written at claim time?17:41
mriedemfrom the scheduler yeah17:41
*** kristian__ has joined #openstack-nova17:41
melwittso that would explain the ones you do have. but I guess your point is why are there more allocation consumers than non error instances17:42
*** kristia__ has joined #openstack-nova17:42
mriedemthat's because i've deleted 1000 over time17:42
mriedemi was hitting messaging timeouts between conductor and the scheduler earlier today, so had 500 in error which i needed to be active, so deleted all of those, restarted conductor and scheduler, and was able to create a single instance17:43
mriedemso tried with 500 more again17:43
mriedemand hit novalidhost on all of those17:43
openstackgerritMerged openstack/nova master: cleanup test-requirements  https://review.openstack.org/50706317:44
*** kristian__ has quit IRC17:45
*** gabor_antal has quit IRC17:46
*** jmlowe has quit IRC17:46
*** lbragstad has quit IRC17:47
*** gabor_antal has joined #openstack-nova17:48
*** ijw has quit IRC17:55
openstackgerritDan Smith proposed openstack/nova master: Make live migration hold resources with a migration allocation  https://review.openstack.org/50763817:55
dansmithjaypipes: cdent: ^ quick stab at the live migrate version of this17:55
dansmithit's probably rough at this point, but worth a look I think17:56
*** vvargaszte has joined #openstack-nova17:58
*** ijw has joined #openstack-nova17:58
*** ijw has quit IRC17:58
*** ijw has joined #openstack-nova17:58
*** baoli has quit IRC18:00
*** baoli has joined #openstack-nova18:01
*** Apoorva_ has joined #openstack-nova18:02
*** vvargaszte has quit IRC18:04
*** Apoorva has quit IRC18:06
*** kristian__ has joined #openstack-nova18:08
*** Apoorva_ has quit IRC18:08
*** kristia__ has joined #openstack-nova18:08
*** Apoorva has joined #openstack-nova18:08
*** moshele has quit IRC18:09
*** kristi___ has joined #openstack-nova18:11
*** lucasxu has quit IRC18:11
*** dave-mcc_ is now known as dave-mccowan18:12
*** kristian__ has quit IRC18:12
*** kristia__ has quit IRC18:14
*** moshele has joined #openstack-nova18:17
*** jmlowe has joined #openstack-nova18:18
*** moshele has quit IRC18:20
*** gbarros has joined #openstack-nova18:21
*** xyang1 has joined #openstack-nova18:24
*** slaweq_ has joined #openstack-nova18:27
cdentdansmith: haven’t had a chance to give it a proper look, but saw a weird when skimming the live migrate thing18:32
dansmithit's returning True-ish which is what I wanted for the functional tests18:33
dansmithso.. working as designed? :)18:33
*** Apoorva has joined #openstack-nova18:33
openstackgerritMerged openstack/nova master: Set the Pike release version for scheduler RPC  https://review.openstack.org/50724518:34
cdentgo python!18:34
mriedemwtf, so i can't create multiple instances, i get novalidhost, but i can create one at a time18:38
melwittare you using multi-create?18:38
mriedemwasn't a problem yesterday18:38
mriedembut i had a bit of a cleaner env yesterday18:38
*** gabor_antal has quit IRC18:39
*** gabor_antal has joined #openstack-nova18:39
melwittmulti-create will reject you if any one of min_count can't be accommodated. so one at a time would work if you're in that situation, if some/most of them fit18:39
mriedemi can just restack this env, but it makes me worry that we aren't properly cleaning up allocations somewhere18:41
dansmithso that's why18:46
mriedemtook 2+ hours to delete 1000 ACTIVE instances18:47
dansmithI do, but I didn't remember all your details18:48
mriedemyeah i basically trying to get back to clean state before starting today18:48
mriedemso was archiving the db's last night18:48
mriedembtw, bauzas pointed this out before, but we log this way too many times18:49
mriedemSep 26 18:44:37 devstack nova-compute[30351]: DEBUG nova.compute.resource_tracker [None req-992d494e-d328-4204-bcfe-80d926cf0a65 demo demo] We're on a Pike compute host in a deployment with all Pike compute hosts. Skipping auto-correction of allocations. {{(pid=30351) _update_usage_from_instance /opt/stack/nova/nova/compute/resource_tracker.py:1071}}18:49
dansmithmriedem: unrelated, see this: http://status.openstack.org/openstack-health/#/test/nova.tests.functional.test_servers.ServersTestV219.test_description_errors?duration=P3M18:52
*** slaweq_ has quit IRC18:52
dansmithmriedem: I think this test is occasionally taking up to 240s locally when it should be about 8s18:52
*** lucasxu has joined #openstack-nova18:53
dansmithand I think it's because it creates a server that it never cleans up and then abruptly exits where we take down conductor before the compute service finishes waiting on a call or something18:53
dansmithso I have a patch to just make it clean up the server and I _think_ it's working18:53
mriedemthe one weird spike in august is, weird18:53
openstackLaunchpad bug 1719714 in OpenStack Compute (nova) "Excessive logging of "We're on a Pike compute host in a deployment with all Pike compute hosts."" [Medium,Confirmed]18:53
dansmithmriedem: it would have just been ordering reasons18:54
dansmithmriedem: note the rising tail at present too18:54
*** mnestratov has quit IRC18:57
mriedemdansmith: jaypipes: bauzas: https://review.openstack.org/#/c/498947/619:28
mriedemthat test_servers thing is wrong19:28
*** ijw has joined #openstack-nova19:28
*** vvargaszte has quit IRC19:28
mriedemthere are 2 tests for failures during evacaute on the dest19:29
mriedem1. test_evacuate_claim_on_dest_fails - that is testing when the claim fails with ComputeResourcesUnavailable19:29
*** sree has quit IRC19:30
mriedemso please explain how i'm wrong that they are now made redundant in that change19:30
jaypipesmriedem: that test raising TestingException was not useful. Because TestingException isn't what is ever raised by any code.19:30
mriedemit's simulating the virt driver raising the error during rebuild19:30
mriedemAFTER the successful claim19:31
mriedemit could be ProcessExecutionError19:31
mriedemfrom driver.spawn()19:31
mriedemif you like19:31
*** slaweq_ has joined #openstack-nova19:31
mriedemthese 2 tests are testing very specific failures19:31
dansmithmriedem: right but we don't run the claim teardown code in that case19:31
mriedemdansmith: correct, which is why we run the allocation cleanup manually19:32
jaypipesmriedem: if the point of the test (as is in that docstring) is to ensure allocations are cleaned up after a failed rebuild, then the test should raise the exception that would be raised *after* a claim has been made for the new resources.19:32
jaypipesmriedem: Matt, I'm trying to be civil.19:33
*** kristi___ has quit IRC19:35
jaypipesmriedem: but you *would* hit the TestingException "crazy shit"19:41
mriedemso being told i don't understand the test pisses me off19:44
mriedemraising ComputeResourcesUnavailable19:49
*** tbachman has quit IRC19:50
dansmithwhy is this here? https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L281219:50
mriedemsee https://review.openstack.org/#/c/499878/19:51
mriedemplus the comment above it19:51
*** awaugama has quit IRC19:52
mriedemso i don't know what else is going on in this patch, i didn't get that far, i saw the commit message and change to the test and wanted to bring that up since it's approved19:54
*** lucasxu has quit IRC19:57
mriedemthat's the assertion in the ML thread and commit to remove it anyway19:59
mriedemmelwitt: see _to_legacy_group_info ?20:09
mriedemmelwitt: https://review.openstack.org/#/c/469037/20:13
mriedemdansmith: what's the word20:21
melwittbird bird bird bababir bird's the word20:22
*** dave-mccowan has joined #openstack-nova20:23
*** liverpooler has quit IRC20:23
melwittmriedem: https://bugs.launchpad.net/nova/+bug/171973020:30
openstackLaunchpad bug 1719730 in OpenStack Compute (nova) "Reschedule after the late affinity check fails with "'NoneType' object is not iterable"" [Undecided,New]20:30
*** jmlowe has joined #openstack-nova20:30
* efried thought "Grease" is the word20:31
*** ltomasbo has quit IRC20:31
*** jpena|off has quit IRC20:32
efriedmriedem sdague https://review.openstack.org/#/c/488137/ should be ready again20:32
*** cleong has quit IRC20:33
*** penick has quit IRC20:39
*** crushil has quit IRC20:42
*** ijw has quit IRC20:43
*** sahid has quit IRC21:03
*** smatzek has joined #openstack-nova21:04
mriedemdansmith: ^ thus begins another round of these21:07
*** baoli_ has quit IRC21:07
*** tbachman has joined #openstack-nova21:28
pinompute instance lifecycle?21:36
pinoI'm just experimenting, but I'm wondering if this would be best built as part of Nova itself, or separately hook into the lifecycle.21:37
*** mnestratov has quit IRC21:38
jaypipespino: definitely not part of Nova itself, no. apart from writing files to a config drive, Nova doesn't mess with the VM.21:38
openstackgerritMatt Riedemann proposed openstack/nova master: Remove dest node allocations during live migration rollback  https://review.openstack.org/50768721:39
pinojaypipes: ok, fair enough... but shouldn't support for ssh certificates be modelled similar to keypair support?21:39
jaypipespino: honestly, I'm not sure what the diff is between a key pair, with the private part of the pair downloaded to the user and the public part laid down on the VM config drive, and the SSH certificates thing you're describing.21:41
pinojaypipes: I probably gave too much detail. I'm just looking for some hints about how I can hook into the startup workflow and block it until I've configured the VMs SSH the way I want it.21:43
*** r-daneel has quit IRC21:43
*** armax has quit IRC21:43
jaypipespino: I think cloud-init is more what you are looking for?21:44
*** claudiub has joined #openstack-nova21:44
mriedempino: https://docs.openstack.org/nova/latest/user/vendordata.html21:45
mriedemsetup an external rest service that provides metadata to the guest when it's created21:45
mriedemexample https://github.com/openstack/novajoin21:46
*** edmondsw has quit IRC21:46
penickpino: I use SSH CA in my environment, maybe I can help?21:47
pinomriedem: I saw that but wasn't sure it was the right approach. I'll take a closer look, thanks for the example.21:47
penickI think I see what you're trying to do, and I think what you're probably going to want is to build a small webservice to create and sign SSH certificates, then tie that in with the nova vendordata stuff to get injected into the instance on boot21:48
mriedemit's a wild penick21:48
mriedemi wonder what the keyword is here21:48
penickI identify as feral21:48
pinojaypipes: I'm looking at cloud-init too... but I want my setup script to run without the user's help (they shouldn't have to do any setup).21:48
*** edmondsw_ has joined #openstack-nova21:49
pinoOk, so I have a few topics/approaches to study. Thanks!21:50
*** ijw has joined #openstack-nova21:50
penicknp :)21:51
*** yassine has joined #openstack-nova21:58
openstackgerritEric Fried proposed openstack/nova master: Use ksa adapter for keystone conf & requests  https://review.openstack.org/50769322:10
*** esberglu has joined #openstack-nova22:12
*** sdague has quit IRC22:13
*** burt has quit IRC22:15
*** jaypipes has quit IRC22:16
*** esberglu has quit IRC22:16
rybridgesHey guys, I have a question. I am trying to inject some default user data into every instance while it is provisioning at this location -> https://github.com/openstack/nova/blob/stable/ocata/nova/compute/api.py#L1011  (I am adding an internal patch for this)  In my patch, I create the user data, merge it with any existing user data on the instance, and then try to write the new user data to the22:26
rybridgesdatabase. I am stuck getting it to write into the database. instance.save(22:26
rybridgesbut that does not seem to actually be saving the user data in the instance for some reason22:27
rybridgesit calls build_req.save() in the update_instance() method22:27
*** lyan has quit IRC22:29
openstackgerritMatt Riedemann proposed openstack/nova master: Remove dest node allocations during live migration rollback  https://review.openstack.org/50768722:32
melwittrybridges: doing it that way is a bad idea IMHO. if you're looking to have default data injected into every instance, you should look into the vendordata stuff that was linked earlier22:32
rybridgesto just save the instance and get the user data written to the db22:38
mriedemyou know what's going to complicate your deployment?22:39
mriedemconstantly rebasing your fork, and when we change the internals that it depends on22:39
*** lbragstad has quit IRC22:40
rybridgeswe plan on adding a vendordata service eventually22:40
rybridgesi just want to get a poc working right now22:42
mriedemdansmith: i joked about this, but somone made it a reality http://forumtopics.openstack.org/cfp/details/622:42
openstackgerritDan Smith proposed openstack/nova master: Make allocation cleanup honor new by-migration rules  https://review.openstack.org/49894822:48
openstackgerritDan Smith proposed openstack/nova master: Pre-create migration object  https://review.openstack.org/49895022:48
openstackgerritDan Smith proposed openstack/nova master: Revert allocations by migration uuid  https://review.openstack.org/49894922:48
openstackgerritDan Smith proposed openstack/nova master: Refactor resource tracker to account for migration allocations  https://review.openstack.org/50641922:48
mriedemi think those have wrapped by now22:51
melwittokay. was just curious22:52
mriedemnow i've got 100 ACTIVE instances, with 100 consumers in the api db and 300 allocations,22:52
mriedemSep 26 22:28:37 devstack nova-scheduler[2951]: WARNING nova.scheduler.client.report [None req-af92d5f2-4c99-4231-966e-939e1da04239 demo admin] Unable to submit allocation for instance 5f9f4f7d-8a2f-4fb8-b30a-024ed2e8e49d (409 {"errors": [{"status": 409, "request_id": "req-cda80554-6083-45b0-87bf-9e9c9924213f", "detail": "There was a conflict when trying to complete your request.\n\n Inventory changed while attempting to alloc22:53
mriedem Another thread concurrently updated the data. Please retry your update  ", "title": "Conflict"}]})22:53
melwittum, so is that a new scheduling race condition that has to be resolved with reschedules? to replace the old claim race?22:55
melwittyeah, but I thought after "claims in the scheduler" we don't have concurrent request race problems that get kicked out to be retried22:56
dansmithbecause you're scheduling so many things to one compute, and it only retries a certain number of times22:56
mriedembut why is that logged 3 times?22:57
mriedembecause ^ we retry 3 times22:57
melwittso this is like the claim race except worse in that the things should have succeeded but can't. maybe it won't happen in real life because by the time that retry limit would be hit, the compute host would already be rejecting claims22:57
mriedemonly thing that could be changing inventory is the RT?22:58
dansmithso we should be only making one call to scheduler I guess22:58
*** chyka has quit IRC23:01
mriedemfilters i suppose yeah23:01
dansmithbut yeah, people really hate this behavior that we tell them they can boot 500 things and later fail 20% of them because we suck23:01
dansmiththis is much better23:01
dansmithso I guess my thinking about why these are changing was because of concurrent requests, but if you're just doing one boot, I'm not sure23:02
melwittyeah, that's what I was trying to think in real life the retries would have a much better chance of succeeding bc not so many concentrated on one host, right23:02
dansmithmaybe the compute is tickling the inventory in some way, such that during 500 of them it gets changed23:02
melwittbecause the claim would detect the host full and then it would move on to another host23:03
dansmithmelwitt: yeah, I'm pretty confident it's related to the having of one compute and 500 instances23:03
melwittthe thing I'm wondering is if the host list order is consistent, could this happen in real life because they'll all try the first host in the list first. but as things are claimed, that host will fill up and then no longer be considered23:04
mriedemif we're not shuffling the hosts a bit23:04
melwittso won't get bombarded that badly?23:04
dansmithmriedem: I want to know what the instance uuid is for each of those lines23:05
*** yamamoto has joined #openstack-nova23:05
dansmithmriedem: like maybe it retried once for a few instances, then three times for one and bailed the whole process23:06
dansmithbecause any one of them gets to three retries and it should fail and stop the num_instances loop I think23:06
mriedemwe don't have the instance uuid in that message23:07
dansmith...that's what I'm asking for yeah23:07
mriedemok, adding23:07
mriedemthat explains the 6 log messages23:07
*** thorst has quit IRC23:08
dansmithwhile you're re-testing, I'd like to suggest that we not hold up the instance list stuff on this if it takes too much longer23:08
dansmithwe can revert it if there's a real performance regression pretty easily, and we're delaying soak-time for any non-perf-related bugs we might be able to resolve just from our own infra workload23:09
mriedemwell, i wasted most of my day on this 500 novalidhost thing,23:09
mriedemnow i know i can create in 100 chunks23:09
dansmithand we know we're not regressing the performance of the infra jobs that have run against this23:09
dansmithI'm actually not sure why we'd be hitting concurrent updates during an allocation event,23:25
dansmithgiven that we're not passing rp_generation23:25
*** sree has joined #openstack-nova23:26
dansmitheither the allocation fits or doesn't23:26
melwitthm, yeah23:27
dansmithit must be that placement doesn't hide transaction commits from us23:27
dansmithbased on the commit that added it23:27
melwittwhat does that mean? placement not hiding transaction commits23:29
*** sree has quit IRC23:30
melwittah, right. that's what I was thinking of23:31
dansmithplacement should really retry those for us server-side I would think, for an allocation type request23:31
melwittyeah, I was thinking the same23:32
mriedemi'm not sure why inventory would change23:32
mriedemthe update_available_resource periodic will post inventory, but only if it changes23:33
dansmithso any two allocations can conflict23:33
melwittI'm +1 on the idea of retrying server-side23:33
melwittas far as how to tune how many retries23:35
melwittto allow23:35
dansmiththis is also a highly synthetic scenario with a "virt driver" that doesn't get looked at much.. it could be doing something to re-stab inventory for no reason or something23:35
mriedemfor each instance, we go through the filters23:36
*** acormier has joined #openstack-nova23:36
mriedemand doesn't the scheduler have some kind of tracking on the HostState objects themselves for chosen hosts?23:36
edleafemriedem: that error message is poorly worded. It should be something like "available inventory has changed"23:36
melwittoh, the "inventory changed" yeah, I don't know anything about that. so that means it wasn't an allocation writing conflict?23:36
dansmithmelwitt: it's al related23:36
dansmithedleafe: that's not really accurate, AFACT, since allocations will cause the generation to increase23:37
dansmithedleafe: and thus it could be nothing changed with inventory to cause that23:37
dansmithedleafe: okay, I wouldn't word it that way for clarity, but okay :)23:38
edleafeI wouldn't word it that way either, but I was guessing the author's intent23:38
edleafe*cough* cdent *cough*23:39
dansmithmriedem: that's the info that we use to generate it, yes23:39
melwittyeah, if it can happen without inventory (total possible capacity) changing, then that error message is confusing to me23:39
edleafemelwitt: yeah, it's worded very poorly23:40
dansmithmriedem: I'd look to see if the compute is hitting placement /inventory ever after the first go, and maybe check the nothing-changed short-circuit to make sure we're never going through it23:40
melwittI dunno, I think I know just enough for it to be confusing. for an end user, it might not be confusing23:40
mriedemthe operator will see it23:41
dansmithPUT.*invent should be all you need23:48
*** penick has joined #openstack-nova23:48
mriedemdoesn't work23:49
mriedemah, well,23:51
mriedemPUT.*alloc works23:51
mriedemso it probably just wrapped23:51
mriedemand it's not updating inventory, as it shouldn't23:51
mriedemgot my 1000 instances now, so will do the test stuff once i'm done with dinner23:52
dansmithso I'd also check to make sure compute isn't doing the ocata healing during boot or something like that23:53
