21:00:01 #startmeeting nova_cells 21:00:02 Meeting started Wed Jan 24 21:00:01 2018 UTC and is due to finish in 60 minutes. The chair is dansmith. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:05 The meeting name has been set to 'nova_cells' 21:00:12 o/ 21:00:55 * dansmith pokes belmoreira melwitt mriedem 21:01:03 o/ 21:01:09 o/ 21:01:19 o/ 21:01:22 sorry I thought maybe we were talking in #openstack-nova instead of meeting 21:01:25 #topic bugs 21:01:39 I think we already said there are no new major bugs to discuss, right? 21:02:01 right 21:02:12 #topic open reviews 21:02:12 We just run out into something... not sure if its a bug 21:02:16 anything here? 21:03:01 nope, 21:03:01 belmoreira: okay let's just pile all your stuff into open discussion, which I'm assuming we can just roll to 21:03:05 just rechecking the alternate host patch 21:03:06 unless anyone has anything here 21:03:12 cool 21:03:21 #topic open discussion 21:03:36 belmoreira: okay, first off, what's the issue you just hit? 21:04:35 currently our instance_mappings in nova_api DB doens't have any cell mapping (it's null)... when migrating to cellsV2 will need to do the correct mapping. However, the table is already populated and it fails to update the correct cell (change null to the right one) 21:04:56 we are in Newton and nova_api DB is required 21:05:14 are we missing something? 21:05:43 you're using newton nova-manage to do the instance mappings and it's not filling that field, is that right? 21:06:04 with --cell_uuid ? 21:06:35 We were testing this week with master and it didn't update the cell_uuid 21:06:44 tssurya_? 21:06:54 yes, we tested this with master 21:07:03 and we had provided the --cell_uuid 21:07:35 that seems really odd, I don't know why that would be 21:07:39 probably because of the marker? 21:07:44 saying you've already mapped everything 21:08:05 yea that or the duplicate entry check ? 21:08:09 does map_instances use a marker like that? 21:08:11 * dansmith looks 21:08:14 it does 21:08:27 ah, then just nuke that and try again I guess 21:08:37 cell-uuid is required for map_instances, so i'm not sure how the instance mappings would have a null cell mapping entry 21:08:57 unless it was a bug in the CLI in newton? 21:09:21 in Newton we don't have any cellV2 defined 21:09:25 idk 21:09:26 do you see an instance mapping with INSTANCE_MIGRATION_MARKER as the project_id? 21:09:46 however nova_api DB is required and new instances have an entry in instances_mapping 21:09:48 i don't see how you could run map_instances without a cell_uuid 21:09:51 to a valid cell mapping 21:10:26 okay so to test the scenario, we ran it on master with a --cell_uuid option 21:10:28 certainly not based on the master code, but maybe it was different in newton 21:10:39 we manually set the value in the DB as NULL 21:10:55 to recreate the scenario 21:11:09 mriedem we will run "map_instances" only when upgrading to Pike 21:11:19 you need it in ocata too 21:11:27 even though it's all one cell 21:11:48 ok i guess you'll have to find a valid recreate for this, 21:11:53 dansmith, true 21:11:53 that doesn't involve manually messing with the db 21:11:58 b/c i do'nt know how this would happen 21:12:12 are you using simple_cell_setup at all? 21:12:20 I hope not :) 21:12:27 that's toootally not for situations like this :P 21:12:42 to be clear, map_instances isn't going to fill in a NULL field for you. it will only create new instance_mappings records. right? 21:13:02 but is the same thing. Then we need to map again the instances to the real cells when upgrading to Pike and cellsV2 21:13:14 melwitt : yes 21:13:16 in ocata everything will be mapped to only one 21:13:23 so the question is, how did you get any instance_mappings records with a NULL field in the first place 21:13:42 i think i understand what belmoreira is saying now, 21:13:46 ah yeah I guess that's right 21:13:49 in ocata, they have like 50 child cells, 21:13:52 melwitt newton with any cellV2 defined 21:13:58 but they really only have 1 global cellsv2 cell right? 21:14:07 so if you map instances in ocata, they go to the global single cellsv2 cell, 21:14:09 but in pike, 21:14:16 you want to map the instances in the separate child cells, 21:14:24 but they are already mapped to the global thing 21:14:25 mriedem yes 21:14:26 correct? 21:14:27 ok 21:14:28 well, 21:14:30 oh, I understand now too 21:14:35 yeah, but at that point you can just nuke all the mappings and recreate them right? 21:14:45 it wasn't NULL but it was all to one global cell 21:15:22 dansmith: yeah, that or provide some --overwrite option to the CLI 21:15:29 to create or update 21:15:46 dansmith we were thinking on that as well. Just asking if there was a different way, ir the nova-manage instance mapping just could update the cell if already sees the instance 21:15:48 dansmith : we already thought of nuking and recreating 21:15:53 yeah I actually thought the code would create or get-and-then-save, but it doesn't 21:16:44 updating the cell of an instance is kindof a dangerous thing if it's wrong, which might be why this doesn't do that now, but I can't think of any real reason to not do that 21:16:44 the fake marker in the instance_mappings table will also cause problems 21:16:58 if you ended up with two cells pointing at the same db, for example, you could re-map and mess struff up 21:17:02 oh right 21:17:24 melwitt all instance_mappings created in Ocata have the cell_uuid of the global but the old entries from newton is null 21:18:06 so we could allow delete_cell --force to remove instance mappings too, 21:18:20 and then when they go to delete their global cell, it will nuke the instance mappings too 21:18:27 i think that's probably safer 21:18:38 belmoreira: oh, odd. there must be a bug in newton then. we need to investigate that 21:18:40 we have --force right now for hosts but not instance mappings 21:19:09 note that the marker mapping has a null cell mapping field 21:19:12 dansmith : yes mappings go only after archive 21:19:22 so delete_cell --force wouldn't remove the marker either... 21:19:39 yeah the marker is going to be a problem regardless 21:20:09 we probably need a map_instances --reset 21:20:17 which will avoid the marker 21:20:24 don't know what the marker is. Will check tomorrow 21:20:34 map_instances can run in chunks 21:20:36 until complete, 21:20:41 so there is a marker for knowing where to start next 21:20:57 mriedem thanks 21:21:03 so, belmoreira tssurya_, can one of you file bugs for these two new flags to the cli? 21:21:16 dansmith : sure 21:21:32 okay so is that all we need to discuss on this map_instances thing? 21:22:03 From us I think so 21:22:19 okay, so what else did you have? the cell-is-down thing? 21:22:21 just to be sure the new flags you mean are reset and force right ? 21:22:32 1. reset marker on map_instances 21:22:33 tssurya_: for map_instances and delete_cell respectively, yeah 21:22:39 2. remove instance mappings for a deleted cell 21:22:41 dansmith : got it 21:22:59 mriedem thanks 21:23:00 tssurya_: we already have --force on delete_cell but it's only active for hosts, so it needs to do instance_mappings too 21:23:20 when moving to Pike with cellsV2 we need to have a solution for a cell going down. 21:23:21 I'm quite sure we previously said something about never wanting to let you nuke a cell with instances in it, but alas, whatever :) 21:23:36 dansmith : yea I remember that patch from takashi 21:24:15 belmoreira: well the solution probably depends on what you need for functionality out of the API 21:24:42 and it depends on what the API people want in the way of communicating a partial result 21:25:04 understand 21:25:06 yeah because i think all we have in the api db is the instance uuid, cell and host 21:25:27 pretty much yeah 21:25:46 for us the uuid will be enough 21:25:57 so I expect we'll just need to return {uuid: $uuid, state: UNKNOWN} or something for each instance 21:26:05 but I can't hide instances from the users 21:26:16 but then the api layer will need to return 234: only part of your stuff was found, yo. 21:26:28 is looking into the request_specs a crazy idea? 21:26:33 yes 21:26:36 there is a bunch of stuff we'll need to return UNKNOWN for 21:26:37 :) 21:26:39 :) 21:27:21 crazy yes, but it could work... just when a cell is down 21:28:05 i guess i forgot about request specs 21:28:11 I don't know how much we could get out of reqspec that would be accurate, so I'd have to see, 21:28:22 but reqspec requires de-json'ing every instance 21:28:31 which would be a lot of CPU overhead to incur when a cell is down 21:28:47 and we couldn't get all the info out of it that we need I think 21:28:53 not so bad for GET on a specific instance, 21:28:58 but for listing instances, yes definitely 21:29:05 and I'm also not sure it'll always be accurate.. I can't remember if we update reqspec after a resize, for example 21:29:24 the accuracy of reqspecs is definitely dubious 21:29:26 since resize happens in the cell late, I think we don't 21:29:34 * mriedem refers to all of the TODOs in the conductor code 21:29:42 yeah, reqspecs are a big mess 21:30:32 looks like we do update the reqspec for the new flavor during resize, 21:30:34 in conductor 21:30:43 and for rebuild if you're changing the image, in the api 21:31:01 and if the resize fails? I guess we have a chance to fix it when you revert or confirm? 21:31:27 yeah i don't see anything about fixing the reqspec if the resize fails, or you revert 21:31:59 anyway, this is about listing and I think listing and de-jsoning all that stuff for a whole cell is a bit of a problem 21:32:19 so one option I guess, 21:32:40 is to just fake out anything we don't have and go forward with that 21:32:48 knowing it's a bit of a lie, but we could do that fairly quickly 21:33:03 yeah i think we're going to have to start with the quick and dirty options so we have *something* if a cell is down 21:33:05 I dunno what to say about things like the flavor 21:33:26 right, well, I thought that's what tssurya_ was going to do 21:33:27 so we had something to argue over :) 21:33:33 I'm happy with "not available" 21:33:47 same for VM state, and all the others 21:33:51 and then if we did try to use the requestspec for a single GET /servers/id, we then have to assess how much of the response could be wrong, and if we're going to spend a bunch of time updating code to make sure the reqspec is correct because it's now used in the API 21:33:56 so we currently have a state of "unknown" for if the host is down I think 21:34:03 but we still have things like flavor at that point 21:34:25 mriedem: well, if we lie in the list, then returning different stuff on GET is kinda messy 21:34:44 dansmith : yes that what we have started with, we have faked stuff, but yes stuck at fields lke flavor 21:34:44 if we return a shell only in the list somehow, then the GET having details seems a little more legit to me 21:34:57 dansmith: i don't mean lie in the list, 21:35:00 tssurya_: ack, makes sense 21:35:16 list would just return uuid and UNKNOWN for everything else, 21:35:31 well, that's not going to work with our schema I'm sure 21:35:39 with the uuid, you could at least to get some information for a specific instance using the reqspec, but that could be all wrong too 21:35:42 we can't return UNKNOWN for a flavor that should be a dict 21:35:53 right, we'd have to use something else that means unknown for non-strings 21:36:04 None or {} or [] or -1 or whatever 21:36:13 or we fill out a flavor object with all UNKNOWN properties, 21:36:22 but we have no sentinel for integers 21:36:30 -1 typically means unlimited :) 21:36:36 which is terrible, 21:36:47 and which will conflict for flavor things that actually use -1 and 0 and stuff 21:37:04 -MAX_INT :) 21:37:05 but that's fine, we can start with that and see how it goes 21:37:15 they can at least use that locally while we argue 21:37:22 cross reference with the flavors table? 21:37:26 melwitt: can't 21:37:49 we don't know which flavor look up in there, unless we decode the reqspec, which as pointed out, may be stale 21:38:00 (and it's a lie since the flavor might have been redefined) 21:38:17 we could pull the first flavor and just say everything in the cell is flavors[0] :P 21:38:36 the reqspec has a serialized versoin of the flavor from when the instance was created, 21:38:37 hehe 21:38:39 like the instance itself 21:38:53 ah yeah so if we're going to decode that we could just use that lie instead of making up a new one :) 21:39:13 tssurya_: so I think you just need to post something, anything, document the things you know are controversial, and then let's go from there 21:39:21 that was the point of the straw man 21:39:35 sure I have been playing around with the above stuff 21:39:36 we can point api people to it and they can say "I hate this so much" and then we can ask for alternatives :) 21:39:47 on how to go about with the various fields 21:40:06 great, we will continue with this approach 21:40:07 dansmith: are you saying post up a rough POC patch? 21:40:15 yeah 21:40:19 melwitt: yes, 21:40:27 that's the thing I'm saying we need to discuss (a strawman) 21:40:32 "post up" isn't that gang slang? 21:40:44 okay. so yeah, tssurya_ if you can post up a very rough draft POC patch to show a way we can do it, we can discuss it there more concretely 21:40:45 cool then, will open a WIP 21:41:09 gang slang? lol. 21:41:12 https://www.urbandictionary.com/define.php?term=post%20up 21:41:23 "propose" 21:41:32 time to watch the wire again 21:41:39 another thing we got stuck at was this - https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/extended_volumes.py#L50 21:42:04 even if we manage to fill fake stuff and come out, 21:42:25 again the connection failure stuff comes up 21:42:25 that needs to be converted to use scatter/gather and then it can take a similar approach 21:42:40 dansmith : ok 21:42:44 mriedem: lol, nice. I had never heard of it before 21:42:55 tssurya_: what's the issue? that we don't know which volumes are attached to which servers b/c we can't get BDMs? 21:43:19 yea 21:43:23 we could use cinder... 21:43:46 but i don't know if there is a way to list volumes by instance_uuid in cinder as a filter, there might be 21:43:50 basically that code again tries to query the DB which is down of course 21:43:53 that's N calls to cinder for N instances right? 21:44:04 if it's a list, i'm just saying for a show 21:44:10 if we do anything 21:44:26 that is only for list 21:44:28 again, start small and just ignore the down cell and return [] 21:44:47 agreed 21:44:48 dansmith: i do'nt think so... 21:44:52 +1 21:44:58 https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/extended_volumes.py#L40 21:45:12 mriedem : ok :d 21:45:21 mriedem: doesn't use that helper 21:45:28 ah 21:45:32 only detail does, and it takes a list of servers and iterates over them 21:46:17 the show needs to be graceful about not failing the whole show there though because it tries and fails to do the list 21:46:28 the context is targeted there, so it needs to catch that and not explode 21:46:34 else the returning of fake things isn't going to work 21:48:06 tssurya_, belmoreira: okay what else? 21:48:16 about quotas... 21:48:29 What we are thinking is if we can't use the request_specs to calculate the quota of a project (challenging!!!) we will block the creation of new instances if the project has instances in cells that are down. 21:48:35 makes sense? 21:48:59 that seems like a totally legit workaround to me 21:49:24 it needs to be configurable, in [workarounds] 21:49:25 as not everyone will want that of course 21:50:07 that seems fine as a start to me also 21:50:16 yeah, seems like that could be a good way to best-effort quota while cells are down 21:50:27 although, 21:50:36 when you go to create a server, how do you know any cells are down? 21:50:40 we have to know if cells are down when.. 21:50:41 yeah that 21:50:43 since we don't have any status for cell mappings?/ 21:50:56 but 21:51:09 the scatter-gather will return sentinels for no response etc at least 21:51:11 the scheduler has to get the host lists from all the cells to start, right? 21:51:24 although not with placement anymore I guess.. not reliably overlapping the down cell 21:51:26 (during the quota check) 21:51:30 unless you're forcing a specific host, yeah 21:51:39 melwitt: ah right because we do the quota check 21:51:45 sweet 21:51:47 but the quota will be checked 21:52:04 yep, totes 21:52:18 so if the quota check scatter gather returns sentinels, fail? 21:52:29 or, check to see if that project has instances in a downed cell? 21:52:29 ...and this flag is asserted 21:52:52 mriedem: yeah, if sentinels, then go on to check on instances in the down cell 21:53:01 it's almost like we need a service group API heartbeat for cells... 21:53:03 which we can do by checking mappings 21:53:15 because our other servicegroup heartbeat thing works so well? 21:53:20 totally 21:53:23 wouldn't it be if sentinels then fall back to request_specs if workaround? 21:53:25 if it were using zk 21:53:28 either way, we'd have to have one thing in the cell dedicated to pumping that back up, which is a problem 21:53:42 melwitt: no, not using reqspecs at all 21:54:08 melwitt: if we do the quota check and find downed sentinels, then we quickly count the number of instances owned by the project in those downed cells 21:54:11 oh. what do you mean by "check on instances in down cell" 21:54:13 if that number is nonzero, then we refuse the build 21:54:14 dansmith: i think i'm just proposing some kind of periodic at the top (scheduler?) that tries to target the cells every minute or something 21:54:18 oh, I see 21:54:44 for cores and ram you'd need request_spec 21:55:08 mriedem: we have no single thing that can do that, we can have multiple schedulers now, and network partitions may make that not so straightforward 21:55:14 this reminds me that we said in syndey we'd add a type to the allocations / consumers table in placement...and didn't 21:55:17 er, nevermind. I think I'm not following the "nonzero, refuse the build". need to think more 21:55:31 melwitt: we wouldn't care about cores and ram, because they have instances in the downed cell so we just punt 21:55:42 got it 21:55:46 mriedem: we said we would and then got into a big argument 21:55:59 mriedem: and apparently there was a lot of disagreement on it 21:56:04 and jay wasn't there 21:56:14 * mriedem adds it back to the ptg etherpad 21:56:21 yeah, we need the type to be able to use placement for the quota stuff 21:56:26 anyway 21:56:36 tssurya_: belmoreira: I say propose something for that, yeah 21:56:46 dansmith : yep 21:56:55 yes, will do 21:56:59 sweet 21:57:06 is there anything else? we're about out of time 21:57:17 I guess that's all for now ... 21:57:42 okay, anything else can go back to the main channel 21:57:45 last call.... 21:57:46 thanks a lot 21:57:51 ...for alcohol 21:58:03 thanks! 21:58:05 * dansmith said that for mriedem's benefit 21:58:10 #endmeeting