22:01:24 #startmeeting nova_cells 22:01:25 Meeting started Wed Nov 26 22:01:24 2014 UTC and is due to finish in 60 minutes. The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:01:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:01:28 The meeting name has been set to 'nova_cells' 22:01:32 https://wiki.openstack.org/wiki/Meetings/NovaCellsv2#Next_Meeting 22:01:35 :-) 22:01:55 I've carried over the agenda from last week, but we can just touch on everything quickly 22:02:10 #topic manifesto 22:02:11 #help 22:02:39 I know dansmith has been working on the manifesto, but it isn't up for general feedback yet 22:02:51 I'm cool with dropping it here if you are 22:03:03 dansmith: sure 22:03:08 https://etherpad.openstack.org/p/kilo-nova-cells-manifesto 22:03:51 if people have general feedback, please add it to a section at the bottom or something 22:03:53 I know leifz is also interested in this and has sent some ideas to me which might be worth incorporating 22:04:09 cool 22:04:33 #link https://etherpad.openstack.org/p/kilo-nova-cells-manifesto 22:04:39 \o 22:04:45 dansmith: I can share directly, I just wanted to sanity check before... it's more big picture stuff.. I'll find a spot to add probably Friday. 22:05:05 cool 22:05:28 good stuff by the way dansmith :-) 22:05:28 I will say, for those that are looking at it, 22:05:44 we specifically didn't put in the solutions to all the problems we have, or even document all the problems 22:06:02 we know there are lots of challenges to solve, but this was supposed to just be documentation of the idea/goal 22:06:07 right, I actually pushed back against that a little 22:06:25 we can create another page for grievances or something and for solution brainstorming 22:06:26 It's less solutions and why people should care IMHO 22:06:30 I didn't want this to need editing after we had implemented a couple of specs 22:06:38 dansmith: which kind of deliverable do you expect ? 22:06:43 dansmith: but should be sort out almost all technical queries before going ahead? 22:06:59 vineetmenon_: those should be sorted out in specs I think 22:07:05 yar 22:07:16 bauzas: not sure what you mean 22:07:23 alaski: fair enough 22:07:37 agree that specs are for problem/solution details 22:07:48 dansmith: I mean, how long can I review this manifesto ? 22:08:06 bauzas: as long as you want? :) 22:08:22 bauzas: as soon as alaski stops this meeting, I'm going away for four days, so .. :) 22:08:33 dansmith: ok, so how the other people will know about the manifesto ? :) 22:08:44 dansmith: so I can keep you from that indefinitely? :) 22:08:57 bauzas: it will eventually get proposed to devref in Nova 22:09:02 alaski: ack 22:09:02 dansmith: I'm reading it... very good stuff 22:09:11 anyway, 22:09:15 let's not read it together live, 22:09:22 we can chat again next week 22:09:23 agreed 22:09:27 alaski: no, that's lower bound :) 22:09:40 :) 22:09:51 we'll come back to this next week 22:09:58 #agreed 22:09:59 operators should be actively informed as well 22:10:02 #action anyone interested please read and comment on the manifesto 22:10:38 belmoreira: good point. We can circulate this on the operators list for feedback as well 22:11:14 but I'd like for that to happen once we've done a first pass on it 22:11:26 yeah 22:11:54 alaski: circulate it next week, then in the mailing list 22:12:01 eh, EU people don't have Thanksgiving, so you can go on vacation while we review the manifesto ;) 22:12:26 bauzas: :) 22:12:51 yeah, you poor folks :( 22:13:04 although a whole holiday dedicated to eating is very american :) 22:13:09 vineetmenon_: let's see how it looks when we check back next week 22:13:17 :) 22:13:45 #topic specs 22:14:12 unfortunately there's not a great link to provide here, though I can link them all separately 22:14:21 +1 22:14:26 but there are 3 specs now I think 22:14:42 is there a Gerrit topic ? 22:14:43 https://review.openstack.org/135424 https://review.openstack.org/135644 https://review.openstack.org/136490 22:14:58 bauzas: there isn't since they're associated with different bps 22:15:06 alaski: yeah.. 1 about migration, 2 about cells. All of then from you 22:15:06 alaski: oh right 22:15:31 %s/then/them 22:15:54 the topic of handling deleted rows has come up and started a ML discussion 22:15:55 at least all of them are having dependencies, so that's easy provided you have the new Gerrit UI 22:16:10 bauzas: yeah, that's helpful 22:16:23 alaski: sounds like you have consensus on the soft delete ? 22:16:42 bauzas: I think there's consensus on not doing soft delete like we currently do it 22:16:45 alaski: well, not exactly consensus, but strong majority 22:16:50 but no clear plan forward 22:16:53 alaski: and it has gather quite a bit of momentum, the email discussion 22:17:37 I don't want to block on that so I may try to defer that to a later spec 22:17:53 alaski: +1 22:18:06 alaski: +1 22:18:29 but please review the specs, and if you see gaps please bring it up in a meeting or propose something to cover it 22:18:48 alaski: I left a comment btw. on the URIs 22:19:00 I think there's at least one more spec needed for populating data in the cells mapping 22:19:05 alaski: maybe this meeting is not the good time for discussing it 22:19:33 bauzas: well, you're keeping dan from vacation... :) 22:19:53 alaski: agreed, and I'm keeping myself away from bed 22:20:01 alaski: we still don't have a clear picture how top and child dbs will look like 22:20:27 "populating data in the cells mapping" specs requires new schema, right 22:20:29 alaski: having that will be important for the "delete" discussion as well 22:20:41 belmoreira: why? 22:21:03 vineetmenon_: there's a schema proposed, but nothing proposed on how to get data into it 22:21:15 dansmith: I'm interested to know where instance information should live. top or child? 22:21:29 belmoreira: definitely in the children, that's the whole point 22:21:37 +1 22:21:51 see line 14 of the manifesto :) 22:22:00 dansmith: does that mean top level will not have anything except the instance information as detailed in the spec? 22:22:02 dansmith: But if delete goes to top, how you remove old data from child? 22:22:12 belmoreira: I responded to your email on that. I think we want to make sure that deleting at the api level doesn't mean the same as deleting in the cell 22:22:27 vineetmenon_: no, the top level will have the mapping, and also things like flavors, keypairs, and other bits of information that are common to the whole deployment 22:22:42 belmoreira: I think deletes are different if you consider a cell or an instance 22:22:57 belmoreira: a delete would delete in the child, and doesn't need to touch the mapping. that can be deleted independently 22:23:01 belmoreira: I dunno what you mean about "old data", but any change (delete, update) that goes to the top will modify the thing in the child database, because that's the only place it lives 22:23:19 unlike current cells, we don't need to delete it in both places and/or sync that it has been deleted 22:24:27 dansmith: maybe is that what I'm missing... understand how to delete in map table and child DB without sync 22:24:35 there's a big change here, which is that the mapping is almost wholly independent of the instance 22:24:45 so they don't need to be synced 22:24:55 We do talk about cache'ing the information at global level, but that is standard cleaning for a cache'ing implementation. 22:25:01 alaski: ok 22:25:02 belmoreira: persistence is local to the object 22:25:32 child cell dbs are the authority for instance information. 22:26:06 leifz: right 22:26:17 ..and the command to alter the database is direct not via message queue 22:26:17 So a query for a dleted instance would match in the top DB, lookup the mapping a query the child woudl say "doesnt exist" right? 22:26:20 belmoreira: does that make sense about deletes? 22:26:33 i mean n-api to child db 22:26:34 alaski: yes, thanks 22:26:37 would the top table do anything other than pass that back to the caller? 22:26:44 vineetmenon: correct 22:27:08 tonyb: the top table is just a mapping of what is where, you still have to talk to the database that has the thing you're looking for, once you determine where it is 22:27:38 dansmith: right. I get that part. 22:27:58 oh wait the top DB tells the client "it's over there" 22:28:04 no 22:28:04 and the client talks the the child DB? 22:28:06 bi 22:28:17 alaski: all other service will use the usual message queue, except n-api-cell, which will directly get hold of DB... is this right 22:28:19 no, it tells the api where to find it, on behalf of the client 22:28:37 tonyb: right now, it's implied that the instance is in "the" database, because there is only one (as far as the api is concerned) 22:28:37 dansmith: okay that's wahtr I thought at first. 22:28:53 dansmith: okay. that's also as I thought. 22:28:56 vineetmenon_: anything that uses a queue now will continue to use a queue 22:29:01 this would add a switch in front of that to not just imply the database location 22:29:10 I think we all need to read the manifesto carefully :) 22:29:14 yes :) 22:29:19 I guess I'll read the specs and if they'er not at all clear poke at that level 22:29:24 rather than keep eveyone here 22:29:28 bauzas: haha.. yes 22:29:29 I think a diagram might help too 22:29:41 alaski: yeah, good point, I suck at drawing, so I volunteer you :) 22:30:01 dansmith: heh. I'm fine with that 22:30:21 #action alaski to diagram cellsv2 flow and approximate table split 22:30:37 which leads to... 22:30:43 #topic analysis of tables 22:30:57 I don't believe there's anything to report here 22:31:14 no, because I suck 22:31:21 and because flavors have been kicking my ass all week 22:31:23 well, because flavors suck 22:31:42 s/flavors/data migration, eh ? 22:31:45 alaski: on that note, i have generated an ER diagram for nova db.. I can share it on ML if anyone needs... 22:32:01 bauzas: no, just flavors 22:32:05 flavors... 22:32:06 bauzas: the migration is easy 22:32:12 dansmith: speak for you :) 22:32:15 flavors, I want to set on fire 22:32:20 I think getting this sorted out will help a lot with understanding of where this is headed 22:32:28 at least getting some things sorted out 22:32:30 agreed 22:32:42 so maybe we just start with some easy ones and get something in place 22:32:57 alaski: the patches by gary are all merged... shouldn't that solve flavor? 22:32:59 YAE ? yet another etherpad ? 22:33:02 flavors is not the worst in my opinion... 22:33:03 vineetmenon_: yeah, share that on the ML 22:33:08 how about aggregates? 22:33:26 vineetmenon_: it solves some of it, although there are two more by him that need merging 22:33:39 bauzas: yeah, probably another etherpad 22:33:43 belmoreira: I don't think that aggregates are considered as an easy one :) 22:34:22 belmoreira: I'm personnally convinced that aggregates are local to cells 22:34:35 I'm going to take an action to get something together with just easy ones. like mapping tables in the api and flavors, and instances in the cells 22:34:37 belmoreira: but some other people can argue on this 22:35:02 #action alaski start an etherpad on easy table splits 22:35:17 belmoreira: cells are a certain level of segregation, aggregates are another level of segregation IMHO 22:35:21 that will pair nicely with the diagram I need to do 22:35:23 bauzas: I think we will need a meeting only for that. But I agree with you 22:35:48 bauzas, alaski, belmoeira: let's start with easy ones.. and subsequently resolve others... 22:35:59 belmoreira: well, the problem is that contextual information (aka. metadata) is attached to a placement thing, and here comes our problems... 22:36:19 yeah, we can devote a much larger meeting time to that discussion 22:36:30 vineetmenon_: ack, I was diverting 22:36:31 after we get some easier ones nailed down 22:37:00 #topic testing 22:37:15 There's this now https://etherpad.openstack.org/p/nova-cells-testing 22:37:23 which needs some real work from me 22:37:50 https://review.openstack.org/#/c/135285/ gets cells failures down from ~150 to ~95 22:38:20 which is a possible short term solution which can be superceded by dansmiths flavor work 22:38:40 I have a tempest fix up for some service test failures 22:38:52 and I've determined that fixed-ip tests should be excluded 22:39:14 I've been looking at floating-ip tests to see what's up there but havne't made much progress yet 22:39:42 ouch, are you planning to fix functional test coverage for Cells V1 ? 22:39:53 I missed the rationale of that work :) 22:39:57 http://logs.openstack.org/85/135285/5/experimental/check-tempest-dsvm-cells/8b23f8d shows some other tests that need work though 22:40:20 bauzas: to check what all things work under current inplementation 22:40:26 bauzas: fix, or exclude tests that we don't expect to work 22:40:35 right, we need to make sure cellsv1 doesn't rot 22:40:52 alaski: oh, for regression checking ? 22:40:56 exactly 22:40:59 alaski: got it 22:41:56 there's been some progress, but there are still some test classes I'm not sure of yet 22:42:15 I'd like to get that laid out on the etherpad so others can help look if they're inclined 22:42:32 alaski: I can do my bit, if guided... 22:42:53 #action alaski record in progress test work, and remaining failing tests in etherpad 22:43:10 vineetmenon_: awesome. I'll ping you once I get the etherpad in better shape 22:43:12 I'm a newbie, so... 22:43:19 * tonyb is also very happy to help with whatever can be delegated 22:43:26 alaski: sure 22:43:28 tonyb: noted, thanks 22:43:41 ooh, nice 22:43:50 tonyb: be careful what you volunteer for :) 22:44:18 I'd offer, but alaski already delegates enough to me :-) 22:44:23 * bauzas should do call for volunteers in other meetings... 22:44:24 dansmith: I'm green but feel free to pass work my way 22:44:33 dansmith: Gotta learn somehow 22:44:39 essellent 22:44:51 dansmith: lisp 22:44:54 dansmith: after your long weekend ;P 22:44:57 leifz: :) 22:45:20 I want to help too, but new to cells. this is my first cells meeting 22:45:28 I had another topic on the agenda about cells scheduling requirements but there hasn't been any progress there yet 22:45:44 melwitt: well it's inly the 2nd cellsv2 meeting ;P 22:45:48 melwitt: cool 22:45:59 well have everyone as a cells expert in no time 22:46:03 I think there was a plan to come up with some low-hanging fruit on the cellsv1 test effort 22:46:04 we'll 22:46:06 (and the first at a "reasonable" time) 22:46:16 :) 22:46:46 dansmith: yes. I haven't quite gotten there yet, but I'm going to get it into that etherpad 22:46:52 https://etherpad.openstack.org/p/nova-cells-testing 22:46:52 tonyb: Americans can say that. :) 22:47:17 vineetmenon_: Meh I'm in Australia 22:47:49 alaski: put an action for me to help find the categories... since i already did this run through once I should help. 22:48:10 #action leifz help categorize test failures 22:48:24 all in the U.S. have a good holiday.. all everywhere else have a good rest of the week, I'm out. 22:48:29 also, tonyb answered my question from last week about this later meeting time 22:48:41 leifz: ok. happy holiday 22:48:57 #topic open discussion 22:49:49 this meeting time has drawn a few different folks than last week, so it may be worth having 22:50:04 alaski: please keep it if you can. 22:50:37 alaski: I'm keen to help and wont be at the alternate time 22:50:39 I don't know if belmoreira can regularly attend this meeting, I took this one opportunisticly 22:51:01 tonyb: okay. good to know. I will keep running it if it's useful 22:51:02 bauzas: I will try 22:52:02 anything else people would like to discuss here? 22:52:04 alaski: you could move it an hour or two earlier if that helps (but only during the US winter) 22:52:59 tonyb: It might help the US east coast, but I don't think that helps Europe that much 22:53:09 alaski: not really :) 22:53:35 I mean, there was no good TV shows tonight... :) 22:53:42 heh 22:53:59 bauzas: :) 22:54:00 then, I assume we're done ? :) 22:54:09 seems like it 22:54:22 #endmeeting