00:01:32 <thinrichs> #startmeeting CongressTeamMeeting
00:01:32 <openstack> Meeting started Thu Jun 23 00:01:32 2016 UTC and is due to finish in 60 minutes.  The chair is thinrichs. Information about MeetBot at http://wiki.debian.org/MeetBot.
00:01:33 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
00:01:35 <openstack> The meeting name has been set to 'congressteammeeting'
00:01:48 <ekcs> hi
00:01:58 <aimeeu> Hi again
00:02:08 <ramineni_> hi
00:02:09 <thinrichs> ekcs, aimeeu: hi!
00:02:12 <thinrichs> ramineni_: hi
00:02:18 <thinrichs> masahito is I think out of town still
00:02:46 <thinrichs> Agenda for the week…
00:02:48 <thinrichs> 1. Gate
00:02:58 <thinrichs> 2. Low-hanging bugs for newcomers
00:03:15 <thinrichs> 3. Status updates
00:03:18 <thinrichs> Anything else?
00:04:27 <thinrichs> #topic Gate
00:04:37 <thinrichs> ramineni_: how is the gate looking?
00:05:22 <ramineni_> thinrichs: right now, it looks green, series of patches merged in tempest , causing our gate to fail
00:05:55 <ramineni_> thinrichs: fixed the same in both master and stable/mitaka
00:06:32 <thinrichs> ramineni_: Nice!
00:06:51 <thinrichs> That's an easy agenda item them
00:07:30 <thinrichs> Moving on, unless there's anything to discuss about the HA tests, which are the ones that have been broken of late.
00:08:43 <thinrichs> #topic Low-hanging bugs
00:08:51 <ramineni_> thinrichs: no, HA tests failing because we are using endpoints_client and service client which got changed in tempets
00:09:07 <ramineni_> thinrichs: otherwise they are fine
00:09:54 <thinrichs> ramineni_: makes sense
00:10:26 <thinrichs> aimeeu is a new contributor (if you remember from last week) and is looking for a couple of bugs to start with
00:10:56 <aimeeu> I did come across another one that was abandoned a year ago: https://bugs.launchpad.net/congress/+bug/1415199
00:10:56 <openstack> Launchpad bug 1415199 in congress "Refactor test_neutron_driver" [Medium,Confirmed] - Assigned to Cleber Rosa (cleber-gnu)
00:11:12 <thinrichs> We used to have bugs in launchpad marked with 'low-hanging-fruit'.
00:11:26 <aimeeu> Only 2 marked that way
00:11:35 <thinrichs> Do we have bugs aimeeu might be able to work on that we haven't registered in launchpad.
00:12:12 <ekcs> Here is a simple one assigned to me but I haven’t done it. #link https://bugs.launchpad.net/congress/+bug/1501579
00:12:12 <openstack> Launchpad bug 1501579 in congress "Calling plexxi_driver.execute(...) throws exception" [Medium,New] - Assigned to Eric K (ekcs)
00:12:28 <thinrichs> aimeeu: The tests are in a state of flux because we basically rewrote them all for a new architecture we're almost finished implementing.
00:13:06 <thinrichs> So we should probably abandon the bug you mentioned above.
00:13:11 <aimeeu> OK
00:13:34 <thinrichs> The new tests could require that kind of work too, but we would need to look.
00:14:01 <thinrichs> ekcs: thanks.
00:14:18 <thinrichs> aimeeu: you could work on the one ekcs is pointing out.  Seems like there's just a method to add.
00:14:51 <aimeeu> Got it - just reassigned it to myself
00:15:11 <thinrichs> Does anyone else have bugs they're not actively working on?  Especially if they're good starters, let's unassign ourselves.
00:16:22 <thinrichs> Can we all take an action item to look for 1-2 low-hanging bugs?  Then aimeeu can have a few to choose from, and we make it easier for others to jump in and help out.
00:16:44 <ramineni_> thinrichs: ok
00:16:56 <aimeeu> thinrichs: thanks!
00:17:13 <ekcs> got it.
00:17:31 <thinrichs> #action Everyone will try to file a couple low-hanging bugs for starting points for newcomers
00:17:52 <thinrichs> #topic status updates
00:18:15 <thinrichs> ekcs: how is the HA discussion going?
00:19:52 <ekcs> I haven’t done much this past week cuz I got pulled into something urgent at work. But I think we’re all in fairly good agreement on the spec. Andrew from redhat also gave some comments. I’ll touch things up in the next couple days (a few clarifications requested and fill in a few TODO sections) then I think it’s ready to merge.
00:20:47 <thinrichs> ekcs: That was my reading too—that we're done with design and need to move on to implementation
00:21:11 <ekcs> the other thing I’m working on is looking for threading issues with the switch to new-arch. A lot of things that used to be safe because there was no blocking are now suspect because rpc.cast/call blocks and yields.
00:21:46 <thinrichs> ekcs: Totally important
00:21:48 <ekcs> and i’m done with the urgent project so I should be able to get more done in the coming week.
00:23:18 <thinrichs> I could see a number of bugs coming out of that analysis, and they could easily be super-hard to write tests for.
00:23:43 <thinrichs> ramineni_: want to discuss your status?
00:24:00 <ramineni_> thinrichs: sure
00:24:35 <ramineni_> last week worked on supporting keystone v3 and use of sessions for all the datasources and also made the default auth_url as v3 for creating datasources in devstack plugin
00:25:05 <ramineni_> still some datasouces are throwing errors, i have to look at it
00:25:14 <thinrichs> Do all the services support v3?
00:25:22 <ramineni_> yes
00:25:27 <thinrichs> Do we?
00:26:09 <ramineni_> yes , creating the client with v3 shouldnt throw any error, ill check agai
00:27:08 <thinrichs> Just wanted to check
00:27:13 <ramineni_> thats it from my side
00:27:40 <thinrichs> Ok.  I'll go.
00:28:04 <thinrichs> I've been underwater for weeks now.
00:28:14 <thinrichs> I've been struggling just to keep up with reviews.
00:28:37 <thinrichs> But I'm hoping to have more time again.
00:29:24 <thinrichs> I was planning on testing out a multi-process deployment
00:29:42 <thinrichs> and maybe add docs around how to do that
00:31:05 <thinrichs> Or are there other things we think are more pressing before we begin the HA work?
00:31:32 <ekcs> that sounds right.
00:32:31 <thinrichs> Okay.  That's my plan then.
00:32:38 <thinrichs> #topic open discussion
00:32:50 <ramineni_> thinrichs: https://review.openstack.org/#/c/329772/
00:32:50 <patchbot> ramineni_: patch 329772 - congress - Fix listing of datasources
00:32:50 <thinrichs> Anything else we should discuss today?
00:33:21 <ramineni_> thinrichs: im thinking should we delete the datasource when we fail to load driver happens
00:33:23 <ramineni_> ?
00:34:02 <thinrichs> So actually remove it from the database?
00:34:09 <ramineni_> thinrichs: yes
00:35:25 <thinrichs> (ekcs: here's the problem ramineni_ is working on.  User creates datasource with driver D.  Then removes D from etc/congress/congress.conf and restarts congress.  Today there's a fatal error b/c Congress can't reinstantiate the datasource b/c D is unavailabe.)
00:35:42 <thinrichs> The question is what do we do in that case?
00:36:01 <ekcs> thanks. just read up on it.
00:36:02 <thinrichs> I'm looking to see what happens if there's a policy referencing that datasource…
00:36:20 <thinrichs> Does Congress let you delete such a datasource or does it block it?  I think it blocks it.
00:37:41 <ramineni_> thinrichs: it doesnt create policy
00:37:58 <ramineni_> thinrichs: it fails here https://github.com/openstack/congress/blob/master/congress/harness.py#L394
00:38:12 <ramineni_> but we silently , log the exception and continue
00:38:30 <ramineni_> so, after that if we list the datasources, it throws internal server error
00:38:59 <thinrichs> What's the behavior we think is right…
00:39:16 <thinrichs> there's a datasource in the DB but the driver for that DB is not available
00:39:36 <thinrichs> I typically think that deleting something out of the DB is dangerous when the user didn't ask us to do it.
00:40:18 <ekcs> Yea I’m with thinrichs on that. if someone dropped in a new config by mistake, that shouldn’d delete DS and especially not policies.
00:40:53 <thinrichs> So let's imagine we leave the DS in the DB.
00:40:54 <ramineni_> ekcs: policies are not created for that datasource
00:41:17 <thinrichs> What problems does that cause?
00:41:32 <ekcs> We could prompt the user to ok the deletion. options: 1. delete the DS and continue 2. fail and you fix the config before relaunching.
00:42:16 <thinrichs> What if it's a script that's starting up Congress?
00:42:36 <thinrichs> Interactive stuff seems scary
00:42:56 <ramineni_> so, we should mark as disabled
00:43:01 <ramineni_> and leave it in DB
00:43:01 <ekcs> anyway this should be a rare situation and i think the easiest thing is to document and let the user resolve it by fixing config or fixing DB. problem is it’s hard to fix DB without congress. so maybe we have a separate congress switch that says clean the DB based on config.
00:43:02 <ramineni_> ?
00:43:41 <ekcs> i feel like disabling adds to much complexity for something that doesn’t happen much anyway. cuz every other part of the system may need to account for that case.
00:44:06 <ramineni_> ekcs: ya, i agree
00:45:05 <thinrichs> +1 to it being a rare situation
00:45:14 <ekcs> my vote would be for: congress just fails with good error message. in doc direct user to either add the config back or launch congress with a new switch (with warning) to launch and delete DS.
00:45:28 <ramineni_> ekcs: you experienced this issue without changing config?
00:45:41 <thinrichs> And we probably shouldn't be doing anything substantial right now
00:46:03 <ekcs> ramineni_: no. did I suggest that?
00:46:08 <ramineni_> ekcs: you have raised the bug right
00:46:19 <ekcs> ramineni_: no. This is first time i heard of it.
00:46:40 <ekcs> ok wait.
00:46:54 <ramineni_> https://bugs.launchpad.net/congress/+bug/1564152
00:46:54 <openstack> Launchpad bug 1564152 in congress "When 1 driver fails to load, all drivers fail to list (Error 500)" [Medium,In progress] - Assigned to Anusha (anusha-iiitm)
00:47:12 <ekcs> I did raise the bug.
00:47:18 <ramineni_> may be you have experienced in totally different scenario?
00:48:42 <ekcs> but yea different scenario. like if there is bug in a driver and that driver has failure, then listing all drivers fail and don’t give any info.
00:49:33 <ekcs> imagine someone added a new thirdparty driver. and lists drivers. idealy he should see the third party driver failing and others working in the list. user gets no info from list or from horizon.
00:50:01 <ekcs> this is helpful especially if someone is developing their custom driver.
00:50:30 <thinrichs> ekcs: +1 to your suggestion about missing driver
00:50:46 <ramineni_> ekcs: but if error, it doesnt let you create datasource right
00:52:20 <thinrichs> ekcs: fixing that bug of yours will take some work
00:53:22 <ekcs> hmm I didn’t document it super clearly. I forget what error I introduced. but it’s something like this. list_drivers() calls each individually configured driver for info(). If some of those info() methods fail, the whole list_drivers() fail. Ideally list_drivers() still gets the info on other drivers.
00:54:00 <thinrichs> ekcs: is the obvious fix (adding a try/except around the info()) a good first approx?
00:54:07 <thinrichs> That could be something for aimeeu
00:54:16 <ekcs> that way when you look on horizon for instance, the page will list drivers and show ‘last error’ on each of the drivers. right now the page just doesn’t load. or shows 500.
00:54:24 <ekcs> yea that’s basically what I had in mind.
00:54:29 <aimeeu> I thought of the try/catch - I can take a look if you like
00:54:55 <ekcs> but I should document better how to reproduce.
00:55:30 <thinrichs> ekcs: maybe in addition to a few more lines about how to reproduce, you could add the low-hanging tag so aimeeu can easily find it
00:55:38 <ekcs> ok
00:55:50 <thinrichs> 5 minutes left.  Anything else for today?
00:56:06 <ekcs> ramineni_: still a good thought on what to do when driver removed from config tho.
00:56:40 <ekcs> ramineni_: i think the delete drivers patch is a good thing to have (triggered by special switch)
00:57:14 <ramineni_> ekcs: hmm
00:57:24 <ekcs> nothing else from me.
00:57:35 <ramineni_> ekcs: or fail it, as they have messed up with config ?
00:57:52 <ekcs> yea fail should be default behavior
00:58:04 <thinrichs> ramineni_: so I think the proposal is: (i) congress fails to start if it has DS without a driver and (ii) add a switch to congress server that causes it to delete all DSs without drivers
00:58:22 <thinrichs> ekcs: is that right?
00:58:29 <ekcs> yea
00:58:40 <ekcs> but again maybe not even that important to add (ii) right now
01:00:25 <thinrichs> Out of time for today.  We can continue on #congress if need be.
01:00:30 <thinrichs> #endmeeting