17:01:20 #startmeeting CongressTeamMeeting 17:01:20 Meeting started Tue Mar 3 17:01:20 2015 UTC and is due to finish in 60 minutes. The chair is thinrichs1. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:01:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:01:24 The meeting name has been set to 'congressteammeeting' 17:01:42 Who do we have this week? 17:02:43 jwy: I see you're here. How's the Horizon UI going? 17:03:32 hi, good, i pushed what's there so far: https://review.openstack.org/#/c/160722/ 17:03:51 for policy creation and deletion 17:04:02 few more things to do for that 17:04:14 Hi 17:04:38 hi 17:05:00 also talked with Yali more about policy abstraction 17:05:09 she is working on a func spec for that 17:06:20 i also made some updates to horizon and the docs since the datasources are now retrieved by id instead of name 17:06:44 waiting for review on those. i think the ci might still be broken? 17:07:07 Not sure about the CI. We've been having trouble with our cloud of late. 17:07:26 Yea hopefully that will be under control soon 17:07:33 It's great that you're making progress! 17:07:38 some how the python unit tests in our repo broke 17:07:56 I haven't tracked down how this is possible yet (I wanna ping the guys in the infra channel about it) 17:08:09 but i have a fix that should merge in a sec that makes the tests pass again. 17:08:10 Did the Makefile change get merged yet? 17:08:24 https://review.openstack.org/#/c/160680/ 17:08:48 thinrichs1: not yet that doesn't unblock the tests. 17:09:35 hrm weird i see the tests did pass on your patch 17:09:35 The tests *should* be failing without that patch. 17:09:46 We shouldn't be able to parse anything. 17:09:55 thinrichs1: the tests passed on my patch that isn't rebased on yours though. 17:10:11 Locally is different b/c you still have the output of the Makefile sittig around. 17:10:26 thinrichs1: I don't mean locally. 17:10:39 If you click on that link jenkins +1'ed it. 17:11:04 Maybe there's a Python-path thing happening somehow. 17:11:05 Not sure. 17:11:20 also i'm not sure how we could break the unit tests without jenkins stopping it from merging 17:11:23 But Jenkins also let in the deletion of the config file. 17:11:25 since they would have had to pass at one point. 17:12:00 in that case i'm sure the unit tests were passing then (not sure which config file you're talking about). 17:12:05 The missing test_datasource_driver_config.py. 17:12:09 ah yea! 17:12:23 That one i want to dig deep and figure out how this could occur 17:12:32 sorry, which patch are we talking about that passed 17:12:46 jwy: https://review.openstack.org/#/c/160680/ 17:13:06 i see failure for http://logs2.aaronorosen.com/80/160680/1/check/dsvm-tempest-full-congress-pg-nodepool/917a568 17:13:20 jwy: yea the cloud is flaky right now 17:13:30 the gateways are over loaded and connections timeout :( 17:13:48 which part are you saying passed? 17:14:09 the python unittests 17:14:26 they failed to pass in the gate pipe line but they passed on check 17:14:35 ok 17:14:38 This one passed too but probably shouldn't have. 17:14:38 https://review.openstack.org/#/c/158489/ 17:15:12 thinrichs1: anyways lets do a little investigating later on and try and nail down what happened. 17:15:36 arosen1: Sounds good. 17:16:25 Back to status updates. 17:16:29 jwy: thanks for the update. 17:16:36 arosen1: want to give a status update? 17:17:33 thinrichs1: sure 17:17:57 so I haven't done much on congress the last week or so. Most of my time was sucked up trying to help debug our cloud 17:18:22 that's it from me for now... 17:19:49 arosen1: thanks. 17:19:57 sarob couldn't attend but sent his status via email. 17:20:15 He cleaned up  https://launchpad.net/congress/kilo 17:20:30 He tagged https://github.com/stackforge/congress/tree/2015.1.0b2 17:20:47 He is still working out getting the tarred file out to http://tarballs.openstack.org/ as part of the process 17:21:25 This seems like a good step toward getting us into the OS release cadence. 17:22:06 I believe kilo3 is mid-March, so a couple weeks away. 17:22:56 I think we'll delay code freeze to say 3-4 weeks before the summit, since we don't have a ton of stabilization to do. 17:23:45 That should give us enough time to do some testing, round out the features/bugs that we want available, and still have time to work on some specs for the summit. 17:24:00 How does that sound? 17:24:23 sounds good to me! 17:24:59 same here 17:25:00 sure 17:25:16 So that'll be the plan moving forward then. 17:25:24 alexsyip: want to give a status update? 17:25:47 I’ve been working on high availability for the congress server + datasource drivers. 17:26:22 For a first cut, we’ll runn two completely replicated congress servers, where each replica will fetch data from the data sources 17:26:31 and clients can make api calls to either replica 17:27:06 Writes (for things like rules and datasource config) will go to the database, and congress server will pull new changes on a period basis. 17:28:09 nice, sound good to me alexsyip 17:28:27 Sounds like the right first cut to me. 17:28:36 Excited about HA! 17:28:37 Currently, I’m setting up a tempest test to run in this configuration. 17:30:00 that’s all 17:30:10 That reminds me—we should fix up our logging so we don't fill up the disk (and crash). 17:30:28 thinrichs1: where did you see this problem? 17:30:31 in devstack? 17:30:38 alexsyip: do you have a blueprint for HA? 17:30:39 I think that logging that jwy pointed out was in horizon 17:30:57 I’m working on an HA 17:30:59 blueprint. 17:31:06 arosen1: if we're going for HA, and we never empty out the logs, we'll eventually fill up the disk. 17:31:16 And crash, thereby making HA harder. 17:31:17 Actually, I think I already made the blueprint, but not the spec 17:31:28 thinrichs1: I don't think that's really related to HA per say 17:31:34 are you talking about syslog? 17:31:42 like output from congress-server logs? 17:31:50 Just the other day I saw ceilometer fill up the disk with its log. 17:32:00 arosen1: yes 17:32:22 thinrichs1: the ceilometer in cloud or devstack ? 17:32:30 Here’s the blueprint: https://blueprints.launchpad.net/congress/+spec/query-high-availability 17:32:37 I think logrotate is probably not configured in that case thinrichs1 17:32:59 arosen1: so you're saying it's easy to fix the logging so it doesn't fill up the disk? 17:33:06 yes 17:33:12 logrotate does it for you 17:33:29 it tar.gz's your logs and deletes them eventually if it's running out of diskspace 17:33:29 If we don't have that turned on, let's turn it on by default now. 17:33:51 it shouldn't ever run out of diskspace in devstack though if it's dupping the output to screen. 17:34:11 on a proper install logrotate would handle this. 17:34:23 sorry battery is about to die on the train :( 17:34:58 Whatever we need to do to make sure we don't fill up the disk, (whether we're running as part of devstack or standalone), let's add a bug/blueprint to make sure that happens. 17:35:16 If it's an deployment option, let's make sure it's documented and on-by-default. 17:37:44 I added a blueprint for this and made it a dependency on query-high-availability. 17:37:55 I guess I'll give a quick status update. 17:38:25 Now that datasources are spun up/down at runtime, we need to be more careful with how we deal with column-references in policy rules. 17:38:43 Remember that a column-reference is where we identify the value for a column using its name, not its position. 17:39:16 Example: p(id=x, name=y) asks for the column named 'id' to be variable x and the column named 'name' to be variable y. 17:39:41 Previously we were compiling these into the usual datalog version *at read-time*. 17:40:07 To do that we needed to know the schema for all the tables *at read-time*. 17:40:42 If our schema for table p has columns ('name', 'id', 'status'), then we would compile the example above into... 17:40:47 p(y, x, z) 17:41:14 But since datasources are spun up/down at runtime, we no longer know the schema at read-time, so we can't do this compilation any longer. 17:41:28 So I'm adding support for column references into the heart of the evaluation algorithms. 17:41:59 We'll still be able to do schema-consistency checking (making sure people don't reference non-existent columns), but we'll want to do it whenever a schema change occurs, e.g. spin up a new datasource. 17:42:08 Hope that made some sense. 17:42:46 It should be transparent to the user. 17:43:15 That's it for me. 17:43:19 is this related to the issue with a policy table like nova:servers being empty now? 17:43:33 jwy: Yes that's where I noticed the problem. 17:43:56 Say we have a rule like p(x) :- q(id=x) in the database. 17:44:25 Sorry.. different rule. 17:44:30 p(x) :- nova:q(id=x) 17:44:48 If we startup/restart Congress and try to load that rule, we won't know the schema for q because Nova hasn't necessarily been spun up yet. 17:45:02 So we won't load it; we'll throw an error. 17:45:20 ah 17:45:36 jwy: the original problem was that we weren't even trying to load the rules into the policy engine. 17:45:46 So we didn't get an error. We just didn't load anything. 17:45:56 Then once I fixed it to actually load the rules, I saw the errors. 17:46:20 glad you found those! 17:46:28 jwy: thanks for pointing those out. 17:46:48 I think it's probably a weird case when this would actually be problematic. 17:46:56 But it's just not functional as it stands. 17:47:17 And if someone deletes a datasource named 'nova' and creates a new one also called 'nova', we weren't doing the right thing. 17:47:33 This should fix all that. 17:47:46 Okay. Time to open it up for discussion. 17:48:07 Oops.. first: is anyone else here that wants to give a status update? 17:49:03 Okay—open discussion it is. 17:49:06 #topic open discussion 17:51:36 it's daylight savings this weekend, so the local time for this meeting for some folks will change 17:51:42 starting next week 17:52:00 jwy: thanks for the reminder! 17:53:06 Thanks for the meeting all! See you next week (an hour later b/c of daylight savings for some of us, I believe). 17:53:09 #endmeeting