14:02:11 #startmeeting neutron_upgrades 14:02:12 Meeting started Thu Jan 25 14:02:11 2018 UTC and is due to finish in 60 minutes. The chair is ihrachys_. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:02:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:02:16 The meeting name has been set to 'neutron_upgrades' 14:02:21 hi! 14:02:23 o/ 14:02:27 Hi everyone 14:03:14 before we dive in, some background on the current state of affairs in relation to release 14:03:31 as you may probably know, a new major release (queens) is getting closer 14:03:42 \o/ 14:04:10 https://releases.openstack.org/queens/schedule.html 14:04:31 this week is feature freeze 14:05:01 we'll have a RC1 release (and hence stable/queens) in several weeks from now 14:05:31 till that moment, we should be cautious landing patches that may affect stability and don't fix clear bugs 14:05:41 which most OVO patches are 14:06:07 we can of course continue reshaping patches in review to get them ready for when master is open 14:06:32 understood. 14:06:52 one sad development because of release final coming close is that we were forced to revert port binding patch that took a while to shape 14:07:06 because of several issues that merging it revealed 14:07:23 one is that it was not counting with mixed old/new engine facade usage 14:07:31 another is postgresql installations busted 14:07:57 yeah, speaking of that. can i have a (stupid) question? 14:08:06 we will need to get back to the white board, so to speak, repropose the patch, fix those issues, and land some time in Rocky 14:08:14 lujinluo, yes! 14:08:24 i was trying to reproduce the postgresql issues 14:08:40 so i built a devstack with postgresql as backend 14:08:56 then i have no idea how i can reproduce the errors.. 14:09:25 could you direct me to the conf/settings of that specific tempest tests? 14:09:42 the periodic job that fails executes full suite of tempest I believe 14:09:57 but let me check 14:10:20 sure, thanks 14:11:43 so this is what failed there: http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/legacy-periodic-tempest-dsvm-neutron-pg-full/71f9bd8/logs/testr_results.html.gz 14:12:21 got it! will start from there 14:12:29 I would imagine running those tests in a loop could get you to failure 14:12:53 specifically check those that raise ServerFault 14:12:54 nice suggestion. will follow it ;) 14:13:00 since that's a sign of neutron-server internal error 14:13:09 ok 14:13:31 btw note that revert patch hasn't landed still: https://review.openstack.org/536913 14:14:05 ok 14:14:20 in other news, gate is quite unstable lately, with multiple issues lingering 14:14:34 we are trying to fix them one by one but we are not there yet 14:14:51 things like job timeouts is one, functional is also unstable (ovsdb commands timing out) 14:15:10 so please be patient if you get those errors over and over, that's sadly expected 14:15:29 Okay :( 14:16:05 now, back to usual business 14:16:06 hello, sorry for late 14:16:21 slaweq, hi! 14:16:22 hey slaweq ! 14:16:24 https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/adopt-oslo-versioned-objects-for-db 14:16:41 https://review.openstack.org/#/c/521797/ "Use Router OVO in external_net_db" 14:17:17 so this patch is fine now but it lacks test coverage for the module / method it touches 14:17:29 so we can't really safely land it 14:18:39 hungpv_, do you think you could work on adding some unit tests for the method before merging this OVO patch? 14:18:56 I thought you said it's called by update_rbac_policy and delete_rbac_policy ? 14:19:17 I just said that I checked there is no mixed old/new engine facade issue there 14:20:18 Oh, if it's the case 14:20:29 remember in the past, in one of older patchsets, there were some mistakes that absolutely broke the code and gate failed to catch it? that's what I am concerned about, that existing coverage won't catch anything if there are still some bugs there. 14:20:52 I'll be working more this issue 14:21:02 so adding some unit tests would help us to sleep well :) 14:21:09 ok thanks hungpv_ ! 14:21:32 next "https://review.openstack.org/507772 Use Network OVO in db_base_plugin" 14:21:46 I’ve left a comment about current blocking point 14:22:16 We’re still working on it 14:22:20 the current list of failures is: http://logs.openstack.org/72/507772/29/check/openstack-tox-py27/8b97a84/testr_results.html.gz 14:23:23 for foreign key failure, I see it's just in object test class itself 14:24:14 Yes, but we haven’t figured out how to fix it 14:24:16 tuan_vu: I didn't have time to check it yet 14:24:25 but I will try to help 14:24:37 Hi slaweq, thank you in advance 14:24:43 I think it's one of those cases that just need self.update_obj_fields in setUp() for whatever field is violated 14:24:49 We really appreciate your help 14:25:43 slaweq, tuan_vu, ping me if you need me to look at the foreign key issue 14:25:53 Thank you, ihar, we’ll try your suggestion 14:26:03 another more important failure there is test_*_queries_constant failures 14:26:41 Yes, I’m just wondering if we can change the number of queries? 14:27:04 Because in ovo, the way it works is different a little bit 14:27:12 which signals that somewhere in the patch, we fetch some models from database in an ineffective way (probably fetching a relationship for networks using a separate query) 14:27:33 tuan_vu, we can't, we should find a way to avoid it 14:27:46 tuan_vu, do you know which relationship triggers additional queries? 14:28:15 We need to reload “shared” attribute 14:28:31 When update network 14:29:40 why is it? can't we calculate new value from what is already passed by user / fetched from db? 14:30:04 is it because "shared" is a synthetic field now? 14:30:23 Hi Luo, yes 14:31:17 besides, test_network_list_queries_constant also fails, which doesn't update network 14:31:28 i think i have met this issue while working on Port integration. when we have aynthetic fields, the # of queries does not stay constant 14:31:44 it lists networks, and probably we trigger a set of queries per network matched 14:32:14 yes, because get_objects() will try to fetch all their synthetic fields too 14:33:06 Yes, thank you Luo, your experience does help a lot 14:33:40 but i do not recall i found a suitable solution for that :( 14:33:42 right. so one of relationships is probably of type that doesn't fetch data when parent network model is fetched. that would be a problem, no? 14:34:01 I mean, type of sqlalchemy model attribute for relationship 14:34:49 ok, let's try from another direction. how did it work before the patch? 14:35:14 Hmm 14:35:42 The main problem is “shared” attribute 14:36:09 Before the patch, we use rbac for “shared” 14:36:48 right. can't we calculate it for OVO the same way? we have .db_obj.rbac_entries already, right? 14:37:21 if we then make rbac metaclass to use this info directly instead of refetching it (I guess that's what happens?) then we would avoid the problem? 14:38:38 Hmm, let me think more about your suggestion, is that ok? I’ll send you an email if that’s needed 14:39:01 Or contact you via IRC 14:39:13 yeah sure, we don't need to dig the code right away, but the main point, we can't disable / change this unit test. 14:39:13 that should work, by changing how the # of queries are calculated. besides, "shared" is not a synthetic field, please check which attribute is the one that causes this issue 14:39:56 Thank you, Ihar and Luo 14:39:57 moving on 14:40:02 https://review.openstack.org/#/c/537320/ "Use Port OVO in neutron/db/external_net_db.py" 14:40:33 i did recheck about the timeouts.. but i guess zuul hung somewhere...? 14:40:37 lujinluo, I was to look at some port patches you have but then we stumbled on port binding revert. I was thinking, are we blocked by that one now? 14:40:59 lujinluo, yeah, zuul is sick these days. I rechecked. 14:41:30 kind of.. it's better to wait for port binding to land first, as those two patches conflicts a lot 14:42:15 lujinluo, right. speaking of which... do you go to Dublin? I think mlavalle was planning a session on ovo/enginefacade issue. 14:42:25 it would be easier to get them in serially, not in parallel 14:42:45 my employer has not decided yet, but i guess probably.. 14:42:54 i would go 14:43:22 great. btw I don't. but having someone from this group there would help. I may try to join remotely if there will be such an option 14:43:44 oh, it is sad that you are not going :( 14:44:49 yeah. I kinda lagged on trips lately for family reasons. 14:45:06 well, family always goes first 14:45:20 anyway, moving on to the next which is 14:45:25 https://review.openstack.org/#/c/537325/ "Use Meter Label OVO in neutron/db/metering/metering_db.py" 14:45:32 this looks green :) 14:46:30 i used label.db_obj (which we should not do normally), but since we cannot use router as a synthetic field, i think this is fine. what do you think? ihrachys_ 14:46:41 so there, you pass db models in one case and objects in another case 14:47:11 hmm right 14:47:38 i need to avoid that 14:48:04 question is, how 14:48:44 we could of course add routers field to labels but that's a bit silly from API perspective (label is not a container for routers) 14:49:08 the first solution came to my mind was _load_object(), but we should avoid that too 14:49:11 yeah 14:49:58 at least for get_sync_data_for_rule, you could avoid fetching labels at all and instead have a method that returns the needed routers. 14:50:01 how about using get_objects() 14:50:08 since you don't use anything but id from the label, and it's already known 14:50:40 we could have this method embedded into Router.get_objects, so that if metering_label_id filter is passed, it does the right tihng 14:51:56 that sounds good. will choose that approach 14:52:11 ah wait not really. you rely on shared there 14:53:01 yes, but instead i can pass the router_ids i fetch from label.db_obj.routers, then use get_objects() 14:54:05 true but wouldn't it trigger another fetch? 14:54:26 but wait, we already trigger it for shared case... 14:54:35 yes 14:54:59 maybe that's a mistake actually that slipped in 14:55:46 it was doing get_collection_query before OVO though 14:55:53 it got in here https://review.openstack.org/#/c/529551/5/neutron/db/metering/metering_db.py 14:55:54 so I guess no one bothered about it being optimial 14:57:01 anyway, let's dig more in gerrit cause we only have 5 min left now! 14:57:29 but actually, for that other case with all routers fetched, it's different 14:57:40 because then label.db_obj.routers won't give same result 14:57:47 ok let's follow up there 14:58:18 slaweq, I am sorry; you joined today. did you have smth? 14:58:47 ihrachys_: no 14:58:54 I just wanted to join :) 14:58:59 ok, just checking :) thanks for joining :) 14:59:20 slaweq: we are happy that you are here! 14:59:29 I will try to be here every week 14:59:34 if I will not forget :) 14:59:42 we don't really have much time left. if you have questions about some patches that were not discussed, please ping me in irc or email after this meeting, I will try to help. 14:59:57 Awesome, Slawek 15:00:01 slaweq, great. your help would be of great value. 15:00:10 ok we are officially out of time 15:00:16 ciao 15:00:18 #endmeeting