00:01:42 #startmeeting qa 00:01:43 Meeting started Thu Sep 12 00:01:42 2019 UTC and is due to finish in 60 minutes. The chair is gmann. Information about MeetBot at http://wiki.debian.org/MeetBot. 00:01:44 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 00:01:47 The meeting name has been set to 'qa' 00:02:02 who all here today? 00:02:05 hi 00:02:11 Hello 00:03:23 this is our first office hour in new time. 00:03:27 #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting#Agenda_for_next_Office_hours 00:03:31 agenda ^^ 00:03:53 #topic Announcement and Action Item 00:04:52 nothing to announce here except feature freeze is coming in this week(tomorrow) and we need to keep monitoring on gate and requirement freeze things. 00:04:53 AT&T is now using tempest as a deploy gate on all of our deployments 00:05:13 bigdogstl: perfect, that is good news 00:05:13 in downstream CI thank you to all for the hardwork. 00:05:20 bigdogstl: cool 00:05:35 we are running over 100 test every deployment with no leaks 00:05:51 we deploy at least 5 time a day 00:06:05 that is why you are see high activity from new people 00:06:07 nice, how you use Tempest. master or release tagged version ? 00:06:11 1000 not 100 00:06:30 master against ocata with some code patches 00:06:39 100% containerized 00:06:41 +1 thanks for adding new intern. 00:07:04 it's a very good usecase 00:07:05 code patch for internal stuff or upstream bugs ? 00:07:08 they have gone back but I have full time engineers working upstream now 00:07:25 miss match of ocata and master 00:07:36 ohk 00:07:47 we blacklist test that do not work in house 00:07:50 not many 00:08:03 i see. 00:08:20 let's move 00:08:23 we are trying to get patrole working but we are seeing to many leaks for it to be a offical gate yet 00:08:29 sure 00:08:47 ok. let's discuss that in detail on later topic 00:08:53 #topic Train Priority Items progress 00:09:05 #link https://etherpad.openstack.org/p/qa-train-priority 00:09:33 we will quickly go through the items which has updates to share 00:09:38 How to make tempest-full stable - gmann 00:10:17 only glance patch is left which i have replied the comment and pinged glance PTL to check. 00:10:32 OpenStack-health improvement - masayukig 00:10:37 masayukig: anything to share on this ^^ 00:11:11 heh, nope actually.. 00:11:11 or bigdogstl 00:11:29 ok 00:11:31 Planning for Patrole Stable release - gmann 00:11:48 no progress at this time 00:11:51 bigdogstl has list of patches for resource leake thngs 00:12:11 https://review.opendev.org/#/q/owner:rick.bartra%2540att.com+status:open+project:+openstack/patrole 00:12:27 bigdogstl: one question on patrole use- how many role you run ? 00:12:40 10 at this time 00:12:52 admin and member or other also 00:13:22 yes 00:13:39 we have 3 admins 00:14:06 full, support read only 00:14:32 and more for type of roles 00:14:35 ok 00:14:49 we containerize the test suite and 00:15:23 run them in parallel 00:15:34 ok 00:15:41 in test labs and green field prod and serial in brown field prod 00:16:13 once we have all of the leaks done we will start address missing test in our deployments 00:17:04 again we are using master against ocata 00:17:06 bigdogstl: ok, I will get those review done by this week before my flight back to CT TZ 00:17:19 but we are moving to stein this soon 00:17:28 perfect, that was my question :) 00:17:30 thank you 00:17:36 one thing i was thinking to add patrole job in Tempest gate as many of service client are being used by patrole only and it can break patrole gate. one case was volume schema change. 00:17:58 that will be nice 00:18:00 may be as admin and member role which should cover most of the cases 00:18:26 we are finding other plugin doing or wanting to do patrole like neutron and taas 00:18:34 #action gmann to add patrole job to Tempest gate 00:18:38 I agree 00:18:45 +1 00:19:07 bigdogstl: yeah we need to discuss how we extend the patrole for other services 00:19:24 it can be good discussion for coming PTG. 00:19:29 +1 00:19:35 bigdogstl: are you planning to come to Shanghai ? 00:19:53 I will not be there in person but will try to be on the etherpad 00:20:23 I wish but cost is too high for everyone 00:20:25 ok. i will make sure to get you and your team into the discussion. 00:20:36 thank you 00:20:36 next item: 00:20:38 Keystone system-scope testing - gmann 00:20:56 no progress on this. I will take this up after feature freeze and IPv6 goal things. 00:21:12 Document the QA process or todo things for releases, stable branch cut etc - gmann 00:21:18 ditto 00:21:27 new whitebox plugin for tempest - artom 00:21:56 I have look at it but not much activity 00:22:00 we talked about it few days back and he will finish the few pending code things first and then push the spec 00:22:07 yeah 00:22:17 Tempest Plugin sanity check among cross tempest plugins - masayukig 00:22:21 masayukig: ^^ 00:22:30 we are still holding our expected values downstreaam 00:22:46 but it is growing and will talk to them to see if it is a fix later 00:23:05 The sanity check job is already voting now. And I think it's stable. 00:23:13 I think we need a gate process for plugins 00:23:39 so this gate an issue in the future 00:23:53 the x/ranger team worked on this 00:24:09 we hope they get back tot he commit comments soon 00:24:22 I think we need a vision here 00:24:33 masayukig: nice thanks. 00:24:46 right now we have cookie cuter smaple but not sure if we can use tempest sheel script to drive it 00:24:57 shell 00:25:27 you mean to create tempest plugin ? 00:25:56 let me get the ps 00:26:18 ok 00:26:30 no a tox.ini in the plugin as a gate 00:26:54 here is the doc change https://review.opendev.org/#/c/674121/ 00:27:21 please review and add comments 00:27:37 Felipe was trrying to get it as a tool over docs 00:27:51 bigdogstl: i did not notice that. that is good initiative. 00:27:55 may need input for direction on that 00:28:00 i will check that 00:28:05 sure 00:28:27 bigdogstl: can you add that link under sanity job item in etherpad 00:28:36 Data plane tester - 00:28:36 will do 00:28:43 bigdogstl ^^ 00:28:45 gmann, btw, does Tempest have the same spec freeze/feature freeze thing as Nova? 00:28:51 Or can I propose the spec anytime? 00:29:57 artom: no, anytime is ok. 00:30:11 gmann, aha, good to know 00:30:26 My NUMA live migration thing will land in time for Nova FF 00:30:32 So I have time to think about other things :) 00:30:40 we are just using shaker wight now and are starting small changes to do harbinger 00:30:55 artom: thanks, that is same way i plan things for Tempest. 00:31:11 bigdogstl: ok. ping us once spec is ready 00:31:15 we are allo doing long live test and signal to stop for resiliency 00:31:41 run east west and north south while playing with control plane node 00:31:59 hope to get spec started soon 00:32:06 ok, that is really good to add in upstream. thx 00:32:12 let's move next 00:32:15 #topic OpenStack Events Updates and Planning 00:32:25 Next: Shanghai Summit & PTG 00:32:46 we need to prepare the forum and PTG planning soon. 00:33:27 masayukig: if ok for you, can you start the etherpad for forum brainstorming ideas and advertise to ML 00:33:39 and we can use the same for PTG topic also. 00:33:52 forum brainstorming+PTG 00:34:07 gmann: ok, I'll do 00:34:15 thx 00:34:28 #action masayukig to plan the forum brainstorming+PTG etherpad 00:34:58 #topic Sub Teams highlights 00:35:04 Tempest 00:35:40 we have stestr version up in req freeze which need Tempest workaround due to teststool issue 00:35:50 #link https://review.opendev.org/#/c/681340/4 00:36:09 I have workaround up and need to get this in by today or tomorrow. this is high priority for us 00:36:36 bigdogstl: ^^ this is to watch if anything break on AT&T testing 00:37:25 Patrole 00:37:46 for patrole, we need to get those resource leak patch in and add job in tempest gate 00:38:19 anything else on patrole as I remember i mentioned few things to discuss in later topic 00:38:19 Once we have the downstream gate running we will be able to spotlight more upstream work 00:38:20 gmann: sure, I'll put my eyes on that patch. 00:38:26 thanks 00:38:31 thanks 00:38:52 one case needs to handle which i am waiting for gate result 00:39:52 Tim Burke gave a comment about the workaround 00:41:17 i see. i tried to monkey patching that but it did not work. But i can check Tim suggestion which seems much better 00:41:24 let me test that locally and then on gate 00:41:40 ++ 00:42:11 it looks better if it works 00:42:19 yeah. 00:42:32 any other subteam to dicuss anything 00:42:34 discuss 00:42:59 #topic Bug Triage 00:43:02 #link https://etherpad.openstack.org/p/openstack_qa_tempest_2019_bug_review 00:43:26 What is the best doc for bug lifecycle 00:43:57 I am not sure how to close items like 00:44:11 #link https://wiki.openstack.org/wiki/BugTriage 00:44:19 bigdogstl: this is what we follow ^^ 00:44:35 i think etherpad you have created is good way to start 00:44:41 Clark Boylan proposed openstack/devstack master: Fix worlddump log collection https://review.opendev.org/680340 00:44:56 ok I will try to catch up on the process 00:45:05 only thing we need to care about is even old bug but we should triage them if that still exist and not fixed 00:45:17 like line 41 00:45:23 how do we close it 00:45:37 https://bugs.launchpad.net/tempest/+bug/1596458 - preprovision creds leaking.ISSUE/NotesThe stack trace is pointing to tempest.common.preprov_creds.PreProvisionedCredentialProvider:and this class has been movedWe also use pre-provision creds and we do not have any leak at this timePROPOSAL Close this bug 00:45:38 Launchpad bug 1596458 in tempest "Preprovisioned cred provider may leak credentials" [High,Confirmed] 00:46:21 ok, i think this should be confirmed by unit test only 00:47:28 because this is case of error during get creds 00:47:33 what about a tool to see if the oslo concurrency directory have hash loack left over 00:48:07 we have a piece of code that check that and we fix them as we see them 00:48:16 must are plugin test 00:48:58 but from tempest we use preprovision and we check lock directory and set a yellow flag if it happens 00:49:30 most of the time it is negative test that do not have bullet proof tear down 00:49:49 ok, that kind of stuff we can run as part of account job in the end 00:49:51 another word do not use tempest framework 00:50:00 +1 00:50:30 bigdogstl: can you push that code up somewhere to see and then we can add in account job? 00:50:43 but we know of not master cred leaks at this time so I would like to close this since the bug was filed against the non lib version 00:51:03 give me an action on that 00:51:08 bigdogstl: but code is not changed from old->new lb version 00:51:22 we had to add logic to not list tempest fixture locks 00:51:43 #action bigdogstl to bring up the pre-provisioned cred resource leak code 00:51:43 ok 00:51:57 but the test have changed over the year and i do not know of any issue 00:52:18 if we close it and there are issue we will get a new bug 00:52:35 yeah, bug seems on basis of static code analysis not actual. 00:52:47 let me check it again and then we can take action 00:52:49 some are so old the data is not very good and the person that reported it is not responding 00:53:08 ok 00:53:24 yeah if author does not provide required info which is must to proceed we can mark it incomplete 00:53:36 if we have a well define set of bug I will try too get other engineers to address them 00:53:43 ok 00:53:53 bigdogstl: we have latest new bugs also for them 00:53:53 does that close it over time 00:54:20 we used to maintain in single digit but now it is 45 00:54:22 #link https://bugs.launchpad.net/tempest/+bugs?search=Search&field.status=New 00:55:00 those are good to target and keep triage and add in same etherpad which we further discuss during office hour 00:55:11 those are latest bugs reported 00:55:13 I was trying to effect the 160 number before storyboard move 00:55:35 ok 00:55:40 yeah we should keep doing those also 00:56:04 I will get sometime to add the comment in etherpad but not in this or next week 00:56:14 ok 00:56:15 but let's check one by one and add comments there 00:56:21 4 min left 00:56:35 anything on bug triage ? 00:56:52 not here 00:57:06 ok, thanks 00:57:10 #topic Critical Reviews 00:57:24 any other critical review than we already discussed 00:57:45 https://review.opendev.org/#/c/668823/ This should be easy one 00:58:19 masayukig: so badges works fine ? 00:58:31 yeah, it works now 00:58:51 perfect, i will check. thanks masayukig for getting this goal. 00:59:01 #topic Open Discussion 00:59:15 no worries :) 00:59:44 This was my first office hour I will be better prepared for the next one. Now that I understand the format. Thanks 01:00:08 we have 2 item in open discussion 1. swift/ceph bug which i will comment 2. devstack speedy thing which we are discussing/discussed over patch and ML. 01:00:32 bigdogstl: thanks a lot for showing up and working on many items. 01:00:40 ++ 01:01:00 bigdogstl: we have very simple format here first 30 min for status things and 30 min for technical/bug discussions 01:01:02 I wanted to point out that it feels like the openstack gate needs an air traffic controller. We've got lots of resets happening across many projects and the queues are long 01:01:02 Your are welcome and thank you to you 2 and the other reviewers 01:01:44 some sort of coordination around getting bug fixes ready and enqueued/promoted would likely be helpful if someone is able to dig into that 01:02:19 clarkb: oh. do you have links or bugs ? 01:02:28 gmann: thats the problem we need someone to do that work 01:02:43 I've been unable to dig into why openstack jobs are failing as I've been debugging infra issues 01:03:04 which gates? 01:03:14 the openstack integrated gate queue 01:03:29 keystone + nova + cinder + neutron + glance (and probably others) 01:03:58 grenade and openstack-tox-py27 seems failing frequently 01:04:29 ok 01:04:55 basically we need someone to identify issues, coordinate getting them fixed, then point out when they are ready so that we can promote them and fix them quicker 01:05:15 seems like upper-constraints issue on py27 01:05:17 mriedem was looking at failures today and filed some e-r bugs 01:05:18 ok 01:05:33 openstack-tox-py27 should be easy to create bugs and gather information about 01:06:05 gmann: cmurphy points out it is relatedto configparser changes maybe? over in #openstack-infra 01:06:16 Sorry for the silly question I am new and trying to see when myself or my team mates can help out 01:06:19 hi 01:07:21 yeah i am afraid that is not caught while version up in req gate 01:07:41 bigdogstl: I think the first thing we need is someone to spend some time identifiying the common errors we are seeing so that we can file bugs and get the appropriate teams working to fix them. Then as fixes happen let the infra team know so that we can promote those changes ahead of changes that are likely to fail due to the bugs that these fixes fix :) 01:08:01 +1 01:08:02 bigdogstl: its about the end of my day but I'll be around starting at about 1500UTC tomorrow if people want a quick walkthrough 01:08:44 sure. 01:08:54 I will try to ding into those today if get time 01:09:15 let's close the office hour as time is over. 01:09:18 #endmeeting