16:00:45 #startmeeting defcore 16:00:45 Meeting started Wed Jan 13 16:00:45 2016 UTC and is due to finish in 60 minutes. The chair is eglute. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:46 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:48 The meeting name has been set to 'defcore' 16:01:19 Hello Everyone! This week's agenda, please review and add as needed! #link https://etherpad.openstack.org/p/DefCoreRing.8 16:01:38 o/ 16:01:39 raise your hand if you are here for the DefCore meeting 16:01:44 #chair hogepodge 16:01:44 o/ 16:01:44 Current chairs: eglute hogepodge 16:01:54 eglute: here! 16:02:03 o/ 16:02:11 * eglute waves to everyone 16:02:42 Rob will be joining us a little later 16:03:23 Under the agenda, I have entered a few items that have been resolved since last meeting, in case you are curious. Let me know if you have questions about any of them 16:04:20 #topic Full data set from running tests submitted to the Foundation 16:05:00 Right now, only partial test results are submitted to refstack, 16:05:16 which does not provide a full picture 16:05:35 The way DefCore is structured, we allow vendors to submit only passing results without any run data associated with them 16:06:04 It's just a list of passed tests. I've had concerns for a while that it's really easy to cheat the testing. 16:06:34 also, does not show the whole picture 16:07:02 We had a vendor submit test results that looked suspicious. They offered an explanation that I'm satisfied with, but it may benefit our trademark protection efforts by requiring that privately more data be sent. 16:07:05 and i think we want to have full data, that was the idea originally anyways 16:07:28 hogepodge: I inquired about that (cheating) a couple meetings ago. Rob seemed to think that the "Stay Off Grass" sign and the fact that its easy to see who has cooked the books is enough to keep organizations from doing so... 16:07:29 hogepodge: The subunit results really only include failures. Is that what you're referring to? 16:07:56 dwalleck: subunit includes all sorts of information about passes and failures 16:08:46 * markvoelker arrives late after a another call ran long 16:08:55 eglute: Is the concern with sending the full set of data that it may contain tenant-specific information (not anonymized)? 16:08:56 dwalleck: it's harder (but not impossible) to create a fraudulent subunit file. It's trivially easy to produce a refstack json file. That's by design (not a knock against refstack at all) 16:09:36 leecalcote I think so. however, anyone testing could create new tenants/users/projects just for defcore testing if that is the issue 16:09:41 background info on DefCored decision on sending pass only tests https://github.com/openstack/refstack/blob/master/specs/prior/implemented/simplify-uploads-by-only-sending-pass-results.rst 16:10:04 thank you catherineD 16:10:53 Sending full data that isn't publicly posted seems like a reasonable idea, but if we're really concerned about fraudulent results it probably doesn't go far enough. Solving that is tricky though. 16:10:58 Major reason 1) privacy 2) can not differentiate fail vs skip test 16:11:47 hogepodge, I think you mentioned the Foundation was looking at adding additional language to the license contracts asking vendors to certify that their results and/or the test code haven't been tampered with? 16:11:51 also size of the data ... since we encourage testing of the entire API tests not just DefCore tests 16:12:15 hogepodge: I couldn't remember, still trying to pull up an actual file. Given that subunit is a standard protocol, wouldn't that still make it easy to doctor result? 16:12:27 we can do that 16:12:35 Perhaps, to maintain privacy while guaranteeing the validity of results (undoctored) some sort of hashing, signing or encrypting of results (w/o the full data set) could be considered? 16:12:55 leecalcote: I was thinking about that last night too...I'm not sure it'll work. 16:13:00 dwalleck: yes, but it would require more expertise 16:13:04 I don't have a good answer. 16:13:22 markvoelker: because the testing tool itself could be doctored? 16:13:24 E.g. even if we have refstack sign the results before submitting them, it's pretty easy to doctor the refstack/tempest code to give you a window to "fix" results before signing 16:13:31 leecalcote: right 16:13:35 Living in the OpenStack testing world, I've heard way too many reports of vendors faking data to pass CI. 16:14:01 you could make the run that counts remotely 16:14:03 It's a weakness of our testing framework that it's hard to test, and it makes independent testing more difficult. 16:14:06 dwalleck: The issues are iin the fail cases the subunit may include credntial info 16:14:14 woulld require to import your public key into the user's machine 16:14:16 Really the only way to solve is with independent testing/auditing, but that's a whole other can of worms. 16:14:32 markvoelker: I've had to do multiple hashing of files to keep a chain of evidence for testing of medical devices 16:14:44 The one time I helped a vendor run tests remotely was not a fun time. It's not a sustainable practice. 16:14:53 fair enough 16:15:02 All that said though, I think maybe there's a reasonable middle ground here... 16:15:36 I think if someone really wants to cheat, they will find a way. Adding language to the license agreement plus full tests submitted privately sounds like a good solution for now 16:15:49 Some basic precautions (like submitting full results), some deterrants (expanded legal language), and some better guidance on how to run tests and what's acceptable will probably head of most problems 16:15:56 eglute: ++ 16:15:59 eglute: +2 16:16:16 Those that remain are pretty likely to be found out via other channels....heck, we already found one oddity and we aren't even running that product (that I know of). =) 16:16:30 i agree with markvoelker 16:17:16 i think foundation and hogepodge will need to work on the legal language 16:17:19 Another middle ground being to add language to force submittal of full test results upon request (in suspicious cases). 16:17:22 o/ 16:17:42 leecalcote that is also a good idea 16:17:47 Would asking to provide a consistent set of results rather than just a single passing run make any difference? 16:17:59 how do people feel about private results for all runs? 16:18:02 So maybe the thing to do here is to explore the potential pitfalls of submitting full results and figure out if there are reasons not to do it? E.g. if that would include "sensitive" data, for example. 16:18:38 And also what the feasibility of storing it is... 16:18:43 we've heard that people are unwilling to submit private cloud data - this could allow companies to submit too 16:18:53 how large is the full data set? 16:19:13 markvoelker: I think other OpenStack projects have found ways to sanitize their logs. I don't think doing that with Tempest would be very difficult, especially if we're only talking about credentials 16:19:14 dwalleck: I think there's something to that suggestion given that a one-time validation upon initial deployment doesn't guarantee these same test results a year later. How often are passing test results required to maintain certification? 16:19:43 leecalcote, re-testing is another topic we had been discussing :) 16:19:43 dwalleck: agreed, I don't think it'll be a big deal. But we should make sure of that, is all. =) 16:19:49 dwalleck, what about error data in logs for failed tests? 16:19:56 there is a patch that is waiting for update on re-testing 16:20:56 eglute: I think the size of the last full result I had was all of 175kb 16:20:56 My suggestion would be to send results to refstack for public review (as required by our open process), and require subunit files to be sent to Foundation (where they will not be made public at all) 16:20:56 dwalleck that seems very manageable 16:20:56 leecalcote: https://review.openstack.org/#/c/232128/ < recurring testing patch 16:20:56 I have seen file that reach couple 200 Mb all depending on number of failed tests 16:21:13 i like hogepodge suggestion 16:21:20 eglute, markvoelker: ah, got it. 16:21:23 zehicle: I'm trying to think of a case where system error data would be spilled out. If a project is returning sensitive stack traces, that sounds like a project bug 16:21:43 catherineD: I'm hoping I only get files with a good set of passes. :-D 16:21:51 dwalleck, agreed but if it happens then that's a potential breach 16:21:59 hard to protect agaist infrequent 16:22:08 But I have lots of failing Tempest results these days :-) I'll double check to remind myself of what gets exposed 16:22:20 hogepodge: What's the feasibility of the Foundation having a separate system to store this sort of thing? E.g. would it make more sense to send it to the refstack server but just not have it accessible in order to minimize the maintenance? 16:23:03 markvoelker: I would love that. It's more catherineD and her team's decision to handle that. 16:23:29 markvoelker: ++ that way vendor may be more willing with potential private data 16:23:50 hogepodge, markvoelker it does not have to be a full refstack. you just need a sensitive drop box that would scrub and forwawrd 16:23:51 sounds like the data storage is a separate discussion, but besides that we all agree that we should ask for full data set privately? 16:24:10 I think I'd feel better if the data was encripted with a key that only the foundation can read 16:24:15 markvoelker, +1 16:24:23 rather than sent around in an email 16:24:39 sounds like the data storage is a separate discussion, but besides that we all agree that we should ask for full data set privately? 16:24:51 Ok, so sounds like we generally agree that we should look into the feasibility of sending the complete data and work on a design for doing so. Maybe the thing to do here is record a couple of AI's for folks to work on doing some analysis of some of the issues? 16:24:56 What motivates vendors to send in tests with failed results anyway? Could the tool make it clear whether the results have fallen below the bar and eliminate having to store large, failed test results? 16:25:29 #action hogepodge catherineD will discuss storage/submission of full test results 16:25:32 #chair zehicle 16:25:34 Current chairs: eglute hogepodge zehicle 16:25:45 speaking form experience, the data is really boring. When the program was launched I got a lot of subunit files. Most everybody sends minimized passing results. 16:25:54 I have a failure log with more the half of tests failed that's 450kb 16:26:28 I would like to move to the next topic, since we agree that we want full data set. Lets work out details separately. Everyone ok with this? 16:26:37 hogepodge, passing Defcore only or all their passing results? 16:26:50 dwalleck: is that a full API tests? 16:27:01 when I run tests I like to see what was skipped and what was ignored and what fails. It helps to diagnose configuration issues. 16:27:15 eglute: yes 16:27:22 thanks. 16:27:25 catherineD: Just the tests in the DefCore spec if that's what you mean 16:27:25 #topic RefStack requirement doc for DefCore review 16:27:35 eglute: I think we agree that we want full data set when ask by the Foundation 16:27:36 #link https://docs.google.com/document/d/1s_dAIuluztlCC6AZ-WO4_FR2CLje1QyS6kbMxlFHaMk/edit 16:28:14 Somewhat related, we should remove the test in question from the guideline. QA is going to get rid of it anyway. 16:28:19 dwalleck: as said earlier, we encourage people to run full API tests not just the DefCore tests 16:28:22 thanks catherineD for sending this for review. I read through it, and it looks ok. would you enable comments on the document? 16:28:23 before the guideline is approved 16:28:49 #action hogepodge to remove the test in question from the quideline 16:29:26 catherineD would you give a quick overview of things you would most like feedback on? 16:29:42 regarding the refstack document? 16:29:43 eglute: will do :-D 16:29:50 * eglute thanks hogepodge 16:31:25 eglute: yes ... maybe I should list out the major areas for review in the next meeting .. 16:31:57 thank you catherineD, that would be helpful. also, if you enable commenting, that would probably be easier to provide feedback 16:32:01 in the meeting etherpad, pleas ;-) 16:32:41 sure thx 16:32:42 #action everyone review RefStack requirements document https://docs.google.com/document/d/1s_dAIuluztlCC6AZ-WO4_FR2CLje1QyS6kbMxlFHaMk/edit#heading=h.8fxpb1onf6vr 16:33:24 we will go over next meeting as well :) 16:33:28 thank you catherineD 16:33:34 #topic midcycle 16:34:00 thanks everyone who voted. March 8-9 in Austin, TX received the most votes 16:34:40 zehicle brought up a good point that SXSW could interfere with travel, but this is several days before anything SXSW starts (i think!) 16:34:49 so hopefully would not be an issue 16:35:12 it will start getting crazy on the 11th 16:35:24 anyone has any other concerns about date/location? 16:35:29 there are some early events, but not the big stuff 16:35:52 We are working on location, and once that is finalized, we will let everyone know 16:36:22 eglute: we should also get this listed on the Sprints wiki page. 16:36:33 ++ 16:36:38 #linke https://wiki.openstack.org/wiki/Sprints 16:36:43 markvoelker good idea, would you do that? 16:36:48 #link https://wiki.openstack.org/wiki/Sprints 16:36:50 Sure 16:36:57 #action markvoelker list midcycle on sprints wiki page 16:37:30 i also created etherpad for topics, please start adding/updating: #link https://etherpad.openstack.org/p/DefCoreSpring2016MidCycle 16:38:11 #topic Problem encountered with changed tests 16:38:28 o/ 16:38:36 rockyg created etherpad https://etherpad.openstack.org/p/novav2extensionstestchanges 16:38:46 and there was some mailing list discussion 16:39:13 rockyg has some questions here http://lists.openstack.org/pipermail/defcore-committee/2016-January/000990.html 16:39:33 1. In the discussions, there are multiple times where Chris mentions that a vendor could just use an older version of the test. This raises the questions: 16:39:33 - How would you do this with Refstack? 16:40:01 we could leave this for mailing list discussion, or discuss it here 16:40:47 there are a lot of issues here 16:41:08 I don't particularly care of check which version of tempest is used. Should defcore have an opinion on that? 16:41:13 s/of/or 16:41:49 community standards and tests change. how obligated is defcore to honor that? 16:41:50 this lines up w/ the idea of having a dedicate set of test items for DefCore 16:41:56 yah. is i ok to use old versions of test(s)? This could have some major issues 16:42:14 I would like to have a blessed version that is known to work 16:42:25 rather than work with latest and greatest and maybe broken 16:42:27 orginally, we assumed you'd use the version matching the version you'd test 16:42:32 +1 gema 16:42:34 but then we undid the version stuff 16:42:35 originally, Refstack talked of having a SHA for the test set 16:42:35 that a company passed in 2015.05 and not 2015.07 or 2016.01 represents evolving standards, and I don't necessarily think that's a problem 16:42:38 Well, this case at least demonstrates the possibility that two products certifying under the same Guideline might not be considered interoperable so it's probably worth discussing 16:43:07 a sha for the tempest version is easy metadata to add, and could help with troubleshooting 16:43:16 My fundamental question is why do we have 2 Guideline (2015.05 and 2015.07) for Juno? 16:43:27 we're not testing Juno 16:43:36 my personal feeling is "always use latest", because latest likely has fixed more bugs 16:43:37 we have multiple guidelines that include Juno 16:43:56 hogepodge: our deployed clouds are not latest 16:44:08 hogepodge, when testing older stuff, new may introduce issues too 16:44:58 So, we know later versions of tests fixed *some* problems with the tests. That's why we can flag them (one reason) 16:45:43 I'm not so willing to flag tests that reflect the evolving standards of the community. That's just one voice in the committee though 16:46:30 hogepodge: So what if the community changes it's mind about something midway after some vendors have gotten a license agreement? 16:46:36 For rockyg: 's case, since the certification is for Juno .. they can use 2015.04 or 05 or 07 ? 16:47:05 Yeah, and no. We flag existing tests that don't work (have bugs) but we want in eventually, not to take out tests we want (but maybe if changed, to fix if appropriate -- this isn't really appropriate) 16:47:10 catherineD, yes 16:47:38 rockyg: in that case you should use 2015.05 ... 16:47:44 catherineD, only 2 latests sets 16:47:51 Um, not quite. We can use 05 and 07 right now, but will only have 07 and 6.01 once 6.01 is out 16:47:58 catherineD: they can use 05 or 07, but the problem here is really what version of Tempest to use, not what version of the Guideline 16:48:05 markvoelker: this highlights that our goals for tests (interoperability) aren't reflected in the goals of the tests we pick (qa). 16:48:21 eglute, right 2 latest approved. 16:48:31 Great phrasing, hogepodge! 16:48:36 markvoelker: I wish we could start a test suite from scratch that builds off a list of apis and beaviors we want from an interoperable cloud 16:48:47 hogepodge: ++ 16:48:50 hogepodge, I've been suggesting that 16:48:52 hogepodge: +1 16:48:57 hogepodge: Hmm...not so sure of that actually. The change in this case was made to foster interoperability. It's just that the timelines differed for DefCore vs Nova. 16:49:03 is a topic at the midcycle 16:49:31 problem with that, unless its used for gating, is that we'd then have multiple, possibly conflicting test sets 16:49:36 everyone - that's it's own topic. we should discuss as dedicated item 16:49:58 rockyg: you'd have to use for gating, or else what comes out of the pipeline may not be compliant 16:50:11 markvoelker: rockyg: refstack-client allows to install with any tempest version, sha, tag .. 16:50:14 gema, exactly 16:50:50 catherineD, is it documented? I don't think we got that far, yet 16:50:55 catherineD: Right, that's why I asked on the ML if they'd tried running with a version of Tempest that predates the additionalProperties change 16:50:56 yes 16:51:21 +1 on what zehicle suggested. lets move own tests to midcycle 16:52:06 rockyg: https://github.com/openstack/refstack-client < "-c option allows to specify SHA of commit or branch in Tempest repository which will be installed." 16:52:29 eglute, based on discussion here, we need to budget a lot of time for it 16:52:32 markvoelker: rockyg: they can if they do not insist on compliant to the latest Guideline (2015.07) 16:52:35 markvoelker: thx 16:52:40 * eglute agrees with zehicle 16:53:03 may be worth getting up/down decision before midcycle and then focus on implementation if people want it 16:53:06 Ah, cool. But how about compliant, but a set SHA? 16:53:54 any other topics before we run out of time? 16:54:12 catherineD: I think Rocky's original mail said they were shooting for 2015.05 since that's what the private cloud version of the product used, so should be ok 16:54:36 yes, there are a couple other topics, but we can move them to next week if needed. 16:54:45 No, they want 07, but 07 isn't tied to a sha like 05 is 16:55:00 markvoelker: technically we should be OK... in reality, a cloud that pass DefCore test may not be interops !!! 16:55:15 05 is tied to a tempest label, 4 16:55:36 catherineD: =) Agreed, the journey isn't over yet. 16:55:50 rockyg: Ping me after on #openstack-defcore, I'm not sure that's true 16:56:02 k 16:56:29 ok, lets move this to defcore irc/ML later 16:56:38 #topic adjusting scoring weights 16:56:49 #link https://review.openstack.org/#/c/226980/ 16:57:06 please review. I agree with markvoelker but would like to hear other voices on the subject 16:57:26 this could potentially affect what capabilities end up in 2016.07 as advisory 16:57:50 * markvoelker is getting in the habit of writing very verbose commit messages apparently 16:58:01 seems like a reasonable tweak 16:58:07 would need board approval 16:58:27 should we update the process to have a max change for the weights per cycle? 16:58:33 zehicle would it since it is not in the process doc? only .json? 16:58:37 nah - nevermind. 16:58:51 I believe the process requires us to get Board input 16:58:59 zehicle agree 16:59:04 * zehicle actually, I''m sure of it 16:59:07 ++ 16:59:16 zehicle: I would definitely want Board input even if not required. =) Hence, it's in 2016.next 16:59:38 we might need to add something about the weights to the process doc as well 16:59:45 and we are out of time. so please review!! 16:59:46 it is required - I remember when we put it in 17:00:02 also please review https://review.openstack.org/#/c/253138/ 17:00:04 * zehicle feels like the archivist 17:00:06 thanks everyone 17:00:08 #endmeeting