15:01:00 #startmeeting third-party 15:01:01 Meeting started Wed Feb 18 15:01:00 2015 UTC and is due to finish in 60 minutes. The chair is krtaylor. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:02 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:04 The meeting name has been set to 'third_party' 15:01:11 Hi everyone 15:01:17 hey 15:01:17 moin moin 15:01:18 hi 15:01:29 Hello 15:01:34 time for another Third Party CI WG meeting 15:02:01 hi 15:02:06 hi 15:02:38 looks like we have a good group today 15:02:39 o/ 15:03:01 here is the agenda: 15:03:05 #link https://wiki.openstack.org/wiki/Meetings/ThirdParty#2.2F18.2F15_1500_UTC 15:03:51 #topic Announcements 15:04:19 hi 15:04:32 I'll start off by reminding everyone of gerrit being upgraded March 21st 15:04:41 hi asselin 15:05:02 what is the "impact" of the upgrade to users? 15:05:32 ja_, for those with firewalls blocking the port, it means firewall updates 15:05:39 ja_, really only if your CI system needs some kind of firewall egress configuration 15:05:48 ok thx 15:05:51 asselin beat me to it 15:06:20 ok, any other quick announcements before we move on? 15:07:09 #topic Third-party CI documentation 15:07:59 ok, so we still have some work to do here, especially in running-your-own 15:08:41 and I am thinking that it is slowed due to the need to walk through it 15:09:26 we have some patches, which reminds me 15:09:57 rfolco, can you change your topic to 'third-party-ci-documentation' on your patch 15:10:20 that is for everyone - so we can track all with one query 15:10:32 #link https://review.openstack.org/#/q/topic:third-party-ci-documentation,n,z 15:10:37 kragniz, summary line ? 15:10:46 krtaylor, ^ 15:10:59 (sorry) 15:11:25 rfolco, you should have a little writing pad next to Topic in the upper left of your patch review 15:11:35 you can edit the topic in gerrit 15:12:11 topic is master, is that one ? 15:12:53 rfolco, just below Branch 15:13:12 done https://review.openstack.org/#/c/155864/ 15:13:16 thanks 15:13:45 lennyb, since you are getting started, any comments on the running-your-own doc would be helpful as well 15:14:22 ja_, your continued input is appreciated too 15:14:51 ok, onward 15:15:03 #topic Spec for in-tree 3rd party ci 15:15:14 krtaylor, thanks I will try to document 15:15:26 asselin, you have a new rev on the spec 15:15:28 so I updated the spec a bit yesterday 15:15:39 #link https://review.openstack.org/#/c/139745/ 15:15:56 yes, it is now a 'priority effort' for openstack-infra 15:16:02 I haven't had a chance to review it yet, will today 15:16:17 yea! 15:16:34 yes, very excited about that! :) 15:17:18 asselin, it will enable a lot of goodness 15:17:44 asselin, is the refactor a override on infra puppet, a fork or what ? could you please clarify ? 15:18:23 rfolco, the refactor is to allow the puppet scripts to be more easily reused 15:19:04 rfolco, there are lots of sections in system-config that are needed, but not easily resuable 15:19:51 I read the spec and I had the impression it was a fork from infra scripts 15:20:15 that today's solution 15:20:41 rfolco, could you comment on the specific sections? I will try to clarify 15:21:12 asselin, wil ldo thx 15:21:37 yes, and just like the puppet module split-out, a great opportunity for third-party WG to get involved and help out 15:21:38 #link https://review.openstack.org/#/c/137471/ 15:21:49 ^^ this is a related spec that will help a lot as well 15:23:20 any other questions on in-tree ? 15:23:34 an action for everyone, please go read the spec 15:24:04 thanks for the overview asselin 15:24:29 oops asselin_ 15:24:41 np :) 15:24:49 ok, next 15:24:52 #topic Repo for third party tools 15:25:09 like last week, I am socializing the idea of creating a repo 15:25:41 a place for ci teams to share their scripts and other goodies that make their job easier 15:26:19 +1 I like the idea 15:26:21 if the consensus is that it is a good idea, I'll see about getting that setup 15:26:51 i like this idea, you thinking a openstack repo or just a public github kind of thing? 15:26:57 there are several tools available, but unless you know about someone's github account, you don't know they exist 15:27:18 actually, a stackforge repo 15:27:29 gotcha 15:27:46 we'd have to discuss the organization of it, etc 15:28:19 * krtaylor wonders if we'd need a spec to propose the idea formally 15:28:54 that might be a good idea, or at least a wiki page or something to capture what we decide for organization 15:29:08 well, I haven't found anyone that hated the idea...yet 15:29:40 A wiki or etherpad might be good to start. 15:29:52 I know we have tools internally that we could share, scripts, dashboards, etc 15:29:59 * krtaylor agrees 15:30:38 #action krtaylor to set up wiki for third-party CI WG repo, and/or etherpad 15:31:46 ok, goodness, thanks everyone for the input 15:31:54 lets move on 15:31:59 #topic Restart monitoring dashboard effort 15:32:05 sweston, ping? 15:32:48 there is an effort to have a public monitoring dashboard, basically a new/nicer/more featured radar 15:33:20 some good comments here: 15:33:26 #link https://review.openstack.org/#/c/135170/ 15:34:01 it would replace the need to change status on ThirdPartySystems, at least eventually 15:35:01 sweston has been swamped with work from his dayjob, but everything is available, just needs input, reviews, ideas to converge 15:36:00 ok, we can come back to that if time permits, but I want to get to the next bit 15:36:47 #topic Highlighting Third-Party CI Service 15:37:15 continuing on the success of rfolco 's discussion of PowerKVM CI 15:37:37 this week we have Pure Storage CI 15:38:01 patrickeast, can you share a brief intro on your system 15:38:06 yep 15:38:16 maybe some problems you had and how you solved them 15:38:35 so, first off i made some stuff to share 15:38:41 http://ec2-54-67-119-204.us-west-1.compute.amazonaws.com/ci_stuff.svg 15:38:49 https://github.com/patrick-east/os-ext-testing-data 15:38:58 a poorly drawn diagram of our setup 15:39:04 and what we use for our data repo 15:39:23 #link http://ec2-54-67-119-204.us-west-1.compute.amazonaws.com/ci_stuff.svg 15:39:31 #link https://github.com/patrick-east/os-ext-testing-data 15:39:49 as of a month (almost 2) ago i have switched our system over to using asselin’s https://github.com/rasselin/os-ext-testing scripts 15:40:12 nice 15:40:16 prior to that we had started with the instructions on jay pipes blog post and cobbled together a system without nodepool 15:40:28 we ran into all kinds of issues with re-using static slaves though 15:40:38 nice pic 15:40:56 yep, without nodepool due to setup requirements? 15:41:12 we went that way originally just due to lack of knowing any better 15:41:35 ah, ok 15:42:00 patrickeast, what's "RDO" 15:42:13 RedHat Repo? 15:42:17 https://openstack.redhat.com/Main_Page 15:42:31 its like their open source version of the redhat openstack stuff 15:42:41 similar to centos vs rhel 15:43:13 it made it very very easy to get setup with openstack 15:44:12 so, as you may have noticed on the diagram we have the nice high speed data connections that are currently not used… thats on my list of todo’s 15:44:20 we are testing our cinder driver 15:44:22 patrickeast, so I take it you are only testing cinder patches 15:44:27 correct 15:44:37 right now we are only listening for openstack/cinder changes on master 15:44:56 and run the volume api tempest tests 15:45:31 we are planning to add a FC cinder driver for our array in L-1 15:45:48 so i’ll be adding support for that into the system early in L 15:46:02 what was a really tricky part that you had to work through? 15:46:37 * asselin_ has fc ci scripts to share in some future repo tbd 15:47:08 probably the hardest part was figuring out how to properly configure everything… all told there are like 50 config files involved between the openstack provider and ci system 15:47:33 this is where that documentation push is really going to shine 15:48:09 patrickeast, does rdo help out with the openstack provider configs? or are those the ci config to point to the provider? 15:48:48 it does get everything setup and working, but we’ve had to go back through and customize things a bit 15:49:00 like where nova stores intances, and glance keeps images 15:49:05 due to partitioning on the system 15:49:17 and we had to delete all of its automatic network setup and do our own 15:49:36 patrickeast, we had a similar situation, but as we worked through everything, we found ways to use upstream as-is and have less delta 15:50:49 yea my goal is to try and reduce that when we add in the FC testing 15:51:10 right now its only a single initiator we test with 15:51:29 i’ve got 2 more even bigger ones on the rack next to it waiting to be hooked up with the array 15:51:48 for those ones i’m hoping to improve upon the current setup a bit 15:52:00 patrickeast, have you automated any other parts of the system for your testing? 15:52:41 nothing significant, we’ve added in some scripts to cleanup the array once it is done testing 15:53:07 created any monitoring framework? 15:53:20 actually yea, a little one 15:53:29 https://github.com/patrick-east/os-ext-testing-data/tree/master/tools/server_monitor 15:53:31 so 15:53:38 * asselin_ looki ng 15:53:44 we ran into a few times where the system would start failing for X reason 15:53:48 * krtaylor looks too 15:53:48 either disk out of space 15:53:56 or the job was unregistered 15:54:01 or the array went down 15:54:20 so i made a little script that sits on the master and sends email alerts whenever something like that happens 15:54:30 * patrickeast doesn’t know how to use nagios 15:54:43 hehheh 15:54:53 we are looking at using nagios also 15:55:18 hm, something else to share with the community... 15:55:44 my companies IT and internal ci teams use nagios quite a lot, so i’m hoping to get their help one day to make this integrated with all their dashboards and stuff 15:55:58 what does -infra use? 15:56:00 but for now its nice to get an email instead of checking in and seeing it failed the last 50 builds 15:56:34 asselin_, eyeballs :) 15:56:56 I configured zuul to send me e-mails on job status. I check the results periodically. 15:57:03 also a good way to fill up your mailbox 15:57:20 +1000, yeah we did that, then turned it off 15:57:31 but looking for something better. thanks patrickeast :) 15:57:50 excellent, thanks for sharing that 15:57:53 oh, also, not sure if anyone else is interested but https://github.com/patrick-east/os-ext-testing-data/blob/master/tools/clean_purity.py is the cleaning script, it parses the cinder logs for any volumes/hosts/whatever from that particular test run and wipes out any left over 15:58:12 we have a relatively small max volume limit on our arrays so it becomes an issue if we aren’t agressive about it 15:58:15 #link https://github.com/patrick-east/os-ext-testing-data/tree/master/tools/server_monitor 15:58:37 #link https://github.com/patrick-east/os-ext-testing-data/blob/master/tools/clean_purity.py 15:58:40 you can also query Jenskins with json api to see last jobs status 15:59:23 yep, we thats all the server_monitor script does, although i dialed it back to just alerting when things hit the fan (health score of 0) 15:59:36 since i was tired of emails for actual failures on bad patches 15:59:57 well, we are close to time 16:00:06 yeah, we are getting now emails only after N failed jobs 16:00:23 thank you patrickeast for sharing this info about your system 16:00:26 np 16:00:36 big thanks! 16:00:36 let me know if you guys have more questions 16:00:41 thanks everyone, great meeting! 16:00:44 patrickeast: thank you, very good 16:00:56 patrickeast, a few. will ask offline 16:01:01 #endmeeting