15:00:07 #startmeeting XenAPI 15:00:08 Meeting started Wed Nov 20 15:00:07 2013 UTC and is due to finish in 60 minutes. The chair is johnthetubaguy1. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:11 The meeting name has been set to 'xenapi' 15:00:22 hello all 15:00:27 Hello John 15:00:28 who is around for today's meeting? 15:00:37 hi John - now I know why I haven't found you on IRC :-) 15:01:11 oh the name, yeah, VPN dropped on me 15:01:24 #topic Blueprints 15:01:32 so are people happy about Icehouse-1 15:01:40 do we have any blueprints that need reviewing? 15:01:57 This is going to be quick from our side - we haven't registered any bps 15:01:58 I mean the blueprints, not the code right now 15:02:10 Happy with I-1 15:02:24 it would be good to get things lined up for I-2 soon 15:02:38 Agreed 15:02:40 we can look at getting stuff we really one promoted to medium, if I can get more sponsors 15:02:44 Can you remind us when I-2 is? 15:02:50 but that will need to be soon rather than later 15:03:00 erm, good question, let me check the date... 15:04:26 OK, so I can't find that damm page 15:04:30 anyways, its soon 15:04:33 hehe 15:04:38 isn't it always :) 15:04:42 I have a few blueprints up for I-1 15:04:49 just finishing the stuff off 15:04:50 https://wiki.openstack.org/wiki/Icehouse_Release_Schedule 15:04:53 but reviews welcome 15:04:57 ah, thank you! 15:05:10 the slashes I added in that URL did not help me 15:05:32 so, W/B 12th December 15:05:51 it's again a very short milestone 15:06:01 If my stuff doesn't make I-1 I will try make them medium for I-2 15:06:02 I suspect most of our efforts will be focused on tempest tests in that period 15:06:16 I was about to say its longer due to the holiday, but anyways 15:06:23 tempest is a good thing to worry about 15:06:34 indeed 15:06:38 but not blueprintable :) 15:06:38 (lets come back to that in a moment) 15:07:06 well we could I guess… but ignore that 15:07:45 Of course we could - but I was judging from my interpretation of Russell's "what is a blueprint" 15:08:15 well its doesn't touch nova code, it would probably live in qa team or something 15:08:16 https://blueprints.launchpad.net/nova/+spec/xenapi-resize-ephemeral-disks 15:08:25 https://blueprints.launchpad.net/nova/+spec/xenapi-vcpu-pin-set 15:08:31 https://blueprints.launchpad.net/nova/+spec/xenapi-vif-hotplug 15:08:37 all have some patches up now 15:08:45 the last two need some work from me still 15:09:02 the last one is more of a sketch than real code at this stage, but progress 15:09:10 so the first one is ready for review? 15:09:16 yes 15:09:22 so is the second one 15:09:28 third one, not so much 15:09:41 but its still worth a peak to see what you think 15:10:15 I just pushed this one out to I-2 15:10:16 https://blueprints.launchpad.net/nova/+spec/xenapi-driver-refactor 15:10:19 nice progress, John! 15:10:29 ok 15:10:33 I think belliott was interested in the above, but not sure yet 15:10:43 anyways, lets crack on 15:10:56 #topic QA and bugs 15:11:21 our bug list is getting very silly now 15:11:21 https://bugs.launchpad.net/nova/+bugs?field.tag=xenserver 15:11:24 58 long 15:11:31 OK, some are fixed, but still 15:11:46 First one isn't really ours 15:11:50 just I found it 15:11:52 and it affects us 15:11:55 well, we could fix it 15:12:12 VMware just added some stuff, and we asked them to start namespacing VMware specific stuff 15:12:36 we could help come up with a baseline that can be tested, but I am too busy to help right now 15:12:53 anyways, just wanted to raise that, we need some effort of getting that list of bugs under control 15:13:05 #link https://bugs.launchpad.net/nova/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&field.tag=xenserver+&field.tags_combinato 15:13:07 it might just mean me being more critical about the priorites of some 15:13:12 42 not fixed 15:13:20 right, its still too high 15:13:35 or looks high compared to other bug groupings 15:13:43 but maybe we are better at tagging 15:13:45 anyways 15:13:50 needs some work 15:14:07 maybe beginning of Icehouse-2 we do a XenAPI bug day? 15:14:29 #action johnthetubaguy1 to organise a XenAPI bug squash day 15:14:34 Sure 15:14:44 cool, so QA? 15:14:54 hows the getting tempest tests running going? 15:15:04 I really want to see how I can help make this happen 15:15:22 even if the test is always failing, I would love to see something on that zuul list 15:15:32 Okay, that's a separate issue 15:15:37 well firstly an account where we can run things in the RAX cloud for Mate would be very good 15:15:52 sure, sign up for a free developer account 15:15:57 That's why I wanted to contact you John. 15:16:04 Ah, OK, will do that. 15:16:10 matel: http://developer.rackspace.com/devtrial/ 15:16:38 I hope it doesn't ask for my cc details. 15:16:40 the 6-month thing is the bit I'm just hitting up against with mine 15:16:43 it will Mate 15:16:44 if you need more to make this happen, then… get euan to sign up 15:17:04 Can we get a proper account somehow? 15:17:12 it is a proper account 15:17:20 maybe I am missing something? 15:17:33 So: I won't give any cc details here. 15:17:51 You can't sign up to RS cloud without CC details 15:17:52 So if that's a dependency, I need to find someone with a cc. 15:17:57 pretty sure of that 15:18:09 We need to find a company card in this case. 15:18:12 yeah, you will need a credit card, its an anti-faud thing 15:18:15 fraud 15:18:29 well I will leave that with you 15:18:37 Okay, that will delay things I guess, but we'll find it out. 15:18:37 well perhaps - if my acc can be extended to have more than 6 months of free credit - we can use that. John? Any chance? 15:18:40 or should I talk to Ant? 15:19:00 well you can ask, let me make an email thread 15:19:27 #action help sort out an RS cloud account for matel 15:19:44 Thanks, in the meanwhile I will sync up with his work. 15:19:54 so where are we now? 15:19:58 what is left to do? 15:20:20 what is the preferred route? 15:20:41 At the moment I am working on a document, and my recommendation is to use the RS cloud and the infrastructure 15:21:06 But I need to sync up, and understand the steps required. 15:21:16 nested virt. 15:21:27 In short: no decision made yet. 15:21:41 I was just communicating my personal view. 15:22:05 Spoke with dan 15:22:27 And we think that TripleO is not there yet to provide us the thing that is required by nova. 15:22:30 well, we could run xenserver-core inside a rackspace performance flavor VM, with nova-compute running in dom0, running tempest tests, kicked off by devstack 15:22:53 yes, triple0 is happening, but I don't see it reporting too much yet 15:23:27 how far are we from the xenserver-core thing doing what we want 15:23:35 I agree - the thing you mentioned would be one way. That includes running devstack in dom0 and fixing xenserver-core. 15:23:40 and fixing full tempest 15:23:48 noting that we only need a subset of tempest to work, for the initial phase 15:24:06 Yes, sure, a start with smoke I think will be fine. 15:24:13 And we're already passing them 15:24:20 at the summit, it was accepted that some of tempest would be good, but just smoke was probably too little 15:24:31 so lets just turn off the bits of full that are failing 15:24:36 Sure. 15:24:50 OK, so I think we should just go for that 15:24:54 its a good starting point 15:25:07 So my plan is to discover if the nested virt is working at all. 15:25:10 when we have that, I will try get the Rackspace test suite running up there too 15:25:17 I thought bob got that working? 15:25:28 * johnthetubaguy1 looks at BobBall 15:25:36 Yes, Bob got it working, but it involves several manual steps afaik. 15:25:40 for XenServer yes 15:25:50 OK, so not tried xenserver-core? 15:25:58 and it was highly manual because RS cloud depends on xenstore for IP address 15:26:08 xenserver-core doesn't work yet in that scenario 15:26:13 don't know if it can be made to work 15:26:14 why is that? 15:26:17 If it's manual, it's not working. 15:26:25 we need the kernel to disable the Xen bus on boot 15:26:33 but the config option to disable it didn't work when I tried it 15:26:46 hmm, odd 15:26:50 not certain that the syntax / positioning was right - but it's far from clear that it'll work 15:26:57 without that config option, the kernel upgrades to PV 15:27:09 ah, I see 15:27:13 or - more to the point - even when running in HVM mode it'll detect it's running under xen and unplug the non-PV devices 15:27:20 If the nested virt does not work, we are dead. 15:27:29 at which point we are left with no networking or disk, because the PV devices can't come up with no xenstore to the outside 15:27:38 indeed, but doesn't sound like we have tried too hard just yet 15:27:47 ah 15:27:52 so we don't have xenstore? 15:28:04 I thought you said that was working? 15:28:04 It might work if we can figure out how to set it 15:28:10 I really don't want to recompile the kernel :( 15:28:15 sure 15:28:17 xenstore can never work with nested xen 15:28:27 right, thats where I was confused before 15:28:36 I think XenServer needs to learn how to become a good guest! 15:28:39 I've certainly never said it was working :P 15:28:49 I thought you said xenstore was working, I guess thats because it was in PV? 15:28:57 XenServer can be a guest - that has been working - but it can't dynamically get an IP 15:29:00 no 15:29:08 can't boot Xen as PV 15:29:15 if it's PV then it's only booting the kernel 15:29:16 sure, I agree with that 15:29:17 and not Xen 15:29:28 so it doesn't apply 15:29:37 I mean it's physically not possible :) 15:29:54 sure, I think we are talk a little cross purposes 15:30:02 so if we do XenServer as a guest VM 15:30:11 we still hit the needing the IP address issue? 15:30:36 We have two options: 15:30:41 OK... 15:31:00 1) Base image is XenServer (or xenserver-core). Would need config drive or another way to get an IP address automatically. 15:31:15 I should be able to sort out the config drive thing 15:31:21 2) Base image is CentOS and we install xenserver-core on top, using the static IP address CentOS got from xenstore before upgrading 15:32:03 oh I see, thats how that worked 15:32:07 I like (2) 15:32:25 but then it will take too long with that reboot 15:32:26 It would be nice, but it is far less certain 15:32:27 Is it an HVM centos? 15:32:28 and less stable 15:32:48 yeah, there is an option (3) 15:33:10 I think these things could be summarized into a proper table to avoid future missunderstandings. 15:33:20 Yes - so option 2 would need us to prepare the image and convert it to HVM before installing xenserver-core 15:33:30 Was that the sound of a volunteer Mate? 15:33:46 yeah, I need to understand the zuul image creation system 15:33:56 There isn't really one 15:34:00 it uses the images from the cloud 15:34:06 runs a prepare script on them to add them to the node pool 15:34:09 then it pulls from the node pool 15:34:14 I thought the re-created an image every evening, then get some of them started "hot" and ready for code, then use them when required 15:34:15 we can do anything we like in that prepare script 15:34:32 ah, OK 15:34:35 gotcha 15:34:46 so that makes (2) possible in the prepare scrip 15:34:47 t 15:34:49 but yeah, it can't take too long 15:34:50 indeed 15:34:56 And how much time do we have in the prepare script? 15:35:00 or - at least - that's my understanding 15:35:04 not sure it's limited mate 15:35:14 well, they start them up and add to the pool 15:35:24 then the start time is when they get pulled out the pool right? 15:35:43 we can skip exercises, and just do tempest (obviously) which saves some time 15:35:44 if we've got a good argument for extending any timeouts I'm sure the check jobs will accomodate 15:35:48 Okay, so as a first step, why don't we have a zuul job, with an empty prepare script? 15:35:50 unless it's the actual run time 15:36:03 First step should be to prove the difficult bit :) 15:36:11 That's not good 15:36:27 yeah, first bit is get xenserver-core running in a VM 15:36:29 actual run time might need more discussions :) 15:36:37 I think, as the end result is a zuul job, you can demonstrate the results, if you have the frontend. 15:36:43 don't assume it's going to be xenserver-core - I think that's much riskier 15:37:09 OK, but its more changeable 15:37:12 I am worried about the way we approach this. 15:37:20 which is a bad thing for the OpenStack team 15:37:22 me too 15:37:37 I want to see the empty zuul jobs running. 15:37:38 Can we just get tempest passing inside a VM? 15:37:48 Why mate? 15:37:52 What do you mean by an empty job? 15:37:57 Because that's how you deliver the things. 15:38:19 yeah, zuul can be consisdered a known thing, I would love to prove the hard thing 15:38:25 I want to see things triggered by zuul, and modify things to give sensible results. 15:38:29 but we could do this is parrallel 15:38:38 It's useless unless it's setting up a XenServer in a way that we're happy has a chance of passing things 15:38:43 people do lots of crazy modifications to zuul already 15:38:58 +1 we need a VM with tempest passing 15:39:13 I would do the zuul first, at least that's my way of delivering things. 15:39:13 I care much less about what version of XenAPI is running 15:39:35 we are not at the delivery phase, we are at the, could it work phase 15:39:40 The key thing here is to deliver something _VISIBLE_ 15:39:41 if it takes 2hours to do full temepst 15:39:50 we can drop it on the floor 15:39:57 perhaps I'm mis-understanding what you want the zuul to do Mate 15:39:58 and add tempest into smokestack 15:40:01 would it have anything to do with XenAPI? 15:40:18 initially* 15:40:31 First runs would do nothing, but then we would be able to develop the system by amending the scripts. 15:40:48 ok - yes, in which case I think that's the wrong approach 15:40:53 So that we have feedback. 15:40:58 because that's the bit we're more certain about how it would work 15:41:07 yeah, I think I see what you mean matel, but we need to prove its worth trying first right, else thats wasted effort? 15:41:49 anyways, we can do this in parallel if you want 15:41:51 What I am worried about is the progress of this task. I want to have visibility, and not relying on people saying "it works" 15:42:17 For me zuul is unknown, nested virt is unknown 15:42:23 everything is unknown. 15:42:31 We don't know that we'll have a solution to nested virt 15:42:33 And on these situations, you want to see something. 15:42:43 but the solution to the Zuul end is definitely feasible 15:43:04 +1 don't worry about zuul 15:43:05 an empty job in Zuul makes things known, absolutely, but it doesn't give any visibility - because it wouldn't be doing anything useful 15:43:43 so here is my view... 15:43:50 Maybe you missunderstood my intentions, but we could speak about it later, maybe what I want is not possible in this framework. 15:44:01 1) we need tempest to run under an hour 15:44:07 define tempest 15:44:14 define machines 15:44:22 Are we talking about the perf flavour? 15:44:31 tempest full, but with a few tests missing if we want 15:44:41 2) I don't care where that runs, if it works 15:44:47 3) I don't care who runs it 15:44:52 Okay, it's taking 2 hours on phy machines, but they might be slower than perf. 15:45:02 4) the easiest runner is zuul, but right now it only talks to the cloud 15:45:14 5) we need to prove its fast enough in the cloud, i.e. nested virt 15:45:37 6) we need to prove we can automate the creation of that, I am happy to add a few little hooks to make that possible 15:45:40 Is the 2 hours running through tox? 15:45:46 7) we can wire that into zuull 15:46:02 2 hours is really really bad 15:46:28 matel? is the 2 hours using tox? 15:46:41 serial tempest afaik 15:46:45 ok 15:46:49 good 15:46:50 s/tempest/nose/g 15:46:55 well, thats probably the issue 15:47:11 so step 1, maybe, can we try tox on the physical machines? 15:47:20 Using parallel execution = loads of failures. 15:47:22 nah - let's go straight to the cloud 15:47:36 well, we can always just do smoke I guess 15:47:54 I need to improve build times anyways, so I can help optimize that 15:48:02 indeed. Parallel is great if we can get it working, but if not, we need it not on local machine 15:48:22 Ran 175 tests in 521.146s 15:48:30 ouch 15:49:09 I don't quite understand why do we need to run these expensive tests to test the driver... 15:49:21 but that's another question. 15:49:41 because its what runs fast everywhere else, they are functional tests that prove the whole thing works right? 15:49:54 Let's figure out what's possible to do first before we focus too much on the speed of the full suite. 15:50:01 we have some faster ones than run in well under 15 mins I think, but thats a different thing 15:50:07 Because we don't have proper tests for the drivers righ? 15:50:13 We can look for speedups or reduce the tests we run at another time 15:50:27 nope, we do have OK tests for those, but we stil NEED tests with the whole stack 15:50:36 but anyways 15:50:40 But that's the whole OS, not nova. 15:50:44 lets focus on getting smoke tests running 15:51:13 Anyhow, I have seen, that OpenStack community is keen on burning CPU time, it makes everyone happy and proud... :-) 15:51:17 matel: before when we didn't include the other bits it all fell apart, its not like we run swift mind 15:51:49 Okay, sorry for that. 15:51:53 Be constructive. 15:51:53 matel: but if you have ideas on speeding that up, certainly suggest them 15:52:00 anyways 15:52:05 so the goal... 15:52:20 tempest (smoke to start with) running in a cloud VM 15:52:40 Hopefully 15:52:41 2) automate the creation of said VM in a script 15:53:03 yeah, we have to just try it, and stop talking really 15:53:08 who wants to do that? 15:53:35 Mate does 15:53:38 I don't have access to the RS cloud 15:53:48 but bob does... 15:53:55 bob != mate 15:53:57 We can fix that short term 15:54:00 ok 15:54:03 I don't have a medium term solution 15:54:07 OK, so let me come around tomorrow, and we can work on this? 15:54:08 that's what John is looking into 15:54:13 I can play with that, Bob can tell me what he has done so far. 15:54:28 not worth it yet john 15:54:34 So, matel/BobBall do you want to try XenServer? 15:54:44 I can go with xenserver-core? 15:54:52 Can't without config drive 15:55:01 OK, so I should work on config drive? 15:55:10 John: I would don't care what is it, let's do a test drive, measure, and act 15:55:19 If you can push a way of an HVM guest with no xenstore to get an IP 15:55:19 yes 15:55:23 that would be _SUPER_ helpful 15:55:23 #action johnthetubaguy1 to make sure config drive has useful ip addresses 15:55:32 in RS cloud 15:55:34 that's the key big 15:55:36 bit* 15:55:38 yeah, I missed that bit 15:55:39 lol 15:55:50 it's useless if it just works in nova somewhere and might get to RS cloud in 2015 15:55:51 I have the patched merged for the OS case 15:55:53 :) 15:56:05 yeah, its probably 2014 now, but I can try 15:56:08 also, it's the HVM guest with no xenstore thing too 15:56:24 though config drive, yes 15:56:29 i.e. if config drive mounts as drive number 5 and required PV tools to mount it - it's useless 15:56:40 its device 4, I made sure of that 15:56:47 that's good 15:56:54 its not an accident 15:58:11 we're out of time 15:58:12 Okay, so the message is that Mate/Bob will look at the nested VM, report issues to John, and John will help with those issues. 15:58:19 https://github.com/openstack/nova/commit/b1445e7e84b720ac232541ef866fbe7a59faeaf8 15:58:33 its #3 15:58:45 sounds good 15:58:58 who whats to take the meeting next week? 15:59:00 I am on holiday 15:59:10 We'll just skip it I guess. 15:59:11 or shall we skip it? 15:59:11 Well we'll skip it then 15:59:15 cool 15:59:24 see you in two weeks 15:59:26 thanks all 15:59:29 let me finish it? 15:59:30 Have a good holida 15:59:33 Have a good holiday 15:59:35 No 15:59:36 BobBall: sure 15:59:38 #endmeeting 15:59:40 mate: thank you :) 15:59:43 Hah 15:59:44 poo - didn't work 15:59:45 wait for it... 15:59:47 hah 15:59:49 poor Bob. 15:59:50 wasn't that > 60 minutes? 15:59:51 a message comes up 15:59:52 Crying. 15:59:53 nope 16:00:09 Have you tried to turn it off and on? 16:00:22 #endmeeting