Monday, 2014-01-27

*** UtahDave has joined #tripleo00:00
*** UtahDave has quit IRC00:06
*** UtahDave has joined #tripleo00:07
*** matsuhashi has joined #tripleo00:09
*** cd-undercloud has joined #tripleo00:13
cd-undercloud************** overcloud complete status=0 ************00:13
*** cd-undercloud has quit IRC00:13
lifelessSpamapS: btw -
*** UtahDave has quit IRC00:27
SpamapSlifeless: right I thought maybe that was something people were already looking at00:27
SpamapSlifeless: oh and huzzah ;)00:27
lifelessSpamapS: can you review,n,z please?00:29
lifelesssimple stuff00:29
lifelessSpamapS: and will take CI a big step forwards00:30
*** matsuhashi has quit IRC00:33
openstackgerritA change was merged to openstack-infra/tripleo-ci: Install dependencies from
*** ccrouch has joined #tripleo00:43
openstackgerritA change was merged to openstack-infra/tripleo-ci: Switch TRIPLEO_ROOT
ccrouchquestion about "state" on the overcloud nodes00:44
ccrouchthere is "precious state" which iiuc goes in /mnt/state00:45
ccrouchthere will also eventually be a readonly / filesystem00:45
ccrouchbut is there also "non precious state" e.g. pids and such, which would get written to /var/run ?00:46
*** matsuhashi has joined #tripleo00:48
lifelessccrouch: sure, e.g. the neutron dnsmasq conf files are written to /var/run which is a tmpfs on Ubuntu00:51
ccrouchok great. I wasn't sure how far the readonly / was going to extend00:52
lifelesswho knows :P00:54
ccrouchi guess if its /tmpfs mounted it doesnt really matter00:54
ccrouchi was thinking of any other parts of the filesytem actually mapped to persistent disk00:55
lifelessI think it would be surprising00:55
lifeless /mnt is enough of a delta from regular ubuntu/fedora etc00:55
lifelesshaving some bits sticky and some not  - harder to reason about00:55
ccrouchoh i agree, if we can get anything that we care about being persistent over to /mnt/state that would be optimal00:56
ccrouchand everything else that needs to get written going to tmpfs00:57
ccrouchthen we should be in good shape00:57
*** nosnos has joined #tripleo00:58
*** cd-undercloud has joined #tripleo01:01
cd-undercloud************** overcloud complete status=0 ************01:01
*** cd-undercloud has quit IRC01:01
pleia2lifeless: I've been Ubuntu and OpenStack eventing all weekend while you've been working a lot on our CI :) could use a state of the union tomorrow morning to see where I should jump back in01:01
ccrouchlifeless: your statement about /var/run and tmpfs holds for Fedora too:
*** UtahDave has joined #tripleo01:05
lifelesspleia2: ok01:08
lifelessccrouch: cool01:08
*** jcooley_ has joined #tripleo01:12
lifelesspleia2: next thing I think would be to work on the tripleo prepare scripts so they work on fedora01:12
pleia2lifeless: perfect, I started on friday to run a few of them manually but didn't get very far, things broke down pretty quick01:14
pleia2lifeless: do we want to make the current scripts distro-agnostic-ish or run different ones for fedora?01:14
pleia2(I've been assuming the former)01:15
lifelesspleia2: I would assume the former01:15
lifelesslike devstack-gate01:15
SpamapSlifeless: on the same vein as what ccrouch was discussing earlier.. I think a push for readonly / soon will help us catch things that are writing state to "not /mnt" earlier and thus will help us avoid "oops we lost X" scenarios01:44
lifelessor find things we don't want to keep01:44
lifelessI'm thinking we don't want to keep the ovs config01:45
lifelessas I think thats whats messing up reboots01:45
SpamapSlifeless: interesting01:47
*** cd-undercloud has joined #tripleo01:47
cd-undercloud************** overcloud complete status=0 ************01:47
*** cd-undercloud has quit IRC01:47
SpamapSwow this is cool01:48
SpamapSwe could actually calculate our downtime now :)01:48
SpamapSor I should say uptime01:48
lifeless50% or something01:48
SpamapSI think we're down for 17 minutes out of every 45 - 5001:48
SpamapScould be more like 2001:49
*** nosnos_ has joined #tripleo01:53
*** nosnos has quit IRC01:54
*** matsuhashi has quit IRC02:26
*** matsuhashi has joined #tripleo02:32
*** morazi has joined #tripleo02:32
*** cd-undercloud has joined #tripleo02:35
cd-undercloud************** overcloud complete status=0 ************02:35
*** cd-undercloud has quit IRC02:35
lifelessStevenK: btw man apt-mirror makes my eyes bleed02:39
lifelessStevenK: what does the 'clean' option do?02:40
openstackgerritA change was merged to openstack/diskimage-builder: Fix ramdisk element for openSUSE
greghayneshuh, it appears my overcloud compute node kernel panicked when booting during resize2fs02:47
SpamapSgreghaynes: I've seen that too!02:48
SpamapSgreghaynes: it is possible it is due to the virsh stuff we do, not a real resize bug02:49
* greghaynes screenshots and rebuilds02:51
lifelessSpamapS: not really02:51
lifelessSpamapS: its a real bug02:51
SpamapSlifeless: ugh02:51
lifelessSpamapS: hey, can you do a review for me please?02:51
lifelessbroke tests02:52
SpamapSlifeless: that was the one I was thinking might cause the resize bug02:52
lifelessSpamapS: how so?02:53
SpamapSlifeless: oh... IIRC sbader pointed out one that was specifically only with "really big resize values" .. which we have.02:53
SpamapSlifeless: oh I was thinking maybe we were writing to our images somehow02:53
*** sdake has joined #tripleo02:53
*** sdake has joined #tripleo02:53
SpamapSbut that is just wishful thinking :)02:53
lifelessit is ;)02:53
lifelessdan has seen it on bare metal02:54
StevenKlifeless: Yes, the man page is *horrible*.02:54
StevenKlifeless: clean tells apt-mirror to delete files not referenced by the mirror, and can be told to miss directories by use of skip-clean02:55
lifelessSpamapS: that review is needed for ci test runs02:55
openstackgerritA change was merged to openstack/tripleo-incubator: Destroy seed domains before copying new files.
lifelessSpamapS: woo02:55
greghaynesSo when heat gets wedged in 'CREATE_IN_PROGRESS' due something like that kernel panic, how do you force it to give up hope of success?02:55
SpamapSgreghaynes: :( delete it02:56
SpamapSgreghaynes: note that the recently landed abandon/adopt feature might also work.02:56
clarkbstack deletes were a very common thing for me during the tripleo sprint02:56
lifelessgreghaynes: nova stop $instanceid02:56
lifelessgreghaynes: nova start $instanceid02:56
SpamapSlifeless: I tried that when I hit it. No dice.02:57
lifelessgreghaynes: heat is waiting for the waitcondition to fire, and the resize happens after we've deployed02:57
lifelessgreghaynes: if that trips the panic again, stop it, then take a copy of the VM disk file so we can reproduce02:57
SpamapStrue would be quite useful02:58
lifelessSpamapS: testing
greghaynespanic's again!02:59
* greghaynes copies02:59
* SpamapS preps happy dance02:59
SpamapSwe test stuff now?02:59
lifelessSpamapS: not kept up on email :)03:00
lifelessSpamapS: if you have time, reviewing stevenk's tie patch for debmirror would be useful03:00
SpamapSlifeless: no I saw that we have some kind of test somewhere of something03:01
lifelessSpamapS: toci is online, non-voting03:01
greghaynesThe .qcow2 is sufficient to grab right? We dont do any mangling inside of glance we migh suspect?03:01
SpamapSlifeless: yeah I've seen things pop up03:01
lifelessgreghaynes: sufficient yes03:01
SpamapSreasonable expectation that the kernel is the same03:02
lifelessoh right03:02
lifelessgreghaynes: yeah, grab the kernel and ramdisk from glance03:02
lifelessyou can get their image id out of nova show $id03:03
lifelessoh god the logging burns03:06
lifeless2014-01-27 03:05:57.926 | uses an insecure transport scheme (http). Consider using https if has it available03:06
lifeless2014-01-27 03:05:58.056 | Requirement already up-to-date: python-ironicclient in /opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages03:07
lifelessspam after spam after spam03:07
SpamapSlovely spaaaaaaam03:08
SpamapSglorious spaaaaaaaam03:08
SpamapSt-minus 52 minutes to sunday night happy hhour sushi and saki bombs03:09
SpamapSI shall toast to rebuild03:09
SpamapSStevenK: note that your needs a little tweaking (posted comments inline)03:10
lifeless2014-01-27 01:58:52.418 | Calling <function virsh_start at 0x7f8aba439c80> with: ['start', 'seed_1']03:10
lifeless2014-01-27 01:58:52.421 | error: Domain is already active03:10
SpamapSlifeless: so I feel like we need a new name for "rolling updaets"03:14
SpamapScoordinated updates is rattling around in my brain at the moment.03:14
lifelessits the distinguishing thing for me03:15
lifelesscoordinated would work too03:15
lifelesshmm, so we got03:16
lifeless2014-01-27 01:38:59.115 | pulling/updating tripleo-incubator03:16
lifeless2014-01-27 01:38:59.116 | Already up-to-date.03:16
greghaynesdoes rolling updates essentially mean restart nodes in certain order so app-level HA prevents downtime?03:17
lifelessplus allow the nodes to offload stuff before restarting03:17
*** matsuhashi has quit IRC03:18
SpamapSgreghaynes: and also trigger rollback if error rates climb03:20
SpamapSbut that is more canary than rolling/coordinated/graceful03:20
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Show what revision we're on in pull-tools.
greghaynesoo, thatd be a fun one to figure oyt03:20
lifelessSpamapS: care to land ^ - can't tell what is going on with the CI failure - dunno if the code is bad or we weren't running the code desired03:20
*** cd-undercloud has joined #tripleo03:21
cd-undercloud************** overcloud complete status=0 ************03:21
*** cd-undercloud has quit IRC03:21
SpamapSlifeless: any reason you're not doing git log -n 1 ?03:22
lifelessdidn't know about it03:22
clarkbgit log -1 :P03:22
SpamapS6 or 1/2 dozen03:23
lifelessuhm oh because its unneeded03:23
SpamapSlifeless: use that. Nice to get the whole commit :)03:23
lifelessSpamapS: no thanks03:23
SpamapSbut.. dates..03:23
lifelessSpamapS: there are two caes03:23
StevenKgit rev-parse03:24
lifelessa) its not being altered, so we'll see what commit from trunnk it ran, and git show will show us03:24
StevenKRather than log | head -n 103:24
lifelessStevenK: shows nothing03:24
lifelessSpamapS: b) it is being altered but is the top and a fastforward, in which case the ref will match that in gerrit03:25
lifelessSpamapS: c) it is the zuul ref, in which case its inaccessible to mere humans03:25
lifelessSpamapS: but also will just have a one-line 'merge x' message AIUI03:25
*** panda has quit IRC03:26
lifelessStevenK: if you're looking to tune it03:27
lifelessgit rev-list HEAD --max-count=103:27
StevenKrev-parse HEAD03:28
lifelesswhat about03:29
lifelessgit log -1 --pretty=oneline03:29
lifelessb948d4a23f6dda93bf8d7b5893d78c9eabb11bcd Merge "Fix ramdisk element for openSUSE"03:29
lifelessSpamapS: ^?03:29
SpamapSlifeless: sorry I really don't know what you're saying. log with just commit vs. log with the commit/author/date ... whats the resistance to the latter?03:29
lifelessSpamapS: verbosity in a tool that looks at lots of repositories03:30
SpamapSah short is good03:30
lifelessSpamapS: I want enough to diagnose03:30
lifelessSpamapS: not enough to drown03:30
lifelessoneline has the commit message first line and the hash03:31
SpamapSlifeless: I'm about to shut down. You going to change it or just want the head -n 1?03:31
SpamapSI get the brevity desire for sure03:31
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Show what revision we're on in pull-tools.
lifelessSpamapS: ^03:32
greghaynesor to shed - log --pretty=format:'%h : %s'03:32
lifelessof course, we probably wont' be running that verion of the tool to get diagnostics if its not actually running trunk03:32
lifelessgnar :)03:32
lifelesswe log the ssh private key03:33
lifelessperhaps not idea03:33
SpamapSok, date night time03:35
openstackgerritA change was merged to openstack/tripleo-incubator: Show what revision we're on in pull-tools.
*** panda has joined #tripleo03:41
openstackgerritlifeless proposed a change to openstack-infra/tripleo-ci: Be verbose in
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x.
lifelessand omg we had a success03:56
lifeless-> seed deployed03:56
*** UtahDave has quit IRC03:59
lifelessok, so I think the issue is that running pull-tools shouldn't be done here04:01
lifelessas we have devstack-gate wrapping things up for us04:01
lifelesszuul wise04:01
clarkboh yeah, d-g should sort that out for you04:04
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Don't try to pull-tools when there is a zuul ref.
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x.
lifelessclarkb: d-g wil reset things back to master when there is no zuul change for it, right ?04:06
lifelessclarkb: (node reuse concerns)04:06
clarkblifeless: yes it will reset to $ZUUL_BRANCH04:07
clarkband if there is no $ZUUL_BRANCH in the project it will fall back to master04:07
lifelessclarkb: doesn't seem to be doing quite that04:08
*** cd-undercloud has joined #tripleo04:08
cd-undercloud************** overcloud complete status=0 ************04:08
*** cd-undercloud has quit IRC04:08
lifelessclarkb: /opt/stack/new still has the trunk revision of this code04:09
lifelesswhich is stacked on04:10
lifelesswhich has this patch
lifelessbut you can see the log:04:11
lifelessmmm, possibly I missed something04:11
* lifeless looks harder04:11
lifelessoh right, we have some stuff to get the tools04:12
clarkblifeless: you want to look in the setup workspace script to see all of the git updating04:12
*** werebutt has joined #tripleo04:12
*** werebutt has left #tripleo04:12
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x.
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Don't try to git pull when there is a zuul ref.
*** ramishra has joined #tripleo04:28
greghayneshrm, qemu-nbd in overcloud-novacompute seems to be spinning on a lockfile04:34
lifelessgreghaynes: I thought I disabled that04:36
greghaynesits /var/lock/qemu-nbd-nbd004:36
*** coolsvap has joined #tripleo04:38
*** ramishra has quit IRC04:39
*** matsuhashi has joined #tripleo04:40
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Source devtest_variables in tripleo-cd.
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Disable libvirt file injection.
lifelessgreghaynes: pull that into your /opt/stack/os-apply/config/templates/etc/nova/nova.conf and do an os-collect-config --force --one04:41
lifelessthere is a dedicated channel on cable here at the moment04:43
*** jcooley_ has quit IRC04:50
greghayneslifeless: tyty04:53
*** cd-undercloud has joined #tripleo04:55
cd-undercloud************** overcloud complete status=0 ************04:55
*** cd-undercloud has quit IRC04:55
*** akuznetsov has joined #tripleo04:59
*** ramishra has joined #tripleo05:05
*** jcooley_ has joined #tripleo05:11
*** jcooley_ has quit IRC05:19
*** jcooley_ has joined #tripleo05:20
*** jcooley_ has quit IRC05:24
*** noslzzp has joined #tripleo05:46
*** noslzzp has quit IRC05:49
*** ramishra has quit IRC05:51
*** ramishra has joined #tripleo05:57
*** jcooley_ has joined #tripleo06:01
*** rpodolyaka1 has joined #tripleo06:01
*** rpodolyaka1 has left #tripleo06:10
*** rpodolyaka1 has joined #tripleo06:10
*** jcooley_ has quit IRC06:14
*** AaronGr is now known as AaronGr_Zzz06:29
lifelessfuck yeah, SUCCESS from check06:32
*** boris-42 has quit IRC06:34
* lifeless toasts status=006:35
openstackgerritlifeless proposed a change to openstack/tripleo-incubator: Stop sourcing
*** ramishra has quit IRC07:00
*** e0ne has joined #tripleo07:09
*** morazi has quit IRC07:16
*** jcoufal has joined #tripleo07:18
*** lsmola_ has joined #tripleo07:31
*** mrunge has joined #tripleo07:34
*** coolsvap has quit IRC07:39
*** jprovazn has quit IRC07:52
*** jprovazn has joined #tripleo07:53
*** pblaho has joined #tripleo08:04
*** jtomasek has joined #tripleo08:07
*** e0ne has quit IRC08:12
*** markmc has joined #tripleo08:18
*** coolsvap has joined #tripleo08:21
*** d0ugal has joined #tripleo08:24
*** d0ugal has joined #tripleo08:24
openstackgerritRalf Haferkamp proposed a change to openstack/diskimage-builder: Include /lib64 into the deploy ramdisk on openSUSE
openstackgerritRalf Haferkamp proposed a change to openstack/diskimage-builder: Add bash as a dependency to the deploy ramdisk
*** boris-42 has joined #tripleo08:36
*** matsuhashi has quit IRC08:36
*** matsuhashi has joined #tripleo08:37
*** ifarkas has joined #tripleo08:40
*** jistr has joined #tripleo08:48
openstackgerritA change was merged to openstack/diskimage-builder: Mount root filesystem readonly during boot
rlandyrpodolyaka1: hello - ... status update on devtest with tuskar09:06
jprovaznlifeless: ping09:11
*** e0ne has joined #tripleo09:15
openstackgerritDirk Mueller proposed a change to openstack/diskimage-builder: Add a service mapping for openSUSE
*** derekh has joined #tripleo09:18
rpodolyaka1rlandy: hey!09:18
lifelessjprovazn: pong09:20
lifelessrpodolyaka1: oh hai09:20
lifelessrpodolyaka1: all the baremetal rebuild stuff landed... and then on the weekend I found new issues :)09:20
lifelessrpodolyaka1: I'm going to ask StevenK if he thinks he can polish the rough fixes up, just thought you should know09:21
rpodolyaka1lifeless: morning!09:21
jprovaznlifeless: hi, thanks for looking at the rabbitmq patch, what do you mean by the last comment here: ? (This looks like something we should push upstream - a hosts.d/ thing.)09:21
rpodolyaka1lifeless: new issues in preserve ephemeral nova series? or rebuild story overall?09:22
rlandyrpodolyaka1: so ... I got as far as registering the baremetal nodes with the undercloud - then the errors started09:22
lifelessjprovazn: well wouldn't it be nice if rather than editing hosts09:23
lifelessjprovazn: we could drop a file in /etc/hosts.d/somethingorother09:24
jprovaznlifeless: ah, ok, will check this option09:25
lifelessjprovazn: you don't need to block on it09:25
jprovazneven better09:25
lifelessjprovazn: but I see it as part of our job to see things that are hard and make them easier09:25
lifelessjprovazn: e.g.: implement for use a hosts.d -> hosts idempotent script09:25
lifelessjprovazn: then we can use hosts.d09:25
lifelessjprovazn: and separately we can send a patch to libc or whatever to make it an intrinsic feature09:26
jprovaznlifeless: I see, thanks09:26
rpodolyaka1lifeless: oh, nice catch09:29
lifelessrpodolyaka1: SpamapS figured it out, I just wrote the code09:29
lifelessrpodolyaka1: but yeah, it was a bit WTF09:29
rpodolyaka1lifeless: I bet :)09:29
rpodolyaka1rlandy: please elaborate on errors :)09:30
lifelessDATA. moah DATA09:31
rlandyrpodolyaka1:  yes (just waiting conversation above to conclude)09:39
openstackgerritDougal Matthews proposed a change to openstack/python-tuskarclient: Remove concepts that no longer exist in the API
rpodolyaka1rlandy: so what errors do you see?09:40
lifelessrlandy: be bold :)09:40
rlandypressure, pressure09:40
rlandyrpodolyaka1:  the - I think they are unrelated (2002, "Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)") None None (HTTP 500)09:41
lifelessrlandy: where do you see that ?09:41
rlandymore ... ERROR: HTTPConnectionPool(host='', port=8774): Max retries exceeded with url: /v2/2db5418a351a4c58bf7cb09dc29c3218/os-baremetal-nodes (Caused by <class 'socket.error'>: [Errno 111] Connection refused09:41
jamezpolleylifeless: is the meeting happening at the usual time this week? doesn't seem to have been updated since mid-december09:41
rlandythe errors are returned directly from the setup-baremetal command09:42
lifelessjamezpolley: yes09:42
lifelessjamezpolley: its terrible for you09:42
lifelessjamezpolley: we should perhaps start doing alternating times09:42
rpodolyaka1rlandy: is mysqld up and running? nova-api?09:42
*** athomas has joined #tripleo09:43
rlandyrpodolyaka1:  yes ... ssh'ed into the undercloud09:43
rlandymysql is up09:43
rlandyas is nova-api09:43
rlandynova baremetal-node-list is interesting09:44
jamezpolleylifeless: it's a mere 6am. I'm usually up around then anyway.09:44
lifelessjamezpolley: oh, ok cool.09:44
lifelessjamezpolley: winter may be worse?09:44
rlandyrpodolyaka1:  nova baremetal-node-list errors the first time and returns an output when rerun09:45
rpodolyaka1rlandy: hmm, maybe you run it the first time when os-refresh-config were still executing?09:45
lifelessrlandy: did you check heat stack-list ?09:45
rlandyrpodolyaka1:  it's repeatable09:46
rpodolyaka1rlandy: interesting. can you show the error message it gives you?09:46
rlandyheat stack-list returns the undercloud when the seed is source'd and nothing when undercloudrc is source'd09:46
rpodolyaka1yeah, that's correct. as you haven't created overcloud stack yet09:46
rpodolyaka1but undercloud is in CREATE_COMPLETE state, right?09:48
rlandycorrect - that part makes sense because I can't heat stack-create at this point09:48
jamezpolleylifeless: 5am isn't great, no. We probably don't want to make it too much later though - 1900UTC is already 8pm London by then09:48
rlandyundercloud is functional09:48
lifelessjamezpolley: what other meetings are doing is one week time A one week time B one week time A09:48
lifelessjamezpolley: so one nice for west-coast-us-through-asian one nice for europe through nz09:49
lifelesswe have some indians here via the indian public cloud thing09:49
jamezpolleylifeless: fortunately there's a long time before this becomes a problem. I'm fine with alternating times too09:49
lifelessor you could man up and do 5am :P09:49
rlandyrpodolyaka1:  error from nova baremetal-node-list09:50
rlandyERROR: An unexpected error prevented the server from fulfilling your request. (OperationalError) (2002, "Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)") None None (HTTP 500)09:50
lifelessrlandy: this is your undercloud or your seed ?09:50
rlandylifeless: undercloud09:50
lifelessrlandy: is there anything in /var/log/mysql/error.log ?09:51
rlandyyes - getting/pasting09:51
jamezpolley5am is doable for an irc meeting. It'd be a stretch if I had to turn on a camera09:51
lifelessjamezpolley: oh god no, we don't need *that*09:51
lifeless might be interesting to play with09:57
rlandylifeless: I got this in /mnt/state/var/log/mysql/error.log ... (seemed close enough - /var/log/mysql/error.log doesn't exist F19)
lifelessohr ight, yes :)09:57
lifelessso something shut it down09:58
lifeless140127  9:54:02 [Note] /usr/libexec/mysqld: Normal shutdown09:58
rlandyyes - I did - I restarted it09:59
rlandyseemed like the thing to do when your services aren't being helpful :)10:00
rlandyI take that back - there are more shutdowns in the log than one10:01
openstackgerritDougal Matthews proposed a change to openstack/python-tuskarclient: Add bindings for overcloud API entrypoints
*** martyntaylor has joined #tripleo10:04
lifelessrlandy: so, I'd tail that log. then try to reproduce the problem10:06
rlandylifeless: ok10:07
rpodolyaka1rlandy: lifeless: hmm, if I'm not missing something, there is another strange thing here. Why are we trying to connect to MySQL via a unix domain socket at all? AFAIK, we should be using TCP sockets according to a connection string we pass to SQLAlchemy (e.g. mysq;://unset:unset@localhost/nova_bm instead of mysql:///nova_bm?unix_socket=/var/lib/mysql/mysql.sock)10:08
lifelessthats true too10:08
*** e0ne_ has joined #tripleo10:13
openstackgerritDougal Matthews proposed a change to openstack/python-tuskarclient: Add bindings for Resource Category API entrypoints
openstackgerritA change was merged to openstack/diskimage-builder: Add a service mapping for openSUSE
openstackgerritA change was merged to openstack/diskimage-builder: Use /usr/bin/env, not /bin/env
*** akrivoka has joined #tripleo10:16
*** athomas has quit IRC10:16
*** e0ne has quit IRC10:16
*** max_lobur_afk is now known as max_lobur10:18
rlandylifeless: rpodolyaka1: here is the command output and tail of  mysql/error.log  ...
*** frankbutt has joined #tripleo10:21
*** frankbutt has left #tripleo10:21
*** athomas has joined #tripleo10:23
lifelessmarkmc: hey, so RH folk @ a midcycle tripleo meetup - doable? [see my mail to -dev]10:23
rpodolyaka1rlandy: hmm, what's in the nova-api.log and os-collect-config.log (I'm wondering, if it's os-collect-config who restarts mysqld)?10:26
openstackgerritA change was merged to openstack/diskimage-builder: Include /lib64 into the deploy ramdisk on openSUSE
markmclifeless, probably a bit short notice to get as many people as last time10:26
markmclifeless, what dates are you thinking of?10:26
openstackgerritA change was merged to openstack/diskimage-builder: Add bash as a dependency to the deploy ramdisk
openstackgerritA change was merged to openstack/tuskar-ui: Adds basic deployment log API and tab
lifelessmarkmc: well thats the thing, if folk are like 'we need 4 weeks warning but we can come' then I'd say - 3rd marchish perhaps - 5 weeks from now, which will let us get a couple more to come too10:30
lifelessmarkmc: HP travel policy doesn't like < 4 weeks booking lead time anyhow10:31
lifelessmarkmc: I have a separate trip at the end of march to florida, so early march then  home then that trip is doable, but mid-march would suck10:31
markmclifeless, ok, I'll ask around10:32
markmclifeless, funnily enough, I'll be in SF on Mar 4th for a board meeting10:32
lifelessmarkmc: so, we get one :)10:33
markmclifeless, it's not me you want so much though :)10:34
lifelessmarkmc: I'd really like a cross section of CI/tuskar/overall-plumbing folk, if we can get it10:34
markmclifeless, yeah10:34
lifelessmarkmc: we have made some huge progress recently10:34
markmclifeless, totally, it's very exciting10:34
lifelesse.g. see - the jenkins check there that ran against the test broker10:35
rlandyrpodolyaka1: I see os-apply-config.log in /var/log: (would os-collect-config.log somewhere else). tailing messages give me:
markmclifeless, very very cool10:38
lifelessmarkmc: yeah, soon as the RH region is online we can make it voting10:38
lifelessmarkmc: ok, so let me know as soon as possible re: meetup; for now, gnight10:41
markmclifeless, yep, will do - thanks10:41
lifelessderekh: I have pushed a bunch of things to infra config/ devstack-gate and tripleo all related to CI10:41
lifelessderekh: you might like to review/ approve /etc as possible10:41
derekhlifeless: cool, will do10:41
lifelessgnight all10:41
rpodolyaka1rlandy: ok, so setup-baremetal fails because it's trying to insert a new entry having the mac-address value, that's already saved in the table (though it definitely should return a meaningful error message...)10:42
rpodolyaka1night, lifeless10:42
rpodolyaka1rlandy: but why mac-address value is 11:22:33:44:55:66...10:42
rpodolyaka1rlandy: anyway, you might try to delete all existing values using baremetal-node-delete/baremetal-interface-remove commands of novaclient10:44
rlandyrpodolyaka1: ok - I see the duplicate error10:44
rpodolyaka1rlandy: and then run setup-baremetal again10:45
rlandyrpodolyaka1: ok - trying that10:45
*** coolsvap_away has joined #tripleo11:05
*** coolsvap has quit IRC11:06
*** lucasagomes has joined #tripleo11:18
openstackgerritA change was merged to openstack/tripleo-incubator: Don't try to git pull when there is a zuul ref.
openstackgerritA change was merged to openstack/tripleo-incubator: Make the devtest scripts for toci run with -x.
openstackgerritA change was merged to openstack/tripleo-image-elements: Remove deprecated option.
*** coolsvap_away is now known as coolsvap11:36
*** e0ne has joined #tripleo11:39
*** matsuhashi has quit IRC11:40
*** e0ne_ has quit IRC11:42
*** matsuhashi has joined #tripleo11:45
*** matsuhashi has quit IRC12:12
*** matsuhashi has joined #tripleo12:13
*** noslzzp has joined #tripleo12:16
*** e0ne_ has joined #tripleo12:16
*** e0ne has quit IRC12:20
openstackgerritA change was merged to openstack/diskimage-builder: Fix tftp mapping on openSUSE
*** coolsvap has quit IRC12:36
openstackgerritLadislav Smola proposed a change to openstack/tuskar-ui: Adding nova baremetal API
*** rpodolyaka1 has left #tripleo12:39
*** CaptTofu has joined #tripleo12:43
openstackgerritDerek Higgins proposed a change to openstack/tripleo-image-elements: Configure glance to use the internal swift endpoint
*** max_lobur is now known as max_lobur_afk13:07
*** CaptTofu has quit IRC13:11
*** CaptTofu has joined #tripleo13:11
*** bcrochet has quit IRC13:12
*** mrunge has quit IRC13:13
*** bcrochet has joined #tripleo13:14
*** CaptTofu has quit IRC13:16
*** morazi has joined #tripleo13:16
*** vkozhukalov has joined #tripleo13:20
*** weshay has joined #tripleo13:29
*** lblanchard has joined #tripleo13:41
ProfFalkenhey all, just to let you knwo that for those who are interested, devtest will stand up on Ubuntu Precise (12.04LTS) without any amendments required to the scriptss13:56
ProfFalkenbackports does need to be enabled, and you need to get the disk layout correct but other than that it works fine13:57
ProfFalkenis there anywhere I can help document this for those who want to experiment but are forced by company policy to use Precise?13:57
*** jayg|g0n3 is now known as jayg14:01
*** dprince has joined #tripleo14:05
openstackgerritA change was merged to openstack/python-tuskarclient: Remove concepts that no longer exist in the API
*** matty_dubs|gone is now known as matty_dubs14:08
akrivokaProfFalken: I have created a wiki page for devtest installation instructions, feel free to add it there if you want -
openstackgerritA change was merged to openstack/tripleo-image-elements: Make nova and nova-kvm elements more compatible.
*** d0ugal has quit IRC14:19
*** d0ugal has joined #tripleo14:22
*** boris-42 has quit IRC14:23
openstackgerritLadislav Smola proposed a change to openstack/tuskar-ui: Adding nova baremetal API
*** julim has joined #tripleo14:31
*** julim has quit IRC14:32
*** julim has joined #tripleo14:35
*** max_lobur_afk is now known as max_lobur14:45
*** CaptTofu has joined #tripleo14:45
openstackgerritRyan Brady proposed a change to openstack/tripleo-heat-templates: Multiple cinder nodes in overcloud
ProfFalkenakrivoka: ok, thanks14:54
*** jdob has joined #tripleo14:56
*** hewbrocca has joined #tripleo14:59
*** morazi_ has joined #tripleo15:02
*** morazi has quit IRC15:03
rlandyakrikova: hi ... thanks for posting I'm going to try those instructions. My attempt to include tuskar within devtest was not so successful15:03
openstackgerritJiri Tomasek proposed a change to openstack/tuskar-ui: Updated Nodes Overview page
*** e0ne has joined #tripleo15:11
*** e0ne_ has quit IRC15:11
*** rwsu has joined #tripleo15:16
*** cody-somerville has joined #tripleo15:17
*** noslzzp has quit IRC15:23
akrivokarlandy: great15:25
akrivokarlandy: let me know if I can help15:25
*** boris-42 has joined #tripleo15:26
*** e0ne has quit IRC15:31
*** e0ne has joined #tripleo15:31
*** lucasagomes is now known as lucas-hungry15:35
*** morazi_ is now known as morazi15:38
*** coolsvap has joined #tripleo15:40
*** noslzzp has joined #tripleo15:41
*** vkozhukalov has quit IRC15:50
*** sdake has quit IRC15:52
*** jprovazn has quit IRC15:56
*** ftcjeff has joined #tripleo15:56
*** jistr has quit IRC15:58
*** pblaho has quit IRC16:00
*** AaronGr_Zzz is now known as AaronGr16:11
*** nosnos_ has quit IRC16:14
*** sdake has joined #tripleo16:15
*** sdake has joined #tripleo16:15
*** matsuhashi has quit IRC16:15
*** julim has quit IRC16:16
*** julim has joined #tripleo16:19
*** e0ne_ has joined #tripleo16:24
*** e0ne has quit IRC16:26
*** noslzzp has quit IRC16:32
*** jistr has joined #tripleo16:34
*** lucas-hungry is now known as lucasagomes16:37
*** noslzzp has joined #tripleo16:39
openstackgerritA change was merged to openstack/tripleo-heat-templates: Allow setting a single NTP Server
*** rlandy is now known as rlandy|bbl16:40
*** UtahDave has joined #tripleo16:42
openstackgerritA change was merged to openstack/tripleo-incubator: Fix openSUSE detection
openstackgerritMarios Andreou proposed a change to openstack/tuskar: WIP: Using from tuskar to generate overcloud.yaml
mariosrbrady: ping - hey, made a comment on your - some progress with the computes (template validating)... but issues pulling in block-storage.yaml still chasing16:52
*** lsmola_ has quit IRC16:53
*** e0ne_ has quit IRC16:54
rbradymarios: thanks16:57
*** markmc has quit IRC16:59
*** morazi has quit IRC17:00
rbradyderekh: ping17:07
openstackgerritA change was merged to openstack/tripleo-image-elements: Enable xinetd service
derekhrbrady: ack, on a call at the moment, but ask away, will answer when I can17:08
openstackgerritA change was merged to openstack/tripleo-image-elements: Configure glance to use the internal swift endpoint
rbradyderekh: I'm just looking for the change you submitted last week for python six module17:09
rbradyderekh: thank you sir17:10
derekhrbrady: np17:10
rbradyderekh: just hit this in ucl17:10
*** morazi has joined #tripleo17:13
*** bauzas has joined #tripleo17:14
dkehnlifeless: ping me today about the NTP, I think the wording that worries me is subnet17:19
SpamapSdkehn: care to elaborate?17:21
SpamapSdkehn: the desire is to be able to set a dhcp option for a subnet really.. NTP is just this particular use case17:21
dkehnSpamapS: using the above ^^^^ context17:22
dkehnSpamapS: just want to make sure I understand if its a configuration issue or a code change issue17:22
*** matty_dubs is now known as matty_dubs|lunch17:23
dkehnSpamapS: currently you should be able to put any option in there and its should work, in theory17:23
SpamapSdkehn: via the API?17:23
dkehnSpamapS: true the only use case we've really tested it with is boot option17:23
SpamapSdkehn: and I'm not sure I see the context you're referring to17:24
dkehnSpamapS: 2014-01-25 23:33:20[ lifeless] dkehn: ^ what do you think, add a feature to set the ntp servers for a subnet in neutron ?17:24
dkehnSpamapS: just want to make sure I understand it17:25
SpamapSdkehn: so subnets are the things that dnsmasq serves by way of dhcp agent right?17:30
dkehnSpamapS: yes17:30
dkehnSpamapS: actually  in the dhcp/network-id./dhcp/ by network, then ports,17:31
dkehnSpamapS: this is why I want to make sure I understand the use case17:32
*** jcooley_ has joined #tripleo17:34
SpamapSdkehn: We want to set ntp-server on DHCP, and it may very well need to be different based on what your subnet is.17:35
derekhlifeless: could gate-tripleo-deploy be using VM's on the ci-overcloud more then once?17:36
dkehnSpamapS: so currently the extra-dhcp-opts, work on a per port basis17:36
derekhno sign of this line
derekhsugesting that the CACHE_PATH already exists...17:36
*** CaptTofu has quit IRC17:37
*** CaptTofu has joined #tripleo17:37
*** hewbrocca has quit IRC17:38
*** max_lobur is now known as max_lobur_afk17:39
*** marun has joined #tripleo17:39
*** vkozhukalov has joined #tripleo17:40
*** marun has quit IRC17:41
*** CaptTofu has quit IRC17:42
*** CaptTofu has joined #tripleo17:43
SpamapSdkehn: right we don't want per-port. We want a blanket "everybody on this subnet gets ntp-server=x.x.x.x17:46
*** derekh has quit IRC17:55
* Ng breaks for dinner&bedtimes17:58
*** akuznetsov has quit IRC18:00
*** hewbrocca has joined #tripleo18:02
*** akuznetsov has joined #tripleo18:05
*** julim has quit IRC18:05
*** bauzas has quit IRC18:08
wendarlifeless: How up-to-date is the README in tripleo-incubator? It makes a lot of references to Grizzly.18:13
*** matty_dubs|lunch is now known as matty_dubs18:14
SpamapSwendar: I haven't read it in a while..18:15
*** martyntaylor has quit IRC18:16
* SpamapS reading.. looking pretty good actually18:16
SpamapS - File injection is required due to the PXE boot configuration conflicting18:19
SpamapS   with Nova-network/Neutron DHCP (work is in progress to resolve this)18:19
SpamapSwendar: ^^ thats not true18:19
wendarSpamapS: okay, cool. so the details have been updated, it's just missing some mention of whether it's also "quite usable on Havana" and not just Grizzly18:19
* SpamapS starts edits18:19
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update README - file injection is not required
*** jcooley_ has quit IRC18:22
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Heat is quite usable in more than Grizzly
SpamapSwendar: thanks for pointing this out. New eyes always find bugs. :)18:23
*** jcooley_ has joined #tripleo18:23
wendarSpamapS: happy to start being useful :)18:23
greghaynesSpamapS: does that depend on merging?18:24
*** akuznetsov has quit IRC18:25
*** boris-42 has quit IRC18:26
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update os-*-config section of README
SpamapSgreghaynes: no. That is just a thing that is still turned on, but we don't actually use anymore. :)18:27
greghaynesI was pointed to pull that in to fix qemu-nbd spinning on a lockfile18:28
greghaynesalthough verifying that works atm18:28
SpamapSgreghaynes: yeah, its generally the suck and horrible, but we don't _NEED_ injection anymore.18:28
*** boris-42 has joined #tripleo18:29
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update README as we do support updates now
*** jistr has quit IRC18:32
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Update tripleo-heat-templates: mention merge tool
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Add in missing definite article in README
*** CaptTofu has quit IRC18:37
*** CaptTofu has joined #tripleo18:37
SpamapSwendar: ^^ ok if you checkout that last review.. you get an up to date README :)18:38
wendarSpamapS: awesome, thanks!18:39
*** rlandy|bbl is now known as rlandy18:42
*** CaptTofu has quit IRC18:42
*** jprovazn has joined #tripleo18:44
*** CaptTofu has joined #tripleo18:45
SpamapSwendar: and really not that much was out of place. Thanks again for looking. :)18:45
openstackgerritAna Krivokapic proposed a change to openstack/tuskar-ui: Run PEP8 check by default when tests are run
*** marun has joined #tripleo18:56
wendarSpamapS: Cool, yeah, that's what I'd expect.19:05
lifelesso/ everybody19:07
jog0lifeless: o/19:07
jog0just the man I wanted to see19:07
lifelessruh roh19:08
jog0was about to inquire about the next NVP19:08
jog0two things, what nova patches are outstanding?19:08
*** vkozhukalov has quit IRC19:08
jog0and second, wanted to ask about why make kernel upgrades part of next19:08
*** noslzzp has quit IRC19:08
lifelessjog0: has a link to the review required to make things work at all19:08
jog0lifeless: now that gate is flowing I have novareview time19:08
lifelessjog0: and in the same series there is another patch fixing a restart failure mode19:08
lifelessjog0: though neither pass jenkins yet, just unit test tweaks needed19:09
jog0lifeless: ahh19:09
lifelessjog0: I'm going to ask StevenK if he'd like to polish them up19:10
*** noslzzp has joined #tripleo19:10
lifelessjog0: since I'm a bit of a bottleneck atm19:10
jog0lifeless: cool, I will follow those two and when jenkins is happy I will revew 'em19:10
lifelessjog0: you could eyeball the code changes now19:10
lifelessjog0: because we're running them ;)19:10
lifelessjog0: functional jenkins is happy19:11
jog0well functional tests don't cover bare metal19:11
lifelessI know :)19:11
lifelesscd-undercloud is exercising it hourly19:12
lifelesswendar: a little stale ;)19:12
lifelessdkehn: ping about ntp19:12
dkehnlifeless: pong19:13
lifelessSpamapS: is tripleo-cd disabled ?19:13
lifelessdkehn: so you said subnet worries you w.r.t. ntp ?19:13
jog0lifeless: I like patches like those: clear and small19:14
lifelessjog0: its how I roll19:14
jog0gave both a +019:14
SpamapSlifeless: not by me. Looking.19:14
lifelessjog0: awesome, thanks19:14
jog0lifeless: so for the next MVP, why is kernel based upgrades included19:14
dkehnlifeless: doesn't worry me, just need a clalification. currently we wrt opt on a per port basis, for a subnet, would require code change,19:14
jog0vs doing non kernel based upgrades in this MVP and kernel based in next19:15
lifelessjog0: ok so; I'd flip it around, why shouldn't we include them ?19:15
lifelessjog0: right now we upgrade the kernel on cd-overcloud every couple of days19:15
jog0lifeless: because the extra complexcity of adding HA in everwhere and live migration support19:15
SpamapSlifeless: seems to be stuck in git log -1 --pretty=oneline19:15
SpamapSlifeless: DOOHHHH19:15
SpamapSlifeless: upstart uses pty's for logging19:15
lifelessSpamapS: dodah?19:16
SpamapSlifeless: pager is running19:16
jog0lifeless: kernel upgrades are definitly important, infact russellb is working on that in the gate right now19:16
lifelessf* git and the horse it rode in on19:16
jog0but I think doing non-kernel based upgrades is hard19:16
lifelessjog0: huh?19:16
lifelessjog0: I don't understand what russelb is working on19:16
jog0bumping the kernel version that we gate on19:17
lifelessoh, totally unrelated19:18
lifelessthis is *deployment*19:18
jog0lifeless: yes, I only brought it up because it shows how important it is to have the right kernel19:18
lifelessjog0: so, if we do non-HA that implies a non-reboot code path19:18
jog0lifeless: yes19:18
SpamapSlifeless: you fixing, or shall I?19:19
lifelessthe non-reboot code path *still* needs rolling heat support and no-downtime neutron migrations, because we have to restart services19:19
jog0lifeless: right, and rolling nova upgrades19:19
lifelessso that implies some form of HA19:19
jog0lifeless: true19:19
lifelessso we just got back to having HA19:19
jog0I should of said non-reboot code path19:19
lifelesstherefor it's a paradoc to say non-HA and VMs are not interrupted19:19
lifelessok, so the non-reboot codepath19:20
jog0yeah, I used the wrong words19:20
lifelesswe either have to put extra code in to guarantee the kernel doesn't cange19:20
lifelessthat is something like19:20
lifelesson the first build cache the kernel files in glance19:20
lifelesson every build after that *uninstall* the kernel the OS vendor image has and reinstall the files we want from glance19:20
jog0so I think we want logic to check if the kernel changed or not anyway19:21
lifelesswe have to accept that the next time the node is rebooted (e.g. power failure, whatever) the kernel it boots from (held by nova-bm) and the kernel in it's image (which is where extra modules are held) may differ19:21
jog0because if we make every upgrade require live-migration thats a lot of extra load19:21
lifelessand it will then failure to come up in entirely hilarious ways19:21
lifelesswe need to update the kernel boot files held by nova-bm, and know that after the first such deploy we can't load modules anymore19:22
lifelessjog0: so hang on, we're not talking optimisations yet19:22
jog0fair enough19:22
lifelessjog0: we're comparing two different features we want both of19:22
lifelessjog0: the question is do we do A(reboot required but users don't have to care) or B(when a reboot isn't /required/ don't do one) first19:23
lifelessjog0: and I'm trying to show that B actually has a bunch of things impacting on it that don't make it this slam-dunk simpler problem19:23
jog0lifeless: what about doing B but not handling the kernel issues19:24
lifelessI think they are about equally hard to implement well, but A solves a lot more cases for us19:24
lifelessjog0: then we'd introduce something unsafe19:24
jog0just assume kernel never changes19:24
jog0yes we would19:24
openstackgerritClint "SpamapS" Byrum proposed a change to openstack/tripleo-incubator: Stop git using a pager in upstart console logs
jog0I think that this MVP is hard without adding in kernel changes19:24
openstackgerritA change was merged to openstack/tripleo-incubator: Stop git using a pager in upstart console logs
lifelessjog0: like i said, kernels change quite regularly for cd-overcloud19:26
jog0and that we will quickly find doing no-downtime neutron migrations and rolling upgrades of OpenStack  Services etc will be hard19:26
lifelessjog0: making the assuming that the kernel never changes would lead to a dead overcloud pretty quickly19:26
lifelessjog0: but we have to do those migrations *in both A and B*19:26
dkehnlifeless: currently the dhcp_option are being stored in the port attributes, would need support for dhcp_option in the subnet attributes if this is going to be on a subnet basis19:26
lifelessdkehn: additional support - yes.19:27
SpamapSlifeless: we could pin the kernel.19:27
jog0lifeless: I just think that the proposed MVP is pretty big in scope19:27
lifelessSpamapS: until the version we pinned drops out of the archive19:27
SpamapS(just playing devil's advocate, I"m on board with focusing on A btw)19:27
dkehnlifeless: additional support?19:28
lifelessjog0: ok, I accept that; so for lean startup here, what we should focus on is how much we'll learn from A and B19:28
lifelessdkehn: storing subnet wide options would be an additional thing in neutron19:28
dkehnlifeless: true it would be19:28
dkehnlifeless: just want to make sure that's what your saking?19:29
lifelessjog0: what do we expect to learn from A(rolling graceful image deploys) vs B(rolling graceful rsync deploys)19:29
lifelessdkehn: it doesn't make a lot of sense to me to do per-server ntp settings19:29
lifelessdkehn: I mean we could, today, as a workaround19:29
jog0lifeless: much of the same things19:29
jog0A and B both need no downtime to VMs etc which involve rolling upgrades etc19:30
lifelessjog0: ok, so now lets look at implementation: the delta from do A to do B is:19:30
jog0so I am not saying do B19:30
lifeless - 2 compute nodes19:30
jog0I am saying break down A into two phases19:31
lifeless + eithe kernel failures or some assocaite19:31
lifeless... go on, am listening19:31
*** hewbrocca has quit IRC19:31
jog0do A but pin kernel version  in phase 1. phase 2 don't pin kernel19:31
lifelessjog0: that is more work, not les19:32
jog0how so?19:32
lifelessjog0: we need an entirely different deployment method, which we have prototyped19:32
lifelessbut that needs to be polished, integrated into heat in an appropriate way19:32
jog0what is the different deployment method?19:32
jog0the rsync?19:32
jog0ohh wait now I get it19:33
*** coolsvap has quit IRC19:33
jog0upgrading a box with a reboot is easier then without19:33
jog0for images19:33
jog0thats what I was overlooking19:33
lifelessit's an optimisation19:33
lifelessjog0: have you watched my disk-image-builder video from LCA?19:34
*** jprovazn has quit IRC19:34
lifelessjog0: I think you might get some insight from it19:34
jog0lifeless: link?19:34
*** martyntaylor has joined #tripleo19:34
jog0lifeless: thanks19:35
jog0watching now19:35
*** epim has joined #tripleo19:35
jog0I'll come back after watching it19:35
SpamapSI watched it, and it definitely crystalizes the things we've been driving toward.19:36
SpamapSlifeless: so regarding easier/harder.. I'm not sure I agree that the rsync is harder, it is just less known.19:36
SpamapSperhaps that is the definition of harder19:36
lifelessSpamapS: we need patches in nova for it19:37
lifelessSpamapS: unless we kamikazi on kernel changes within it19:37
*** martyntaylor has quit IRC19:38
lifelessSpamapS: to update the boot files for AMI/ARI/AKI image setups (which is an all-hypervisors thing)19:38
jog0SpamapS:  ... there are known knowns; there are things we know that we know.19:38
jog0There are known unknowns; that is to say, there are things that we now know we don't know.19:38
jog0But there are also unknown unknowns – there are things we do not know we don't know.19:38
jog0~ Donald Rumsfeld19:39
jog0rsync is more in the unknown unknown realm19:41
greghaynesSeems like doing the new image/reboot first you also better learn which cases *require* non reboot, which makes sense given you want the general case to be reimage/reboot as it is much easier to test.19:41
SpamapSlifeless: wait, I thought B was "no kernel upgrade" ??19:41
greghaynesOr at least thats what I gleaned from the LCA talk... curious how on target it s19:42
lifelessSpamapS: B was no reboot19:42
lifelessSpamapS: but the kernel changing in the image will impose kernel changing complexity on us19:42
*** hewbrocca has joined #tripleo19:42
lifelessI have to run, acct appointment - reachable on phone19:42
dkehnlifeless: got it19:43
SpamapSlifeless: ACK19:43
SpamapSgreghaynes: the most common update case is updating less than 50 files on the filesystem and none of those being the kernel, I think.19:44
SpamapSgreghaynes: the complexity comes in identifying that you're in such a situation, and optimizing for it.19:45
greghaynesI just mean the goal being actually doing CI/CD testing for the images, its a difference of testing the actual image vs hoping two deltas end up in the same state19:46
SpamapSgreghaynes: right we want to assert that there is no delta every time.19:46
greghaynesSo its nice to make that the general case (hence doing it first)19:47
rlandydprince: hi - can you tell me how I could enable iptables during image building?19:47
SpamapSOne thing I think we can do is test both .. reboot and no reboot.. and then if we can assert that the no-reboot one is running the things we expect.. we can take the no-reboot path from there, onward.19:47
SpamapSlike we could publish with the tested images, the tested upgrade paths19:48
*** openstackgerrit has quit IRC20:06
*** openstackgerrit has joined #tripleo20:06
*** julim has joined #tripleo20:06
*** jcooley_ has quit IRC20:22
*** jcooley_ has joined #tripleo20:22
openstackgerritAna Krivokapic proposed a change to openstack/tuskar-ui: WIP: Add node detail view
*** akrivoka has quit IRC20:41
*** lucasagomes has quit IRC20:46
*** athomas has quit IRC21:01
*** jcoufal has quit IRC21:03
*** rlandy has quit IRC21:05
*** e0ne has joined #tripleo21:07
lifelessSpamapS: interesting idea21:16
lifelessSpamapS: I like that better than a human saying 'oh yeah, this is a non-reboot case', or finding out at deploy time that actually a reboot REALLY WAS NEEDED21:16
lifelessdprince: hey so - see the discussion above w/joe - does that cover your list questions as well ?21:16
lifelesspleia2: you wanted a quick hangout?21:17
dprincelifeless: perhaps. I'm not quite sure it is as easy as we think it is though.21:18
dprincelifeless: I think I understand the general idea though for sure.21:19
lifelessdprince: I'm sure it's not /easy/ I was more seeing if you were satisfied with the rationale for <set of things> as the next step vs <other set of things>21:22
lifelessdprince: also - we're now passing check tests on all tripleo repos (and tripleo-ci when infra merge my patch)21:24
dprincelifeless: yeah21:24
jog0lifeless: so just watched the video and while very interesting it didn't really answer anything about why do kernel in this MVP. but you answered that separately21:24
lifelessdprince: they aren't voting yet - we need the RH region up for that per infra policy21:24
lifelessjog0: ack; it was more that it pulled everything together21:25
jog0lifeless: it sure did21:25
jog0very interesting talk21:25
lifelessdprince: where are we at on getting access ?21:25
dprincelifeless: :(. So... while I certainly value the infra policy it would seem that any gating at all is valuable at this time.21:25
dprincelifeless: why not break the rules21:26
lifelessdprince: because if ci-overcloud wedges, without a fallback, infra have to have a firedrill to let us land anything at al21:26
dprincelifeless: We are waiting for them to essentially unplug the cables and do final security tests to verify the rack has been properly disconnected from any internal networks.21:26
dprincelifeless: once that happens it'll go public21:26
lifelessbug 127280321:27
lifelessbug 127296921:27
lifelessbug 127134421:27
lifelessare all affecting the ci-overcloud21:27
jog0lifeless: btw for the cards for the next MVP a few quick questions21:27
lifelessso its not an academic question21:27
jog0when you have a moment21:27
lifelessjog0: shoot21:27
jog0so starting with the easy:21:28
jog0why do we need HA API?21:28
jog0or rather do we need HA API for all APIs?21:28
lifelessjog0: APIs that I know of21:28
lifelessmetadata API21:28
lifelessif thats not HA, then a rebooting VM will fail to come up21:28
lifelessI don't know what the metadata API depends on but I wouldn't be surprised if it calls out to neutron for networking details21:29
lifelessso we need HA for the neutron API21:29
jog0lifeless:  both of those make sense, what about nova-compute-api21:29
lifelessaccessing the neutron API requires keystone21:29
lifelessso keystone API21:29
lifelessmetadata API is nova-api21:29
lifelessnova-compute doesn't have an API21:30
jog0metadata is one of several nova apis21:30
lifelessjog0: they're all in the same process21:30
lifelessjog0: so if we get one HA'd we get them all21:30
dprincelifeless: re 1272969 ( what if we just munge the config files in init-neutron-ovs so that dhcp is disabled?21:30
jog0lifeless: err osapi_compute21:30
jog0ahh we run them as an all in one21:30
jog0not everyone does that21:30
lifelessjog0: its the default21:31
* dprince hates coupling... but that would seem to work21:31
lifelessjog0: if the default is wrong, change it :)21:31
jog0lifeless: true, but default isn't to run neutron either21:31
jog0anyay make sense21:31
lifelessjog0: it was meant to be :)21:31
lifelessjog0: in fact, installer docs steer people at neutron rather strongly, so I'd argue it is21:31
jog0I only ask because I assume we are just focusing on uptime for VMs and not APIs. although uptime for VMs means no downtime for many APIs as pointed out21:32
lifelessjog0: totally get that21:32
lifelessjog0: so basically the only VM consumed HTTP APIs I know of are nova metadata and heat21:32
lifelessthey are ones that if they go away VM's *per se* may glitch21:32
jog0lifeless: cool we are in agreement21:33
lifelessjog0: maybe cinder?21:33
jog0so second question21:33
lifelessjog0: nova metadata -> cinder that is21:33
lifelessand or21:33
jog0hmm I don't think so, but we will find out soon enough21:33
lifelesscinder being needed when live migrating a block storage using VM21:33
lifelessyou need to detach and reattach the volume21:33
lifelessdprince: maybe; I think this is a whiteboard problem21:34
jog0lifeless: so that reminds me actually, for live migration are we going to look at distributed file systems?21:34
lifelessdprince: would you like to do a voice call later today and try and nut out the design interactions of ovs, state, dhcp-all-interfaces ?21:34
lifelessjog0: I don't think they help do they - you still need to block-migrate the ephemeral volume21:34
jog0lifeless: unless you use distributed file system for ephemeral21:35
jog0which some people do21:35
lifelessjog0: ugh :)21:35
jog0not saying we should for the record21:35
lifelessjog0: yeah, I know21:35
*** dprince has quit IRC21:35
jog0anyway that is a detail that we can sort out later21:35
jog0what I was going to say was:21:35
lifelessso my opinionated w/out data opinion is that users that ask for ephemeral are asking for local disk21:35
lifelessusers that want network disk should use cinder21:36
jog0I think the rolling-upgrade card is pretty big21:36
lifelessand we should make cinder be backed by cluster//sheepdog//ceph21:36
jog0lifeless: I agree with that opinion21:36
jog0anyway I haven't played with live migration enough to have an informed opinion about the possible issues21:37
lifelessjog0: depending on who you ask it's all terrible or just fine21:38
jog0so for rolling-upgrade21:38
lifelessjog0: might be relevant21:38
openstackgerritA change was merged to openstack/os-collect-config: Updated from global requirements
jog0thats a pretty big item21:43
jog0do we have a etherpad for it?21:43
lifelessrolling upgrade ?21:43
pleia2lifeless: I think I'm ok, had some paperworky things to do this morning, but post lunch I'm now digging into the devstack scripts w/ fedora21:43
lifelesspleia2: ack21:44
lifelessoh crap I still have expenses to do :(21:44
jog0lifeless: yeah, etherpad for details on rolling upgrade item21:44
jog0or other doc21:44
lifelesslooking in heat blueprints21:46
SpamapSHeat vms will just not get updated metadata21:48
SpamapSpgup'd and forgot21:48
SpamapSignore me21:48
SpamapS^^ rolling updates21:49
SpamapSand some completely out of date total fiction
lifelessso I think there are two things21:50
SpamapSprobably need to go through that wiki spec and rewrite it to reflect what we actually know now21:50
lifelessthere's canary controls21:50
jog0SpamapS: lol21:50
lifelessand theres graceful N-at-a-time sequencing21:50
SpamapSlifeless: right, the canary thing just makes N a calculation.21:51
lifelessSpamapS: *and* possibly rollsback on OMG moments21:51
lifelessI'll start an etherpad21:51
lifelessbecause we have more than just heat scope21:52
SpamapSWell with N-at-a-time do we roll back on one fail?21:52
lifelessSpamapS: I was thinking rollback was an orthogonal thing21:52
lifelessSpamapS: for first iteration21:52
*** jcooley_ has quit IRC21:53
SpamapSIt is. Heat can either stop and whine, or rollback, on any failure.21:53
*** jcooley_ has joined #tripleo21:53
lifelessSpamapS: so at high scale we might want to add a third option of ignore failures21:54
SpamapSI had always thought with a more convergence-focused Heat we could then argue for a third mode, which is to whine, but keep going in cases where that is allowed.21:54
lifelesshaha yes21:54
jog0lifeless: so there is the heat aspect to make rolling upgrades possible and then there is the how to actually do them per service21:55
jog0what order works, any gotchas etc21:55
*** cadenzajon has joined #tripleo21:56
*** cadenzajon has left #tripleo21:56
*** cadenzajon has joined #tripleo21:56
greghayneslifeless: looks like the inject_partition=-2 hasnt fixed the qemu-nbd race cond for me21:58
openstackgerritJames Slagle proposed a change to openstack/diskimage-builder: Add ability to use local cloud image
lifelessjog0: right21:58
lifelessgreghaynes: it should have stopped qemu-nbd being used at all :(21:58
greghaynesI can assume if the change is shown in /opt/stack/os-config-applier/templates/etc/nova/nova.conf on the node then the change is being used, yes?21:58
lifelessgreghaynes: possibly there is another place that qemu-nbd is being triggered from ?21:58
lifelessgreghaynes: once you do a os-collect-config --force --one - it should show up in /etc/nova/nova.conf21:59
greghaynesah, yep its shown in there :)21:59
* greghaynes investigates21:59
lifelesshas nova-compute been restarted?22:00
lifelessjog0: SpamapS: should be updated perhaps22:00
lifelessstevebaker: ^22:00
lifelessSpamapS: also speaking of convergence -
*** CaptTofu has quit IRC22:06
*** CaptTofu_ has joined #tripleo22:11
jog0lifeless: perhaps the rolling upgrade card should be two: 1 for heat support and one for what order to upgrade in22:11
lifelessjog0: you're thinking of the nuisance conductor thing ?22:12
*** lblanchard has quit IRC22:12
jog0lifeless: yup22:12
jog0and of doing db migrations22:12
jog0in general22:12
lifelessyeah, PITA stuff22:12
jog0lifeless: which is why I think that card is big22:13
jog0although once we have a good system to actually test ordering of upgrades out, everything becomes much clearer22:13
lifelessI don't think it is really22:13
lifelessif we have a dependency on the control plane in heat it will upgrade that first22:14
jog0so in nova we have done a lot of work on making RPC work across services with different versions22:15
jog0so new control plane can talk to old non control plane nodes22:15
jog0but I don't know the state of that for !nova22:15
lifelessI think we need a for this one22:16
lifelesspersonally I think the compat stuff would be about 1000 times more obvious if each service was it's own code base22:17
jog0lifeless: I think that is irrelevent to MVP1 right?22:17
jog0(I agree with you though22:17
*** e0ne has quit IRC22:18
jog0err MVP422:18
cadenzajonI'm setting up a tripleo dev/test environment to get started with it and just ran across it's pretty outdated, is there any use in it? or a new project that replaces it, beyond
*** rollerj has quit IRC22:19
lifelesswendar: got settled in now?22:20
wendarlifeless: got all the set up out of the way22:20
cadenzajonlifeless: thanks. is there a "best fit" OS for hosting my dev/test tripleo VMs? Ubuntu, Redhat, etc?22:21
wendarlifeless: From here, it's mostly about getting familiar with codebases and into a daily habit.22:21
SpamapScadenzajon: many of us are on Ubuntu, some are on Fedora22:25
SpamapScadenzajon: both of those need fairly recent versions.. Ubuntu 12.04 won't cut it.22:25
cadenzajonspamaps: good to know... should I pull 13.10 and go to the latest release?22:28
*** jayg is now known as jayg|g0n322:29
SpamapScadenzajon: 13.10 is what I'm on22:30
SpamapScadenzajon: I usually get the dev release around mid-dev-cycle (in fact upgrading my personal laptop to trusty right now)22:30
lifelessjog0: ok so I think I see whats going on with communications22:31
lifelessjog0: 'rolling upgrade' means a totally different thing in nova land to heat land22:31
jog0lifeless: what does it mean in heat land?22:31
lifelessjog0: in nova land it refers to sequencing conductor -> db migrate -> other servives22:31
lifelessjog0: in heat land it refers to doing only part of a scaling group at once22:32
lifelessjog0: the current behaviour is like this - say you have a 10 server scaling group and you do a stack-update that needs to change the servers (e.g. new image)22:32
lifelessit will spin up 10 new servers, then delete the 10 old ones22:32
jog0lifeless: thats part of what it means in nova land.  in nova land it means not needing to upgrade all nova-computes at the same time. but to do that we specify an upgrade order that we will test22:33
lifelessjog0: if it's set to rebuild, it rebuilds all 10 at once22:33
SpamapSjog0: Oh btw, regarding "dunno about other projects" ... Heat, for instance, requires you to stop heat-engine, db_sync, then start the new engine.22:33
SpamapSjog0: but heat-api and heat-engine are, in theory, able to be upgraded not in that sequence.22:34
SpamapSs/not in that sequence/independent of one another/22:34
jog0SpamapS: ack22:34
SpamapSjog0: anyway, heat should switch to nova's object API and then the DB will work well too. :-P22:34
jog0lifeless: so to make sure I have this right: heat land rolling upgrade means 'being able to run a stack-update not all at once'22:35
jog0to use some ackwardly phrased English22:35
lifelessnot quite22:38
SpamapSnote that there are two words being interleaved that mean different things22:38
SpamapSupgrade != update22:38
SpamapSupdate in this context is heat's term for changing a running stack22:39
SpamapSupgrade is specifically an update of software to a new version (I think)22:39
lifelessSpamapS: review please
SpamapSlifeless: that will print out the private key into the log...22:40
SpamapSHave not been following things closely.. but perhaps thats a bad idea?22:41
lifelessSpamapS: the private key that is randomly created everytime we boot a testenv, and which has limited privs to just copy the seed, enumerate vms and start and stop vms22:41
*** epim has quit IRC22:41
*** CaptTofu_ has quit IRC22:41
lifelessSpamapS: its also echoed to the log by the ci-client22:41
SpamapSjust checking before I +2 :)22:42
*** noslzzp has quit IRC22:42
lifeless2014-01-27 03:01:28.810 | 2014-01-27 03:01:15,055 - testenv-client - INFO - Received job : {"remote-operations":"1", "host-ip":"", "seed-ip":"", "node-macs":"52:54:00:f9:00:30 52:54:00:96:3c:28 52:54:00:30:29:10", "ssh-key":"LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQ22:42
*** jdob has quit IRC22:42
openstackgerritA change was merged to openstack-infra/tripleo-ci: Be verbose in
*** jtomasek has quit IRC22:42
jog0lifeless: I agree with the 4 steps in
jog0so we want to maintain API access while doing control plane upgrade right? at least for most APIs22:43
lifelessStevenK: morning :)22:43
jog0because as SpamapS, it sounds like heat can't do a db upgrade life22:44
jog0not sure how well nova does it today either22:44
lifelesswe don't need to magically get it all right on day one22:44
lifelessbeing able to file bugs about where things are not suitable for deployment is a good thing22:45
jog0lifeless: sounds good, I am just trying to better understand what our desired goal is22:46
lifelessALL THE THINGS22:46
jog0for MVP4 that is22:46
lifelessjog0: right22:46
*** morazi has quit IRC22:46
lifelessso in my head the goal is to go from 'downtime of APIS and all VMS stopped during deployment'22:46
lifelessto 'VM workload keep working but you might not be able to start new things // stop old things during deployment'22:47
*** hewbrocca has quit IRC22:48
jog0lifeless: I like it, although in my mind VM workload keep working (mostly) is a much easier target. mostly here would mean transient issues that resolve themselves in some window of time22:49
jog0but yeah that goal means we can turn off APIs for short amounts of time while upgrading control plane22:49
jog0(I think)22:50
dkehnlifeless: have sometime for a conversation ?22:50
lifelessdkehn: sure22:50
dkehnlifeless: ok whick method, i.e. skype, gtalk, etc.22:51
*** e0ne has joined #tripleo23:05
dkehndevananda: thx23:07
*** e0ne has quit IRC23:10
*** matty_dubs is now known as matty_dubs|gone23:13
*** sdague has quit IRC23:27
*** sdague has joined #tripleo23:28
*** jpeeler has quit IRC23:29
*** jpeeler has joined #tripleo23:30
*** ftcjeff has quit IRC23:36
*** rbrady1 has joined #tripleo23:48
*** clarkb has quit IRC23:48
*** rbrady has quit IRC23:48
*** clarkb has joined #tripleo23:49
lifelessAn optional feature of IPv6, the jumbo payload option, allows the exchange of packets with payloads of up to one byte less than 4 GiB23:51
lifelessI would lke to see that23:51
greghaynesim sure most routers will just do fine with that :p23:53
greghaynesrunning heat stack-update on overcloud results in compute node erroring in nova-compute with trying to connect to mysql via local socket (which is actually running on the other node)23:55
greghaynesknown bug?23:55
lifelessgreghaynes: I believe that ronelle was seeing that yesterday23:55
lifelessgreghaynes: check that the mysql url is correct in /etc/nova/nova.conf23:55
greghaynesIt is23:56
lifelessfile a bug :)23:56
greghaynesWhat project do you think?23:56
lifelesstripleo to start with23:56
greghaynesAlso got some info on whats happening with my qemu-nbd, it happens when nova booting demo image from overcloud, best guess is that its because we load in just the qcow2, not the kernel/initrd images. Sounds sane / should we still not be doing qemu-nbd for that case?23:58

Generated by 2.14.0 by Marius Gedminas - find it at!