Wednesday, 2016-05-04

*** rfolco has joined #openstack-third-party-ci01:11
*** apoorvad has quit IRC01:56
*** rfolco has quit IRC03:26
*** openstackgerrit has quit IRC06:17
*** openstackgerrit has joined #openstack-third-party-ci06:18
*** rfolco has joined #openstack-third-party-ci12:09
*** apoorvad has joined #openstack-third-party-ci15:53
*** asselin has joined #openstack-third-party-ci15:56
*** asselin__ has quit IRC15:58
*** asselin has quit IRC16:23
cbadermmedvede, are you there19:21
mmedvedehi cbader19:21
cbaderhi I have an issue with my jenkins and nodepool not talking, I have restarted zuul zuul-merger jenkins and nodepool and I can't get them to respond to each other any clue were to look?19:22
mmedvedecbader: first thing to check is if you have firewall, e.g. iptables19:23
mmedvedeiptables -L19:23
mmedvedetry disabling it19:23
mmedvedeif you have any rules there19:23
cbaderpolicy ACCEPT for input forward and output19:24
cbaderno targets listed19:24
cbaderthis just stopped working last night.19:24
cbaderwas working yesterday fine19:25
mmedvedecbader: so nodepool boots a VM, but fails to register it with jenkins?19:25
cbadermmedvede, yes all services are on the same vm19:25
cbadermmedvede, zuul, jenkins, nodepool on same vm19:26
mmedvedeok, that makes it easier19:26
mmedvedejenkins/nodepool use zmq to talk19:26
cbadermmedvede, I even stopped them all and rebooted the vm19:26
mmedvedethere is a tool in nodepool repo, nodepool/tools/zmq-stream.py19:26
cbadermmedvede, yes I know it is configured for the jobs19:27
cbadermmedvede, is that under system-config19:27
mmedvedecbader: I would guess you should find it under /opt/nodepool/tools/zmg-stream.py19:28
mmedvedeon your all-in-one VM19:28
cbadermmedvede: found it19:28
cbadermmedvede: did python zmq-stream.py returned ready.19:29
mmedvedeI am checking if the 8888 port is correct19:30
cbadermmedvede, is there another way to check if it is up. I didn't find it as a process19:30
cbadermmedvede, my global is set to 888819:30
*** apoorvad has quit IRC19:31
cbadermmedvede: so zmq-stream.py uses 8888 and it responded ready so that seems to work. put if the script works does that mean the port is not being used by jenkins at the time.19:34
mmedvedecbader: which process were you looking for?19:34
mmedvedecbader: is your jenkins installed on the same VM?19:34
cbadermmedvede: yes all-in-one19:34
cbadermmedvede, my nodepool list only show status of ready even when they are being used by jenkins so that is why I was wondering if there was some change made upstream I didn't see anything.19:36
mmedvedecbader: I've seen this before19:37
mmedvedecbader: I normally fix it by restarting nodepool and jenkins19:38
cbadermmedvede: I have tried that twice seems to come back19:38
mmedvedecbader: ordering could matter too19:38
mmedvedecbader: also make sure you do not have left-over processes. So instead of simply restarting, stop the process, make sure it has actually stopped, and then start it19:39
cbadermmedvede: I wonder if I need to remove the vms from the providers before starting nodepool and jenkins19:39
cbadermmedvede: ok will do thank you for your time.19:40
mmedvedeonce it works (VMs get marked as being used), you'll need to manually cleanup VMs that where never marked as used19:40
cbadermmedvede: great thanks will do. have a nice day. so where are you located.19:41
mmedvedecbader: yws. let me know if it did not help19:41
mmedvedecbader: I am in Texas19:41
cbadermmedvede: oh so two hours ahead. I am in California19:42
mmedvedeit is still day here :)19:42
cbadermmedvede: my day starts at 5:00 so almost over if I can fix this else stay till done.19:43
cbadermmedvede: do you work in office or from home. I am in office 5 days a week19:44
mmedvedecbader: frequently from home, most of my team is remote19:45
mmedvedecbader: which CI are you running?19:46
mmedvedefound it, HP Storage CI19:47
cbadermmedvede: I run openstackci to report cinder, manila, for 3par then two others for inside company to test on.19:47
*** openstackgerrit has quit IRC19:48
cbadermmedvede: yup sorry, I have been having an internal network error with pypi and apt-get which is causing all my error.19:48
*** openstackgerrit has joined #openstack-third-party-ci19:48
mmedvedeI've got pypi-mirror working, also using aptcacher to mirror apt repos19:49
mmedvedebut it is not HA19:49
mmedvedesaves tons of traffic to outside net19:50
cbadermmedvede: for some reason I get can't connect. the blades and the mirror on the same blade enclosure so they don't even go to the wire.19:51
*** apoorvad has joined #openstack-third-party-ci20:15
cbadermmedvede: so got it back up. shutdown all services zuul,nodepool, jenkins, cleared all running vms, rebooted providers, cleared node entries in mysql db, then restarted jenkins, nodepool, zuul in order and zmq is working now. thanks for you help.20:20
mmedvedecbader: awesome20:22
mmedvedeyou're welcome20:23
cbadermmedvede: get to leave on time for a change worked all last weekend trying to figure out the libffi.h issue with my nodes.20:27
mmedvedesorry to hear that. cbader, next time it happens, it would be faster to figure out20:28
mmedvededependency problems always lurking around20:28
cbadermmedvede: only problem job is going away in Oct 3120:29
mmedvedewhy?20:29
cbadermmedvede, well it is not fixed my jenkins shows slaves offline, so lost communiction with nodepool. crud.20:50
mmedvedecbader: I forgot to ask, are you pinning your nodepool/zuul versions? That would help to avoid random failures20:52
mmedvedeit does make sense to pin nodepool, and only update it manually from time to time20:52
cbadermmedvede, yes I am20:52
cbadermmedvede; zuul_revision: 7fca9c1cc625dec94b1c06a6a65216cd1a041e85 and nodepool_revision: c8c680912384f041dd1a824e9970c98dc74c7ef020:54
cbadermmedvede: it has been pretty stable. but yesterday night went in the toilet.20:55
mmedvedecbader: also, slaves offline on jenkins does not necessarily means connection lost to nodepool20:58
mmedvedeif you deleted all VMs manually, you should expect to see them as offline once jenkins starts back up20:59
cbadermmedvede: this is after it recreated them new after the restart. the were reporting normal then went off-line20:59
mmedvedecbader: ok, just sanity checking21:00
cbadermmedvede: so this is different. the ssl connection can't be made. connection closed that is different21:00
mmedvedeI am out of ideas, other than going deeper into logs and debugging lower level21:00
cbadermmedvede: need to see ssh key got messed up somehow thanks. might just leave and work on it at home.21:02

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!