Sunday, 2018-03-04

JohnnyOSAHi all.  I have a problem on 2 of 3 infra nodes in an HA setup with the gnocchi container where a "OSError: [Errno 13] Permission denied" is found in the gnocchi-apache-error.log.00:41
JohnnyOSA(http://paste.openstack.org/show/690633/)  Those gnocchi containers appear to not be reporting metrics due to this problem.00:41
JohnnyOSAThis error started after I manually ran a gnocchi-upgrade cmd due to a prior "Unable to detect the number of storage sack" error on those same gnocchi containers which fixed that problem but led to this one.00:41
JohnnyOSAAnyone know how to troubleshoot or fix this?00:41
JohnnyOSAOne of the infra nodes never had these problems and is operating fine.00:41
JohnnyOSA(be back in 30m)00:42
larsksJohnnyOSA: I'll bet you ran gnocchi-upgrade as root instead of as the gnocchi user.00:42
larsksCheck the permissions on the gnocchi log files.00:42
JohnnyOSAHi larsks, thanks for the fast feedback!  Checking...00:43
JohnnyOSAlarsks: I do recall running the command as root.  In the /var/log/gnocchi dir, 2 of the 3 log files (access and apache-error) are root:root, the other is gnocchi:gnocchi.00:44
JohnnyOSAlarsks: those ownership perms appear to be the same as the working infra node which I didn't run the gnocchi-upgrade command manually on.00:46
*** dims has quit IRC00:47
JohnnyOSAlarsks: so probably there are files somewhere else where that had the perms/ownership botched up then...  Have to step away; will start hunting a little later on what/how to fix.00:48
*** fabian has joined #openstack-telemetry00:48
larsksGood luck! That was the only idea I had off the00:59
larsksTop of my head00:59
*** fabian has quit IRC01:06
*** rwsu has joined #openstack-telemetry01:11
JohnnyOSAlarsks: Cool -- that gives me a place to start!  If I can't figure it out, I'll post again in another day or so with a few more details on where I get to in case anyone else has thoughts.01:29
*** masber has joined #openstack-telemetry01:40
*** germs has quit IRC01:42
*** germs has joined #openstack-telemetry01:42
*** germs has quit IRC01:42
*** germs has joined #openstack-telemetry01:42
*** germs has quit IRC01:44
*** germs has joined #openstack-telemetry01:45
*** germs has quit IRC02:11
*** germs_ has joined #openstack-telemetry02:11
*** fabian has joined #openstack-telemetry02:31
*** fabian is now known as chenyb402:31
JohnnyOSAlarsks: Ok, got that fixed up.  The /var/lib/gnocchi/ dir had to have ownership recursively set back to gnocchi:gnocchi, then apache2 service restarted.03:15
larsksJohnnyOSA: yay!03:15
JohnnyOSA:)03:15
larsksAlways run maintenance commands as the service user.03:15
JohnnyOSAOk, good to know.  My newbiness is shining through ;)03:16
JohnnyOSAOnto my next puzzle: I'm finding that when I query a metric multiple times (each call going to a different gnocchi container via HAproxy, I'm getting different results lists returned.  I would expect, prior to reading further, that each call should return the same list regardless of which gnocchi container responds to the call.03:17
JohnnyOSAbut, could be that I'm just not understanding the architecture.03:18
JohnnyOSALooks like this may have to do with ceilometer, which is logging messages of the type: "Skip pollster hardware.<metric_type>, no resources found this cycle"04:17
larsksJohnnyOSA: if you're stilla round: which storage backend are you using for gnocchi?04:38
larsksYou can only run multiple gnocchi instances if you're using a shared storage backend (swift, ceph, etc). If you're using the file backend, you can only run a single instance.04:39
JohnnyOSAlarsks: that helps, thanks.  The deploy was done for PoC with OpenStack ansible using their standard HA deploy.04:47
JohnnyOSAlarsks: in the gnocchi.conf file, the indexer refers to the mysql galera cluster, and for the file storage section,04:47
larsksJohnnyOSA: if there are in fact multiple gnocchi instances running, then it sounds as if the openstack-ansible folks don't really understand the telemetry services.04:48
JohnnyOSAthere is a coordination_url set to the mysql galera cluster as well.  There are some local storage paths set, but that looks like it might be for temporary storage only.  I may need to dig deeper.04:48
larsksIf your deploy already includes swift, the easiest solution is probably just to point gnocchi at it.04:48
larsks(or you can use shared storage -- like nfs -- with the file storage backend)04:49
JohnnyOSAok.  Ceph is installed.  No swift.  I'll look into switching.  I think a ceilometer bug may also be preventing metrics that should be regularly polled to getting to gnocchi as well.  I'm guessing I'll be spending another few to several hours working at understanding what's going on.04:53
*** chenyb4 has quit IRC05:05
JohnnyOSAMore likely that I botched something in the OSA install than they have a bug in the playbook/roles.  I see ceph tags and checks in the gnocchi playbook.05:06
*** germs_ has quit IRC05:07
*** germs has joined #openstack-telemetry05:07
*** germs has quit IRC05:07
*** germs has joined #openstack-telemetry05:07
larsksJohnnyOSA: I have to turn in for the evening, but I'm generally online during the week (us business hours). I've just spent the past few weeks digging into openstack telemetry, so feel free to ping me with questions.05:08
*** germs_ has joined #openstack-telemetry05:08
JohnnyOSAThank you!  Have a great night!05:09
*** germs has quit IRC05:12
*** swamireddy has quit IRC05:42
*** germs_ has quit IRC06:39
*** germs has joined #openstack-telemetry06:39
*** germs has quit IRC06:39
*** germs has joined #openstack-telemetry06:39
*** germs has quit IRC06:44
*** nijaba has quit IRC07:03
*** nijaba has joined #openstack-telemetry07:06
*** dims has joined #openstack-telemetry07:10
*** masuberu has joined #openstack-telemetry07:58
*** masuberu has quit IRC07:58
*** masber has quit IRC08:01
*** germs has joined #openstack-telemetry08:40
*** germs has quit IRC08:40
*** germs has joined #openstack-telemetry08:40
*** germs has quit IRC08:44
*** swamireddy has joined #openstack-telemetry09:05
*** Tom-Tom_ has quit IRC10:23
*** germs has joined #openstack-telemetry10:40
*** germs has quit IRC10:40
*** germs has joined #openstack-telemetry10:40
*** germs has quit IRC10:45
*** Tom-Tom has joined #openstack-telemetry11:05
*** gongysh has joined #openstack-telemetry12:15
*** gongysh has quit IRC12:17
*** gongysh has joined #openstack-telemetry12:20
*** fabian has joined #openstack-telemetry12:23
*** fabian is now known as chenyb412:23
*** gongysh has quit IRC12:35
*** germs has joined #openstack-telemetry12:42
*** germs has quit IRC12:42
*** germs has joined #openstack-telemetry12:42
*** germs has quit IRC12:46
*** germs has joined #openstack-telemetry13:22
*** germs has quit IRC13:22
*** germs has joined #openstack-telemetry13:22
*** chenyb4 has quit IRC13:26
*** AlexeyAbashkin has joined #openstack-telemetry13:38
*** AlexeyAbashkin has quit IRC13:55
*** nijaba has quit IRC14:02
*** nijaba has joined #openstack-telemetry14:10
*** AlexeyAbashkin has joined #openstack-telemetry14:53
*** jmlowe has joined #openstack-telemetry14:56
*** AlexeyAbashkin has quit IRC14:57
*** gongysh has joined #openstack-telemetry15:04
*** germs has quit IRC15:21
*** pcaruana has quit IRC15:32
*** pcaruana has joined #openstack-telemetry15:35
*** gongysh has quit IRC15:36
*** AlexeyAbashkin has joined #openstack-telemetry15:55
*** AlexeyAbashkin has quit IRC15:59
*** AlexeyAbashkin has joined #openstack-telemetry16:53
*** AlexeyAbashkin has quit IRC16:58
*** jmlowe has quit IRC17:44
JohnnyOSAHi all...  Digging into why ceilometer polling metrics which should be coming in every 5 minutes for nova instances don't.  I see info messages in the ceilometer logs of: "Skip pollster hardware.cpu.util (or other metric), no resources found this cycle".  Anyone familiar with this?18:11
JohnnyOSAAlso -- regarding my comments from several hours ago about trying to get gnocchi centralized into Ceph storage in an OSA deployment: I needed to add a few extra params to the user_variables.yml file in the OSA deploy to get it to install with that option (see OSA channel for anyone interested).18:13
*** AlexeyAbashkin has joined #openstack-telemetry18:52
*** AlexeyAbashkin has quit IRC18:57
larsksJohnnyOSA: The hardware.* aren't nova instance metrics.  Those are host metrics that are collected via IPMI or SNMP. There's a good chance those metrics aren't going to be configured out of the box. Metrics available for Nova instances are these: https://docs.openstack.org/ceilometer/latest/admin/telemetry-measurements.html#openstack-compute19:10
JohnnyOSALarsks: thanks!  Yup, you are right -- looking at snmpd config now.19:11
*** AlexeyAbashkin has joined #openstack-telemetry19:53
*** AlexeyAbashkin has quit IRC19:57
*** jmlowe has joined #openstack-telemetry20:46
*** germs has joined #openstack-telemetry21:06
*** germs has quit IRC21:06
*** germs has joined #openstack-telemetry21:06
JohnnyOSAAnyone know an easy way to change the archive_policy for gnocchi metrics that already exist?  I'd like to change from the default of low to medium.  Changed in the ceilometer.conf, but looks like that will only affect newly created metrics.21:08
*** germs has quit IRC21:10
*** germs has joined #openstack-telemetry21:11
*** germs has quit IRC21:16
*** threestrands has joined #openstack-telemetry21:17
*** germs has joined #openstack-telemetry21:42
*** germs has quit IRC21:42
*** germs has joined #openstack-telemetry21:42
*** pcaruana has quit IRC22:17
*** rcernin has joined #openstack-telemetry22:28
*** AlexeyAbashkin has joined #openstack-telemetry22:52
*** AlexeyAbashkin has quit IRC22:57
*** germs has quit IRC23:12
*** germs has joined #openstack-telemetry23:13
*** germs has quit IRC23:13
*** germs has joined #openstack-telemetry23:13
*** germs has quit IRC23:18
*** vint_bra has joined #openstack-telemetry23:25
*** vint_bra has quit IRC23:25
*** AlexeyAbashkin has joined #openstack-telemetry23:53
*** AlexeyAbashkin has quit IRC23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!