Wednesday, 2017-03-08

openstackgerritgordon chung proposed openstack/gnocchi master: WIP bucketise incoming  https://review.openstack.org/44138900:05
*** gordc has quit IRC00:12
*** catintheroof has quit IRC00:31
*** iceyao has joined #openstack-telemetry00:38
*** zhurong has joined #openstack-telemetry01:15
*** gongysh has joined #openstack-telemetry01:49
*** thorst has quit IRC01:58
*** lhx__ has joined #openstack-telemetry01:58
*** vint_bra has joined #openstack-telemetry02:10
*** vint_bra has left #openstack-telemetry02:12
*** thorst has joined #openstack-telemetry02:21
*** thorst has quit IRC02:23
*** g3ek has quit IRC02:54
*** g3ek has joined #openstack-telemetry03:01
*** chlong_ has joined #openstack-telemetry03:15
*** chlong has quit IRC03:15
*** thorst has joined #openstack-telemetry03:24
*** thorst has quit IRC03:28
*** oomichi has quit IRC03:30
*** oomichi has joined #openstack-telemetry03:33
*** rbak has joined #openstack-telemetry03:48
*** g3ek has quit IRC03:59
*** thorst has joined #openstack-telemetry04:00
*** zhurong has quit IRC04:01
*** thorst has quit IRC04:03
*** g3ek has joined #openstack-telemetry04:05
*** rbak has quit IRC04:18
*** links has joined #openstack-telemetry04:41
*** iceyao has quit IRC04:53
*** thorst has joined #openstack-telemetry05:04
*** dhellman_ has joined #openstack-telemetry05:17
*** iceyao has joined #openstack-telemetry05:18
*** thorst has quit IRC05:19
*** iceyao has quit IRC05:23
*** dhellman_ has quit IRC05:24
*** gongysh has quit IRC05:31
*** gongysh has joined #openstack-telemetry05:32
*** adriant has quit IRC05:39
*** nadya has joined #openstack-telemetry05:39
*** Andrew_jedi has joined #openstack-telemetry05:51
*** Jack_Iv has joined #openstack-telemetry05:59
*** yprokule has joined #openstack-telemetry06:00
*** nadya has quit IRC06:12
openstackgerritOpenStack Proposal Bot proposed openstack/panko master: Imported Translations from Zanata  https://review.openstack.org/44229306:14
*** thorst has joined #openstack-telemetry06:16
*** thorst has quit IRC06:20
*** lhx__ has quit IRC06:21
*** lhx__ has joined #openstack-telemetry06:22
*** nadya has joined #openstack-telemetry06:25
*** g3ek has quit IRC06:32
*** g3ek has joined #openstack-telemetry06:41
*** pcaruana has joined #openstack-telemetry06:43
*** Jack_Iv has quit IRC06:46
*** Jack_Iv has joined #openstack-telemetry06:48
*** nadya has quit IRC06:50
*** Andrew_jedi has quit IRC06:53
*** Jack_Iv has quit IRC06:56
*** Gautam has joined #openstack-telemetry07:00
*** Andrew_jedi has joined #openstack-telemetry07:11
*** thorst has joined #openstack-telemetry07:17
*** iceyao has joined #openstack-telemetry07:19
*** thorst has quit IRC07:21
*** nadya has joined #openstack-telemetry07:23
*** iceyao has quit IRC07:24
*** nadya has quit IRC07:33
*** rcernin has joined #openstack-telemetry07:40
*** tesseract has joined #openstack-telemetry07:43
*** lhx__ has quit IRC07:53
*** lhx__ has joined #openstack-telemetry07:53
*** dschultz has quit IRC07:54
*** donghao has joined #openstack-telemetry08:05
*** donghao has quit IRC08:11
*** thorst has joined #openstack-telemetry08:18
*** chlong_ has quit IRC08:18
*** thorst has quit IRC08:22
*** Andrew_jedi has left #openstack-telemetry08:23
*** Gautam has quit IRC08:23
*** sanchitmalhotra has quit IRC08:27
*** sanchitmalhotra has joined #openstack-telemetry08:28
*** Jack_I has joined #openstack-telemetry08:29
*** amoralej|off is now known as amoralej08:31
*** shardy has joined #openstack-telemetry08:43
*** sudipto has joined #openstack-telemetry08:50
*** sudipto_ has joined #openstack-telemetry08:50
*** dschultz has joined #openstack-telemetry08:55
*** Gautam has joined #openstack-telemetry08:55
*** dschultz has quit IRC08:59
*** openstackgerrit has quit IRC09:03
*** thorst has joined #openstack-telemetry09:18
*** thorst has quit IRC09:23
*** thorst has joined #openstack-telemetry09:40
*** thorst has quit IRC09:44
*** flwang1 has joined #openstack-telemetry10:06
flwang1jd_: ping10:08
flwang1how can i get the sample id when i using 'sample-list'? thanks10:08
*** lhx__ has quit IRC10:20
*** lhx_ has joined #openstack-telemetry10:20
*** openstackgerrit has joined #openstack-telemetry10:40
openstackgerritliusheng proposed openstack/python-pankoclient master: Modify the doc descriptions of pankoclient  https://review.openstack.org/44184810:40
*** thorst has joined #openstack-telemetry10:40
*** thorst has quit IRC10:45
openstackgerritOpenStack Proposal Bot proposed openstack/aodh master: Imported Translations from Zanata  https://review.openstack.org/44187610:46
*** Jack_Iv has joined #openstack-telemetry10:51
*** Jack_Iv has quit IRC10:53
*** Jack_Iv has joined #openstack-telemetry10:53
*** donghao has joined #openstack-telemetry10:54
*** Jack_Iv has quit IRC10:55
*** Jack_Iv has joined #openstack-telemetry10:55
*** donghao has quit IRC11:00
*** Gautam has quit IRC11:03
*** Gautam has joined #openstack-telemetry11:04
*** zhurong has joined #openstack-telemetry11:07
*** Gautam has quit IRC11:08
*** masber has quit IRC11:09
*** masber has joined #openstack-telemetry11:10
*** gongysh has quit IRC11:16
*** catintheroof has joined #openstack-telemetry11:20
*** cdent has joined #openstack-telemetry11:21
*** iceyao has joined #openstack-telemetry11:22
*** iceyao has quit IRC11:27
*** g3ek has quit IRC11:31
*** Gautam has joined #openstack-telemetry11:38
*** g3ek has joined #openstack-telemetry11:40
*** thorst has joined #openstack-telemetry11:41
*** Gautam has quit IRC11:42
*** thorst has quit IRC11:46
*** dschultz has joined #openstack-telemetry11:57
*** vint_bra has joined #openstack-telemetry11:58
*** vint_bra has quit IRC11:59
*** dschultz has quit IRC12:02
flwang1jd_: sileht: ping12:05
flwang1we're running into a problem of instance status12:05
flwang1after shelve/unshelve, the instance is always showing active in the samples12:06
flwang1did you see this before?12:06
*** Jack_Iv has quit IRC12:17
*** gkadam has quit IRC12:25
*** catinthe_ has joined #openstack-telemetry12:29
*** catintheroof has quit IRC12:29
*** david-lyle has quit IRC12:30
*** catintheroof has joined #openstack-telemetry12:30
*** catinthe_ has quit IRC12:31
*** thorst has joined #openstack-telemetry12:33
*** david-lyle has joined #openstack-telemetry12:33
*** shardy has quit IRC12:34
*** zhurong has quit IRC12:35
*** gkadam has joined #openstack-telemetry12:36
*** Jack_Iv has joined #openstack-telemetry12:36
*** gkadam has quit IRC12:43
*** lhx_ has quit IRC12:54
*** iceyao has joined #openstack-telemetry12:57
*** dschultz has joined #openstack-telemetry13:01
*** gongysh has joined #openstack-telemetry13:06
*** gordc has joined #openstack-telemetry13:16
*** dschultz has quit IRC13:17
gordcjd_: added some items to https://etherpad.openstack.org/p/gnocchi-incoming-bucket-scheduler to get details on non-metricd bucket handling13:24
gordcdo you need me to add my proposal?13:24
*** amoralej is now known as amoralej|lunch13:24
*** links has quit IRC13:25
*** Gautam has joined #openstack-telemetry13:28
silehtgordc, you haven't test last kombu on windows, seriously ? ;)13:33
openstackgerritMerged openstack/gnocchi master: cleanup unused var  https://review.openstack.org/44286413:34
jd_gordc: I'll reply inline, I think I understood your current code/proposal though so no need to write one, but if you find flaws in my mind let me know :)13:35
gordcsileht: i know. i should stop being so lazy :P13:44
gordcjd_: ack13:44
*** lhx_ has joined #openstack-telemetry13:46
*** Jack_Iv has quit IRC13:50
gordcjd_: added notes.13:50
*** sileht has quit IRC13:51
*** donghao has joined #openstack-telemetry13:54
*** sileht has joined #openstack-telemetry13:55
*** efoley has joined #openstack-telemetry13:57
*** sileht has quit IRC13:59
*** zhurong has joined #openstack-telemetry14:01
*** sileht has joined #openstack-telemetry14:04
*** sileht has quit IRC14:04
jd_gordc: me too :)14:07
*** sileht has joined #openstack-telemetry14:08
*** sileht has quit IRC14:08
*** sileht has joined #openstack-telemetry14:10
*** sileht has quit IRC14:11
*** nadya has joined #openstack-telemetry14:12
openstackgerritMerged openstack/aodh master: Switch to use stable data_utils  https://review.openstack.org/44278314:12
*** shardy has joined #openstack-telemetry14:13
*** pradk has quit IRC14:17
*** dave-mccowan has joined #openstack-telemetry14:20
*** sileht has joined #openstack-telemetry14:26
*** amoralej|lunch is now known as amoralej14:26
gordcjd_: reply added.14:26
gordci think i'll just push the indexing stuff to another patch for now.14:26
*** nadya has quit IRC14:27
gordcbe easier to review anyways14:27
*** iceyao has quit IRC14:32
jd_why doing PTG when you can spend hours debating design with gordc on an Etherpad :p14:34
gordcjd_: this is very efficient. it's like communicating with mars. type message, send, wait 15 minutes, read, repeat :)14:36
jd_haha14:37
jd_TBH i am not sure it's worst than me trying to explain this in oral English real time with a time constraint of 40 minutes :p14:37
gordctrue. i cannot focus 40mins anyways.14:37
*** g3ek has quit IRC14:37
* gordc is not listening at design summit14:38
*** gongysh has quit IRC14:42
*** nadya has joined #openstack-telemetry14:42
*** g3ek has joined #openstack-telemetry14:47
*** sileht has quit IRC14:47
*** chlong_ has joined #openstack-telemetry14:48
*** rbak has joined #openstack-telemetry14:53
*** sileht has joined #openstack-telemetry14:54
*** sileht has quit IRC14:54
*** efoley has quit IRC14:55
*** jahsis has joined #openstack-telemetry15:01
jahsisHello all, I installed newton gnocchi on ubuntu 16.04, but when I try to run gnocchi-api, I receive error:15:02
jahsisgnocchi-api: error: unrecognized arguments: --config-file=/etc/gnocchi/gnocchi.conf --log-file=/var/log/gnocchi/gnocchi-api.log15:02
jahsisanyone had same issue and know how to solve it?15:02
*** fguillot has joined #openstack-telemetry15:05
*** zhurong has quit IRC15:05
*** sileht has joined #openstack-telemetry15:13
*** pradk has joined #openstack-telemetry15:13
gordcjahsis: http://gnocchi.xyz/running.html#running-as-a-wsgi-application15:13
jahsisgordc, thanks, but I think than ubuntu package should'nt have gnocchi-api service. Looks like issue in ubuntu package.15:17
gordcah, i see15:19
jd_gordc: I replied again but I think I'm starting where we diverge mainly15:20
*** Gautam has quit IRC15:20
jd_gordc: it seems you think than computing a hash is a slow operation15:20
jd_whereas it should be considered almost as a noop, especially compared to database CRUD15:20
gordcjd_: partly yes, but mainly storing the index so we can change number of buckets whenever we want.15:21
jd_I replied to that also15:21
gordcwell again. the db CRUD is done regardless15:21
jd_this is a myth you built I think15:21
jd_there is no need to change the number of buckets15:21
jd_consistent hashing is exactly what people built to avoid all the problem you want to implement :p15:22
jd_e.g. rebalancing15:22
jd_also I hate encoding, trying to fix the tooz bug you found :(15:23
gordcjd_: this is fair. it's one thing to say we don't allow changing buckets (then i'm ok with no indexing reqs)15:24
gordcbut if we allow them to change buckets, then you run into issue how we guarantee all measures are processed15:25
jd_definitely15:26
jd_I think I wrote something at some point saying that it's likely to be doable and easy if every bucket is empty, that could be a first step if we want to implement that one day, but with consistent hashing, changing the number of buckets is to be considered a maintenance operation IMHO15:27
gordci think i mention this in gerrit, but the indexing part is basically just for dynamic bucket changes... and since we have it, it allows us to not compute hash constantly (and add zero additional db calls)15:28
jd_but I think it's totally fine to say that it's not supported for now and provide a good guidance on the default number of buckets you need15:28
gordcjd_: if we don't support changing buckets, i don't think we need hashring either15:29
jd_gordc: ?15:29
gordcit is much simpler to take metric.id mod buckets to find where to send it15:30
gordcwe will never need to rebalance15:30
jd_this is what I wrote gordc15:30
gordc(becuase we can't)15:30
gordcright.15:30
jd_the hashring it used for mapping metricd-workers -> buckets15:31
jd_(via tooz partitioner)15:31
jd_s/it/is/15:31
gordcbut there's sorting into buckets step (from api pov) and assigning buckets to metricd pov15:31
gordci don't think it's useful for metricd-workers -> buckets either... i don't see how it's more useful than just dividing them equally?15:32
gordcie if i have 3 metricd, at start, a -> [1,2], b -> [3,4], c->[5,6]15:34
jd_how do you do this mapping then?15:35
jd_how a know it has to pick 1 and 2 ?15:35
gordcif b leaves, it's very simple to just have take a -> 3, c -> 4,15:35
gordcyou konw how many buckets there are.15:35
gordcyou know how many metricd there are15:35
*** links has joined #openstack-telemetry15:36
gordcjd_:  https://review.openstack.org/#/c/441389/4/gnocchi/cli.py if you look at line 17015:37
gordchmm. i think there might be some concurrency issue in the code. ignore that :)15:40
jd_yeah that works too, though the rebalancing is going to be much heavier15:42
*** andreaf has joined #openstack-telemetry15:42
jd_every metricd is going to lose some of its sacks and gain new sacks to take care15:42
*** yprokule has quit IRC15:42
jd_whereas the hashring would limit that IIUC15:43
jd_(not saying it's particularly a huge problem in metricd's context though)15:43
andreafsileht, jd_, gordc: do you have any idea about what may be wrong with the grenade job on https://review.openstack.org/#/c/439414 ? Do you see that in other patches?15:43
gordcsure, but it doesn't have make any new connections or anything? it's basically, interval1-> look at folder1, folder2. interval2 -> look at folder3, folder4?15:44
jd_andreaf: I think gordc fixed that but we're blocked by requirements15:44
jd_andreaf: IIUC15:44
jd_gordc: I didn't get your interval thing15:44
andreafjd_: heh ok, good to know, I will stop rechecking15:44
andreafjd_: thanks15:44
gordcandreaf: (i think) i fixed it :)15:45
andreafgordc: ok another recheck coming then15:45
gordcandreaf: https://review.openstack.org/#/c/442913/15:45
gordcwell it's not merged15:45
catintheroofjd_: hi! do you have an example on how the coordinator url should be configured on gnocchi if i want to use FILE coordinator ?15:45
gordcwe're blocked by some other pbr patch15:45
jd_catintheroof: file:///some/directory15:45
gordcjd_:  so metricd has processing_interval, basically every 60s, it will run through it's buckets and dump all the metrics_with_measures15:46
catintheroofjd_: thx !15:46
gordcif my buckets change, the next interval, i just dump different buckets.15:46
andreafgordc: doh, right15:46
andreafgordc: I was too quick, sorry15:46
gordci think the rebalancing is important if each of those buckets required a separate connection, or if we had some localise stuff15:47
gordcthis is my understanding15:47
jd_gordc: if a scheduler rebalance too much (imagine an extreme case where 50% of it change) then it will add measures in its Queue that are already in another Queue, putting a lot of workers on contention because they will try to do the same thing15:48
jd_gordc: I'm not saying it's going to happen every 5 minutes lol but that might happen15:49
jd_gordc: the hashring has the good idea to solve that IIRC for free so15:49
jd_if it's free… :P15:49
gordcjd_: ah, yes, in that case.15:50
gordc*shrugs* i'll add a note. we can't change buckets anyways15:50
gordc:P15:50
jd_it also provides replicas, which might be useful at some point? I wonder15:51
gordcmaybe, if you change the model of single scheduler/metricd15:52
gordcbut that just gives you the contentions i would think.15:52
*** sudipto has quit IRC15:52
jd_right15:52
*** sudipto_ has quit IRC15:52
jd_my main point is that I really don't want to involve the indexer in any of this as tooz ought to be enough15:53
jd_I'd be willing to remove as much as possible access to the indexer even if what we have currently (as you mentioned the AP details)15:54
gordclol. i think you're going to end up with a lot of duplicate data in indexer/storage... and really really long names to parse15:55
*** jahsis has quit IRC15:56
jd_long names to parse?15:56
gordcwell you might need to store all that archive info in the object names15:58
gordci don't know ... just random musings15:58
*** rcernin has quit IRC15:58
silehtceph doesn't support long name16:04
silehtif ext4 is used 256 chars max I think16:04
*** donghao has quit IRC16:06
gordcsileht: i'll let you folks figure it out :)16:10
gordcjd_: so i have one minor concern, where are we specifying total buckets? in conf?16:11
gordcthis might get really buggy, if all the conf files don't have same value...16:12
gordci don't know how big an issue this is.16:13
silehtgordc, we can share the conf value throught the coordinator capability and raise warning in log if miss configred16:14
jd_conf file is ok gordc , they have to be identical everywhere16:15
jd_gnocchi does not work if you specify different database either, it's no different… :)16:15
silehthaha :)16:16
gordcjd_: have you tried? you have no proof it doesn't work :P16:16
jd_gordc: true.16:17
jd_it'll work with quantum computing, but not yet16:17
gordcdid we figure out why integration gate still fails randomly? the console shows no alarm, instance, stack16:20
gordchttp://logs.openstack.org/95/442395/1/gate/gate-telemetry-dsvm-integration-gnocchi-ubuntu-xenial/90640b3/console.html#_2017-03-08_02_48_38_88892416:20
gordcsome reason gnocchi/panko return stuff fine.16:20
*** Kevin_Zheng has quit IRC16:23
*** nadya has quit IRC16:26
jd_gordc: no idea :(16:30
*** dschultz has joined #openstack-telemetry16:40
*** nicodemus_ has joined #openstack-telemetry16:40
nicodemus_hello16:41
nicodemus_I'm doing some testing with gnocchi stable/3.1 with S3 backend, and gnocchi-metricd is showing a repeating error: http://paste.openstack.org/show/601960/16:43
nicodemus_I'm not quite sure if it's regarding S3, or perhaps the coordination... has anyone seen this error?16:44
gordci don't recall seeing that, but it's not related to s3... the queue is between scheduler worker giving groups metric_ids to processing workers16:48
*** links has quit IRC16:55
openstackgerritMerged openstack/gnocchi master: simplify swift report  https://review.openstack.org/44195616:57
nicodemus_gordc, so the coordination looks like a more plausible culprit17:01
*** cdent has quit IRC17:01
*** nadya has joined #openstack-telemetry17:03
*** lhx_ has quit IRC17:16
jd_nicodemus_: what's your coordination?17:18
*** tesseract has quit IRC17:18
jd_I never saw that error either17:18
*** rbak has quit IRC17:20
*** magicboiz has joined #openstack-telemetry17:20
nicodemus_jd_, it's Redis, but since it's on AWS I'm not 100% sure it's deployed properly. I'm double-checking if with file for coordination changes17:20
magicboizHi, Anyone has seen gnocchi/ceilometer error "Failed to connect to db, purpose metering retry later: 'NoneType' object has no attribute 'find'" before?? http://paste.openstack.org/show/601901/17:21
*** rbak has joined #openstack-telemetry17:21
jd_magicboiz: lol yes it happens if you don't set any database url for ceilometer api17:21
gordcjd_: :/ so i'm trying it out right now... if we change buckets, we don't lose measures. we just have stale measures that are visible but can never be processed/removed17:23
*** vint_bra has joined #openstack-telemetry17:24
gordcwhich is worse.17:24
magicboizjd_:: in my config I have set something like:17:24
magicboiz[dispatcher_gnocchi]17:24
magicboizfilter_service_activity = False17:24
magicboizurl = <<http://x.x.x.x:gnocchi_api_port>>17:24
jd_gordc: I'm lacking context here :)17:24
magicboizjd_: you mean that?17:24
jd_magicboiz: so that means ceilometer agent will send data to gnocchi at this address if asked to do so17:24
jd_magicboiz: no I don't17:24
jd_magicboiz: I know documentation is sparse but did you take a look at it first? it can help understanding how things work17:25
magicboizjd_: yes I did, believe me. And also blogs, etc. ;)17:25
gordcmagicboiz: ceilometer-api is not relevant if you have gnocchi.17:26
magicboizjd_: actually, I'm trying to debug why kolla (official openstack project) deploys ceilometer with gnocchi as backend and it fails....17:26
magicboizgordc: why?17:26
jd_you mean ceilometer (official openstack project) I imagine17:26
jd_:p17:26
jd_gordc: why???17:27
magicboizjd_: no, I mean kolla: https://docs.openstack.org/developer/kolla/17:27
gordcjd_: magicboiz: um because it's not?lol17:27
gordcgnocchi is an alternative to ceilometer+mongodb not an alternative to mongodb17:28
jd_magicboiz: do you mean kolla or kolla (official openstack project)?17:28
jd_ok jk17:28
magicboizgordc: is not possible to setup gnocchi as backend for ceilometer, instead of using mongodb, while keeping ceilometer running?17:29
gordcjd_: scheduler puts metric_ids to process on queue, not where it is. we compute that later.17:30
gordcmagicboiz: ceilometer is many parts. not just an api, agents still run17:31
gordcthey just write to gnocchi.17:31
jd_gordc: right that's probably not enough, you'd need to push bucket+metric+files17:31
magicboizaccording to https://docs.openstack.org/developer/ceilometer/architecture.html#storing-accessing-the-data, it is....17:32
gordchuh?17:32
gordcif you notice, there is no ceilometer-api listed anywhere17:33
jd_I second that huh17:33
magicboizgordc: yes, I have ceilometer-agent-compute, collector, etc etc. But my last goal is to eliminate mongo from my deployment, while keeping ceilometer running...17:33
gordcor if there is. it's wrong.17:33
jd_IIUC magicboiz wants ceilometer-api that uses gnocchi as a backend17:33
jd_but since that's not possible he is going to keep asking until we say how to do that :p17:33
gordcmagicboiz: https://docs.openstack.org/developer/ceilometer/architecture.html#high-level-architecture the pipeline pushes straight to gnocchi.17:34
magicboizgordc: ok, so why error "Failed to connect to db, purpose metering retry later: 'NoneType' object has no attribute 'find'"?? http://paste.openstack.org/show/601901/17:35
*** nadya has quit IRC17:39
gordcjd_: i could push bucket+metric. but sigh... more changes :P17:40
jd_gordc: that can be a first change before anything? that'll help the further optimization idea anyway17:40
gordcjd_: possibly. it just bothers me that they are connected but we're disconnecting them and having to deal with it in many places17:42
*** shardy has quit IRC17:45
gordcjd_: so, bucket+metric works when processing. but it's broken on delete.17:57
gordcwe actually can't figure out how to delete if bucket changes.17:58
gordc(and unprocessed stuff remains)17:58
jd_why would bucket change?17:59
jd_gordc: ^18:00
gordcjd_: well it's a conf option and people are people18:01
jd_gordc: the number of bucket  you mean?18:01
gordcright18:01
jd_but it does not change, it's an init parameter and they are created on upgrade18:01
jd_you don't even have to put it in a conf I guess18:01
jd_gnocchi-upgrade --bucket 102418:02
jd_and voila18:02
gordcbasically, if it ever gets changed, there's a good chance, there's goingn to be stuff that we can never figure out how to delete but will be visible18:02
jd_but … it does not change18:02
jd_we already decided that18:02
gordcthat's what i was planning :) but how would metricd know how many bucket there are?18:02
jd_also it does not work if the ceph pool is deleted you know18:02
jd_gordc: ls?18:03
gordcit has to count buckets on start?18:03
jd_yeah18:03
jd_if it's slow we can store it in the storage driver18:03
jd_whatever18:03
gordcthis seems very fragile.18:05
jd_how so?18:05
gordcwell this is all dependent on user not ever changing bucket18:06
jd_facepalm18:06
jd_but it can't change bucket18:06
jd_everything depends on user not deleting the ceph pool or shutting the sql server too18:07
jd_lol18:07
jd_gnocchi-upgrade --bucket=32 then you create 32 containers in swift and you write 32 in an object so you don't have to list the buckets and done18:07
gordcbut that's castrophic at least18:07
jd_next time you call upgrade it knows how many buckets they are18:07
jd_hahaha18:07
jd_so it's very fragile, but not enough, I get it18:08
jd_:P18:08
gordcthis will look like it's still working but there's jsut crap that you can't access but can see18:08
gordcexactly :)18:08
jd_so IF the user connects to swift and manipulates containers it will fail indeed18:08
jd_same if it types random sql statements18:08
jd_:D18:08
gordcwell that's not through our api18:08
gordci'm just saying, my solution protects more stupidity18:09
gordc:P18:09
jd_how so? if the user connects to the database and changes things, how does it work?18:11
jd_like the sack of a metric18:11
gordcthey wouldn't be connecting to db themselves?18:12
gordcthe idea was no matter what, index knows where it is writing to. whatever process that changes it, knows it has to cleanup the previous place after change.18:13
gordci'll push no-indexer-change patch soon. we can see if we need more idiot-proofing18:15
jd_gordc: so… they would not connect to db but they would connect to swift to create new bucket or change a file?18:15
jd_cmon :p18:16
jd_(change a file in swift I mean)18:16
gordchuh? why would they connect to swift?18:16
jd_to change the buckets18:17
jd_since that's the only way to do it18:17
jd_[19:07:44]  <jd_>gnocchi-upgrade --bucket=32 then you create 32 containers in swift and you write 32 in an object so you don't have to list the buckets and done18:17
gordcyou have an upgrade-agent compute new bucket location. if it changes, update bucket in indexer, process any old stuff in previous bucket, next...18:17
jd_if you do that then it's _impossible_ to change the bucket without doing things manually in the storage backend18:17
gordcwhys that? it's just not one step.18:18
jd_to create buckets?18:18
jd_what's not?18:19
gordcgnocchi-upgrade already creates buckets?18:19
gordcwhy would you need to do it manually?18:19
jd_no need18:19
jd_it's you inventing users that mess with buckets18:19
jd_so i'm trying to demonstrate what would be required to do so18:20
jd_mess with the buckets18:20
gordcthey run gnocchi-upgrade --bucket 32 and then gnocchi-upgrade --bucket 6418:20
jd_"don't mess with the buckets boyzz 𝅘𝅥𝅮"18:20
gordcthat doesn't seem hard for user18:20
jd_gordc: ERROR18:20
jd_they are already 32 buckets18:20
jd_move on boyz18:21
jd_that's _easy_ no?18:21
jd_"don't mess with the buckets boyzz 𝅘𝅥𝅮"18:21
gordclol what if they underestimated target size?18:21
gordcstart over?18:21
jd_gordc: RTFM?18:21
jd_yep18:21
jd_as I said 3 or 4 times today, it's exactly what Swift did for a few years :)18:22
gordcthen why don't we just set it ourselves to a really big number?18:22
gordcand make it constant18:22
jd_then it's not impossible to implement number of bucket change but it's not the first feature I'd do18:22
jd_gordc: I think it's a real option18:22
jd_having a default of 2^12 for example18:22
jd_which should be large enough for most people18:22
gordcdid i upload that patch? i had it as a constant.18:22
openstackgerritJulien Danjou proposed openstack/ceilometer master: agent: start coordinator at run() and never stops  https://review.openstack.org/44326718:25
openstackgerritJulien Danjou proposed openstack/ceilometer master: agent: only create partition coordinator if backend url provided  https://review.openstack.org/44326818:25
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: create coordinator at init time  https://review.openstack.org/44326918:25
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: stop checking for _coordinator to be None  https://review.openstack.org/44327018:25
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: remove group_id check  https://review.openstack.org/44327118:25
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: fix leave_group() async call  https://review.openstack.org/44327218:25
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: make group_id to never be None  https://review.openstack.org/44327318:25
openstackgerritgordon chung proposed openstack/gnocchi master: WIP bucketise incoming  https://review.openstack.org/44138918:29
jd_ 6 files changed, 71 insertions(+), 210 deletions(-)18:29
jd_the amount of useless code one can write18:29
jd_oneS18:29
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: remove started check  https://review.openstack.org/44327718:31
flwang1gordc:18:33
flwang1(01:05:39) flwang1: we're running into a problem of instance status18:33
flwang1(01:06:06) flwang1: after shelve/unshelve, the instance is always showing active in the samples18:33
flwang1(01:06:11) flwang1: did you see this before?18:33
flwang1jd_: ^18:33
flwang1thanks in advance18:33
gordcflwang1: nope... but i also don't manage a (real) cloud18:35
flwang1gordc: ok, IIRC, instance metrics can be collected by notification and pollster, right?18:36
gordcyes18:36
flwang1so will the notification impact the result collected by pollster?18:36
gordcno18:36
*** cdent has joined #openstack-telemetry18:36
gordcor i don't know what you mean18:36
gordcthey both do their own thing. if you're asking if they generate same meters, then maybe, they will18:37
*** chlong_ has quit IRC18:37
flwang1and how the pollster get the metadata of the instance?18:37
flwang1gordc: yep, i'm asking if they may generate same samples18:39
gordci imagine so. you can check admin-guide. there's a chart of meter's source18:39
flwang1gordc: ok, i see. the problem is very weird. i know a bit ceilometer i think, but currently the problem is totally out of my knowledge18:41
flwang1because the status collected by pollster is not correct18:41
flwang1ok, i will go through the code and bug you guys later, thanks a lot18:41
*** flwang1 has quit IRC18:41
gordckk, bbl. going to get lunch18:41
*** rcernin has joined #openstack-telemetry18:42
*** chlong_ has joined #openstack-telemetry18:53
*** cdent has quit IRC19:03
nicodemus_Does gnocchi stable/3.1 require a specific gnocchi dispatcher version? I'm using ceilometer mitaka, and for each POST gnocchi resurns 404 but then the dispatcher doesn't create the resource...19:07
*** cdent has joined #openstack-telemetry19:08
*** rwsu has quit IRC19:09
*** pcaruana has quit IRC19:13
openstackgerritJulien Danjou proposed openstack/ceilometer master: coordination: remove started check  https://review.openstack.org/44327719:15
jd_nicodemus_: it should not, check your gnocchiclient version too19:16
nicodemus_jd_, is there a minimum version of gnocchiclient needed for stable/3.1?19:17
jd_nicodemus_: hum the latest one would be recommended19:17
jd_there was also some change with how ID are encoded19:17
nicodemus_jd_, I'll give it a try with the last client then. Thanks!19:17
jd_cool19:18
*** Jack_I has quit IRC19:18
nicodemus_jd_, with gnocchiclient==3.1.1 I have the same behavior... the strange thing is that the dispatcher doesn't give a clear error log, it simply says "Not found (HTTP 404)" after each POST to the gnocchi api19:22
*** g3ek has quit IRC19:32
*** flwang1 has joined #openstack-telemetry19:39
*** amoralej is now known as amoralej|off19:39
*** cdent has quit IRC19:40
*** g3ek has joined #openstack-telemetry19:41
flwangjd_: can you remind me how ceilometer get the nova instance list by polling? Thanks19:51
*** Jack_I has joined #openstack-telemetry19:51
flwangjd_: gordc: in other words, at this line https://github.com/openstack/ceilometer/blob/kilo-eol/ceilometer/compute/pollsters/instance.py#L25 where is the 'resources' coming from?19:52
nicodemus_I'm seeing that gnocchi api stable/3.0 used to send a text/plain reply while stable/3.1 answers with content-type application/json, is that correct?19:56
gordcflwang: https://github.com/openstack/ceilometer/blob/kilo-eol/ceilometer/agent/base.py#L12319:58
flwanggordc: cool, i have to admitted i have forgot most of the ceilometer code :)19:59
flwangbtw, the instance metric is collected by central agent or compute agent?19:59
gordccentral. (i remember last time it came but, just reminder, it's not there anymore)20:00
flwangyep, i know. we're using kilo20:01
gordckk20:01
gordcit should be central. but i'm half guessing. 50/5020:01
*** narasimha_SV has joined #openstack-telemetry20:03
nicodemus_Apparently the dispatcher is not creating the resources because the exception it's looking to create it is with a "resource not found" message, not a plain "Not found" message. My question is, under what circumstances could the gnocchi API reply with a "Not found" instead of a "Resource not found"?20:04
flwanggordc: could you please let me know how the agent manger get the instance list by discovery, https://github.com/openstack/ceilometer/blob/kilo-eol/ceilometer/agent/base.py#L13720:15
flwangsorry, i don't have much time to understand all the code, it's an urgent issue20:15
*** chlong_ has quit IRC20:16
gordcflwang: https://github.com/openstack/ceilometer/blob/eb970605d7a7263007f36136bf5ae052cf44984a/ceilometer/compute/discovery.py#L12020:25
flwanggordc: so for kilo, is it here https://github.com/openstack/ceilometer/blob/kilo-eol/ceilometer/compute/discovery.py#L40 ?20:27
*** chlong_ has joined #openstack-telemetry20:28
flwangseems it's collected by compute agent20:28
nicodemus_gnocchi's newton dispatcher doesn't seem to work with the latest gnocchi client :(20:28
flwangcan anybody confirm that?20:28
*** sergio_ has joined #openstack-telemetry20:28
*** sergio_ is now known as Guest8853420:29
gordcnicodemus_: maybe open a bug? i recall there being some changes in webob which changed some stuff20:30
nicodemus_newton' dispatcher tries to call 'encode_resource_id' from gnocchiclient's utils.py that is present on client 2.7 but not on 3.1.120:31
nicodemus_I'm confusing myself with so many versions20:32
gordcocata dispatcher is designed to work against gnocchiclient 3.120:33
gordcand gnocchi3.120:33
gordcso if you want gnocchi 3.1, try just taking the code from ocata20:33
nicodemus_gordc, oooh now that makes sense20:34
gordcand if you can contribute that to docs, that's even better.20:34
nicodemus_I'd need to recall the whole process to contribute, but I think I can do that :)20:35
*** rcernin has quit IRC20:36
gordcnicodemus_: no pressure :)20:36
flwanggordc: based on this https://github.com/openstack/ceilometer/blob/ffdb2977e36e99528b70540a0c83de04fb13ffd6/setup.cfg#L55 seems the instance metric is collected by compute agent20:37
nicodemus_Let me ask you just one more question before changing for ocata (not related): if I have a resource that has an 'ended_at' date, what would happen if the dispatcher tries to POST new measures for that resource?20:37
flwanggordc: pls skip my last msg20:38
gordcflwang: yes, seems so https://github.com/openstack/ceilometer/blob/stable/mitaka/ceilometer/compute/pollsters/instance.py20:38
nicodemus_flwang, I believe instance is collected by compute agent. In my case, I use agent-central to poll through SNMP in order to get measures from the hypervisors20:38
nicodemus_and agent-compute for the instances on each compute node20:39
flwanggordc: then i think we're running into a weird bug20:39
gordcnicodemus_: i imagine you still can. i don't think it changes status of metrics20:39
flwangif the instance metric is collected by compute agent20:39
nicodemus_gordc, kk. Thanks!20:39
flwangafter the instance is shelved, it won't belong to any host technically20:39
flwangdid i miss anything?20:39
flwangwith that context, can the compute agent still get the instance which has been shelved?20:40
gordcflwang: maybe? i don't know what happens when shelved20:40
gordcno20:40
gordcit only queries whatever instances nova tells us is on the host20:41
gordc(a little different) in ocata20:41
gordcstill have no idea if can see 'shelved'20:41
flwangok, so now, can we confirm the 'instance' metric is collected by the compute agent instead of central agent?20:42
flwangbecause i think it will impact the final result20:42
gordcyes, compute.20:42
gordci mean, it wont' matter since in kilo they would definitely be dependent on what nova tells us is on host20:43
*** adriant has joined #openstack-telemetry20:43
*** Guest88534 has quit IRC20:46
openstackgerritgordon chung proposed openstack/gnocchi master: push incoming into different sacks  https://review.openstack.org/44138920:49
flwanggordc: yep, but if the instance is collected by central agent, then it doesn't make sense to use 'host' as the parameter21:02
jd_nicodemus_: ended_at is just an information field, you can post metric anyway21:11
nicodemus_got it. Thanks jd_ !21:12
*** narasimha_SV has quit IRC21:13
flwanggordc: still around?21:18
flwangjd_: gordc: i saw there is a discovery_cache  https://github.com/openstack/ceilometer/blob/kilo-eol/ceilometer/agent/base.py#L13521:19
flwangso if last time i can see the instance, and this time when polling, the instance is gone, so will the cache be refreshed?21:19
*** thorst has quit IRC21:30
*** thorst has joined #openstack-telemetry21:31
*** thorst has quit IRC21:35
nicodemus_I guess I know the answer, but.. is there any way of forcing gnocchi's URL in ceilometer ocata in the config file? I'm doing some testing and want to use an alternate gnocchi without having to change the endpoint in keystone21:36
gordcnicodemus_: um... i imagine there's some endpoint_override param but i'm not entirely sure.21:49
*** thorst has joined #openstack-telemetry21:50
gordcstepping out. sorry.21:50
*** gordc has quit IRC21:50
*** fguillot has quit IRC21:50
openstackgerritgordon chung proposed openstack/gnocchi master: push incoming into different sacks  https://review.openstack.org/44138921:50
*** nicodemus_ has quit IRC21:53
*** dave-mccowan has quit IRC22:13
*** yassine has quit IRC22:13
*** yassine has joined #openstack-telemetry22:21
*** chlong_ has quit IRC22:24
*** vint_bra has quit IRC22:28
*** rwsu has joined #openstack-telemetry22:31
*** catintheroof has quit IRC22:44
flwangjd_: ping22:53
flwangany telemetry core around?22:54
*** thorst has quit IRC22:54
*** thorst has joined #openstack-telemetry22:55
flwangpls tell me I'm wrong, for this line https://github.com/openstack/ceilometer/blob/kilo-eol/ceilometer/agent/base.py#L13522:55
flwangdoes that mean the resource list will be cached22:55
flwangin other words, even the resource has been deleted, ceilometer will continually insert samples into db?22:56
*** thorst has quit IRC22:59
*** iceyao has joined #openstack-telemetry23:04
*** Jack_I has quit IRC23:06
*** iceyao has quit IRC23:08
*** yassine has quit IRC23:16
*** thorst has joined #openstack-telemetry23:20
*** thorst has quit IRC23:24
*** catintheroof has joined #openstack-telemetry23:28
*** g3ek has quit IRC23:30
*** joadavis_ has joined #openstack-telemetry23:39
*** g3ek has joined #openstack-telemetry23:39
*** joadavis has joined #openstack-telemetry23:40
*** joadavis_ has quit IRC23:41
*** pradk has quit IRC23:55
*** david-lyle has quit IRC23:56

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!