Tuesday, 2015-11-17

harlowjaklindgren ya, the question is how to get there from where we are now00:01
harlowjathat is the mega-change that i'm unsure how to get to :-P00:01
*** arnoldje has joined #openstack-performance00:07
*** arnoldje has quit IRC01:05
*** arnoldje has joined #openstack-performance01:16
*** paco20151113 has joined #openstack-performance01:23
*** arnoldje has quit IRC01:28
*** markvoelker has joined #openstack-performance01:37
*** mriedem_away is now known as mriedem01:59
*** mwagner has joined #openstack-performance02:03
*** mriedem has quit IRC02:07
kun_huangha02:43
kun_huangharlowja: “ i think we (yahoo) really want to switch osprofiler to not ceilometer" what's yahoo's alternative of ceilometer?02:44
harlowjasomething internal that users hbase or tsdb or something02:45
harlowjaas u can imagine metrics gathering and such has existed many years before ceilometer02:45
harlowjamany many years02:45
harlowjaso its ummm, sorta hard to say, use this ceilometer thing, when that other stuff has existed for years and is being actively worked on02:46
harlowjaand the fact that ceilometer has (had?) scale issues doesn't make that sell any easier..02:47
harlowjai think kun_huang http://www.slideshare.net/HBaseCon/ecosystem-session-6/6 references what exists (or what is being built or something)02:48
harlowjaprobably other slides somewhere to02:49
harlowjabut ya, thats the jist of the reasoning...02:50
*** bapalm has quit IRC03:06
*** harshs has quit IRC03:06
*** harshs has joined #openstack-performance03:16
*** arnoldje has joined #openstack-performance03:33
*** harshs has quit IRC03:46
kun_huangharlowja: I'm sorry, I didn't receive notification of you message...04:03
kun_huangharlowja: thank you first, do your team use redis to store some perf data?04:04
boris-42harlowja: so we will do mongodb and influxdb and elasticsearch04:05
boris-42harlowja: for osprofiler04:05
boris-42harlowja: so it will work well04:05
*** dims has quit IRC04:06
kun_huangboris-42: DinaBelova said there is a performance team in mirantis04:09
kun_huangare you in that team? I'm curious about how this team works with other teams in mirantis04:13
kun_huangwe(huawei) are running a public cloud, but we don't have such a performance team to solve issues04:14
kun_huangthey solve every performance issues by deploying more nodes04:14
*** harshs has joined #openstack-performance04:46
*** swann has joined #openstack-performance04:56
*** serverascode has quit IRC05:00
*** swann_ has quit IRC05:00
*** mgagne has quit IRC05:00
*** mgagne has joined #openstack-performance05:00
*** serverascode has joined #openstack-performance05:03
klindgrenwhen you can print servers....05:06
kun_huangklindgren: ?05:07
klindgrenhuawei - is a among many other things a server vendor... no?  We at least tested some sku's from you guys.  So when you can literally make servers, just adding more boxes is probably at some point the easiest solution.05:10
kun_huangklindgren: huawei has many production lines: switch, servers, cloud computing, storage like EMC's, OS forked from SUSE, cellphone.....05:12
kun_huangklindgren: I'm in cloud computing...05:12
*** rpodolyaka1 has joined #openstack-performance05:17
*** rpodolyaka1 has quit IRC05:22
*** rpodolyaka1 has joined #openstack-performance05:42
*** aswadr has joined #openstack-performance05:54
*** harshs has quit IRC06:11
*** aswadr has quit IRC06:55
*** rpodolyaka1 has quit IRC06:56
boris-42kun_huang: so actually we have 2 teams related to perfromance/scalability06:57
boris-42kun_huang: One is QA team another one is RnD06:57
boris-42kun_huang: so we are testing stuff on daily basis and trying to figure out the issues after that we are invlolving component teams that are fixing issues in our product/upstream that we found06:58
kun_huangyou have TWO teams only, that is the point. There are many QA teams and RnD teams in huawei07:02
kun_huangwith geography politic competition  aslo07:03
*** harlowja_at_home has joined #openstack-performance07:03
kun_huangand it's not strange for me to get nothing performance topics from public cloud side07:03
kun_huangthe QA team I will meet these days looks like "open guys" and could learn something07:04
*** rpodolyaka1 has joined #openstack-performance07:07
*** itsuugo has joined #openstack-performance07:07
*** arnoldje has quit IRC07:08
*** harlowja_at_home has quit IRC07:17
*** rpodolyaka1 has quit IRC07:23
*** rpodolyaka1 has joined #openstack-performance07:25
*** itsuugo has quit IRC08:06
*** itsuugo has joined #openstack-performance08:07
*** aojea has joined #openstack-performance08:12
*** itsuugo has quit IRC08:14
*** rpodolyaka1 has quit IRC08:47
*** xek has joined #openstack-performance09:03
*** rpodolyaka1 has joined #openstack-performance09:23
*** rpodolyaka1 has quit IRC09:43
*** rpodolyaka1 has joined #openstack-performance09:45
*** markvoelker has quit IRC10:05
*** paco20151113 has quit IRC10:12
*** aojea has quit IRC10:49
*** rpodolyaka1 has quit IRC10:50
*** itsuugo has joined #openstack-performance11:06
*** markvoelker has joined #openstack-performance11:06
*** dims has joined #openstack-performance11:08
*** markvoelker has quit IRC11:11
*** redixin has joined #openstack-performance11:21
*** rmart04 has joined #openstack-performance11:23
*** rpodolyaka1 has joined #openstack-performance11:57
*** rpodolyaka1 has quit IRC11:59
*** rpodolyaka1 has joined #openstack-performance12:09
*** itsuugo has quit IRC12:19
*** itsuugo has joined #openstack-performance12:19
*** markvoelker has joined #openstack-performance12:37
*** markvoelker has quit IRC12:42
*** itsuugo has quit IRC12:53
*** markvoelker has joined #openstack-performance13:27
*** regXboi has joined #openstack-performance13:34
*** rpodolyaka1 has quit IRC13:35
*** msemenov has joined #openstack-performance13:52
*** itsuugo has joined #openstack-performance13:54
*** rpodolyaka1 has joined #openstack-performance13:56
*** mdorman has joined #openstack-performance14:09
*** itsuugo has quit IRC14:20
*** itsuugo has joined #openstack-performance14:20
*** mriedem has joined #openstack-performance14:32
*** rvasilets___ has joined #openstack-performance14:48
*** ozamiatin has joined #openstack-performance14:49
*** rohanion has joined #openstack-performance14:51
*** mriedem has quit IRC14:55
DinaBelovaharlowja - did you have a chance to wake up? :)14:57
*** manand has joined #openstack-performance14:57
DinaBelovaprobably no :)14:59
DinaBelova#startmeeting Performance Team15:00
openstackMeeting started Tue Nov 17 15:00:02 2015 UTC and is due to finish in 60 minutes.  The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
openstackThe meeting name has been set to 'performance_team'15:00
DinaBelovahello folks!15:00
rvasilets___o/15:00
rohanionHi!15:00
ozamiatino/15:00
kun_huanggood evening :)15:00
kun_huango/15:00
DinaBelovakun_huang - good evening sir15:00
DinaBelovaso todays agenda15:00
boris-42hi15:00
DinaBelova#link https://wiki.openstack.org/wiki/Meetings/Performance#Agenda_for_next_meeting15:00
DinaBelovathere was a complain last time that there was not enough time to fill it15:01
DinaBelovaalthough this time it looks not so big as well :)15:01
DinaBelovaso let's start with action items15:01
DinaBelova#topic Action Items15:01
DinaBelovalast time we had two action items15:01
DinaBelova#1 was about filling the etherpad https://etherpad.openstack.org/p/rally_scenarios_list with information about Rally scenarios used15:02
DinaBelovain your companies :)15:02
DinaBelovawell, it looks like nothing has changed since previous meeting15:02
DinaBelova:(15:02
*** AugieMena has joined #openstack-performance15:02
DinaBelovaI really hoped augiemena3, Kristian_, patrykw_ will fill it15:02
DinaBelovaalthough I do not see them here today15:03
kun_huangI know Kevin had shared a topic about rally and neutron's control plane benchmarking15:03
*** ikhudoshyn has joined #openstack-performance15:03
DinaBelovakun_huang - oh, that's cool15:03
DinaBelovado you have link to that info?15:03
*** mriedem has joined #openstack-performance15:03
kun_huanga topic in tokyo, wait a minute15:04
AugieMenaDina - my bad, should have filled in with some info15:04
* mriedem joins late15:04
* dims waves hi15:04
kun_huang#link https://www.youtube.com/watch?v=a0qlsH1hoKs15:04
DinaBelova#action everyone (who use Rally for OpenStack testing inside your companies) fill etherpad https://etherpad.openstack.org/p/rally_scenarios_list  with used scanarios15:04
DinaBelovaAugieMena - :)15:04
DinaBelovaplease spend some time on this etherpad filling15:04
DinaBelovaif we want to create a standard it'll be useful to collect some info preliminary15:05
DinaBelovamriedem, dims o/15:05
DinaBelova@kun_huang thank you sir15:05
DinaBelovalemme take a quick look15:05
DinaBelovaah, that's video15:05
DinaBelovaso after the meeting :)15:05
DinaBelova#action DinaBelova go through the https://www.youtube.com/watch?v=a0qlsH1hoKs15:05
kun_huangno problem15:05
DinaBelovaok, cool15:06
DinaBelovaso one more action item was on Kristian_15:06
*** llu-laptop has joined #openstack-performance15:06
*** amaretskiy has joined #openstack-performance15:06
DinaBelovahe promised to collect the information about Rally blanks inside ATT15:06
DinaBelovait looks like he was not able to join us today15:06
DinaBelova#action DinaBelova ping Kristian_ about internal ATT Rally feedback gathering15:07
DinaBelovaso it looks like we went through the action items :)15:07
DinaBelovajust once more time - please fill https://etherpad.openstack.org/p/rally_scenarios_list15:07
DinaBelovathat will be super useful for future recommendations / methodologies creation15:08
*** pkoniszewski has joined #openstack-performance15:08
*** dansmith has joined #openstack-performance15:08
DinaBelovaI guess we may go to the next topic15:08
DinaBelova#topic Nova-conductor performance issues15:08
DinaBelova#link https://etherpad.openstack.org/p/remote-conductor-performance15:08
*** alaski has joined #openstack-performance15:08
boris-42DinaBelova: can we retrun back to the previous topic?15:08
DinaBelovaboris-42 heh :)15:09
DinaBelovaI dunno how to make that easy using the bot controls15:09
dansmith#undo15:09
DinaBelovathanks!15:09
DinaBelova#undo15:09
openstackRemoving item from minutes: <ircmeeting.items.Link object at 0xaf28d90>15:09
DinaBelovaboris-42 - feel free15:09
boris-42dansmith: nice15:09
boris-42DinaBelova: so we (Rally team) started recently workin on certification task15:10
*** bauzas has joined #openstack-performance15:10
boris-42#link https://github.com/openstack/rally/tree/master/certification/openstack15:10
*** claudiub has joined #openstack-performance15:10
DinaBelovaboris-42 - sadly I do not have much info about this initiative15:10
DinaBelovalemme take a quick look15:10
boris-42DinaBelova: wich is much better way to share your expirience15:10
* bauzas waves15:10
boris-42DinaBelova: than just using etherpads15:11
DinaBelovaboris-42 - that may be cool15:11
DinaBelovaso it's some task creation on cloud validation15:11
boris-42DinaBelova: so basically it's single task that accepts few arguments about cloud and should generate proper load and test everything that you specified15:11
DinaBelovaboris-42 - a-ha, cool15:11
boris-42DinaBelova: so basically it's executable etherpad15:12
DinaBelovaok, so that may be very useful for this purpose15:12
boris-42DinaBelova: that you are trying to collect15:12
DinaBelovathank you sir15:12
DinaBelovawe may definitely use it15:12
mriedemso rally as defcore?15:12
*** atuvenie_ has joined #openstack-performance15:12
kun_huangboris-42: Has mirantis team used this feature?15:12
*** abalutoiu has joined #openstack-performance15:12
boris-42mriedem: so nope15:12
regXboimriedem: I'm trying to wrap my head around that :)15:12
*** lpetrut has joined #openstack-performance15:13
DinaBelova#info we may use https://github.com/openstack/rally/tree/master/certification/openstack to collect information about Rally scenarios used in verious companies15:13
boris-42mriedem: it's pain in neck to use rally to validate OpenStack15:13
DinaBelovakun_huang - I did not hear about this frankly speaking15:13
boris-42mriedem: because you need to create such task and it takes usually 2-3 weeks15:13
DinaBelovakun_huang - but as boris-42 said this initiative is fairly new15:13
boris-42mriedem: so we decided to create it once and avoid duplication of effort15:14
DinaBelovaboris-42 - very useful, thank you sir15:14
boris-42mriedem: our goal is not say is it openstack or not*15:14
boris-42kun_huang: so we just recently made it15:14
boris-42kun_huang: I know about only 1 usage and there were bunch of issues that I am going to address soon15:14
mriedemok, maybe the readme there needs more detail15:15
DinaBelovaok, very cool. thanks boris-42! something else to mention here?15:15
boris-42mriedem: what you would like to see there15:15
boris-42mriedem: ?15:15
mriedemwhat it is and what it's used for15:15
dimsboris-42 why is it called "certification" then? :)15:15
mriedemnote that i'm not a rally user15:15
mriedemright, 'ceritification' makes me think defcore15:15
DinaBelova:)15:15
dimsy15:15
AugieMenawould someone provide a one-liner on what the  purpose of it is?15:16
kun_huangI would like to say that is some kind of task template15:16
kun_huangtasks template15:16
DinaBelovaAugieMena - single task to check all OpenStack cloud. And you may fill it with all scenarios you like15:16
DinaBelovakun_huang - is that accurate?15:16
boris-42AugieMena: just that will put properl load and SLA on your cloud15:16
rvasilets___I guess to run one big task again cloud and to see measures or different resources15:17
boris-42dims: nope not scenarios15:17
boris-42DinaBelova: nope not scenarios15:17
kun_huangDinaBelova: my understanding15:17
boris-42dims: sorry15:17
DinaBelovaboris-42 :)15:17
AugieMenaso how will it help make it easier to gather info about what Rally scenarios various companies are using?15:17
boris-42It's the single task that contains bunch of subtasks that will test specified serviced with proper load (based on size & quality of cloud) and proper SLA15:18
DinaBelovaAugieMena - boris-42 just proposed to create these lists in form of these "certification" tasks to be able to run them15:18
AugieMenaOK, I see15:18
DinaBelovaack!15:19
boris-42AugieMena: separated scenario doesn't mean anything15:19
boris-42AugieMena: without it's arguments, context, runner....15:19
DinaBelovaboris-42 - moving forward? :)15:19
boris-42DinaBelova: there is still one question15:19
DinaBelovaboris-42 - go ahead :)15:20
*** claudiub has quit IRC15:20
boris-42dims: so certification is picked because it's like "Rally certification of your cloud"15:20
kun_huangboris-42: DinaBelova pls make a note to describe rally's certification work, blogs or slides... I will help to understand15:20
boris-42dims: it certifies the scalability & performance of evertyhing..15:20
* DinaBelova guesses boris-42 meant Dina15:20
AugieMenaboris-42 - ok, understand the need to provide specifics about arguments used in the scenarios15:21
DinaBelova#idea describe rally's certification work, blogs or slides - kun_huang can help with it15:21
dimsboris-42 : i understand, some link to the official certification activities would help evangelize this better. you will get this question asked again and again :)15:21
*** kashyap has joined #openstack-performance15:22
boris-42dims: )15:22
DinaBelovadims - yep, documentation is everything here :)15:22
boris-42dims: honestly we can rename this directory to anything15:22
boris-42dims: but personally I don't like word validation because validation is what Tempest is doing15:22
boris-42=)15:22
rvasilets___)15:23
rvasilets___or not doing)15:23
*** arnoldje has joined #openstack-performance15:23
DinaBelovarvasilets___ :)15:23
DinaBelovaok, anything else here?15:23
DinaBelovaok, moving forward15:24
DinaBelova#topic Nova-conductor performance issues15:24
DinaBelovaok, so some historical info15:24
*** andreykurilin__ has joined #openstack-performance15:24
DinaBelovaduring the Tokyo summit several operators including GoDaddy (ping klindgren) mentioned about issues observed around nova-conductor15:25
DinaBelova#link https://etherpad.openstack.org/p/remote-conductor-performance15:25
DinaBelovaRackspace mentioned it was well15:25
* klindgren waves15:25
DinaBelovaso it was decided it'll be cool idea to investigate this issue15:25
DinaBelovacurrently all known info is collected in the etherpad ^^15:26
DinaBelovaSpamapS has started the investigation of the issue on the local lab15:26
DinaBelovaafaik he had to switch to something else yesterday, so not sure if anything new has hapenned15:26
mriedemi'd be interested to know if moving to oslo.db >= 1.12 helps anything15:27
rpodolyaka1why would it?15:27
dansmithalso, is everyone still using mysqldb-python in these tests?15:27
mriedemdansmith: right15:27
mriedemb/c oslo.db < 1.1215:27
dansmithmriedem: is that a yes, or agreement with the question/15:27
mriedemrpodolyaka1: oslo.db 1.12 switched to pymysql15:27
rpodolyaka1oslo.db >= 1.12 does not mean they use pymysql15:27
mriedemdansmith: that's agreement15:27
mriedemand yes15:28
rpodolyaka1it's only used in oslo.db tests15:28
rpodolyaka1it's up to operator to specify the connection string15:28
dansmithright15:28
mriedemooo15:28
rpodolyaka1you may use mysql-python as well15:28
mriedemhave we deprecated mysql-python?15:28
DinaBelovaand afaik Rackspace fixed this (or probably looking like this) issue by moving back to MySQL-Python15:28
rpodolyaka1mriedem:  I think we actually run the unit tests for it in oslo.db15:29
dansmithDinaBelova: I think you're conflating two things there15:29
mriedemrax has an out of tree change (that's also a DNM in nova) for direct sql for some db APIs15:29
alaskiDinaBelova: rackspace went back to an out of tree db api15:29
mriedemthis is what rax has https://review.openstack.org/#/c/243822/15:29
DinaBelovadansmith - probably, I just remember conversation on Tokyo summit about issue like that15:29
alaskiessentially dropping sqlalchemy for some calls15:29
DinaBelovaalaski - a-ha, thank you sir15:29
DinaBelovathanks dansmith, mriedem15:30
mriedemit'd also be good to know what the conductor/compute ratios are15:30
DinaBelovaklindgren ^^15:30
mriedemthere is some info in the etherpad15:31
rpodolyaka1mriedem: e.g. https://review.openstack.org/#/c/246198/ , there is a separate gate job for mysql-python15:31
mriedemrpodolyaka1: so why isn't that deprecated? we want people to move to pymysql don't we?15:31
alaskithat being said, we are using mysqldb15:31
DinaBelovamriedem - yeah, conductor service with 20 workers per server (2 servers, 16 cores per server), 250 HV in the cell15:31
klindgrenDo you want to see if oslo.db >- 1.12 works better ?  Or if pymysql works better15:32
mriedemklindgren: pymysql15:32
klindgrenright now 20 computes * 3 servers15:32
rpodolyaka1mriedem: we let them decide which one they want to use15:32
klindgren2 servers are 16 core boxes, one is an 8 core box15:32
mriedembut that requires at least oslo.db >= 1.12 if i'm understanding the change history correctly15:32
dansmithklindgren: so 2.5 conductor boxes for 20 computes?15:32
klindgren20 conductors*15:32
mriedemrpodolyaka1: yeah but mysql-python is not python 3 compliant and has known issues with eventlet right?15:32
klindgrenfor 250 computes15:33
*** harlowja_at_home has joined #openstack-performance15:33
dansmithklindgren: that's waaaay low15:33
rpodolyaka1mriedem: right, but as rax experience shows, pymysql does not shine on busy clouds :(15:33
dansmithrpodolyaka1: I don't think that's what their experience shows15:33
rpodolyaka1anyway, are we sure that's a bottleneck?15:33
mriedemrpodolyaka1: i think those are unrelated15:33
harlowja_at_home\o15:33
DinaBelovarpodolyaka1 - not yet, sir. Investigation in progress, we're just collecting the ideas of where to look at15:33
alaskirpodolyaka1: rax hasn't tried pymysql yet.  it's on our backlog to test but we don't have any data on it15:34
DinaBelovaharlowja_at_home - morning sir!15:34
mriedemrpodolyaka1: rax uses mysqldb b/c of their direct to mysql change uses mysqld-python15:34
mriedemhttps://review.openstack.org/#/c/243822/15:34
klindgrenrpodolyaka1, I am getting Model server went away errors randomly from nova-computes15:34
harlowja_at_homeDinaBelova, hi! :)15:34
rpodolyaka1alaski: mriedem: ah, I must have confused them with someone else then. I was pretty sure someone blamed pymysql for causing the load on nova-conductors. and that mysql-python was a solution15:35
DinaBelovaSpamapS wanted to check if switching to some other JSON lib will help, and I'm going to work on this issue as well (probably start tomorrow)15:35
dansmithrpodolyaka1: I'm pretty sure not15:35
klindgrendansmith, what would recommend as the number of servers dedicated to nova-conductor to nova-compute ratio?15:35
rpodolyaka1ok15:35
mriedemDinaBelova: unless you're on python 2.6, i don't know that the json change in oslo.serialization will make a difference15:35
alaskirpodolyaka1: we blame sqlalchemy right now :)  but are hopeful that pymysql will be better15:35
rpodolyaka1haha15:36
dansmithklindgren: it all depends on your environment and your load.. but I just want to clarify.. above you seemed to confuse a few things15:36
dimsalaski lol15:36
dansmithklindgren: 250 computes and how many physical conductor machines running how many workers?15:36
DinaBelovamriedem - well, SpamapS is experimenting here, I'll probably start with some meaningful profiling15:36
rpodolyaka1++15:36
DinaBelovaif will be able to reproduce it15:36
klindgren3 physical boxes one server has 8 cores the others have 1615:36
klindgrenrunning 20 workers each15:36
dansmithklindgren: so three total boxes for 250 computes, right?15:36
klindgrenyep15:37
dansmithklindgren: right, so that's insanely low, IMHO15:37
mriedemplus, you have $workers > npcu on those conductor boxes,15:37
dansmithklindgren: and the answer is: keep increasing conductor boxes until the load is manageable :)15:37
klindgrenthats a pretty shit answer15:37
dansmithmriedem: well, with mysqldb you have to have that15:37
klindgrenimho15:37
DinaBelovaklindgren :D15:37
kun_huanghah15:37
dansmithklindgren: so run some conductors on every compute if you want15:38
mriedemdansmith: although local conductor is now deprecated15:38
dansmithklindgren: the load is all the same, conductor just concentrates it on a much smaller number of boxes if you choose it to be small15:38
DinaBelovadansmith - heh, afair conductors were created to avoid local conductoring?15:38
dansmithmriedem: sure, but they can still run conductor on compute if they don't want upgrades to work15:38
klindgrenfyi this environment has always been remote conductor15:38
klindgrenand load only started being an issue15:38
rpodolyaka1klindgren: can you run nova-conductor under cProfile on one of the nodes? We haven't seen anything like that on our 200-compute nodes deployments15:38
klindgrenwhen we went to kilo15:38
dansmithklindgren: are you still on kilo?15:38
klindgren"still"15:39
klindgrenliberty *just* came out15:39
dansmithklindgren: that's an important detail, maybe you're just experiencing the load of the flavor migrations15:39
mriedemhmmm, flavor migrations in kilo maybe?15:39
dansmithklindgren: that's a hugely important data point :)15:39
klindgrenwe ran all the flavor migration commands after upgrade15:39
dansmithklindgren: right, but there is still overhead15:40
klindgrenbtw all of this in in the etherpad15:40
dansmithklindgren: and it turned out to be higher than we expected even after the migrations were done15:40
alaskieven after migrations we still saw overhead as well15:40
DinaBelovadansmith, mriedem - yep, these details are in the etherpad as well :)15:40
dansmithklindgren: but it's gone in liberty because the migration is complete15:40
dansmithDinaBelova: I've read the etherpad and didn't get the impression this was just a kilo thing15:40
DinaBelovadansmith, ok15:40
mriedemDinaBelova: klindgren: i don't see anything about flavor migrations in the etherpad15:41
DinaBelovamriedem - I meant kilo-based cloud15:41
dansmithDinaBelova: I see that they say they started getting alarms after kilo, but the rest of the text makes it sound like this has always been a problem and just now tipped over the edge15:41
mriedemyeah, i just added the notes on the flavor migrations15:42
dansmithklindgren: so I think you should add some more capacity for conductors until you move to liberty, at which time you'll probably be able to drop it back down15:42
DinaBelovamriedem thanks!15:42
mriedemfyi on the flavor migrations for kilo upgrade https://wiki.openstack.org/wiki/ReleaseNotes/Kilo#Upgrade_Notes_215:42
dansmithklindgren: going forward, we have some better machinery to help us avoid the continued overhead once everything is upgraded15:42
*** itsuugo has quit IRC15:43
DinaBelovaok, so any other points for investigators to look at (except flavor migrations and JSON libs)? // not mentioning some profiling to find the real bottleneck //15:43
dansmithklindgren: and also, the flavor migration was about the largest migration we could have done, so it almost can't be worse in the future15:43
dansmithDinaBelova: I don't think there is a bottleneck to find, it sounds like15:43
mriedemDinaBelova: i'm always curious about rogue periodic tasks in the compute nodes hitting the db too often and pulling too many instances15:43
dansmithDinaBelova: I think this is likely due to flavor migrations we were doing in kilo and nothing more15:43
dansmithDinaBelova: conductor-specific bottlenecks I mean15:43
mriedembut roge periodic tasks pulling too much data could also mean you need to purge your db15:43
mriedem*rogue15:43
alaskidansmith: not conductor specific bottlenecks, but there are db bottlenecks which conductor amplifies15:44
DinaBelovadansmith - that may be very probable answer, I just want to reproduce the same situation klindgren is seeing, track that's about flavor migrations, and check everything is ok on liberty15:44
dansmithalaski: yes, totes15:44
DinaBelovathat is also an answer15:45
DinaBelovanot mentioning something interesting may be found on what alaski has mentioned15:45
DinaBelovaok, cool.15:45
dansmithI shouldn't have said "no bottleneck to find" I meant that I think the kilo-centric bit that is the immediate problem is flavor migrations15:45
DinaBelovadansmith, yep, gotcha15:46
dansmithI'm also amazed that they _were_ fine with 2.5 conductor boxes for 250 computes15:46
klindgrenit it possible to turn off flavor mgirations under kilo to see if things get better?15:46
dansmithklindgren: not really, no15:47
mriedemnot configurable, it happens in the code15:47
DinaBelovaklindgren, suffer :)15:47
dansmithklindgren: we can have a back alley chat about some hacking you can do if you want15:47
klindgrendansmith, can you provide what your mind is an acceptable conductor -> compute ratio?15:47
dansmithklindgren: and if I may say, the next time you hit some spike when you roll to a release, please come to the nova channel and raise it :)15:47
DinaBelovadansmith - I think if klindgren will be ok with trying some code hacking, I suppose this session will be very useful15:48
dansmithklindgren: as I said, there is no magic number.. 1% is much lower than I would have expected would be reasonable for anyone, but you're proving it's doable, which also points to there being no magic number :)15:48
DinaBelova#idea check if issue GoDaddy is facing is related to the flavor migrations or just to the too low conductor/compute ratio15:49
DinaBelovaklindgren - are you interesting in hacking session dansmith has proposed?15:49
dansmithI think it's also worth pointing out,15:49
dansmithsince my answer was "shit" about having enough boxes to handle the load,15:50
bauzasmaybe running a GMR ?15:50
dansmiththat conductor separate from computes is mostly an upgrade story15:50
harlowja_at_homejust out of curiosity since there isn't a magic number, has any bunch of companies shared there conductor ratios with the world, then we can derive a 'suggested' number from those shared values...?15:50
klindgrentechnically 2 -> 250 was working as well.  Adding another physical box didn't actually fix anything, it just resulted in burning cpu on that server as well.15:50
dansmithif you don't care about that, you can run a few conductor workers on every compute and distribute the load everywhere15:50
DinaBelovadansmith thanks for the note15:51
klindgrenI mean if local conductor is depreacted - and remote conductor ian upgrade story - people are going to need to know a conductor to compute ratio that is "safe"15:51
DinaBelovaharlowja_at_home - did not hear about that :(15:51
dansmithklindgren: 100% is safe15:51
klindgrenotherwise people are going to be blowing up their cloud15:51
harlowja_at_home:-/15:51
DinaBelovaklindgren probably we need to write an email to the operators email list15:51
dansmithklindgren: let me ask you a question.. how many api nodes should everyone run?15:52
DinaBelovaand try to find what ratio do other folks have15:52
klindgrenthen un-deprecate local-condcutor because obvious remote-conductor is planned out15:52
klindgrenis not well planned out*15:52
harlowja_at_homeDinaBelova, i'd like that15:52
DinaBelova#action DinaBelova klindgren compose an email to the operators list and find out what conductors/computes ratio is used15:53
mriedemcan you do rolling upgrades with cells though? i thought not.15:53
DinaBelovadansmith - well, I guess there is no right answer here :)15:53
dansmithDinaBelova: right, that's what I'm trying to get at.. if I never create/destroy nodes, I can use one api worker for 250 computes :)15:53
dansmiths/nodes/instances/15:54
DinaBelovadansmith :D15:54
klindgrenits almost always been possible in the past to run n-1 in cells15:54
manandwhile we are on the subject of ratio, is this something we should look across other components such as network node to compute ration etc.,15:54
dansmithklindgren: just so you know, we think that's crazy :)15:54
alaskiklindgren: that has been by chance though.  there's no code to ensure it works15:54
DinaBelovamanand - yep, great note15:54
dansmithwhether or not it works :)15:54
DinaBelovaok, folks, we've spent much time on this item15:55
mriedemreminds me of the rpc compat bug in the cells code i saw last week...15:55
dansmithyeah15:55
DinaBelovait losos like we'll return to it back after the meeting15:55
DinaBelovalooks*15:55
DinaBelovaso let's move forward, as we're running out of time15:55
DinaBelova#topic OSProfiler weekly update15:55
DinaBelovaok, so last time we agreed that if we want to use osprofiler for tracing/profiling needs we need #1 fix it and #2 make it better15:56
DinaBelovaharlowja_at_home has created an etherpad15:56
DinaBelova#link https://etherpad.openstack.org/p/perf-zoom-zoom15:56
harlowja_at_homei put some code up for an idea of a different notifier that just uses files!! :-P15:56
harlowja_at_homemorezoom zoom15:56
harlowja_at_homelol15:56
DinaBelovaharlowja_at_home - yep, saw it15:57
DinaBelovaand I left a comment - lemme create a change regarding https://github.com/openstack/osprofiler/blob/master/doc/specs/in-progress/multi_backend_support.rst first15:57
DinaBelovanot to have two drivers for backward compatibility15:57
DinaBelovaso in short - I was able to make osprofiler working ok with ceilometer events15:58
harlowja_at_homecool15:58
DinaBelovaits limited now and some ceilometer work needs to be done now15:58
DinaBelovaone of Ceilo devs will work on it15:58
DinaBelovaand I've moved to https://github.com/openstack/osprofiler/blob/master/doc/specs/in-progress/multi_backend_support.rst task15:58
*** atuvenie_ has quit IRC15:58
DinaBelovaharlowja_at_home - I'll ping you once I'll push the change to gerrit15:58
harlowja_at_homekk15:58
harlowja_at_homethx15:58
DinaBelovaso you'll be able to rebase your code15:58
DinaBelovanp15:58
harlowja_at_homesounds good to me15:59
DinaBelovaboris-42 - did you have a chance to update the osprofiler -> oslo spec?15:59
DinaBelovafor mitaka?15:59
DinaBelovaa-ha, I see not yet15:59
DinaBelova#action boris-42 update osprofiler spec to fit Mitaka cycle15:59
DinaBelovaok, so we ran out of time16:00
DinaBelovaany last questions to mention?16:00
DinaBelovathank you guys!16:00
harlowja_at_homeboris-42, where are u!16:00
harlowja_at_homecome in boris!16:00
harlowja_at_homelol16:00
DinaBelova:D16:00
DinaBelova#endmeeting16:00
harlowja_at_home:)16:00
openstackMeeting ended Tue Nov 17 16:00:40 2015 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:00
openstackMinutes:        http://eavesdrop.openstack.org/meetings/performance_team/2015/performance_team.2015-11-17-15.00.html16:00
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/performance_team/2015/performance_team.2015-11-17-15.00.txt16:00
openstackLog:            http://eavesdrop.openstack.org/meetings/performance_team/2015/performance_team.2015-11-17-15.00.log.html16:00
*** rohanion has quit IRC16:02
*** mriedem has quit IRC16:06
*** llu-laptop has quit IRC16:08
*** mriedem has joined #openstack-performance16:11
*** mwagner has quit IRC16:18
*** harlowja_at_home has quit IRC16:23
*** ozamiatin has quit IRC16:23
*** rpodolyaka1 has quit IRC16:24
*** rpodolyaka1 has joined #openstack-performance16:30
*** dansmith has left #openstack-performance16:32
*** pkoniszewski has quit IRC16:36
*** rmart04 has quit IRC16:45
DinaBelovaswann - are you around?16:47
*** harshs has joined #openstack-performance16:58
*** mriedem is now known as mriedem_meeting16:59
*** mwagner has joined #openstack-performance17:03
*** AugieMena has quit IRC17:37
*** amaretskiy has quit IRC17:37
*** mriedem_meeting is now known as mriedem17:39
*** rpodolyaka1 has quit IRC17:55
*** markvoelker_ has joined #openstack-performance17:57
*** lpetrut1 has joined #openstack-performance17:58
*** lpetrut has quit IRC17:58
*** lpetrut1 is now known as lpetrut17:58
*** rvasilets___ has quit IRC17:59
*** markvoelker has quit IRC18:00
*** xek has quit IRC18:00
*** lpetrut has quit IRC18:02
*** abalutoiu has quit IRC18:09
*** harshs has quit IRC18:22
*** dims has quit IRC18:37
*** lpetrut has joined #openstack-performance18:39
*** dims has joined #openstack-performance18:40
*** mriedem has quit IRC18:42
*** mriedem has joined #openstack-performance18:44
*** andreykurilin__ has quit IRC18:46
*** itsuugo has joined #openstack-performance18:48
*** boris-42 has quit IRC18:48
*** lpetrut has quit IRC18:52
*** itsuugo has quit IRC18:53
*** lpetrut has joined #openstack-performance18:53
*** itsuugo has joined #openstack-performance18:58
*** harshs has joined #openstack-performance19:02
*** itsuugo has quit IRC19:03
*** manand has quit IRC19:10
*** ozamiatin has joined #openstack-performance19:44
*** itsuugo has joined #openstack-performance19:46
*** itsuugo has quit IRC20:07
*** regXboi has quit IRC20:11
*** harshs has quit IRC20:13
*** itsuugo has joined #openstack-performance20:34
*** ozamiatin has quit IRC20:46
*** rmart04 has joined #openstack-performance20:50
*** rmart04 has left #openstack-performance20:52
*** dims_ has joined #openstack-performance21:00
*** dims has quit IRC21:02
*** itsuugo has quit IRC21:07
*** lpetrut has quit IRC21:12
*** itsuugo has joined #openstack-performance21:17
*** itsuugo has quit IRC21:21
*** itsuugo has joined #openstack-performance21:21
*** rpodolyaka1 has joined #openstack-performance21:23
*** rpodolyaka1 has quit IRC21:32
*** rpodolyaka1 has joined #openstack-performance21:44
*** dims_ has quit IRC21:51
*** rpodolyaka1 has quit IRC21:52
*** dims has joined #openstack-performance21:57
*** rpodolyaka1 has joined #openstack-performance22:06
*** rpodolyaka1 has quit IRC22:15
*** harshs has joined #openstack-performance22:16
*** mriedem has quit IRC22:53
*** mwagner has quit IRC22:55
*** itsuugo has quit IRC22:59
*** dims has quit IRC23:00
*** dims has joined #openstack-performance23:01
*** dims_ has joined #openstack-performance23:04
*** dims has quit IRC23:07
*** arnoldje has quit IRC23:16
*** mwagner has joined #openstack-performance23:31
*** redixin has quit IRC23:41

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!