17:08:21 #startmeeting self-healing 17:08:22 Meeting started Wed Jun 5 17:08:21 2019 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:08:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:08:25 The meeting name has been set to 'self_healing' 17:09:13 So this morning witek mentioned some ongoing discussions around billing, and the idea that instrumenting service code in order to provide metrics might work better than black-box monitoring for that 17:09:27 which ties in with https://storyboard.openstack.org/#!/story/2005632 17:09:45 #topic exporting metrics from services 17:10:03 BTW we seem to have a duplicate story in storyboard for this I think? 17:10:15 https://storyboard.openstack.org/#!/story/2005640 17:10:37 seem to remember some weirdness with StoryBoard when we were submitting stories recently 17:11:10 oh weird. yea I may have created a duplicate because of the weirdness. 17:11:22 I guess we should delete one? 17:11:33 yeah, https://storyboard.openstack.org/#!/story/2005632 has one fewer task 17:11:50 ok I’ll delete that one. 17:11:54 thanks 17:12:37 not much more to say on that right now except link to this morning's minutes 17:12:45 #link http://eavesdrop.openstack.org/meetings/self_healing/2019/self_healing.2019-06-05-09.05.html this morning's minutes 17:13:06 #topic heat + octavia + aodh 17:13:12 great. yea I read up on the morning meeting. sounds like there isn’t great support just yet, but great thing that witek is working on it. 17:13:22 so this popped up on the mailing list: 17:13:37 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006582.html demo of app auto-healing via heat+octavia+aodh 17:13:52 Didn't get a response though 17:14:22 We can either keep chasing or try to document at least a skeleton for it ourselves 17:14:25 #action aspiers to create a story for documenting that use case 17:15:57 got it. yea first step maybe simply to link to that video in a skeletal doc. I can take a stab at that. 17:16:02 I'll finish that after the meeting 17:16:10 I mean, finish creating the story 17:16:18 That would be awesome if you could kick it off 17:16:24 yup. 17:16:26 We can totally merge a skeleton and flesh it out later 17:16:39 Main thing is promoting the discoverability / awareness 17:16:48 If people are aware and they need more details, they'll probably ask for them 17:17:06 #topic automated testing 17:17:08 sounds good 17:17:12 This old chestnut :) 17:17:34 So we *may* have an intern doing a masters thesis on this topic 17:17:44 in which case we could expect to see some progress 17:17:47 but nothing guaranteed yet 17:17:55 fingers crossed! 17:18:27 oh very nice! I also see that ricolin started some basic tempest setup. 17:18:50 Yup. IIRC it's still marked WIP so not sure if he needs any help with that 17:19:16 aspiers, ekcs, yes, it's working already but I'm more working on how to make the test scenario test more stable 17:19:26 ricolin: cool! 17:19:43 #link https://storyboard.openstack.org/#!/story/2005830 New story for documenting Heat+Octavia+Aodh 17:19:58 ricolin: Let us know if you need any help 17:20:08 awesomeness 17:20:16 I think that was all I had for now 17:20:25 #topic AOB 17:20:26 the self-healing scenario is very unstable in https://review.opendev.org/656070 try to figure out why 17:20:34 ah OK 17:20:38 anything else? 17:20:52 * aspiers takes a look at that review 17:21:45 ricolin: are these similar to tests already being run on heat repos? 17:21:45 heat_tempest_plugin.common.exceptions.TimeoutException: Request timed out 17:21:59 Details: Stack SelfHealingTest-243821469/c9e222f4-e0f0-4cbf-ba58-dea30d2d6a08 failed to reach UPDATE_COMPLETE status within the required time (1200 s). 17:22:15 #topic heat self-healing tests 17:22:34 knowing what’s new exsting heat tests may help us diagnose. 17:22:41 true 17:23:24 the time out is when the healing process didn't start in any reason 17:23:41 OK 17:24:03 that's beyond my familiarity right now 17:24:31 Heat should play better role during entire process and help to make sure all component works well 17:24:49 and reduce the unstable cases 17:25:02 do you know why it didn't start? 17:25:46 I think I got some idea 17:26:04 but since next week is part of my wedding ceremony, I won't be that available before 6/15 17:26:15 Ah! No problem, enjoy! :-D 17:26:30 oh wow congrats! 17:27:02 and the rest part happen in 11/17 so it's going to be a very long years for me!lol 17:27:08 ekcs, aspiers thx! 17:27:14 haha 17:27:35 alright 17:27:44 anything else anyone want to discuss? 17:27:57 aspiers, in short, I think that test case fail because Heat didn't make sure the Mistral workflow is up and running stable before we assume next step 17:28:08 ahah, I see 17:28:24 I will look into that and hope I can bring some good knews 17:28:31 knews/news 17:28:31 perfect 17:28:44 great! 17:28:51 Once that test is stable, the rest gate job setting will be easy 17:29:10 since all required patch is already there 17:29:25 nice 17:29:46 I guess we need a short doc explaining it too 17:31:22 not a discussion topic per se, but I’ve been wavering in my personal priority between identifying and supporting new use cases vs documenting existing use cases. I think I settled on documenting existing as higher priority at this stage of the sig. 17:31:45 personally I think either is fine 17:31:55 Whatever you are more excited about ;) 17:32:06 = ) 17:32:34 Any small contributions are a lot better than nothing :) 17:32:54 yup 17:33:18 We're all busy with other stuff, so IMO there's no problem at all with being selective and time-boxing SIG work 17:34:00 Alright, sounds like we're done for today? 17:34:17 yup 17:34:28 cool 17:34:34 thanks, and catch you soon! 17:35:00 yup later guys! have a great week! 17:35:28 o/ 17:35:30 #endmeeting