Thursday, 2018-11-29

*** dklyle has joined #kata-dev01:32
kata-irc-bot<mnaser> @salvador.fuentes http://jenkins.katacontainers.io/job/kata-containers-tests-fedora-28-master/1/console :slightly_smiling_face:01:34
kata-irc-bot<mnaser> clarkb: i'm slowly rolling out this updated kernel across all of sjc1, so i think it would be a good time to pick up zuul work again01:35
kata-irc-bot<mnaser> @salvador.fuentes looks like job failed but i dunno why.. but it works.01:59
kata-irc-bot<salvador.fuentes> thanks @mnaser, I'll check the error03:30
*** dklyle has quit IRC03:58
kata-irc-bot<eric.ernst> @fupan @bergwolf - we have any arch docs for containerd-v2-shim yet?04:06
kata-irc-bot<eric.ernst> whether in github, gdoc, or preferably .md?04:06
kata-irc-bot<eric.ernst> @xu ^^04:07
kata-irc-bot<fupan> Currently we have a doc about how to deploy and run containerd-shim-kata-v2 with  containerd and cri04:17
kata-irc-bot<fupan> https://gist.github.com/gnawux/d06c34b845aa3350799cbeaeb3c1270e04:17
kata-irc-bot<eric.ernst> thanks @fupan04:28
kata-irc-bot<eric.ernst> what are gaps between what's on hyperhq and what's on kata-containers/runtime now?04:29
kata-irc-bot<eric.ernst> should be at parity with the PR (finally!!!) merging?04:30
kata-irc-bot<eric.ernst> I'd like to get this up tomorrow, though perhaps using runtimeClass and 1.1204:30
kata-irc-bot<eric.ernst> btw, nice job on the prep for the kata, v2 shim, gvisor talk @fupan04:31
kata-irc-bot<fupan> By now there is almost04:36
kata-irc-bot<fupan> By now there isn’t  any gap between hyperhq and kata/runtime , and what’s merging into kata/runtime is also the latest updates in hyperhq.04:39
kata-irc-bot<fupan> By now there is only one PR  related to  shimv2  hasn’t been merged.  https://github.com/kata-containers/runtime/pull/94004:40
*** eernst has joined #kata-dev04:45
*** eernst has quit IRC04:55
kata-irc-bot<eric.ernst> ok, thanks @fupan05:39
*** jodh has joined #kata-dev06:51
*** zerocoolback has joined #kata-dev07:10
*** zerocoolback has quit IRC07:12
*** shrasool has quit IRC07:18
*** sameo has joined #kata-dev08:13
*** dims has quit IRC08:32
*** dims has joined #kata-dev08:33
*** gwhaley has joined #kata-dev08:59
*** davidgiluk has joined #kata-dev09:06
kata-irc-bot<graham.whaley> @mnaser @salvador.fuentes back on the boot speed with atomic - your 3.7s is on the high end of boot times that we see - but then, I normally measure the time 'into the workload', as your time also has the container shutdown time (which is debatable if anybody is really interested in ;) ). @mnaser, you could run up the metrics report (https://github.com/kata-containers/tests/tree/master/metrics/report) to compare the two - if you09:24
kata-irc-botare only interested in boot times and maybe footprint, use the '-t -d' options on the grabdata script. You end up with a PDF report showing you a comparison, and you get some more details of breakdown.09:24
kata-irc-bot<graham.whaley> btw, great news on the debug and kernel update everybody - woot09:25
davidgilukgraham.whaley: I guess something like container-shutdown-time probably shows up indirectly in something like a container density measurement09:43
*** shrasool has joined #kata-dev09:44
gwhaleydavidgiluk: I guess if we were doing some sort of rolling dynamic density test, yes it could have an effect (if you launch faster than you die for instance). Right now we don't have a test like that :-) our density tests are static (run n containers, take measures, do math...)09:45
gwhaleywhen most folks talk about 'boot speed' though, what they really care about is time to get to the workload. So, when somebody measures 'time docker run busybox true', they are also measuring the quit time, so they skew the numbers a little.09:46
davidgilukyep09:46
gwhaleythe report tool tries to generate a nice little graph with a breakdown for us - nicely somebody posted a snippet on a PR last night, so we have a handy example: https://github.com/kata-containers/runtime/pull/768#issuecomment-44261386909:47
davidgilukthat's pretty09:51
davidgilukI wonder if it's worth running 'systemd-analyze blame' as the task to see if it says where the time is going in the systemd world09:53
gwhaleyI have a feeling @devimc has done that before in the past. we've also used bootchart to have a look - but, I don't think we've done that and had an optimisation cycle for a while09:53
gwhaleyalso, for fun, you'd want to run the systemd analyse inside the VM but not inside the container (in that 'little space' that is the mini-OS sat in the VM around the container)... that is a fun space to try and debug ;-) We have a doc on the repos somewhere describing how to gain yourself a console to that space..09:54
davidgilukyeh at one point I did have that debug shell working10:02
davidgilukbut then I rebuilt the rootfs/initrd and lost the magic change10:03
*** shrasool has quit IRC10:24
*** lpetrut has joined #kata-dev10:38
*** gwhaley has quit IRC12:09
*** gwhaley has joined #kata-dev13:12
*** shrasool has joined #kata-dev13:34
*** shrasool has quit IRC13:38
kata-irc-bot<mnaser> @salvador.fuentes i see a job that seems to have passed under fedora 28? :slightly_smiling_face:13:38
kata-irc-bot<salvador.fuentes> @mnaser, yes it passed :slightly_smiling_face: :tada:13:40
kata-irc-bot<salvador.fuentes> thanks13:40
*** shrasool has joined #kata-dev13:50
kata-irc-bot<mnaser> would anyone have some free cycles to throw questions at in order to try and get magnum integrated with kata?14:27
kata-irc-bot<mnaser> i've made ok progress in terms of getting it installed, but some of the containers (pause) launch, but the others are not launching14:27
kata-irc-bot<mnaser> maybe just even a pointer to where i can get logs to whats happening14:28
kata-irc-bot<mnaser> they just sit in "Created" status in `docker ps`14:28
*** kailun has quit IRC14:29
gwhaley@mnaser: if you enable debug in the kata toml config file, then you get reams of stuff in the journal, which you can then extract and post on an Issue.14:31
gwhaleyhttps://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#enable-full-debug14:32
kata-irc-bot<mnaser> gwhaley: lovely, thank you14:32
gwhaleybtw, watch out for any journal rate limits you have - let me find the note for that as well (otherwise we lose debug ;-))14:33
gwhaleyoh, it is the second block on that link above ;-)14:33
kata-irc-bot<mnaser> `level=error msg="Create container failed with error: oci runtime error: Error bridging virtual endpoint"`14:34
kata-irc-bot<mnaser> i probably missed that before14:34
kata-irc-bot<mnaser> https://github.com/kata-containers/runtime/blob/master/virtcontainers/veth_endpoint.go#L88-L9714:35
kata-irc-bot<mnaser> seems like the veth attach is somehow not happening right14:35
gwhaleyprobably want @amshinde @archana on that - maybe open an Issue on github14:36
kata-irc-bot<mnaser> https://github.com/kata-containers/runtime/blob/23e75f0f03c7357cec6c77f904e610ad37d1d179/virtcontainers/network.go#L504-L53414:37
*** fuentess has joined #kata-dev14:37
kata-irc-bot<mnaser> let me see what type of interface its trying14:37
kata-irc-bot<mnaser> internetworking_model="macvtap"14:39
kata-irc-bot<mnaser> i guess thats' the default14:39
kata-irc-bot<mnaser> ```Nov 29 14:40:04 my-cluster-pojfxyiqeafi-minion-0.vexxhost.local kata-runtime[28165]: time="2018-11-29T14:40:04.702259779Z" level=error msg="Error bridging virtual endpoint" arch=amd64 command=create container=85a88894c36f6a686088b28ccb2de815dc719c93643b6736b94b9889aabe720a error="Could not create TAP interface: LinkAdd() failed for macvtap name tap2_kata: file exists" name=kata-runtime pid=28165 source=virtcontainers subsystem=network14:42
kata-irc-botNov 29 14:40:04 my-cluster-pojfxyiqeafi-minion-0.vexxhost.local kata-runtime[28165]: time="2018-11-29T14:40:04.70235696Z" level=error msg="Could not create TAP interface: LinkAdd() failed for macvtap name tap2_kata: file exists" arch=amd64 command=create container=85a88894c36f6a686088b28ccb2de815dc719c93643b6736b94b9889aabe720a name=kata-runtime pid=28165 source=runtime ```14:42
kata-irc-bot<mnaser> i wonder if its not properly cleaning up macvtap devices14:42
kata-irc-bot<mnaser> i wonder if it failed to launch at some point and didn't clean up properly14:44
gwhaley@mnaser: definitely a possibility. If you open an issue, please detail what you ran - and, as nobody else probably has atomoc set up, maybe some ideas about how many containers of what sort it was launching and if in parallel etc.14:46
kata-irc-bot<mnaser> gwhaley: i'm just going to try and delete all CNIs and containers first, just to see if there is a root cause *before* this14:46
kata-irc-bot<mnaser> aka something happened which resulted in things not cleaning up properly in the first place14:46
kata-irc-bot<mnaser> ``` time="2018-11-29T14:48:23.065660185Z" level=error msg="Error bridging virtual endpoint" arch=amd64 command=create container=04701f1e4e6c8326d109cd18eb448a2eb94bdc7e246c932e415c51a720a91f4b error="Could not get veth interface: tap0_kata: Incorrect link type macvtap, expecting veth" name=kata-runtime pid=31008 source=virtcontainers subsystem=network```14:49
kata-irc-bot<mnaser> and then later14:50
kata-irc-bot<mnaser> ```time="2018-11-29T14:48:40.386621979Z" level=error msg="Could not create TAP interface: LinkAdd() failed for macvtap name tap0_kata: file exists" arch=amd64 command=create container=0f1f8fd314c61d0b81be833a4114681f9eaadeb8c7151c2c4a3f05bcd1757611 name=kata-runtime pid=31838 source=runtime```14:50
kata-irc-bot<mnaser> it almost looks like its a race condition where multiple containers go up at the same time and the tap indexes get mucked up14:51
*** lpetrut has quit IRC14:58
gwhaleymnaser - that would have been one of my guesses - hence the note about noting parallelism... ;-)  yeah, we need @amshinde on the case I think14:58
kata-irc-bot<mnaser> prelim https://github.com/kata-containers/runtime/issues/95315:00
kata-irc-bot<graham.whaley> @archana.m.shinde ^^15:02
kata-irc-bot<mnaser> woo15:07
kata-irc-bot<mnaser> i think i got a reproducer15:07
kata-irc-bot<mnaser> added a reproducer to the bug, it'd be nice if someone can confirm on their side too15:14
kata-irc-bot<mnaser> not sure where to file this bug, but 1.4 doesnt seem to be in https://build.opensuse.org/project/show/home:katacontainers:release15:18
*** sameo has quit IRC15:30
*** shrasool has quit IRC15:36
*** sameo has joined #kata-dev15:50
*** dklyle has joined #kata-dev16:01
kata-irc-bot<salvador.fuentes> @jose.carlos.venegas.m ^16:11
kata-irc-bot<jose.carlos.venegas.m> @mnaser: @salvador.fuentes: that is true, let me update it there16:13
kata-irc-bot<mnaser> ok so i'm not seeing the netmon process go up16:31
kata-irc-bot<mnaser> which can explain why things werent working16:31
gwhaleymnaser: not sure if the netmon only landed in 1.4, and you are on 1.3 I think - but, others may know for sure.16:32
kata-irc-bot<mnaser> netmon seems disabled by default16:32
kata-irc-bot<mnaser> it is in 1.316:32
kata-irc-bot<mnaser> ```[netmon] # If enabled, the network monitoring process gets started when the # sandbox is created. This allows for the detection of some additional # network being added to the existing network namespace, after the # sandbox has been created. # (default: disabled) #enable_netmon = true```16:33
*** sameo has quit IRC16:35
*** fiddletwix has joined #kata-dev16:43
*** eernst has joined #kata-dev17:03
*** david-lyle has joined #kata-dev17:04
*** dklyle has quit IRC17:05
kata-irc-bot<eric.ernst> Extra motivation for passing (semi)static files over VSOCK and eliminating requirement for 9p: https://nabla-containers.github.io/2018/11/28/fs/17:15
davidgilukhm, now would we have hit the same thing in our world17:24
*** sameo has joined #kata-dev17:25
*** sameo has quit IRC17:32
*** sameo has joined #kata-dev17:40
*** jodh has quit IRC18:01
*** david-lyle has quit IRC18:15
*** gwhaley has quit IRC18:18
*** eernst has quit IRC18:40
*** eernst has joined #kata-dev18:42
*** dklyle has joined #kata-dev18:45
*** fiddletwix has quit IRC18:58
*** lpetrut has joined #kata-dev18:58
kata-irc-bot<mike> is there a best practice on mitigating things like this on the host? strict seccomp? or would that break kata? pretty new to using it19:01
*** eernst has quit IRC19:07
*** dklyle has quit IRC19:07
*** fuentess has quit IRC19:08
*** eernst has joined #kata-dev19:13
*** eernst has quit IRC19:18
*** eernst has joined #kata-dev19:19
*** eernst has quit IRC19:21
*** shrasool has joined #kata-dev19:26
*** sameo has quit IRC19:41
*** shrasool has quit IRC19:48
*** shrasool has joined #kata-dev19:49
*** shrasool has quit IRC19:51
*** fiddletwix has joined #kata-dev20:04
*** shrasool has joined #kata-dev20:18
*** eernst has joined #kata-dev20:19
*** shrasool has quit IRC20:19
*** davidgiluk has quit IRC20:20
*** eernst has quit IRC20:21
kata-irc-bot<eric.ernst> I think moving to block device is one major step20:21
kata-irc-bot<eric.ernst> I need to look more re: other actual mitigation’s20:22
*** eernst has joined #kata-dev20:25
*** eernst has quit IRC20:29
*** fuentess has joined #kata-dev20:30
*** dklyle has joined #kata-dev21:10
*** dklyle has quit IRC21:39
*** fuentess has quit IRC22:12
*** dklyle has joined #kata-dev22:19
*** dklyle has quit IRC22:24
*** dklyle has joined #kata-dev22:32
*** dklyle has quit IRC22:44
*** lpetrut has quit IRC23:07
*** eernst has joined #kata-dev23:17
*** eernst has quit IRC23:22
*** dhellmann_ has joined #kata-dev23:26
*** eernst has joined #kata-dev23:26
*** dhellmann has quit IRC23:26
*** eernst has quit IRC23:27
*** dhellmann_ is now known as dhellmann23:30

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!