Wednesday, 2017-11-08

osh-chatbot<v1k0d3n> hey man…00:00
osh-chatbot<v1k0d3n> so with osh i want to increase the pg for osd’s…i know with ceph balancing these numbers can be a bit tricky, and this is an area i’m really sort of blind with.00:00
SamYaplev1k0d3n: pgs are per pool00:01
SamYapledont worry so much about power of 2 or other nonsense at such small scale00:01
SamYapleit doesnt *really* matter til you hit like 4096 pgs in total00:01
SamYapleyour aim is for ~100pgs per osd00:01
SamYapleso if you have 10 osds, and 3 pools with three replicas then you have about 330 pgs to split between all those pools00:03
SamYapleyoull want to weight it base ondata distribution00:03
SamYapleso if cinder is holding 80% ofyour data, youll wantto give it 80% of that 330 pg number00:03
SamYapleimportant thing is to not overshoot, you can always increase, you cannot decrease00:04
osh-chatbot<v1k0d3n> so i’m getting roughly 1200+ per OSD x 5.00:10
osh-chatbot<v1k0d3n> 1200pg00:11
SamYapleyou have 5 osds in total?00:15
SamYaplecan you get me the pgs counts per pool and the replicas per pool?00:15
osh-chatbot<v1k0d3n> one sec…let me get home and i can send that.00:19
osh-chatbot<v1k0d3n> are you still in SC or did you go to sydney? i can’t remember…00:19
osh-chatbot<v1k0d3n> i think you told me though…my memory is just that bad lately.00:20
SamYapleim in Cali for other work00:20
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645000:37
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645000:40
*** gouthamr has quit IRC00:48
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645001:27
*** gouthamr has joined #openstack-helm01:50
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645002:00
*** MarkBaker has joined #openstack-helm02:05
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645002:08
*** MarkBaker has quit IRC02:17
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645002:18
*** yamamoto has joined #openstack-helm02:20
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645002:24
*** gouthamr has quit IRC02:40
*** gouthamr has joined #openstack-helm02:45
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645002:49
*** ianw has joined #openstack-helm03:34
*** mihdih has joined #openstack-helm03:34
*** gouthamr has quit IRC03:34
*** jklare has joined #openstack-helm03:35
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645003:35
jklarehi03:35
jklarethanks for hosting the workshop :)03:35
openstackgerritMerged openstack/openstack-helm master: Add option to set external policy to local for openstack services  https://review.openstack.org/51717103:39
*** gouthamr has joined #openstack-helm03:39
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645003:39
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645003:51
jayahnsrwilkers: ping03:56
srwilkerso/03:57
jayahnjust wanted to ask what monitoring-tool we lunch at gate? https://review.openstack.org/#/c/512365/39/tools/gate/launch-osh/basic.sh03:57
jayahnwill it be prometheus? then will we give any opportunity for other tool (like collected) to run at gate for checking?03:57
jayahnhttps://review.openstack.org/#/c/518360/3/tools/gate/launch-osh/basic.sh03:58
jayahnjust wondering while i am searching through all the PS. :)03:59
*** gouthamr has quit IRC03:59
srwilkersCollectds being explored as a method for gathering some metrics that we done have exporters for. It’s been tossed around as an idea for gathering some openstack metrics03:59
srwilkersDon’t, not done. Oops03:59
jayahnah. okay04:00
srwilkersBut Prometheus will definitely be the go-to04:00
jayahnI really need the wheels going again to upload potential alert list spec. :(04:00
jayahnclearing my side of queue.04:01
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645004:01
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Migrating v3 gate  https://review.openstack.org/51645004:07
*** mihdih has quit IRC04:10
*** yamamoto has quit IRC04:38
*** yamamoto has joined #openstack-helm04:41
*** coolsvap has joined #openstack-helm04:54
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Testing zuul v3 proj dependencies  https://review.openstack.org/51645005:43
openstackgerritTin Lam proposed openstack/openstack-helm master: [DNM] Testing zuul v3 proj dependencies  https://review.openstack.org/51645005:55
*** dims has quit IRC06:05
*** julim has quit IRC06:07
*** dims has joined #openstack-helm06:12
*** julim has joined #openstack-helm06:14
openstackgerritVu Cong Tuan proposed openstack/openstack-helm master: Do not use “-y” for package install  https://review.openstack.org/51847507:34
*** mihdih has joined #openstack-helm07:49
*** huxinhui_ has joined #openstack-helm07:56
*** mihdih has quit IRC09:00
*** hogepodge has quit IRC09:43
*** bryan_att has quit IRC09:43
*** zla has quit IRC09:43
*** zioproto has quit IRC09:43
*** jayahn has quit IRC09:43
*** petevg has quit IRC09:43
*** sebastian-w has quit IRC09:43
*** powerds0111 has quit IRC09:43
*** srwilkers has quit IRC09:43
*** serverascode has quit IRC09:43
openstackgerritHyunsun Moon proposed openstack/openstack-helm master: Neutron: Correct section name for linuxbridge bridge_mappings config  https://review.openstack.org/51850309:45
*** hogepodge has joined #openstack-helm09:49
*** bryan_att has joined #openstack-helm09:49
*** zla has joined #openstack-helm09:49
*** zioproto has joined #openstack-helm09:49
*** jayahn has joined #openstack-helm09:49
*** petevg has joined #openstack-helm09:49
*** sebastian-w has joined #openstack-helm09:49
*** powerds0111 has joined #openstack-helm09:49
*** srwilkers has joined #openstack-helm09:49
*** serverascode has joined #openstack-helm09:49
*** serverascode has quit IRC09:50
*** serverascode has joined #openstack-helm09:51
*** yamamoto has quit IRC11:03
openstackgerritMateusz Blaszkowski proposed openstack/openstack-helm master: ElastAlert chart  https://review.openstack.org/51662911:12
openstackgerritArtur Korzeniewski proposed openstack/openstack-helm master: RBAC authorization support  https://review.openstack.org/46463011:31
*** yamamoto has joined #openstack-helm11:37
osh-chatbot<akorzeni> Hi @portdirect, https://review.openstack.org/#/c/516688/10/helm-toolkit/templates/snippets/_kubernetes_entrypoint_rbac.tpl@30 I have spotted the issue of resources created in pre-install phase not to be deleted when chart is removed using helm11:37
osh-chatbot<akorzeni> @portdirect this implies errors when user wants to reinstall chart... the resources like cluster-role-binding already exists and the installation fails11:38
*** julim has quit IRC11:40
*** julim has joined #openstack-helm11:41
*** dayou has quit IRC13:07
*** coolsvap has quit IRC13:16
osh-chatbot<kperr> @v1k0d3n Your workaround didn't fix rbd lock issue in my case :(13:52
osh-chatbot<kperr> But from the logs it seems that the command `rbd lock list kubernetes-dynamic-pvc-99b0d666-c46e-11e7-b290-e6134dfe9e0b --pool rbd --id admin -m ceph-mon.ceph.svc.cluster.local:6789 --key=AQBn2AJaAAAAABAAXCsd9RStDl6zzjAyKeDESg==` fails with error `server name not found: ceph-mon.ceph.svc.cluster.local (Name or service not known) unable to parse addrs in 'ceph-mon.ceph.svc.cluster.local:6789' rbd: couldn't connect to the cluster!`13:52
openstackgerritArtur Korzeniewski proposed openstack/openstack-helm master: RBAC authorization support  https://review.openstack.org/46463013:52
osh-chatbot<kperr> Can you/anyone point me to the code that issues this command? I would like to replace the hostname by IP and see if that fixes the problem.13:53
*** yamamoto has quit IRC13:55
*** yamamoto has joined #openstack-helm13:56
*** yamamoto has quit IRC14:01
openstackgerritSteve Wilkerson proposed openstack/openstack-helm-infra master: WIP - Move prometheus to osh-infra  https://review.openstack.org/51764514:06
openstackgerritArtur Korzeniewski proposed openstack/openstack-helm master: Add jobs and daemonsets namespace support  https://review.openstack.org/51151514:06
*** yamamoto has joined #openstack-helm14:26
openstackgerritMateusz Blaszkowski proposed openstack/openstack-helm master: Nagios: passive check for DB error notifications coming from ElastAlert  https://review.openstack.org/51854314:29
*** yamamoto has quit IRC14:31
osh-chatbot<raymaika> @kperr can you confirm that your hosts can resolve the address `ceph-mon.ceph`? The lock issue is often just related to nodes not correctly resolving the mon addresses.14:38
osh-chatbot<kperr> @raymaika No my hosts cant resolve ceph-mon.ceph14:43
osh-chatbot<kperr> Only the pods running in my cluster can resolve it14:44
osh-chatbot<kperr> How do I fix this :(14:44
osh-chatbot<v1k0d3n> hey @kperr14:45
osh-chatbot<raymaika> you'll need to add the following to `/etc/resolv.conf` ```nameserver 10.96.0.10 search svc.cluster.local ceph.svc.cluster.local openstack.svc.cluster.local svc.cluster.local```14:45
osh-chatbot<v1k0d3n> do you have ^^ yup…that in your /etc/resolve.conf file.14:46
osh-chatbot<raymaika> (assuming you're using default svc subnet, 10.96.0.10 should be your DNS service IP)14:46
osh-chatbot<v1k0d3n> so basically you want to point your hosts to the dns server kube-dns (or coredns) is using.14:46
osh-chatbot<v1k0d3n> the 10.96.0.10 is if you follow the instructions on the wiki/readthedoc.io to a “t”14:47
osh-chatbot<v1k0d3n> File uploaded https://kubernetes.slack.com/files/U0A333J23/F7X0MT99S/-.txt / https://slack-files.com/T09NY5SBT-F7X0MT99S-20a58d4d3514:51
osh-chatbot<v1k0d3n> L614:51
osh-chatbot<v1k0d3n> but i’m not exactly sure that’s the issue with locking.14:56
osh-chatbot<v1k0d3n> so again, i clear out the locking issues on hosts with the removal of `/var/lib/docker*` (which removes docker and dockershim) and `/var/lib/kubelet`. this has worked for me every single time, without fail (so far), on both VM and bare metal. you’re saying that you’ve done this, and it’s still locking @kperr?14:57
openstackgerritArtur Korzeniewski proposed openstack/openstack-helm master: RBAC authorization support  https://review.openstack.org/46463014:58
osh-chatbot<v1k0d3n> i’ve noticed that once locking occurs, sometimes a reboot or restart of the kubelet process will clear locking. the other method mentioned above is obviously really disruptive to other docker workloads you have, so i believe just performing a wipe/clean of kubelet will work.14:59
srwilkersyep.  typically disabling the kubelet, removing what i need to, then restarting typically works15:03
osh-chatbot<raymaika> I've seen the event saying a n rbd lock was present as a false negative a lot of times if my hosts aren't resolving ceph endpoints, though I can't say I've run into it all that recently since I haven't tried without setting up resolv.conf lately.15:03
*** lukepatrick has joined #openstack-helm15:28
*** yamamoto has joined #openstack-helm15:54
*** yamamoto has quit IRC16:01
*** yamamoto has joined #openstack-helm16:03
*** yamamoto has quit IRC16:16
*** yamamoto has joined #openstack-helm16:26
*** yamamoto has quit IRC16:31
*** openstackgerrit has quit IRC16:48
*** lukepatrick has quit IRC16:48
*** lukepatrick has joined #openstack-helm16:55
*** yamamoto has joined #openstack-helm17:35
osh-chatbot<v1k0d3n> so question to the overall group…anyone using ceph…have you experienced a case where after deployment of ceph+openstack-helm (so bound pvc’s) where the host hangs on a typical `sudo reboot`? :slightly_smiling_face:17:42
osh-chatbot<v1k0d3n> i started digging into this a bit, and it may be an opportunity for improvement of the chart.17:43
*** jdandrea has quit IRC17:43
*** yamamoto has quit IRC17:43
osh-chatbot<v1k0d3n> but you will never hear me claim to be an expert at all on ceph.17:43
*** jdandrea has joined #openstack-helm17:43
osh-chatbot<v1k0d3n> @srwilkers have you run into this, where you have to `hard reset` the nodes in your environment? you do a lot of dev work, esp with prometheus.17:44
srwilkersyeah, from time to time ive run into issues where it's made life easier to hard reset17:48
*** openstackgerrit has joined #openstack-helm17:56
openstackgerritSteve Wilkerson proposed openstack/openstack-helm-infra master: Add entry for serviceaccount in flannel's values.yaml  https://review.openstack.org/51859417:56
openstackgerritVlad Naboichenko proposed openstack/openstack-helm-addons master: Add permissions for influxdb user  https://review.openstack.org/51859818:31
openstackgerritLuke Philips proposed openstack/openstack-helm master: WIP - Add labels to all components of apps  https://review.openstack.org/51841218:38
SamYaplev1k0d3n: did you get an answerto your ceph issues?18:45
osh-chatbot<v1k0d3n> hey man! yeah, i need to give you answers. rebuilding the cluster (the same exact way) and then i can share. totally on me for delaying.18:45
osh-chatbot<v1k0d3n> really appreciate you reaching back out though. shouldn’t be too long.18:46
*** hachi_ has joined #openstack-helm18:50
srwilkershey SamYaple o//19:07
SamYaplesrwilkers: o719:40
osh-chatbot<kperr> @v1k0d3n yes I had followed the steps you had mentioned but that doesn't solved the problem in my case :(19:43
osh-chatbot<kperr> I used the bashscript mentioned in https://github.com/ceph/ceph-container/issues/642 to capture the logs when rbd command is called by the kubelet and the logs show `rbdd lock list kubernetes-dynamic-pvc-efb012a9-c4bb-11e7-9c7c-4a5a3ad11b05 --pool rbd --id admin -m ceph-mon.ceph.svc.cluster.local:6789 --key=AQBtWwNaAAAAABAAdzKd1Qrngp8wdZ2248jShA== server name not found: ceph-mon.ceph.svc.cluster.local (Name or service not known) unable19:43
osh-chatbotto parse addrs in 'ceph-mon.ceph.svc.cluster.local:6789' rbd: couldn't connect to the cluster!`19:43
osh-chatbot<kperr> So I was wondering if there is some way to replace `ceph-mon.ceph.svc.cluster.local` by ip addresses because that atleast seems to work on command line19:44
osh-chatbot<v1k0d3n> so that typically happens when you can’t resolve from the node, inside the cluster (where the endpoints live).19:44
osh-chatbot<kperr> I have already added the mentioned hostnames in /etc/resolv.conf19:44
osh-chatbot<v1k0d3n> for your cluster…can you send me the output of `kubectl get svc -o wide -n kube-system`?19:45
osh-chatbot<kperr> NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE       SELECTOR calico-etcd     ClusterIP   10.96.232.136   <none>        6666/TCP        1h        k8s-app=calico-etcd kube-dns        ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP   1h        k8s-app=kube-dns tiller-deploy   ClusterIP   10.96.174.171   <none>        44134/TCP       1h        app=helm,name=tiller19:45
osh-chatbot<kperr> resolv.conf file:19:46
osh-chatbot<kperr> nameserver 10.96.0.10 # Kubernetes DNS Server search svc.cluster.local ceph.svc.cluster.local openstack.svc.cluster.local cluster.local options ndots:5 timeout:1 attempts:119:46
SamYaplev1k0d3n: yea so the pg thing with ceph is scary and poorly explained for multiple pool bits. but the good news is at small scale it almost doesnt matter19:46
osh-chatbot<v1k0d3n> and now can you show me what’s in your ceph namespace @kperr? since you’re using slack…paste in using the `+` symbol and paste as a code snippet…just in case the formatting get’s a little wonky.19:48
osh-chatbot<kperr> File uploaded https://kubernetes.slack.com/files/U1UPJN457/F7WL0MTED/-.txt / https://slack-files.com/T09NY5SBT-F7WL0MTED-bca9eb067c19:50
osh-chatbot<v1k0d3n> ok, so @kperr, now within your cluster, can you do the following?19:50
osh-chatbot<v1k0d3n> `kubectl apply -f https://gist.githubusercontent.com/v1k0d3n/3d1e7e81f976f4e432633961d9915c35/raw/3032a55e5fdfb702ef7a24400f99fad9fa56b5e3/dns-test.yaml`19:50
osh-chatbot<v1k0d3n> i’ll give a bit more in a sec…someone in my cube…19:51
osh-chatbot<kperr> sure, I apprciate you looking into this :slightly_smiling_face:19:51
osh-chatbot<v1k0d3n> of course. no problem at all.19:52
osh-chatbot<v1k0d3n> ok sorry…i created one for ceph :slightly_smiling_face:20:02
osh-chatbot<v1k0d3n> so use this20:02
osh-chatbot<v1k0d3n> `kubectl apply -f https://gist.githubusercontent.com/v1k0d3n/3d1e7e81f976f4e432633961d9915c35/raw/e090be316b656757141cbc0df9098364efc70650/dns-test.yaml`20:02
osh-chatbot<v1k0d3n> then once that is up…do…20:02
osh-chatbot<v1k0d3n> `kubectl exec busybox -n ceph -- nslookup ceph-mon`20:02
osh-chatbot<v1k0d3n> and see if you get something like this…20:02
osh-chatbot<v1k0d3n> File uploaded https://kubernetes.slack.com/files/U0A333J23/F7X3ULZB5/-.txt / https://slack-files.com/T09NY5SBT-F7X3ULZB5-8752a1b7ed20:03
osh-chatbot<kperr> File uploaded https://kubernetes.slack.com/files/U1UPJN457/F7WGA43NC/-.txt / https://slack-files.com/T09NY5SBT-F7WGA43NC-ea70bdd51b20:07
osh-chatbot<kperr> `kube-etcd-0000.kube-etcd.kube-system.svc.cluster.local` portion of your code seems missing in my case :O20:07
osh-chatbot<kperr> Is `192.168.3.21` your node ip on which ceph-mon is running?20:08
*** lukepatrick has quit IRC20:09
osh-chatbot<v1k0d3n> It is.20:09
osh-chatbot<v1k0d3n> That’s the bare metal host ip of eth020:09
osh-chatbot<kperr> I see. I am using VMs20:10
osh-chatbot<kperr> Have you ever tested this in a VM setup?20:11
osh-chatbot<kperr> I am using latest k8s version so wondering if it has something to do with it :(20:11
osh-chatbot<v1k0d3n> kube-etcd-000 is an etcd operator.20:11
osh-chatbot<kperr> I see; I don't have an etcd operator20:13
osh-chatbot<kperr> Any other hints that I could try?20:13
osh-chatbot<v1k0d3n> You’re using kubeadm.20:14
osh-chatbot<v1k0d3n> ?20:14
osh-chatbot<kperr> kubeadm version: &version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:38:10Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}20:15
*** lukepatrick has joined #openstack-helm20:16
osh-chatbot<v1k0d3n> Ok cool. Ubuntu?20:18
osh-chatbot<kperr> 16.0620:19
osh-chatbot<kperr> 16.0420:19
osh-chatbot<v1k0d3n> If so can you show us what’s in /var/run/resolv.conf and see if there’s something running in resolv that’s different than what’s in /etc/resolv.conf20:19
osh-chatbot<kperr> File uploaded https://kubernetes.slack.com/files/U1UPJN457/F7Y8UFEJ3/-.txt / https://slack-files.com/T09NY5SBT-F7Y8UFEJ3-17bbab783e20:21
osh-chatbot<v1k0d3n> There it is...the kube-dns ip needs to be at the top.20:23
osh-chatbot<v1k0d3n> Bingo.20:23
osh-chatbot<kperr> I see :) Thanks a lot :)20:26
osh-chatbot<v1k0d3n> If you can get that resolved, than let me know if that fixes your issue, cool? Always nice to know how we can help people over time. The more we see, the more we can help.20:27
osh-chatbot<kperr> Yes the pod is running now :)20:28
osh-chatbot<v1k0d3n> Sweet! Glad we could help man. Enjoy.20:30
*** lukepatrick has quit IRC20:31
openstackgerritSteve Wilkerson proposed openstack/openstack-helm-infra master: WIP/DNM - Explore backing nfs with volume or hostpath  https://review.openstack.org/51861620:48
*** lukepatrick has joined #openstack-helm21:00
*** hachi_ has quit IRC21:14
*** schwicht has joined #openstack-helm21:54
openstackgerritSteve Wilkerson proposed openstack/openstack-helm-infra master: WIP - Move prometheus to osh-infra  https://review.openstack.org/51764521:54
openstackgerritSteve Wilkerson proposed openstack/openstack-helm-infra master: WIP - Move prometheus to osh-infra  https://review.openstack.org/51764522:32
openstackgerritSteve Wilkerson proposed openstack/openstack-helm-infra master: WIP/DNM - Explore backing nfs with volume or hostpath  https://review.openstack.org/51861622:42
*** schwicht has quit IRC22:59
*** lukepatrick has quit IRC23:28
portdirectHey, in case anyone is interested here's the repo that we used yesterday in Sydney to deploy OSH: https://github.com/portdirect/sydney-workshop23:41
portdirectWas great to see the new gate playbooks from OSH-infra deploy 150 times flawlessly23:43
*** julim has quit IRC23:50
ianwportdirect: thanks for running it.  now i'm back in my home office, it feels weird I haven't heard anyone utter the word "kubernetes" all day :)23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!