Tuesday, 2017-11-14

*** nicovs_be has joined #ara00:08
*** nicovs_be has quit IRC00:12
ara-slack<pilotmattk> @dmsimard We need the DB, running Maria.   Using this in a *very, large, international shop. 200,000+ endpoints :grinning:.   We had to build our own ansible control/cluster to slice inventory across nodes.   Currently doing some load testing before opening the gates in Q1 next year02:07
ara-slack<pilotmattk> Without ARA, I can hit ~5K nodes in 3-6 minutes.   With ARA it's more like 20mins, can only push about 32 commits per minute.   DB is fairly close (same LAN).   Troubleshooting both sides.   DB and the app, currently.02:08
ara-slack<dmsimard> @pilotmattk I'm sure you could be surprised by the performance of sqlite, even at a large scale. If there's even 5ms roundtrip (10ms) for a MySQL database, if you're recording 20k task results (4 tasks on 5k hosts?) that's already 3 minutes worth of latency over the course of a playbook run02:10
ara-slack<dmsimard> There's certainly an overhead in running a callback that records data, especially that amount of data. I'm not particularly surprised by your numbers. I'd love to improve them, though.02:11
ara-slack<dmsimard> Out of curiosity, we could try benchmarking the current state of ARA 1.0 with your setup -- I'd love to test with that use case and find improvement opportunities. Not tonight, though :)02:12
ara-slack<pilotmattk> Yea, I'm not at all surprised about callbacks adding time.   2-3X just seemed high.   We're running playbooks all over the place inside docker containers (multiple clouds).   Having SQLight files all over would be hard to track down.   I wonder if there is a way to batch up results and send bulk commits?   Maybe send things through a redis buffer as forks spawn/die.     I'd be happy to give 1.0 a go, just replying as there is time02:16
ara-slack:slightly_smiling_face:   Will check back on tomorrow.02:16
ara-slack<dmsimard> @pilotmattk Another thing is that ara 1.0 will ship the notion of input drivers. Right now you have this callback <1.0 that does pure SQL queries, in >1.0, this callback is refactored to use an API instead (either "internal (offline)" or HTTP REST). However, this callback will be folded back as a "driver". The driver implementation will make it easier to add other means of inserting data into ARA. I see you mention redis but we already have02:20
ara-slackother message queues in mind like mqtt, rabbitmq, etc. This way, the data could be written to a low-latency bus and asynchronously processed to make it available in the interface.02:20
ara-slack<dmsimard> Definitely happy to spend some time narrowing down how we can improve this :slightly_smiling_face:02:40
*** bcoca has quit IRC03:48
*** jparrill has joined #ara06:59
*** nicovs_be has joined #ara07:45
*** jclaret has joined #ara07:57
*** jcl has joined #ara07:57
*** twouters_ is now known as twouters08:23
*** twouters has joined #ara08:23
*** jcl has quit IRC09:59
*** jclaret has quit IRC10:00
*** jcl has joined #ara10:00
*** sshnaidm is now known as sshnaidm|afk10:06
*** jclaret has joined #ara10:07
*** sshnaidm|afk is now known as sshnaidm11:34
*** nicovs_b_ has joined #ara12:02
*** nicovs_be has quit IRC12:04
*** bcoca has joined #ara13:35
*** jclaret has quit IRC14:01
*** jcl has quit IRC14:01
*** dmsimard|off is now known as dmsimard14:05
*** jclaret has joined #ara14:11
*** jcl has joined #ara14:11
*** bcoca has quit IRC14:18
*** bcoca has joined #ara14:19
*** bcoca has quit IRC14:19
*** bcoca has joined #ara14:19
ara-slack<pilotmattk> I have a few ideas to try.  Just to double-check that latency estimate vs 3 minutes.  If I understand callbacks correctly,  each fork runs the callback with the parent running a final callback at the end (summary).   Is this correct?     There *Should* be a high degree of concurrency at 500 forks (depends on db threads).14:34
ara-slack<pilotmattk> I see the 1.0 branch out on github (your repo and openstack), how safe are the callbacks?   Does the data model match 0.14.514:36
*** tbielawa has joined #ara14:41
ara-slack<dmsimard> @pilotmattk the database model is very different and there's no upgrade path, it breaks backwards compatibility.15:04
ara-slack<dmsimard> The callback is safe (it is integration tested), I haven't yet fully tested the API with MySQL however.15:05
*** jcl has quit IRC15:06
ara-slack<dmsimard> I'm not sure about the impact of forks, not familiar with the low level implementation in Ansible15:06
*** jcl has joined #ara15:06
ara-slack<dmsimard> For yesterday's example, it was just napkin math, though, there's a bit more queries involved than that.. recording hosts, files, plays. Bulk of your time would be spent recording task results in your use case however15:08
ara-slack<pilotmattk> OK, thank you.   First order of business is to locate the DB in the same rack (or floor) as the worker nodes.     Then thinking about ways to batch up / federate mysql (without the overhead of FederatedX).   Might try writing to SQlite then dump and load to backhaul the data.    Need some sort of local write cache / tempfs.   RabbitMQ is perfect going forward, seems 50/50 which in-mem store a python project will pick15:21
ara-slack<dmsimard> @pilotmattk I don't know what's your use case but we had scalability issues in OpenStack because we were generating static reports for every CI job. The static reports aren't large but it's a lot of smaller files. Anyway, we came up with a WSGI middleware to load arbitrary sqlite databases which suits well the use case for "ephemeral" CI reports http://ara.readthedocs.io/en/latest/advanced.html15:25
ara-slack<pilotmattk> I think I remember that blog post...  something about static reports in jenkins.   To date we do not generate static reports...   We could likely store the sqlite file as a blob inside our Deployment Orchestrator db (postgres)..   That has some potential.  Call postgres HTTP api to retrieve the report.15:50
ara-slack<pilotmattk> I'd have to think about how to collate the reports... eventually.  there is a desire to have some visibility exactly what is being deployed where and how often.15:51
*** jparrill has quit IRC15:54
*** nicovs_b_ has quit IRC16:08
ara-slack<dmsimard> @pilotmattk yeah, the feature is not so much to create static reports, but rather to dynamically load an arbitrary sqlite database16:31
ara-slack<dmsimard> So instead of generating a static report and storing the report files, you store the sqlite database(s) instead16:31
ara-slack<dmsimard> I recognize it's a niche use case though :slightly_smiling_face:16:32
*** nicovs_be has joined #ara16:52
*** nicovs_be has quit IRC16:56
*** dougbtv__ has joined #ara17:45
*** dougbtv_ has quit IRC17:48
*** cliles has joined #ara17:53
*** tbielawa has quit IRC17:56
*** dougbtv__ has quit IRC18:12
*** tbielawa has joined #ara18:33
*** dougbtv__ has joined #ara18:36
*** tbielawa is now known as tbielawa|caff19:00
*** jclaret has quit IRC19:21
*** jcl has quit IRC19:21
*** tbielawa|caff is now known as tbielawa19:28
*** resmo has joined #ara20:22
*** resmo has quit IRC20:24
*** tbielawa has quit IRC21:07
*** nicovs_be has joined #ara23:09
*** nicovs_be has quit IRC23:13

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!