<LnL>
do you think it would be ok to query hydra directly for that or probably not?
<gchristensen>
I'd rather not
<gchristensen>
this would be a good use case for that database export, or for being able to subscribe to build notifications
<LnL>
yeah figured it's not a great idea, especially since it should check multiple historical builds
<LnL>
do you have an idea on how to expose / access that?
<gchristensen>
the events?
<gchristensen>
or the database dump
<LnL>
I think the database would probably be better for this
<gchristensen>
so I actually have the data already
<gchristensen>
it gets backed up to my machine every 5 minutes
<LnL>
the safest first step I'm thinking of is checking if any build in the last x time has succeeded
<LnL>
which is kind of hard to do with events
<gchristensen>
yeah
<gchristensen>
but you could have a timeseries database
<LnL>
notifications of breakages is a different thing
<gchristensen>
the thing I want to do next is load the database and do a .sql dump on a daily or weekly basis. this would validate the backup was good. secondary effect is letting other people access the .sql dump for queries like this :)
<gchristensen>
(by events I mean like a rabbitmq or 0mq or whatever mechanism of publishing parsable event data)
<LnL>
right
<gchristensen>
I suppose if I'm loading and dumping the sql, it would not be a far throw to be able to have a list of queries executed and publish those query results too
<LnL>
ah, so you're thinking more of publishing events periodically going over all packages?
<gchristensen>
sorry, 2 different ideas :)
<gchristensen>
the build event data is just purely: here is a firehose of events, do whatever you want, anybody can subscribe. good luck and god speed.
<gchristensen>
the second idea is if we do it as a batch operation at the same time as validating the database backup
<gchristensen>
since I'm loading the database and doing a pgdump anyway, might as well at the same time execute some list of queries and publish their results in the same place the pgdump data goes
<LnL>
makes sense
<LnL>
so what you'd want is a small deamon that just runs a bunch of queries over jobs, etc. and publishes events for those?
<gchristensen>
ah, for sort of publishing I mostly mean just like, `aws s3 cp` to a bucket :) not really an event exactly
<LnL>
hmm, a bit confused now
<gchristensen>
sorry :/
<gchristensen>
overloaded words
<LnL>
making food first, then I'll make a diagram of what I was thinking
<gchristensen>
cool
<srk>
have you seen fedmsg infrastructure for Fedora?
* srk
likes the idea of sql dumps, front-facing / staging sever for testing queries and so would be even better
<gchristensen>
I've heard of fedmsg
<gchristensen>
but I don't remember how it works
<gchristensen>
we'd need to be able to send a fairly high number of events
<srk>
not sure either, it is some messages, maybe even AMQP
<srk>
hmm, zmq is like building blocks for queues and co, it handles bunch of low level stuff for you but it's not a fully-fledged message queue by itself
<gchristensen>
this is beautiful LnL
<gchristensen>
how did you make it?
<LnL>
:p
<srk>
Postgres Mirror <3
<LnL>
but does it make any sense?
<LnL>
omnigraffle
<gchristensen>
nice, I love omnigraffle. I have a macos VM just for omnigraffle
<gchristensen>
LnL: I think this makes sense, but let me suggest a few edits
<srk>
so it's AMQP now..
<gchristensen>
Hydra sends me a ZFS filesystem diff every 5min, so on my system I'd take the current state of the filesystem, start posgresql, and make a dump from that
<gchristensen>
the Selector would then operate on the same postgres server which the dump is made from
<srk>
ZFS diff replication! mad
* srk
was wondering how could you do that every 5 min
<srk>
is there a backup hydra? :D
<gchristensen>
nah, hehe
<LnL>
right, the details of that don't really matter for the rest of the picture
<gchristensen>
but yeah it uses snapshots for backups
<gchristensen>
the arrow from Selector to build status is a set of queries, right?
<LnL>
yeah
<LnL>
for this probably first listing failed builds on trunk and then a query for each of those
<gchristensen>
yeah
<LnL>
which either results in an event or not
<LnL>
or always send an event including the delta, whatever makes more sense
<gchristensen>
yeah, so then the output of that could be a stream of "broken-forever" or "broken-recently" messages
<gchristensen>
or a bulk blob of JSON containing that "report"
<gchristensen>
which are you thinking?
<gchristensen>
oh that is what you just said too haha
<LnL>
probably an event for each, I bet the queries could be a bit heavy
<LnL>
long term you might want to make it remember some of the stuff it did so it doesn't start with 0ad every time if it didn't complete a cycle, etc.
<srk>
btw I have post-receive hook implemented for watching nixpkgs commits, that could be used as a source instead of webhook. it is a standalone thing for now which passes events to sever which sents them to clients over websocket to web face
<gchristensen>
LnL: yeah, that sounds like a future thing we can deal with if we have to :P
<gchristensen>
srk: github has post-receive hooks beyond their webhooks?
evanjs has joined #nixos-borg
<srk>
gchristensen: no, it works by checking out mirror copy of the repo and fetching periodically then pushing to repo which has post-receive hooks
<srk>
cause github doesn't make it easy :)
<gchristensen>
ah
<gchristensen>
we have the webhook setup on github's end
<srk>
sure, but if I wanted to receive a stream? :)
<gchristensen>
yeah, so the webhook goes right in to rabbitmq :)