<gchristensen>
no queued evals means there is a problem before the ofborg-evaluator worker itself
<cole-h>
Hm. Like what?
<gchristensen>
cole-h: in this case, ofborg-evaluation-filter.service has a crash
<cole-h>
Ah, I had looked at it, but the fact that I saw "Hello, world!" and following lines of logging, I figured it had recovered
<gchristensen>
interesting
<gchristensen>
it'd be good t omonitor the size of the queue that reads from
<cole-h>
Oh yeah, look at all those juicy logs
<cole-h>
Do you mean a dashboard in grafana, or is this something that should be done in code?
<gchristensen>
probably both needed
<gchristensen>
not sure exactly
<cole-h>
Being unfamiliar with AMQP, I wonder how one would retrieve queue size
<gchristensen>
how does the ofborg evals graph work? I'd start there :)
<cole-h>
It hooks into a mysterious metric called "ofborg_queue_evaluator_in_progress"
<cole-h>
(and _waiting)
<cole-h>
To ofborg/infra I go
<gchristensen>
indeed!
<cole-h>
Oh no it's PHP
<gchristensen>
a little bit of PHP is good for the soul
<gchristensen>
in a sort of homeopathic kind of way
<cole-h>
lol
<cole-h>
I'm thinking it would be interesting to look at `$queue['messages']`, but it also seems like it would fail in exactly the same way because that is used to determine the in_progress stuff
<gchristensen>
lol this is overly fancy php too, the worst
<cole-h>
Where is the json that $queues and $connections read generated?
<gchristensen>
I mean in a slightly poisonous way, but in such low quantity to not hurt you
<samueldr>
describing homeopathic as having anything in any context is dangerous! it's poison for your mind!
<samueldr>
if there's detectable traces of it in it it's not homeopathy anymore
<gchristensen>
oh :)
<samueldr>
(and it happens, and it's dangerous!)
<cole-h>
`send_pend` from connections looks interesting... "Send queue size." according to rabbitqmctl(8). There's also the various stuff in `message_stats` from queues
<cole-h>
The main thing I'm interested in (for comparison's sake) is what connections.json and queues.json looks like when the eval filter is down like it was before
<cole-h>
So I can determine what things are just a red herring or not
<LnL>
I wish I could push this second patch to carnix tho, the nest doesn't seem to be very happy
<gchristensen>
:/
orivej has joined #nixos-borg
<LnL>
hmm, the build queue dropped to zero with the deploy
<gchristensen>
still weird?
<LnL>
stuff is building so probably a coincidence
<LnL>
was also darwin which nothing on that side should be consuming
<gchristensen>
yeah
<LnL>
idea: publish a master eval and hello build or something after a deploy
<LnL>
that way something kick off immediately instead of waiting for somebody to make a pr
<gchristensen>
that sounds like a great idea :)
andi- has quit [Ping timeout: 246 seconds]
cole-h has joined #nixos-borg
<cole-h>
Anybody have thoughts on this comment? https://github.com/NixOS/nixpkgs/pull/85951#issuecomment-619358840 I can't say for sure, but I'm leaning towards "no," because wouldn't the coreutils bump have seen the same issue? Or is binutils larger than coreutils?
<LnL>
that's the build thing I linked yesterday
<cole-h>
Oh right
<LnL>
but I don't think it's "failing", but rather only builders handle timeout messages since the evaluators are not really supposed to be building stuff
<LnL>
publishing a build request for the libs would be better but that might be a bit tricky to do, not sure
<LnL>
cole-h: actually that returns jobs to be scheduled so shouldn't be all that hard
<cole-h>
GitHub are you kidding me
<cole-h>
"thread 'main' panicked at 'Failed to add labels ["ofborg-internal-error"] to issue #86014: Error(Http(Io(Os { code: 110, kind: TimedOut, message: "Connection timed out" })),"
<infinisil>
gchristensen: For https://github.com/NixOS/nixpkgs/pull/85951, I think I have a pretty good solution: I have a nixpkgs patch to allow overriding the pkgs used for the lib tests, so they can be run with `nix-build lib/tests/release.nix --arg pkgs 'import <nixpkgs> {}'`, meaning it doesn't depend on the commits nixpkgs version anymore for stdenv and such
<infinisil>
The only requirement is that ofborg needs a nixpkgs in NIX_PATH. How does that sound?
<gchristensen>
it doesn't have one, on purpose -- to prevent nixpkgsfrom importing nixpkgs
<gchristensen>
that is tricky :/
<gchristensen>
probably should add the idea of "mandatory builds"
<infinisil>
Hmm
<infinisil>
I mean it actually doesn't matter whether it's <nixpkgs> or something else
<infinisil>
Could even be `--arg pkgs 'import (fetchTarball "channel:nixpkgs-unstable") {}'`
<infinisil>
Or `--arg pkgs 'import <nixpkgs-impure-borg> {}'`
<infinisil>
Or `--arg pkgs 'import /path/to/nixpkgs {}'`
<infinisil>
I think for the lib tests something like this would work great, because these don't test pkgs itself, they only need it to support the testing
<infinisil>
I'll look into supporting something like that
<cole-h>
GitHub seriously, please stop. Getting more internal errors due to timeouts... gchristensen do you know if timeout limits are configured in ofborg, or someplace else (i.e. is it something that can be dealt with)?
<tilpner>
cole-h: Can you determine what the current timeout is? (90s?)
<cole-h>
The only configurable timeouts I spy is the one for builds, and the one for the rabbitmq connection, while I believe this timeout is borg <-> GitHub, not borg <-> queue(s)
<{^_^}>
[ofborg] @Infinisil opened pull request #472 → Pass build nixpkgs to mass-rebuilder binary → https://git.io/Jft8t
<infinisil>
Done the thing :) ^
<tilpner>
cole-h: No, I meant "can you calculate the timeout that's being used from the logs, or otherwise observe it?"
<tilpner>
Some libraries do allow for setting a timeout
<tilpner>
But it might as well be a server-side timeout, in which case there's nothing you can do
<tilpner>
If it's 90s, there's a chance it's client-side
<cole-h>
Some of these aren't timeouts, which leads me to believe it's actually GH again... borg merges the commit, and then 18 seconds later, it errors at "Failed to get issue: end of stream before headers finished"