gchristensen changed the topic of #nixos-borg to: https://www.patreon.com/ofborg https://monitoring.nix.ci/dashboard/db/ofborg?refresh=10s&orgId=1&from=now-1h&to=now "I get to skip reviewing the PHP code and just wait until it is rewritten in something sane, like POSIX shell. || https://logs.nix.samueldr.com/nixos-borg
orivej has joined #nixos-borg
orivej has quit [Ping timeout: 256 seconds]
orivej has joined #nixos-borg
<cole-h> Oof, all the evaluators and builders died?
<cole-h> If none of the packet machines are back up by 11PM PDT (~30m), I'll redeploy to see if that helps...
<cole-h> I'm impatient and anxious, so I'm moving it up 15m and doing it now.
<cole-h> ...that doesn't seem good. Multi-minute "waiting for agent" times. buildkite r u OK? (according to https://buildkitestatus.com it's fine...)
<cole-h> Something's wrong with buildkite, methinks.
<cole-h> packet-nix-builder has been waiting for an agent for almost 6 hours, packet-spot-buildkite-agent for almost 14 hours, and there's even a r13y job that has been waiting for almost 24 hours.
<cole-h> gchristensen: It's up to you, now. I don't think there's anything else I can do from here.
cole-h has quit [Quit: Goodbye]
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #nixos-borg
<gchristensen> uh oh
<LnL> it's the deploy host that's down I think
<gchristensen> I don't understand what is happening here
<gchristensen> fetching my fork of nixops-packet which has that rev as the tip
<gchristensen> I think there was a regression in nix but I hacked it.
orivej has quit [Ping timeout: 265 seconds]
cole-h has joined #nixos-borg
<cole-h> OK, what's up? This whole thing has me real confused.
<gchristensen> no clue
<cole-h> tbh I was tempted to email buildkite last night seeing if it's not just us... But the fact their status page has nothing up makes me think maybe it is.
<gchristensen> it was on my end and now it is on maybe packet's end
<cole-h> What was the problem on your end?
<gchristensen> my deploy host rebooted and needed some secrets uploaded
<cole-h> oh lol
<cole-h> Is that what all those failures were on the other jobs came from, as well?
<gchristensen> ...maybe?
<gchristensen> I'm not sure it isn't working yet
<cole-h> packet :-(
<cole-h> Could the issues be related to the new ipxe? Didn't you say you were uploading that some time ago?
<gchristensen> no
<cole-h> Harrumph. Packet claims to be fine, yet our machines still aren't up.
<cole-h> idk what to do. If I can help in any way, let me know.
orivej has joined #nixos-borg
orivej has quit [Ping timeout: 256 seconds]
orivej_ has joined #nixos-borg
<gchristensen> I'm about to have some time to look closer
<gchristensen> a bunch of things had to be done this morning
<cole-h> <3 gchristensen
<{^_^}> gchristensen's karma got increased to 328
orivej_ has quit [Ping timeout: 246 seconds]
<gchristensen> (1 "this morning "later)
<cole-h> :P
<cole-h> gchristensen: I think the packet.net link is completely broken. It times out for me with or without https, but the previous IP works fine.
<cole-h> (Tested just by `curl`ing the ipxe)
<cole-h> :O I see stuff happening!
<cole-h> :OOOO
<gchristensen> yayyy
<cole-h> What did you do, you magician?
<gchristensen> I changed https to http
<gchristensen> lol
<cole-h> But how did that work when I can't curl the http link? :o
<cole-h> "[RESOLVED] StalledEvaluator ..." 🙏🙏🙏
<cole-h> Except 1 and 3 both failed to install bootloaders
<cole-h> I was too hasty in my 🙏
<gchristensen> hmm
<gchristensen> so it didn't work .........
<cole-h> It got farther, at least...
<gchristensen> oneuhnsaoheunoatehu spot-eval-2 is causing problems
<cole-h> wat how
<cole-h> Even though it got the furthest?
<cole-h> Interesting how it's eval-2 that always goes first...
<cole-h> Or at least appears to always be going first
<gchristensen> eval-1 and eval-3 are pretty much ready to go, but nixops won't set them up until eval-2 is ready :x
<cole-h> >:(
<cole-h> eval-2 WHY
<cole-h> omg
* cole-h is cautiously optimistic
<gchristensen> yaaaaay
<cole-h> ✨ gchristensen
<{^_^}> gchristensen's karma got increased to 329
<cole-h> ✨ gchristensen
<{^_^}> gchristensen's karma got increased to 330
<cole-h> ✨ gchristensen
<{^_^}> gchristensen's karma got increased to 331
<cole-h> Now I can do the thing again...
<cole-h> "[RESOLVED] StalledEvaluator ..." 🙏🙏🙏
<gchristensen> w000t
<gchristensen> :)
<cole-h> Any indication as to what caused this? Extremely weird situation...