<lovesegfault>
Are these kernel builds not big-parallel or something?
<gchristensen>
big-parallel just doesn't mean as much as it should
<samueldr>
it can still be given to an underpowered or overcrowded machine, right?
<lovesegfault>
gchristensen: But how does the chromium build that takes like 7 days and a goat sacrifice not timeout?
<lovesegfault>
When a mundane kernel build goes kaboom-boom
<samueldr>
luck, and it doesn't take actually 7 days
<gchristensen>
it has a custom timeout
<lovesegfault>
samueldr: I know, I'm just being hyperbolic :P
<samueldr>
making sure :)
<samueldr>
lovesegfault: your link was an aarch64 build
<lovesegfault>
Also: I thought timeout was based on no output for $time, not just "been building for $time"
<samueldr>
imagine it's a 64 core machine, with 32 different jobs running at the same time
<samueldr>
that makes it quite crowded, and I/O may have issue being done timely
<samueldr>
(and two cores for the build!)
<lovesegfault>
TAGS NOW
<samueldr>
meanwhile, that same 32 jobs machine is fine building 32 trivial things at the same time, maybe even 32 not-so trivial!
<samueldr>
yeah :)
<lovesegfault>
Could we, as a stop-gap, set a custom time out for kernel builds? They consistently timeout on aarch64
<samueldr>
I think gchristensen has a better fix that's a stop-gap fix, though I'm not sure I understand/know it
<gchristensen>
there is no need for a stop-gap
<gchristensen>
the latest kernel will not always build, or be built, by hydra
<qyliss>
The other question is, should we add linuxPackages_latest to the tested set?
<gchristensen>
if it is important, we can make it mandatory
<gchristensen>
I would say no, imo
<samueldr>
I think lovesegfault's is tangentially related
<gchristensen>
but not sure :)
<samueldr>
like, aarch64's latest kernel might be in tested
<qyliss>
Three complaints (that I saw) in one day feels like a lot
<samueldr>
and I would say it is important for linuxPackages_latest
<lovesegfault>
Look, I just want green check marks oh hydra so I can go to sleep happy :P
<qyliss>
So I'm leaning towards yes
<lovesegfault>
Yesterday I went to sleep angry
<samueldr>
it's often the only way to use recent (like 6 months and newer) laptops
<qyliss>
Especially three complaints from stable users
<gchristensen>
lovesegfault: you'll have to become a bit zen about it
<lovesegfault>
gchristensen: I call my approach OCD Zen
<gchristensen>
I'm not joking, you'll burn out
* lovesegfault
pays attention
<gchristensen>
to clarify, I'm not opposed to it being mandatory for passing
<gchristensen>
samueldr: the better solution I have is to make 10% of the hydra builders have jobs=1,cores=24, and big-parallel a feature
<gchristensen>
cores=$(nproc) that is
<qyliss>
It's a shame we don't have better insight into failed Hydra jobs
<gchristensen>
what kind of insight?
<qyliss>
Especially on stable, it should be news when things just start failing.
<samueldr>
gchristensen: right, that's what I thought you were thinking, but didn't know how to articulate it
<gchristensen>
aye :)
<gchristensen>
qyliss: can you give some examples of what you'd like to know?
<qyliss>
I think that, if some package that previously built stops building on stable, there should be some sort of alert about it, on IRC or something.
<gchristensen>
maybe an email to the maintainer?
<qyliss>
That would be good too, but I think a broadcast would also be nice
<qyliss>
It should be a rare enough event (I hope?), and then we could have caught this.
<gchristensen>
heh
<qyliss>
Emails to maintainers would be good in general. From what I've heard that used to happen?
<qyliss>
I guess maybe we don't want to broadcast all those failures
bhipple has joined #nixos-dev
<gchristensen>
maintainer emails were turned off after a new release jobset was created with "email maintainers" enabled and then it sent an email to a ton of people, one for each newly failing build -- which is all of them, quite a lot, on tis first eval
<gchristensen>
I would like them to be turned back on
phreedom_ has quit [Remote host closed the connection]
lovesegfault has quit [Ping timeout: 260 seconds]
lovesegfault has joined #nixos-dev
kenjis1 has joined #nixos-dev
orivej has joined #nixos-dev
<das_j>
hmm, is there a way to extend the lib my configuration.nix gets passed? I can extend pkgs.lib with an overlay, but is there also a way for the lib parameter?
<{^_^}>
#77683 (by jtojnar, 1 hour ago, open): doc: Make callout marks & prompts unselectable
qyliss has quit [Quit: bye]
qyliss has joined #nixos-dev
qyliss has quit [Remote host closed the connection]
qyliss has joined #nixos-dev
kenjis1 has quit [Remote host closed the connection]
kenjis1 has joined #nixos-dev
kenjis1 has quit [Remote host closed the connection]
qyliss has quit [Quit: bye]
kenjis1 has joined #nixos-dev
qyliss has joined #nixos-dev
drakonis1 has joined #nixos-dev
<gchristensen>
jtojnar: oh dear :)
<gchristensen>
embarassing!
janneke has joined #nixos-dev
evils has joined #nixos-dev
eyJhb has quit [Quit: Clever message]
<gchristensen>
OMG y'all, I have seen it, the BEST error message I have EVER seen from Nix.
<drakonis1>
show the goods
<gchristensen>
this error message is specific (a list is being assigned to the option config.programs.ssh.knownHosts), tells me where to go for more information, tells me where the violation is (packet-spot-buildkite-agent/network.nix) SHOWS me the code which is wrong, *AND* tells me the solution (https://gist.github.com/grahamc/a5e79e44087442b4afedbb46248bcf57 extremely amazing kudos to rnhmjoj++,
<{^_^}>
rnhmjoj's karma got increased to 4
<gchristensen>
infinisil++, and worldofpeace++ wow I am just completely over the moon with delight
<{^_^}>
infinisil's karma got increased to 182, worldofpeace's karma got increased to 55
<rnhmjoj>
gchristensen: thank you. it was infinisil to show me how we could implement such a detailed error. we definitely need more of these
<gchristensen>
+1 if every new error message was this good, we would be doing Great Work.
eyJhb has joined #nixos-dev
Jackneill has quit [Remote host closed the connection]
<qyliss>
Yeah those error messages are amazing.
<qyliss>
infinisil++
<{^_^}>
infinisil's karma got increased to 183
<infinisil>
I didn't do very much there, rnhmjoj++ moreso :)
<{^_^}>
rnhmjoj's karma got increased to 5
<qyliss>
rnhmjoj++ :)
<{^_^}>
rnhmjoj's karma got increased to 6
<gchristensen>
infinisil: your care and attention to detail :)
<gchristensen>
your thinking it was valuable to make it happen
<infinisil>
:D
<infinisil>
I've been wondering whether the types.str error could be improved
<gchristensen>
infinisil: thank you so much
<infinisil>
It's not possible to do the same thing as for loaOf
lovesegfault has quit [Ping timeout: 248 seconds]
bgamari has joined #nixos-dev
ixxie has joined #nixos-dev
<Ox4A6F>
Nice, thanks rnhmjoj++ and infinisil++
<{^_^}>
infinisil's karma got increased to 184, rnhmjoj's karma got increased to 7
* infinisil
concluded that it's not possible to have more contextful error messages for types.string
<infinisil>
Without major changes at least
ris has joined #nixos-dev
ixxie has quit [Ping timeout: 268 seconds]
ajs124 has quit [Quit: Gateway shutdown]
lovesegfault has joined #nixos-dev
aszlig has quit [Quit: Kerneling down for reboot NOW.]
aszlig has joined #nixos-dev
_ris has joined #nixos-dev
ris has quit [Ping timeout: 258 seconds]
_ris has quit [Read error: Connection reset by peer]
_ris has joined #nixos-dev
<infinisil>
Hm, for managing secrets on a remote machine, nixops uses send-keys to send the keys when necessary
<infinisil>
But I'm thinking it might make more sense for the remote machine to have a recv-keys mechanic
<infinisil>
Which would ask the deploy host to send the keys
<infinisil>
But then this would also work with other things, like getting keys from <whatever secret store you can connect to>, not only the deploy host
<infinisil>
However the problem is that the deploy host might not even have a public reachable ip from the remote host
<infinisil>
Not sure about what to do..
<gchristensen>
support both? :)
<infinisil>
gchristensen: The cool thing about recv-keys would be that when the remote machine boots up it wouldn't have to wait for the deploy host to send-keys
<infinisil>
Supporting both is kind of meh
<infinisil>
But maybe
<gchristensen>
many of the use cases of `nixops send-keys` cannot be replaced with a `recv-keys`, so I think both would need to exist
<infinisil>
What are those use-cases?
<gchristensen>
the deploy host might not even have a public reachable ip from the remote host
<gchristensen>
there is also the question about bootstrapping trust, which send-keys implicitly gets by having a pre-cached copy of the SSH host key
<infinisil>
Hm true
<gchristensen>
and also recv-keys adds a new step of validating and restricting which keys is the remote allowed to ask for, whereas on the sending side there is no need for such a restriction
<infinisil>
I'm thinking of a recv-keys as just the remote host sending "please do a send-keys"
<gchristensen>
also it is nice to be able to send-keys to a limited subset of hosts to do a phased key rollover, validating it worked correctly and then continuing the rollout
<gchristensen>
and actually whereever receive-keys is querying, it probably shouldn't be on a public reachable IP at all :P
<infinisil>
Oh yeah that's a good point
<gchristensen>
I'm 100% on board with a receive-keys btw
<gchristensen>
my point is mostly that send-keys isn't going anywhere :P
<infinisil>
Yeah
<gchristensen>
(even in a deployment where you use Vault for everything to query secrets from, I would still likely use send-keys to the machines running Vault to upload a secret which Vault needed to decrypt the key store)
<infinisil>
Vault is a server that stores secrets that can be retrieved by authorized clients?
<gchristensen>
yeah
<gchristensen>
btw if you're not familiar with Vault, and you're interested in thinking about this more / working on this more, please take a look -- it is really, really good, and if anything seems weird to you -- ask because very few design choices in Vault make me go ?
<infinisil>
I'll probably do that :)
<gchristensen>
great!
kenjis1 has quit [Remote host closed the connection]
<infinisil>
gchristensen: Oh, I think recv-keys is the main thing to do, because send-keys can be implemented on top of it by just doing a port forward over ssh from the remote to the deploy machine, then calling recv-keys
<gchristensen>
yeah ... not keen on thaht
<infinisil>
Why not?
<gchristensen>
"the question about bootstrapping trust, which send-keys implicitly gets by having a pre-cached copy of the SSH host key" "recv-keys adds a new step of validating and restricting which keys is the remote allowed to ask for, whereas on the sending side there is no need for such a restriction" and "that is really complicated when scp works great"
<infinisil>
The ssh host key is still necessary because of the port forward
<gchristensen>
right but that is not providing authentication for the thing which will do the sending
<infinisil>
And I think restricting secrets shouldn't be a problem to implement, the sending host can still decide on that
<gchristensen>
it is just forwarding a port, and now the request to that port has to validate
<infinisil>
Hmm to secure a port such that only X can make requests to it..
<gchristensen>
right, but it is always much easier to restrict what is sent when you are doing an explicit series of sends, vs. defending against malicious requests
<gchristensen>
and then every one of these questions and statements seriously bolsters my third point
<gchristensen>
besides, I don't want my remote servers being able to SSH to my deploy host :P
<infinisil>
recv-keys just makes more sense in general imo. A host saying "Oops, I don't have secret X, let's see who's supposed to have given me that, I'll ask them to send it"
<gchristensen>
sure, recv-keys makes sense for some use cases
<gchristensen>
I'm not sure why you're so keen on recv-keys supplanting send-keys
<infinisil>
If a single abstraction works for both then that saves work :)
<gchristensen>
a single abstraction forced to work against the grain is a dangerous abstraction
<infinisil>
Yeah
<infinisil>
Will need some more thinking, maybe it doesn't work
<infinisil>
Though I guess there's not much to be gained
<infinisil>
As you said, scp works great
<infinisil>
And is simple
<gchristensen>
simple is good
<infinisil>
Though in my current code I'm first collecting all secrets into a binary stream, sending that over ssh, then decoding it back on the other end
<gchristensen>
that sounds like scp so far
<gchristensen>
souns a bit spooky, though, iirc there are certain syscalls to mark bits of memory as containing sensitive data
<gchristensen>
maybe not, I'm not an expert at this
<infinisil>
The idea is that to deploy machine X, all you need is the system profile and that binary stream (which could be saved to a file)
<infinisil>
And I heard that nixops gets slow with many secrets because it uses scp or so
<gchristensen>
I would have to see more specifics to know more, but the ability to store and replay a stream of secrets gives me prickly skin :)
* gchristensen
should stop being a debbie downer
<infinisil>
Might change that though, but I thought it was a nice abstraction, collecting secrets on the deploy host (from different filepaths and such), sending that over, then distributing the secrets to where they're supposed to go on the target host, all encoded in the binary stream
<gchristensen>
I don't feel qualified to comment on that code
<infinisil>
I'm using a really neat trick to figure out what secrets a machine needs
<infinisil>
Assign secret files with `secrets.foo.file = ./actual/secret/path`, which then gets turned into a /nix/store path containing a symlink to `/run/keys/foo`. So if you reference `secrets.foo.file` somewhere, you're actually using the /nix/store path
<gchristensen>
nice
<infinisil>
Then I'm basically intersecting the closure of the system with all known secrets. Those are the ones the system depends on
<infinisil>
The sha512 of secrets is used to force a /nix/store path rebuild if the secret changes, which has the nice side-effect of automatically restarting services
<gchristensen>
how does it get "turned in to" a path of a symlink?
<evils>
i expect too much from that piece of magic
<evils>
so, regardless of willingness to explain it to me, is there a plurality of people that have a fairly complete overview of the documentation? (or am i just overlooking some existing documentation?)
<infinisil>
Hm well I guess there's docs for nixos options, the nixos manual which includes all nixos options in a listing, and the nixpkgs manual.
<infinisil>
And I think all of those are generated for html and manpages
<evils>
so there's a bunch of .xml files, and the modules and nix expressions with .meta.description, what i'm completely blanking on is how most of that gets into the web manuals, or configuration.nix.5
<evils>
nixpkgs/doc/Makefile seems to do some things with the .xml's and some .md's, i'm guessing that makes nixos.org/nixpkgs/manual
<qyliss>
whoa TIL configuration.nix.5
<evils>
yea, you try `man nix` and give up until someone tells you about it :P
<infinisil>
I'd say follow the nix expressions in the doc directories
<infinisil>
That's what I'd do to figure this out
<rycee>
evils: A while ago I extracted the documentation generation framework into a separate project nmd (https://gitlab.com/rycee/nmd) and also added a few minor improvements here and there. It may or may not be easier to follow than the code in nixos :-)