<andi->
That depends on the decrease of jobs. If the overall "core" overbooking will be the same or not
<gchristensen>
another option is setting one to big-parallel and the other to not
<gchristensen>
big-parallel running fewer jobs with higher cores
<andi->
that means just big-parallel jobs and nothing else?
<gchristensen>
we have two options there. if we set big-parallel in the "supported features" list, the machine will build big-parallel jobs and non-big-parallel jobs. if we set big-parallel in the "mandatory features" list, the machine will only build big-parallel jbos
<andi->
ok, so the mandatory case is were I fear we might "waste" idle time... But since we can not enforce that we priortize big-parallel vs "regular" jobs it might be worth a shot.
<srhb>
gchristensen: I forget what the verdict was on timeout increasing. There was some default being enforced on (some but not all?) builders despite setting meta.timeout and friends
<srhb>
Fixing that is the least wasteful option (slow builds are fine as long as they eventually complete...)
<srhb>
That said, we might not want to optimize for that right now given how utterly fragile our jobsets are these days...
<srhb>
In which case more cores less jobs is the safer bet
<andi->
srhb: https://github.com/NixOS/hydra/issues/591 is the issue for that. See my comment with a potential fix.. Just haven't managed to reproduce it with reasonable timeouts locally :/
<{^_^}>
hydra#591 (by cleverca22, 8 weeks ago, open): meta.timeout does not always work
<srhb>
andi-: Right, thanks!
<srhb>
gchristensen: I don't think there's a reason to set big-parallel in mandatory AND increase cores/decrease jobs
<gchristensen>
right. so let's just start with cores/jobs
<srhb>
So right now my vote is more cores/less jobs -- at least until timeout is a safer bet and we've stabilized somewhat
<srhb>
Right :)
<gchristensen>
how does 12cores/9jobs sound?
<srhb>
Out of how many cores?
<gchristensen>
96
aminechikhaoui has quit [Ping timeout: 264 seconds]
<srhb>
So that's a target load of 1.1 per core?
<gchristensen>
yeah I guess so
<srhb>
Seems OK.
<srhb>
How much memory does that thing have anyway?
<gchristensen>
enough that I've never had to check
<srhb>
OK :)
<srhb>
It does peak occasionally, according to those graphs
<srhb>
But not regularly, so I guess as long as the cores*jobs product is less than it is today, we're fine.
<sphalerite>
I think it was 128 or 256GB
<gchristensen>
125
<gchristensen>
ok, build-jobs changed from 48 to 9
<srhb>
gchristensen: Will be very interesting to see! :)
aminechikhaoui has joined #nixos-dev
<gchristensen>
indeed, we'll have to watch the queue carefully
<srhb>
Yeah.
<srhb>
gchristensen: The recent deployment marker on those graphs is this change, right?
<andi->
if you hove over the triangle you should see a message
<gchristensen>
yeah, the pink ones are the actual change. the blue one is a manuallycreated one showing what I did.
<gchristensen>
that was me saying "is the wifi up?" and then pressing RET~. to kill the session, but having it come back at the last moment to send my message. :)
<andi->
not a fan of mosh?
<gchristensen>
I had a few frustrating moments of dealing with firewall shenanigans and mosh and then went back to ssh :)
<andi->
fair enough
<gchristensen>
and then I get to a bad wifi connection where it is high latency and wish I had fought that fight sooner.
<Synthetica>
I did `sudo hydra-create-user synthetica --full-name "Patrick Hilhorst" --email-address "patrick@hilhorst.be" --password hahayeahwouldntyoulikethat--role admin --help`, but I can't log in through the web interface, what am I doing wrong?
<andi->
you are missing a space between the pw and --role? ;)
<gchristensen>
and also probably don't want to pass --help
<Synthetica>
Oh, copy-pasted wrong command, did it without help before (and with the space, that's just for irc :P)
alp has joined #nixos-dev
<gchristensen>
globin: where do you make the directory?
<Synthetica>
Could it be a permissions thing, that I shouldn't use sudo?
<srhb>
Synthetica: sudo -u hydra
<srhb>
(Usually)
<srhb>
But if you got no error (which is puzzling) I guess it sounds right
<Synthetica>
Hmm, with sudo -u hydra it gives an error
<srhb>
You're not using the NixOS module?
<Synthetica>
Yeah, I am
<srhb>
... odd
<Synthetica>
With sudo -u hydra I get an error
<Synthetica>
Maybe because I also did hydra-init-db as root?
<srhb>
Yes, that sounds wrong as well.
<srhb>
The module should take care of that. Nuke the db and start over.
<globin>
gchristensen: think it's the runtime directory in the module or similar, will check in ~1 hour when I'm back at my laptop
<arianvp>
I want to write a regression test that involves changing the nixos config and asserting that the right systemd units are restarted
<gchristensen>
globin: please do :) let me know and I'll switch over
<arianvp>
Is there an easy way to do that with the nixos testing infra?
<arianvp>
Can you change the config of a VM inside the VM itself?
<gchristensen>
arianvp: sure, look at the installer.nix test
<andi->
you are really evolving towards a cyborg :P
<gchristensen>
I don't actually do that, but it is something I've dreamed of :P
<srhb>
Everything but chromium is green on the latest trunk-combined tested eval, too... So close.
<andi->
I would like to know if working on more granular permissions for hydra is something that we would want.. I opened that one PR that allows restarting of a single job with the "restart-jobs" role but not feedback yet.. I'd also be up to introduce more / different roles if we think that is a good thing..
<andi->
srhb: yeah, lets hope the scheduler puts it on a node that doesn't honor 10h timeouts :D
<srhb>
indeed.
<srhb>
I know it builds. I have it locally. So 70.x is not broken, just slow as molasses.
<andi->
e.g. packet-epyc-1 seems to just work fine
<srhb>
I've suggested this before, it could help a lot. :)
<srhb>
It needn't even be a big builder, just a very low-jobs one.
<srhb>
I think that was essentially the original intention behind big-parallel, but we've misused it.
<andi->
could we somehow slice a build machine into being "two" builders just different settings and one having more CPU prefernce then the other?
<andi->
(technically we can, not sure if it makes much sense)
<srhb>
andi-: vms?
<gchristensen>
I'm not sure we have? only ~5 jobs have big-parallel, srhb
<srhb>
gchristensen: It's more about which builders get it, and what maxJobs they have :)
<srhb>
The jobs don't matter much.
<andi->
srhb: without that overhead of VMs, I am thinking like a preemtive scheduling that prefers the "bountiful-cores" builder
<gchristensen>
I think the reason that has happened is due to bein resource constrained and feeling not great about depriving the rest of the builds with those jobs.
<srhb>
Yep, I understand the reasoning.
<srhb>
I too like being environmentally and economically friendly. :-P
<gchristensen>
I see no reason why one machine can't be in the hydra machines file twice, one with big-parallel and one joband many cores, and once without and few cores
<srhb>
andi-: Your restart thing got merged. :o
<andi->
srhb: \o/ must check whats up with my github notifications.. I get mails but they don't show up on the UI
<aanderse>
i'm poking around hydra and trying to find out if a package is broken under master... i haven't spent much time looking at hydra, so i'm kinda lost atm (can't seem to find)
<andi->
gchristensen: things like rustc that just eats what it can is not optimal for such setups without further restricting things
<aanderse>
the package name is speed_dreams
<gchristensen>
andi-: I think nix-daemon restricts CPUs?
<gchristensen>
does it not?
<srhb>
aanderse: If it's in one of the sets that actually receives evaluations, the easiest way to go is nixpkgs->trunk->search latest eval
<andi->
gchristensen: I think not on my machines..
<andi->
might be able to..
<Synthetica>
srhb: gchristensen: Fixed it, the sollution was passing -i to sudo
<andi->
aanderse: last build around 2013, was it removed?
<srhb>
No, it's still there.
<srhb>
Anyway, this should probably be in #nixos :-)
<aanderse>
andi-: ah that makes sense... the build is failing on my machine
<andi->
ahh, hydraPlatforms = [];
<andi->
it just isn't being scheduled on hydra
<aanderse>
i'm doing a version bump on a package and noticed speed_dreams depends on it so thought that a good thing to test that it still builds
<srhb>
(fwiw #nixos is also about package development while this channel is more about Nix infrastructure and wide-ranging changes to nixpkgs/NixOS)
<Synthetica>
Can't we package a binary, and have a chromium-from-source for people that really want it?
<niksnut>
I guess people can use google-chrome instead?
<fpletz>
globin tried that a few weeks ago but just enabling it didn't work
<andi->
Or can we maybe builds less bundled libraries? It seems to redistribute a lot of stuff.. I know thats not how they want it to be done but firefox also bundles a bit and we still use system libs
<gchristensen>
fpletz: with jumbo builds turned on I was able to build chromium in 20 minutes on the epyc machine
<Synthetica>
Oh damn
<samueldr>
gchristensen: did you have a number for non-jumbo?
<globin>
gchristensen: yes, but that broke linking for us.. :/
<globin>
gchristensen: did that succeed for you?
<fpletz>
gchristensen: wow, did you need to add changes to the chromium expression?
<gchristensen>
globin: the build finished at least, but I didn't try it.
<gchristensen>
it seems overall our builds are straining on overburdened build nodes.
<gchristensen>
notably, big-parallel build nodes.
<gchristensen>
builds*
<gchristensen>
niksnut: does Nix restrict the cores a build job can access based on build-cores?
<aminechikhaoui>
I think build-cores is mainly passed to make -j
<niksnut>
no
<gchristensen>
ah
<gchristensen>
maybe the thing to do is dedicate a node to big-parallel builds, set cores=0 and max-jobs=1 for a few days and see what happens. I feel we don't have enough data about how reliability and timing changes based on how loaded machines are. am I wrong?
<gchristensen>
(yes yes environment, but 18hrs of chromium maybe could be just 2hrs of chromium and be more efficient)
<Profpatsch>
niksnut: aszlig mentioned that they put in half of Chrome-OS by now.
<Profpatsch>
Less of a browser, more of an operating system …
<Profpatsch>
Though I still prefer it over FF tbh
<globin>
gchristensen: is it possible for ofborg to run tests in nixpkgs that aren't included in release.nix, might be an option for gitlab, which probably uses too much memory for us to want it run on hydra.. trying with 4GB after OOM with 2GB
<gchristensen>
right now, no
<gchristensen>
though I wonder if the builders are capable of such large builds, since I don't manage them all I don't know
<shlevy>
I've caught the NixCon cold so obviously I'm sitting here thinking about a new configuration scheme: just like object capabilities replace imperative global namespaces and ACLs with combined imperative designation and authority, we could have "declarative capabilities" to replace the module system's declarative global namespace (and lack of ALCs) with combined declarative designation and authority
<shlevy>
A component that *creates* a capability expresses it through a function argument, whereas a component that *uses* a capability expresses it through setting attributes. These can be composed in arbitrary ways, e.g. if services A, B, and C each need a database, service A and B can be "passed" a capability to one postgres instance and C can be passed a capability to another, without having to rewrite A, B, C, or the postgres functionality
<shlevy>
And it's completely in the control of the user which components have the right to modify which aspects of the overall configuration
<niksnut>
this could already be accomplished using NixOS submodules btw
<simpson>
shlevy: Yessssss.
<shlevy>
simpson: That's what you get for sharing a bunch of interesting ocap links just before I have nothing to do but read :P
<simpson>
shlevy, niksnut: FWIW, in ocap languages this is a fundamentally-common pattern, to have a *maker* function which produces a parameterized object.
<simpson>
In Monte, modules have an outermost layer of exported values, which are "DeepFrozen" (transitively immutable), and those are usually makers for objects which will be live and mutable at runtime. This gives us our weird statically-linked dynamically-typed ability.
<Profpatsch>
Isn’t that just Ocaml modules?
<simpson>
Their module system is a big inspiration, yeah. Racket's too.
drakonis1 has joined #nixos-dev
Mic92 has quit [Quit: WeeChat 2.2]
Mic92 has joined #nixos-dev
__Sander__ has quit [Quit: Konversation terminated!]
Lisanna has joined #nixos-dev
phreedom has quit [Ping timeout: 256 seconds]
Taneb is now known as GHOSTLY_SPOOK
GHOSTLY_SPOOK is now known as Taneb
<shlevy>
simpson: Implementing unforgeable references on top of a language with no encapsulation is fun :D
orivej has quit [Ping timeout: 268 seconds]
<simpson>
No kidding.
drakonis1 has quit [Quit: WeeChat 2.2]
sir_guy_carleton has joined #nixos-dev
orivej has joined #nixos-dev
drakonis_ has joined #nixos-dev
drakonis has quit [Ping timeout: 268 seconds]
phreedom has joined #nixos-dev
drakonis has joined #nixos-dev
drakonis_ has quit [Ping timeout: 264 seconds]
drakonis has quit [Read error: Connection reset by peer]