Synthetica has quit [Quit: Connection closed for inactivity]
bhipple has joined #nixos-dev
ajs124 has quit [Remote host closed the connection]
Scriptkiddi has quit [Remote host closed the connection]
das_j has quit [Remote host closed the connection]
ris has quit [Ping timeout: 240 seconds]
lovesegfault has joined #nixos-dev
phreedom has quit [Remote host closed the connection]
phreedom has joined #nixos-dev
lovesegfault has quit [Ping timeout: 246 seconds]
lovesegfault has joined #nixos-dev
bhipple has quit [Remote host closed the connection]
drakonis_ has quit [Read error: Connection reset by peer]
orivej has quit [Ping timeout: 268 seconds]
tilpner_ is now known as tilpner
claudiii has joined #nixos-dev
Scriptkiddi has joined #nixos-dev
ajs124 has joined #nixos-dev
das_j has joined #nixos-dev
__monty__ has joined #nixos-dev
<FRidh>
I wonder if there are any objections to the pace at which we iterate staging nowadays. It can at times go pretty fast, e.g. now there's 2 new stdenv rebuilds in 2 days, which results in quite some bandwidth needed when following unstable
<Taneb>
What's with the eval failure on the nixos:release-20.03 jobset?
<infinisil>
FRidh: I once brought up the idea of having multple levels of staging, which I quite like. So one level for e.g. stdenv rebuilds, only getting a couple commits, being merged into a second level that gets core library updates which are more common, etc.
<infinisil>
I think that could save quite a bit of resources
claudiii has quit [Quit: Connection closed for inactivity]
<Taneb>
In the haskell-updates jobset haskellPackages.generic-deriving failed for weird reasons that don't happen to me locally, could it be restarted? It causes a lot of knock-on failures
<Taneb>
A few things seem to have been failing on machine a63b04eb for the same reason
m15k has joined #nixos-dev
pkolloch[m] has joined #nixos-dev
<pie_[bnc]>
infinisil: something i shoot myself in the foot with a lot with modules is ones that dont complain if you pass an invalid path
<infinisil>
pie_[bnc]: Elaborate?
<pie_[bnc]>
just did it with services.tomcat.webapps for example
<pie_[bnc]>
i did services.tomcat.webapps = [ ./share/whatever ] instead of ./result/share/whatever
<pie_[bnc]>
well, im still rebuilding so hopefully it will work this time and not some other user error
<infinisil>
Hm I see, this might be fixable
orivej has joined #nixos-dev
<pie_[bnc]>
proceeded to shoot myself in the foot a second time by typoing it
<pie_[bnc]>
infinisil: maybe sometimes you want to be able to pass paths that seem invalid at the time, this seems to be generatinga shell script...but most of the time, probably not
<pie_[bnc]>
i didnt think about this very hard
<pie_[bnc]>
ok maybe im just doing this wrong, #nixos
<Profpatsch>
FRidh: Huh, are we merging staging that often?
<Profpatsch>
I thought the time between mass rebuilds has gone up since we introduced staging?
<Profpatsch>
At least I’m switching to current master a lot and I never have mass rebuilds anymore.
<Profpatsch>
Usually I don’t have to build anything.
<Profpatsch>
(I really don’t care about re-fetching binaries when following master tbh)
<Profpatsch>
FRidh: If you really want to contribute to the solution, help get the CAS into nix :)
<Profpatsch>
Fast iteration is *good*, because it means people aren’t afraid to touch low-level parts of the system, which means we won’t accrue as much technical debt.
<Profpatsch>
Slowing that down because some people have to fetch more (prebuilt!) binaries because they want to follow master/unstable would be extremely unhealthy.
Scriptkiddi has quit [Remote host closed the connection]
ajs124 has quit [Remote host closed the connection]
das_j has quit [Remote host closed the connection]
ajs124 has joined #nixos-dev
das_j has joined #nixos-dev
Scriptkiddi has joined #nixos-dev
<worldofpeace>
Taneb: I asked gchristensen about that
ixxie has joined #nixos-dev
<FRidh>
Profpatsch: typically it's once a week or so that staging-next is merged
<FRidh>
those killed jobs are indeed annoying
<gchristensen>
very :(
<gchristensen>
I'm a bit out of my depth on that problem. I'll have to ping eelco
<FRidh>
have there been any changes to hydra in the last week or so?
<FRidh>
aside from the database change
<gchristensen>
not sure
<gchristensen>
is master failing to eval too?
<FRidh>
no, failing eval is only with release-20.03
<gchristensen>
INTERESTING.
<FRidh>
but jobs are dying with killed 9 everywhere
<FRidh>
new index for 20.03?
<Taneb>
It's weird that 20.03 isn't working but 20.03-small, for example, is
<gchristensen>
it is surprising to me that the tested job evaulates for release-combined but not release-small
<gchristensen>
that is the first place to look -- bisect between the last known good to the currently broken, and find where the evaulation started failing
m15k has quit [Ping timeout: 260 seconds]
<gchristensen>
yorick: where did you see that text, btw?
<clever>
gchristensen: likely the build overview page
<tokudan[m]>
so... 19.09-small seems to be stuck on the error: value is a string while an integer was expected, at /tmp/build-112561727/nixpkgs/source/nixos/release.nix:15:59. which is the nixpkgs.revCount part of this line: versionSuffix = (if stableBranch then "." else "beta") + "${toString (nixpkgs.revCount - 192668)}.${nixpkgs.shortRev}";
<tokudan[m]>
which is a negative number
<gchristensen>
yikes, did those values change recently?
<gchristensen>
did something get backported which shouldn't have?
<tokudan[m]>
the last time the revCount changed was ~5 months ago, according to the blame game
<tokudan[m]>
which should probably be ok
<gchristensen>
does it evaluate locally?
<tokudan[m]>
that was my local reproduction of the error
<tokudan>
so... I'm able to build the nixos-19.09-small channel with a minor change in the buildscript: instead of revCount = \"$revCount\"; i used revCount = $revCount;
<tokudan>
another strange error I noticed though is machine# [ 137.275557] systemd[1]: container@foo.service: start operation timed out. Terminating.
<tokudan>
got no idea what the actual problem is
ixxie has quit [Ping timeout: 268 seconds]
ChanServ has quit [shutting down]
<yorick>
why would revcount be lower than 192668?
<gchristensen>
maybe there is a bug in hydra?
<niksnut>
hm, there have been some changes to fetchGit etc., but I don't think that's used here
<gchristensen>
yeah, this is hydra's fetch git impl
dongcarl has quit [Read error: Connection reset by peer]
<niksnut>
btw, it's astonishing how much slower nixos eval has gotten, 'nix-instantiate nixos/release-combined.nix -A nixos.tests.misc.x86_64-linux --dry-run' took 1.7s in 18.09, but 5.5s on master
<gchristensen>
that really is astonishing
<multun>
:'(
<clever>
compare the function call counts from the profiling json?
<niksnut>
maybe we can have a moratorium on adding modules to module-list.nix
<niksnut>
so all new modules have to be enabled via 'imports = [ ... ]'
<gchristensen>
we'll need to be careful to not kill our baby
<clever>
niksnut: isnt there infinite recursion if imports depends on config?
<niksnut>
yes, but how does it depend on config?
<clever>
if your using an if statement, to exclude things
<clever>
not sure if a giant tree of static imports would be any better then the flag module-list.nix
<niksnut>
you shouldn't use an if statement, you just import the modules you need
<niksnut>
looks like the worst slowdown was between 19.03 and 19.09 (2.0s -> 5.4s)
<niksnut>
no sorry, 19.03 just fails to evaluate after 2.0s ;-)
<gchristensen>
let me guess: strongswan
<niksnut>
it was 18.09 -> 19.03
<niksnut>
there also appears to be some IFD going on
<niksnut>
error: cannot import '/nix/store/gccjbjnhw5fr2z4cmbkhjlz5y7xkjcrp-nixpkgs', since path '/nix/store/gccjbjnhw5fr2z4cmbkhjlz5y7xkjcrp-nixpkgs' is not valid, at /home/eelco/Dev/nixpkgs/nixos/release.nix:23:14
<niksnut>
copying all of nixpkgs to the store probably isn't helping either
<gchristensen>
IFD? hrm. is ofborg not doing its job? :P
<niksnut>
no, this is caused by nixpkgs ? { outPath = cleanSource ./..; revCount = 130979; shortRev = "gfedcba"; } in release.nix
<niksnut>
which doesn't happen on hydra / ofborg
<gchristensen>
ahh
<LnL>
niksnut: is there an easy way to invoke the hydra evaluator (or something similar) from the cli?
<thoughtpolice>
Regardless of mechanism (Flakes, etc) requiring module imports would be a very good idea from a readability/usability perspective, IMO
<gchristensen>
hydra-evaluate-jobs I think, LnL
<gchristensen>
erm hmm maybe not that, there is something you can just point at an expression I think?
<niksnut>
hydra-eval-jobs
<niksnut>
for example: hydra-eval-jobs '<nixpkgs/nixos/release-small.nix>' -I nixpkgs=/home/eelco/Dev/nixpkgs
justanotheruser has quit [Ping timeout: 260 seconds]
<LnL>
does that also write stuff to the db?
<gchristensen>
I think it just does stdout and the hydra evaluator parses and inserts
<clever>
LnL: yeah, it will eval every attr, and write all .drv files to /nix/store/
<clever>
but not insert anything into the hydra db
<LnL>
ah perfect, thanks!
<clever>
hydra-eval-jobsets (the perl script) parses the json, and fills the postgresql
<LnL>
right
<clever>
,profile
<{^_^}>
clever: Did you mean profiling?
<{^_^}>
Use NIX_COUNT_CALLS=1 and/or NIX_SHOW_STATS=1 to profile Nix evaluation
<clever>
hydra-eval-jobs will obey these, but the restarting mechanism will overwrite the profile
<clever>
you need to patch nix to append rather then overwrite
<niksnut>
heap usage was bad with and without that
<samueldr>
ah, that's not part of master
<samueldr>
is the nixos hydra running against master or the flake branch?
<gchristensen>
flake
<samueldr>
I guess that's something to know for someone that looks into the current evaluation issue
<gchristensen>
+1
<samueldr>
LnL: ^
<gchristensen>
samueldr, LnL: note the footer of hydra says Hydra 0.1.20200211.53896ff (using nix-2.4pre20200207_d2032ed). which is correct https://github.com/nixos/hydra/commit/53896ff
<LnL>
ah right
justanotheruser has joined #nixos-dev
<LnL>
hmm: restarting hydra-eval-jobs after job '...' because heap size is at ... bytes
<samueldr>
AFAIUI this won't happen with the flake branch
<samueldr>
since this code was removed
<niksnut>
right
<LnL>
ah, was going to say how come we see problems if this happens
<samueldr>
this might not be solving the problems, depending what they are, as we were hitting against the GC's maximums anyway