#nixos-dev on 2020-02-12

2020-02-03 20:39 samueldr changed the topic of #nixos-dev to: #nixos-dev NixOS Development (#nixos for questions) | NixOS 20.03 Feature Freeze Feb 10 https://discourse.nixos.org/t/nixos-20-03-feature-freeze/5655 | https://hydra.nixos.org/jobset/nixos/trunk-combined https://channels.nix.gsc.io/graph.html | https://r13y.com | 19.09 RMs: disasm, sphalerite; 20.03: worldofpeace, disasm | https://logs.nix.samueldr.com/nixos-dev

00:19 multun has quit [Ping timeout: 265 seconds]

00:24 multun has joined #nixos-dev

00:51 Synthetica has quit [Quit: Connection closed for inactivity]

01:24 bhipple has joined #nixos-dev

03:20 ajs124 has quit [Remote host closed the connection]

03:20 Scriptkiddi has quit [Remote host closed the connection]

03:20 das_j has quit [Remote host closed the connection]

03:42 ris has quit [Ping timeout: 240 seconds]

04:14 lovesegfault has joined #nixos-dev

04:20 phreedom has quit [Remote host closed the connection]

04:22 phreedom has joined #nixos-dev

05:17 lovesegfault has quit [Ping timeout: 246 seconds]

05:19 lovesegfault has joined #nixos-dev

05:21 bhipple has quit [Remote host closed the connection]

05:55 drakonis_ has quit [Read error: Connection reset by peer]

07:09 orivej has quit [Ping timeout: 268 seconds]

08:36 tilpner_ is now known as tilpner

09:08 claudiii has joined #nixos-dev

10:05 Scriptkiddi has joined #nixos-dev

10:05 ajs124 has joined #nixos-dev

10:05 das_j has joined #nixos-dev

10:06 __monty__ has joined #nixos-dev

10:18 <FRidh> I wonder if there are any objections to the pace at which we iterate staging nowadays. It can at times go pretty fast, e.g. now there's 2 new stdenv rebuilds in 2 days, which results in quite some bandwidth needed when following unstable

10:35 <Taneb> What's with the eval failure on the nixos:release-20.03 jobset?

10:42 <infinisil> FRidh: I once brought up the idea of having multple levels of staging, which I quite like. So one level for e.g. stdenv rebuilds, only getting a couple commits, being merged into a second level that gets core library updates which are more common, etc.

10:43 <infinisil> I think that could save quite a bit of resources

11:18 claudiii has quit [Quit: Connection closed for inactivity]

11:21 <Taneb> In the haskell-updates jobset haskellPackages.generic-deriving failed for weird reasons that don't happen to me locally, could it be restarted? It causes a lot of knock-on failures

11:21 <Taneb> https://hydra.nixos.org/build/112558734

11:22 <Taneb> A few things seem to have been failing on machine a63b04eb for the same reason

12:13 m15k has joined #nixos-dev

12:22 pkolloch[m] has joined #nixos-dev

12:50 <pie_[bnc]> infinisil: something i shoot myself in the foot with a lot with modules is ones that dont complain if you pass an invalid path

12:51 <infinisil> pie_[bnc]: Elaborate?

12:51 <pie_[bnc]> just did it with services.tomcat.webapps for example

12:52 <pie_[bnc]> i did services.tomcat.webapps = [ ./share/whatever ] instead of ./result/share/whatever

12:52 <pie_[bnc]> well, im still rebuilding so hopefully it will work this time and not some other user error

12:53 <infinisil> Hm I see, this might be fixable

12:59 orivej has joined #nixos-dev

13:01 <pie_[bnc]> proceeded to shoot myself in the foot a second time by typoing it

13:02 <pie_[bnc]> infinisil: maybe sometimes you want to be able to pass paths that seem invalid at the time, this seems to be generatinga shell script...but most of the time, probably not

13:02 <pie_[bnc]> i didnt think about this very hard

13:05 <pie_[bnc]> ok maybe im just doing this wrong, #nixos

13:36 <Profpatsch> FRidh: Huh, are we merging staging that often?

13:37 <Profpatsch> I thought the time between mass rebuilds has gone up since we introduced staging?

13:37 <Profpatsch> At least I’m switching to current master a lot and I never have mass rebuilds anymore.

13:37 <Profpatsch> Usually I don’t have to build anything.

13:38 <Profpatsch> (I really don’t care about re-fetching binaries when following master tbh)

13:38 <Profpatsch> FRidh: If you really want to contribute to the solution, help get the CAS into nix :)

13:40 <Profpatsch> Fast iteration is *good*, because it means people aren’t afraid to touch low-level parts of the system, which means we won’t accrue as much technical debt.

13:40 <Profpatsch> Slowing that down because some people have to fetch more (prebuilt!) binaries because they want to follow master/unstable would be extremely unhealthy.

13:53 Scriptkiddi has quit [Remote host closed the connection]

13:53 ajs124 has quit [Remote host closed the connection]

13:53 das_j has quit [Remote host closed the connection]

13:53 ajs124 has joined #nixos-dev

13:53 das_j has joined #nixos-dev

13:53 Scriptkiddi has joined #nixos-dev

14:19 <worldofpeace> Taneb: I asked gchristensen about that

14:50 ixxie has joined #nixos-dev

14:57 <FRidh> Profpatsch: typically it's once a week or so that staging-next is merged

15:10 <FRidh> those killed jobs are indeed annoying

15:11 <gchristensen> very :(

15:11 <gchristensen> I'm a bit out of my depth on that problem. I'll have to ping eelco

15:12 <FRidh> have there been any changes to hydra in the last week or so?

15:12 <FRidh> aside from the database change

15:12 <gchristensen> not sure

15:12 <gchristensen> is master failing to eval too?

15:12 <FRidh> no, failing eval is only with release-20.03

15:13 <gchristensen> INTERESTING.

15:13 <FRidh> but jobs are dying with killed 9 everywhere

15:13 <FRidh> new index for 20.03?

15:14 <Taneb> It's weird that 20.03 isn't working but 20.03-small, for example, is

15:14 <gchristensen> I wonder if I misconfigured it

15:15 <FRidh> supportedSystems is missing

15:15 <FRidh> when comparing with release-19.09

15:15 <Taneb> No it isn't?

15:15 <gchristensen> it is? https://hydra.nixos.org/jobset/nixos/release-20.03#tabs-configuration

15:15 <FRidh> oh, you are right

15:16 <Taneb> stableBranch is false, whereas for 19.09 it's true

15:16 <FRidh> is that not changed on release

15:16 <Taneb> Maybe

15:16 <Taneb> But aside from which branch it's pointing to, and like names and descriptions, that's the only difference I can see

15:34 <andi-> Yeah nixos:trunk-combined is also failing with `hydra-eval-jobs returned signal 9`

15:35 <Taneb> Recent change in release-combined.nix?

15:35 <andi-> i'd say signal 9 doesn't mean it is related to what it tries to eval?

15:35 <Taneb> Quite possibly

15:35 <gchristensen> it is probably something fussy about how the memory allocator is configured

15:36 <gchristensen> unless it really truly does need 45G to evaluate

15:39 <{^_^}> firing: BuildsStuckOverTwoDays: https://status.nixos.org/prometheus/alerts

15:46 <andi-> argh, rust-cbindgen doesn't reproduce anymore... This fixed output crap is really annoying -.-

15:47 <gchristensen> kill it!

15:49 <andi-> https://github.com/rust-lang/rust/issues/63476 it is not our fault this time.. (from the looks of it)

15:49 <{^_^}> rust-lang/rust#63476 (by yurivict, 26 weeks ago, open): nightly version fails: invalid version 3 on git_proxy_options; class=Invalid (3)

15:49 <gchristensen> :/

15:49 <andi-> or maybe it is due to the rust version

16:17 <gchristensen> it sounds like a major cause of memory bloat problem is the number of NixOS tests

16:18 <gchristensen> each vm is a few hundred mb

16:20 <emily> could some of them use containers instead?

16:20 <gchristensen> the test framework doesn't have a mechanism for testing with containers

16:21 <emily> right, but I mean, in theory

16:21 <gchristensen> not sure, but the problem isn't the VM itself, but evaluating NixOS all those times

16:22 <emily> oh, ok

17:07 <yorick> did the channels stop updating? https://hydra.nixos.org/eval/1570061 finished but didn't make it into the nixos-unstable-small channel

17:07 <gchristensen> yorick: do you know https://status.nixos.org/prometheus/alerts ?

17:08 <yorick> gchristensen: okay, sorry

17:08 <gchristensen> no worries, that isn't a spurn -- just curious

17:08 <yorick> gchristensen: I usually see ChannelUpdateStuck in the channel, but not today

17:08 <yorick> (which makes sense, since it's been there for 2 days)

17:09 <gchristensen> if you click on that alert, you'll see 20.03 is erroring, not unstable-small

17:10 <yorick> gchristensen: hmm, so am I reading the hydra output wrong or is nixos-unstable-small also stuck?

17:10 <gchristensen> we are exploring together :)

17:11 <gchristensen> https://hydra.nixos.org/job/nixos/unstable-small/tested/latest-finished is pointing to https://hydra.nixos.org/build/112192936

17:11 <gchristensen> why, I'm not sure ...

17:12 <yorick> "This job is not a member of the latest evaluation of its jobset. This means it was removed or had an evaluation error."

17:12 <gchristensen> oh I see

17:13 <gchristensen> yes, an evaluation error was introduced, breaking the tested job for nixos-unstable-small

17:13 <gchristensen> see https://hydra.nixos.org/jobset/nixos/unstable-small#tabs-errors

17:14 drakonis has joined #nixos-dev

17:14 <gchristensen> it is surprising to me that the tested job evaulates for release-combined but not release-small

17:15 <gchristensen> that is the first place to look -- bisect between the last known good to the currently broken, and find where the evaulation started failing

17:18 m15k has quit [Ping timeout: 260 seconds]

17:23 <gchristensen> yorick: where did you see that text, btw?

17:23 <clever> gchristensen: likely the build overview page

17:23 <clever> job overview*

17:28 <yorick> gchristensen: https://hydra.nixos.org/job/nixos/unstable-small/tested#tabs-constituents

17:31 <gchristensen> interesting

17:31 <gchristensen> the fact that it is deleted should be exported on https://hydra.nixos.org/job/nixos/unstable-small/tested/prometheus

17:38 NinjaTrappeur has quit [Quit: WeeChat 2.7]

17:40 NinjaTrappeur has joined #nixos-dev

17:48 <gchristensen> btw yorick: https://status.nixos.org/grafana/d/LhIq8iLWk/channel-updates

17:48 <yorick> gchristensen: ooh

17:57 tokudan[m] has joined #nixos-dev

18:07 <tokudan[m]> so... 19.09-small seems to be stuck on the error: value is a string while an integer was expected, at /tmp/build-112561727/nixpkgs/source/nixos/release.nix:15:59. which is the nixpkgs.revCount part of this line: versionSuffix = (if stableBranch then "." else "beta") + "${toString (nixpkgs.revCount - 192668)}.${nixpkgs.shortRev}";

18:09 <tokudan[m]> which is a negative number

18:11 <gchristensen> yikes, did those values change recently?

18:11 <gchristensen> did something get backported which shouldn't have?

18:13 <tokudan[m]> the last time the revCount changed was ~5 months ago, according to the blame game

18:13 <tokudan[m]> which should probably be ok

18:14 <gchristensen> does it evaluate locally?

18:15 <tokudan[m]> that was my local reproduction of the error

18:16 <gchristensen> ah

18:16 <tokudan[m]> from "curl https://hydra.nixos.org/build/112561727/reproduce | bash"

18:17 <tokudan[m]> I'll have another look when I'm home

19:03 claudiii has joined #nixos-dev

19:06 orivej has quit [Ping timeout: 265 seconds]

19:14 ris has joined #nixos-dev

19:21 orivej has joined #nixos-dev

19:25 cole-h has joined #nixos-dev

19:27 <thoughtpolice> `revCount - 192668` lol

19:31 <thoughtpolice> Anyone ever hit an issue like this with a workaround, or is the only solution truly just to fix rust-dns-resolver: https://github.com/NixOS/nixpkgs/pull/79224#issuecomment-585183243

19:33 <thoughtpolice> Oops, wrong channel, really

19:44 <{^_^}> firing: BuildsStuckOverTwoDays: https://status.nixos.org/prometheus/alerts

19:50 tokudan has joined #nixos-dev

20:30 <tokudan> so... I'm able to build the nixos-19.09-small channel with a minor change in the buildscript: instead of revCount = \"$revCount\"; i used revCount = $revCount;

20:30 <tokudan> another strange error I noticed though is machine# [ 137.275557] systemd[1]: container@foo.service: start operation timed out. Terminating.

20:33 <tokudan> got no idea what the actual problem is

20:49 ixxie has quit [Ping timeout: 268 seconds]

20:50 ChanServ has quit [shutting down]

20:58 <yorick> why would revcount be lower than 192668?

20:59 <gchristensen> maybe there is a bug in hydra?

21:01 <niksnut> hm, there have been some changes to fetchGit etc., but I don't think that's used here

21:02 <gchristensen> yeah, this is hydra's fetch git impl

21:02 dongcarl has quit [Read error: Connection reset by peer]

21:06 <niksnut> btw, it's astonishing how much slower nixos eval has gotten, 'nix-instantiate nixos/release-combined.nix -A nixos.tests.misc.x86_64-linux --dry-run' took 1.7s in 18.09, but 5.5s on master

21:07 <gchristensen> that really is astonishing

21:07 <multun> :'(

21:08 <clever> compare the function call counts from the profiling json?

21:14 ChanServ has joined #nixos-dev

21:14 <niksnut> https://github.com/NixOS/nixpkgs/issues/79943

21:14 <{^_^}> #79943 (by edolstra, 51 seconds ago, open): NixOS evaluation speed regression

21:19 <niksnut> maybe we can have a moratorium on adding modules to module-list.nix

21:19 <niksnut> so all new modules have to be enabled via 'imports = [ ... ]'

21:19 <gchristensen> we'll need to be careful to not kill our baby

21:20 <clever> niksnut: isnt there infinite recursion if imports depends on config?

21:21 <niksnut> yes, but how does it depend on config?

21:21 <clever> if your using an if statement, to exclude things

21:22 <clever> not sure if a giant tree of static imports would be any better then the flag module-list.nix

21:22 <niksnut> you shouldn't use an if statement, you just import the modules you need

21:24 <niksnut> looks like the worst slowdown was between 19.03 and 19.09 (2.0s -> 5.4s)

21:24 <niksnut> no sorry, 19.03 just fails to evaluate after 2.0s ;-)

21:25 <gchristensen> let me guess: strongswan

21:26 <niksnut> it was 18.09 -> 19.03

21:28 <niksnut> there also appears to be some IFD going on

21:28 <niksnut> error: cannot import '/nix/store/gccjbjnhw5fr2z4cmbkhjlz5y7xkjcrp-nixpkgs', since path '/nix/store/gccjbjnhw5fr2z4cmbkhjlz5y7xkjcrp-nixpkgs' is not valid, at /home/eelco/Dev/nixpkgs/nixos/release.nix:23:14

21:28 <niksnut> copying all of nixpkgs to the store probably isn't helping either

21:29 <gchristensen> IFD? hrm. is ofborg not doing its job? :P

21:31 <niksnut> no, this is caused by nixpkgs ? { outPath = cleanSource ./..; revCount = 130979; shortRev = "gfedcba"; } in release.nix

21:31 <niksnut> which doesn't happen on hydra / ofborg

21:32 <gchristensen> ahh

22:00 <LnL> niksnut: is there an easy way to invoke the hydra evaluator (or something similar) from the cli?

22:00 <thoughtpolice> Regardless of mechanism (Flakes, etc) requiring module imports would be a very good idea from a readability/usability perspective, IMO

22:00 <gchristensen> hydra-evaluate-jobs I think, LnL

22:01 <gchristensen> erm hmm maybe not that, there is something you can just point at an expression I think?

22:01 <niksnut> hydra-eval-jobs

22:02 <niksnut> for example: hydra-eval-jobs '<nixpkgs/nixos/release-small.nix>' -I nixpkgs=/home/eelco/Dev/nixpkgs

22:05 justanotheruser has quit [Ping timeout: 260 seconds]

22:05 <LnL> does that also write stuff to the db?

22:05 <gchristensen> I think it just does stdout and the hydra evaluator parses and inserts

22:05 <clever> LnL: yeah, it will eval every attr, and write all .drv files to /nix/store/

22:05 <clever> but not insert anything into the hydra db

22:05 <LnL> ah perfect, thanks!

22:05 <clever> hydra-eval-jobsets (the perl script) parses the json, and fills the postgresql

22:06 <LnL> right

22:06 <clever> ,profile

22:06 <{^_^}> clever: Did you mean profiling?

22:06 <{^_^}> Use NIX_COUNT_CALLS=1 and/or NIX_SHOW_STATS=1 to profile Nix evaluation

22:06 <clever> hydra-eval-jobs will obey these, but the restarting mechanism will overwrite the profile

22:06 <clever> you need to patch nix to append rather then overwrite

22:06 <clever> - fs.open(outPath, std::fstream::out);

22:06 <clever> + fs.open(outPath, std::fstream::out | std::fstream::ate);

22:08 <niksnut> we got rid of the restart mechanism

22:08 <clever> oh?, how do you deal with heap usage then?

22:08 <niksnut> https://github.com/NixOS/hydra/commit/8f3114960cd4178669c54e7785973ac6198315ab

22:09 <niksnut> heap usage was bad with and without that

22:10 <samueldr> ah, that's not part of master

22:11 <samueldr> is the nixos hydra running against master or the flake branch?

22:11 <gchristensen> flake

22:12 <samueldr> I guess that's something to know for someone that looks into the current evaluation issue

22:12 <gchristensen> +1

22:12 <samueldr> LnL: ^

22:12 <gchristensen> samueldr, LnL: note the footer of hydra says Hydra 0.1.20200211.53896ff (using nix-2.4pre20200207_d2032ed). which is correct https://github.com/nixos/hydra/commit/53896ff

22:13 <LnL> ah right

22:15 justanotheruser has joined #nixos-dev

22:16 <LnL> hmm: restarting hydra-eval-jobs after job '...' because heap size is at ... bytes

22:17 <samueldr> AFAIUI this won't happen with the flake branch

22:17 <samueldr> since this code was removed

22:17 <niksnut> right

22:17 <LnL> ah, was going to say how come we see problems if this happens

22:18 <samueldr> this might not be solving the problems, depending what they are, as we were hitting against the GC's maximums anyway

22:24 <puck> worldofpeace: hey so. https://github.com/NixOS/nixpkgs/pull/77850 broke notifications, since electron dlopens libnotify, and doesn't link to it directly

22:25 <{^_^}> #77850 (by worldofpeace, 3 weeks ago, merged): signal-desktop: use autoPatchelfHook, wrap properly

22:25 <puck> i believe, at least

22:25 <worldofpeace> super familar with that kind of issue (I've probably fixed it 10+ times) , thanks for the mention

22:26 <worldofpeace> I think you can add it to `runtimeDependencies` and the autoPatchelfHook will pick it up

22:26 <puck> yeah, i'm not entirely sure if that's the actual source of the issue, but that seems the most likely culprit..

22:26 lovesegfault has quit [Quit: WeeChat 2.7]

22:27 <puck> oh huh, it does

22:30 <puck> it works. should i open a pr or are you on it?

22:32 <worldofpeace> puck: you can open a PR 👍️ it probably be a bad idea for me to even try too 🤣

22:33 <worldofpeace> just ping for merge

22:35 <puck> worldofpeace: https://github.com/NixOS/nixpkgs/pull/79949 :p

22:35 <{^_^}> #79949 (by puckipedia, 59 seconds ago, open): signal-desktop: fix notifications

22:37 <worldofpeace> puck: thanks, backport is needed also for release-20.03

22:40 <puck> worldofpeace: https://github.com/NixOS/nixpkgs/pull/79950 :)

22:40 <{^_^}> #79950 (by puckipedia, 17 seconds ago, open): [20.03] signal-desktop: fix notifications

22:42 <worldofpeace> puck: 😁 https://github.com/NixOS/nixpkgs/blob/master/.github/CONTRIBUTING.md#backporting-changes you need the `-x` in the command so it shows what commit it came from in the commit body

22:42 <puck> oh, oops

22:43 <puck> fixed!

22:45 <worldofpeace> puck: cool, all done. thanks again 👋

23:02 <LnL> clever: I'll just use an expr, doesn't seem to propagate the stats :/

23:02 __monty__ has quit [Quit: leaving]

23:09 primeos has quit [Ping timeout: 252 seconds]

23:10 primeos has joined #nixos-dev

23:41 <LnL> euh, why is this building stuff?

23:49 <{^_^}> firing: BuildsStuckOverTwoDays: https://status.nixos.org/prometheus/alerts

23:55 dongcarl has joined #nixos-dev