#nixos-dev on 2018-10-29

2018-08-16 20:49 gchristensen changed the topic of #nixos-dev to: NixOS Development (#nixos for questions) | https://hydra.nixos.org/jobset/nixos/trunk-combined https://channels.nix.gsc.io/graph.html | 18.09 release managers: vcunat and samueldr | https://logs.nix.samueldr.com/nixos-dev

00:01 pie_ has quit [Ping timeout: 252 seconds]

01:02 <ekleog> (triage) close https://github.com/NixOS/nixpkgs/issues/36542

01:02 <{^_^}> #36542 (by vitiral, 33 weeks ago, open): question(newbie): how do I set an environment variable in nix-shell for rust packages?

02:13 lassulus_ has joined #nixos-dev

02:16 lassulus has quit [Ping timeout: 264 seconds]

02:16 lassulus_ is now known as lassulus

02:42 copumpkin has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

02:43 orivej has quit [Ping timeout: 246 seconds]

02:54 Nadrieril has quit [Remote host closed the connection]

03:05 Jackneill has quit [Ping timeout: 240 seconds]

03:18 Jackneill has joined #nixos-dev

03:33 drakonis has quit [Quit: WeeChat 2.2]

03:52 Mic92 has quit [Quit: Connection closed for inactivity]

04:47 sir_guy_carleton has quit [Quit: WeeChat 2.2]

04:51 Lisanna has joined #nixos-dev

06:21 <schmittlauch[m]> Now the nixos.tests.installer.simple.x86_64-linux part of the latest hydra build has failed.

06:21 <schmittlauch[m]> Meanwhile Firefox didn't build locally, maybe due to insufficient RAM or tmp disk space

07:00 orivej has joined #nixos-dev

07:14 emily has quit [Ping timeout: 252 seconds]

07:22 orivej has quit [Ping timeout: 246 seconds]

07:45 <andi-> schmittlauch[m]: it requires more ram with 63

07:46 <andi-> Or well tmpfs

08:10 teto has joined #nixos-dev

08:19 Mic92 has joined #nixos-dev

08:29 <srhb> Just decompressed the most recent failure logs on trunk-combined on packet-epyc-1, looks like they're all "timed out waiting for the VM to connect." This has gotten much worse recently.

08:30 <andi-> I guess that has to do with the changes in how heavy we load them?

08:30 <andi-> (just a wild shot)

08:30 <srhb> I thought gchristensen recently halved the load, but maybe not?

08:31 <gchristensen> I did

08:31 <srhb> OK, good to know.

08:31 <srhb> Doesn't look like any measurable improvement occurred.

08:31 <srhb> Maybe we should increase the connect timeout after all then

08:34 <srhb> Actually, this is strange: https://hydra.nixos.org/build/83246413 -- this shows with a duration of 5m4s, has the timed out error, but the time limit should be 10 minutes, should it not?

08:34 <srhb> nixpkgs/nixos/lib/test-driver/Machine.pm:252

08:35 <srhb> Guh, no, I can't math

08:35 <srhb> :-)

08:35 <srhb> 5 minutes indeed.

08:40 <gchristensen> :)

08:43 <srhb> Another thing to note: These errors seem to be far more prevalent on all of the installer.* tests.

08:44 <gchristensen> any correlation between whether or not it uses the bootloader?

08:44 <gchristensen> or do they all use the bootloader?

08:45 <srhb> I think they all do, though not completely sure.

08:45 <srhb> But this seems like a red flag.

08:45 <gchristensen> then that is interesting, i think the bootloader tests are "heavier" somehow

08:46 __Sander__ has joined #nixos-dev

08:46 <srhb> Is there a way to get more history on the constituents tab? I think a visual comparison might in fact reveal when this started being really bad.

08:47 <ekleog> fwiw when I tried to run the installer test from on my machine it OOM'd, while other tests usually pass without a hitch… don't know if that's related

08:49 * andi- runs the ZFS installer test locally..

08:56 <srhb> I think it's likely there really is a bug here. Looking at the logs, the root shell ought to be connected at the point in time the error is happenning. I have to turn down the timeout to like 5 seconds to reproduce it locally

08:56 <srhb> Did this happen at approximately the same time as the new VM backdoor got added?

08:57 <srhb> I think it might have.

08:57 <srhb> I wonder if there's a race

08:58 <gchristensen> ooh

08:59 <srhb> Dezgeg: ^ Do you know? :)

09:03 <andi-> But isn't the backdoor already >2 months old? How long do you *think* we are having those issues?

09:04 <srhb> Ah, I thought it was only about a month. Regardless, it's just a hunch, no solid data yet.

09:04 <andi-> I just reverted it on my local 18.09 branch... the installer test was running fine before that but took about 10min

09:05 <srhb> andi-: The failure actually occurs very quickly after startup when it occurs

09:05 <srhb> To reproduce locally you'll need to reduce it drastically, probably

09:05 <andi-> ok, I'll do that after I had some food & coffee :)

09:06 <andi-> I fear it might just be load related and hard to simulate :/

09:07 <srhb> That was my first hunch, but it's actually a fairly new error. Something is likely to have changed.

09:09 <srhb> andi-: If you notice the timestamps from the startup, the system is done doing anything after approximately 30 seconds. Then, four minutes pass and still no connection.

09:09 <srhb> (on my system, it's done booting after approx 8 seconds, so the load seems to be less than a factor of 10 in difference, and certainly not a difference of a factor of 100)

09:10 <srhb> Assuming the clock is at least somewhat accurate..

09:10 <andi-> with booting you mean the dmesg spam? The VM construction? Stage2 and the start of the script execution?

09:10 <srhb> Yeah

09:13 <srhb> It's not impossible that it's load related still. But the load reduction doesn't look like it had much impact..

09:14 <andi-> Would be nice to see the load of the machine at that time / during that period

09:14 <andi-> would certainly allow for ruling it out

09:14 <LnL> gchristensen has been working on getting metrics from the builders

09:15 <andi-> so I have heard :) What can we do to help with that ? :)

09:16 <gchristensen> which builder/period?

09:16 <srhb> packet-epyc-1 2018-10-29 03:54:13 for instance

09:17 <srhb> (Does the Hydra UI report UTC or my TZ?)

09:23 <LnL> logs and timezones, that never ends well...

09:23 <srhb> Indeed...

09:23 <gchristensen> srhb, andi-, https://snapshot.raintank.io/dashboard/snapshot/E740OUL966n5lMjc4Mm82evpTPkNQhfe

09:24 <srhb> gchristensen: Thanks!

09:24 <srhb> This certainly does not look like a lad issue...

09:24 <srhb> load*

09:24 <gchristensen> I talked to niksnut about public metrics, it isn''t be a problem at all, I'm working on getting us a machine to host the collector

09:26 <LnL> what's that io spike, is that in seconds?

09:26 <gchristensen> yeah I think iowait is seconds

09:26 <srhb> Not load?

09:27 <LnL> gchristensen: btw, if there's anything I can do to help with the setup let me know

09:27 <gchristensen> thank you =)

09:27 <gchristensen> I'll reach out

09:27 <srhb> Regardless, it doesn't seem to correlate.

09:28 <gchristensen> node_cpu_seconds_total{instance=~"^($instance):.*",mode="iowait"} -- yeah, iowait in seconds

09:29 <srhb> What are people comfortable with trying next? We can increase the timeout to a ridiculous amount and see if that changes anything.

09:29 * gchristensen nominates srhb as in charge of this project

09:30 <srhb> I don't think I'm a good nominee, I don't have the access necessary to do the relevant testing of the machines, restarting of jobs, etc.

09:30 <gchristensen> delegate? :)

09:30 <srhb> OK, I am actively doing that insofar as spamming here counts as delegation. xD

09:30 <gchristensen> I'm happy to be remote hands as needed... but can't really think about it clearly

09:31 <srhb> Understood

09:36 <andi-> srhb sounds like a good candidate :)

09:43 orivej has joined #nixos-dev

10:12 pie_ has joined #nixos-dev

10:14 orivej has quit [Ping timeout: 264 seconds]

10:22 Nadrieril has joined #nixos-dev

10:53 Mic92 has quit []

10:54 Mic92 has joined #nixos-dev

10:56 <gchristensen> we should be getting some hw for metrics shortly!

10:58 <srhb> gchristensen: Neat!

11:02 <globin> gchristensen++

11:02 <{^_^}> gchristensen's karma got increased to 42

11:02 <gchristensen> ok nobody give me any more karma ever, its just the right number

11:03 <globin> just have to find enough people to give you negative karma

11:03 <globin> gchristensen: if you need support for the metrics stuff feel free to just ping me

11:04 <gchristensen> ok, first support question: what should this machine be named

11:06 <globin> gathernix

11:09 <gchristensen> https://github.com/NixOS/nixos-org-configurations/blob/master/delft/network.nix here are some current names, no names are based on "nnix" so far

11:11 orivej has joined #nixos-dev

11:17 aanderse has quit [Ping timeout: 260 seconds]

11:17 genesis has quit [Ping timeout: 260 seconds]

11:17 fadenb has quit [Ping timeout: 260 seconds]

11:27 Mic92 has quit [Quit: WeeChat 2.2]

11:28 <andi-> graphix?

11:28 Mic92 has joined #nixos-dev

11:29 Mic92 has quit [Client Quit]

11:30 genesis has joined #nixos-dev

11:30 fadenb has joined #nixos-dev

11:32 <srhb> statistix. Clearly. :P

11:32 orivej has quit [Ping timeout: 264 seconds]

11:32 Mic92 has joined #nixos-dev

11:40 <andi-> srhb: is that your special operation (build timeouts) hat speaking? :P

12:03 sir_guy_carleton has joined #nixos-dev

12:03 <niksnut> gchristensen: just pick a random name from http://wiki.southpark.cc.com/wiki/List_of_Characters :-)

12:25 simpson has quit [Ping timeout: 252 seconds]

12:29 <sphalerite> srhb: I like it! Or maybe vitalstatistix

12:29 <sphalerite> (I think that's also the name of the village chief from asterix?

12:32 simpson has joined #nixos-dev

13:06 <Taneb> sphalerite: at least in English, yeah

13:17 copumpkin has joined #nixos-dev

13:20 <shlevy> niksnut: around?

13:38 <niksnut> dyes

13:38 <niksnut> -d

13:40 <shlevy> niksnut: Re your comment on RST, does anything viable besides docbook *not* have that issue? If not I'll just switch to docbook :)

13:42 <niksnut> yes, the one I'm working on ;-)

13:42 <niksnut> but it's semantically close to docbook

13:42 <shlevy> Is it available for use?

13:42 <niksnut> so that may be the best choice (and also what NixOS option descriptions use)

13:42 <niksnut> not yet

13:42 <shlevy> OK

13:46 * gchristensen 's humble opinion is this doesn't matter right now, and won't necessarily be setting long-term policy

13:46 <Mic92> Is there a plan to use RST anywhere?

13:47 <shlevy> gchristensen: Agreed, but I want to have comprehensive docs for what I'm working on and I don't want to rewrite it. If we're probably going to end up on docbook, then I want to use it now

13:47 <gchristensen> (mboes tells me he finds RST not composable either?)

13:47 <gchristensen> shlevy: start with docbook, you can down-convert the semantics later :)

13:47 <shlevy> :)

13:49 <Mic92> is there a language server for docbook?

13:51 <Mic92> or maybe something hat validates xml?

13:52 <gchristensen> language server, not sure -- interesting

13:52 <gchristensen> there are plenty of editor integrations available for XML

13:52 <gchristensen> my emacs validates as I type

13:53 <shlevy> gchristensen: Does it validate just as XML or does it actually check the schema?

13:54 <gchristensen> checks the schema

13:54 <shlevy> Nice. Was that easy to set up?

13:54 <gchristensen> it was

13:55 <gchristensen> https://github.com/grahamc/nixos-config/blob/master/packages/emacs/default.el#L50-L51 https://github.com/grahamc/nixos-config/blob/master/packages/emacs/default.nix#L51-L59

13:58 <Mic92> gchristensen: appearently https://github.com/w0rp/ale does xml schema checks with xmllint, however dockbook is not detected as xml. I might suggest this to upstream.

13:58 <Mic92> or is there a better tool for docbook?

13:58 <gchristensen> hmmm you probably need to just tell xmllint about docbook

13:59 <gchristensen> oh, Ale doesn't detect it? that is weird

13:59 <Mic92> vim sets the filetype to docbk

13:59 <Mic92> I think this was also a plugin

13:59 <gchristensen> ah!

14:00 <Mic92> I probably can ask https://github.com/jhradilek/vim-docbk to set the linter correctly

14:01 <Mic92> something like https://github.com/LnL7/vim-nix/blob/master/compiler/nix-build.vim

14:02 <LnL> that's separate from ale AFAIK

14:02 <LnL> those options are for :make

14:03 Synthetica has joined #nixos-dev

14:03 <Mic92> I suppose I can add this also in third-party plugins: https://github.com/w0rp/ale/blob/master/ale_linters/xml/xmllint.vim#L59

14:04 <Synthetica> Did someone say "linter"?

14:04 <LnL> the thing I added for vim-dispatch, so if I run a build it jumps to the hash mismatch error directly

14:05 <Mic92> Synthetica: for docbook

14:06 <LnL> there is a plugin for docbook, but I'm not sure what it does other than extra syntax highlighting

14:07 <gchristensen> also I have a nix expression for the Oxygen editor .... but it requires a fairly expensive license

14:18 <Mic92> Is flake.nix the same as require.nix?

14:19 <shlevy> Mic92: Yeah, we settled on calling "packages" "flakes" for now :)

14:19 <shlevy> (snowflake...)

14:19 <Synthetica> Was that decided @ nixcon2018?

14:19 <shlevy> Yeah, I mean it's not a final decision of course

14:20 <Synthetica> I like it

14:20 <shlevy> Once it's all ready and proven we'll have RFCs

14:21 <Mic92> shlevy: I made you the maintainer of the project in the github repository description. This is what we have for all repositories on nix-community: https://github.com/nix-community/flake.nix

14:21 <shlevy> speaking of, globin do you have any thoughts on how to best handle large RFCs with multiple somewhat-independent decisions?

14:21 <shlevy> Mic92: Ah, thanks!

14:32 <globin> shlevy: in the sense of needing shepherds from different parts or trying to reach a decision? I don't see a problem selecting different shepherds from different parts of nix/nixpkgs development and in case of not being able to reach a decision I'd propose those get rejected and split up into multiple less controversial RFCs

14:33 <shlevy> globin: I guess I was thinking more in terms of whether we should try for granular RFCs or not to start

14:33 <andi-> Is there a way to see the reason for "Aborted" of these jobs? https://hydra.nixos.org/build/83226757#tabs-buildsteps Or is aborted always someone manually aborting them?

14:35 <shlevy> niksnut: By the way, it turns out we really do need a name for modules in the docs. Otherwise there's no way to describe the part of the flake metadata that describes them. Currently I'm leaning toward "cell" (as in the "unit cell" of a crystal), or member as we were throwing around on Saturday

14:36 <gchristensen> start with whatever for now, search-replace later?

14:36 <shlevy> Yep

14:36 <shlevy> Just not "modules" :P

14:36 <gchristensen> I recommend the name 808a214e-4fc7-4abb-a1ae-045017a801c2

14:36 <gchristensen> (nearly) guaranteed to not have false-positive matches later :)

14:37 <globin> shlevy: I'd prefer RFCs which are made to be as granular and uncontroversial as possible to make it easier to reach a decision. If that is not possible then expand them as far out as necessary

14:37 <shlevy> gchristensen is so anti-bikeshedding :P

14:38 <gchristensen> I think so too, to be gentle on the process as we develop it

14:38 <shlevy> +1

14:38 <shlevy> OK

14:38 <gchristensen> we don't want to blow the spark out :)

14:38 <niksnut> shlevy: "member"

14:39 <shlevy> No one lets me have my whimsy :( OK :)

14:41 orivej has joined #nixos-dev

15:16 dywedir[m] has quit [Ping timeout: 250 seconds]

16:08 <andi-> Mic92, fpletz: I rebased our systemd v239 ontop of the upstream stable v239 that contains a bunch of fixes. I can not open a PR for a rebase.. How are we going to handle this? :) Also we shouldn't remove older versions (branches) since nixpkgs is referring to them...

16:10 <Mic92> andi-: just open this against nixos-v239

16:10 <Mic92> This is a branch we have in NixOS/systemd

16:11 <andi-> I know, but it is not mergable since I had to rebase our patches..

16:11 <Mic92> andi-: I think you and I should have more rights on this repository.

16:11 emily has joined #nixos-dev

16:12 <shlevy> andi-: You can "merge in" v239 and simply take all the changes from one side of the merge to trick git

16:12 <Mic92> andi-: nixos-v239rev1 maybe?

16:12 <Mic92> or we can make tags.

16:13 <andi-> https://github.com/NixOS/systemd/pull/24 is the changeset

16:13 <{^_^}> systemd#24 (by andir, 14 seconds ago, open): Nixos v239

16:13 <andi-> shlevy: mhm, how would that look like on the CLI/github?

16:14 <gchristensen> maybe revert the conflicting changes above yours, and then apply your rebased patches

16:14 <shlevy> I think you can do git merge --strategy=ours the-branch

16:15 <shlevy> Yep, that should do it

16:15 <andi-> ok, let me try that

16:15 <shlevy> "the resulting tree of the merge is always that of the current branch head"

16:15 <shlevy> But the commit history is forward of both branches

16:16 <gchristensen> wow cool

16:17 <andi-> looks like that would work: https://github.com/NixOS/systemd/compare/nixos-v239...andir:nixos-v239-test

16:21 <Mic92> Can I have {^_^} in #nixos-nur ?

16:22 <gchristensen> sure

16:45 __Sander__ has quit [Quit: Konversation terminated!]

16:55 <fpletz> Mic92: andi-: please ask domenkozar or niksnut for access to that repo, I don't have admin rights

16:56 <andi-> fpletz: for now it is just about reviewing ;)

16:56 <fpletz> but yeah, we should rethink that rebase strategy

16:57 <fpletz> ok

17:10 <Mic92> andi-: somehow I would like to have both versions, the rebased one to see what patches we have and the merged version too see what changes have been added.

17:21 drakonis_ has quit [Ping timeout: 244 seconds]

17:27 drakonis_ has joined #nixos-dev

17:35 FRidh has joined #nixos-dev

17:39 drakonis has joined #nixos-dev

18:05 <andi-> Mic92: can push that later.. At birthday dinner right now.

18:05 drakonis has quit [Quit: WeeChat 2.2]

18:10 drakonis_ has quit [Ping timeout: 240 seconds]

18:10 drakonis has joined #nixos-dev

18:45 drakonis has quit [Ping timeout: 272 seconds]

18:51 Dezgeg has quit [Ping timeout: 252 seconds]

19:03 drakonis has joined #nixos-dev

19:12 Dezgeg has joined #nixos-dev

19:16 orivej has quit [Ping timeout: 252 seconds]

19:43 <domenkozar> zimbatm++

19:43 <{^_^}> zimbatm's karma got increased to 4

19:43 <domenkozar> well it was an ace

19:43 <domenkozar> so

19:43 <domenkozar> zimbatm++

19:43 <{^_^}> zimbatm's karma got increased to 5

20:07 drakonis1 has joined #nixos-dev

20:11 tmplt has joined #nixos-dev

20:13 drakonis_ has joined #nixos-dev

20:16 drakonis has quit [Ping timeout: 252 seconds]

20:44 <niksnut> shlevy: I've done a stream of consciousness braindump on what the flake mechanism could look like: https://gist.github.com/edolstra/40da6e3a4d4ee8fd019395365e0772e7

20:44 <niksnut> a minimal version (with no global dependency resolution) and a less minimal version

20:44 <niksnut> also some usage scenarios (it's probably important to collect as many of these as possible)

20:47 orivej has joined #nixos-dev

20:47 <shlevy> Thanks! Looking now.

20:52 <Synthetica> niksnut: Sounds good!

20:55 <shlevy> Hmm... Outside of the specifics of tooling I'm not sure I see how this is better than the design I had in mind. Also for my use case the documentation aspects are crucial, the number one blocker to broader adoption I've seen at work is discovering and understanding how different components work

20:56 <shlevy> I still think putting any kind of source info in the main flake metadata, even just the git URL, is a mistake

20:56 <shlevy> I really think we want to separate out "what do I depend on" from "where do I get it from"

20:58 <shlevy> I like the hydra jobset idea

20:59 <shlevy> Want to think more on the the types of flake members...

21:03 xeji has joined #nixos-dev

21:10 <domenkozar> xeji: shlevy: nice meeting you in person :)

21:10 jtojnar has quit [Quit: jtojnar]

21:10 <shlevy> You too!

21:26 <xeji> yeah great to meet you guys!

21:30 <LnL> I didn't expect to meet so many new people, it was nice

21:37 FRidh has quit [Quit: Konversation terminated!]

21:45 <Ericson2314> domenkozar (IRC): I don't have twitter but luigy showed me your shoutout. Thanks! 😊

21:54 zarel has joined #nixos-dev

22:01 <domenkozar> Ericson2314: missed you this year :)

22:03 zarel has quit [Quit: Leaving]

22:10 <Ericson2314> domenkozar (IRC): Yeah I procrastinated bad again :(

22:11 <Ericson2314> next year for sure, already kicking myself

22:49 drakonis has joined #nixos-dev

22:53 drakonis_ has quit [Ping timeout: 252 seconds]

23:01 <andi-> Those chromium builds being aborted :/ I start to hate that piece of software more every day.

23:56 drakonis_ has joined #nixos-dev

23:57 drakonis2 has joined #nixos-dev

23:59 drakonis has quit [Ping timeout: 252 seconds]

23:59 <andi-> so it isn't just chromium having those restart issues: https://hydra.nixos.org/build/83240579#tabs-buildsteps Is it mack of RAM / scratch space for the builds? Looking at those new fancy graphs makes me believe that the packet-t2-* machines cancle builds every few minutes to hours :/ CPU load magically drops down to zeroish