gchristensen changed the topic of #nixos-dev to: NixOS Development (#nixos for questions) | https://hydra.nixos.org/jobset/nixos/trunk-combined https://channels.nix.gsc.io/graph.html | 18.09 release managers: vcunat and samueldr | https://logs.nix.samueldr.com/nixos-dev
pie_ has quit [Ping timeout: 252 seconds]
<{^_^}> #36542 (by vitiral, 33 weeks ago, open): question(newbie): how do I set an environment variable in nix-shell for rust packages?
lassulus_ has joined #nixos-dev
lassulus has quit [Ping timeout: 264 seconds]
lassulus_ is now known as lassulus
copumpkin has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
orivej has quit [Ping timeout: 246 seconds]
Nadrieril has quit [Remote host closed the connection]
Jackneill has quit [Ping timeout: 240 seconds]
Jackneill has joined #nixos-dev
drakonis has quit [Quit: WeeChat 2.2]
Mic92 has quit [Quit: Connection closed for inactivity]
sir_guy_carleton has quit [Quit: WeeChat 2.2]
Lisanna has joined #nixos-dev
<schmittlauch[m]> Now the nixos.tests.installer.simple.x86_64-linux part of the latest hydra build has failed.
<schmittlauch[m]> Meanwhile Firefox didn't build locally, maybe due to insufficient RAM or tmp disk space
orivej has joined #nixos-dev
emily has quit [Ping timeout: 252 seconds]
orivej has quit [Ping timeout: 246 seconds]
<andi-> schmittlauch[m]: it requires more ram with 63
<andi-> Or well tmpfs
teto has joined #nixos-dev
Mic92 has joined #nixos-dev
<srhb> Just decompressed the most recent failure logs on trunk-combined on packet-epyc-1, looks like they're all "timed out waiting for the VM to connect." This has gotten much worse recently.
<andi-> I guess that has to do with the changes in how heavy we load them?
<andi-> (just a wild shot)
<srhb> I thought gchristensen recently halved the load, but maybe not?
<gchristensen> I did
<srhb> OK, good to know.
<srhb> Doesn't look like any measurable improvement occurred.
<srhb> Maybe we should increase the connect timeout after all then
<srhb> Actually, this is strange: https://hydra.nixos.org/build/83246413 -- this shows with a duration of 5m4s, has the timed out error, but the time limit should be 10 minutes, should it not?
<srhb> nixpkgs/nixos/lib/test-driver/Machine.pm:252
<srhb> Guh, no, I can't math
<srhb> :-)
<srhb> 5 minutes indeed.
<gchristensen> :)
<srhb> Another thing to note: These errors seem to be far more prevalent on all of the installer.* tests.
<gchristensen> any correlation between whether or not it uses the bootloader?
<gchristensen> or do they all use the bootloader?
<srhb> I think they all do, though not completely sure.
<srhb> But this seems like a red flag.
<gchristensen> then that is interesting, i think the bootloader tests are "heavier" somehow
__Sander__ has joined #nixos-dev
<srhb> Is there a way to get more history on the constituents tab? I think a visual comparison might in fact reveal when this started being really bad.
<ekleog> fwiw when I tried to run the installer test from on my machine it OOM'd, while other tests usually pass without a hitch… don't know if that's related
* andi- runs the ZFS installer test locally..
<srhb> I think it's likely there really is a bug here. Looking at the logs, the root shell ought to be connected at the point in time the error is happenning. I have to turn down the timeout to like 5 seconds to reproduce it locally
<srhb> Did this happen at approximately the same time as the new VM backdoor got added?
<srhb> I think it might have.
<srhb> I wonder if there's a race
<gchristensen> ooh
<srhb> Dezgeg: ^ Do you know? :)
<andi-> But isn't the backdoor already >2 months old? How long do you *think* we are having those issues?
<srhb> Ah, I thought it was only about a month. Regardless, it's just a hunch, no solid data yet.
<andi-> I just reverted it on my local 18.09 branch... the installer test was running fine before that but took about 10min
<srhb> andi-: The failure actually occurs very quickly after startup when it occurs
<srhb> To reproduce locally you'll need to reduce it drastically, probably
<andi-> ok, I'll do that after I had some food & coffee :)
<andi-> I fear it might just be load related and hard to simulate :/
<srhb> That was my first hunch, but it's actually a fairly new error. Something is likely to have changed.
<srhb> andi-: If you notice the timestamps from the startup, the system is done doing anything after approximately 30 seconds. Then, four minutes pass and still no connection.
<srhb> (on my system, it's done booting after approx 8 seconds, so the load seems to be less than a factor of 10 in difference, and certainly not a difference of a factor of 100)
<srhb> Assuming the clock is at least somewhat accurate..
<andi-> with booting you mean the dmesg spam? The VM construction? Stage2 and the start of the script execution?
<srhb> Yeah
<srhb> It's not impossible that it's load related still. But the load reduction doesn't look like it had much impact..
<andi-> Would be nice to see the load of the machine at that time / during that period
<andi-> would certainly allow for ruling it out
<LnL> gchristensen has been working on getting metrics from the builders
<andi-> so I have heard :) What can we do to help with that ? :)
<gchristensen> which builder/period?
<srhb> packet-epyc-1 2018-10-29 03:54:13 for instance
<srhb> (Does the Hydra UI report UTC or my TZ?)
<LnL> logs and timezones, that never ends well...
<srhb> Indeed...
<srhb> gchristensen: Thanks!
<srhb> This certainly does not look like a lad issue...
<srhb> load*
<gchristensen> I talked to niksnut about public metrics, it isn''t be a problem at all, I'm working on getting us a machine to host the collector
<LnL> what's that io spike, is that in seconds?
<gchristensen> yeah I think iowait is seconds
<srhb> Not load?
<LnL> gchristensen: btw, if there's anything I can do to help with the setup let me know
<gchristensen> thank you =)
<gchristensen> I'll reach out
<srhb> Regardless, it doesn't seem to correlate.
<gchristensen> node_cpu_seconds_total{instance=~"^($instance):.*",mode="iowait"} -- yeah, iowait in seconds
<srhb> What are people comfortable with trying next? We can increase the timeout to a ridiculous amount and see if that changes anything.
* gchristensen nominates srhb as in charge of this project
<srhb> I don't think I'm a good nominee, I don't have the access necessary to do the relevant testing of the machines, restarting of jobs, etc.
<gchristensen> delegate? :)
<srhb> OK, I am actively doing that insofar as spamming here counts as delegation. xD
<gchristensen> I'm happy to be remote hands as needed... but can't really think about it clearly
<srhb> Understood
<andi-> srhb sounds like a good candidate :)
orivej has joined #nixos-dev
pie_ has joined #nixos-dev
orivej has quit [Ping timeout: 264 seconds]
Nadrieril has joined #nixos-dev
Mic92 has quit []
Mic92 has joined #nixos-dev
<gchristensen> we should be getting some hw for metrics shortly!
<srhb> gchristensen: Neat!
<globin> gchristensen++
<{^_^}> gchristensen's karma got increased to 42
<gchristensen> ok nobody give me any more karma ever, its just the right number
<globin> just have to find enough people to give you negative karma
<globin> gchristensen: if you need support for the metrics stuff feel free to just ping me
<gchristensen> ok, first support question: what should this machine be named
<globin> gathernix
<gchristensen> https://github.com/NixOS/nixos-org-configurations/blob/master/delft/network.nix here are some current names, no names are based on "nnix" so far
orivej has joined #nixos-dev
aanderse has quit [Ping timeout: 260 seconds]
genesis has quit [Ping timeout: 260 seconds]
fadenb has quit [Ping timeout: 260 seconds]
Mic92 has quit [Quit: WeeChat 2.2]
<andi-> graphix?
Mic92 has joined #nixos-dev
Mic92 has quit [Client Quit]
genesis has joined #nixos-dev
fadenb has joined #nixos-dev
<srhb> statistix. Clearly. :P
orivej has quit [Ping timeout: 264 seconds]
Mic92 has joined #nixos-dev
<andi-> srhb: is that your special operation (build timeouts) hat speaking? :P
sir_guy_carleton has joined #nixos-dev
<niksnut> gchristensen: just pick a random name from http://wiki.southpark.cc.com/wiki/List_of_Characters :-)
simpson has quit [Ping timeout: 252 seconds]
<sphalerite> srhb: I like it! Or maybe vitalstatistix
<sphalerite> (I think that's also the name of the village chief from asterix?
simpson has joined #nixos-dev
<Taneb> sphalerite: at least in English, yeah
copumpkin has joined #nixos-dev
<shlevy> niksnut: around?
<niksnut> dyes
<niksnut> -d
<shlevy> niksnut: Re your comment on RST, does anything viable besides docbook *not* have that issue? If not I'll just switch to docbook :)
<niksnut> yes, the one I'm working on ;-)
<niksnut> but it's semantically close to docbook
<shlevy> Is it available for use?
<niksnut> so that may be the best choice (and also what NixOS option descriptions use)
<niksnut> not yet
<shlevy> OK
* gchristensen 's humble opinion is this doesn't matter right now, and won't necessarily be setting long-term policy
<Mic92> Is there a plan to use RST anywhere?
<shlevy> gchristensen: Agreed, but I want to have comprehensive docs for what I'm working on and I don't want to rewrite it. If we're probably going to end up on docbook, then I want to use it now
<gchristensen> (mboes tells me he finds RST not composable either?)
<gchristensen> shlevy: start with docbook, you can down-convert the semantics later :)
<shlevy> :)
<Mic92> is there a language server for docbook?
<Mic92> or maybe something hat validates xml?
<gchristensen> language server, not sure -- interesting
<gchristensen> there are plenty of editor integrations available for XML
<gchristensen> my emacs validates as I type
<shlevy> gchristensen: Does it validate just as XML or does it actually check the schema?
<gchristensen> checks the schema
<shlevy> Nice. Was that easy to set up?
<gchristensen> it was
<Mic92> gchristensen: appearently https://github.com/w0rp/ale does xml schema checks with xmllint, however dockbook is not detected as xml. I might suggest this to upstream.
<Mic92> or is there a better tool for docbook?
<gchristensen> hmmm you probably need to just tell xmllint about docbook
<gchristensen> oh, Ale doesn't detect it? that is weird
<Mic92> vim sets the filetype to docbk
<Mic92> I think this was also a plugin
<gchristensen> ah!
<Mic92> I probably can ask https://github.com/jhradilek/vim-docbk to set the linter correctly
<LnL> that's separate from ale AFAIK
<LnL> those options are for :make
Synthetica has joined #nixos-dev
<Mic92> I suppose I can add this also in third-party plugins: https://github.com/w0rp/ale/blob/master/ale_linters/xml/xmllint.vim#L59
<Synthetica> Did someone say "linter"?
<LnL> the thing I added for vim-dispatch, so if I run a build it jumps to the hash mismatch error directly
<Mic92> Synthetica: for docbook
<LnL> there is a plugin for docbook, but I'm not sure what it does other than extra syntax highlighting
<gchristensen> also I have a nix expression for the Oxygen editor .... but it requires a fairly expensive license
<Mic92> Is flake.nix the same as require.nix?
<shlevy> Mic92: Yeah, we settled on calling "packages" "flakes" for now :)
<shlevy> (snowflake...)
<Synthetica> Was that decided @ nixcon2018?
<shlevy> Yeah, I mean it's not a final decision of course
<Synthetica> I like it
<shlevy> Once it's all ready and proven we'll have RFCs
<Mic92> shlevy: I made you the maintainer of the project in the github repository description. This is what we have for all repositories on nix-community: https://github.com/nix-community/flake.nix
<shlevy> speaking of, globin do you have any thoughts on how to best handle large RFCs with multiple somewhat-independent decisions?
<shlevy> Mic92: Ah, thanks!
<globin> shlevy: in the sense of needing shepherds from different parts or trying to reach a decision? I don't see a problem selecting different shepherds from different parts of nix/nixpkgs development and in case of not being able to reach a decision I'd propose those get rejected and split up into multiple less controversial RFCs
<shlevy> globin: I guess I was thinking more in terms of whether we should try for granular RFCs or not to start
<andi-> Is there a way to see the reason for "Aborted" of these jobs? https://hydra.nixos.org/build/83226757#tabs-buildsteps Or is aborted always someone manually aborting them?
<shlevy> niksnut: By the way, it turns out we really do need a name for modules in the docs. Otherwise there's no way to describe the part of the flake metadata that describes them. Currently I'm leaning toward "cell" (as in the "unit cell" of a crystal), or member as we were throwing around on Saturday
<gchristensen> start with whatever for now, search-replace later?
<shlevy> Yep
<shlevy> Just not "modules" :P
<gchristensen> I recommend the name 808a214e-4fc7-4abb-a1ae-045017a801c2
<gchristensen> (nearly) guaranteed to not have false-positive matches later :)
<globin> shlevy: I'd prefer RFCs which are made to be as granular and uncontroversial as possible to make it easier to reach a decision. If that is not possible then expand them as far out as necessary
<shlevy> gchristensen is so anti-bikeshedding :P
<gchristensen> I think so too, to be gentle on the process as we develop it
<shlevy> +1
<shlevy> OK
<gchristensen> we don't want to blow the spark out :)
<niksnut> shlevy: "member"
<shlevy> No one lets me have my whimsy :( OK :)
orivej has joined #nixos-dev
dywedir[m] has quit [Ping timeout: 250 seconds]
<andi-> Mic92, fpletz: I rebased our systemd v239 ontop of the upstream stable v239 that contains a bunch of fixes. I can not open a PR for a rebase.. How are we going to handle this? :) Also we shouldn't remove older versions (branches) since nixpkgs is referring to them...
<Mic92> andi-: just open this against nixos-v239
<Mic92> This is a branch we have in NixOS/systemd
<andi-> I know, but it is not mergable since I had to rebase our patches..
<Mic92> andi-: I think you and I should have more rights on this repository.
emily has joined #nixos-dev
<shlevy> andi-: You can "merge in" v239 and simply take all the changes from one side of the merge to trick git
<Mic92> andi-: nixos-v239rev1 maybe?
<Mic92> or we can make tags.
<{^_^}> systemd#24 (by andir, 14 seconds ago, open): Nixos v239
<andi-> shlevy: mhm, how would that look like on the CLI/github?
<gchristensen> maybe revert the conflicting changes above yours, and then apply your rebased patches
<shlevy> I think you can do git merge --strategy=ours the-branch
<shlevy> Yep, that should do it
<andi-> ok, let me try that
<shlevy> "the resulting tree of the merge is always that of the current branch head"
<shlevy> But the commit history is forward of both branches
<gchristensen> wow cool
<Mic92> Can I have {^_^} in #nixos-nur ?
<gchristensen> sure
__Sander__ has quit [Quit: Konversation terminated!]
<fpletz> Mic92: andi-: please ask domenkozar or niksnut for access to that repo, I don't have admin rights
<andi-> fpletz: for now it is just about reviewing ;)
<fpletz> but yeah, we should rethink that rebase strategy
<fpletz> ok
<Mic92> andi-: somehow I would like to have both versions, the rebased one to see what patches we have and the merged version too see what changes have been added.
drakonis_ has quit [Ping timeout: 244 seconds]
drakonis_ has joined #nixos-dev
FRidh has joined #nixos-dev
drakonis has joined #nixos-dev
<andi-> Mic92: can push that later.. At birthday dinner right now.
drakonis has quit [Quit: WeeChat 2.2]
drakonis_ has quit [Ping timeout: 240 seconds]
drakonis has joined #nixos-dev
drakonis has quit [Ping timeout: 272 seconds]
Dezgeg has quit [Ping timeout: 252 seconds]
drakonis has joined #nixos-dev
Dezgeg has joined #nixos-dev
orivej has quit [Ping timeout: 252 seconds]
<domenkozar> zimbatm++
<{^_^}> zimbatm's karma got increased to 4
<domenkozar> well it was an ace
<domenkozar> so
<domenkozar> zimbatm++
<{^_^}> zimbatm's karma got increased to 5
drakonis1 has joined #nixos-dev
tmplt has joined #nixos-dev
drakonis_ has joined #nixos-dev
drakonis has quit [Ping timeout: 252 seconds]
<niksnut> shlevy: I've done a stream of consciousness braindump on what the flake mechanism could look like: https://gist.github.com/edolstra/40da6e3a4d4ee8fd019395365e0772e7
<niksnut> a minimal version (with no global dependency resolution) and a less minimal version
<niksnut> also some usage scenarios (it's probably important to collect as many of these as possible)
orivej has joined #nixos-dev
<shlevy> Thanks! Looking now.
<Synthetica> niksnut: Sounds good!
<shlevy> Hmm... Outside of the specifics of tooling I'm not sure I see how this is better than the design I had in mind. Also for my use case the documentation aspects are crucial, the number one blocker to broader adoption I've seen at work is discovering and understanding how different components work
<shlevy> I still think putting any kind of source info in the main flake metadata, even just the git URL, is a mistake
<shlevy> I really think we want to separate out "what do I depend on" from "where do I get it from"
<shlevy> I like the hydra jobset idea
<shlevy> Want to think more on the the types of flake members...
xeji has joined #nixos-dev
<domenkozar> xeji: shlevy: nice meeting you in person :)
jtojnar has quit [Quit: jtojnar]
<shlevy> You too!
<xeji> yeah great to meet you guys!
<LnL> I didn't expect to meet so many new people, it was nice
FRidh has quit [Quit: Konversation terminated!]
<Ericson2314> domenkozar (IRC): I don't have twitter but luigy showed me your shoutout. Thanks! 😊
zarel has joined #nixos-dev
<domenkozar> Ericson2314: missed you this year :)
zarel has quit [Quit: Leaving]
<Ericson2314> domenkozar (IRC): Yeah I procrastinated bad again :(
<Ericson2314> next year for sure, already kicking myself
drakonis has joined #nixos-dev
drakonis_ has quit [Ping timeout: 252 seconds]
<andi-> Those chromium builds being aborted :/ I start to hate that piece of software more every day.
drakonis_ has joined #nixos-dev
drakonis2 has joined #nixos-dev
drakonis has quit [Ping timeout: 252 seconds]
<andi-> so it isn't just chromium having those restart issues: https://hydra.nixos.org/build/83240579#tabs-buildsteps Is it mack of RAM / scratch space for the builds? Looking at those new fancy graphs makes me believe that the packet-t2-* machines cancle builds every few minutes to hours :/ CPU load magically drops down to zeroish