gchristensen changed the topic of #nixos-dev to: NixOS Development (#nixos for questions) | https://hydra.nixos.org/jobset/nixos/trunk-combined https://channels.nix.gsc.io/graph.html https://r13y.com | 18.09 release managers: vcunat,samueldr | 19.03 RMs: samueldr,sphalerite | https://logs.nix.samueldr.com/nixos-dev
drakonis_ has quit [Ping timeout: 258 seconds]
drakonis_ has joined #nixos-dev
<samueldr> !! I think I just hit a usual hydra fluke in a local nixos test run
<gchristensen> oh?
<samueldr> still doing tests
<samueldr> but it seemingly was reproduced the second time so I'm thinking it wasn't it
<samueldr> though I do have this in my dmesg
<samueldr> [86826.269647] L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html for details.
<samueldr> though just once
<samueldr> and it's a recent log going from the timestamp
<samueldr> oh, not from today though :/
copumpkin has joined #nixos-dev
copumpkin has quit [Client Quit]
ajs124 has left #nixos-dev [#nixos-dev]
<samueldr> initial testing seems to show that `nix-build ./nixos/tests/installer.nix -A simple && nix-build ./nixos/tests/installer.nix -A simpleUefiGrub` will pass, but `nix-build ./nixos/tests/installer.nix -A simpleUefiGrub -A simple` fails
<samueldr> are we betting it's likely tests run in parallel more often on epyc?
justanotheruser has joined #nixos-dev
Taneb has quit [Quit: I seem to have stopped.]
<samueldr> same nixpkgs revision on the hosts; same nixpkgs revisions on which I nix-build; can reliably reproduce on my workstation, cannot on the laptop ¯\_(ツ)_/¯
drakonis has quit [Quit: WeeChat 2.3]
<samueldr> if anyone has anything to try and debug or investigate, I'm in
<samueldr> differences between the two machines is ram (64 vs. 8 GB) and CPU (E5-1660 vs. i5-4210U [6*2 vs 2*2 core*threads]); otherwise they were really free as the laptop wasn't in use during the tests, only a terminal being started
drakonis has joined #nixos-dev
drakonis_ has quit [Ping timeout: 264 seconds]
init_6 has joined #nixos-dev
Taneb has joined #nixos-dev
drakonis has quit [Ping timeout: 246 seconds]
drakonis has joined #nixos-dev
<worldofpeace> Yay :) https://github.com/NixOS/nixpkgs/pull/59072#issuecomment-480744410 . Who doesn't enjoy praise when it's given.
<etu> worldofpeace++
<{^_^}> worldofpeace's karma got increased to 15
ajs124 has joined #nixos-dev
ajs124 has left #nixos-dev [#nixos-dev]
ajs124 has joined #nixos-dev
ajs124 has left #nixos-dev [#nixos-dev]
ajs124 has joined #nixos-dev
ajs124 has left #nixos-dev [#nixos-dev]
ajs124 has joined #nixos-dev
ajs124 has left #nixos-dev [#nixos-dev]
ajs124 has joined #nixos-dev
init_6 has quit []
<gchristensen> worldofpeace++
<{^_^}> worldofpeace's karma got increased to 16
<gchristensen> samueldr: pretty sure epyc-1 does run multiple tests at once, indeed!
init_6 has joined #nixos-dev
init_6 has quit [Remote host closed the connection]
orivej has quit [Ping timeout: 250 seconds]
<samueldr> I'm more annoyed at how it reliably reproduces on one machine, but not the other :/
ajs124 has left #nixos-dev [#nixos-dev]
ajs124 has joined #nixos-dev
<andi-> Have you considered restricting the amount of CPU flags to something which is available on all machines? Maybe that will already show if it depends on something specific of that one chip?
<samueldr> andi-: how can I restrict the amount of cpu flags?
<samueldr> in my local testing, at least, it's not done through hydra, and not done through a builder option, just plain native nix-build
<andi-> the qemu -cpu parameter probably
<andi-> must check what presets there are that make sense for a test..
<samueldr> need to test with a real repro now that I know of one; but from the little research I did in the past it seems that with -kvm, it won't stop the cpu from working with these features, it just masks the flag
<andi-> yes it doesn't "protect" the CPU from executing them.
<samueldr> and it wasn't obvious if instructions existed for those flags whether they'd work or not with qemu non-kvm
<andi-> It is just a wild guess since it seems to be specific to the epyc?
<andi-> we are already using kvm64 which is a common set of features AFAIK
<samueldr> yeah, though my workstation is a 2012 vintage cpu; my laptop a 2014 vintage one :/
<andi-> so probably not it
<samueldr> epyc is more recent, though it's entirely possible that an older xeon has features that a newer i* doesn't have
<samueldr> workstation has ["smx", "dca", "x2apic"]; laptop has ["sdbg", "fma", "movbe", "f16c", "rdrand", "abm", "cpuid_fault", "invpcid_single", "ept_ad", "fsgsbase", "tsc_adjust", "bmi1", "avx2", "smep", "bmi2", "erms", "invpcid"]
<andi-> what irritates me is that the both tests just after each other should work and just that one test does not..
<andi-> does it matter how much of a pause there is between the tests? (This sounds like total madness)
<samueldr> uh, that when ran in parallel they fail?
<samueldr> I don't know other than && if there are issues
<samueldr> but I guess that adding additional time between runs won't make it fail
<samueldr> wondering though if adding a sleep to one so it starts a bit crooked would change things up
<andi-> which revision did you test this on?
<samueldr> acbdaa569f4ee387386ebe1b9e60b9f95b4ab21b
<samueldr> current tip of nixos-unstable
<andi-> ok
<andi-> will see if I can reproduce that somewhere here..
<samueldr> you might want to introduce something like builtins.currentTime into the tests to make sure you start new tests once one passes :)
<andi-> --check? :)
<samueldr> && will fail :)
<samueldr> (if you wanted to reproduce the nix-build in succession)
<andi-> I am starting the with failure case first
<samueldr> just started it on an older vintage xeon
<andi-> seeing that it fails to find the disk did you check how we provide the scratch disks to the VMs?
<samueldr> not really, I asked because I'm not sure about the best ways to debug
<andi-> the case that fails for you is now copying channels to the target VMs.. Looks better then your gist :/
aszlig has joined #nixos-dev
<samueldr> yeah, seems to follow through on that older vintage xeon too
<samueldr> right now out of a sample of three of my machines, it fails (and reliably!) on only one, in the given conditions
<andi-> But I recall having seen that error on one of my machines at least once.. I restarted the test and it was fine..
<andi-> given that we are basically just having a difference in how we run it: Whats the nix daemons version on those machines?
<andi-> I am on "nix (Nix) 2.2pre6526_9f99d624"
<samueldr> 2.2
<samueldr> /nix/store/kjjbqc6q8brqz87jil6w5hrym3di75k7-nix-2.2/bin/nix
<samueldr> also verified that /run/booted-system points to the same revision
<samueldr> (not sure a nix upgrade restarts the daemon?)
<andi-> it should
<samueldr> nixos-version -> 19.03.git.23fd139 (Koi)
<samueldr> anyway, even if it didn't that's the nix that should be in use
<andi-> Only on my work laptop that still runsn 18.09 now..
<samueldr> the older xeon might not be on that revision, I'll update it just in case
jtojnar has joined #nixos-dev
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 250 seconds]
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 250 seconds]
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 258 seconds]
orivej has joined #nixos-dev
<samueldr> the older xeon powers through the build :/
<gchristensen> (as you know, I'm happy to do remote hands stuff :))
<samueldr> nix-build ./nixos/tests/installer.nix -A simpleUefiGrub -A simple # here when both are built simultaneously fails on one out of three machines
<samueldr> in my test cases, it is always done on an almost unused machine, and the machine on which it fails has more cores available than the others, more memory also :/
<gchristensen> samueldr, sphalerit, sphalerite: can we be certain to release this week?
<gchristensen> like, if not, okay -- but we really need more communication about what is going on
<gchristensen> it is a lot of volunteer work to organize a release -- it is okay if things are busy, or harder than expected, or not going as planned. the most important thing is to communicate the status
<gchristensen> it also doesn't need to be perfect
<averell> as long as you don't pull a MS-win10 18.09 update everything is cool :)
noonien has joined #nixos-dev
<gchristensen> :)
<samueldr> (I'm waiting for more details for the release, I thought it was good to go by today)
drakonis_ has joined #nixos-dev
drakonis has quit [Ping timeout: 244 seconds]
drakonis_ has quit [Ping timeout: 240 seconds]
<averell> which channel and package is this on? it might be possible to override the url or just update.
<averell> ergh, ww
<niksnut> btw I upgraded all my machines to 19.03 last week, very smooth upgrade :-)
<gchristensen> nice!
<rycee> Anybody around who has experience with the Emacs package generation? I'm looking for help updating them in Nixpkgs :-)
{`-`} has joined #nixos-dev
carter has quit [Read error: Connection reset by peer]
chrisaw has quit [Read error: Connection reset by peer]
chrisaw has joined #nixos-dev
vdemeester has joined #nixos-dev
carter has joined #nixos-dev
cocreature has quit [Ping timeout: 268 seconds]
cocreature has joined #nixos-dev
orivej has quit [Ping timeout: 240 seconds]
hl has quit [Read error: Connection reset by peer]
<ryantm> rycee: Last time I tried to use emacs2nix I got an error https://github.com/ttuegel/emacs2nix/issues/48
<{^_^}> ttuegel/emacs2nix#48 (by ryantm, 14 weeks ago, open): Melpa and Melpa stable updating fails with exit code 255
hl has joined #nixos-dev
<rycee> ryantm: Thanks! I'll give it a try as well but I suspect I'll get the same issue.
<sphalerite> gchristensen: samueldr: yeah, all the big bits are ready I think.
<gchristensen> yay :D
<sphalerite> Do we actually need a new nix release for 19.03?
<sphalerite> (since "Release Nix (currently only Eelco Dolstra can do that)." is in the manual)
<gchristensen> I think that would have >1mo ago if it were to happen
<gchristensen> so, skip for now :)
<sphalerite> Could someone with appropriate hydra access please " Change stableBranch to true and wait for channel to update. "? :)
<gchristensen> yep!
<gchristensen> it'd be really cool to have this be a PR
<gchristensen> updated!
<sphalerite> thanks!
<sphalerite> I guess I'll do the https://nixos.org/nixos/manual/index.html#at-final-release-time stuff tomorrow if I have appropriate internet
<gchristensen> yay =)
<gchristensen> I think the docs should clarify that that tag goes on the first version to be released with stableBRanch set to true
<gchristensen> not just whatever the branch is at th etime
<globin> sphalerite: what gchristensen is saying +1
<sphalerite> ack!
<rycee> ryantm: Well, I managed to run the generation but of course I then noticed https://github.com/NixOS/nixpkgs/pull/59153 :-)
<{^_^}> #59153 (by rasendubi, 12 hours ago, open): Emacs packages update: 2019-04-08
<samueldr> nixos:release-19.03 Too many heap sections @ ~21:00 UTC -> https://gist.github.com/samueldr/55884121098e2a568d00bce9092e0c0f
<samueldr> nixos:trunk-combined Too many heap sections @ ~16:25 UTC -> https://gist.github.com/samueldr/bf0bd9b8d0235db15c6e1f3612f74e3b
<samueldr> both requeued
<ryantm> rycee: Good to know that it's possible to update the emacs packages! I should try again.
<matthewbauer> ryantm: have you ever though about getting nixpkgs-update to run emacs updates? it's kind of a different task, but since nixpkgs-update is already running it would be a pretty straightforward transition
<ryantm> matthewbauer: I was investigating doing that the last time I ran into that issue I referenced.