<gchristensen>
gustavderdrache: mind taking a look?
<samueldr>
btw, saying again *workaround*, as there could be issues with other backtick usage, but I believe it's unlikely nix-instantiate would be there if other commands are missing
<gchristensen>
aye
<samueldr>
ah, only nix hash-file is a backtic, so I think it's fine
<samueldr>
it's already checking the output, so it would fail with an empty string comparison
<samueldr>
gchristensen: I believe we have an approval
<gchristensen>
:partyparrot:
<gchristensen>
bad news
<samueldr>
is the code wrong?
<gchristensen>
Jan 03 03:29:11 bastion xdqqyi1si2qinqkczzssjlmbd81j6q6x-unit-script-update-nixos-19.03-start[20608]: error while executing nix-instantiate (32512).
<gchristensen>
Jan 03 03:29:11 bastion systemd[1]: update-nixos-19.03.service: Succeeded.
<samueldr>
>> 32512 = 0x7F00
<samueldr>
So it didn't die from a signal, a core dump wasn't produced, and it exited with code 127 (0x7F).
<samueldr>
what's weird is how I tested that `commandthatdoesnotexist` returned $? == -1
<samueldr>
so 127/not found may come from elsewhere
<gchristensen>
what if you do `NIX_PATH= commandthatdoesnotexist with some args`
<samueldr>
ooooof yeah, the environment variable thing changes the behaviour
<samueldr>
(still, the code is right, just changes where the error happens here)
<gustavderdrache>
$ perl -E '`foo= bar`; say $? >> 8'
<gustavderdrache>
127
<gustavderdrache>
sh: 1: bar: not found
<gustavderdrache>
it's not right - non-signal exits are in $? >> 8
<gustavderdrache>
monitoring is still failing to pick up the errors, so i'm going to politely disagree :)
<samueldr>
it at least stops, and errors out
<gustavderdrache>
yeah that's fair
<gchristensen>
that is an improvement
<samueldr>
ah, would exit truncate to the 8 lowermost bits?
<samueldr>
(thus exit 0)
<gustavderdrache>
IIRC the exit status is only 0x00-0xff
<samueldr>
yeah, that's it
<gchristensen>
nice catch!
<gustavderdrache>
and perl probably packs a bit more information into $? that you'd get from WIFEXITED and friends
<gchristensen>
I was just stepping through every stage from .service -> ExecStart -> wrapper -> wrapper -> ... -> .pl looking for somewhere we might accidentally swallow the exit
<gchristensen>
I didn't find one so I'm glad you have a theory
<gustavderdrache>
$ echo $((32512 & 255))
<gustavderdrache>
0
* gchristensen
prepares to smash the merge button
<gustavderdrache>
$ perl -E '`sh -c "exit 127"`; say $?'
<gustavderdrache>
32512
<gustavderdrache>
i'm pretty confident that there's the issue
<samueldr>
though we're losing the information from those bits
<samueldr>
not sure if we should worry
<gustavderdrache>
we should - writeup incoming
<samueldr>
thans
<samueldr>
thanks*
<gustavderdrache>
i think at this point we should just simplify the two conditionals - if all the monitoring needs is a non-zero exit, just report the info we have and ignore being fancy about what goes into the script's own exit
<samueldr>
sounds alright
<gustavderdrache>
qx'sh -c "kill -31 $$"' - perl says that $? after this is 159
<gustavderdrache>
(31 for the signal number and 128 to indicate it dumped core)
justanotheruser has quit [Ping timeout: 260 seconds]
<gustavderdrache>
gchristensen: how do i invoke the shiny new merge image macro?
kenjis has quit [Remote host closed the connection]
kenjis has joined #nixos-dev
orivej has quit [Ping timeout: 258 seconds]
kenjis has quit [Remote host closed the connection]
<gchristensen>
infinisil: sorry, it is time to revert your PR and try again -- tarball is failing now
kenjis has joined #nixos-dev
<FRidh>
gchristensen: i recall you writing sometime about tty and nix 2.2 and 2.3. Could you point me to anything related to that. Encountered a weird impurity, maybe its related https://github.com/NixOS/nixpkgs/issues/76879
kenjis has quit [Remote host closed the connection]
<LnL>
dies after texlive-york-thesis-3.6 for me and I definitively had memory available this time
__Sander__ has joined #nixos-dev
<infinisil>
So nix-instantiate nixos/release-combined.nix -A nixpkgs.tarball ?
<gchristensen>
nix-build ./nixos/release-combined.nix -A nixpkgs.tarball
<gchristensen>
it fails at build time, not instantiation time
<infinisil>
Ah
<LnL>
yeah, the tarball build evaluates nixpkgs at build time
kenjis has joined #nixos-dev
<FRidh>
I keep getting attribute '__propagatedImpureHostDeps' missing, at /tmp/foo/nixpkgs/pkgs/os-specific/darwin/apple-sdk/default.nix:184:36
<gchristensen>
Bisecting: 0 revisions left to test after this (roughly 1 step)
<LnL>
FRidh: where?
<FRidh>
e.g. when building the tarball with nix-shell
<FRidh>
also when evaluating the nix-env command
<FRidh>
myself
<infinisil>
Hm ` nix-env -f . --argstr system i686-linux -qa --drv-path --system-filter \*` works for me on master
<FRidh>
nope not here
<infinisil>
I'm at commit 0fb7ae83ade88abd3af3f6969796909499b2bc2a
<FRidh>
same
<infinisil>
Huh
<FRidh>
with /nix/store/f0n51a2dbz4pxlqfn02b970pac1gymv6-nix-2.3.1/bin/nix
<LnL>
FRidh: fine here, config.nix / overlays?
<infinisil>
/nix/store/994h5zvp7vcyf60m89r2vygf27rwbw9v-nix-2.3.1/bin/nix for me
<FRidh>
LnL: nope
<infinisil>
FRidh: And graham's revert fixes it?
<gchristensen>
guh my bisect was wrongly done
<FRidh>
infinisil: no it does not fix it for me
<infinisil>
FRidh: Even with the same exact nix version I can't reproduce this nix-env failure
<FRidh>
still have the error: attribute '__propagatedImpureHostDeps' missing, at /home/freddy/code/other/nixpkgs/pkgs/os-specific/darwin/apple-sdk/default.nix:184:36
<gchristensen>
even with nix-build, infinisil & FRidh?
<FRidh>
that's funny
<LnL>
yeah interesting, eval doesn't stack overflow outside of the build
<infinisil>
nix-build'ing the tarball I'm getting the same error as hydra
<gchristensen>
you can bisect between your config and what is happening inside the build :P
<LnL>
including when using a separate store
<infinisil>
Some nix.conf setting perhaps
<FRidh>
so thus far I am the only one with error: attribute '__propagatedImpureHostDeps' yet you do get the stack overflow (inside a build only) ?
<infinisil>
Yea
<infinisil>
Something screwy is going on!
<FRidh>
LnL: the issue I mentioned, its in the AppKit override. That evaluates fine for you?
<LnL>
hold up..., let me check something
<LnL>
FRidh: this does, but those are darwin only packages nix-instantiate -A darwin.apple_sdk.frameworks --argstr system x86_64-linux --arg config '{allowUnsupportedSystem = true;}'
<FRidh>
LnL: ahh right, my bad. I did not expect this bit of the config, allowUnsupportedSystem, to play a role here, but it will then evaluate those as well
<infinisil>
gchristensen: Can we close the revert PR? It doesn't seem like that's what causing it
<gchristensen>
sure
kenjis has quit [Remote host closed the connection]
<infinisil>
Ohh I can finally reproduce it outside the nix-build
<infinisil>
It's the nix version from that nixpkgs
<infinisil>
`nix-build -A nix` first
<infinisil>
Then `result/bin/nix-env ...`
<infinisil>
That fails with a stack overflow
ixxie has quit [Ping timeout: 260 seconds]
<infinisil>
Shortest command to reproduce: `$(nix-build -A nix)/bin/nix-env -f . -qa --drv-path`
<FRidh>
and messes up your shell
<infinisil>
Doesn't mess it up for me
<infinisil>
But there's something to bisect with
<gchristensen>
why not bisect with nix-build'ing -A tarball?
<LnL>
hmm in that case, was boehmgc bumped recently?
<FRidh>
its the staging merge
<FRidh>
we had gcc9, bison, ...
kenjis has joined #nixos-dev
<infinisil>
gchristensen: Ah yeah sure
<LnL>
yeah, sounds like we found a bug somewhere
<infinisil>
nix-env directly is a bit faster though, and I'm glad it's reproducible
<gchristensen>
same
<infinisil>
(like outside the build)
<gchristensen>
but didn't staging run tarball too?
<FRidh>
it does
<FRidh>
part of nixpkgs
<gchristensen>
so pro tip, after you bisect you should `git bisect reset` so that next time you bisect you don't start with that state
<infinisil>
Hehe
<FRidh>
so it was failing on gcc-9 branch already at 77b6c3cd06a679140fb5a44f81f904497007f333
<LnL>
same with merge/rebase, I had a bad time with a staging merge because of that once
<infinisil>
Also pro tip: You can do `git bisect reset HEAD` to not have it put you somewhere else
<gchristensen>
(that is why my first bisect's results were bad, I have 3 steps remaining)
<gchristensen>
I'm worried it is 8f729c0070ec3f78edadeaebcbd110257fe4577e
<FRidh>
yes
<LnL>
nix.override { stdenv = clangStdenv; }
<gchristensen>
oops, filled up my /nix/store with tarballs
justanotheruser has quit [Ping timeout: 248 seconds]
<infinisil>
#justnixthings
<LnL>
the clang one works so we could override it to use an older compiler
<gchristensen>
okay I'm patching current master to use gcc8Stdenv for nix
<gchristensen>
^ this didn't work, I think because the expression uses callPackage internally. anyone else able to make a nice patch than sed -e s/stdenv/gcc8Stdev/ ?
<FRidh>
remove stdenv parameter from common?
drakonis has joined #nixos-dev
<gchristensen>
it is unlikely that lots of software is broken by gcc9 right?
<LnL>
I think that does mean my previous snippet wouldn't work anymore tho :/
<FRidh>
but it did
<FRidh>
because of stdenv.lib.optionalAttrs stdenv.cc.isClang I think
<{^_^}>
#44196 (by nbp, 1 year ago, open): Add pkgs.overrideWithScope
<FRidh>
is there anything against using clangStdenv for now?
<FRidh>
aside from closure size
<LnL>
I'll get there, depsBuildBuild was annoying
<nbp>
clang does not work properly, as far as I recall llvm-config is not capable of looking back, we would have to re-compile it for whatever extended version of llvm we want.
<LnL>
not sure what you mean
<LnL>
but something like nix = mapStdenv gcc8Stdenv nix; would be much nicer than this mess https://git.io/JepT3
orivej has joined #nixos-dev
__Sander__ has quit [Quit: Konversation terminated!]
<gchristensen>
andi-: thinking more about what you said, about an alert if a tested jobset goes red
<gchristensen>
andi-: I'm thinking it might be too much? but maybe if it stayed red for a whole day?
<gchristensen>
not sure. what do you think?
<andi->
gchristensen: for release branches I believe they should just throw an alert when they fail. A failing build there should be fixed ASAP. On trunk I agree that it might be too bad.
<gchristensen>
that sounds right to me
<andi->
Maybe we go with your idea of "it is broken >=24h" first and then implement the other thing?
<gchristensen>
it is pretty easy to do both
<andi->
ok
<gchristensen>
if hydra_job_failed{jobset=~"release-.*"} == 1 for 5min+, and hydra_job_failed == 1 for 24h+