<gchristensen>
on that note, any hot tips on why it is failing with /nix/store/k8p54jg8ipvnfz435mayf5bnqhw4qqap-bash-4.4-p23/bin/bash: ./config.status: No such file or directory ?
davidtwco has quit [Read error: Connection reset by peer]
thoughtpolice has quit [Ping timeout: 268 seconds]
dmj` has quit [Ping timeout: 268 seconds]
davidtwco has joined #nixos-dev
thoughtpolice has joined #nixos-dev
dmj` has joined #nixos-dev
supersandro2000 has quit [Disconnected by services]
<nh2[m]>
What's up with Hydra? 20.09 is building multiple large packages (Thunderbird, Libreoffice) on my laptop instead of fetching them from cache; according to https://hydra.nixos.org/jobset/nixos/release-20.09 since 3 days ago
<nh2[m]>
According to https://hydra.nixos.org/build/129301019 it's "Error --- hydra-queue-runner[0m cannot connect to ‘[33;1mroot@a63b04eb.packethost.net[0m’"
<gchristensen>
sounds like it needs to be restarted
<nh2[m]>
gchristensen: Can I do that myself?
<gchristensen>
maybe, if not you could ask in #nixos-infra
<nh2[m]>
gchristensen: I'm a bit at a loss btw where we track infra issues, sometimes there are notifications on top of https://status.nixos.org/, sometimes Hydra issues are pinned in the nixpkgs issue tracker, but I feel like it's pretty ad-hoc
<gchristensen>
may be a good discussion for -infra :). I can't be "the guy" these days.
<nh2[m]>
gchristensen: ok, thanks!
cole-h has joined #nixos-dev
ris has quit [Ping timeout: 240 seconds]
jonringer has joined #nixos-dev
Baughn has quit [Quit: ZNC 1.6.2+deb1 - http://znc.in]
Baughn has joined #nixos-dev
justanotheruser has quit [Ping timeout: 240 seconds]
<aristid>
idle thought: i wonder if hydra's "reproduce locally" could use a pre-existing nixpkgs checkout as a remote to speed up the download
zowoq[m] has quit [Quit: Idle for 30+ days]
FRidh has joined #nixos-dev
alp has joined #nixos-dev
spookyscarysphal is now known as sphalerite
Jackneill has quit [Read error: Connection reset by peer]
Jackneill has joined #nixos-dev
<FRidh>
can't sign in on hydra using google anymore it seems. Anyone else have that problem?
FRidh has quit [Remote host closed the connection]
<jtojnar>
FRidh: works for me now, but I had to disable the Firefox tracking protections in the past
alp has quit [Ping timeout: 264 seconds]
FRidh has joined #nixos-dev
luc65r has joined #nixos-dev
vcunat has joined #nixos-dev
vcunat has quit [Client Quit]
__monty__ has joined #nixos-dev
ris has joined #nixos-dev
nschoe has joined #nixos-dev
nschoe has quit [Ping timeout: 264 seconds]
alp has joined #nixos-dev
FRidh has quit [Remote host closed the connection]
FRidh has joined #nixos-dev
nschoe has joined #nixos-dev
alp has quit [Ping timeout: 268 seconds]
<gchristensen>
I'm trying to fix the big-parallel hydra problem. I can't build new machines because when I try to copy-closure to a builder, I get: path '/nix/store/01n3wxxw29wj2pkjqimmmjzv7pihzmd7-which-2.21.tar.gz.drv' is not a valid store path
<gchristensen>
on the remote. any suggestions on dealing with this?
orivej has quit [Ping timeout: 272 seconds]
<MichaelRaskin>
Hmmm, I guess strace-ing and posting here what command prints that error would increase your chances of getting useful ideas.
<gchristensen>
I'm not able to invest a ton of time in this today, but if anyone can help we can get aarch64 and the channels going sooner. I'm running + nix-copy-closure --use-substitutes --to root@4cf2cdd7.packethost.net /nix/store/l6zwpfnvbha57zhkqxh8inj1shpzym7h-netboot.drv ... Nix on the receiving side says: path '/nix/store/01n3wxxw29wj2pkjqimmmjzv7pihzmd7-which-2.21.tar.gz.drv' is not a valid store path
<gchristensen>
on the receiving side the .drv does exist in the store, and if I try to build it: path '/nix/store/01n3wxxw29wj2pkjqimmmjzv7pihzmd7-which-2.21.tar.gz.drv' does not exist and cannot be created
<gchristensen>
I notice /nix/store basically has no .drv's. has nix gotten rid of drv's and broken my remote buildingworkflow?
<FRidh>
gc-keep-derivations = false ?
<LnL->
depends on how you evaluate I think
<MichaelRaskin>
I think Nix now doesn't always created drv's but you can use it with instantiate
<MichaelRaskin>
I think my workflows ended up using instantiate deep inside so I do not suffer from it as much…
<gchristensen>
skipping that flag doesn't seem to change
<gchristensen>
seems weird it is sending a second file before the error
<LnL->
yeah, looks like it's sent but not getting registered on the remote side
AlwaysLivid has joined #nixos-dev
<gchristensen>
gotta go for about 15min. the remote side is nix-3.0pre20201020_e0ca98c, the sending side is nix-2.3.7, but trying nix-3.0pre20201020_e0ca98c on the sending side doesn't make a difference. I don't have a lot of time today to give this a ton of thought, so any and all help is deeply appreciated
<LnL->
are these about the same version of nix? unstable -> stable could do through something that's not really tested
<LnL->
ah :)
Jackneill has joined #nixos-dev
<LnL->
well I just tried to copy a drv and nothing happens
<LnL->
the first drv file exists but it's not registered
<gchristensen>
I've find it really frustrating to spend the little time I have to do things this weekend on figuring out how Nix has broken the features I depend on, instead of doing those things
<LnL->
yeah, the matrix is actually pretty crazy if you consider local/daemon, ssh and version permutations
<gchristensen>
yeah... but also version permutations don't matter here, and the other issue I debugged yesterday the feature was straight up deleted without realizing it
<LnL->
not the problem here but it is relevant for these kind of issues in general
<gchristensen>
right
<gchristensen>
and yet the current testing strategy missed it, so we could be catching a lot more problems early without trying to address the massive matrix
<LnL->
like the cas signatures from a few releases back
<LnL->
gchristensen: there are some integration style tests in the nix repo but I don't know when those run
orivej has joined #nixos-dev
<gchristensen>
I've gotta run for a while longer, it'd be great if someone found where this regression happened
<MichaelRaskin>
(I am not sure if it is just the question of testing strategy; I have never got a feeling that some use-cases — like, anything but either fully local builds + cache, or maybe independently installed NixOS copies used as builders are actually considered things that matter)
<zimbatm>
Mic92: you are now both admins of the repo
<LnL->
nah the tests do check a variety of things
jonringer has joined #nixos-dev
FRidh has quit [Remote host closed the connection]
FRidh has joined #nixos-dev
alp has quit [Ping timeout: 260 seconds]
cole-h has joined #nixos-dev
evils has quit [Read error: Connection reset by peer]
evils has joined #nixos-dev
cransom has quit [Quit: WeeChat 2.7.1]
cransom has joined #nixos-dev
<Mic92>
zimbatm: thangs
rajivr has quit [Quit: Connection closed for inactivity]
justanotheruser has joined #nixos-dev
alp has joined #nixos-dev
alp has quit [Ping timeout: 260 seconds]
<das_j>
oof has anyone experienced performance issues when mass-rebuilding with `nix build`?
<das_j>
currently rebuilding 40000 packages and it is downloading deps right now, but between each download, it does nothing for a few seconds
<das_j>
(on darwin in single user mode)
<cole-h>
Maybe try seeing what it's actually doing by using `-v` until you see the desired amount of details?
<cole-h>
Might help show you that it either is doing something (just not visibly), or if it actually is stalled for a bit.
<supersandro2000>
das_j: Are you building my PR?
<das_j>
yes
<das_j>
but at the current rate, this might take 1-2 days
<das_j>
also, why would it build xf86-* packages and xmonad and so on? I doubt they work on darwin
alp has joined #nixos-dev
cole-h has quit [Ping timeout: 264 seconds]
<gchristensen>
I think I'm going insane. Why does hydra succeed in building Nix (current master) but I get "no configure script, doing nothing", and Hydra runs a configure script?
<MichaelRaskin>
Is it the same commit id?
<gchristensen>
yeah 035d0adfd8a4a20dd404cb5586cfd5414ac28b77 in both cases
<MichaelRaskin>
Also, are you using Nixpkgs package overriding src or Nix default.nix?
<gchristensen>
this is the same problem causing me t obe unable to update the pinned nixUnstable
<das_j>
`-I nixpkgs-channels=/` maybe?
<das_j>
umm nixpkgs-overlays of course
<gchristensen>
I don't have any overlays
<das_j>
oof
<MichaelRaskin>
Show the exact expression you are building maybe?
<MichaelRaskin>
Like, is autoreconfHook there?
<gchristensen>
nix-build . -A defaultPackage.x86_64-linux on a clean checkout of nix at master
<MichaelRaskin>
Using nixStable as building Nix?
<gchristensen>
yeah
<MichaelRaskin>
Hmm, WorksForMe™
<MichaelRaskin>
Care to paste a session shot to pastebin? git status; nix-build command; output till no configure script ?
<MichaelRaskin>
[«works» = «reaches compilation of C++ code to .o files»]
<MichaelRaskin>
Ah, should have asked for git branch -v
<MichaelRaskin>
Ah OK git log is fine
<MichaelRaskin>
What the absurdity
<MichaelRaskin>
For the record, my Autoreconf output is exactly the same
<MichaelRaskin>
But configure gets produced
<MichaelRaskin>
The derivation file name is also exactly the same
<MichaelRaskin>
gchristensen: I assume you are not eager to give me SSH to look at that?
<gchristensen>
sudo zpool scrub tank && nix-store --verify --check-contents
<MichaelRaskin>
Well, good point…
<gchristensen>
MichaelRaskin: let's do this first, then we could tmate.io
<MichaelRaskin>
Wasn't it discussed this weekend that it is down?
<gchristensen>
seems to be up again :)
<MichaelRaskin>
Not that it matters much of course, I surely have a jumphost VPS if necessary
<gchristensen>
I should plug in while I read my entire disk twice
<MichaelRaskin>
For what it's worth, drv file has SHA256 61359ed41a0e63ed6c6d7559c07306129a7cdae9a55f4d155af86511c08c107b for me
<gchristensen>
same
<gchristensen>
"scan: scrub repaired 0B in 0 days 00:10:30 with 0 errors on Sun Nov 1 15:47:08 2020" unclear on how far along nix-daemon's check is
<gchristensen>
it is only using 30% of one core, though, so it'll probably be a while
abathur has quit [Quit: abathur]
<MichaelRaskin>
Is it SSD? Why should it take longer than scrub, 30% CPU sounds like IO-bound
<gchristensen>
nvme
<gchristensen>
the scrub is all in the kernel, daemon's does: lstat, openat, read, close, per file
<gchristensen>
it goes alphabetically through the ni xstore and is at /nix/store/7s so far
<MichaelRaskin>
Hmm ouch
<MichaelRaskin>
I wonder if find /nix/store would make things better or worse
<MichaelRaskin>
(by making sure the metadata is already there)
<MichaelRaskin>
(but I have some weird habits from using HDD and 32GiB RAM)
<gchristensen>
I should have garbage collected
<MichaelRaskin>
I wonder if launching it in parallel would make things better or worse
<gchristensen>
heh
<MichaelRaskin>
How far along is it now?
<MichaelRaskin>
After journalctl nothing can surprise me anymore
<gchristensen>
it made it worse
<MichaelRaskin>
I mean, it will make it worse short-term
<MichaelRaskin>
That's not a surprise
<gchristensen>
the verify spewed zillions of errors b/c of missing paths
<MichaelRaskin>
Argh
luc65r has quit [Quit: WeeChat 2.9]
<MichaelRaskin>
Indeed
<MichaelRaskin>
How much RAM you have?
supersandro2000 has quit [Ping timeout: 240 seconds]
<gchristensen>
16g
<MichaelRaskin>
Do the already-verified paths fit in disk cache?
<gchristensen>
...maybe.
<gchristensen>
once this GC ends I'll re-verify
<gchristensen>
okay it moved on to deleting /nix/store/trash. checking again.
supersandro2000 has joined #nixos-dev
<gchristensen>
the verify is to the "l"s now, and trash is still being deleted
supersandro2000 has quit [Ping timeout: 260 seconds]
<gchristensen>
MichaelRaskin: no issue ... nix-store --verify --check-contents 0.01s user 0.02s system 0% cpu 28:43.39 total
<MichaelRaskin>
hmm
<MichaelRaskin>
would be so much easier if it found something!
<gchristensen>
yup
<gchristensen>
note: deleting the garbage is still going, and it didn't even need to read everything
<gchristensen>
92159 store paths deleted, 295296.16 MiB freed
<MichaelRaskin>
It surely doesn't care about content of the garbage!
<MichaelRaskin>
Oooh
<MichaelRaskin>
BTW, in your configuration are temporary build directories in the store or in /tmp ?
<gchristensen>
ehhhh.... is that a configurable thing?
<MichaelRaskin>
I think it at least varies across installations
<gchristensen>
good news / bad news: it still fails to build like it did before. I think mine builds in /tmp
<gchristensen>
MichaelRaskin: want to do tmate? :)
<MichaelRaskin>
Hmmm, I wonder if NIX_REMOTE= TMPDIR=/tmp/something TMP=/tmp/something would change the build location
<MichaelRaskin>
Yeah, I guess we could try that
<MichaelRaskin>
By now I see one thing that is likely the same for me and Hydra but not you
<gchristensen>
read only session: ssh ro-rSbXQr2HQNJwxXRBe2sVLJYwx@nyc1.tmate.io
<MichaelRaskin>
Connected
<gchristensen>
see PM
<MichaelRaskin>
My /tmp is ext4, and I guess the same is Hydra
<gchristensen>
it built on ike, which I think is indeed ext4 -- though many of the builders are also zfs
<MichaelRaskin>
Can you cheaply trigger a build on Hydra ZFS builder?
<MichaelRaskin>
Mostly top just bury that rabbit hole
<gchristensen>
actually, I have another ZFS rooted machine where tmp is on zfs where it succeeds
<MichaelRaskin>
Aha
<MichaelRaskin>
Hm, and nothing too new and fancy and untrustable
<MichaelRaskin>
Isn't it cute. configure is there
<gchristensen>
huh
<MichaelRaskin>
Wait, is it That /bin/sh I advised you to write to check who calls it?
<gchristensen>
it is, but /bin/sh inside the build sandbox works fine and isn't the same
<MichaelRaskin>
Also, the wrapped /bin/sh also works fine
<MichaelRaskin>
Do you remember if NIX_DEBUG is a permitted impure variable?
<gchristensen>
I don't think it is by default
<gchristensen>
whew
<MichaelRaskin>
So now I need to read nixos-20.09-small to understand what setup.sh script is in use?
<gchristensen>
heh
<MichaelRaskin>
Apparently it is different from git master
<MichaelRaskin>
But I am _pretty_ sure that -x ./configure test failed
<MichaelRaskin>
Ah right there are hooks
<MichaelRaskin>
The selected 2 lines are probably «if [[ -z "$configureScript" && -x ./configure ]]; then»
<MichaelRaskin>
Aaand the next line doesn't get executed
<MichaelRaskin>
So it looks like test -x ./configure fails
<MichaelRaskin>
And that's a kiloton later than the actual Autoreconf invocation that allegedely succeeds
<danderson>
what's the convention in NixOS for software that releases a backwards-incompatible 2.0? Create new packages and modules for it?
<danderson>
case in point, I want to use the release candidate of InfluxDB 2.0. There's a manual upgrade procedure from 1.0, but it's a significantly different beast than 1.x
<MichaelRaskin>
For DBs — I guess so…
<MichaelRaskin>
gchristensen: OOOK. So we have a file that is clearly a+x and test -x fails on it
<gchristensen>
:D
<MichaelRaskin>
At that point I am inclined about the exact ZFS mount options!
<MichaelRaskin>
Although I have no idea about them
<samueldr>
what's the status code of the failure?
supersandro2000 has joined #nixos-dev
<MichaelRaskin>
It's test -x
<MichaelRaskin>
Of course 1
<samueldr>
of course, you sure? :)
<samueldr>
(but likely yes)
<gchristensen>
you know ...
<MichaelRaskin>
What about have printed it, twice?
<samueldr>
I was thinking if things are real spooky it could have been 127+
<MichaelRaskin>
OK, so now you have that nice file outside nix build and it still behaves the same
<gchristensen>
so, / *is* mounted noexec, but it has been for ages
<gchristensen>
is it possible I have gone *this long* without crashing in to that?
<MichaelRaskin>
Well, maybe you never needed to run full local builds that need to write executables to build dir?
<MichaelRaskin>
Wait, that means no configure
<gchristensen>
huh.... I guess that is possible!
<MichaelRaskin>
But buildEnv is fine
<gchristensen>
right, anything that moved to $out before it was run would be fine
<gchristensen>
well shoot, thanks MichaelRaskin
<gchristensen>
x_x
<gchristensen>
at least it isn't a specific stream of cosmic rays ......
<MichaelRaskin>
I have a feeling you _can_ force build sandbox to be inside store
<gchristensen>
well I'll be.
<MichaelRaskin>
Yeah, I recommend considering this checkout busted until clean/reset/whatever
* samueldr
wonders about nix doctor check for sandbox on noexec
<MichaelRaskin>
Oooh
<MichaelRaskin>
And nix-info, actually
<MichaelRaskin>
remaster abbreviation for stuff is cool
<MichaelRaskin>
Fortunately I do not need it enough, I do not do much git work of that type