worldofpeace changed the topic of #nixos-dev to: NixOS Development (#nixos for questions) | NixOS 20.09 Nightingale ✨ | | | 20.09 RMs: worldofpeace, jonringer |
teto has quit [Quit: WeeChat 2.9]
<gchristensen> on that note, any hot tips on why it is failing with /nix/store/k8p54jg8ipvnfz435mayf5bnqhw4qqap-bash-4.4-p23/bin/bash: ./config.status: No such file or directory ?
davidtwco has quit [Read error: Connection reset by peer]
thoughtpolice has quit [Ping timeout: 268 seconds]
dmj` has quit [Ping timeout: 268 seconds]
davidtwco has joined #nixos-dev
thoughtpolice has joined #nixos-dev
dmj` has joined #nixos-dev
supersandro2000 has quit [Disconnected by services]
supersandro2000 has joined #nixos-dev
tolt has quit [Quit: tolt]
supersandro2000 has quit [Quit: The Lounge -]
supersandro2000 has joined #nixos-dev
<gchristensen> the amusement of hardware.firmware
<samueldr> tofu
<samueldr> hard, firm, soft
<gchristensen> :brownies:
nh2[m] has joined #nixos-dev
tolt____ has quit [Ping timeout: 264 seconds]
<nh2[m]> What's up with Hydra? 20.09 is building multiple large packages (Thunderbird, Libreoffice) on my laptop instead of fetching them from cache; according to since 3 days ago
<nh2[m]> According to it's "Error --- hydra-queue-runner[0m cannot connect to ‘[33;[0m’"
<gchristensen> sounds like it needs to be restarted
<nh2[m]> gchristensen: Can I do that myself?
<gchristensen> maybe, if not you could ask in #nixos-infra
<nh2[m]> gchristensen: I'm a bit at a loss btw where we track infra issues, sometimes there are notifications on top of, sometimes Hydra issues are pinned in the nixpkgs issue tracker, but I feel like it's pretty ad-hoc
<gchristensen> may be a good discussion for -infra :). I can't be "the guy" these days.
<nh2[m]> gchristensen: ok, thanks!
cole-h has joined #nixos-dev
ris has quit [Ping timeout: 240 seconds]
jonringer has joined #nixos-dev
Baughn has quit [Quit: ZNC 1.6.2+deb1 -]
Baughn has joined #nixos-dev
justanotheruser has quit [Ping timeout: 240 seconds]
xwvvvvwx has quit [Ping timeout: 264 seconds]
xwvvvvwx has joined #nixos-dev
rajivr has joined #nixos-dev
Emantor has quit [Quit: ZNC -]
Emantor has joined #nixos-dev
LnL- has joined #nixos-dev
LnL- has quit [Changing host]
LnL- has joined #nixos-dev
LnL has quit [Ping timeout: 260 seconds]
cole-h has quit [Ping timeout: 264 seconds]
jonringer has quit [Ping timeout: 264 seconds]
<aristid> idle thought: i wonder if hydra's "reproduce locally" could use a pre-existing nixpkgs checkout as a remote to speed up the download
zowoq[m] has quit [Quit: Idle for 30+ days]
FRidh has joined #nixos-dev
alp has joined #nixos-dev
spookyscarysphal is now known as sphalerite
Jackneill has quit [Read error: Connection reset by peer]
Jackneill has joined #nixos-dev
<FRidh> can't sign in on hydra using google anymore it seems. Anyone else have that problem?
FRidh has quit [Remote host closed the connection]
<jtojnar> FRidh: works for me now, but I had to disable the Firefox tracking protections in the past
alp has quit [Ping timeout: 264 seconds]
FRidh has joined #nixos-dev
luc65r has joined #nixos-dev
vcunat has joined #nixos-dev
vcunat has quit [Client Quit]
__monty__ has joined #nixos-dev
ris has joined #nixos-dev
nschoe has joined #nixos-dev
nschoe has quit [Ping timeout: 264 seconds]
alp has joined #nixos-dev
FRidh has quit [Remote host closed the connection]
FRidh has joined #nixos-dev
nschoe has joined #nixos-dev
alp has quit [Ping timeout: 268 seconds]
<gchristensen> I'm trying to fix the big-parallel hydra problem. I can't build new machines because when I try to copy-closure to a builder, I get: path '/nix/store/01n3wxxw29wj2pkjqimmmjzv7pihzmd7-which-2.21.tar.gz.drv' is not a valid store path
<gchristensen> on the remote. any suggestions on dealing with this?
orivej has quit [Ping timeout: 272 seconds]
<MichaelRaskin> Hmmm, I guess strace-ing and posting here what command prints that error would increase your chances of getting useful ideas.
<gchristensen> I'm not able to invest a ton of time in this today, but if anyone can help we can get aarch64 and the channels going sooner. I'm running + nix-copy-closure --use-substitutes --to /nix/store/l6zwpfnvbha57zhkqxh8inj1shpzym7h-netboot.drv ... Nix on the receiving side says: path '/nix/store/01n3wxxw29wj2pkjqimmmjzv7pihzmd7-which-2.21.tar.gz.drv' is not a valid store path
<gchristensen> on the receiving side the .drv does exist in the store, and if I try to build it: path '/nix/store/01n3wxxw29wj2pkjqimmmjzv7pihzmd7-which-2.21.tar.gz.drv' does not exist and cannot be created
<gchristensen> I notice /nix/store basically has no .drv's. has nix gotten rid of drv's and broken my remote buildingworkflow?
<FRidh> gc-keep-derivations = false ?
<LnL-> depends on how you evaluate I think
<MichaelRaskin> I think Nix now doesn't always created drv's but you can use it with instantiate
<MichaelRaskin> I think my workflows ended up using instantiate deep inside so I do not suffer from it as much…
<gchristensen> well, so this is failure is on sending the very first drv:
Jackneill has quit [Ping timeout: 272 seconds]
<MichaelRaskin> Does drv exist in store on either side?
<gchristensen> the sending side has the drv
<MichaelRaskin> Last when I did cross-machine moves direct nix-store --export helped me
<gchristensen> I mean, sure
<MichaelRaskin> (I also wanted to avoid direct machine-to-machine SSH, which made --export preferrable, though)
<gchristensen> this is how I've built these machines' config for over a year
<MichaelRaskin> Heh
<LnL-> what does --use-substitutes do when copying paths?
<gchristensen> the receiving side fetches output paths from the cache as often as possible
<LnL-> maybe that results in the output being substituted but not the drv itself copied?
<gchristensen> skipping that flag doesn't seem to change
<gchristensen> seems weird it is sending a second file before the error
<LnL-> yeah, looks like it's sent but not getting registered on the remote side
AlwaysLivid has joined #nixos-dev
<gchristensen> gotta go for about 15min. the remote side is nix-3.0pre20201020_e0ca98c, the sending side is nix-2.3.7, but trying nix-3.0pre20201020_e0ca98c on the sending side doesn't make a difference. I don't have a lot of time today to give this a ton of thought, so any and all help is deeply appreciated
<LnL-> are these about the same version of nix? unstable -> stable could do through something that's not really tested
<LnL-> ah :)
Jackneill has joined #nixos-dev
<LnL-> well I just tried to copy a drv and nothing happens
<LnL-> the first drv file exists but it's not registered
nschoe has quit [Quit: - Chat comfortably. Anywhere.]
tv has quit [Read error: Connection reset by peer]
tv has joined #nixos-dev
<gchristensen> LnL-: hrm :/
<gchristensen> I guess the thing to do is change the builders to use nix stable and manually roll back a builder so we can deploy that
<gchristensen> any other options?
alp has joined #nixos-dev
<LnL-> not sure, maybe ssh-ng?
<gchristensen> I think I might open a bug about Nix's overall testing strategy
<{^_^}> nix#4210 (by grahamc, 59 seconds ago, open): Nix can't receive derivations
<gchristensen> I've find it really frustrating to spend the little time I have to do things this weekend on figuring out how Nix has broken the features I depend on, instead of doing those things
<LnL-> yeah, the matrix is actually pretty crazy if you consider local/daemon, ssh and version permutations
<gchristensen> yeah... but also version permutations don't matter here, and the other issue I debugged yesterday the feature was straight up deleted without realizing it
<LnL-> not the problem here but it is relevant for these kind of issues in general
<gchristensen> right
<gchristensen> and yet the current testing strategy missed it, so we could be catching a lot more problems early without trying to address the massive matrix
<LnL-> like the cas signatures from a few releases back
<Mic92> zimbatm: gchristensen domenkozar[m] can one of you add psub to this team and give him enough rights so he can add secrets to
<LnL-> gchristensen: there are some integration style tests in the nix repo but I don't know when those run
orivej has joined #nixos-dev
<gchristensen> I've gotta run for a while longer, it'd be great if someone found where this regression happened
<MichaelRaskin> (I am not sure if it is just the question of testing strategy; I have never got a feeling that some use-cases — like, anything but either fully local builds + cache, or maybe independently installed NixOS copies used as builders are actually considered things that matter)
<zimbatm> Mic92: you are now both admins of the repo
<LnL-> nah the tests do check a variety of things
jonringer has joined #nixos-dev
FRidh has quit [Remote host closed the connection]
FRidh has joined #nixos-dev
alp has quit [Ping timeout: 260 seconds]
cole-h has joined #nixos-dev
evils has quit [Read error: Connection reset by peer]
evils has joined #nixos-dev
cransom has quit [Quit: WeeChat 2.7.1]
cransom has joined #nixos-dev
<Mic92> zimbatm: thangs
rajivr has quit [Quit: Connection closed for inactivity]
justanotheruser has joined #nixos-dev
alp has joined #nixos-dev
alp has quit [Ping timeout: 260 seconds]
<das_j> oof has anyone experienced performance issues when mass-rebuilding with `nix build`?
<das_j> currently rebuilding 40000 packages and it is downloading deps right now, but between each download, it does nothing for a few seconds
<das_j> (on darwin in single user mode)
<cole-h> Maybe try seeing what it's actually doing by using `-v` until you see the desired amount of details?
<cole-h> Might help show you that it either is doing something (just not visibly), or if it actually is stalled for a bit.
<supersandro2000> das_j: Are you building my PR?
<das_j> yes
<das_j> but at the current rate, this might take 1-2 days
<das_j> also, why would it build xf86-* packages and xmonad and so on? I doubt they work on darwin
alp has joined #nixos-dev
cole-h has quit [Ping timeout: 264 seconds]
<gchristensen> I think I'm going insane. Why does hydra succeed in building Nix (current master) but I get "no configure script, doing nothing", and Hydra runs a configure script?
<MichaelRaskin> Is it the same commit id?
<gchristensen> yeah 035d0adfd8a4a20dd404cb5586cfd5414ac28b77 in both cases
<MichaelRaskin> Also, are you using Nixpkgs package overriding src or Nix default.nix?
<gchristensen> this is the same problem causing me t obe unable to update the pinned nixUnstable
<das_j> `-I nixpkgs-channels=/` maybe?
<das_j> umm nixpkgs-overlays of course
<gchristensen> I don't have any overlays
<das_j> oof
<MichaelRaskin> Show the exact expression you are building maybe?
<MichaelRaskin> Like, is autoreconfHook there?
<gchristensen> nix-build . -A defaultPackage.x86_64-linux on a clean checkout of nix at master
<MichaelRaskin> Using nixStable as building Nix?
<gchristensen> yeah
<MichaelRaskin> Hmm, WorksForMe™
<MichaelRaskin> Care to paste a session shot to pastebin? git status; nix-build command; output till no configure script ?
<MichaelRaskin> [«works» = «reaches compilation of C++ code to .o files»]
<MichaelRaskin> Ah, should have asked for git branch -v
<MichaelRaskin> Ah OK git log is fine
<MichaelRaskin> What the absurdity
<MichaelRaskin> For the record, my Autoreconf output is exactly the same
<MichaelRaskin> But configure gets produced
<MichaelRaskin> The derivation file name is also exactly the same
<MichaelRaskin> gchristensen: I assume you are not eager to give me SSH to look at that?
<gchristensen> sudo zpool scrub tank && nix-store --verify --check-contents
<MichaelRaskin> Well, good point…
<gchristensen> MichaelRaskin: let's do this first, then we could
<MichaelRaskin> Wasn't it discussed this weekend that it is down?
<gchristensen> seems to be up again :)
<MichaelRaskin> Not that it matters much of course, I surely have a jumphost VPS if necessary
<gchristensen> I should plug in while I read my entire disk twice
<MichaelRaskin> For what it's worth, drv file has SHA256 61359ed41a0e63ed6c6d7559c07306129a7cdae9a55f4d155af86511c08c107b for me
<gchristensen> same
<gchristensen> "scan: scrub repaired 0B in 0 days 00:10:30 with 0 errors on Sun Nov 1 15:47:08 2020" unclear on how far along nix-daemon's check is
<gchristensen> it is only using 30% of one core, though, so it'll probably be a while
abathur has quit [Quit: abathur]
<MichaelRaskin> Is it SSD? Why should it take longer than scrub, 30% CPU sounds like IO-bound
<gchristensen> nvme
<gchristensen> the scrub is all in the kernel, daemon's does: lstat, openat, read, close, per file
<gchristensen> it goes alphabetically through the ni xstore and is at /nix/store/7s so far
<MichaelRaskin> Hmm ouch
<MichaelRaskin> I wonder if find /nix/store would make things better or worse
<MichaelRaskin> (by making sure the metadata is already there)
<MichaelRaskin> (but I have some weird habits from using HDD and 32GiB RAM)
<gchristensen> I should have garbage collected
<MichaelRaskin> I wonder if launching it in parallel would make things better or worse
<gchristensen> heh
<MichaelRaskin> How far along is it now?
<MichaelRaskin> After journalctl nothing can surprise me anymore
<gchristensen> it made it worse
<MichaelRaskin> I mean, it will make it worse short-term
<MichaelRaskin> That's not a surprise
<gchristensen> the verify spewed zillions of errors b/c of missing paths
<MichaelRaskin> Argh
luc65r has quit [Quit: WeeChat 2.9]
<MichaelRaskin> Indeed
<MichaelRaskin> How much RAM you have?
supersandro2000 has quit [Ping timeout: 240 seconds]
<gchristensen> 16g
<MichaelRaskin> Do the already-verified paths fit in disk cache?
<gchristensen> ...maybe.
<gchristensen> once this GC ends I'll re-verify
<gchristensen> okay it moved on to deleting /nix/store/trash. checking again.
supersandro2000 has joined #nixos-dev
<gchristensen> the verify is to the "l"s now, and trash is still being deleted
supersandro2000 has quit [Ping timeout: 260 seconds]
<gchristensen> MichaelRaskin: no issue ... nix-store --verify --check-contents 0.01s user 0.02s system 0% cpu 28:43.39 total
<MichaelRaskin> hmm
<MichaelRaskin> would be so much easier if it found something!
<gchristensen> yup
<gchristensen> note: deleting the garbage is still going, and it didn't even need to read everything
<gchristensen> 92159 store paths deleted, 295296.16 MiB freed
<MichaelRaskin> It surely doesn't care about content of the garbage!
<MichaelRaskin> Oooh
<MichaelRaskin> BTW, in your configuration are temporary build directories in the store or in /tmp ?
<gchristensen> ehhhh.... is that a configurable thing?
<MichaelRaskin> I think it at least varies across installations
<gchristensen> good news / bad news: it still fails to build like it did before. I think mine builds in /tmp
<gchristensen> MichaelRaskin: want to do tmate? :)
<MichaelRaskin> Hmmm, I wonder if NIX_REMOTE= TMPDIR=/tmp/something TMP=/tmp/something would change the build location
<MichaelRaskin> Yeah, I guess we could try that
<MichaelRaskin> By now I see one thing that is likely the same for me and Hydra but not you
<gchristensen> read only session: ssh
<MichaelRaskin> Connected
<gchristensen> see PM
<MichaelRaskin> My /tmp is ext4, and I guess the same is Hydra
<gchristensen> it built on ike, which I think is indeed ext4 -- though many of the builders are also zfs
<MichaelRaskin> Can you cheaply trigger a build on Hydra ZFS builder?
<MichaelRaskin> Mostly top just bury that rabbit hole
<gchristensen> actually, I have another ZFS rooted machine where tmp is on zfs where it succeeds
<MichaelRaskin> Aha
<MichaelRaskin> Hm, and nothing too new and fancy and untrustable
<MichaelRaskin> Isn't it cute. configure is there
<gchristensen> huh
<MichaelRaskin> Wait, is it That /bin/sh I advised you to write to check who calls it?
<gchristensen> it is, but /bin/sh inside the build sandbox works fine and isn't the same
<MichaelRaskin> Also, the wrapped /bin/sh also works fine
<MichaelRaskin> Do you remember if NIX_DEBUG is a permitted impure variable?
<gchristensen> I don't think it is by default
<gchristensen> whew
<MichaelRaskin> So now I need to read nixos-20.09-small to understand what script is in use?
<gchristensen> heh
<MichaelRaskin> Apparently it is different from git master
<MichaelRaskin> But I am _pretty_ sure that -x ./configure test failed
<MichaelRaskin> Ah right there are hooks
<MichaelRaskin> The selected 2 lines are probably «if [[ -z "$configureScript" && -x ./configure ]]; then»
<MichaelRaskin> Aaand the next line doesn't get executed
<MichaelRaskin> So it looks like test -x ./configure fails
<MichaelRaskin> And that's a kiloton later than the actual Autoreconf invocation that allegedely succeeds
<danderson> what's the convention in NixOS for software that releases a backwards-incompatible 2.0? Create new packages and modules for it?
<danderson> case in point, I want to use the release candidate of InfluxDB 2.0. There's a manual upgrade procedure from 1.0, but it's a significantly different beast than 1.x
<MichaelRaskin> For DBs — I guess so…
<MichaelRaskin> gchristensen: OOOK. So we have a file that is clearly a+x and test -x fails on it
<gchristensen> :D
<MichaelRaskin> At that point I am inclined about the exact ZFS mount options!
<MichaelRaskin> Although I have no idea about them
<samueldr> what's the status code of the failure?
supersandro2000 has joined #nixos-dev
<MichaelRaskin> It's test -x
<MichaelRaskin> Of course 1
<samueldr> of course, you sure? :)
<samueldr> (but likely yes)
<gchristensen> you know ...
<MichaelRaskin> What about have printed it, twice?
<samueldr> I was thinking if things are real spooky it could have been 127+
<MichaelRaskin> OK, so now you have that nice file outside nix build and it still behaves the same
<gchristensen> so, / *is* mounted noexec, but it has been for ages
<gchristensen> is it possible I have gone *this long* without crashing in to that?
<MichaelRaskin> Well, maybe you never needed to run full local builds that need to write executables to build dir?
<MichaelRaskin> Wait, that means no configure
<gchristensen> huh.... I guess that is possible!
<MichaelRaskin> But buildEnv is fine
<gchristensen> right, anything that moved to $out before it was run would be fine
<gchristensen> well shoot, thanks MichaelRaskin
<gchristensen> x_x
<gchristensen> at least it isn't a specific stream of cosmic rays ......
<MichaelRaskin> I have a feeling you _can_ force build sandbox to be inside store
<gchristensen> well I'll be.
<MichaelRaskin> Yeah, I recommend considering this checkout busted until clean/reset/whatever
* samueldr wonders about nix doctor check for sandbox on noexec
<MichaelRaskin> Oooh
<MichaelRaskin> And nix-info, actually
<MichaelRaskin> remaster abbreviation for stuff is cool
<MichaelRaskin> Fortunately I do not need it enough, I do not do much git work of that type
<gchristensen> thank you so much, MichaelRaskin!
<MichaelRaskin> You are welcome
<MichaelRaskin> Separately wondering whether this is worth a check in
<MichaelRaskin> (upstream creating a non-executable configure file is also possible)
__monty__ has quit [Quit: leaving]
FRidh has quit [Ping timeout: 246 seconds]
supersandro2000 has quit [Ping timeout: 240 seconds]
supersandro2000 has joined #nixos-dev
MichaelRaskin has quit [Quit: MichaelRaskin]