<orivej>
ekleog: you may try e.g. "sysdig evt.type=write" (with "programs.sysdig.enable = true;") to see what's going on.
<ekleog>
ok, so actually it looks like journalctl is actually the source of all this spam (with 80% certitude): I just ran a strace -p $(pidof systemd-journald), and a *lot* of information strolls by, much more than 33KB in 10s -- after thinking about it, as these 2KB were computed by du -hs /var/log/journald, journald's log rotation must have limited the increase in folder size... (also for sysdig, seeing it
<ekleog>
requires a kernel module doesn't make me want to try it without much further investigation, which won't come before long)
<ekleog>
hmm, are you actually sure about nix not storing substituted dependencies? I've got as many .drv's for firefox-unwrapped as I have firefox-unwrapped's in my nix-store, and think I'd have noticed if I actually did rebuilt firefox every single time
<vcunat>
apparently some closure size increased in staging, and that broke one installer test (the system no longer fits) https://hydra.nixos.org/build/64709761
taktoa has joined joined #nixos-dev
ma27 has joined joined #nixos-dev
ma27 has quit [(Client Quit)]
FRidh has joined joined #nixos-dev
<vcunat>
well, pushed 474c1ce79 for that, at least
phreedom has quit [(Quit: No Ping reply in 180 seconds.)]
phreedom has joined joined #nixos-dev
zraexy has quit [(Ping timeout: 255 seconds)]
zraexy has joined joined #nixos-dev
<orivej>
vcunat: looking at this closure size graph it seems almost impossible for the closure size spike to be caused by anything other than fwupd being enabled by default:
<vcunat>
it has a nice closure: two pythons, two perls, two gtk+...
<orivej>
if the nixos installer test is the most obvious way to notice such things, and it is good to notice them early, maybe you should revert 474c1ce79?
<vcunat>
(meaning fwupd)
<orivej>
:)
<vcunat>
I'm considering that
<vcunat>
I think I would prefer to separate such tests into different jobs
<vcunat>
when you look at the red/green/... icons at Hydra, it's only confusing
<vcunat>
that installation with SW RAID breaks
<vcunat>
when it's "only" closure size increase
<vcunat>
ping niksnut for that, as he's sensitive to closure blowups
<vcunat>
It might be just a single job that measures closures of various systems and packages (individually) and compares them to some hardcoded thresholds to either succeed or print some informative warning.
<vcunat>
And add some changes to make it easy to diff two closures in terms of sizes, e.g. name-sorted list from nix-store -qR. (And then you can e.g. use `nix why-depends` if it's something new.)
taktoa has quit [(Remote host closed the connection)]
_ts_ has joined joined #nixos-dev
ris has joined joined #nixos-dev
<orivej>
are there any Hydra branches for "staging-17.09", or should I commit a mass rebuild straight to "release-17.09"?
<orivej>
err, Hydra jobsets
<FRidh>
orivej: there is no job for staging-17.09. That one was primarily used before the release. You can indeed push directly to release.
<vcunat>
+1
<vcunat>
:-) the four-headed merge
<vcunat>
Thanks for keeping the first-parent line on master.
<gchristensen>
can IFD be turned off in nix 1.11?
<vcunat>
gchristensen: what is IFD?
<gchristensen>
ack Import From Derivation but what I meant was building during evaluation
<vcunat>
FRidh, orivej: so we leave the mass rebuild on master?
<vcunat>
Estimating rebuild amount by counting changed Hydra jobs.
<vcunat>
12337 x86_64-darwin
<vcunat>
18846 x86_64-linux
<vcunat>
I think it would be more practical to revert it for now, for anyone developing against master.
<FRidh>
vcunat: isnt that mostly desktop environments, long builds, and the long tail of "insignificant" packages?
<vcunat>
It's almost everything.
<orivej>
but some of this stuff has already been built in the staging jobset. could you start nixpkgs/trunk evaluation to see what is left and how much is broken?
<FRidh>
some of it has been build in python-unstable branch, which was build on staging
<FRidh>
*based on
<vcunat>
ah, right, bad diff
<aszlig>
i have the revert commits ready, should i push?
<adisbladis>
vcunat: What ARMs would you like to have accumulated?
<aszlig>
vcunat: sure
<aszlig>
but i still think this is a race condition
<vcunat>
I don't have any aarch64, so borg would be a nice way to test stuff in there as well.
<vcunat>
aszlig: what kind of race?
<aszlig>
vcunat: the terminal not getting the sendKeys()
<aszlig>
so i think we should make that more robust instead
<vcunat>
aszlig: so something else than not getting enough CPU for 60s
<vcunat>
(and thus not getting them)
<vcunat>
I wonder if there's a performance penalty when you virtualize multiple machines at once on a single CPU...
<vcunat>
(or perhaps lack of some other resources)
<aszlig>
vcunat: ah, sorry... that timeout is for reader.ready
<aszlig>
vcunat: so that's not the kind of race condition i was having in mind
<aszlig>
vcunat: so does it always happen with the xterm tests or the VT ones too?
<gchristensen>
vcunat: I'm working on getting a type 2a for people in the community to help with the aarch64 effort, we can run a borg there
<vcunat>
aszlig: I don't know
<vcunat>
gchristensen: a separate one?
<vcunat>
wouldn't it be better shared
<vcunat>
if it's for the PRs, it would be best if the binaries from testing were then used for binary cache contents directly...
<aszlig>
vcunat: okay, i've got an idea, one second...
<gchristensen>
yes, a separate one
<gchristensen>
I don't want to mix that infrastructure, and ideally would give out more liberal access to borg than committers. we don't want to be too liberal in having things access the hydra build machines
<gchristensen>
since they're a fairly critical component of the chain of trust of the nix cache
<aszlig>
vcunat: don't merge #32020 yet, please
<vcunat>
OK. I wasn't going to.
<vcunat>
gchristensen: but 2A seems really overkill just to test PRs
<gchristensen>
it is also for people in the community to help with the aarch64 effort
<vcunat>
even so, 96 cores... and 20 Gbit network
<vcunat>
We don't need to worry until we get it, I guess.
<gchristensen>
why worry? they want us to have the tools we need to get support
<vcunat>
But I would expect there's some relatively secure way to "split" it into two machines.
<vcunat>
That would be ideal, as parts unused by community would get utilized by Hydra directly.
<vcunat>
You don't need to hand out root access, most likely, too.
<gchristensen>
I'd rather not go down this complicated route, it doesn't seem like a prudent use of time
<vcunat>
Right, probably not.
<vcunat>
I mean, considering relative priorities.
<gchristensen>
right
<gchristensen>
I'm confused about that failure to build, I wish I had more log output.
<gchristensen>
maybe something went over the 30min build timeout
<aszlig>
that way we should get a screenshot whenever a timeout occurs
<aszlig>
and the test build will always succeed with an output path
<aszlig>
vcunat: so in order to debug this i'd suggest making a hydra jobset and i'll try to regularly push random whitespace changes to the keymap test to that branch so we have multiple evaluations
<vcunat>
aszlig: I don't want to complicate this with that mass rebuild
<vcunat>
I can point it to your branch, but better pick the commit atop 2f1a818d00f957f3102c0b412864c63b6e3e7447
<gchristensen>
I would 100% do that if this machine was easy to replace :P
<gchristensen>
(read: netboot)
<aszlig>
hm?
<aszlig>
what exactly?
<gchristensen>
libeatmydata
<Dezgeg>
well the nix store isn't power-fail safe anyway so almost no harm x)
<Dezgeg>
though with that even nix-store --repair won't work
<aszlig>
gchristensen: why? the builds are going to be transferred off the build machine anyway, so no harm
<gchristensen>
I need to be able to safely update the machine
<aszlig>
hm...
<aszlig>
run a second nix-daemon?
<aszlig>
along with a separate store from /nix/store
<aszlig>
nix 1.12 should make that quite easy
<aszlig>
but even when not using nix 1.12, you can still use containers
<aszlig>
... which of course shouldn't share the store... hmmm...
<gchristensen>
ehh once we get a bit better aarch64 support it'll be easy too make this netboot I think then we can just go libeatmydata
<aszlig>
or that :-)
<clever>
Dezgeg: i think that as long as the order of data writes, and flushes to db.sqlite are preserved, it is powerfail safe
<clever>
but the defaults for ext4 dont preserve that
<clever>
ive seen files truncate after an improper shutdown
<clever>
zfs seems like it will obey those rules much much better
<clever>
ive also noticed, /nix/var/nix/binary-cache-v3.sqlite has the sync mode disabled for speed, which can lead to corruption at improper shutdown
<clever>
nix can safely regenerate it if deleted, but it doesnt actualy handle the corruption
<clever>
so i had to spend a few hours debugging that, and then delete the db
<Dezgeg>
well it's nix who is not calling fsync() on store paths
<clever>
yeah, but i would still expect zfs to keep non-sync'ed writes that happen before a synced db.sqlite, to all be preserved in-order
<Dezgeg>
why?
<clever>
thats just the design feel i get from zfs, but i see your point, the fsync() would lag more because of other data...
<Dezgeg>
that would be a massive performance killer
<clever>
but what about when you close() a handle, doesnt that imply sync?
<Dezgeg>
no
<clever>
ah, the man page agrees with you
<clever>
2017-11-25 12:36:31 < DeHackEd> the file written normally will be in an unkonwn state, but ZFS does guarantee that the sequence of filesystem syscalls will have been performed in order
<clever>
from #zfs
<clever>
and a normal sync() would ensure everything goes to disk
<clever>
so in theory, if nix did a sync() against the /nix/store filesystem, all storepaths would persist, and it would be safe to mark them as good in db.sqlite
<Dezgeg>
well yes, that would work on any filesystem
<simpson>
Hm, with fetchgit, is there a way to ask for the --recursive flag to git? Looking at packaging libfirm using these instructions: https://pp.ipd.kit.edu/firm/Download.html
<clever>
simpson: i think you want fetchSubmodules = true;
<gchristensen>
"Parallel mksquashfs: Using 96 processors"
zraexy has quit [(Ping timeout: 260 seconds)]
<ma27>
gchristensen: do we actually receive sponsoring for these machines (because we're such an awesome distro :p) or is this completely paid by the NixOS Foundation?
<gchristensen>
Packet.net provides one Type 2 for Hydra, one Type 2A for hydra, and soon one Type 2A to help the community have an aarch64 system to help improve our aarch64 support