#nixos — irc logs

21:56 <clever> yeah

21:56 <clever> lua also allows 0 to be a key, but the iterator starts at 1

21:55 <clever> lua even allows objects to be keys

21:55 <clever> both php and lua allow strings and int as keys in an array

21:38 <clever> sphalerite[m]: https://hydra.angeldsis.com/build/59300 if you want to keep an eye on it

21:37 <clever> i'll restart that hydra job

21:36 <clever> the nw hydra hasnt built it yet, and i deleted the jobset from the old hydra

21:36 <clever> dang, had

21:35 <clever> i have a tar for arm, that that script could accept

21:35 <clever> every supported platform

21:34 <clever> sphalerite[m]: how it has a list of tar's for every platform?

21:34 <clever> sphalerite[m]: you know the source of nixos.org/nix/install ?

21:34 <clever> i have a tar of that

21:33 <clever> ahh, arm?

21:33 <clever> what arch?

21:33 <clever> ah yeah, /usr/, i see

21:33 <clever> ah

21:33 <clever> sphalerite[m]: weird, nix-store --verify --check-contents ?

21:31 <clever> if you confirm that the perl is fixed, restart all failing jobs in http://hydra.nixos.org/eval/1378129

21:31 <clever> but other jobs that failed due to perl wont be restarted

21:30 <clever> and repeat anything that has failed

21:30 <clever> it will just build the entire closure of that job

21:29 <clever> the logs on hydra dont show signs of that

21:29 <clever> copumpkin: what if you restart the job on hydra?

21:27 <clever> ALT-SYSRQ-K

21:26 <clever> this lets you hard kill everything in the tty, returning control to the login prompt

21:26 <clever> the linux version of ctrl+alt+del to login

21:25 <clever> sphalerite[m]: https://www.kernel.org/doc/Documentation/SAK.txt

21:25 <clever> sphalerite[m]: you know about SAK?

21:25 <clever> sphalerite[m]: :O

21:25 <clever> copumpkin: yeah, just a single nixbld1 is fine

21:24 <clever> sphalerite[m]: what about 'ctrl+c' then 'reset' ?

21:24 <clever> sphalerite[m]: weird

21:24 <clever> copumpkin: user namespacing works by mapping a real uid on the host to a fake uid in the guest

21:24 <clever> copumpkin: it still needs a nixbldX user on the host, to map the usernamespace to

21:23 <clever> sphalerite[m]: you ran cat on a binary file, and it changes the font settings

21:23 <clever> sphalerite[m]: run "reset" in that terminal

21:22 <clever> user namespaces just make it more determisistic, so the build doesnt know which build user its on

21:22 <clever> any time nix has root, it will want to do that, even if sandboxes are off

21:21 <clever> copumpkin: you need to create a nixbld1 user and put it into the nixbld group

21:19 <clever> it needs root to setup the sandbox and drop privs

21:19 <clever> the build must run as root (or nix-daemon as root) for sandboxes to work

21:19 <clever> is this on a nixos machine?

21:16 <clever> and is used to build the bison and binutils that become the final stdenv

21:16 <clever> i think that perl is build against the bootstrap binutils

21:14 <clever> copumpkin: and checking --tree, i can confirm the 2nd perl that is failing, is under stdenv-linux-boot -> binutils -> bison -> perl

21:13 <clever> copumpkin: yeah, i can confirm 2 perls in the closure of at, and the hydra failed on the one that isnt in the perl attr

21:11 <clever> copumpkin: the path for 'at' in the nixpkgs rev does match up, checking its --tree

21:09 <clever> copumpkin: i'm guessing its one of the perl's in the bootstrap of stdenv

21:07 <clever> copumpkin: yep, let me check that hydra url

21:06 <clever> perl.devdoc

21:06 <clever> copumpkin: and -A perl.decdov ?

21:05 <clever> copumpkin: what about: nix-build -A perl --arg config '{}'

19:52 <clever> bbl

19:49 <clever> et4te: refer to 'man configuration.nix'

19:49 <clever> et4te: programs.ssh.knownHosts

19:49 <clever> et4te: there is also a nixos option to pre-populate it system wide

19:48 <clever> et4te: sudo -u hydra -i, then ssh into the target host to populate ~/.ssh/known_hosts

19:45 <clever> aristid: oops, the site was hacked within 5 minutes of launch, the hacker got all of our funds!

19:31 <clever> aristid: the problem i was focusing on was in stage-1, it didnt even get as far as mounting the rootfs

19:30 <clever> aristid: ah, that sounds like a different problem , after stage-2

19:28 <clever> aristid: where is it hanging for you?

19:28 <clever> aristid, domenkozar: https://gist.github.com/cleverca22/25b44f4f4c76aefe8a5e32fb60eaa226

19:27 <clever> domenkozar: i was able to reproduce it 2 or 3 times when my system was basically idle, so high disk load isnt a hard requirement

19:26 <clever> domenkozar: i tracked it down to mke2fs hanging in stage-1, but it refused to reproduce once i straced it

01:55 <clever> was able to get about 0.2 via cpu mining before i cooked a server

01:54 <clever> but i used to mine btc many years ago

01:54 <clever> not currently

01:53 <clever> no idea about that error

01:53 <clever> ah

01:48 <clever> johnramsden: what error did you have when you tried?

22:55 <clever> so if you use nix's -j4 (or hydra builds 4 jobs per slave), it will only do 2 tests max

22:55 <clever> i do see how marking a derivation as being worth 2 jobs would help lessen the load at the nix level

22:53 <clever> there is also the tricky problem of balancing nix -j4 vs make -j4

22:52 <clever> and then hydra will waste its power when not doing testing

22:52 <clever> but you would need to heavily reduce the concurrent jobs allowed on that machine

22:52 <clever> there is already a nixos-test feature that a slave must have, for hydra to even push qemu jobs to it

22:51 <clever> yeah

22:51 <clever> yeah, no make involved in qemu tests

22:51 <clever> which would require patching every single makefile, rather then just flagging one derivation as being fat

22:50 <clever> gchristensen: but i have also seen some packages that have the problem at the make level

22:50 <clever> gchristensen: that works at the nix level

22:40 <clever> qmm: i'm guessing the problem is with ghc, not nix

22:40 <clever> i have done far more crazy things, i upgraded a laptop from gentoo to 32bit nixos, to 64bit nixos, without using an iso, lol

22:38 <clever> chrishill: and if you do that after booting a 64bit kernel from the iso, it will rebuild in 64bit mode

22:38 <clever> chrishill: yeah, nixos-install is just a script to run nixos-rebuild under a chroot

22:38 <clever> qmm: try changing it to "module Main where"

22:37 <clever> and it will upgrade it to 64bit

22:37 <clever> just boot a 64bit install cd, mount the existing filesystems to /mnt, and re-run nixos-install

22:37 <clever> reinstalling is pretty easy

22:36 <clever> chrishill: your on nixos, right?

22:36 <clever> chrishill: then your cpu is capable of 64bit mode, but your install is only 32bit

22:34 <clever> chrishill: is "lm" in the flags field?

22:34 <clever> chrishill: what is the flags field under /proc/cpuinfo ?

22:33 <clever> all of the search commands in nix-env hide broken packages

22:33 <clever> chrishill: spotify only works on 64bit nixos, your on 32bit

22:33 <clever> 5 assert stdenv.system == "x86_64-linux";

22:31 <clever> chrishill: and what is the output of "nix-build '<nixpkgs>' -A spotify -v 2>&1 | grep config.nix"

22:31 <clever> chrishill: what happens if you do nix-env -iA nixos.spotify

22:29 <clever> qmm: can you gist the contents of T.hs?

22:28 <clever> qmm: does each main file declare the module to be Main ?

22:19 <clever> qmm: can you gist all of the output?

22:14 <clever> chrishill: does ~/.config/nixpkgs/config.nix exist?

21:33 <clever> pie__: not sure then

21:30 <clever> bennofs: it would also help to throw disorderfs into nix-daemon based sandboxes, to find more bugs

21:29 <clever> pie__: are you using the python36 version of python?

21:04 <clever> mke2fs started, and never finished, so the stage-1 boot hung

21:03 <clever> there is a log on a previous eval that wasnt restarted: http://hydra.nixos.org/build/56734579/nixlog/15

21:03 <clever> had a few hangs waiting for "f"

21:03 <clever> ive mostly been ignoring the other failures

21:02 <clever> but hydra deletes logs when restarting

21:02 <clever> i also saw it on the hydra logs for one job

21:01 <clever> Infinisil: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/qemu-vm.nix#L399

21:01 <clever> yeah

21:00 <clever> it happened exactly once with the echo's around the mke2fs

21:00 <clever> Infinisil: each time, i narrowed in closer on it, until i added the strace, then it never happened again

20:59 <clever> Infinisil: 2 or 3 times

20:57 <clever> aristid: my first thought is builtins.map and builtins.elem, test each to see if it exists

20:57 <clever> gchristensen: just normal browsing activity

20:57 <clever> gchristensen: whats weird, is that i wasnt even taxing the system that hard when i reproduced it the first few times

20:52 <clever> gchristensen: test took forever to start with 105mb/sec going to the drives, but it can still pass

20:48 <clever> plus zfs snapshots saving every byte they make, lol

20:48 <clever> ah, 1gig per worker, with 20 workers

20:47 <clever> qmm: looks like it should just work, what is the error?

20:47 <clever> gchristensen: i wonder, when does stress delete its temp files, lol

20:47 <clever> amd/root 20G 8.2G 12G 42% /

20:46 <clever> error: writing to file: No space left on device

20:46 <clever> but if i make -j3, on a 8 core machine, over half is going to waste

20:46 <clever> so even with 16gig in my machine, i can only handle ~3 jobs in parallel

20:46 <clever> and also, some ghc compiles need 5gig of ram

20:46 <clever> yeah

20:45 <clever> so it takes 12 hours to run

20:45 <clever> and make tries to make it share with 3 gcc's

20:45 <clever> gchristensen: for example, i want to make -j4 on my rpi, to use all 4 cores, but there is a single step in the libc locales, that needs all the ram

20:45 <clever> gchristensen: this reminds me of a common complaint ive had with things like -j, they arent aware of the load types

20:43 <clever> twice the bandwidth going thru the sata controllers

20:43 <clever> i'm on an SSD mirror, so its shoving that much data into both drives

20:42 <clever> cant even close a tab in the browser, heh

20:41 <clever> 98mb/sec

20:41 <clever> gchristensen: system is noticably laggy with 63mb/sec going to the drives

20:39 <clever> qmm: and what is your nix file?

20:38 <clever> pie__: dont know enough about how python searches for deps

20:37 <clever> gchristensen: load average: 66.30, 29.04, 12.06 and the test can still pass

20:37 <clever> gchristensen: `TZ='America/Moncton' date`

20:36 <clever> gchristensen: trying to recreate the loadavg you say with `stress --cpu 70`

20:36 <clever> gchristensen: 5:36 pm

20:35 <clever> that reminds me, i havent had supper, lol

20:34 <clever> qmm: note the 2 main-is entries in https://github.com/snoyberg/yaml/blob/master/yaml.cabal#L96-L116

20:33 <clever> pie__: this will create a shell that depends on the built result of slimit

20:33 <clever> pie__: nix-shell -p '(import ./shell.nix)'

20:32 <clever> qmm: i prefer gist, because you can update things easily

20:32 <clever> pie__: so the source is in $src and the deps are available

20:32 <clever> pie__: that puts you into a shell suitable for building slimit

20:30 <clever> qmm: do you have a cabal file?

20:28 <clever> gchristensen: wut?, on which machine? lol

20:27 <clever> pie__: try just fixing the sha256 again?

20:26 <clever> *doh*

20:25 <clever> pie__: what was the contents of the storepath i pointed to?

20:21 <clever> gchristensen: but the failures i saw are happening so early in the boot, that every test passes thru that code

20:14 <clever> but when ive only been able to reproduce it ~4 times, and it refuses to reproduce now that i have an strace in there

20:14 <clever> gchristensen: yeah, it would be better to just fix these problems

20:12 <clever> pbogdan_: what does "less /nix/store/x3jjsvd6i4vzyhs9ylssg7gkyig031qi-slimit-cd76bde" say?

20:12 <clever> gchristensen: the tests pass 99.9% of the time on this end

20:11 <clever> and because stage2 and systemd never come up, the perl test-driver says it failed to connect to the vm

20:11 <clever> gchristensen: i believe this is ran by every single nixos test and `nixos-rebuild build-vm`

20:11 <clever> gchristensen: the mke2fs within the initrd sometimes hangs

20:10 <clever> gchristensen: https://gist.github.com/cleverca22/25b44f4f4c76aefe8a5e32fb60eaa226

20:10 <clever> i havent been able to reproduce it even once since adding strace to the problem

20:10 <clever> i'm guessing the system load was just right to be able to reproduce the problem

20:04 <clever> it wasnt loosing cpu time to another process

20:03 <clever> pbogdan_: but when i reproduced the test on my end, the mke2fs was hung, and the vm had gone idle

19:55 <clever> maybe only retry if the build failed within a certain timeframe

19:55 <clever> but it would also increase the cpu usage in the cluster

19:55 <clever> grahamc: having some automatic retry in hydra would help with these bugs

19:51 <clever> but this failure is so early in the boot, that i expect it to hit every single test in nixos

19:51 <clever> dont know

19:50 <clever> Infinisil: i think adding some retry to hydra would help

19:49 <clever> and nix will just blindly build every test

19:49 <clever> Infinisil: then try to build that txt file

19:49 <clever> Infinisil: i think you can run listToAttrs over (import ./nixos/release.nix {}).tests and then use map to create a txt file refering to every test's $out

19:46 <clever> heisenbugs always are

19:45 <clever> rng

19:44 <clever> Infinisil: yeah, thats why i was adding echo's to this region

19:32 <clever> Infinisil: http://imgur.com/a/wucKM

19:30 <clever> by default, it needs X11, but that can easily be solved

19:30 <clever> Infinisil: this should generate a bash script that runs nixos in a vm, with that kernel param

19:30 <clever> Infinisil: { ... }: { boot.kernelParams = [ "boot.debug1devices" ]; }

19:30 <clever> Infinisil: nix-build '<nixpkgs/nixos>' -A vm -I nixos-config=./example.nix

19:29 <clever> Infinisil: or boot a qemu vm

19:28 <clever> add boot.debug1devices to the kernel params when booting

19:28 <clever> Infinisil: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/boot/stage-1-init.sh#L195

19:28 <clever> Infinisil: easy

19:25 <clever> potentialy doubling or trippling the size of your initrd

19:25 <clever> and due to magic in the extraUtils package, it will pull in any dynamic libraries it needs

19:24 <clever> Infinisil: that just adds the binary to the initrd

19:21 <clever> Infinisil: symlinks

19:21 <clever> lrwxrwxrwx 1 root root 6 Dec 31 1969 /nix/store/hvc388xb0llkqxi8ff3h9zympdqn1s9b-e2fsprogs-1.43.4-bin/bin/mkfs.ext4 -> mke2fs

19:16 <clever> Infinisil: i can see it finishing just fine on my test runs

19:16 <clever> machine# done

19:16 <clever> machine# +++ exited with 0 +++

19:09 <clever> Infinisil: i suspect there are also some buffering problems dropping fragments of the logs

19:09 <clever> Infinisil: mine prints strace output every time, but has yet to hang

19:07 <clever> Infinisil: the gist i have says it has hung at least once

19:00 <clever> nothing used them, so it will probably omit both ends

19:00 <clever> deltasquared: i suspect it will do the same even if i store the pointers in an array and free them later on

18:59 <clever> deltasquared: and this is if i uncomment the printf: https://gist.github.com/cleverca22/8cc522eff59ff6fb124bab15991951d6#file-gistfile4-txt

18:58 <clever> and it knows bzero only effects the ram its pointed at, and it can omit it if the ram is never read

18:58 <clever> and it can omit that, if the ram isnt used

18:58 <clever> it knows malloc's only side-effect is allocating ram

18:57 <clever> this is why i said the compiler is getting too smart

18:57 <clever> deltasquared: i dont even have a return statement in it!

18:57 <clever> deltasquared: https://gist.github.com/cleverca22/8cc522eff59ff6fb124bab15991951d6#file-gistfile3-txt

18:55 <clever> deltasquared: https://gist.github.com/cleverca22/8cc522eff59ff6fb124bab15991951d6#file-gistfile2-txt

18:53 <clever> the entire body got optimized out

18:52 <clever> with default settings on nixos (just nix-shell -p gcc and gcc main.c -o main) it runs instantly

18:51 <clever> deltasquared: gcc in both cases, say todays gcc vs a gcc from 10 years ago

18:49 <clever> deltasquared: if i pass that thru an old enough compiler, it will cripple any machine, but on a modern one, it runs instantly

18:49 <clever> deltasquared: how would you deal with this program running faster on a modern compiler? https://gist.github.com/cleverca22/8cc522eff59ff6fb124bab15991951d6

18:45 <clever> it had a timeout while waiting for "f"

18:44 <clever> Infinisil: doesnt appear to have hung

18:43 <clever> Infinisil: pastebin!

18:43 <clever> the native and cross compilers will have different hashes

18:43 <clever> deltasquared: but your still basing everything around the hash of the compiler, not what it can produce

18:43 <clever> yep

18:42 <clever> deltasquared: you declare upfront what the hash of $out will be, and then $out's path wont depend on the inputs

18:42 <clever> deltasquared: fixed-output derivations are the only way right now to prevent rebuilds

18:39 <clever> so you have to apply that rewrite at unpack time

18:39 <clever> Infinisil: part of the problem, is that rewriting the build to reference its own hash, changes its hash

18:38 <clever> so if the native and cross-compiler produce the same output, things may share the result

18:38 <clever> the hash of the temporary $out

18:38 <clever> so you create a new $out, whose path is based on the temporary $out

18:37 <clever> after building $out, you hash the entire thing, and then rewrite the references to itself (and its runtime deps) within every binary

18:37 <clever> but i have seen another plan of a possible solution

18:35 <clever> so every value that can potentialy impact the build, will also impact its output path

18:35 <clever> and every attribute on that set, becomes an env variable when building the derivation

18:35 <clever> and its also used to compute the value of $out (which is in that .drv)

18:34 <clever> that hash is then used to create the /nix/store/<hash>-<name>.drv path

18:34 <clever> deltasquared: all attributes will then be forced down to a string, and the entire set is hashed

18:34 <clever> deltasquared: at the core of nix, every derivation must be made by calling builtins.derivation, and passing it a set containing, system, builder, args, and name

18:32 <clever> and which gcc the glibc was built from

18:31 <clever> the version of glibc and bash will also impact that build

18:31 <clever> so you can have an arm, 32bitx86, and 64bit x86 build of a simple "echo foo > $out/bar.txt"

18:31 <clever> even something as simple as write this string to a file, depends on which platform you run it on

18:30 <clever> the purity in nix doesnt allow it

18:30 <clever> nope

18:30 <clever> and the storepath of the compiler, depends on the options it was build with, the platform it runs on, and the storepath of every one of its build inputs

18:29 <clever> its based on the storepath of the compiler

18:29 <clever> you want to rebuild things when the compiler changes

18:29 <clever> and a good reason for that, is that some versions of the compiler may be glitched

18:28 <clever> the cause, is that which compiler you use impacts the hash

18:28 <clever> and if you only ever cross-compiled, that means building the entire gcc bootstrap

18:28 <clever> not the hello that was cross-compiled

18:28 <clever> so when you do nix-env -iA nixos.hello, it wants the hello that was natively compiled

18:27 <clever> half of the problem with cross-compiling in nixpkgs, is that it impacts the hash

18:26 <clever> but i never got around to doing a proper nixos install on it, so its still just rasbian with nix on the side

18:26 <clever> but the v6's where just too slow, so i retired them in favor of a faster rpi

18:26 <clever> i had nixos running on 2 armv6 rpi's, before the aarch64 stuff was being compiled by hydra

18:25 <clever> took a while to figure out why

18:25 <clever> it claimed the en_us mapping didnt exist, yet it clearly did

18:25 <clever> because of that, the nixos build on my rpi couldnt even generate the keymap files for the initrd

18:24 <clever> but it can potentialy break things

18:24 <clever> there is a special glibc thing you can compile with, that will silently switch over to getdents64() on a 32bit os

18:24 <clever> and 80% of software silently ignores it, treating the directory as empty

18:24 <clever> getdents()'s will return EOVERFLOW, because the 64bit inode doesnt fit in the 32bit struct

18:23 <clever> if your nfs server uses 64bit inodes, then a 32bit client will fail in the weirdest ways

18:23 <clever> i also discovered a rather nasty nfs bug on 32bit clients

18:22 <clever> and the current disk image routines use virtio-9p anyways, to copy things in

18:22 <clever> slow, but better then making a 2gig disk image for every vm

18:21 <clever> which allows sharing host directories directly to the guest

18:21 <clever> 9plan over the virtio interface

18:21 <clever> leading to files just not existing

18:20 <clever> i have previously had a problem interaction between the host zfs and qemu, where the guest /nix/store just randomly swapped entire directories

18:20 <clever> the real question though, is the race in mke2fs?, the guest linux?, qemu?, the host linux?

18:16 <clever> since adding the strace, the problem has not happened

18:14 <clever> pbogdan_: the mke2fs is hanging on boot, triggering the 5 minute timeout

18:13 <clever> pbogdan_: the problem i reproduced on my end, isnt just a simple timeout

18:13 <clever> pbogdan_: https://gist.github.com/cleverca22/25b44f4f4c76aefe8a5e32fb60eaa226

18:13 <clever> pbogdan_: and the test still passes!

18:12 <clever> heh, nix-build hasnt even started yet, and i can hear the cpu fans ramping up

18:11 <clever> pbogdan_: yeah, i did notice 3 of the failures where on lucifer, but i was able to recreate the problems on my local machine 3 or 4 times

18:11 <clever> ah, everything except the ssh to irc is laggy as hell, lol

18:10 <clever> though top is unually slow to refresh

18:10 <clever> pbogdan_: thats a lot of cpu, but i dont even notice it in the gui, lol

18:10 <clever> %Cpu(s): 50.2 us, 38.4 sy, 0.0 ni, 0.0 id, 10.7 wa, 0.0 hi, 0.6 si, 0.0 st

18:09 <clever> deltasquared: which causes failures if the vm extensions are in use by vbox

18:08 <clever> deltasquared: this forcibly adds -enable-kvm if /dev/kvm exists

18:08 <clever> pkgs/applications/virtualization/qemu/default.nix: makeWrapper "$p" $out/bin/qemu-kvm --add-flags "\$([ -e /dev/kvm ] && echo -enable-kvm)"

18:07 <clever> if you unload the kvm driver, qemu will sanely fallback (with obvious performance losses)

18:07 <clever> so only one can be using it at a time

18:07 <clever> and both kvm and vbox share it, via a single global mutex

18:06 <clever> i think virtualbox uses its own vm acceleration

18:06 <clever> sub connectmachine# [ 105.936521] xsession[858]: IceWM: MappingNotify

18:06 <clever> but, if virtualbox is running a guest, /dev/kvm fails at runtime, and then EVERY test fails

18:05 <clever> one strange thing i have noticed, the nixos tests will sanely use qemu without kvm, if /dev/kvm doesnt exist

18:04 <clever> kvm can emulate both 32 and 64bit easily

18:03 <clever> there are some things like jit that try to speed it up, but the end result is still slower

18:03 <clever> so qemu basicaly turns into a giant switch-case to turn every opcode into an effect on a struct of registers

18:02 <clever> without kvm, it has to emulate the cpu directly, the same as if it was an arm

18:02 <clever> with kvm, it can make use of the host MMU in a secure manner, and just run the guest directly on the cpu

18:01 <clever> that makes qemu based vm's much faster

2017-07-24

2017-07-23