2017-07-24

<clever> yeah
<clever> lua also allows 0 to be a key, but the iterator starts at 1
<clever> lua even allows objects to be keys
<clever> both php and lua allow strings and int as keys in an array
<clever> sphalerite[m]: https://hydra.angeldsis.com/build/59300 if you want to keep an eye on it
<clever> i'll restart that hydra job
<clever> the nw hydra hasnt built it yet, and i deleted the jobset from the old hydra
<clever> dang, had
<clever> i have a tar for arm, that that script could accept
<clever> every supported platform
<clever> sphalerite[m]: how it has a list of tar's for every platform?
<clever> sphalerite[m]: you know the source of nixos.org/nix/install ?
<clever> i have a tar of that
<clever> ahh, arm?
<clever> what arch?
<clever> ah yeah, /usr/, i see
<clever> ah
<clever> sphalerite[m]: weird, nix-store --verify --check-contents ?
<clever> if you confirm that the perl is fixed, restart all failing jobs in http://hydra.nixos.org/eval/1378129
<clever> but other jobs that failed due to perl wont be restarted
<clever> and repeat anything that has failed
<clever> it will just build the entire closure of that job
<clever> the logs on hydra dont show signs of that
<clever> copumpkin: what if you restart the job on hydra?
<clever> ALT-SYSRQ-K
<clever> this lets you hard kill everything in the tty, returning control to the login prompt
<clever> the linux version of ctrl+alt+del to login
<clever> sphalerite[m]: you know about SAK?
<clever> sphalerite[m]: :O
<clever> copumpkin: yeah, just a single nixbld1 is fine
<clever> sphalerite[m]: what about 'ctrl+c' then 'reset' ?
<clever> sphalerite[m]: weird
<clever> copumpkin: user namespacing works by mapping a real uid on the host to a fake uid in the guest
<clever> copumpkin: it still needs a nixbldX user on the host, to map the usernamespace to
<clever> sphalerite[m]: you ran cat on a binary file, and it changes the font settings
<clever> sphalerite[m]: run "reset" in that terminal
<clever> user namespaces just make it more determisistic, so the build doesnt know which build user its on
<clever> any time nix has root, it will want to do that, even if sandboxes are off
<clever> copumpkin: you need to create a nixbld1 user and put it into the nixbld group
<clever> it needs root to setup the sandbox and drop privs
<clever> the build must run as root (or nix-daemon as root) for sandboxes to work
<clever> is this on a nixos machine?
<clever> and is used to build the bison and binutils that become the final stdenv
<clever> i think that perl is build against the bootstrap binutils
<clever> copumpkin: and checking --tree, i can confirm the 2nd perl that is failing, is under stdenv-linux-boot -> binutils -> bison -> perl
<clever> copumpkin: yeah, i can confirm 2 perls in the closure of at, and the hydra failed on the one that isnt in the perl attr
<clever> copumpkin: the path for 'at' in the nixpkgs rev does match up, checking its --tree
<clever> copumpkin: i'm guessing its one of the perl's in the bootstrap of stdenv
<clever> copumpkin: yep, let me check that hydra url
<clever> perl.devdoc
<clever> copumpkin: and -A perl.decdov ?
<clever> copumpkin: what about: nix-build -A perl --arg config '{}'
<clever> bbl
<clever> et4te: refer to 'man configuration.nix'
<clever> et4te: programs.ssh.knownHosts
<clever> et4te: there is also a nixos option to pre-populate it system wide
<clever> et4te: sudo -u hydra -i, then ssh into the target host to populate ~/.ssh/known_hosts
<clever> aristid: oops, the site was hacked within 5 minutes of launch, the hacker got all of our funds!
<clever> aristid: the problem i was focusing on was in stage-1, it didnt even get as far as mounting the rootfs
<clever> aristid: ah, that sounds like a different problem , after stage-2
<clever> aristid: where is it hanging for you?
<clever> domenkozar: i was able to reproduce it 2 or 3 times when my system was basically idle, so high disk load isnt a hard requirement
<clever> domenkozar: i tracked it down to mke2fs hanging in stage-1, but it refused to reproduce once i straced it
<clever> was able to get about 0.2 via cpu mining before i cooked a server
<clever> but i used to mine btc many years ago
<clever> not currently
<clever> no idea about that error
<clever> ah
<clever> johnramsden: what error did you have when you tried?

2017-07-23

<clever> so if you use nix's -j4 (or hydra builds 4 jobs per slave), it will only do 2 tests max
<clever> i do see how marking a derivation as being worth 2 jobs would help lessen the load at the nix level
<clever> there is also the tricky problem of balancing nix -j4 vs make -j4
<clever> and then hydra will waste its power when not doing testing
<clever> but you would need to heavily reduce the concurrent jobs allowed on that machine
<clever> there is already a nixos-test feature that a slave must have, for hydra to even push qemu jobs to it
<clever> yeah
<clever> yeah, no make involved in qemu tests
<clever> which would require patching every single makefile, rather then just flagging one derivation as being fat
<clever> gchristensen: but i have also seen some packages that have the problem at the make level
<clever> gchristensen: that works at the nix level
<clever> qmm: i'm guessing the problem is with ghc, not nix
<clever> i have done far more crazy things, i upgraded a laptop from gentoo to 32bit nixos, to 64bit nixos, without using an iso, lol
<clever> chrishill: and if you do that after booting a 64bit kernel from the iso, it will rebuild in 64bit mode
<clever> chrishill: yeah, nixos-install is just a script to run nixos-rebuild under a chroot
<clever> qmm: try changing it to "module Main where"
<clever> and it will upgrade it to 64bit
<clever> just boot a 64bit install cd, mount the existing filesystems to /mnt, and re-run nixos-install
<clever> reinstalling is pretty easy
<clever> chrishill: your on nixos, right?
<clever> chrishill: then your cpu is capable of 64bit mode, but your install is only 32bit
<clever> chrishill: is "lm" in the flags field?
<clever> chrishill: what is the flags field under /proc/cpuinfo ?
<clever> all of the search commands in nix-env hide broken packages
<clever> chrishill: spotify only works on 64bit nixos, your on 32bit
<clever> 5 assert stdenv.system == "x86_64-linux";
<clever> chrishill: and what is the output of "nix-build '<nixpkgs>' -A spotify -v 2>&1 | grep config.nix"
<clever> chrishill: what happens if you do nix-env -iA nixos.spotify
<clever> qmm: can you gist the contents of T.hs?
<clever> qmm: does each main file declare the module to be Main ?
<clever> qmm: can you gist all of the output?
<clever> chrishill: does ~/.config/nixpkgs/config.nix exist?
<clever> pie__: not sure then
<clever> bennofs: it would also help to throw disorderfs into nix-daemon based sandboxes, to find more bugs
<clever> pie__: are you using the python36 version of python?
<clever> mke2fs started, and never finished, so the stage-1 boot hung
<clever> there is a log on a previous eval that wasnt restarted: http://hydra.nixos.org/build/56734579/nixlog/15
<clever> had a few hangs waiting for "f"
<clever> ive mostly been ignoring the other failures
<clever> but hydra deletes logs when restarting
<clever> i also saw it on the hydra logs for one job
<clever> yeah
<clever> it happened exactly once with the echo's around the mke2fs
<clever> Infinisil: each time, i narrowed in closer on it, until i added the strace, then it never happened again
<clever> Infinisil: 2 or 3 times
<clever> aristid: my first thought is builtins.map and builtins.elem, test each to see if it exists
<clever> gchristensen: just normal browsing activity
<clever> gchristensen: whats weird, is that i wasnt even taxing the system that hard when i reproduced it the first few times
<clever> gchristensen: test took forever to start with 105mb/sec going to the drives, but it can still pass
<clever> plus zfs snapshots saving every byte they make, lol
<clever> ah, 1gig per worker, with 20 workers
<clever> qmm: looks like it should just work, what is the error?
<clever> gchristensen: i wonder, when does stress delete its temp files, lol
<clever> amd/root 20G 8.2G 12G 42% /
<clever> error: writing to file: No space left on device
<clever> but if i make -j3, on a 8 core machine, over half is going to waste
<clever> so even with 16gig in my machine, i can only handle ~3 jobs in parallel
<clever> and also, some ghc compiles need 5gig of ram
<clever> yeah
<clever> so it takes 12 hours to run
<clever> and make tries to make it share with 3 gcc's
<clever> gchristensen: for example, i want to make -j4 on my rpi, to use all 4 cores, but there is a single step in the libc locales, that needs all the ram
<clever> gchristensen: this reminds me of a common complaint ive had with things like -j, they arent aware of the load types
<clever> twice the bandwidth going thru the sata controllers
<clever> i'm on an SSD mirror, so its shoving that much data into both drives
<clever> cant even close a tab in the browser, heh
<clever> 98mb/sec
<clever> gchristensen: system is noticably laggy with 63mb/sec going to the drives
<clever> qmm: and what is your nix file?
<clever> pie__: dont know enough about how python searches for deps
<clever> gchristensen: load average: 66.30, 29.04, 12.06 and the test can still pass
<clever> gchristensen: `TZ='America/Moncton' date`
<clever> gchristensen: trying to recreate the loadavg you say with `stress --cpu 70`
<clever> gchristensen: 5:36 pm
<clever> that reminds me, i havent had supper, lol
<clever> qmm: note the 2 main-is entries in https://github.com/snoyberg/yaml/blob/master/yaml.cabal#L96-L116
<clever> pie__: this will create a shell that depends on the built result of slimit
<clever> pie__: nix-shell -p '(import ./shell.nix)'
<clever> qmm: i prefer gist, because you can update things easily
<clever> pie__: so the source is in $src and the deps are available
<clever> pie__: that puts you into a shell suitable for building slimit
<clever> qmm: do you have a cabal file?
<clever> gchristensen: wut?, on which machine? lol
<clever> pie__: try just fixing the sha256 again?
<clever> *doh*
<clever> pie__: what was the contents of the storepath i pointed to?
<clever> gchristensen: but the failures i saw are happening so early in the boot, that every test passes thru that code
<clever> but when ive only been able to reproduce it ~4 times, and it refuses to reproduce now that i have an strace in there
<clever> gchristensen: yeah, it would be better to just fix these problems
<clever> pbogdan_: what does "less /nix/store/x3jjsvd6i4vzyhs9ylssg7gkyig031qi-slimit-cd76bde" say?
<clever> gchristensen: the tests pass 99.9% of the time on this end
<clever> and because stage2 and systemd never come up, the perl test-driver says it failed to connect to the vm
<clever> gchristensen: i believe this is ran by every single nixos test and `nixos-rebuild build-vm`
<clever> gchristensen: the mke2fs within the initrd sometimes hangs
<clever> i havent been able to reproduce it even once since adding strace to the problem
<clever> i'm guessing the system load was just right to be able to reproduce the problem
<clever> it wasnt loosing cpu time to another process
<clever> pbogdan_: but when i reproduced the test on my end, the mke2fs was hung, and the vm had gone idle
<clever> maybe only retry if the build failed within a certain timeframe
<clever> but it would also increase the cpu usage in the cluster
<clever> grahamc: having some automatic retry in hydra would help with these bugs
<clever> but this failure is so early in the boot, that i expect it to hit every single test in nixos
<clever> dont know
<clever> Infinisil: i think adding some retry to hydra would help
<clever> and nix will just blindly build every test
<clever> Infinisil: then try to build that txt file
<clever> Infinisil: i think you can run listToAttrs over (import ./nixos/release.nix {}).tests and then use map to create a txt file refering to every test's $out
<clever> heisenbugs always are
<clever> rng
<clever> Infinisil: yeah, thats why i was adding echo's to this region
<clever> Infinisil: http://imgur.com/a/wucKM
<clever> by default, it needs X11, but that can easily be solved
<clever> Infinisil: this should generate a bash script that runs nixos in a vm, with that kernel param
<clever> Infinisil: { ... }: { boot.kernelParams = [ "boot.debug1devices" ]; }
<clever> Infinisil: nix-build '<nixpkgs/nixos>' -A vm -I nixos-config=./example.nix
<clever> Infinisil: or boot a qemu vm
<clever> add boot.debug1devices to the kernel params when booting
<clever> Infinisil: easy
<clever> potentialy doubling or trippling the size of your initrd
<clever> and due to magic in the extraUtils package, it will pull in any dynamic libraries it needs
<clever> Infinisil: that just adds the binary to the initrd
<clever> Infinisil: symlinks
<clever> lrwxrwxrwx 1 root root 6 Dec 31 1969 /nix/store/hvc388xb0llkqxi8ff3h9zympdqn1s9b-e2fsprogs-1.43.4-bin/bin/mkfs.ext4 -> mke2fs
<clever> Infinisil: i can see it finishing just fine on my test runs
<clever> machine# done
<clever> machine# +++ exited with 0 +++
<clever> Infinisil: i suspect there are also some buffering problems dropping fragments of the logs
<clever> Infinisil: mine prints strace output every time, but has yet to hang
<clever> Infinisil: the gist i have says it has hung at least once
<clever> nothing used them, so it will probably omit both ends
<clever> deltasquared: i suspect it will do the same even if i store the pointers in an array and free them later on
<clever> deltasquared: and this is if i uncomment the printf: https://gist.github.com/cleverca22/8cc522eff59ff6fb124bab15991951d6#file-gistfile4-txt
<clever> and it knows bzero only effects the ram its pointed at, and it can omit it if the ram is never read
<clever> and it can omit that, if the ram isnt used
<clever> it knows malloc's only side-effect is allocating ram
<clever> this is why i said the compiler is getting too smart
<clever> deltasquared: i dont even have a return statement in it!
<clever> the entire body got optimized out
<clever> with default settings on nixos (just nix-shell -p gcc and gcc main.c -o main) it runs instantly
<clever> deltasquared: gcc in both cases, say todays gcc vs a gcc from 10 years ago
<clever> deltasquared: if i pass that thru an old enough compiler, it will cripple any machine, but on a modern one, it runs instantly
<clever> deltasquared: how would you deal with this program running faster on a modern compiler? https://gist.github.com/cleverca22/8cc522eff59ff6fb124bab15991951d6
<clever> it had a timeout while waiting for "f"
<clever> Infinisil: doesnt appear to have hung
<clever> Infinisil: pastebin!
<clever> the native and cross compilers will have different hashes
<clever> deltasquared: but your still basing everything around the hash of the compiler, not what it can produce
<clever> yep
<clever> deltasquared: you declare upfront what the hash of $out will be, and then $out's path wont depend on the inputs
<clever> deltasquared: fixed-output derivations are the only way right now to prevent rebuilds
<clever> so you have to apply that rewrite at unpack time
<clever> Infinisil: part of the problem, is that rewriting the build to reference its own hash, changes its hash
<clever> so if the native and cross-compiler produce the same output, things may share the result
<clever> the hash of the temporary $out
<clever> so you create a new $out, whose path is based on the temporary $out
<clever> after building $out, you hash the entire thing, and then rewrite the references to itself (and its runtime deps) within every binary
<clever> but i have seen another plan of a possible solution
<clever> so every value that can potentialy impact the build, will also impact its output path
<clever> and every attribute on that set, becomes an env variable when building the derivation
<clever> and its also used to compute the value of $out (which is in that .drv)
<clever> that hash is then used to create the /nix/store/<hash>-<name>.drv path
<clever> deltasquared: all attributes will then be forced down to a string, and the entire set is hashed
<clever> deltasquared: at the core of nix, every derivation must be made by calling builtins.derivation, and passing it a set containing, system, builder, args, and name
<clever> and which gcc the glibc was built from
<clever> the version of glibc and bash will also impact that build
<clever> so you can have an arm, 32bitx86, and 64bit x86 build of a simple "echo foo > $out/bar.txt"
<clever> even something as simple as write this string to a file, depends on which platform you run it on
<clever> the purity in nix doesnt allow it
<clever> nope
<clever> and the storepath of the compiler, depends on the options it was build with, the platform it runs on, and the storepath of every one of its build inputs
<clever> its based on the storepath of the compiler
<clever> you want to rebuild things when the compiler changes
<clever> and a good reason for that, is that some versions of the compiler may be glitched
<clever> the cause, is that which compiler you use impacts the hash
<clever> and if you only ever cross-compiled, that means building the entire gcc bootstrap
<clever> not the hello that was cross-compiled
<clever> so when you do nix-env -iA nixos.hello, it wants the hello that was natively compiled
<clever> half of the problem with cross-compiling in nixpkgs, is that it impacts the hash
<clever> but i never got around to doing a proper nixos install on it, so its still just rasbian with nix on the side
<clever> but the v6's where just too slow, so i retired them in favor of a faster rpi
<clever> i had nixos running on 2 armv6 rpi's, before the aarch64 stuff was being compiled by hydra
<clever> took a while to figure out why
<clever> it claimed the en_us mapping didnt exist, yet it clearly did
<clever> because of that, the nixos build on my rpi couldnt even generate the keymap files for the initrd
<clever> but it can potentialy break things
<clever> there is a special glibc thing you can compile with, that will silently switch over to getdents64() on a 32bit os
<clever> and 80% of software silently ignores it, treating the directory as empty
<clever> getdents()'s will return EOVERFLOW, because the 64bit inode doesnt fit in the 32bit struct
<clever> if your nfs server uses 64bit inodes, then a 32bit client will fail in the weirdest ways
<clever> i also discovered a rather nasty nfs bug on 32bit clients
<clever> and the current disk image routines use virtio-9p anyways, to copy things in
<clever> slow, but better then making a 2gig disk image for every vm
<clever> which allows sharing host directories directly to the guest
<clever> 9plan over the virtio interface
<clever> leading to files just not existing
<clever> i have previously had a problem interaction between the host zfs and qemu, where the guest /nix/store just randomly swapped entire directories
<clever> the real question though, is the race in mke2fs?, the guest linux?, qemu?, the host linux?
<clever> since adding the strace, the problem has not happened
<clever> pbogdan_: the mke2fs is hanging on boot, triggering the 5 minute timeout
<clever> pbogdan_: the problem i reproduced on my end, isnt just a simple timeout
<clever> pbogdan_: and the test still passes!
<clever> heh, nix-build hasnt even started yet, and i can hear the cpu fans ramping up
<clever> pbogdan_: yeah, i did notice 3 of the failures where on lucifer, but i was able to recreate the problems on my local machine 3 or 4 times
<clever> ah, everything except the ssh to irc is laggy as hell, lol
<clever> though top is unually slow to refresh
<clever> pbogdan_: thats a lot of cpu, but i dont even notice it in the gui, lol
<clever> %Cpu(s): 50.2 us, 38.4 sy, 0.0 ni, 0.0 id, 10.7 wa, 0.0 hi, 0.6 si, 0.0 st
<clever> deltasquared: which causes failures if the vm extensions are in use by vbox
<clever> deltasquared: this forcibly adds -enable-kvm if /dev/kvm exists
<clever> pkgs/applications/virtualization/qemu/default.nix: makeWrapper "$p" $out/bin/qemu-kvm --add-flags "\$([ -e /dev/kvm ] && echo -enable-kvm)"
<clever> if you unload the kvm driver, qemu will sanely fallback (with obvious performance losses)
<clever> so only one can be using it at a time
<clever> and both kvm and vbox share it, via a single global mutex
<clever> i think virtualbox uses its own vm acceleration
<clever> sub connectmachine# [ 105.936521] xsession[858]: IceWM: MappingNotify
<clever> but, if virtualbox is running a guest, /dev/kvm fails at runtime, and then EVERY test fails
<clever> one strange thing i have noticed, the nixos tests will sanely use qemu without kvm, if /dev/kvm doesnt exist
<clever> kvm can emulate both 32 and 64bit easily
<clever> there are some things like jit that try to speed it up, but the end result is still slower
<clever> so qemu basicaly turns into a giant switch-case to turn every opcode into an effect on a struct of registers
<clever> without kvm, it has to emulate the cpu directly, the same as if it was an arm
<clever> with kvm, it can make use of the host MMU in a secure manner, and just run the guest directly on the cpu
<clever> that makes qemu based vm's much faster