2017-07-23

<clever> Infinisil: lets see what happens if i lack /dev/kvm!
<clever> Infinisil: also, crippling the host with memory usage doesnt seem to make the problem more likely
<clever> and it will clone them
<clever> but the kernel/cpu is too dumb to know when you write zero's to the zero page
<clever> and the kernel probably gives it a CoW clone of that zero page
<clever> mmap(NULL, 1073745920, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5f88ed7000
<clever> for an malloc this big, its probably going to mmap
<clever> sub connectmachine# [ 13.695150] xsession[867]: IceWM: using /root/.icewm for private configuration files
<clever> so gcc just omited everything
<clever> gcc knew that bzero only affects p, and nothing used p
<clever> and only if this is in as well
<clever> bzero(p, 1024*1024*1024);
<clever> this makes the program cripple my machine (as it was supposed to)
<clever> printf("%p\n", p);
<clever> this is instant
<clever> void *p = malloc(1024*1024*1024);
<clever> the instant i started to print the pointer with printf, it actualy got slow
<clever> thats only for freshly allocate memory
<clever> zero's out a range of memory
<clever> the bzero was to "use" the ram
<clever> Infinisil: it ran in 0.00 seconds, because even with the bzero, gcc knew i never referenced the data, and optimized it out
<clever> Infinisil: gcc is getting too smart, i wrote a program that malloc's 10gig of ram, and runs bzero over it all
<clever> nh2: yeah, i dont know why thats working either
<clever> sub connectmachine# [ 19.309685] xsession[849]: IceWM: MappingNotify
<clever> Infinisil: and the instant i say that, i get a new failure
<clever> Infinisil: i have yet to get it to fail
<clever> bhipple[m]: some of my php libraries dump every function argument in the backtrace, including the raw query, and the mysql password
<clever> bhipple[m]: that password will wind up in at least a .drv file in /nix/store/
<clever> bhipple[m]: there is a builtins.readFile, but that happens at eval time, so its not impure
<clever> bhipple[m]: yeah
<clever> bhipple[m]: i mostly do a passwords.nix containing { foo = "bar"; ... } and then near the top of a module, let passwords = import ./password.nix; in ...
<clever> and now that i think of it, hydra can also offer up its own nix store in full.....
<clever> there are some things like hydra where the code just doesnt allow reading the secrets from another location
<clever> nh2: i can also confirm rebase is in the hackage-packages.nix for vector-builder, under testHaskellDepends
<clever> so every failed build just sits there using space, until you reboot or rm -rf
<clever> mpickering: nix never cleans those up
<clever> mpickering: oh, where you building with -K a lot?
<clever> so it tends to be rather large
<clever> nh2: i dont think there are any existing tools that will trim the tree at leaves that are in the binary cache
<clever> mpickering: du /run/user/0 -h
<clever> mpickering: what is your free space like?
<clever> but some things like haskell want to modify the derivation to make it suitable for shell use
<clever> nh2: i also recently rediscovered, you can run nix-shell directly on a .drv file
<clever> nh2: nix-instantiate and nix-store -q --tree
<clever> does it give any other output when failing?
<clever> oh, on closer inspection, you already have that nix, lol
<clever> try this one
<clever> nix-store -r /nix/store/866pyym5zgvx78kr55pwas16xp1m0sjp-nix-1.11.11
<clever> one sec
<clever> ah, and i gave a bad path there
<clever> mpickering: what did you do to the system?, does /etc/nix/nix.conf exist?
<clever> mpickering: what about nix-store -r /nix/store/kw31dip97zpfjzz24q2j9bzsgcav6mgc-nix-1.11.1
<clever> Infinisil: oh, and due to the previously mentioned zfs trouble, nix-store --delete takes several minutes
<clever> error: getting status of ‘/home/clever/nixpkgs/warning:’: No such file or directory
<clever> + nix-store --delete warning: you did not specify '‘--add-root’;' the result might be removed by the garbage collector /nix/store/c229x42kja0bkdj3mipvz12bhpcc66ym-vm-test-run-keymap-neo
<clever> just need to trigger the bug again
<clever> Infinisil: yep, already doing that
<clever> mpickering: my usual response to that is to throw strace at it
<clever> Infinisil: yep, found the cause
<clever> mpickering: anything in dmesg?
<clever> 399 mke2fs -t ext4 ${cfg.bootDevice}
<clever> Infinisil: i suspect its hanging here, adding more prints
<clever> nixos/modules/virtualisation/qemu-vm.nix
<clever> this is the last thing a hung run has produced
<clever> machine# mke2fs 1.43.4 (31-Jan-2017)
<clever> Infinisil: this is the output from the middle of a passing run
<clever> machine# mke2fs 1.43.4 (31-Jan-2017)
<clever> machine# Creating filesystem with 131072 4k blocks and 32768 inodes
<clever> Infinisil: i think ive only had 2 failures in the last half hour
<clever> havent kept count
<clever> 407acb80fc0890e62597e170cc00d9e873e37e34
<clever> which would imply that the nixos guest isnt responding on the serial console
<clever> Infinisil: after running it about a dozen times, it has hung, but the order of the prints i added says the accept blocks, and its not a race
<clever> blocking
<clever> accepted shell
<clever> line 240 then uses the socket accept made
<clever> line 179 will accept a connection from the child (does it block, or happen in the background??)
<clever> line 236 starts the qemu process (which will connect back to a listening unix socket)
<clever> Infinisil: oh, i wonder if this is a race condition...
<clever> Infinisil: so this is what actualy boots the vm: https://github.com/NixOS/nixpkgs/blob/master/nixos/lib/test-driver/Machine.pm#L236
<clever> so the perl code just starts it on-demand, the first time something is done to it
<clever> Infinisil: aha, the keymap test doesnt start the vm
<clever> Infinisil: patched my Machine.pm with more print statements, re-running
<clever> though timestamps in the logs will vary, so it will technically fail
<clever> Infinisil: i think this one makes it do the build 10 times, and complain if the outputs dont match exactly
<clever> [clever@amd-nixos:~/nixpkgs]$ nix-build nixos/release.nix -A tests.keymap.neo.x86_64-linux --option build-repeat 10 --check
<clever> Infinisil: and this nixos module provides a root shell on that dedicated serial port
<clever> [clever@amd-nixos:~/nixpkgs]$ nix-build nixos/release.nix -A tests.keymap.neo.x86_64-linux
<clever> so there is a dedicated serial port on the guest, linked to the shell unix socket
<clever> i forgot to run it under time
<clever> "-device virtio-serial -device virtconsole,chardev=shell " .
<clever> "-monitor unix:./monitor -chardev socket,id=shell,path=./shell " .
<clever> and it gets that socket by accepting a connection on shellS
<clever> it waits 300 seconds for a line to appear on the socket
<clever> second run passed, test script finished in 18.04s
<clever> Infinisil: i ran the test twice on my machine, first one failed timed out waiting for the VM to connect
<clever> the tests that have passed cant be restarted
<clever> gchristensen: are you able to restart the keymap tests on http://hydra.nixos.org/job/nixos/trunk-combined/tested#tabs-constituents ?
<clever> Infinisil: my first guess is either timeouts due to a slow host, or iceWM eating keystrokes
<clever> machine# [ 31.289923] xsession[856]: IceWM: MappingNotify
<clever> machine: sending monitor command: sendkey f
<clever> machine# [ 30.265567] layer1[1012]: Waiting for 'f' to be typed
<clever> machine# [ 26.269645] layer3[955]: SUCCESS: Got back '}' as expected.
<clever> strange for only some of them to fail like that
<clever> Infinisil: ah, do you know which test is failing? https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/keymap.nix#L89
<clever> yeah, but this test involves programaticaly sending keystrokes to the guest
<clever> then the vm acceleration should be working
<clever> Infinisil: and you have r/w to it?
<clever> Infinisil: does /dev/kvm exist?
<clever> then just edit the nix files, add debug, and re-run
<clever> Infinisil: that would launch the entire test, and print the result
<clever> Infinisil: nix-build '<nixpkgs/nixos/release.nix>' -A tests.keymap.x86_64-linux
<clever> Infinisil: so its likely, that the testreader isnt getting the keys that perl sent to the vm
<clever> then the host blocks on the guest creating files
<clever> and the testreader in the guest reads them
<clever> the perl on the host sends keys to the guest
<clever> so it has to use that to block on things inside the script
<clever> Infinisil: aha, line 69 runs a script in the guest, in the background
<clever> wait for a file, cat it, then delete it?
<clever> i cant even tell what that bash is doing
<clever> Jul 23 11:42:08 nas nix-gc-start[7180]: deleting ‘/nix/store/trash’
<clever> Jul 23 11:42:08 nas nix-gc-start[7180]: deleted or invalidated more than 16581001216 bytes; stopping
<clever> that is the true mystery of zfs
<clever> so why does 200mb of on-platter cache help more?
<clever> it could have bumped that up by 200mb, and gotten bigger benefits
<clever> Infinisil: it already has 1.7gig of in-ram cache
<clever> Infinisil: i think that perl function tuns the script under the context of the guest vm
<clever> its nice that its faster, but why is it faster?
<clever> Infinisil: it could have easily used another 300mb of ram and gotten better benefits...
<clever> Infinisil: it somehow feels much snappier after adding an L2 cache, even though its only using 200mb so far
<clever> in the past, i also had a bug in my gpu drivers that reduced it to about 1 line per frame, with vsync
<clever> after about 15 minutes, the ls of /nix/store finished, 236,608 files
<clever> yeah
<clever> mostly rwait
<clever> i'm seeing 40 reads/sec and writes/sec to each of the 3 drives
<clever> yeah
<clever> 80 to 85% %util
<clever> Infinisil: iostat says its almost entirely IO bound
<clever> 4gig of ram
<clever> model name : AMD A6-5400K APU with Radeon(tm) HD Graphics
<clever> a single GC cycle also ran overnight, and was still going when i woke up in the morning
<clever> a garbage collection is also running
<clever> over 10 minutes now
<clever> the nas is still doing an ls of store
<clever> Infinisil: my desktop can list /nix/store/.links/ in 1m 48sec, and it contains 1,045,532 entries
<clever> Infinisil: amd/nix on /nix/store type zfs (ro,noatime,xattr,noacl)
<clever> sssd.out 24,080 x /nix/store/v4fd35aal24a7incamr9xr9hqw0gx4vh-sssd-1.14.2/lib/libsss_nss_idmap.so.0.2.0
<clever> [clever@amd-nixos:~]$ ./apps/nix-index/result/bin/nix-locate libsss_nss_idmap.so.0
<clever> exactly what i'm wondering
<clever> real 0m27.629s
<clever> 85535
<clever> drwxrwxr-t 22684 root nixbld 84K Jul 23 09:50 /nix/store/
<clever> the desktop is currently an SSD mirror with an NVME log, no L2 right now
<clever> Infinisil: but my desktop with an SSD mirror ontop of an NVME L2 can be even slower
<clever> Infinisil: raidz1 across 3 magnetic drives
<clever> its been running for 2 minutes now
<clever> [root@nas:~]# ls -lh /nix/store/ | wc -l
<clever> Infinisil: i think i have 4x the number of subdirs, and the size is also ~4x
<clever> Infinisil: for size reference: drwxrwxr-t 46029 root nixbld 232K Jul 23 11:14 /nix/store/
<clever> Infinisil: reading the directory listing for /nix/store/ and /nix/store/.links/
<clever> dash: and i think i just found a concurrency issue in the GC process
<clever> Jul 23 11:10:30 nas nix-gc-start[6246]: error: getting status of ‘/nix/store/.tmp-link-2816-1576729590’: No such file or directory
<clever> Jul 23 11:10:30 nas nix-gc-start[6246]: deleting ‘/nix/store/.tmp-link-2816-1576729590’
<clever> sometimes, zfs is just really slow, for no obvious reason
<clever> getdents(15, /* 416 entries */, 32768) = 32752 <3.048378>
<clever> deltasquared: and vde_switch has some modes, where it can inject latency, bandwidth limits, and even packet corruption into the links
<clever> deltasquared: nixos tests use vde_switch to link all of the qemu instances together
<clever> joepie91: yes, php sucks :P
<clever> deltasquared: the php csv read function has no way to turn that escape feature off, making it imposible to read the file as-is
<clever> deltasquared: it escaped the end of field quote, causing my csv parser to eat half of the next field and go entirely out of sync
<clever> i recently discovered that a "CSV" export from MS access, includes fields like "1234\"
<clever> hodapp: is this in the right region?
<clever> also, the error looks unrelated
<clever> /tmp/nix-build-opencv-3.3.0-rc.drv-0/opencv-3.3.0-rc-src/modules/stitching/include/opencv2/stitching/detail/matchers.hpp:52:42: fatal error: opencv2/xfeatures2d/cuda.hpp: No such file or directory
<clever> then see what happens
<clever> hodapp: id patch this to just not call download_boost_descriptors
<clever> /tmp/nix-build-opencv-3.3.0-rc.drv-0/opencv_contrib/xfeatures2d/CMakeLists.txt:8 (download_boost_descriptors)
<clever> deltasquared: yeah
<clever> hodapp: and download_boostdesc.cmake appears to have the expected hash of each file it wants
<clever> set(hash_BGM "0ea90e7a8f3f7876d450e4149c97c74f")
<clever> hodapp: i think this is involved in checking the existing copies it got: https://github.com/opencv/opencv/blob/master/cmake/OpenCVDownload.cmake#L64-L85
<clever> Infinisil: some things like haskell and some event libraries will randomly switch threads for performance reasons
<clever> Infinisil: which means you must always call the functions in the same thread on the OS side
<clever> Infinisil: things can also be thread-local to the main cpu
<clever> every branch appears to contain different files
<clever> what files does this repo provide?
<clever> hodapp, gchristensen: oh dear god, https://github.com/opencv/opencv_3rdparty
<clever> /tmp/nix-build-opencv-3.3.0-rc.drv-0/opencv_contrib/xfeatures2d/cmake/download_boostdesc.cmake:22 (ocv_download)
<clever> and cmake also gives a stack trace, nice
<clever> hodapp: the error is deep inside a generic ocv_download function
<clever> CMake Warning at /tmp/nix-build-opencv-3.3.0-rc.drv-0/opencv-3.3.0-rc-src/cmake/OpenCVDownload.cmake:188 (message):
<clever> yeah
<clever> and then it copies them into the right spot so cmake wont try to dl
<clever> yeah, an extra fetchFromGitHub is ran to pre-download the files
<clever> fetchFromGitHub is a different derivation, so it can have the network on, and nix will enforce it always giving the same result (via the sha256)
<clever> during the entire derivation, the network is off
<clever> nix will spawn a new container for every build, with no network access, and limited access to the /nix/store/
<clever> yeah
<clever> nix disables all network access during the build
<clever> it shouldnt be capable of doing any network now
<clever> configure has begun
<clever> hodapp: fetching deps
<clever> hodapp: build started...
<clever> so you can just pad it with ctrl+o's until you get the right answer
<clever> i got bored once and read the source of an !8ball script on irc, and it was a basic crc over the entire question, including non-printable characters
<clever> deltasquared: the color is probably just based on a hash of the nick
<clever> deltasquared: only thing that comes to mind for purple is willy wonka, lol
<clever> i cant see the colors on this end
<clever> deltasquared: who? heh
<clever> qknight: via config like normal
<clever> but if clang doesnt exist, it will be renamed, and take the place of clang
<clever> if clang already exists, the clangdir will be put INSIDE the existing clang, with its original name
<clever> cp -r ${clangdir} $sourceRoot/third_party/llvm/llvm/tools/clang
<clever> mpickering: i always cp -vi out of habbit, because i never get around to creating the alias
<clever> if you cd into the llvm dir it checks out, what does "git rev-parse HEAD" say?
<clever> and they havent noticed the mistake
<clever> my only guess is that the build script is out of date with the upstream cmake file
<clever> yeah, it looks right
<clever> mpickering: are you sure this is the right rev? https://gist.github.com/mpickering/5a5f122db8a4bec194d2dc4970135740#file-shell-nix-L6
<clever> ah, i see line 30 of the nix file now
<clever> ah
<clever> the cmake files arent present on github?
<clever> weird
<clever> mpickering: so thats going to try to patch the source it hasnt unpacked
<clever> mpickering: without an unpackPhase, postUnpack and src are broken
<clever> oh god, lol
<clever> and the more words you have, the worse it gets
<clever> so even with zero use of interpolation, it still does 2 concat operations
<clever> but its not doing any pre-processing at parse time
<clever> joepie91: what about the speed difference between "foo bar baz" and 'foo bar baz' ? lol
<clever> we need an Either type!
<clever> ahh
<clever> joepie91: yeah
<clever> zarel: i'm guessing you just manualy run cryptsetup open
<clever> joepie91: returns null upon error?!
<clever> returns false?
<clever> decode_json with a catch block i'm guessing
<clever> but id have to give it a test
<clever> joepie91: " <?php" or "?>\n" in a library!
<clever> deltasquared: so its imposible to automate any ingame action
<clever> deltasquared: but if any mod based code is in the stack, that permission is lost
<clever> deltasquared: things initiated by the game have a special priv, that lets them call special functions (that trigger ingame actions)
<clever> deltasquared: world of warcraft also uses a slightly modified lua engine for all of its mods
<clever> deltasquared: lua can often do that
<clever> zarel: https://nixos.org/nixos/options.html#boot.initrd.network
<clever> and patchelf is expecting everything to be named
<clever> i'm guessing it has an un-named section, that is referenced only by its section index
<clever> which would imply that findSection was called with an empty string
<clever> that makes more sense
<clever> ah, version 0.3 has error("cannot find section " + sectionName);
<clever> there should be single quotes in the error, acording to the source
<clever> error("cannot find section '" + sectionName + "'");
<clever> ben: what if you add --debug to patchelf?
<clever> ben: the source isnt capable of producing only that output
<clever> ben: does the error say which section it cant find?
<clever> ben: what does the file command say about the binary?
<clever> and it will add a bash script around it to modify env variables for you
<clever> you can also add the wrapProgram i gave previously to this
<clever> does it build?
<clever> then run nix-build on that file
<clever> ''
<clever> patchShebangs $out/bin/
<clever> cp -vi ${./input.python} $out/bin/output
<clever> mkdir -pv $out/bin/
<clever> with import <nixpkgs> {}; runCommand "script" { buildInputs = [ python ]; } ''
<clever> try putting this into a default.nix
<clever> if you gist your current script, i can modify it
<clever> nix should automaticaly patch it to /nix/store/<hash>-python/bin/python
<clever> and then put python into the buildInputs
<clever> for example, make a $out/bin/foo that starts with #!/usr/bin/env python
<clever> then write a nix expression that will generate a suitable script
<clever> which prevents that problem entirely
<clever> nix-shell ensures it will only be available in that shell, so nothing else can use it
<clever> so its going to be more likely to ignore the incompatible libz.so
<clever> that version wont look in /lib by default
<clever> nix-env will persist, nix-shell wont
<clever> either nix-env or nix-shell
<clever> 'nix-env -iA nixpkgs.python' would place it at ~/.nix-profile/bin/python
<clever> that will spawn a shell that has the nix version of python in $PATH
<clever> nix-shell -p python
<clever> which defaults to looking in /lib
<clever> ah, that one is using the ld.so from your host
<clever> "which python", is it using the nix build of python?
<clever> but LD_LIBRARY_PATH has a higher priority, so it can break things
<clever> and normally have the search path pre-set
<clever> libraries in /nix/store should only ever be loading other libraries in /nix/store/