<gchristensen>
ok so the node had failed to provision and was stuck in this provisioning state, basically waiting to be flagged and culled by their system
<gchristensen>
from you can't really do anything, the UI doesn't let you do anything, their backend admin system doesn't even let them try to reboot the machine
<gchristensen>
(or me for that matter)
<gchristensen>
due to a logic bug, though, their API _does_ let you issue a Rescue mode boot
<gchristensen>
which forces a reboot
<gchristensen>
it rebooted, I got a broken GRUB screen, and then I imagined the GRUB menu in my mind's eye, arrow-down'd, enter, and arrow-down a million times to the very first system profile #1 which came with my personal SSH key
<samueldr>
[19:57:55] <samueldr> time to find a zero-day exploit
<samueldr>
it wasn't to be taken literally
<gchristensen>
haha
<gchristensen>
so anyway here I am rescuing this server
<joepie91>
hahahaha
<joepie91>
gchristensen: good thing GRUB doesn't wrap? :P
<samueldr>
gchristensen: I would ask them to *implement* your misfeature :D
<gchristensen>
good thing :)
<joepie91>
I now wonder if this is why...
<gchristensen>
ewll here I am wondering if I should report the bug or not :)
<samueldr>
I would, to make sure it's kept as a feature
<emily>
what machine is this?
<gchristensen>
the hydra.nixos.org aarch64 builder
<emily>
ah
<emily>
I was wondering which "their" in "their backend system" but let me guess, scaleway?
<gchristensen>
Packet.net
<emily>
oh ok
<gchristensen>
NixOS has a few aarch64 machines from Packet
<emily>
I just pattern-matched arm server + broken -> scaleway
<emily>
^^;
<joepie91>
hahaha
<gchristensen>
haha
<samueldr>
:eyes: u-boot can boot EFI, u-boot did boot my janky grub efi program on my raspberry pi... *scheming intensifies*
<gchristensen>
oeuhounthoenuthoeunth I rebooted, it didn't come back up properly, and now even my rescue mode trick is broken.
* samueldr
wonders how this is said
<gchristensen>
about how you'd say asdfasdfasdfasdfasdafsdaf
<samueldr>
oh right, qwerty biased here
lopsided98 has quit [Quit: Disconnected]
lopsided98 has joined #nixos-chat
clever has quit [Ping timeout: 252 seconds]
clever has joined #nixos-chat
Myrl-saki has joined #nixos-chat
sir_guy_carleton has quit [Quit: WeeChat 2.0]
<sphalerite>
joepie91: if it did wrap you could just use the up key once?
pie___ has joined #nixos-chat
pie__ has quit [Ping timeout: 260 seconds]
pie___ has quit [Read error: Connection reset by peer]
pie___ has joined #nixos-chat
<joepie91>
sphalerite: assuming it wraps both ways :P
<joepie91>
I forgot what it was, but I ran into a tool recently that only wrapped from end to start
<joepie91>
not the other way around
<sphalerite>
aaaaah why would you do that
jasongrossman has quit [Ping timeout: 260 seconds]
jasongrossman has joined #nixos-chat
<gchristensen>
in the following jq incantation: `jq MAGIC filea fileb` what is the appropriate magic to put the jsoney contents of fileb under the attribute named "config", overwriting file's "config"?
<gchristensen>
my memory was trying me to go down the path of .[0] and .[0] for args
<joepie91>
sphalerite: that's a question I'm regularly left asking about software...
jasongrossman has quit [Ping timeout: 252 seconds]
<gchristensen>
PageRank is patented and I can't use it without a license, right?
<gchristensen>
pretty sure that is my reading of the wiki page, but I'm seeing a lot of stuff using it.
<simpson>
Yeah, but you don't want the patented version anyway; what you want is the underlying matrix operation, which is just multiplication.
<gchristensen>
would you mind taking a look at something I've hacked together?
<simpson>
Sure?
<gchristensen>
I have a nice improvement to dockerTools.buildImage but it seems I can't just swap it in, but have to provide a new, buildImageDifferently function
<sphalerite>
gchristensen: very good name for it, I approve ;)
<manveru>
also because i'm never sure how to reference them better than putting them into an env var in the base images so it doens't clutter the root dir
<gchristensen>
everything after the 0a80233... commit is in flux and being squashed / rebased / amended / etc. to death, so don't worry too much if you see crappy comments or whatever.
<samueldr>
buildLasagnaDocker
<samueldr>
or buildLasagnaImage maybe
<samueldr>
:)
<gchristensen>
=)
<samueldr>
don't go hating on mondays though, that would be copyright infringement
<gchristensen>
I found a sort of annoying bug with buildImage in the process, it doesn't consider the closure of the config, so this doesn't work: contents = []; config.Cmd = [ "${pkgs.gitFull}/bin/git" ];
<gchristensen>
"Any Linux OS that supports ext3/4 file systems and has cloudinit 0.7.7, cloudbase-init, coreos-cloudinit, ignition, or bsd-cloudinit installed should work with the import tool." so they muck with your image I guess?
<samueldr>
doesn't cloudinit use a floppy drive?
<gchristensen>
it can use a read-only cdrom
<gchristensen>
but also will attempt to contact a specific IP
<samueldr>
I wonder if you don't use cloudinit, if it'll still work, but fail their misc tasks like hostname
<samueldr>
still interesting to see that one of the popular VPS providers may be more usable with nixos (without using kexec tricks)
<gchristensen>
+1
lassulus_ has joined #nixos-chat
lassulus has quit [Ping timeout: 244 seconds]
lassulus_ is now known as lassulus
<sphalerite>
random idea: "netboot" nixops backend
<gchristensen>
yes please
<sphalerite>
I think we need a better netboot process first :p
<gchristensen>
better??
<gchristensen>
how could it be better?
<sphalerite>
not including the entire nix store in the initramfs that needs to be downloaded at boot time
<gchristensen>
why would that make it better?
<sphalerite>
the images are big and slow to build
<sphalerite>
squashfsing isn't fast
<sphalerite>
maybe a single minimal netboot image that gets served up to all the builders, then nixops will SSH into the machines as it does with any other backend
<gchristensen>
so put less in it :)
<gchristensen>
Nix's guarantee of what you use is definitely inside it makes the netboot process a smooth dream compared to nearly every other netboot tool ever
<sphalerite>
but what if you want to run something that *is* big?
<sphalerite>
Oh yeah and the way it needs to all be kept in RAM
<gchristensen>
you've got to solve that one way or the other
<gchristensen>
Nix rightfully punts that decision to you
<sphalerite>
Would be nice to have an option (other than swap) for not having the whole store in RAM
<sphalerite>
I'm not saying our netboot is terrible, but the netboot installer image is big enough that eelco doesn't want to merge the change that would make netboot.xyz support possible
<gchristensen>
how big is it, and can we make it smaller?
<sphalerite>
he's suggested `nix mount-store`, which isn't in a release yet
<sphalerite>
and is in its current incarnation also far too inefficient (no caching whatsoever)
<gchristensen>
what would store-mounting do?
<sphalerite>
FUSE magic to pretend that cache.nixos.org is actually on your disk :p
<gchristensen>
for the netboot image just put less stuff in the image
<gchristensen>
they're guaranteed to have network
<sphalerite>
how are they supposed to get at the stuff they do need? Download it in a systemd service or something?
<sphalerite>
have the image contain only a systemd unit that runs nix-store -r /nix/store/abcxyz-real-system && /nix/store/abcxyz-real-system/bin/switch-to-configuration switch?
<gchristensen>
well, that is the question -- how are they going to get the stuff they need in any case at all?
MichaelRaskin has joined #nixos-chat
<gchristensen>
Nix makes an explicit promise of the stuff you reference will be in there, and anything beyond that is left as an exercise to the reader
<gchristensen>
if you can't fit it, it isn't up to Nix to solve that but up to your bootstrapping process or whatever
<MichaelRaskin>
gchristensen: nice conversation with {^_^} you had on #nixos-unregistered
<gchristensen>
:)
<gchristensen>
from his feedback I don't see him not wanting to merge it
<gchristensen>
and not unhappy about the 440mb initrd, just curious about it
<MichaelRaskin>
440 MiB initramfs? With compression? That's larger than mine «to hell with it I pack glibc» one
<gchristensen>
its a netboot ramdisk with the entire system in the initrd
<sphalerite>
just like the installer CD image
<gchristensen>
"It's unable to find the root filesystem, for reasons similar to boot failures caused by incorrect USB stick creation." can we fix that somehow?
<MichaelRaskin>
Well, OK, depends on the definition of entire system, my initramfs is also close to a «complete system», although doesn't come with lots of tools
<gchristensen>
weird hardware is so frustrating sometimes.
<MichaelRaskin>
Does #nix-core has more bots than core team members?
<sphalerite>
systemd alone is 189MB (uncompressed, but still) :|
<samueldr>
I can't say say you're wrong, but there's only two bots :)
<infinisil>
sphalerite: Oh damn, that's a lot..
<sphalerite>
wait what, nfs-utils depends on gcc!?
<sphalerite>
so the nixos installer image contains gcc...
<MichaelRaskin>
tilpner: I think some rooms on matrix.org are unreliable even without bridging
<tilpner>
How so?
<tilpner>
I was told that rooms are not bound to any homeserver
<tilpner>
If matrix.org goes down, the rooms will continue to exist and people can still talk to each other
<MichaelRaskin>
Well, people already in the room and people knowing non-matrix.org alias
<tilpner>
Yes, that's good enough IMO
<MichaelRaskin>
The catch: a large majority of people in the room are logged in to matrix.org anyway
<tilpner>
That is hopefully just a starting problem
<MichaelRaskin>
A bit more complicated, I am afraid
<sphalerite>
yeah no, the fact that the main client (riot) defaults to matrix.org rather than forcing you to choose means that almos all matrix users are on matrix.or
<MichaelRaskin>
Not only that
<MichaelRaskin>
Also, scaling is not yet solved well enough, so many currently open servers have a plan that starts with «if matrix.org closes new registration we close new registration as soon as we find out»
<infinisil>
Whaaat, why that
<MichaelRaskin>
Because a large homeserver eats a lot of everything (RAM included, and RAM is often the bottleneck on not-too-costly VPS)
Ralith has joined #nixos-chat
<MichaelRaskin>
And matrix.org is so much the largest one, that many other servers do not expect to survive an influx of users
<infinisil>
Why does RAM usage increase the more users registered with your domain?
<MichaelRaskin>
(and the protocol is still under-specified and sometimes mis-specified, so synapse is the only server that is expected to federate with matrix.org)
<MichaelRaskin>
Well, if they do something…
<infinisil>
Ah, each server is responsible for handling its users communications?
<infinisil>
(I don't really know matrix well)
<MichaelRaskin>
Each server is responsible for keeping track of all history in all the rooms where its users are present
<Ralith>
tilpner: added latency to matrix users is a fairly minor price to pay at the moment, at least
<MichaelRaskin>
(at least since the moment they joined, and possible earlier if enabled in the room settings)
<Ralith>
also memory use is not really sensitive to number of users, moreso number of rooms
<infinisil>
MichaelRaskin: And saving history uses RAM?
<MichaelRaskin>
Then again, join Matrix-HQ, see RAM use hit 2GiB
<tilpner>
Ralith - It is bearable, but not something I'm willing to give up weechat for
<Ralith>
saving history uses (embarassingly large amounts of) disk
<MichaelRaskin>
I am almost sure that the implementation of something about the history model is not done optimally — in algorithmic sense, not «Python is not good enough»
<tilpner>
(And I would love to replace this weechat+ZNC setup, it just shouldn't be a downgrade)
<Ralith>
accessing/manipulating room state uses RAM, and size of room state is proportional to number of joined users
worldofpeace has joined #nixos-chat
<infinisil>
I really don't get why a chat server would use a lot of RAM. Processing a couple text messages doesn't seem like that should use a lot :/
<MichaelRaskin>
But their model is rich enough that simple things are no longer simple to store, and the mainline implementation of finding out the current state of the room eats RAM
<infinisil>
Oh right, it's not only text
<Ralith>
MichaelRaskin: apparently python 2 is sufficiently bad that updating to python 3 gives a dramatic performance improvement all on its own
<Ralith>
also things got massively better for me when I tuned the cache size so it wasn't hitting disk so much
<MichaelRaskin>
The problem is, of course, that even if the room is actually text-only, you still have something that is more complicated than a, say, XMPP stream dump
<MichaelRaskin>
(which is already complicated, but usually survivably so)
<Ralith>
infinisil: I think the real source of complexity is the fact that there is no central authority for room state/history
<Ralith>
so it has to reconcile and authenticate everything
<MichaelRaskin>
If only authenticate…
<infinisil>
Well darn, I hope they can fix these problems, and I hope it's not embedded in how the protocol works
<MichaelRaskin>
It has an interesting and expensive to run and apparently not-yet-optimised model of reconciling
<Ralith>
it's "only" 1GB or so of memory use for me these days, with lots of big IRC rooms
<MichaelRaskin>
No problem if it is embedded in how the protocol works, actually. Nobody trusts the spec and everybody expects breaking changes anyway
<infinisil>
I really like how Pijul did it, to state the goal of having low algorithmic complexity
<Ralith>
supposedly the big go rewrite they're working on will have asymptotic improvements in disk storage (and presumably bandwidth) requirements
<infinisil>
MichaelRaskin: Ah well then :D
<MichaelRaskin>
The sad part of this Go rewrite is that there are some deviations from the Spec that are the same in the Python implementation and the Go implementation but not yet described in the Spec…
<Ralith>
working multidevice and shared history and gitter+irc bridging are all pretty delightful and make up for the jank, imo
<MichaelRaskin>
Frankly, I wouldn't really care about RAM if there was a way for a server to recover the state of the connected rooms after being offline.
<Ralith>
huh?
<MichaelRaskin>
Then I would set up some tunneling and run the actual server instance for large rooms on my laptop — normally I don't exhaust all the RAM, and if I do something that makes me care about all the RAM use, I probably don't want to chat in #nixos/#matrix right now
<Ralith>
it recovers from downtime just fine, but startup is slow at the best of times
<MichaelRaskin>
I think there is something about the recovery only starting after the first incoming message, and possibly there is also exponential back-off
<sphalerite>
This server from 2004 takes a long time to boot.
<sphalerite>
and by "boot" I mean for the BIOS to hand over to GRUB
<Ralith>
POST, then?
<sphalerite>
nah I don't think it's the self-test so much as initialising various devices
<sphalerite>
sequentially of course because this is 2004 :D
<sphalerite>
gchristensen: said server fails to netboot because it doesn't have enough RAM :(
<gchristensen>
:(
__monty__ has joined #nixos-chat
<elvishjerricco>
Somewhat crazy idea: Add a store argument to `builtins.derivation`, allowing nix expressions to specify what store to build in. An expression could specify a chroot store to get a private store readable only by the current user. This way one expression could build both your system config and a secret files config.
lopsided98 has quit [Quit: Disconnected]
lopsided98 has joined #nixos-chat
worldofpeace has quit [Remote host closed the connection]
<gchristensen>
what is the general guidance for a healthy loadverage?
<gchristensen>
like 1.5xcores?
<joepie91>
gchristensen: amount of cores; higher than that means you're backed up
<joepie91>
(afaik)
<samueldr>
guidance is it went out of the window when storage became faster AFAICT
<andi->
IIRC the way you used to do that was a few more builds then you have jobs since disks where a lot slower.. that probably doesn't apply anymmore?
<andi->
so I'd stick to number of cores + N (N< cores/2)..
<gchristensen>
you sure it isn't a "higher is better" thing? b/c I'd be doing great
<andi->
thats the result of buildCores == maxJobs and thus multiplying being >= cores^2
<andi->
missing a ? at the end there
<gchristensen>
:)
<gchristensen>
close, 2*cores*cores
<andi->
I am not always so sure the amount of load we have on hydra builds is great... it seems to make those VM tests very flaky.. a more intelligent "adaptive" jobs per build scheduling would be nice but probably hard to get right. Maybe chromium would still finish within a day! :D
<manveru>
won't use them anyway at those prices... but it's nice
lopsided98 has quit [Quit: Disconnected]
lopsided98 has joined #nixos-chat
<andi->
and they expect cloud-init support.. I am starting to have hard feeling about that issue where people don't want our minimal image to have that... We should at least publish an image with support for that... Maybe I should comment there and actually search for it again :/
<manveru>
:)
<manveru>
i think it was because cloud-init has a huge closure?
<manveru>
like... not fitting in an iso kinda big
<andi->
It was 60MiB more and 700 MiB closure.. That's kind of bad.. If we decide to reimplement it what feature set? Completely different? Etc..
<manveru>
or we just add another iso?
<andi->
On the other hand I think minimal + cloud init at around 400MiB would still be good e ough..
__monty__ has quit [Quit: leaving]
<manveru>
i'd be fine with a custom minimal iso that doesn't have usb, dos/ntfs/fat stuff, manual, w3m :)
<manveru>
i mean, all you need in there is ... nix and being able to boot, everything else is waste
<manveru>
though i think now nix needs git/curl/gzip/zip etc and doesn't include them in its closure
<manveru>
i'm having fun this week setting up my own CI and build cluster
<andi->
Well you want an opensshd for the rare cases or the use cases where that might be desirable. You do want a proper udev & USB support for loading data from the cloud-init too however it is provided in that instance..
<andi->
You should ship tools to Mount, format, repair common file systems.. :/
<manveru>
:D
<manveru>
well, if you can't get those from the network, sure
<manveru>
someone should make a genetic algorithm that sets up the minimum viable iso for each provider...
<andi->
Isn't that what we have with the specific nix files per provider? I also have that for the OpenNebula hosts st the office but their networking is funky and needs some fine tuning from my aide before that useful for others.