<andi->
Nice April's fool question on the Mailinglist m(
taktoa has quit [Remote host closed the connection]
zybell has joined #nixos-chat
jtojnar has quit [Read error: Connection reset by peer]
jtojnar has joined #nixos-chat
<zimbatm>
the FHS troll?
<zimbatm>
related to FHS, maybe we should add a FHS-check in the fixup phase if they have an "installable" flag. It would avoid unnecessary conflicts when building user profiles.
zybell has quit [Ping timeout: 240 seconds]
zybell_ has joined #nixos-chat
<infinisil>
(Dropping in from #nixos-borg to continue discussing version control systems with MichaelRaskin)
<infinisil>
MichaelRaskin: Is git messier than I think?
tilpner has joined #nixos-chat
<samueldr>
s/git/*/
<MichaelRaskin>
Well, git has that DAG of commits, which it writes to disk in an unsafe way, and everything on top is a mess
<infinisil>
Yeah I think it's better how monotone and apparently also Fossil use databases instead of files
<zybell_>
unsafe? what do you call unsafe
<MichaelRaskin>
Well, you can use files safely, just use explicit write barriers
<zybell_>
?
<infinisil>
While files are easy to handle on Linux, databases are a better abstraction for most cases
<MichaelRaskin>
zybell_: git writes files in such a way that can cause corruption if power is cut.
<MichaelRaskin>
For example, on ext4 filesystem with default settings
<infinisil>
imo. And files are easier to handle, but that's probably only because it's the only thing we know and is well established
<zybell_>
If files are immutable , checked by hash you can check that file is not corrupted.
<simpson>
infinisil: To a point, sure. The problem is that everything still has to write to The Filesystem in the end, and that makes some stuff unhappy. For example, I figured out how to corrupt a Fossil DB reliably by putting it in a 'magic' auto-synced network folder. It'd be great if that sort of thing *did* reliably work.
<MichaelRaskin>
zybell_: git doesn't write files in the way you have just described
<infinisil>
simpson: How did that work?
<MichaelRaskin>
simpson: well, does this syncing setup guarantee anything about fsync?
<simpson>
infinisil, MichaelRaskin: SQLite was surprised that the network folder had been written to by another SQLite on the other side of the network.
<zybell_>
Thats on read, a corrupted file is a if it wasn't there and eventually gc.
<simpson>
The really fun part is that sometimes when it happened, the write came from *inside the localhost* whooooo whooooo~
<simpson>
I never root-caused it, I just stopped putting Fossil DBs in magic folders.
<infinisil>
Maybe the filesystem itself should be a database!
<MichaelRaskin>
Well, there are write barriers and there is fsync
<infinisil>
Files, aka strings of bytes, aren't flexible at all
<infinisil>
as far as i know
<MichaelRaskin>
Of course, network filesystems often fail to provide fsync guarantees (never mind locking guarantees), and FS-using programs often fail to use anything to communicate ordering requirements
<simpson>
MichaelRaskin: There's also Fossil's workdirs and extra dotfiles and several piles of TR1 written to less-than-SQLite's-standards, and also I learned what I feel is the appropriate lesson, and I also reported it as a bug on the magic-folder feature since that's the likeliest culprit.
<MichaelRaskin>
Fossil tries to bit a bit too fancy with its accounting, in my opinion
<MichaelRaskin>
I prefer Monotone where less things are outside of the main database
<MichaelRaskin>
(but it is a completely non-moving project)
<simpson>
infinisil: It's funny that you say that. Most of my side-business stuff is written to treat Tahoe-LAFS as a database and to do all of its work using patterns which avoid or tolerate write collisions.
<simpson>
The biggest reason was that containers don't have filesystems, so abstracting the FS away made containerizing easier.
<simpson>
I mean, containers have filesystems. But they don't have, y'know, filesystems.
<infinisil>
Umm yes yes
<zybell_>
If I write the expected hash into the filename, then I can write the content in any order, only when the content reaches the expected(and predicted)bit pattern, the file magically appears in the eys of a reader that checks the hash.
<infinisil>
simpson: (I really don't get what you're saying, I don't have any experience with Tahoe-LAFS and only little with containers)
<simpson>
Actually, I wanna retell zooko's story of the ancient toasters. Suppose that one day we discover a lost civilization's ancient toasters. They are magical and wonderful and make the best toast ever, to the point where even if you're not utilitarian, you really want these toasters integrated into society somehow. There's only one tiny problem...
<MichaelRaskin>
zybell_: then you discover that this is bad for performance
<zybell_>
So I can be lazy with write checks, because of this magic atomicity.
<simpson>
...they blow up sometimes. Not always! But sometimes.
* infinisil
listens closely
<simpson>
zooko argues that if they only blow up once in a bilion times that we try to make toast, then they might become popular in society. One-in-a-quadrillion is even worse; they'd be household objects,
<simpson>
But if they're one-in-a-hundred, then society will realize the danger pretty quickly and build special toast-containment domes for safely producing toast.
<zybell_>
Huh?Why is this bad for performance? Without fsync?
<simpson>
zooko was arguing that we should, if we can't *prove* that write collisions are impossible, make our systems have write collisions pretty often, and design everything to be fault-tolerant around that.
<MichaelRaskin>
zybell_: because FS provides much better throughput for reading and writing large contiguous blocks
<simpson>
zybell_: And to add to MichaelRaskin's awesome phrasing, this is because the hardware itself usually reads and writes in large contiguous blocks.
<infinisil>
simpson: Pretty smart move by zooko
<simpson>
infinisil: Yeah, he's a great philosopher. Everybody gives him shit these days for starting an altcoin though. I dunno. I think the parable stands on its own; I think I actually heard it thirdhand through warner.
<MichaelRaskin>
Oh, what hardware does is another sad story.
<zybell_>
The hash must be checked anyway because of cryptographic guarantees. And large continous blocks are included in the phrase 'any order'.
<MichaelRaskin>
A different sad story for each generation of storage technology
<MichaelRaskin>
zybell_: if you have a single large file, you are likely to get a contiguous read on the hardware level. If you have a lot of small files, they are likely to be scattered on purpose to provide space for their future growth,
<MichaelRaskin>
because filesystems do not currently provide a way to commit to having a file that can never grow beyond some size
<zybell_>
I didn't say and didn't mean a random order. And the single large file you get when the objects are packed. Sensibly this is done only when you have a backup in the form of unpacked objects, because the atomicity doesn't work with packed objects.
<MichaelRaskin>
Random order comes from the FS allocation strategies
<infinisil>
While it's really early in its development, the idea behind it is really nice
zybell_ has joined #nixos-chat
<zybell_>
Dont know if this came through
<zybell_>
You can fsallocate() the size before writing, or even ftruncate() the file to the needed size(ftruncate()works upwards too).
<zybell_>
If the FS honors such requests by modifying the allocation strategy, it may be used more often if properly documented.
<zybell_>
Sent again
<infinisil>
(It did not come through indeed)
<MichaelRaskin>
zybell_: well, there are cases when applications ftruncate, then grow the file on the next use. So FS doesn't pack such allocations too densely.
<MichaelRaskin>
Also, each file gets an integer number of allocation units.