<dhess> I've also had issues with the 11.1 and 11.2 updates on 3 different machines. I suspect the /Nix volume
<dhess> and possibly MDM. Tomorrow I should be able to confirm whether the /Nix volume is related.
<dhess> no reinstall necessary, though; the Macs boot into recovery, I run Disk Utility, First Aid on the Macintosh HD volume, and reboot.
<dhess> then the installer proceeds and finishes normally.
supersandro2000 has quit [Disconnected by services]
supersandro2000 has joined #nix-darwin
<abathur> mine's been flaky all around, though most of my experience with this was back in beta
<abathur> the time machine interface is weird, some clean installs fail, some updates fail, etc.
fiddlerwoaroof has joined #nix-darwin
<thefloweringash> every machine I know of running big sur and nix (an x86_64 mbp, a dtk, an m1 mac mini, an m1 mba) has some failure during updates
<thefloweringash> during the beta the progress bar would just hang and would be fine after a reset. release versions seem to boot to recovery, from which you can just reboot back to macos
<thefloweringash> no MDM on any of these
<abathur> yeah, I'm still waiting on big sur
<abathur> I have a decent bootstrap process, but I'd like to get the installer PR in first, and then test my setup to fix any breaks against big sur
<abathur> and then I should be able to plan on a fresh install and recovering productivity within an hour or two
<abathur> I've been tempted to get an M1, though tbh I'm a bit miffed that I'm somewhat sure I'm starting to see the keyboard problems even on this (early) 2020 mba
<abathur> I still need to take the time to airdust it out and make sure it's not just suffering more dust from being virtually stationary/open for most of a pandemic
philr has quit [Ping timeout: 246 seconds]
stephank7 has joined #nix-darwin
stephank7 is now known as stephank
copumpkin has joined #nix-darwin
LnL has quit [Ping timeout: 260 seconds]
stephank has quit [Quit: Ping timeout (120 seconds)]
stephank has joined #nix-darwin
LnL has joined #nix-darwin
cptrbn has quit [Quit: Textual IRC Client: www.textualapp.com]
andi- has joined #nix-darwin
philr has joined #nix-darwin
supersandro2000 has quit [Quit: The Lounge - https://thelounge.chat]
supersandro2000 has joined #nix-darwin
jhuizy has joined #nix-darwin
copumpkin has quit [Ping timeout: 240 seconds]
<dhess> OK, it's not MDM-related. But so far, 4 of 4 machines with a /Nix volume have booted into firmware Recovery Mode after updating from 11.1 to 11.2.
__monty__ has joined #nix-darwin
__monty__ has quit [Ping timeout: 256 seconds]
<stephank> Ah, good to know. I also have a /src volume, so not sure if it’s _any_ volume in fstab, or just something specific to nix or nix-darwin.
<stephank> Fwiw, I also could not reinstall over the existing macOS install. I had to wipe the system volume group, but could then reinstall with my other volumes intact. So mere presence of extra volumes is not an issue. (No surprise.)
<stephank> I can’t imagine a bad launchd entry would cause this, but maybe having /etc/zshrc or /etc/shells on another volume is an issue. Guessing here.
<dhess> stephank: I *also* have a src volume on 2 of the 4 machines, but not on all. One of those 2 machines is MDM, the other is not. So the Nix volume is, so far, the common factor.
<dhess> stephank: however, see my earlier comments from yesterday -- I do not actually need to reinstall in any of these situations. It sufficies to launch Disk Utility from Recovery Mode, run First Aid on Macintosh HD, and then reboot. At that point, the installer picks up where it left off and finishes just fine.
<dhess> (However, if I do not run First Aid, and simply try to reboot, I end up back in Recovery Mode.)
<LnL> euh, what's all this?
<abathur> I suppose it would be nice if we could figure out what actually causes it on the off chance it's something we can just not-do :]
<andi-> This has added significantly to my disliking as a first-time macos user :D I did blame apple for most of the failure until I recalled the notes on encrypted volumes and booting with them..
eraserhd2 has joined #nix-darwin
eraserhd has quit [Ping timeout: 272 seconds]
<abathur> yeah
<andi-> slightly off-topic but is anyone else seeing iterm2 hangs now after the 11.2 update?
<andi-> It drives me nuts to have the terminal (my main application..) freeze on me
<abathur> and I suppose it'll be common for new users who don't have experience with a Nix volume working just fine on 10.15 to blame us, here
<andi-> Is there some way to figure out *why* the upgrades are getting stuck?
<andi-> The Nix volume is essentially like any other encrypted volume, no?
<andi-> So that would be an bug with MacOS and encrypted volumes
<abathur> if/once we can nail it down, maybe helpful to throw as many OBVIOUSLY RELATED reports through feedback assistant as possible to make sure it zoomies up their radar
<abathur> I'm not sure encryption is necessary
<andi-> so just the mountpoint?
copumpkin has joined #nix-darwin
<abathur> dhess: are all of yours encrypted? I feel like I've seen this on my test system whether I was testing FV or not
<abathur> unfortunately, I wasn't keeping meticulous notes while testing because it was beta, and I didn't know others were seeing it
copumpkin has quit [Client Quit]
<abathur> so I filed feedbacks, but didn't bother keeping an obsessive log
<abathur> and re: seeing what fails; there may be an update log somewhere? I
copumpkin has joined #nix-darwin
<andi-> Is there an early system boot log that I can still look at now? I've seen a bunch of things in that console app but nothing looked like that would help
<stephank> The installer does write a `<data volume>/.install-failure.log`, which I set aside. But skimming through it was unhelpful.
<abathur> I've never gone looking/debugging
<abathur> <3 stephank
<{^_^}> stephank's karma got increased to 1
<dhess> abathur: I have FileVault enabled on all of these systems, but only on one of them is the Nix volume encrypted with its own separate key. On the others, I rely on the T2 hardware to encrypt/lock the volume with the key used for FileVault on the main volume group.
<dhess> abathur: in other words, FileVault is on and all of these are T2 machines; but regardless of whether the Nix volume is separately encrypted or not, I see this issue.
<stephank> I wonder now if First Aid does more than just filesystem-level stuff. Otherwise seems weird that First Aid fixes / the installer botches the filesystem somehow specifically when Nix is present.
<dhess> LnL: it seems that many people (all?) with a Nix volume experience System Upgrade failures since 11.0 (i.e., 11.0 -> 11.1, and 11.1 -> 11.2).
<dhess> In all cases, the system reboots into the firmware's Recovery Mode at some point during the upgrade.
<dhess> In my case, at least, this is recoverable by running Disk Utility from Recovery Mode and doing First Aid on Macintosh HD (or whatever you've named your system volume). In other cases, people have reported needed to reinstall.
<abathur> dhess: ok, I think that roughly squares with my perception of it; I guess with a minimal reliable repro I could try with no fv on my spare system to confirm; it's currently on 11.2 and filevaulted but iirc rolled back to a pre-nix snapshot and manually deleted the extra volume
<dhess> FWIW in my case, Recovery Mode did tell me that I needed to reinstall; but that was not actually required.
<dhess> abathur: cool
<abathur> anyone know if there's a simple way to trigger it without the update process, or do I gotta reinstall 11.1 and then update? :)
<dhess> I have only ever seen this during System Update.
<dhess> Do you have VMware Fusion?
<abathur> I feel like I saw it some during recovery reinstalls in beta, but was too naive to keep notes
<dhess> Huh, I just realized that Apple never asked for their Apple Silicon development machines back.
<dhess> DTK or whatever it was called.
<stephank> I did read somewhere that it was a contract for a year? (I don't have one, so don't know, not sure that's NDA.)
<andi-> dhess: didn't you have to buy them?
<dhess> andi-: it was clear that it was not something you got to keep.
<dhess> that much I remember.
<dhess> it did cost $$ to get one, if that's what you mean. But it was a license, not a purchase.
<abathur> iirc you had to put up $500 and yeah, not get to keep it
<andi-> ouch
<dhess> I'm hoping they'll give you a Mac mini in exchange when you return it.
<dhess> I believe they did that with the Intel developer kits.
<andi-> I wonder why they want it back. Are those more open / less restrictive? Or do they simply not want some non-standard hardware out in the wild? (I wouldn't want to support one-off snowflakes for years)
<dhess> Probably the latter.
<abathur> this probably won't make sense w/o examples, but snooping around in /system/volumes/update/ logs makes me want to advocate for using <prefix>-<word> for the positive version of any serious event that gets logged when the negative version is usually just non-<word>
<abathur> so, the inverse of non-fatal should be like, yes-fatal, or totally-fatal...
<abathur> so that the important cases are as easy to find as the unimportant ones :/
<LnL> dhess: any ideas if this is nix installer or nix-darwin specific? the later kind of sounds more plausible to me
<abathur> I am a less-reliable report, but since I'm just doing these to test the installer update, I've only been installing Nix and not nix-darwin
<dhess> LnL: in my case, all of the machines use nix-darwin. I guess it could be something related to launchd scripts.
<dhess> or changes made to /etc maybe?
<dhess> Those have always made me nervous :)
<LnL> yeah /etc is the thing I'd say could perhaps cause problems
<LnL> other stuff might break after the update I don't see how they could impact the upgrade itself
philr has quit [Ping timeout: 256 seconds]
<dhess> Strangely enough, since 11.x, my Nix/nix-darwin install doesn't seem to break after upgrades like it used to.
<dhess> I still run a re-activation script out of caution, but I've noticed that my nix profile is in the path, at least, after an upgrade.
<andi-> I am pretty sure it is just Nix as I haven't even touched nix-darwin yet
<LnL> that's just bizarre, are upgrades broken with synthetic.conf or something
<andi-> My /etc/synthetic.conf just contains `nix`
<LnL> yeah that's all it does
<LnL> but it's the only special os thing we use
<andi-> How would I start a MacOS vom on 11.0/11.1 to test the update in there?
<andi-> now that I do not need weird MacOS on Linux hacks...
<LnL> I'm not aware of something nicer than the qemu dance
<dhess> There is VMware Fusion and Parallels, of course. But is it even possible to get the 11.0 or 11.1 installers, anymore?
<LnL> hmm doesn't look like it, there are multiple versions of 10.15.7 and earlier but just one for 11.x
<abathur> trying to re-use my beta installer usb to avoid having to make something new, forgot what a disaster the recovery on it is
<abathur> had to boot into regular recovery to erase the volume group because the one on the USB hangs during that process :p
<abathur> not certain this'll work, but hopefully it'll get me back to ~11.0
<dhess> I looked around for 11.0 or 11.1 on my Apple Developer account; no dice.
<abathur> yeah
<abathur> meh, just hit an install error; giving it another try but if this doesn't work I may have to drop back to catalina and see if updating from there triggers
<abathur> unless it's maybe possible to extract something smart from an unofficial source like GH's 11.0 CI
<abathur> 2nd try did get me back on 11.0 beta 20A5354i
<abathur> anyone happen to know -- if I take a local snapshot with `tmutil localsnapshot`, then upgrade to 11.2, whether I'll be able to roll back cleanly to pre-update?
<abathur> answer: no; after updating to 11.2 and rolling back, the system was still at 11.2
<abathur> :/
philr has joined #nix-darwin
__monty__ has joined #nix-darwin
__monty__ has quit [Client Quit]
<abathur> wonder if I can use asr for this
<abathur> there is a mention in https://eclecticlight.co/2020/09/14/will-big-sur-support-the-cloning-of-system-volumes/ about creating a bootable clone in a separate container, but I don't know if the time to reinstall in a new container would actually give me a leg up over just brute forcing this
<abathur> but looking at ~1h reinstalls to get back to 11.0, and it was likewise at least an hour to upgrade from scratch (a big slice of that is the download; if I can find a decent rollback procedure that'll include the SSV I could download that update before taking the snapshot; otherwise I guess I can just download it on another system and transfer