<colemickens>
been using that for months with no issue. rebuilds, no issue. today, built some of my config on the aarch64 box and copied it back and activated it and now mongo weirdness.
<samueldr>
hm, couldn't say
<colemickens>
(despite how that sounds, I still assume I'm making a mistake)
<samueldr>
(sorry if you were hopeful in a more in-depth answer)
<colemickens>
next step is I guess disabling it, clean store, rebuild from pi only? just to eliminate that possibility, though I agree with your earlier analysis...
<samueldr>
(I mostly wanted to get the question out in the open)
<colemickens>
no its okay, I understood that I was just putting it out there :)
<colemickens>
oh, it's definitely not the build, I should've expected.
<colemickens>
the build I'm pulling is a cached nixpkgs version so its coming from ... probably one of this boxes siblings.
<colemickens>
I've had a week of weird/interesting-in-a-bad-way issues. :|
<samueldr>
maybe its sibling outputed cortex-a96 instructions though
<samueldr>
it *is* an issue
<colemickens>
I don't think my machine ever built it though, I'm sure my machine or it substituted anyway too.
<colemickens>
oh!
<colemickens>
I'm pulling in another nixpkgs and not inherit'ing nixpkgs.
<Irenes[m]>
this is our first attempt to build any nixos-mobile code
<Irenes[m]>
we found the directions okay but currently it's complaining about not being able to find lib.lazyAttrsOf, which looks like a nixos version mismatch to us
<samueldr>
that's exactly it :)
<Irenes[m]>
our host system is nixos 19.09 (stable), is there a different version we should be building against?
<samueldr>
yes, give me ~2-3 minutes until my machine stabilizes again
<Irenes[m]>
hah, thanks, no worries
<samueldr>
just built a huge image and it's been swapping
<Irenes[m]>
yeah for sure
<samueldr>
Irenes[m]: are you familiar with NIX_PATH?
<samueldr>
(this'll tell me at what abstraction level to start)
<samueldr>
the most practical way to build against another nixpkgs revision, in my opinion, is to do a git clone / checkout of the desired revision somewhere, and set NIX_PATH accordingly
<Irenes[m]>
yes, absolutely
<Irenes[m]>
we know how to set it to a git checkout of the upstream repo
<Irenes[m]>
so this needs a particular revision, not one of the named channels?
<Irenes[m]>
we normally do it with -I actually because we think that's a bit cleaner than using environment variables, but as we understand it, both should work here
<samueldr>
well, nixos-unstable should work, but you never know, regressions do happen :)
<samueldr>
it tells which nixpkgs revision it built for, either in changes on the summary page, or in inputs
<samueldr>
it could happen that the nixpkgs revision didn't change so it would be missing from the "Changes" section
<samueldr>
and yes, using -I should work too
<Irenes[m]>
makes sense, thanks!
<samueldr>
do tell if you face other issues
<Irenes[m]>
absolutely, will do
<Irenes[m]>
it's gonna take a while to do the steps, especially since we'd also like to get our build machine updated before we try
<samueldr>
I'll preface quickly that the system built without using an example will look like it is hanging at the "Mobile NixOS" splash
<samueldr>
this is because the default configuration builds no special configuration for the running system
<Irenes[m]>
haha, good to know
<Irenes[m]>
because there's no desktop environment installed by default?
<samueldr>
pretty much
<Irenes[m]>
makes sense
<samueldr>
and it's not something that can be exactly changed since (still to be documented/figured out) that configuration has to work in a way that you will import it in your running system's configuration
<samueldr>
so it can't really do anything that may conflict with a user's configuration
<samueldr>
and another annoying truth is that without native compilation, examples/demo doesn't build, but that can be side-stepped by building only the system image (rootfs) once and re-using it for now
<Irenes[m]>
no worries. we know how to configure this stuff :)
<Irenes[m]>
our host machine is aarch64 so we should be able to make examples/demo work hopefully
<samueldr>
I actually just found a neat usage pattern so a user can re-use a pre-built system.img in a complete build, will document it shortly
<samueldr>
ah, then you're pretty much set
<Irenes[m]>
yeah
<samueldr>
I *think* firefox may need to be built, if it hasn's since last time another contributor looked
<samueldr>
it it hasn't been built*
<ashkitten>
i still dont understand what changed that made its compile times spike above 10 hours on hydra
<Irenes[m]>
we've been moving a lot slower on it than we'd hoped to, but we're working on tooling surrounding the A64's signed boot feature. this is our first foray into bootloader development and it's been hard to focus with the pandemic and everything.
<Irenes[m]>
so that's why we have this build setup heh
<ashkitten>
but yes, it looks like firefox needs to be built
<samueldr>
ashkitten: bad luck scheduling on high contention machines
<ashkitten>
fair enough
<samueldr>
I'm not perfectly sure though
<ashkitten>
maybe the timeout should be increased for that build then?
<ashkitten>
looks like it's normally ~8 hours
<ashkitten>
i don't know how that would be done with hydra though
<samueldr>
maybe, but my assumption is that if it's contention, the builds are not simply just a bit longer, but like really longer
<ashkitten>
hm maybe
<samueldr>
so at that point it kinda becomes wasteful to keep going on the resources-starved machine, if my assumption holds
<Irenes[m]>
so this sounds like a capacity-planning issue. where do I donate ;)
<samueldr>
(this copy operation is killing me, I'm about to test the "final" image on the eMMC of the pinephone)
<Irenes[m]>
although presumably it's worth checking that assumption first
<samueldr>
Irenes[m]: if you have time, and the ability, there is something that's been on hold to review scheduling in hydra to hopefully make this a non-issue
<Irenes[m]>
interesting, do tell
<samueldr>
I'm not the one who wrote up the idea, and I don't have a link to it, but the main idea is to send all jobs to a machine that does trivial builds, any build longer than e.g. 1 minute is killed, and re-scheduled for less trivial builds, where there is less parallelisation, and more time, and the same still applies, if e.g. after 15 minutes the build is not over, it is re-scheduled forward until it gets a big builder with all the resources
<samueldr>
this is a simplified explanation
<ashkitten>
interesting
<Irenes[m]>
that's really cool
<samueldr>
the idea being that you don't need to "tag" big builds so they're scheduled to run on better hardware, you don't keep state of previous requirements
<Irenes[m]>
yeah. definitely better not to rely on manual tagging.
<samueldr>
and this allows builds that gets better to sink back to lower levels, and builds that become harder to complete to float up
<Irenes[m]>
as an ex-Googler I will say that I have seen lots of situations where, if the build tool has some firm limit, very often downstream developers don't really know how to adjust that and you wind up with code that is carefully sized to just barely nudge that limit without going over. and it then becomes really hard for newcomers to make changes, since the way things break is totally opaque.
<samueldr>
sounds about right
<Irenes[m]>
so I hope there's thought being put into making sure that it's clear to people what's happened, when things go over the 15-minute limit
<Irenes[m]>
it sounds like there's a kind of hidden limit right now anyway, and this is just picking an arbitrary threshold to make it visible
<samueldr>
the example limits I explained are to illustrate, it would be an abstraction at the build farm that should be left invisible elsewhere
<Irenes[m]>
makes sense. good, as long as there's documentation somewhere :)
<samueldr>
the limit currently seen is the 10 hours build limit for any build on hydra, and only hydra
<samueldr>
nix itself doesn't have that limitation
<Irenes[m]>
I haven't admined a Hydra instance before, although I was meaning to look into that anyway, so I don't know how well-suited I am to help. but I'd love to look into it.
<ashkitten>
is there a way to tell hydra to retry the build, and maybe it'll complete this time?
<samueldr>
ashkitten: yes there is
<ashkitten>
even if we can't fix the root issue rn
<Irenes[m]>
any idea where the discussion of this has been happening?
<samueldr>
ashkitten: just restarted it
<ashkitten>
thank you
<samueldr>
Irenes[m]: I don't know if there is a dicussion happening
<Irenes[m]>
hm. any idea who to talk to about it? :)
<Irenes[m]>
or is it you
<samueldr>
that's another issue with build failures in hydra, they can be resolved by something else at some point but that doesn't end up showing in other previously failed builds :)
<samueldr>
Irenes[m]: g//christensen was the one that proposed this at some point in the past, #nixos-dev may be a place to get in touch about that
<ashkitten>
oh
<ashkitten>
well i guess it's built now :p
<samueldr>
though, in that specific case I think that it wasn't built as part of a job, but as a discrete step of another job, so it just ended up being cached because of transitivity
<ashkitten>
did that randomly happen to finish just now?
<samueldr>
though if you go to the "Build steps" tab it's clearly been built there :)
<samueldr>
meanwhile that long cp finished :)
<Irenes[m]>
samueldr: thanks! I'll follow up in #nixos-dev then
<samueldr>
(I hate SD cards)
<Irenes[m]>
yeah it's kind of shocking to me how slow they are sometimes
<Irenes[m]>
hope your eMMC test works :)
<samueldr>
gadget mode doesn't work on my pinephone, so I can't use jumpdrive :(
<Irenes[m]>
ah, yeah
<Irenes[m]>
I've thought about trying to make FEL boot work, with a full stage-2 wrapped into it somehow - presumably a second ramdisk image that the stage-1 would have to mount
<Irenes[m]>
doesn't seem worth it for the slight convenience though
<Irenes[m]>
since only devs would really care
<Irenes[m]>
(FEL being the Allwinner SoC's USB boot protocol)
<samueldr>
hmm, forgot about FEL, should trigger it to see if it shows up, but I figure it won't if it's a hardware issue
<Irenes[m]>
I haven't tried to use FEL on the pinephone yet, so no idea
<Irenes[m]>
it would be good to know whether it's even an option
<samueldr>
I would have to wait until someone confirms it's supposed to work furst
<samueldr>
first*
<Irenes[m]>
I'd kind of prefer if it isn't, tbh, for security reasons :)
<Irenes[m]>
but I suspect it is
zupo has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
t184256 has left #nixos-aarch64 ["Error from remote client"]
t184256 has joined #nixos-aarch64
wavirc22 has quit [Quit: ZNC 1.7.2+deb3 - https://znc.in]
zupo has joined #nixos-aarch64
wavirc22 has joined #nixos-aarch64
FRidh has quit [Quit: Konversation terminated!]
zupo has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]