#nixos-infra on 2021-05-08

2020-03-23 19:08 samueldr changed the topic of #nixos-infra to: NixOS infrastructure | logs: https://logs.nix.samueldr.com/nixos-infra/

05:32 cole-h has quit [Quit: Goodbye]

08:34 jonringer has quit [Ping timeout: 250 seconds]

11:38 <gchristensen> Equinix Metal asked me yesterday to review our account for any machines we're not really using. There were probably a half dozen machines or so that I expected their spot market algorithm would have taken back by now, so I released those, plus a couple of ARM machines that I think we could get back without trouble in the future. Just an FYI

13:33 <lukegb> 👍

13:56 <gchristensen> they've also asked if we could make our infra more dynamic, placing and revoking spot bids based on demand

14:30 eyJhb has quit [Quit: Clever message]

14:33 eyJhb has joined #nixos-infra

15:19 cole-h has joined #nixos-infra

16:54 jonringer has joined #nixos-infra

17:21 cole-h has quit [Ping timeout: 240 seconds]

18:44 <jonringer> gchristensen: before the 21.11 release, we should probably come up with a solution to sunsetting binary caches

18:44 <gchristensen> oh?

18:45 <jonringer> sunsetting packages that are in EOL nixpkgs

18:45 <jonringer> like really old unstable, and EOL releases

18:45 <gchristensen> oh, like pruning the cache?

18:46 <gchristensen> or packages' nix expressions in the nixpkgs.git repository

18:46 <jonringer> yea, IIRC, we have retained everything since the cache was started

18:46 <jonringer> the s3 bucket

18:46 <gchristensen> ah

18:46 <jonringer> "really old unstable" being like 18months

18:47 <gchristensen> not sure we need to wait (or schedule it) for 21.11, unless there is something about 21.11 you're thinking about?

18:47 <jonringer> nothing in particular, but coming up with a reasonable solution for doing this in the present and future I would assume take some time

18:47 <samueldr> what is the goal?

18:48 <jonringer> assume would take*

18:48 <gchristensen> gotcha

18:49 <gchristensen> yeah, it is going to be hard

18:49 <jonringer> The goal is just to be better stewards of resources, and if we wanted, could dedicated that storage in other ways

18:50 <jonringer> I was thinking of something like, we retain the past two releases (3 during the one month transition period), and X months of unstable

18:51 <gchristensen> so retaining old releases wouldn't even be a problem exactly

18:51 <gchristensen> we could keep the newest channel version of every release and prune everything between and keep a lot of history, but lose a lot of other builds we probably don't need

18:51 <jonringer> how to organize this, I have no idea.

18:51 <gchristensen> my bet is we'll need a lot of ram

18:52 <jonringer> Can always download more :)

18:52 <gchristensen> create a map of all the store paths in the cache and their dependencies, register GC roots, and then delete the oldest unrooted paths

18:52 <samueldr> old channels still being cached has been shown as an example of... not sure what, but an example that NixOS is... healthy? useful? not a toy?

18:52 <samueldr> also been used in some examples to find older binaries quickly if you need them

18:52 <gchristensen> yeah, but also it is useful for archaeology :)

18:53 <samueldr> e.g. you need an old firefox

18:53 <gchristensen> or postgers

18:53 <samueldr> meh, no one ever needs an old postgres

18:53 <samueldr> ;)

18:53 <jonringer> yea, it's nice as a "wow, look a these old things". But people shouldn't be encouraged to use those as they likely have CVEs and other issues

18:53 <samueldr> they're not encouraged to use those, though, are they?

18:54 <samueldr> without the cache it still would work (hopefully) by building, but building is expensive

18:54 <samueldr> what I aim to say is other than "cleaning up the old stuff" I see no advantage in cleaning up the old stuff

18:54 <jonringer> Well, seen some blog posts about dumpster diving through old releases

18:55 <gchristensen> one advantage is cost

18:55 <jonringer> anyway, I don't think having an "infinitely increasing cache" is sustainable on principle

18:55 <gchristensen> if the foundation had to pay for the cache tomorrow, we'd be okay for a few months but we'd need to prune

18:55 <samueldr> yeah, that's part of "cleaning up the old stuff" for me

18:55 <gchristensen> yea

18:56 <gchristensen> I wouldn't want to delete all the old history

18:56 <gchristensen> but there are a lot of artifacts we could easily lose and nobody would ever know

18:56 <samueldr> yeah

18:58 <gchristensen> but back to the main point

18:58 <gchristensen> it would be really good to be ready and able to clean up the cache in a controlled way so we don't get in to a scenario where we're feeling like we have to "just do it now"

18:59 <samueldr> yes, planning, being ready for that is good

19:00 <gchristensen> a related topic is our rate of growth appears to be increasing

19:00 <gchristensen> not unexpected: (linux,darwin) -aarch64, packages

19:01 <lukegb> I asked the question before but if we end up with "bad bits" in the binary cache it would be nice to have a written-down process for obliterating them

19:01 <gchristensen> yeah

19:01 <lukegb> (bad bits here being things like software that has a do-not-distribute-ever license that ends up being built and cached due to... whatever reason)

19:01 <gchristensen> yup

19:02 <gchristensen> ideally a process with poka yoke :P

19:03 <gchristensen> ("oops scaled S3 down by a factor of ten via typo" errors and whatnot ...)

19:08 <jonringer> lol

19:09 <jonringer> anyway, thats part of the reason why I brought up the 21.11 release, if we did want to have "buckets" that can be phased out altogether with related packages, then that would be the time to implement such a conversion

19:09 <MichaelRaskin> Given … the internet it would surely be nice if FODs mostly survived

19:10 <jonringer> similar to other jobsets, not a lot of the staging-next packages are super relevant after it's been merged to master

19:10 <MichaelRaskin> (But indeed, too many binary builds retained)

19:10 <samueldr> would be nice to have a "project" to scan all revisions of Nixpkgs for FODs URLs and pipe them back into archive.org or something like that

19:10 <gchristensen> MichaelRaskin: agreed, I think those can be largely identified from the narinfo luckily

19:10 <jonringer> nix show-derivation differientiates between eval drvs and FOD drvs

19:11 <gchristensen> samueldr: I think they're going to software heritage these days

19:11 <samueldr> something like that I guess :)

19:11 <samueldr> and why not both? can't have our eggs in one basket!

19:11 <gchristensen> :)

19:13 <MichaelRaskin> About binaries — after one year, first + last + highest-success-rate-of-each-month for each Hydra jobset sounds like a lot, but also much less than now

19:14 <MichaelRaskin> But a ton of Hydra DB crunching…

19:14 <samueldr> yeah, multiple "stories", use cases, for the question about collecting the cache

19:14 <lukegb> Hydra DB crunching isn't too bad since it can be done offline

19:15 <gchristensen> one caveat is hydra's db doesn't have *all* the logs

19:15 <MichaelRaskin> Oops

19:16 <gchristensen> most, though :)

19:16 <gchristensen> a good 80% solution

19:17 <MichaelRaskin> Are the missing bits the oldest?

19:17 <gchristensen> yea

19:17 <samueldr> would be nice to gather some info, e.g. unrooted paths

19:21 <gchristensen> I can provide a manifest of everything in the cache

20:10 <roberth> I'm experiencing a slow https://cache.nixos.org, both from my home in NL and a server in DE

20:10 <roberth> hmm that was a bit euro-centric; Netherlands and Germany

20:11 <roberth> 78.63 MiB download is taking minutes

20:16 <roberth> it's still "going", still 55 MiB after restarting `nix-build`

20:17 <roberth> this is on an idle server in a data center

23:19 supersandro2000 has quit [Killed (card.freenode.net (Nickname regained by services))]

23:19 supersandro2000 has joined #nixos-infra

23:37 cole-h has joined #nixos-infra