gchristensen changed the topic of #nixos-dev to: NixOS Development (#nixos for questions) | https://hydra.nixos.org/jobset/nixos/trunk-combined https://channels.nix.gsc.io/graph.html | 18.03 release managers: fpletz and vcunat | https://logs.nix.samueldr.com/nixos-dev
Sonarpulse has quit [Ping timeout: 245 seconds]
phreedom has quit [Remote host closed the connection]
phreedom has joined #nixos-dev
layus has quit [Quit: ZNC 1.6.5 - http://znc.in]
layus has joined #nixos-dev
lassulus_ has joined #nixos-dev
lassulus has quit [Ping timeout: 240 seconds]
lassulus_ is now known as lassulus
vdemeester` has quit [Ping timeout: 260 seconds]
sorear has quit [Ping timeout: 256 seconds]
gleber_ has quit [Ping timeout: 260 seconds]
cbarrett has quit [Ping timeout: 268 seconds]
elvishjerricco has quit [Ping timeout: 256 seconds]
cstrahan_ has quit [Ping timeout: 240 seconds]
terrorjack has quit [Ping timeout: 276 seconds]
mbrock_ has quit [Ping timeout: 276 seconds]
pauldub has quit [Ping timeout: 256 seconds]
taktoa[c] has quit [Ping timeout: 256 seconds]
ocharles_ has quit [Ping timeout: 256 seconds]
angerman has quit [Ping timeout: 256 seconds]
thoughtpolice has quit [Ping timeout: 276 seconds]
zimbatm has quit [Ping timeout: 256 seconds]
ghuntley has quit [Ping timeout: 276 seconds]
manveru has quit [Ping timeout: 276 seconds]
pauldub has joined #nixos-dev
ghuntley has joined #nixos-dev
sorear has joined #nixos-dev
mbrock_ has joined #nixos-dev
ocharles_ has joined #nixos-dev
elvishjerricco has joined #nixos-dev
thoughtpolice has joined #nixos-dev
cbarrett has joined #nixos-dev
gleber_ has joined #nixos-dev
cstrahan_ has joined #nixos-dev
manveru has joined #nixos-dev
angerman has joined #nixos-dev
zimbatm has joined #nixos-dev
taktoa[c] has joined #nixos-dev
elvishjerricco has quit [Ping timeout: 256 seconds]
gleber_ has quit [Ping timeout: 256 seconds]
ocharles_ has quit [Ping timeout: 256 seconds]
cbarrett has quit [Ping timeout: 276 seconds]
pauldub has quit [Ping timeout: 256 seconds]
angerman has quit [Ping timeout: 265 seconds]
sorear has quit [Ping timeout: 256 seconds]
ghuntley has quit [Ping timeout: 256 seconds]
taktoa[c] has quit [Ping timeout: 265 seconds]
zimbatm has quit [Ping timeout: 276 seconds]
cstrahan_ has quit [Ping timeout: 256 seconds]
thoughtpolice has quit [Ping timeout: 256 seconds]
mbrock_ has quit [Ping timeout: 256 seconds]
manveru has quit [Ping timeout: 265 seconds]
terrorjack has joined #nixos-dev
pie_ has quit [Read error: Connection reset by peer]
pie_ has joined #nixos-dev
thoughtpolice has joined #nixos-dev
mbrock_ has joined #nixos-dev
MichaelRaskin has quit [Quit: MichaelRaskin]
zimbatm has joined #nixos-dev
manveru has joined #nixos-dev
ghuntley has joined #nixos-dev
vdemeester` has joined #nixos-dev
taktoa[c] has joined #nixos-dev
ocharles_ has joined #nixos-dev
angerman has joined #nixos-dev
ocharles_ has quit [Max SendQ exceeded]
sorear has joined #nixos-dev
cstrahan_ has joined #nixos-dev
cbarrett has joined #nixos-dev
ocharles_ has joined #nixos-dev
elvishjerricco has joined #nixos-dev
pauldub has joined #nixos-dev
cstrahan_ has quit [Max SendQ exceeded]
gleber_ has joined #nixos-dev
cstrahan_ has joined #nixos-dev
vdemeester` has quit [Changing host]
vdemeester` has joined #nixos-dev
jtojnar has quit [Remote host closed the connection]
<peti> It feels like "master" is in a really bad state these days. Libreoffice has been fixed and immediately broken again at least 3 times in the last month or so. I haven't been able to compile it for well over a month. Channel updates for nixos-unstable also come around only very rarely. We had to wait 2+ weeks on the last update, and that one is now 9 old again, too, with no end in sight as plenty of Hydra
<peti> tests are still failing.
Synthetica has joined #nixos-dev
* peti wonders whether there are changes we can make to improve that situation.
jtojnar has joined #nixos-dev
<steveeJ> peti: does every PR trigger a rebuild of *all* derivations which depend on the changed code?
<peti> steveeJ: No. That's too expensive, I suppose.
<steveeJ> peti: how expensive is it to git-bisect all the time? :D
<steveeJ> peti: something in between would be to define a list of derivations which are required to be in a working state
<steveeJ> of course such a list is highly subjective but it'll be worth the effort
<steveeJ> the effort of negotiating such a list I mean
<peti> We have such a list: https://hydra.nixos.org/job/nixos/trunk-combined/tested#tabs-constituents. But it's not verified before people push their updates.
__Sander__ has joined #nixos-dev
orivej_ has joined #nixos-dev
orivej has quit [Ping timeout: 268 seconds]
<steveeJ> peti: I see, then requiring these to build and test successfully before every merge to master would do it
<srhb> steveeJ: That's not a realistic solution though.
<srhb> Not without building the actual PRs on hydra itself...
<Mic92> Sometimes I had builds that run on my machine, but broke on hydra.
<steveeJ> srhb: is building them on hydra not an option?
<Mic92> Hydra can build pull requests as far as I know, but this is not enabled for nixpkgs for some reason.
<Mic92> We could probably increase the package quality of some core packages, by allowing less other packages, that add up to much time on hydra.
<steveeJ> I don't know too much about the available infrastructure but I've seen people trigger a bot to run tests. to prevent too much load on hydra or the bot who would run the above mentioned tests, it could be triggered only on PR approval
<Mic92> The bots usually only tests the packages itself, not all its dependencies.
<steveeJ> Mic92: which is why master is broken so often which peti would like to change :D
<Mic92> mind writing perl?
<steveeJ> very much so..
<Mic92> me neither
<steveeJ> dang language barriers
<Mic92> also bots timeout after an hour for good reason: https://github.com/NixOS/nixpkgs/pull/42288#issuecomment-398597400
<Mic92> I usually allways build all dependencies on my machine for an update.
<steveeJ> Mic92: a per-job timeout is somewhat primitive. the timeout could be dynamic and measure pkg build times
<Mic92> But for mass-rebuilds this is not feasiable
<Mic92> steveeJ: https://github.com/NixOS/ofborg feel free
<steveeJ> Mic92: btw, why did you mention perl?
<Mic92> steveeJ: hydra is written in it, ofborg is mostly rust
<steveeJ> let's rewrite hydra in Rust then :D
<Synthetica> Rewrite all the things in rust!
<steveeJ> is the a relation between ofborg and hydra?
<Mic92> steveeJ: I think there was a budget to rewrite hydra and this one: https://github.com/hercules-ci/hercules
<Mic92> ofborg was meant to replace our travis infrastructure that we had to test pull requests before
<Synthetica> Budget from whom?
<Mic92> I don't remember
jtojnar has quit [Ping timeout: 268 seconds]
vcunat has joined #nixos-dev
orivej_ has quit [Ping timeout: 240 seconds]
vcunat has quit [Ping timeout: 256 seconds]
orivej has joined #nixos-dev
jtojnar has joined #nixos-dev
vcunat has joined #nixos-dev
jtojnar has quit [Ping timeout: 260 seconds]
orivej has quit [Ping timeout: 245 seconds]
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 264 seconds]
taktoa has joined #nixos-dev
orivej has joined #nixos-dev
Sonarpulse has joined #nixos-dev
<gchristensen> peti: I have an idea, let's turn off r-ryantm for a while.
<gchristensen> reduce the churn and let people who are focusing on build problems actually get something done
<peti> Yes, that might help indeed. We could also batch a few dozen r-ryantm updates in a separate branch and merge them to master only *after* Hydra says they don't cause any harm. That is what I've been doing via 'haskell-updates' for a long time.
<gchristensen> yeah, ok I'll reach out to ryan
<Sonarpulse> 1 step closer to build all PRs first :D
<LnL> l like that idea, those are all 'trivial' updates
<LnL> so we can branch off the last trunk eval into package-updates and put a bunch of stuff in there, then compare to see if there are problems
<Sonarpulse> LnL: in general I'd like to separate pkg updates from refactors
<Sonarpulse> I mentioned this from staging
<Sonarpulse> but I guess it applies to more things too
<Sonarpulse> *about staging
* peti can set up a Hydra jobset for ryantm. The name "r-ryantm-updates" comes to mind. :-) We can point it to his personal fork, too, in case he wants to do some crazy commit re-ordering and history editing, like I routinely do.
<LnL> ack
<peti> Generally speaking, though, we should consider the role of our master branch, though. We've had only 1 channel update in the last 8 weeks or so. I don't think that's a good idea. People who follow "nixos-unstable" assume that they'll get security updates quicker than aynone else, but in fact it might be *weeks* until a security-related git commit actually reaches the channel.
<gchristensen> I have told everyone who will listen for the past year that nixos-unstable is slower to receive security updates than stable
<LnL> sure, that will always be the case
<peti> That's the reality, no doubt.
<peti> But it doesn't have to be.
<peti> We can create a protected "nixos-unstable" branch (or whatever name) and let Hydry build & publish that. Then we have a second jobset that builds "master". When "mater" passes all Hydra tests, then we merge (automatically) to the release branch. Everything is already built, so it could go out, basically, immediately.
orivej has quit [Ping timeout: 256 seconds]
<peti> When an important security update comes up, however, then we commit it directly into the "nixos-unstable" branch and by-pass master.
<gchristensen> interesting!
<gchristensen> though, we then get the same problem with -- staging -- poor coordination. we may require more than just this, but maybe a branch-for-the-week or something
<peti> This gives us 3 kinds of releases: "nixos-x.y" is quick and stable, "nixos-unstable" is quick and unstable, and "master" is the wild west.
<gchristensen> before this, we should probably try and figure out a better coordination scheme for staging
<peti> gchristensen: Well, the acceptance criteria for "nixos-unstable" is well defined. You need "https://hydra.nixos.org/job/nixos/trunk-combined/tested#tabs-constituents" to succeed, then it can be merge. So there's no manual decision making involved.
<peti> gchristensen: The problem with staging is a different one. The problem there is that you have to merge to master just to run the tests! We don't run them for staging.
<gchristensen> let's fix it!
<ben> Is it prohibitive to block merges to master that don't pass tests?
<gchristensen> with master -> whatever -> release-nixos-unstable, the jobs will pile up on master just like they are already
<LnL> in practice that might run into merge conflicts, but it's probably fine most of the time
<gchristensen> changing faster than we can properly test everything
<vcunat> ATM it might probably prohibitive to run the tested job on every commit
<peti> The problem remains that "master" can move so fast that more tests break every day than we can fix.
<gchristensen> yes
<peti> I am looking at you, nixos.tests.gnome3.
<ben> If it's not automatically the responsibility of the person proposing to merge something to master to ensure that the change doesn't break master, it seems hard to ensure that anyone cares enough to fix it again
<vcunat> that's the main problem with staging as well
<gchristensen> yeah
<vcunat> (but there it's made worse by the fact that builds need more time)
<gchristensen> ben: it is very very hard to ask someone to perform possibly hundreds of hours of builds locally before pressing merge on a change that by all means should be perfectly fine
<ben> I'm gonna get my drive-by PR merged, it fixes my use case but breaks the rest o the world, then I'm gonna walk away and never look at the issue tracker again
<gchristensen> actually, probably thousands of hours of CPU time
<ben> I'm definitely used to smaller integration test cycles, yeah
<vcunat> When thinking of staging workflows, if we have *one* that works nice for both of them, it will be good, for simplicity.
<peti> openSUSE handles this issue as follows. They have many separate staging branches, and all of them are built and tested by their build server. Now, contributors submit patches and the "review team" distributes those patches into separate staging areas for testing. Only after the tests of some staging-X branch succeed, the changes get into "master".
<peti> Now, sometimes changes are submitted separately but they belong together logically. In that case, they get merged into the same staging branch so that they are tested (and merged) together.
<vcunat> peti: and that's run on openbuildservice?
<gchristensen> ben: ultimately we need to have tooling and process on the receiving side to handle it
<peti> On other occassions, staging branches are split in two, because a submit request contains multiple modifications where only some of them cause trouble but others don't -- so you test them separately to speed things up.
<ben> The rust project tags PRs as "reviewed, could be merged" and automation runs tests on the merge commit and pushes to master once it passes. The author does not have to expend their own CPU time but they'll have to keep working on the PR until the automation finds a merge commit against the ever-moving master that is acceptable. afaik there is a manual process to batch changes that are unlikely to cause an
<ben> issue into a single run of the integration tests, but PRs still remain unmerged until the merge commit is confirmed to work.
<peti> vcunat: Yes. But we can do the exact same thing with Hydra. It's more about a logical structure to the branches and the build service.
<gchristensen> peti: this sounds like a good avenue to explore
<vcunat> we would need some (half-)automatic way of creating Hydra jobsets, e.g. based on branch naming
<ben> The rust thing sounds vaguely isomorphic to manually maintained staging branches...
<vcunat> Though we could start by cloning some "prototype" jobset.
<gchristensen> peti: are all changes staged like this?
<LnL> vcunat: automating hydra jobsets isn't a problem
<LnL> it just doesn't scale for every pr
<peti> vcunat: We could have build jobs calles "staging-01", "staging-02", and so on that build coresponding branches. The rest is just a matter of appropriate cherry picking and merging in git.
<vcunat> peti: I assume suse use a similar process even for the enterprise versions?
<gchristensen> peti: and policies around *who* can do *what* and *when
<vcunat> branches named by people might also work
<vcunat> (the person responsible for that branch)
<ben> Without backpressure against merges to master based on the brokenness of master I don't see how you can hope to keep up with the rate of changes without burning out the people who feel obliged to actually return things to a working state
<gchristensen> how is this different from putting every PR in to a jobset?
<gchristensen> ben +1 totally agreed
<ben> (and subsequent contributors branching off a likely-broken master rather than a known-good state is probably also not ideal)
<peti> vcunat, gchristensen: In the commercial distribution, every change is verified *manually* by the QA team. This puts a tight upper limit on how many things can change in any given week. :-) There is a strict procedure, though, on what kind of changes can go in and which ones cannot. Generally speaking, it's a different setting because the code base is not supposed to change a lot. That's more comparable to
<peti> our release branches than to "master".
<gchristensen> ofborg has helped a lot with that, I think, and has exposed more issues :P
<ben> sorry for the drive-by opinions, I'll shush now :)
<gchristensen> ben: yeah. its good stuff, you're not wrong
<gchristensen> like, our current process did pretty well when we were merging ~700prs/mo
<gchristensen> now that we're consistently over 1k it is no longer serving us well
<peti> gchristensen: Yes, every single change is staged like that in openSUSE.
<vcunat> Right, with stable branches the situation is easier. I think our workflow for those is OK for now.
* peti thinks that merging semi-automatic updates only after some kind of testing would already go a long way to improve matters.
<gchristensen> I agree
<gchristensen> something Ryan may even be able to add to his existing testing
<vcunat> Well we might better move his testing ideas to us and apply them to all PRs (and perhaps all changes).
<gchristensen> of coursse
<vcunat> Ryan could be just a non-testing bot "filing PRs".
<gchristensen> but we can start with one place and then do both
<vcunat> +1
<gchristensen> a much cheaper experiment to say to Ryan, make sure X test passes before you open a PR... if the bleeding slows down, we move that to ofborg or policy or whatever
<peti> Also, our life would be *much easier* if Hydra would have more resources available. Right now, the feedback loop is too slow. When I commit a fix for issue A to master, then it takes easily 2-3 days until Hydry knows whether that fix actually worked. By then, chances are that issue B has popped up meanwhile. This would not happen if we'd had a cycle time of, say 4 hours.
<gchristensen> yes
* peti is sure that funding could be secured to improve the build farm.
<gchristensen> peti: maybe we should chat in private? :)
<vcunat> I added two boxes to Hydra this week :-)
<peti> I sent you messages on Wire anyway because of that other issue. Let's take it there.
<peti> gchristensen: ^^^
<gchristensen> oh cool
<gchristensen> I closed Wire. *goes to open it again*
<vcunat> But overall power on Hydra certainly could improve things significantly.
<ikwildrpepper> if you think we are in need of extra build machines, we can also add some hetzner machines (using foundation money)
<vcunat> And coverting some VM tests to containers instead.
* peti recently stumbled across https://cachix.org/. This is also a nice idea to reduce compile load. It's a shame that it does not quite address the issue of trust. That needs to be figured out on top.
<ikwildrpepper> (or increase the number of spot instances)
<vcunat> ikwildrpepper: either is relatively expensive over long term, isn't it?
<gchristensen> I'd rather find funding instead of spending foundation money
<ikwildrpepper> gchristensen: well, that's basically what the money of the foundation is meant for
<gchristensen> right
<ikwildrpepper> and we have at least a significant buffer atm
* peti thinks it would be great of we could run the "nixos-unstable" tests more quickly. This is our central means of quality assurance, but in the last couple of days feedback there has been slow'ish. I suppose converting (some) tests from VM to containers could go a long way to mitigate that issue, too.
<gchristensen> +1
jtojnar has joined #nixos-dev
<gchristensen> niksnut: what is the load like on the hydra master?
<niksnut> 14:01:27 up 2 days, 2:17, 2 users, load average: 4.49, 3.68, 3.60
<gchristensen> what does top report for `wa` / io wait?
<niksnut> 0.0
<gchristensen> are those surprise eyes or no wait?
<gchristensen> ^.^
<ikwildrpepper> don't niksnut does emojis
<ikwildrpepper> +think
<gchristensen> I know :D
<niksnut> I do twitch emotes though :p
<niksnut> but yeah hydra doesn't scale
<niksnut> maybe we can replace it by AwsStore
{^_^} has joined #nixos-dev
{^_^} has joined #nixos-dev
<niksnut> so you would do nix-build --store aws://... release.nix
<gchristensen> O.o
<niksnut> which would fire off some AWS Batch job to build the missing derivations
<gchristensen> my god
<niksnut> so Amazon would spin up / shut down the necessary build machines
<ikwildrpepper> niksnut: yeah, but you'd need some central process to prevent building derivations from building multiple times, right?
<ikwildrpepper> any idea how that could work nicely?
<niksnut> some locking via dynamodb probably
<ikwildrpepper> also, it'll be hard to make it work with requiredSystemFeatures
<gchristensen> what exactly about hydra doesn't scale?
<niksnut> also, optimistic concurrency, since we don't really care if two machines build the same derivation occasionally
<gchristensen> b/c I've ideas that don't involve committing to AWS like that
<niksnut> also, we'd create 1 aws batch job per derivation
<ikwildrpepper> niksnut: but in case of stdenv change, wouldn't that likely cause gcc etc to be built N times (where N is the maximum concurrency we use)
<niksnut> yeah, it would be limited to x86-linux
<niksnut> ikwildrpepper: no, each job would correspond to one derivation
<ikwildrpepper> one job per derivation would work
<ikwildrpepper> but you would only need to post job once dependencies have built
<ikwildrpepper> so you still need a separate process to coordinate ?
<niksnut> aws batch jobs can have dependencies
<niksnut> unfortunately, last time I looked at it, you could only have 20 dependencies per job
<ikwildrpepper> niksnut: not sure if aws can handle such big graphs :)
<niksnut> anyway, it doesn't have to be using aws batch
<niksnut> you could also have a process that pulls jobs from dynamodb or something
<gchristensen> can we go back to "<gchristensen> what exactly about hydra doesn't scale?" ? :)
<niksnut> that could also support non-aws machine
<niksnut> +s
<niksnut> gchristensen: it has a single central server
<gchristensen> agreed. now, let's not throw the baby out with the bathwater here. hydra does a lot of stuff that isn't just replaced by a new fancy backend store
ciil has quit [Quit: Lost terminal]
ciil has joined #nixos-dev
<gchristensen> the interface, while not especially humane as it is now, is really important. all the accoutrement that makes it a usable service -- restarting jobs, viewing logs, etc.
<gchristensen> I think a major falling over in efforts to replace hydra is thinking too grandly about initial plans, and not thinking more incrementally
<gchristensen> "Come to my talk, on 28th of June in London, nix-build -j 296: ofborg, and what even is Hydra?" https://www.meetup.com/NixOS-London/events/251792988/
* gchristensen says, to the author of Hydra
taktoa has quit [Remote host closed the connection]
<vcunat> For starters I expect it might help to upload from builders to S3 directly, instead of through the central machine. That's slightly harder on permissions/security, though.
<vcunat> (I'm partially guessing at the major bottleneck. Eelco will probably know more/better.)
<vcunat> I would hope that with that change we could scale to a multiple.
<gchristensen> that would require exposing the signing key to the builders, which is understandably a thing to consider given that would mean you and I would have to have it. however, practically speaking, by virtue of being able to build things and control the builder, it doesn't mean a whole lot.
<vcunat> The builder controls what gets signed.
<gchristensen> right
<vcunat> Each builder (or location) could have its own key, too.
<vcunat> (theoretically scalable to a PKI-like scheme)
<gchristensen> I have experimentally used rabbitmq as a bus to distribute one-message-per-drv to builders, which each built and then uploaded to the cache
<gchristensen> that experiment used a central process to control when jobs were sent to rabbitmq based on when its dependencies were satisfied, but this is not a fundamental requirement: builders could have sent a message to a "completed" queue, and all builders could watch that queue and keep track of which dependencies it is still waiting for
<vcunat> In general I would prefer to avoid significant vendor lock-in (like Amazon, if practicable).
<gchristensen> me too
<vcunat> And hosted/cloud builders probably won't be money-efficient over the long term. I got the 4-cores for roughly half a year's price of hetzner's "cloud" 4-cores...
<vcunat> (on the other hand a comparably performant CDN will be harder to build on our own)
<gchristensen> we'll want to be able to easily work in hardware provided by generous companies, too
<vcunat> Yes. I actually don't know if trusted compuation is feasible unless you "trust the physical location", but it's similar to trusting hetzner.
<gchristensen> well yes you can't trust the computer if you can't trust the location
<gchristensen> it is why DCs have policies and certifications asserting they have and follow policies
<gchristensen> I wouldn't trust WidgetCorp to provide hw, but I would trust DatacenterAndServerManagementCorp since it is their business
<vcunat> I can imagine a company physically donating an older server that they replace by a new one. My employer (a non-profit) got some HW this way, I think.
<ikwildrpepper> one problem with buying hardware is maintaining it, and it needs a datacenter
<gchristensen> and >1 person who can drive to it and replace the hard drive or whahtever
<ikwildrpepper> yeah
<ikwildrpepper> basically, hardware is painful
<gchristensen> I hardly like going down to my basement to fix my hw :P
<ikwildrpepper> hehe
<vcunat> yes, that is a problem
<vcunat> but for builders we don't really need >99% uptime, etc.
<vcunat> quantity/price seems to matter more than the usual datacenter tradeoff
<copumpkin> congrats niksnut :)
<Profpatsch> niksnut: Congratulations! Meanwhile, /me fails at the programming challenge.
ciil has quit [Quit: Lost terminal]
ciil has joined #nixos-dev
<gchristensen> Profpatsch: :( :(
<Profpatsch> gchristensen: Let’s see, maybe you can bribe Mathieu. :P
<Profpatsch> tbh I’m implementing Dijkstra with mutable vectors and am not sure whether that was intended.
<gchristensen> not sure we should talk about it publicly tbh
Sonarpulse has quit [Ping timeout: 245 seconds]
<Profpatsch> hehe, I didn’t want to go into it, yes.
<__Sander__> hehe
<Profpatsch> But there is a light at the end of the tunnel (and it’s not a train)
<__Sander__> insight information in the hiring process :D
<__Sander__> anyway
* __Sander__ has a new job since two months
<Profpatsch> Oh, nice, where do you works?
<Profpatsch> -s
<__Sander__> http://mendix.com
* ikwildrpepper also has a new job, since october :|
<ikwildrpepper> sorry, trolling again
<__Sander__> hehe
Synthetica has quit [Quit: Connection closed for inactivity]
<niksnut> peti: I got an approval request for the docker registry to access our github repo, what is that for?
<vcunat> I'd expect it to say what permissions they require.
<vcunat> I hope I'm not missing anything important, around those congratulations. (no idea what they're about)
<ikwildrpepper> gchristensen: I thought the congratulations were because niksnut is finally active on twittag?
<ikwildrpepper> -g+h
<vcunat> Ah, thanks :-)
<vcunat> A single post probably doesn't yet count as "active".
<peti> niksnut: hub.docker.io can automatically re-build this nixos/nix docker image every time the repo changes on github. To do that, docker.io needs read access to the repo, though.
<gchristensen> peti: afai remember GitHub will not create a read-only access integration, but a minimum of a read-write access
<niksnut> hm, wouldn't it be better to update the image as part of the release script?
<vcunat> It seems better to rebuild the image on every commit, so problems are located sooner.
<peti> niksnut: That would work, too, no doubt. I don't think that it would be better though. At least I don't see any concrete advantages.
<vcunat> (that's from CI point of view - not for actual publicly used images)
<niksnut> peti: presumably we want the image to provide the latest released version, not master
<vcunat> I'd think that doing both is best.
<niksnut> I don't really see a reason for providing master
<peti> niksnut: The latest release version has a tag that people can use if they want to. The latest build from master is tagged "latest".
<gchristensen> we could _build_ the image in Hydra automatically
<gchristensen> as part of regular builds
<niksnut> we're not really eating our own dogfood if we're not building the image in Nix
<LnL> my image doesn't depend on the docker daemon except for the last part that initialises the db
<LnL> and that could use runAsRoot from dockerTools
Sonarpulse has joined #nixos-dev
<peti> niksnut: We are re-inventing the wheel for no good reason if we don't use that service. hub.docker.io exists precisely so that you have a central registry for images. What is the point of building the image in some otherway? The end result is going to be exactly the same.
<gchristensen> because we're a build system ^.^
<peti> The docker image that comes out of calling "docker build ." is no shinier just because Nix executed that command.
<gchristensen> Nix doesn't execute it
<gchristensen> Nix makes the tarballs `docker build .` would have made, and it is a major feature that a lot of users like a lot
orivej has joined #nixos-dev
__Sander__ has quit [Quit: Konversation terminated!]
<peti> gchristensen: "docker build ." doesn't make tarballs.
<gchristensen> it does, you just don't see it :) each layer is a tarball and inside is a metadata file, and a filesystem tarball
<LnL> we're talking about the docker image layers here, not the nix tarball
<LnL> those can be loaded into the docker daemon with docker load
* peti thinks that *not* re-using this existing service is just plain stupid. We can set up an automated build on hub.docker.io in 30 seconds and provide reliable, reusable docker images with Nix inside to everyone.
orivej has quit [Ping timeout: 265 seconds]
<simpson> peti: FWIW I think it's remarkably stupid that docker.io can't *read* a *public* repo without bothering people.
<peti> simpson: I suppose they don't want to busy poll.
<gchristensen> the issue is the commit hooks to be notified of builds
<gchristensen> which is dumb, because they could just provide the URL to post to
<peti> gchristensen: They do. You can also configure a manual web-hook POST trigger, if you want.
<gchristensen> nice!
<gchristensen> that seems much better
<vcunat> I certainly wouldn't be afraid of the github integration for docker-hub
matthewbauer has joined #nixos-dev
matthewbauer has quit [Write error: Connection reset by peer]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Ping timeout: 240 seconds]
<peti> niksnut: What do you gain by stopping me from setting up this service? It's not like having A prevents you somehow from doing B anyway. If you want a Nix-built docker image too, then just have it! Where is the downside to using the automation now that already exists?
matthewbauer has joined #nixos-dev
<gchristensen> with the "Integration" style, does it require write access to repositories? I believe it does
<gchristensen> on that scenario, I'm -1
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
<vcunat> Such issues might be worked around by creating a mirror repo, but that's probably cumbersome.
matthewbauer has quit [Ping timeout: 256 seconds]
<domenkozar> afaik it only needs read access
<domenkozar> which seems fine to me
matthewbauer has joined #nixos-dev
<gchristensen> it depensd on how you connected your github account to docker hub, and the nixos org isn't able to ensure you selected one or the other
<gchristensen> and actually, I guess since it is asking Eelco for permission, it implies the more liberal permission grant
<gchristensen> indeed, yes, that is what is happening
matthewbauer has quit [Ping timeout: 240 seconds]
matthewbauer has joined #nixos-dev
<domenkozar> yeah apps need to opt-in for github apps now
matthewbauer has quit [Read error: Connection reset by peer]
<domenkozar> that takes 30% for the fact that apps can do fine grained permissions
matthewbauer has joined #nixos-dev
<domenkozar> niksnut: does hydra use abstract store for pushing new builds? I'm asking if the http binary cache in Nix 2.0 would work with Hydra
<niksnut> domenkozar: yes, should work
<domenkozar> :O
<domenkozar> that means hydra could support cachix soon
<domenkozar> m'kay
<domenkozar> niksnut: thanks!
<vcunat> I guess you don't mean Hydra.nixos.org
<domenkozar> hydra as software
<vcunat> (cache.nixos.org seems enough for that)
<vcunat> +1
matthewbauer has quit [Ping timeout: 268 seconds]
MichaelRaskin has joined #nixos-dev
matthewbauer has joined #nixos-dev
<LnL> domenkozar: I've been wondering it would be useful for ofborg, enabling sharing builds across machines
<domenkozar> if we got cache.nixos.org and ofborg on cachix, they'd reuse nar uploads :-)
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
<vcunat> good for ofborg
<vcunat> at least in future when there are really multiple slaves per platform
<domenkozar> yeah :)
<LnL> vcunat: hmm?
<vcunat> LnL: right now it reports 32 slaves for aarch64 (single machine AFAIK), 2 slaves for darwin and one for x86
<LnL> aarch is the only platform with a single physical builder
<LnL> even the evaluator runs on multiple machines now IIRC
<vcunat> this isn't really relevant to evaluators
<vcunat> (caching of builds)
<LnL> sure, my point is that the other platforms are distributed
<vcunat> not much ATM apparently
<LnL> it's usually 3 linux 2 darwin
<LnL> but yes, a cache only becomes important when we start building more stuff
matthewbauer has quit [Read error: Connection reset by peer]
<MichaelRaskin> 3 Linux meaning 2 physical boxes, though. Even 4 Linux might be just 3+1
<LnL> oh?
<vcunat> one aarch64 machine is counted as 32 ATM
matthewbauer has joined #nixos-dev
<vcunat> (as an example)
<MichaelRaskin> Yes. When my builder is up (not right now), I usually run 2 or 3 builders.
<LnL> why? it doesn't start builds with -j1
<vcunat> the point is to run multiple separate PRs
<vcunat> (i.e. separate nix invocations)
<MichaelRaskin> Yes, so that a single slow one doesn't block a ton of small ones
<MichaelRaskin> Also, tell me more about configure -j8
<LnL> sure for the aarch machine it makes sense, but unless you have a crazy linux box it won't really do much
<MichaelRaskin> Parallel configure does make sense with 16GiB of RAM and most requests being small
<LnL> I'm talking about --option max-jobs not cores here
<MichaelRaskin> ofborg doesn't do parallel requests
<MichaelRaskin> A lot of requests build just one path, or a linear sequence of paths
<LnL> not so sure about that, but maybe
<LnL> anyway, I kind of expected there to be a bit more builders by now
<vcunat> they don't seem to be overloaded, at least ATM
<vcunat> (so now I was adding slaves to hydra.nixos.org instead)
<MichaelRaskin> I guess we have an equilibrium: even a single amd64 box from vcunat would keep up with the builds at ~80% duty cycle (as it does on Thursdays), and 3 builders on two machines eat everything quickly; but doing expensive tests like LibreOffice all the time is not a good idea anyway.
matthewbauer has quit [Read error: Connection reset by peer]
<MichaelRaskin> So there is no immediate payoff from adding more builders, and it doesn't seem a good idea to add more builds, and there we go
<LnL> why not? if there was more capacity
matthewbauer has joined #nixos-dev
<MichaelRaskin> You need to change both at once. Coordination. Coordination in a Nix* project.
<MichaelRaskin> Doable, might be nice, but requires a planned and tracked effort.
<LnL> if we get more people to contribute their idle desktop we can get more capacity first, then make some changes to start using it
<LnL> instead of the current stalemate
<MichaelRaskin> Well, using idle desktop resources without impacting other uses of the same computer is more complicated
<MichaelRaskin> I guess we can assume that ofborg jobs have approximately the same trust level in the sense of non-maliciousness as Nixpkgs master commits, otherwise there is this question of fixed-output risks
<vcunat> ofborg assumes almost no trust
<vcunat> It's strictly separated from what goes to cache.nixos.org.
<vcunat> (in the relevant direction)
<domenkozar> it assumes git trust
<domenkozar> afaik github doesn't provide history for force-pushes
<vcunat> you can see a short history of force-pushes
<LnL> that's not entirely true, only maintainers and a trusted set of users can trigger it
<domenkozar> well yes, it trusts maintainers :)
matthewbauer has quit [Read error: Connection reset by peer]
<vcunat> I meant: we don't need to trust the builders.
<MichaelRaskin> Well, extra-known-users too
<MichaelRaskin> That is true
<LnL> ah yeah
<vcunat> The builders still need to trust us (and git) a bit.
matthewbauer has joined #nixos-dev
<LnL> as for force pushes, I think that's handled
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
obadz- has joined #nixos-dev
obadz has quit [Ping timeout: 256 seconds]
obadz- is now known as obadz
vcunat has quit [Quit: Leaving.]
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Read error: Connection reset by peer]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Ping timeout: 256 seconds]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Ping timeout: 256 seconds]
orivej has joined #nixos-dev
matthewbauer has joined #nixos-dev
matthewbauer has quit [Remote host closed the connection]
matthewbauer has joined #nixos-dev
matthewbauer has quit [Ping timeout: 260 seconds]
matthewbauer has joined #nixos-dev
orivej has quit [Ping timeout: 245 seconds]
<gchristensen> ofborg builds the approved commit, not just the current version of the PR
<gchristensen> MichaelRaskin: I think your view of the project's equilibrium is a bit too pessimistic
matthewbauer has quit [Ping timeout: 240 seconds]
<gchristensen> I haven't been working on expanding its build capacity and use because of my medical issues, not apathy
<gchristensen> afaik we could probably start pushing builds from nixos-org-maintained builders to the nixos cache, but I am not excited to make a cache from other builders
<MichaelRaskin> gchristensen: I didn't mean that the equilibrium is about apathy by any specific person
<gchristensen> then you will be glad to know there is indeed a planned and tracked effort
<MichaelRaskin> I meant that it is an equilibrium w.r.t. random people (not) actively asking you for permission to run a builder
<gchristensen> right, got it
<MichaelRaskin> I didn't really mean that planned effort definitely doesn't exist, it is just a level of effort that participates in prioritisation and can lose the competition for priority for some time.
<gchristensen> that is of course true
<Sonarpulse> gchristensen: know if your talk will be recorded>
<Sonarpulse> *?
<Sonarpulse> i asked before but then lost irc
<Sonarpulse> sorry
<gchristensen> no idea :) zimbatm?
matthewbauer has joined #nixos-dev
ciil has quit [Ping timeout: 256 seconds]
ciil has joined #nixos-dev
matthewbauer has quit [Ping timeout: 264 seconds]
<zimbatm> Sonarpulse: not this time, our regular venue wasn't available
<Sonarpulse> zimbatm: ah man!
<Sonarpulse> oh well
<Sonarpulse> fingers cross for a bootleg :D
ciil has quit [Quit: Lost terminal]
<zimbatm> No choice, you've got to come to London :)
<gchristensen> +1
ciil has joined #nixos-dev
orivej has joined #nixos-dev
<jtojnar> Sonarpulse: 👍 on the meson patch, Jussi will be convinced
<Sonarpulse> jtojnar: as in you are optimistic?
<Sonarpulse> jtojnar: i was wondering whether you had any opinions on how to approach this / do the convincing
<jtojnar> Sonarpulse: not sure, meson development seems to be very vision-oriented, but nirbheek also supports it
<Sonarpulse> yeah I don't know the community at all
<Sonarpulse> but I was looking for some cmake-like thing to spread the cross Gospel, haha, and I had heard very good things about meson
<jtojnar> Sonarpulse: personally, I am not very familiar with the cross-compilation requirements so I probably will not be able to contribute to the advokacy
<Sonarpulse> jtojnar: well the issue at that specific thing is less cross, then trying to do as much "eval" time as possible
<Sonarpulse> more purity
<Sonarpulse> the follow up stuff for cross is just collapsing duplicate code paths internally
<Sonarpulse> and hopefully will be less controversial
<jtojnar> generally, it seems to me like Jussi knows what he is doing, though meson is still pretty young and some rough edges are showing around the less standard workflows (my beef is mostly with splitting packages)
<Sonarpulse> jtojnar: yeah in this case I feel like forcing faith in autodetection on the end user is a bit much
<Sonarpulse> also in nixpkgs we have everything in localSystem and crossSystem
<Sonarpulse> so we just want to force that
<Sonarpulse> if reality doesn't match the spec, reality is wrong, not the spec
<Sonarpulse> it's sort of hard to convey this without just being like "meson's great but I don't want to trust it"