gchristensen changed the topic of #nixos-dev to: NixOS Development (#nixos for questions) | https://hydra.nixos.org/jobset/nixos/trunk-combined https://channels.nix.gsc.io/graph.html | 18.09 release managers: vcunat and samueldr | https://logs.nix.samueldr.com/nixos-dev
eadwu has joined #nixos-dev
pie_ has joined #nixos-dev
<ekleog> is it only me or do installer tests always blink for some reason? just checked the 18.09 dashboard, and…
* ekleog wonders whether there's an actual reason for all these failures or whether it's just transient failures all the time
<ekleog> (reason: matrix people don't stop yelling that the latest upgrade is critical and to upgrade now, so I'm hoping the bump can make it through asap)
<ekleog> if it's just transient failures all the time, then maybe something is wrong with the installer tests in particular
<ekleog> (and given the few I've looked at are all “Waiting for the VM to connect”, I'm assuming it's transient failures? but given it's installer tests I'm not sure at all)
<samueldr> ekleog: there *seems* to be a transient failure going on in the tests
<samueldr> it's (probably) not caused by the increase in time the tests had between 17.09 and 18.03
<samueldr> (that increase was caused by a new test to verify the store contents in some tests)
<samueldr> other than that, not much is known :/
<samueldr> the tests *could* be hanging at a spot, but there's a lack of data to know whether it's because the limit is too conservative or not
* samueldr must open that PR
<ekleog> the installer tests appear to be particularly heavily struck by this issue, though they are not the only ones
<ekleog> (I'm assuming it's not just that they are rebuilt more often, without particular data to back this assumption other than gut feeling)
* ekleog wonders what's particular in them
<samueldr> AFAIUI all tests involving in some way the installer is rebuilt for any git revision change
<samueldr> since the nixpkgs revision changes (I might be wrong)
<samueldr> I know any change to the repo while testing my "add more data to tests" changes caused the installer media to be rebuilt, every time
<gchristensen> gross. :|
<samueldr> the image rebuild every time? annoying, not gross, the repo *does* change every time
<samueldr> there's probably some way to pass *another* nixpkgs than the one being built though
<samueldr> than the one being used to build the test*
<gchristensen> right, fair
<ekleog> hmmm that kind of makes sense, we do need to pass the whole nixpkgs to the installer tests… ideally it'd be possible to extract only the portion of the whole nixpkgs that will actually be used by the installed system, but that'd be *hard*… I guess this issue is intractable then
<ekleog> *but* somehow nixos.tests.installer.simpleProvided.x86_64-linux looks too green for an installer test… or maybe this one doesn't include the whole nixpkgs image?
* samueldr needs to check those assumptions
<samueldr> possiblty the "really booting" images are having more issues?
<samueldr> I'm finishing something (eating + some show) then doing the PR and taking a look
<samueldr> the thing is those tests failures *could* be caused by a host being overloaded, if so, it's annoying to test any fixes
<ekleog> yeah, another thing I noticed is that all the installer test failures appear to be on packet-epyc-1
<samueldr> maybe we should scrape X evals and see where it fails and where it passes?
<ekleog> oh found one VM failing to connect not on packet-epyc-1, but it's not an installer test: https://hydra.nixos.org/build/86991955
<samueldr> also, scraping the logs to see which failures are "failing to connect" and which aren't
<samueldr> the "failing to connect" is (without evidence) what's assumed to be the main issue
<ekleog> yeah, randomly clicking on ~10 failures on the hydra dashboard showed only failing to connects
<samueldr> ah, maybe we could revert the revert to the increase in that timeout and see if it helps that issue
<samueldr> (or re-increase)
<samueldr> if nobody beats me to it, and no one else objects I'll do it later
alp has left #nixos-dev ["Leaving"]
<ekleog> hmm I'll let you decide about that, time to sleep here :)
* samueldr should stop pausing the show or else I'll never finish
alp has joined #nixos-dev
<ekleog> haffun!
pie_ has quit [Ping timeout: 258 seconds]
worldofpeace has joined #nixos-dev
drakonis has joined #nixos-dev
init_6 has joined #nixos-dev
jtojnar has quit [Ping timeout: 246 seconds]
jtojnar has joined #nixos-dev
drakonis has quit [Read error: Connection reset by peer]
lassulus_ has joined #nixos-dev
lassulus has quit [Ping timeout: 246 seconds]
lassulus_ is now known as lassulus
eadwu has quit [Ping timeout: 250 seconds]
orivej has joined #nixos-dev
init_6 has quit [Ping timeout: 246 seconds]
init_6 has joined #nixos-dev
orivej has quit [Ping timeout: 268 seconds]
worldofpeace has quit [Quit: worldofpeace]
init_6 has quit [Ping timeout: 245 seconds]
init_6 has joined #nixos-dev
init_6 has quit [Ping timeout: 258 seconds]
jtojnar has quit [Ping timeout: 244 seconds]
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 250 seconds]
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 245 seconds]
orivej has joined #nixos-dev
orivej has quit [Ping timeout: 250 seconds]
orivej has joined #nixos-dev
<ekleog> … is it me or is the acme module on release-18.09 broken?
<ekleog> oops wrong chan
init_6 has joined #nixos-dev
kgz has quit [Quit: WeeChat 2.2-dev]
pie_ has joined #nixos-dev
jtojnar has joined #nixos-dev
eadwu has joined #nixos-dev
Lingjian has joined #nixos-dev
eadwu has quit [Ping timeout: 250 seconds]
kragniz has joined #nixos-dev
kragniz is now known as kgz
orivej has quit [Ping timeout: 245 seconds]
init_6 has quit []
<samueldr> https://hydra.nixos.org/build/87181343#tabs-constituents just checked, most of the failures here (last eval) are error: timed out waiting for the VM to connect
orivej has joined #nixos-dev
<samueldr> #53827 might be cool to have, to better understand the "macro" profile of our tests, though I'd like it if someone with perl experience could confirm everything is fine :)
<{^_^}> https://github.com/NixOS/nixpkgs/pull/53827 (by samueldr, 13 hours ago, open): tests: Logs timing in tests
<samueldr> (and/or with our tests infra)
<gchristensen> looks good from here ...
<gchristensen> pretty simple
orivej has quit [Ping timeout: 250 seconds]
drakonis has joined #nixos-dev
orivej has joined #nixos-dev
lopsided98 has quit [Ping timeout: 264 seconds]
MichaelRaskin has quit [Quit: gchristensen last call: if nobody is in the process of writing a good post about medium-term RFC SC selection rules, I will write a quick-hack «please state your opinions» one]
schmittlauch[m] has left #nixos-dev ["User left"]
drakonis has quit [Ping timeout: 245 seconds]
Lingjian has quit [Ping timeout: 250 seconds]