supersandro2000 has quit [Disconnected by services]
supersandro2000 has joined #nixos-dev
rajivr has joined #nixos-dev
<gchristensen>
it would be cool if error messages had a check list
<gchristensen>
like: The option value `networking.hostName' in `/home/grahamc/projects/github.com/grahamc/network/flexo/hardware.nix' is not of type `string matching the pattern ^$|^[[:alnum:]]([[:alnum:]_-]{0,61}[[:alnum:]])?$'.
<gchristensen>
compare that error to - The option definition `security.acme.certs.flexo.gsc.io.allowKeysForGroup' no longer has any effect; Please remove it.All certs are readable by the configured group. If this is undesired,consider changing security.acme.certs.flexo.gsc.io.group to an unused group. which doesn't mention where I set this option
<gchristensen>
I'm not intending on griping here, just noting that some error messages are amazing, and then others vary significantly
<gchristensen>
and am wondering what a check-list would look like
<samueldr>
gchristensen: it's not a Q, it's a thin and weird magnifier AFAIK
<gchristensen>
oh :)
<samueldr>
but I does have the Q vibe
<gchristensen>
brb
gchristensen has joined #nixos-dev
{^_^} has joined #nixos-dev
<abathur>
like regex, but it's invalid without matching examples, and requires one distinct non-overlapping match per character of regex?
<abathur>
maybe ignoring comments
mkaito has quit [Quit: WeeChat 3.0]
gchristensen has quit [Quit: WeeChat 2.9]
{^_^} has quit [Remote host closed the connection]
<eyJhb>
Anyone up to review this? https://github.com/NixOS/nixpkgs/pull/110404 It is not done, as there needs to be some descriptions etc. but wanted to know if the basis for it is OK
<{^_^}>
#110404 (by eyJhb, 1 week ago, open): WIP: module mautrix-* new service to handle all mautrix services
ScottHDev5 has quit [Quit: Ping timeout (120 seconds)]
ScottHDev5 has joined #nixos-dev
<siraben>
Profpatsch: I see. I might learn how to use hnix and take a stab at it
<siraben>
Or perform the redundant rec removal again and so on
<Profpatsch>
siraben: I guess with hnix you will have the same problem again, that it can’t do source spans for non-exprs
<Profpatsch>
that is you will be able to find out which vars are unused, but then you have no way to highlight the var that is unused
<Profpatsch>
unless you do hacks like ad-hoc parsing the source spans
<Profpatsch>
siraben: we merged the tree-sitter-nix grammar into nixpkgs recently
<Profpatsch>
Since you don’t need inter-file checks for this, you might as well just use tree-sitter and then implement a conservative subset of variable scoping
<siraben>
I see. How hard is it to use tree-sitter?
<Profpatsch>
collecting free variables recursively and removing them on each introduction site, collecting the ones which are introduced but not in the collection
<{^_^}>
nix#963 (by domenkozar, 4 years ago, open): error "value is a list while a set was expected" is too vague
<domenkozar[m]>
to fix a number of such issues, ideally, it would take something like opening repl, but it's probably not feasible that it would always result into a working repl
<domenkozar[m]>
printing values is also tricky since one could do (import nixpkgs) + []
<siraben>
lol I ran into that today because I wrote `overlays = [ haskellPackages.ghcWithPackages (h: ...) ]`
<{^_^}>
nix#3901 (by edolstra, 25 weeks ago, open): Add a flag to start the REPL on evaluation errors
BaughnLogBot has joined #nixos-dev
tilpner_ has joined #nixos-dev
marek_ has joined #nixos-dev
marek_ is now known as marek
tilpner has quit [Ping timeout: 265 seconds]
tilpner has joined #nixos-dev
tilpner_ has quit [Ping timeout: 260 seconds]
<infinisil>
gchristensen: Could it be that you broke declarative flake jobsets with your recent hydra changes? Because it seems to be
<gchristensen>
is it possible? ... it is possible
<gchristensen>
what are you seeing?
<infinisil>
There's no error, the declarative jobsets evaluates, it produces the correct result (a store path with jobsets saying `"type": 1` and `"flake": "github:..."`), but the actual jobsets still use the previous state (non-flake in my case)
<gchristensen>
any logs in the database?
<infinisil>
Oh how would I check?
<infinisil>
(but yeah I'm thinking some database update must've failed)
<gchristensen>
find the database, and journalctl -fu postgresql
<gchristensen>
or -eu
<gchristensen>
wait, you're moving from a non-flake to a flake jobset?
<infinisil>
Yeah
<infinisil>
Hold on, getting db logs
<gchristensen>
I don't know why that would be a problem, but good to know
<gchristensen>
infinisil: it looks like you're running inconsistent versions of code?
<infinisil>
Hmm, inconsistent between what?
<gchristensen>
like the revision of hydra you're running is incompatible with the database
<infinisil>
Oh, hmm I did see something about having to run `hydra-init` in an issue earlier
<infinisil>
I personally didn't do the update to hydra master, but I guess if that needs to be done manually it could easily be forgotten
<infinisil>
Though it is run automatically by NixOS' hydra module. Gonna check if that's being used
<gchristensen>
it should be run automatically by the module, indeed
<gchristensen>
you may need to manually restart the queue runner
<infinisil>
Yeah that ran indeed when it was updated to master. I see `upgrading Hydra schema from version 65 to 66` up to `upgrading Hydra schema from version 69 to 70`
AlwaysLivid has quit [Remote host closed the connection]
<infinisil>
Hmm the queue-runner has been restarted since the upgrade, I'll try again
<gchristensen>
infinisil: you should be up to version 72 at this point
<sterni>
infinisil: about the lib.generators.toPretty change, I noticed today that it doesn't fix anything actually, because tryEval apparently doesn't catch the kind of error generated by builtins.functionArgs
<sterni>
I must have tested this with the wrong nix version yesterday, I guess I'll make PR reverting this change
<sterni>
unless there is another way to catch this kinds of failures
<gchristensen>
infinisil: yeah, that commit introduces schema version 72
<infinisil>
sterni: (-> #nix-lang?)
AlwaysLivid has quit [Read error: Connection reset by peer]
<infinisil>
Hmmm
<sterni>
infinisil: oh right
<infinisil>
gchristensen: Oh, the upgrades were done too
<gchristensen>
okay
<infinisil>
Up to 72, though it was a bit hidden within the SQL
<gchristensen>
so I'm going to hope restarting the queue runner fixes it ? :)
<gchristensen>
https://paste.infinisil.com/56vCC2Iito.log looking at this error "new row for relation "jobsets" violates check constraint "jobsets_check"" is interesting, that constraint is: alter table Jobsets add constraint jobsets_check check (schedulingShares > 0);
<infinisil>
Yeah saw that too, very weird
<gchristensen>
it sounds like you're not giving the jobset any shares?
jonringer has joined #nixos-dev
jonringer has quit [Remote host closed the connection]
<infinisil>
Unfortunately nothing is different with a restarted queue runner
<infinisil>
Note the last timestamps here ^ That's the time when I retried flakes, only these couple lines appeared
<infinisil>
28 Jan is after the upgrade
<infinisil>
gchristensen: Hmm maybe we could try out creating a new declarative flake jobset for hydra.nixos.org?
<infinisil>
s/jobset/project
<gchristensen>
can you connect to the hydra server, `ps auxfg | grep hydra` and share all the store paths? I'm still thinking that you're running out of sync versions of the code
kalbasit___ has joined #nixos-dev
<infinisil>
Hmm well it doesn't show the store path of hydra-queue-runner
<gchristensen>
and the deploy was dirty, what diff is present?
<andi->
my hydra on ~yesterdays master also throws some 500s when restarting failed jobs... :/
<gchristensen>
nice, let's get it fixed
justanotheruser has joined #nixos-dev
<infinisil>
I'll look into why that's DIRTY, I don't think it should be
<infinisil>
Maybe because it's fetched by niv somehow
<gchristensen>
andi-: refactoring schemas is hard without a compile time checker
<andi->
Mine looks more like a perl issue not a DB issue
<andi->
> Caught exception in Hydra::Controller::JobsetEval->restart_failed "Can't locate object method "project" via package "Hydra::Model::DB::JobsetEvals" at /nix/store/252137553mzhqmqjwv3i9g0wma7a4fpa-hydra-0.1.19700101.DIRTY/libexec/hydra/lib/Hydra/Controller/JobsetEval.pm line 155."
<{^_^}>
error: syntax error, unexpected IN, expecting ')', at (string):471:18
<gchristensen>
yeah, that is the hard part
<infinisil>
And I'm pretty convinced the DIRTY thing just comes from it being fetched from Niv and not with flakes. So I think it uses the `default.nix` with flake-compat
<andi->
infinisil: yeah, same happens if you use fetchgit
AlwaysLivid has quit [Remote host closed the connection]
kalbasit___ has quit [Ping timeout: 272 seconds]
<infinisil>
Hmm but yeah there's no declarative jobsets on hydra.nixos.org
<infinisil>
How about creating one, and with flakes too, just to test whether it works? Because if it does, then it's an issue with my deployment. Otherwise it's an issue with hydra itself
<andi->
gchristensen: deploying..
<gchristensen>
andi-: I just pushed an updated version which should be more clear and be identical
<andi->
ok
<gchristensen>
I tested that locally and it worked, fwiw
<siraben>
> builtins.currentTime
<infinisil>
gchristensen: I guess I'll file an issue for the problem I'm having
<{^_^}>
1612194385
<siraben>
> builtins.currentTime
<{^_^}>
1612194390
<gchristensen>
infinisil: can you `\d jobsets` ?
<gchristensen>
in a `psql` terminal
<infinisil>
Did not find any relation named "jobsets".
<infinisil>
Wait I'm probably not connected to the right db
<gchristensen>
hopefully you have such a relation :)
<immae>
What is the goal with `((type = 0) = (nixexprinput IS NOT NULL AND nixexprpath IS NOT NULL))` ?
<immae>
is it supposed to be an "imply" term? `type == 0 => (...)` ? If so, it’s incorrect
<gchristensen>
infinisil: UPDATE jobsets SET flake = 'github:input-output-hk/ECIP-Checkpointing/66dbb9c0117d2965617755c193dbb035f096a149', type = '1' WHERE ( ( name = 'pr-16' AND project = 'ecip-checkpointing' ) );
mikroskeem has quit [Quit: WeeChat 3.0]
<immae>
gchristensen: it won’t work: in your case nixexprinput and nixexprpath are non-null, so (type = 0) = (...) will be false and the check will fail, no?
<immae>
the check should be `(type != 0) || (nixexprinput IS NOT NULL AND nixexprpath IS NOT NULL)` if I understand correctly the goal of it
kalbasit___ has joined #nixos-dev
<gchristensen>
good catch! we'd come to that at the same time =)
<gchristensen>
I don't think we should relax the constraint, I think we should nullify those columns when setting flake params
<immae>
it’s not "relaxed", the constraint seems incorrect to me in its current state
<gchristensen>
I don't think so... it is invalid to specify an nixexprinput / nixexprpath in combination with a flake
<infinisil>
immae: I think it's just flipped around, type = 0 is equal to (type != 1)
<immae>
hmm 0 is "non-flake" right?
<infinisil>
So it's `(type != 1) == (nixexprinput is not null ...)`
<infinisil>
Yea
<infinisil>
ANd that's then an implication again
<immae>
no an equal sign is not an implication
<immae>
it might be what you want though regarding what gchristensen said :)
<immae>
you want a chekc that says "type 0 => flake == null and nixexprinput != null AND nixexprpath != null", and reverse for "type 1", is that it?
<gchristensen>
if it is a flake, I want flake to not be null and nixexprinput and nixexprpath to be null
saschagrunert has quit [Remote host closed the connection]
<gchristensen>
if it is not a flake, I want flake to be null and nixexprinput and nixexprpath to not be null
<immae>
Then the boring and no-error-prone way to write it is `(flake IS NULL AND nixexprinput IS NOT NULL AND nixexprpath IS NOT NULL AND type = 0) OR (flake IS NOT NULL AND nixexprinput IS NULL AND nixexprpath IS NULL AND type = 1)`
<gchristensen>
I mean, the schema is correct
<gchristensen>
the update query is not correct
<immae>
if nixexprinput is null and nixexprpath is not null and flake is not null and type is 1 then your check pass but it shouldn’t
<gchristensen>
ah
<gchristensen>
good catch!
<immae>
(you might say that you don’t care, and I would accept it, but the assertion that the schema is correct is false regarding the constraints I understood :) )
kalbasit___ has quit [Quit: WeeChat 2.9]
mkaito has quit [Quit: WeeChat 3.0]
mkaito has joined #nixos-dev
<gchristensen>
immae: you're right
<gchristensen>
immae: can you open an issue and/or PR for it?
<gchristensen>
infinisil: my audio is busted ...... let me know the result of deploying that patch? also, can you link me to the patch?
<infinisil>
gchristensen: Ahh damn. But yeah will do :)
<infinisil>
Evaluation has been going for over 4 minutes, which might be reasonable considering it's heavy IFD, but at least there's no error, which is different from before the patch
<infinisil>
Or rather, at least something is happening, instead of it just showing the previous error
<immae>
gchristensen: I sure can, but I have absolutely no clue where this kind of thing is defined. Should I look somewhere in hydra repo?
<infinisil>
Well if it's already built, it should be faster
<NinjaTrappeur>
yup
<infinisil>
This is the first time it builds the new flake stuff though :)
<gchristensen>
immae: a file called hydra.sql
<immae>
found it thanks
<immae>
I should write some "migration" one too I guess?
<gchristensen>
that would be great!
<infinisil>
13 minutes and counting..
<gchristensen>
sounds about right lol
rajivr has quit [Quit: Connection closed for inactivity]
<infinisil>
Every time I have to wait for something I'm debating whether it's worth starting something new
<infinisil>
s/starting something new/working on something else
<infinisil>
If I knew it would take over 19 minutes I probably would've done so earlier lol
<gchristensen>
yeah it is a motivator in making things as fast as possible, to keep people from getting distracted :)
<infinisil>
Would be cool if Hydra showed every stage of IFD
<immae>
Or you can just look at your server working and take a rest :)
<andi->
I wish I knew why my hydra instance has problems with running nixos tests... regardless of runner (e.g. even when building them on my workstation) they appear stuck very often. When I run the same drv outside of the hydra remote builder setting it just works...
<andi->
Load can't be an issue as I've some machines on max-jobs=1
thefloweringash has quit [Ping timeout: 244 seconds]
<infinisil>
30 minutes and counting..
etu has joined #nixos-dev
<bennofs>
just kill IFD, it's never been a great thing...
thefloweringash has joined #nixos-dev
<infinisil>
It's great in some ways, not as much in others
<ajs124>
andi-: Good to know that someone else has that problem. Although we don't have max-jobs=1.
<ajs124>
I always assumed it was some part of our config, which is kind of weird, in some ways.
<dhess>
We also get stuck NixOS tests on our Hydra, from time to time. It's pretty rare, though.
<dhess>
We also see what look like races where we'll occasionally get spurious failures. They nearly always pass after the job is re-tried.
<ajs124>
Maybe there is something about our config, after all. Just checking right now, there's a nixos.tests.predictable-interface-names.unpredictableNetworkd.x86_64-linux running since 2d 2h 16m 16s.
<andi->
ajs124: exactly that!
<andi->
I have no idea why they aren't killed because of a timeout
<ajs124>
If it weren't for other people (which don't like my super hacky solutions), I'd honestly have a systemd time that restarts the queue runner once a day.
<andi->
ajs124: the hydra builders are rebooted daily ;-)
<andi->
at least those from packet..
<bennofs>
is the hydra.nixos.org webinterface currently unreachable or is that something on my end?
<immae>
Ah does the travis check that the sql syntax is correct (i.e. does it start a VM with hydra?), or should I test it myself?
tilpner_ has joined #nixos-dev
tilpner has quit [Ping timeout: 246 seconds]
tilpner_ is now known as tilpner
<gchristensen>
please test it yourself too P
<immae>
ok I’ll try that
<immae>
(it’s the first time I’m running a hydra :p )
<cole-h>
gchristensen: Semi-relatedly, but the instructions in hydra.sql (specifically step 2) seem to be "wrong"? I had to run `make -C src/sql -f Makefile.am update-dbix`, and dunno what the `hydra-postgresql.sql` arg is for / means.
<gchristensen>
did you do the bootstrap / configure phase stuff in the HACKING docs?
<cole-h>
o maybe not :D
<cole-h>
Yep, that removed the necessity for `-f Makefile.am`, but still unsure what `hydra-postgresql.sql` is for.
<gchristensen>
hrm maybe it is stale
<cole-h>
(The only reference to that file I can find is in hydra.sql)
<cole-h>
And even renaming that arg to `hydra.sql` shows "Nothing to be done for hydra.sql". So yeah, maybe stale
tilpner has quit [Remote host closed the connection]
<gchristensen>
cool
tilpner_ has joined #nixos-dev
<gchristensen>
this is so bizarre
<gchristensen>
a query is taking 5000ms according to the slow query log, and taking just 1ms if I run it locally
<cole-h>
Maybe your NVMe's are better than Hydra's? :D
<gchristensen>
this is all hydra
<cole-h>
I assumed "run it locally" meant you were running it on your replicated hydra backup, and that the "slow query log" was on Hydra (h.n.o) itself
<gchristensen>
ah, I mean manually
<immae>
gchristensen: what does the slow query look like?
<symphorien[m]>
https://github.com/symphorien/nix-du/issues/5#issuecomment-770752868 << a store path A is alive because `nix-store --query --roots` reports a root B -> C.drv but this root cannot depend on A since it points to derivation files... It looks like an issue to me, but I'd like a second opinion before asking the person to open an issue against nix.
<gchristensen>
I think one of my PRs must have introduced a query in a hot loop
<infinisil>
gchristensen: Can confirm the fix works :) Will PR shortly (and cross-link to immae's related PR)
<gchristensen>
cool
<immae>
infinisil: I didn’t feel like "fixing" anything, merely adding even more constraint to something that was already constrained :D
<immae>
But if you’re happy it’s all good :)
<infinisil>
Ah yeah, I've got a fix (created with the help of gchristensen) for the problem I was having earlier, which was very much related to those constraints
<gchristensen>
samueldr: I think the implementation of storing the eval error message in with the jobseteval was a bit naive :)
<samueldr>
I did say I wouldn't comment on the DB part :)
<gchristensen>
hehe
jonringer_ has joined #nixos-dev
<lukegb>
gchristensen: did it get blocked on another transaction?
nh2_ has joined #nixos-dev
alunduil_ has joined #nixos-dev
<gchristensen>
I think it is blocked on literally writing bytes to the wire
<samueldr>
(throwback to a previous query of mine) would it help if they were files on the FS?
mkaito has quit [Quit: WeeChat 3.0]
mkaito has joined #nixos-dev
mkaito has joined #nixos-dev
mkaito has quit [Changing host]
pmy_ has quit [Ping timeout: 256 seconds]
pmy_ has joined #nixos-dev
stolyaroleh_ has quit [Ping timeout: 265 seconds]
orivej has quit [Ping timeout: 265 seconds]
<gchristensen>
samueldr: imho that isn't so useful and makes dealing with them harder
<gchristensen>
one option is to go through and explicitly not select the column when it isn't needed, another option is to break it off in to its own table
<gchristensen>
seeing as there are quite a lot of queries for evals, the separate table is probably easier
<cole-h>
As someone who has taken exactly one (1) database class, I like the "separate table" option better as well :)
<dhess>
Does a Hydra expect to be the only writer of an S3 binary cache? In other words, if I have another Hydra (or similar service) writing to the same binary cache, will that cause any problems?
<edef>
multi-writer might cause orphaned NARs sometimes
<edef>
but nothing serious
<edef>
(i'm not sure about the binary cache GC mechanism, that might collect stuff you want to keep)