<gchristensen>
Unix-domain socket path "/var/lib/buildkite-agent-pgloadndump/builds/kif-pgloadndump-1/grahamc/postgres-load-dump/socket/.s.PGSQL.5432" is too long (maximum 107 bytes)
<cole-h>
Heh wow. I can open the revert, sure. I haven't been briefed on deployment yet, so I'll leave that to you
<gchristensen>
sure, thanks
<cole-h>
Oof, need to recompile openssl 1.0.2u before I can `cargo test`... Oh well, time to blindly open the PR
<gchristensen>
sure
<cole-h>
(also, small note: the old version of amqp we were using doesn't have the "Make the socket blocking during the initial connection" and "Set read timeout only after SSL is initialized" commits)
<LnL>
cole-h: those commits you mentioned sound pretty suspicious in combination with these threads
<cole-h>
Suspicious as in "we should have them" or "those might be causing problems"?
<cole-h>
I was really just noting that ofborg hadn't picked up the two commits from 2019 -- reverting what we did yesterday went back to a commit from 2018
<LnL>
actually this might be after since it recovered already
<LnL>
hmm, hydra uses the hydra role but the postgres role should also be there
<cole-h>
Maybe `-U postgresql` didn't work, but `-U postgres` will (slight typo?)
<gchristensen>
oh!
<gchristensen>
-U hydra is doing it!
<gchristensen>
look at it go!
<LnL>
\o/
<gchristensen>
I skipped the "Figure out how to upload it" step
<LnL>
the hydra role is guaranteed to be there so probably better to use that anyway :)
<gchristensen>
:)
<cole-h>
:)
<gchristensen>
LnL: so I guess I should upload these artifacts to S3 or something
<gchristensen>
and set a policy of delete after 3 days or whatever
<LnL>
first question is how big this is
<gchristensen>
$bigish
<gchristensen>
let's hope it doens't clean up the working directory _after_ the build finishes, and only does the clean up _before_ the build starts
<LnL>
actually exporting might also not be needed (at least for the analyzer stuff) if that could talk to it directly
<gchristensen>
I want to do a full pgdump to validate the backup is good enough to complete a dump
<gchristensen>
(and that the output is ~Xgb)
<gchristensen>
but don't have to upload them
<MichaelRaskin>
Are you talking of single-shot dump does «policy» imply periodic dumps?
<LnL>
this builds table it going to take forever isn't it... :p
<cole-h>
gchristensen: Woo, 4 x86_64-linux builders are up and running again :)
<gchristensen>
nice!
<gchristensen>
LnL: probably :)
<gchristensen>
MichaelRaskin: so, every 5 minutes I get an incremental filesystem snapshot from the database server (so does Rob) and I'm thinking this load-and-dump job would run on a daily basis, and as part of that, uploading a database export
<LnL>
btw I'd love to see if a partial copy is small enough to use as testing data
<gchristensen>
yea
<MichaelRaskin>
Like the whole Nixpkgs history export, or last year?
<LnL>
just the last year
<LnL>
COPY (SELECT * FROM builds as b JOIN jobsetevalmembers as m ON b.id = m.build JOIN jobsetevals as e ON e.id = m.eval WHERE e.project = 'nixpkgs' AND e.jobset = 'trunk' AND e.id > 1515735) TO '/tmp/builds.csv';
<gchristensen>
LnL: maybe you could send a PR to that repo, adding that query as an export step?
<gchristensen>
also I'm not sure about this `--format directory` choice, I was expecting the directory structure to be ... reasonable ...
<LnL>
no idea what that looks like, I've only used -Fc
<MichaelRaskin>
Looks like PostgreSQL internal data layout
<LnL>
reminds me of the layout of timeseries dbs
<MichaelRaskin>
e.id > 1515735 is intended to be date_part('epoch',now())-e.timestamp < 3.3e7, right?
<gchristensen>
what's this?
<gchristensen>
what's this?oh
<gchristensen>
gotcha
<gchristensen>
I was hoping it'd be a file-per-table which would be easy to get just some data from, but if it is going to be this useless, might as well just use -Fc :)
<MichaelRaskin>
LnL: yes, but hardcoding a constant cutoff sounds like something that will be fine, then not fine
<MichaelRaskin>
I wrote the condition «get current time as Unix timestamp, and go back from it by just a bit over a year»
<LnL>
ah yeah, let's not put this in the "backup last year" script :D
<LnL>
there's a unix timestamp in one of the tables
<MichaelRaskin>
Yep, e is jobevals which carries a timestamp
<LnL>
oh! thought this was just an example
<LnL>
thanks ::D
<LnL>
gchristensen: zfs clone -o canmount=noauto
<gchristensen>
oh cool, that'll make the || true bit unneeded?
<LnL>
I think so
<gchristensen>
Let's give it a go :) want to include that in your PR?
arianvp has joined #nixos-borg
<gchristensen>
hey arianvp
<arianvp>
hello
<cole-h>
o/
<cole-h>
Welcome to the cool kids club
<gchristensen>
moving back to my desk and I'll share some context / background
<arianvp>
cool
<gchristensen>
okay, back story arianvp
<gchristensen>
every 5min an incremental zfs snapshot goes from hydra's db server to my server, where I want to, daily, start a postgresql server with that data and do a pg_dump, plus run some batch queries.
<gchristensen>
I do it in a buildkite job, so the job runs as a buildkite user. since postgresql needs a mutable directory to work with, I have to clone the snapshot somewhere and mount it, then chown the contents to the buildkite user so it can write
<gchristensen>
the buildkite user can clone the snapshot and destroy the filesystem it clones to, the remaining issues are: mount, chown, unmount
lordcirth__ has left #nixos-borg ["Leaving"]
<gchristensen>
I tried to put the filesystem in fstab with the "user" option set, but apparently / according to #zfsonlinux, that requires the mount.zfs helper to be setuid, which upstream doesn't really support
<arianvp>
so wait you have a postgres server on your server and basically you want to have the same content there as on hydra-db ?
<arianvp>
but not 'live' ?
<gchristensen>
right. the main purpose here is validate the snapshot is actually a functional backup
<arianvp>
And why not use Postgres's builtin WAL archiving or WAL streaming?
<gchristensen>
because these are the tools I have
<gchristensen>
besides, the snapshots are pretty good backups actually
<gchristensen>
anyway
<arianvp>
alright besides the point; was just wondering
<LnL>
gchristensen: where should I put the data? postgres wants an absolute path
<gchristensen>
uh
<arianvp>
so the goal is: this unprivelged user need to be able to mount the snapshot right?
<gchristensen>
maybe make a directory, $(pwd)/upload/SOMENAME.sql and I'll .gz each file in upload individually and upload in to a datestamped path, LnL?
<gchristensen>
arianvp: mount it, have rw on the files, and then unmount it
<arianvp>
You could use udisksd / udiskctl for this
<arianvp>
to give unprivelged people the possibility to mount unmount
<gchristensen>
cool
<arianvp>
that's what e.g. Gnome uses to implement mounting in their Files app
<gchristensen>
this is services.udisks2.enable?
<arianvp>
yes
<gchristensen>
I suppose it would be strictly better to do it that way than ... what I'm doing now..?
<arianvp>
you can then use `udiskctl mount` to mount
<arianvp>
more.. conventional? :P
<gchristensen>
nice
<gchristensen>
:P
<gchristensen>
I dunno, I kind of like this :D
<gchristensen>
in a sick and twisted way
<arianvp>
yeh it's kinda simple
<arianvp>
but udisks does basically this but instead of touching files it sends a DBUS message
<arianvp>
:P
<gchristensen>
gotcha
<LnL>
btw, I have no idea how this works under the hood but you can zfs allow users to mount specific volumes
<gchristensen>
LnL: it doesn't work on linux :(
<LnL>
hmm, thought it did
<arianvp>
ok time for some dinner
<gchristensen>
thanks for the tips, arianvp
<gchristensen>
LnL: you can't even give someone a kernel capability to do it, since the capability required is literally CAP_SYS_ADMIN :(
<LnL>
assumed it would go through zed or something
<gchristensen>
unfortunately not
<gchristensen>
(though sort of fortunately so, makes it harder to create security bugs if things have to happen as the user)
<gchristensen>
LnL: got final sizes
<gchristensen>
the `directory` dump is 32g
<LnL>
oh really? that's less than I was expecting
<cole-h>
That's less than most AAA games!
<gchristensen>
lol
<gchristensen>
yeah
<gchristensen>
it is like 300G+ imported in to postgres
<cole-h>
That's slightly more than most AAA games!
<gchristensen>
lol
<gchristensen>
I think a lot of it is because there are tables with way too many indexes that are completely useless
<gchristensen>
«recommendation engine mode» People in #nixos-borg may also be interested in #nixos-infra
<cole-h>
Soon I won't be able to switch to these by hotkey anymore :(
<LnL>
_another_ nix channel, when was that created?
<gchristensen>
heh
<gchristensen>
not so long ago
* cole-h
is still in #nixos-baduk with manveru, all by our lonesomes
<gchristensen>
it used to be that things like hydra admin stuff would polute #nixos-dev or be done in needlessly private places
<LnL>
does anybody else miss substitute/substituteInPlace when doing stuff outside of nix?
<gchristensen>
yes
<gchristensen>
LnL: have you seen abathur's resholver?
<LnL>
gchristensen: for my db this is actually bigger so I'm not sure if it's useful
<cole-h>
gchristensen: Was the problem with the comment filter because the systemd service was in a failing state (after the panic)? Or was it still running, but doing nothing?
andi- has quit [Ping timeout: 256 seconds]
andi- has joined #nixos-borg
<gchristensen>
still running doing nothing
<cole-h>
Whacky
<gchristensen>
a half-panicked state
<LnL>
still issues after the revert?
<cole-h>
Don't think so, was from way before the revert. Just didn't notice until emilazy pinged in -dev
<cole-h>
No output from the filter service until I checked the "last 12 hours" time window in Loki, so I thought something was strange (since last output was a panic)
<gchristensen>
LnL: zfs' permissions are very precise lol