<samueldr>
(and here also mixes the new kernel boot configuration thing needlessly?)
<samueldr>
or too shallow, where the article is basically "ftrace is cool! use this tool to ftrace this simple example"
<samueldr>
but nothing seems to map to boot time args
<samueldr>
ugh, I seem to recall having done maybe something similar in the past with net console
<samueldr>
but I'm unsure
<samueldr>
oh, uhm.... it might be tracing fine, but on on the console... from what I'm understanding
<samueldr>
which is not helpful because I need to trace a hang
<patagonicus>
Ugh. No wonder that the extlinux config doesn't show up in the image - making it copy the files to the dir that gets copied to the image doesn't help if you have the call to cp commented out.
<patagonicus>
I think tomorrow I'll finally order more HC2s. Having to build something on the machine, shut it down, swap SD cards, swap them again, boot the working installation, modify something and repeat is just super annoying.
<Ke>
samueldr: you are aware that ftrace does it's own buffers and does not print to kernel ring buffer or console
<Ke>
oh I guess there is an option for that
zupo has joined #nixos-aarch64
alp has joined #nixos-aarch64
alp has quit [Ping timeout: 272 seconds]
t184256 has left #nixos-aarch64 ["Disconnected: Replaced by new connection"]
t184256 has joined #nixos-aarch64
Thra11 has joined #nixos-aarch64
wavirc22 has joined #nixos-aarch64
bennofs has joined #nixos-aarch64
<Ke>
samueldr: did you try kernel.tp_printk ftrace.tracer=function
bennofs_ has quit [Ping timeout: 258 seconds]
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #nixos-aarch64
evils has quit [Quit: Lost terminal]
evils has joined #nixos-aarch64
FRidh has joined #nixos-aarch64
<Ke>
tp_printk does not seem to be working also trace is there only from the moment you cat it
<Ke>
tp_printk documentation says it's only for tracepoints, so probably not function tracer
<Ke>
I would guess tracer=function on serial console was deemed too insane to even try
<Ke>
I would guess might hit many rcu stalls or something
<Ke>
then there is some panic on hang options
<Ke>
do you get to userspace init at all?
alp has joined #nixos-aarch64
alp has quit [Ping timeout: 272 seconds]
FRidh has quit [Quit: Konversation terminated!]
alp has joined #nixos-aarch64
<patagonicus>
Woo! I have a working sd-image-armv7l-odroid-xu3.nix. :) Now I just need to get rid of the firmware partition (which means doctoring around with sd-image.nix).
<patagonicus>
It's also a bit hacky, I use the populateFirmwareCommands to dd the odroid bootloader onto the image. Those commands are only supposed to copy files to ./files, not mess with $img
alp has quit [Ping timeout: 272 seconds]
orivej has quit [Ping timeout: 265 seconds]
alp has joined #nixos-aarch64
alp has quit [Ping timeout: 272 seconds]
zupo has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
cole-h has joined #nixos-aarch64
<patagonicus>
And now I get an image without the vfat partition. :) Needs some cleanup and I'm honestly not sure if this is the right way to do it, but I'll send a PR soon-ish.
orivej has joined #nixos-aarch64
orivej has quit [Ping timeout: 256 seconds]
orivej_ has joined #nixos-aarch64
Guest49640 is now known as AmandaC
knerten has joined #nixos-aarch64
zupo has joined #nixos-aarch64
knerten1 has quit [Ping timeout: 256 seconds]
fpletz has quit [Quit: ^D]
orivej_ has quit [Ping timeout: 246 seconds]
<samueldr>
Ke: yeah, tp_printk worked for another type of event, but not for functions
<samueldr>
Ke: which is what confuses me
<samueldr>
the docs is *pretty thick* when you don't already know about all the deeper terms
<samueldr>
and does not point to a good primer
<samueldr>
and pretty much all articles about it is too high level, not really describing the scope, mostly "look, it's cool, it allows you to see function calls"
<samueldr>
so yeah :/ looks like I'll have to bisect between two major versions and fine where it fails
<clever>
samueldr: hows this code look?
<Ke>
samueldr: I bet the reason this is not made available as it's probably infeasible for most consoles
<samueldr>
it would have been easier to know in which function it hangs, so I could have looked at the history or diff between the functions
<samueldr>
Ke: tracing all events is already infeasible, but I think there is already a feature in the kernel to delay printks for that reason
<samueldr>
(when I tried to filter the events for "*" the console bugged out and I believe the kernel froze for a different reason)
<samueldr>
I'm also open to other ways to debug a hanging kernel
<samueldr>
I know it happens either in the display init or DRM
<samueldr>
I should probably get them built with -DDEBUG but I fear no human will have left useful debugging info there
<samueldr>
clever: what am I looking for?
<Ke>
if you leave those things as modules, do you get to userspace?
<clever>
more just how its doing thins, how clean it is
<lopsided98>
interesting, U-Boot 2020.07 added support for storing the environment on the SPI flash of the RockPro64, so it loaded the old env vars from the vendor U-Boot, which contain the wrong offsets, preventing it from booting
<patagonicus>
\o/ and a proof of concept install with root on LVM on LUKS. Just need to get ethernet to work in stage1.
<samueldr>
lopsided98: oh, that's naughty
<Ke>
well configuring u-boot to not look at the env is easy anyway
<samueldr>
sure, but we build the defconfig
<samueldr>
so when defconfig change their defaults, it's a breaking change, somewhat
<samueldr>
Ke: they're already modules, though present in the initrd for obvious reasons
<Ke>
I guess it's easy to override?
<samueldr>
the actual solution will be to saveenv or erase the spi flash to ensure there is no env
<samueldr>
but it's something that should be communicated as being an issue with the upgrade
<lopsided98>
I did 'env default -a' then 'saveenv' to fix it
<samueldr>
clever: the only thing I'd be wary of, but almost a nit, is echoing the cmdline, to future proof things it may be better to write a text file and copy that output
<samueldr>
anything that helps not having to escape for bash in addition to escaping for nix
<clever>
samueldr: but it can only break if the storepath for toplevel has a double-quote in it, heh
<samueldr>
sure, but future proofing for upcoming changes :)
<clever>
i think the "$(cat ...)" is safe from escaping problems
<Ke>
samueldr: if you get to userspace, you can start trace-cmd stream with output to console
<clever>
its only the nix level ${foo} that can evade bash's detection stuff
<samueldr>
Ke: true
<clever>
samueldr: oh, but i cant easily add `passAsFile` to the derivation using `populateFirmwareCommands` so id need my own pkgs.writeText, and now it gets ugly
<Ke>
samueldr: I guess the options to make hung tasks panic etc. probably work better, if userland is running, but not sure
<samueldr>
I'm searching to see if it can be a kernel command line parameter
<samueldr>
(I don't know what constitutes a hung task and if it really ends up being one or if it would somehow evade this)
<Ke>
hung_task_panic=1 at least
<Ke>
sure no guarantees this will produce anything
<Ke>
there are also some lockup detection assertions at compile time
<Ke>
outside of lockdep
<samueldr>
CONFIG_BOOTPARAM_HUNG_TASK_PANIC <- unsure what option it adds, but the option exists and is not set
<samueldr>
oh, btw it's a hang that is caused by an "external input", it depends on how it's booted, and my strong assumptions that it's somewhere in the graphics stack are because of that
<samueldr>
(and where it stops printing)
<samueldr>
yeah, hung_task_panic requires that option to be set
* samueldr
preps a new build
<samueldr>
oh, I misread, no, that sets a build-time default
<Ke>
I guess though, if that would help, you would already see the messages
<Ke>
so maybe a silly suggestion anyway
<Ke>
adding a panic there, is mostly useful, when you are not looking at the logs otherwise
<samueldr>
if I had a trace of the function where it hangs, things would be easier
<samueldr>
either a backtrace or a log of all called functions
<Ke>
I guess magic sysrq does not help
<samueldr>
(or any other strategy to debug kernels)
<samueldr>
hmm
<Ke>
you could try leaving in background something that pokes some value to sysrq-trigger, in case everything does not hang
<Ke>
like (while sleep 1; do echo t > /proc/sysrq-trigger; done ) &
<samueldr>
init isn't started :(
<samueldr>
or maybe it is and I just never checked the order of the console= params
<samueldr>
>:|
<samueldr>
I changed the console parameters order and it's booting
<Ke>
asking irc and not getting any useful advice always helps =o)
<samueldr>
I know
<samueldr>
which is why I was poking at IRC
<samueldr>
but it's not good
<samueldr>
it means that when the logs are sent to the display it'll lock-up
<samueldr>
>:| did this seriously kill another SD card?
lopsided98 has quit [Quit: Disconnected]
<samueldr>
well, half-kill
<samueldr>
I spent the entire day of thursday chasing against a bug that doesn't exist because a specific card stopped being good in that device, but not on my main computer
<samueldr>
and now it's doing the same here
<samueldr>
or it could be the reader
lopsided98 has joined #nixos-aarch64
* samueldr
starts F3
<samueldr>
though it shouldn't be affecting the flash memory, the board says it doesn't answer to card select
<samueldr>
and on two computers it just shows up fine
* samueldr
sighs
alp has joined #nixos-aarch64
<fps>
hmm, it seems my kernelPatches extraConfig didn't get applied
<fps>
i forgot where i picked this up. i tried to go through nixpkgs a little bit to see if that even _should_ work :)
<fps>
but i failed
<fps>
kernelPatches is a list of attribute sets, but i couldn't find out what was allowed and effective in these attribute sets
<fps>
could i just as well just add an extraConfig field to the argument to ./kernel.nix?
<patagonicus>
Hmm. That's annoying. AFAICT the module for ethernet device is loaded so late that both ip= on the kernel command line and the udhcpc autoconfig don't catch it.
<fps>
or it might just be that PREEMPT_RT is just the wrong option. tring with PREEMPT_RT_FULL y
<samueldr>
oh, good news, it failed
<samueldr>
with console= re-ordered
<samueldr>
so it was a spurious non-failing boot (the first) just before
<Ke>
samueldr: if this is u-boot loading something on pine rk3399, I have an issue where my pbp loads corrupt data from the good and fast card and I need to use a slow low end card to boot it
<samueldr>
Ke: on two cards, pbp, u-boot stopped being able to read the two different brand cards
<samueldr>
(and bought multiple years apart)
<Ke>
analogous to mmc init code learning too high a speed for the hw and reading corrupt data due to it
<samueldr>
stopped as in multiple dozens of good boots
<samueldr>
so yeah, init runs, I can hook anything from userspace before udev loads the drivers
<Ke>
my strong belief is that the u-boot driver and pbp hw combo is not good (maybe for high end cards)
<samueldr>
quite possible
<samueldr>
though it's weird that it was working well for a while, and stopped working
<samueldr>
f3probe tells me there's nothing wrong with the card
<Ke>
like your kernel might be corrupt, unless it's validated somehow after loading from sd
<samueldr>
u-boot says the card doesn't answer to voltage select
<samueldr>
so it's not even reading the card
<samueldr>
but u-boot booted from that same card!
<Ke>
this is not the only hw where I have sd compat issues, my mcbin maskrom fails to read from the sd card that can boot pbp
<Ke>
so I have learned to expect incompatibility on not broken hw
<samueldr>
oh, I'm not exactly surprised by difficulties
<samueldr>
only worried that something bad is happening that shouldn't be happening since it eventually stops working
<Ke>
I have heard of people using patch to limit eMMC bus speed on u-boot on rk3399 hw
<Ke>
maybe something similar is going on with sd
<samueldr>
I would assume voltage select is separate from this all
<Ke>
I think it was just dtb patch
<samueldr>
yeah
<clever>
ive been looking into SD stuff lately, for the rpi
<clever>
and one of the commands you run, tells you the max bus speed for the SD bus
<clever>
and another command, tells you what voltage ranges the card supports, and you must then tell the card what voltage youve set the rails to
<clever>
in the case of the rpi, i think the voltage is set with a mosfet controlled via gpio? but ive not looked into which one yet
<clever>
ah, there it is, rpi4b SIO_1V8_SEL pin 4 on the i2c expander
<clever>
rpi1b+, pin 38 internal gpio
<clever>
and its missing on every other model!
<clever>
so only rpi1b+ and rpi4b can do 1.8v, all others are stuck at 3.3v i think?
<clever>
but due to the lack of schematics, i cant confirm how its doing things on either
<samueldr>
whew, having initrd work is great, I slept 5 between each module loads and I *think* I have enough knowledge to know which module it is
rajivr has quit [Quit: Connection closed for inactivity]
<clever>
samueldr: i came across this 8gig SD "card" in discord today
<samueldr>
that looks fun
<clever>
tracing things out on the pcb, it seems to only have one chip, and the lines for the SD interface go directly to the chip (and pass by a few connectors)
<clever>
i think its just emmc?
<clever>
but with connectors for sniffing
<samueldr>
possibly
<samueldr>
AFAIUI SD can talk to eMMC
<samueldr>
for a while the hardmod methods for 3DS were using that fact to tamper with the image
<samueldr>
(and backup)
<clever>
yeah, sd and emmc share a lot of commands
<clever>
and the protocol is compatible for some modes
zupo has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
zupo has joined #nixos-aarch64
pbb has quit [Quit: No Ping reply in 210 seconds.]