System_Error has quit [Remote host closed the connection]
ungeskriptet has quit [Ping timeout: 260 seconds]
ungeskriptet has joined #linux-rockchip
System_Error has joined #linux-rockchip
ungeskriptet has quit [Ping timeout: 248 seconds]
ungeskriptet has joined #linux-rockchip
digetx has quit [Quit: No Ping reply in 180 seconds.]
digetx has joined #linux-rockchip
ungeskriptet has quit [Ping timeout: 276 seconds]
tlwoerner has quit [Remote host closed the connection]
tlwoerner has joined #linux-rockchip
ungeskriptet has joined #linux-rockchip
ungeskriptet has quit [Ping timeout: 252 seconds]
franoosh has joined #linux-rockchip
warpme has joined #linux-rockchip
ldevulder has joined #linux-rockchip
franoosh has quit [Ping timeout: 240 seconds]
franoosh has joined #linux-rockchip
chewitt has joined #linux-rockchip
stikonas has joined #linux-rockchip
xha has quit [Ping timeout: 252 seconds]
raster has joined #linux-rockchip
franoosh has quit [Read error: Connection reset by peer]
erg_ has joined #linux-rockchip
fleg has quit [Remote host closed the connection]
digetx has quit [Quit: No Ping reply in 180 seconds.]
sfo has quit [Remote host closed the connection]
digetx has joined #linux-rockchip
naoki has joined #linux-rockchip
naoki has quit [Client Quit]
stikonas has quit [Remote host closed the connection]
cbeznea has joined #linux-rockchip
digetx has quit [Remote host closed the connection]
digetx has joined #linux-rockchip
digetx has quit [Remote host closed the connection]
digetx has joined #linux-rockchip
digetx has quit [Client Quit]
digetx has joined #linux-rockchip
sfo has joined #linux-rockchip
fleg has joined #linux-rockchip
<diederik>
mmind00: Can we still sent a Revert of 1631cbdb8089 ("arm64: dts: rockchip: Improve LED config for NanoPi R5S") to Linus before 6.16 is released? And how would that work?
<mmind00>
diederik: what is the issue with those LEDs?
<diederik>
I've now done 20 warm reboots and 5 cold boots with 6.16-rc7 with that commit reverted and I haven't seen the hung-task problem yet
<diederik>
As that's the only port I'm using, that's a problem ... and a regression
<diederik>
See also the discussion from 2025-07-18 20:20:33 (CEST)
<mmind00>
I guess when you drop the "linux,default-trigger = "netdev";" line, the issue goes away?
ungeskriptet has joined #linux-rockchip
<diederik>
could be and sounds logical, but so far I've only tried a revert of that commit to make sure that is indeed the culprit.
<diederik>
but then again, without robmur01's hint I would have never thought it could be related to that commit
<mmind00>
also, while that DT change triggers the issue, the error seems to be in the trigger routine
<mmind00>
diederik: personally I'd like things minimal ... so if you could check if we could just drop the netdev trigger, that would be helpful
<mmind00>
ah ... just realized those are all netdev triggers :-D
<diederik>
yeah, the real problem is probably/possibly somewhere else and just triggered by my commit. So I figured that a proper investigation would be needed, but I'd prefer not to bring a regression into 6.16 before that is done
<diederik>
If there was an issue with the netdev triggers, I'd have more expected them with the LAN ports
<mmind00>
diederik: ok, so I guess I'll just do a revert, set you as "Reported-by" with your paste and send that to the armsoc people ... sounds ok?
<diederik>
Yep
<diederik>
I'll now do tests with the WAN trigger removed.
<diederik>
If you prefer you could wait for that, but I don't/didn't know if it's possible and how long it would take to get it to Linus (before 6.16 is released)
<qschulz>
diederik: worst case scenario you can ask for a backport to 6.16 and it would make it to 6.16.1 for example
<diederik>
dropping the trigger on the WAN port isn't enough. Now I'll put that back and drop the triggers on the LAN ports
<robmur01>
Got it: lock inversion between pid 615 and 758 - dev_change_flags holds rtnl_lock and ends up waiting for triggers_lock; meanwhile netdev_trig_activate() is trying to take rtnl_lock while led_trigger_regsiter() holds triggers_lock
<robmur01>
diederik: any chance you could rebuild with lockdep enabled, confirm the splat and report it?
<diederik>
robmur01: I don't really understand what that means, but if you have a patch I'd be happy to try that (and report about it)
<diederik>
Dropping the LAN ports triggers wasn't enough either, so now testing with no netdev triggers
<robmur01>
I mean can you try enabling PROVE_LOCKING in your kernel config, and boot with the triggers enabled - that should spit out a report of the deadlock condition, which you can then give to the netdev/LED maintainers to fix
<diederik>
robmur01: I can/will do that :)
ungeskriptet has quit [Remote host closed the connection]
ungeskriptet has joined #linux-rockchip
<diederik>
I've now warm rebooted 10 times in a row with no netdev triggers and that all went fine
System_Error has quit [Ping timeout: 244 seconds]
ungeskriptet has quit [Remote host closed the connection]
ungeskriptet has joined #linux-rockchip
System_Error has joined #linux-rockchip
<chewitt>
now that I have HEVC working nicely on 3588/3576 I thinking .. I wonder what's needed for HDR to work?
<chewitt>
however when I pick the commits, I end up with no DRM device for Kodi to render to
<chewitt>
not sure if Cristian lurks here or not, but thought I'd pass that info along :)
<diederik>
mmind00: I've now warm rebooted 20 times with the netdev triggers dropped, so I would be fine with a fixup instead of a full revert
<diederik>
Currently building new kernel with PROVE_LOCKING for further investigation ...
<mmind00>
diederik: nice ... but this one you should send to me :-)
<diederik>
ok, I can do that :)
<mmind00>
"mechanical changes" I can do myself, but when it comes to making stuff work, having seen things working on the actual hw is more helpful :-)
<diederik>
done :)
<Daanct12>
diederik: have you looked into rkvop2 module issue?
<diederik>
Daanct12: not sure what issue you're referring to, but likely not
<Daanct12>
so if you turn vop2 into a module (not builtin) your display would not work
<diederik>
I have built a kernel where the order of the 10-bit and 8-bit formats was reversed, but haven't gotten around to actually testing that
<diederik>
oh that one :) Piotr fixed that. Let me look up the ML post ...
<diederik>
chewitt: awesome :) I was pretty sure swapping wasn't a/the proper fix, but it would 'prove' where the problem lies
<diederik>
and I was also curious what effect that would have on 8-bit media
<chewitt>
none that I could see
<chewitt>
it was rendering 8-bit as NV12 and 10-bit as NV15, which was correct
<diederik>
great :)
<chewitt>
but the reorder was only ever a workaround until someone that actually reads/authors code eyeballed the real problem
<diederik>
Yeah, my tests would just be for a '+1' on your findings :)
System_Error has quit [Remote host closed the connection]
<detlevc>
chewitt: nice find ! It was supposed to keep 420/8 and 420/10 indeed, the decoder (as much as I can tell, doesn't support hevc 422)
<detlevc>
I will change that in the next version of the series
<chewitt>
Alex Bee found/spotted the real problem, but good to see it will be fixed up
Daanct12 has quit [Quit: WeeChat 4.6.3]
System_Error has joined #linux-rockchip
xha has joined #linux-rockchip
mripard has joined #linux-rockchip
System_Error has quit [Remote host closed the connection]
System_Error has joined #linux-rockchip
lucaceresoli has joined #linux-rockchip
digetx has quit [Remote host closed the connection]
digetx has joined #linux-rockchip
dsimic has quit [Ping timeout: 240 seconds]
dsimic has joined #linux-rockchip
ldevulder has quit [Ping timeout: 240 seconds]
warpme has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<diederik>
Interesting. Booted into the kernel with PROVE_LOCKING enabled ... and 'memtest' as kernel parameter and all 3 boot attempts resulted in OOPS :-O
<diederik>
I just noticed that OOPS 1 was "error -EEXIST: failed to register extcon device" (again), but OOPS 2 & 3 are not
<robmur01>
the "corruption" itself is also puzzling: not a bad pointer, or NULL, or some numeric value, nor even ASCII... just a load of random-looking bytes written over where a pointer should be... what does that?
<diederik>
I have no idea how to interpret the printed out data, but if you click on line 1045 in OOPS 2 and line 447 in OOPS 3 and then switch between tabs ... there is a LOT of data the same
<diederik>
or f.e. in the x0..x29 fields, they have either the same values, or there is/seems to be a consistent pattern/difference between the ones that do differ
<robmur01>
yeah, the register state is likely to be pretty consistent for the same call stack - that "SUBSYSTEM=" in x16/17 is intriguing but I think unrelated :)
<diederik>
ok :)
<robmur01>
don't suppose you have DYNAMIC_DEBUG enabled so you can quickly boot 'dyndbg="file dd.c +p"' (or something to that effect) to signpost the driver probing?
<diederik>
AFAIK I do have that enabled. I have used it a couple of times
<robmur01>
if only the standard "failed to probe" message wasn't later than devres_release_all()...
<diederik>
do you want me to literally use "file dd.c +p" ? Or was that just an example
<robmur01>
just the ones in really_probe() should suffice to tell which driver is the culprit here
raster has quit [Quit: Gettin' stinky!]
<robmur01>
my hunch is the USB phy, since that does have a clk_bulk_get_all() and is already sometimes implicated by the extcon errors...
<diederik>
"platform fe310000.mmc: bus: 'platform': really_probe: probing driver sdhci-dwcmshc with device" I guess that confirms your suspicion?
<robmur01>
oh FFS, there it is: sdhci_platfm_free() right at the end of dwcmshc_probe()... guess where that "priv" area is that various devres things are still pointing to when it defers because the regulator isn't ready?