michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
iive has quit [Quit: They came for me...]
BradleyS has quit [Read error: Connection reset by peer]
TheVibeCoder has quit [Ping timeout: 244 seconds]
TheVibeCoder has joined #ffmpeg-devel
BradleyS has joined #ffmpeg-devel
<bcheng>
Lynne, jkqxz: No, 26a2a763 breaks vulkan_vp9 when loop_filter_delta_update = 1 (and I presume if segmentation_update_data = 1 too)
<TheVibeCoder>
why ffmpeg mrged code that adds now global signatures from libavcodec/smpte_436m.c ?
<TheVibeCoder>
some of them are unused outside libavcodec
<JEEB>
that sounds like a thing that could be highlighted by the CI to make it easier to spot during review
<JEEB>
since people *will* forget to make everything static
<JEEB>
plus the whole av_* prefix if that was utilized
<rix>
Is there any work going on to integrate ffmpeg with EBU ADM Renderer (EAR) for rendering object audio conforming to BS.2127?
<rix>
maybe part of the IAMF work?
wellsakus has quit [Server closed connection]
<JEEB>
rix: I think there was a filter patch using the videolan library?
<rix>
JEEB: didn't find anything in search, any link?
<TheVibeCoder>
JEEB: libspatialaudio?
<JEEB>
yea also mentally missed "object audio", I think libspatial was just ambisonics? or not?
<TheVibeCoder>
yes
<TheVibeCoder>
ambisonics only
<JEEB>
this is me on E_NOT_ENOUGH_COFFEE
Kimapr has quit [Remote host closed the connection]
<JEEB>
object based audio anyways needs to have the object channels to be as object channels
Kimapr has joined #ffmpeg-devel
<JEEB>
plus the per-object location metadata
<rix>
so.. there's no object audio renderer for IAMF support in ffmpeg yet?
<JEEB>
no, there is IIRC no support for object audio AVFrames yet.
* JEEB
goes check the header just in case
<rix>
i think that would also enable proper render of ATMOS too
<JEEB>
yea, nada in channel_layout.h or frame.h
<JEEB>
yes
<JEEB>
(of course you need to """"just"""" be able to decode the bit stream)
<rix>
decoding is not that hard there is some recent progress someone reversed the metadata format
<JEEB>
but I would like opinions on how to extend AVFrame and channel layouts to be compatible and enabling ISO, ITU and D object definitions in AVFrame
<JEEB>
if someone has poked at all of MPEG-H 3-D Audio, the ITU object audio rendering spec as well as the D *bleep*
microchip_ has quit [Remote host closed the connection]
<rix>
use coordinate system
<TheVibeCoder>
polar
<JEEB>
yea but stuff like amount of objects (essentially object channels) usually utilized, what coordinates etc are required for each object channel etc
<JEEB>
also is there something like object metadata switching WITHIN a decoded frame
<TheVibeCoder>
cant find that anywhere
<JEEB>
as in, is metadata related to these (say) 1024 samples I just decoded, or is it a separate time line
<TheVibeCoder>
i would like to add renderer for object based channels, 16ch in truehd
<TheVibeCoder>
looks like only cavern have something, maybe suboptimal?
<rix>
that's standardized in ADM (BS.2076)
<rix>
EBU had libadm for that
<JEEB>
yea, truehd is the one where I think we're closest since the channels should be there, need to be parsed and interpreted correctly of course
<JEEB>
the extension to E-AC-3 is at least defined in ETSI
<TheVibeCoder>
rix: the metadata that is put into ADM BWF, it usually contains object coordinates, sometimes also frequency range of object
<TheVibeCoder>
is it similar what dolby does? is metadata format actually same?
<JEEB>
I expect metadata format in the container is different, but can be mapped into something else (I recall the JOC spec in ETSI having a mapping defined for something else)
<JEEB>
which is why I expect that one can extend AVFrames in a manner that should be compatible between MPEG-H 3-D Audio, the ITU definitions and D stuff
<TheVibeCoder>
i guess, no need for adding extra fields into lavu channel layouts is necessary
<TheVibeCoder>
just add another type of frame side-data
<TheVibeCoder>
and pass it from decoder to filter doing actual rendering
<JEEB>
you need to specify which channels have object XYZ, no? you may have a better idea of course :D
<JEEB>
usually it's limited like 16 or so object channels (~=objects)
<JEEB>
and then you have metadata for them
<TheVibeCoder>
decoder pass all decoded objects to filter, and each decoder audio frame have side data is something changes, filter picks up the metadata in frame side-data and renders 2.0/5.1/7.1.4 or whetever user needs
<TheVibeCoder>
there can be more objects, like wav with 98 channels
<JEEB>
yea
<TheVibeCoder>
but not all of them are active all the time
<TheVibeCoder>
and there is mixed format IIRC, real channnels + objects
<JEEB>
yes
<JEEB>
which is why you probably need to extend channel layout so that you can mark which are normal channels and which are object audio
<TheVibeCoder>
JEEB: yes, that i meean with frame side-data, decoder puts all necessary metadat into it and that is later used by filter
<TheVibeCoder>
of course, nothing prevents user playing all 16 channels without actual renderer/filter
<TheVibeCoder>
now if you want to integrate renderer inside decoder, that is big mess i do not want to do
<JEEB>
yea, no
<JEEB>
decoder puts the metadata, then you have a filter that possibly is also called from swresample (so you get smoother experience if someone "just wants XYZ channel layout output")
<TheVibeCoder>
swresample is ugly, obsolete, and low quality in many aspects
<JEEB>
yes, which is why I don't say put the impl into swresample
<JEEB>
just that it may make sense to call the filter from swresample's AVFrame API so that existing API callers can easily get results
<michaelni>
Lynne, Source plugins where added long ago, this is just documentation
<Lynne>
michaelni: no, its not
<Lynne>
this officially gives anyone who wants to have their own custom branch officiated by us a path
<michaelni>
Lynne, are you going to fix the security vulnerability in the AAC code that you maintain ? (which is open since 6 months)
<Lynne>
I did?
<Lynne>
the TNS stuff, right?
<TheVibeCoder>
new one
<Lynne>
literally, anyone can ask to have their custom closed-source binary CUDA j2k decoder easily enabled at compile time
<GewoonLeon>
> Checklist to be listed in merge-all-source-plugins
<GewoonLeon>
that seems to imply that this is the way to get listed in there which would be akin to an endorsement
<Lynne>
or H264
<Lynne>
yes, that's my issue with it
<michaelni>
"[FFmpeg-devel] [RFC] AAC (USAC) bug" or "avcodec/aac/aacdec_usac: Fix memory deallocation of pl_data", i did send a patch but my patch was bad
<michaelni>
someone needs to fix it properly, andreas explained how
<michaelni>
source plugins are strictly GPL / LGPL only no closed source allowed, thats in the documentation patch
<TheVibeCoder>
make decoder experimental
<Lynne>
michaelni: nothing's stopping anyone from using dlopen
<michaelni>
we are stoping that from being added
<Lynne>
manually?
<michaelni>
yeah, we manually approve pull requests that add stuff like new source plugins
<Lynne>
do we HAVE to monitor for more crazy stuff being merged?
<Lynne>
first, I HAVE to monitor all MAINTAINERS requests
<Lynne>
because that's not confusing enough
<Lynne>
then I HAVE to monitor all source plugins requests
<Lynne>
and I have, what, a day on a bad day, and 3 days on a good day to respond?
<Lynne>
I don't want that
<michaelni>
you dont HAVE to, but you want to
<Lynne>
of course I want to
<michaelni>
also plugins would get removed immedeatly if something bad goes in. This is not an area we disagree about
<Lynne>
everything that's official, like asking for push access and now this is so freely worded that we very often get requests from strangers with zero commits
<Lynne>
we don't want anything bad to go in, ever
<michaelni>
also goes in just means its in a list, the user is free to merge this or not
<Lynne>
I distrust developers, especially softworkz
<michaelni>
why sw ?
<Lynne>
merging random badly coded features with no reviews
<michaelni>
i exchanged several private mails with him, iam confident he has no bad intend and also i know his real name
<Lynne>
I am confident he has no bad intentions either but either way what he codes and merges are dumb stuff that we really have to fight to fix
<Lynne>
regardless, I am not comfortable with having this feature be officially usable by everyone, and I demand that this is made for official-use only
<GewoonLeon>
michaelni: another thing is that you waited not even 2 days *mostly on a weekend* to merge
<michaelni>
The patch is just documentation
<michaelni>
if you remove the documentation ten nothing says it needs to be GPL / LGPL software
<michaelni>
and nothing says it needs to have support for security fixes
<TheVibeCoder>
NAK
<Lynne>
michaelni: its not *just* documentation
<Lynne>
its a procedure
<Lynne>
once again, I am not comfortable with this, and I want to either revert it, or append a sentence "Source plugins are only available for official repositories maintained by the FFmpeg project."
<Lynne>
you decide what
<michaelni>
FFmpeg can only play videos available for official repositories maintained by the FFmpeg project ?
<Lynne>
no, only official repositories associated with the project can be added to tools/merge-all-source-plugins is what I'd like
<Lynne>
if you want to have SDR as a source plugin -- feel free to add it
<michaelni>
i understand, but the idea of free software and open source is not to build a walled garden where only "our" things can be used, dont you agree / see this ?
<Lynne>
for this particular case, no, I only see ways in which it can be exploited in its current form
<michaelni>
please elaborate how this can be exploited
<Lynne>
I already did - custom dlopen() decoders that circumvent LGPL/GPL
<michaelni>
we remove these, simply
<Lynne>
this means that we need to check each of them periodically
<Lynne>
each branch
<Lynne>
and each repository
<michaelni>
someone would notice that without us checking
<Lynne>
we are busy enough that we have to ping each other for security fixes. we have no time to check all repositories all the time
<michaelni>
We can automate it even
<Lynne>
no.
<Lynne>
we cannot
<michaelni>
git grep dlopen ?
<Lynne>
its so easy to bypass names in source code, and you know it
<Lynne>
you merged this early too, and I gave you an alternative
<michaelni>
but this is not the case, we spell out the rules like pure GPL/LGPL and the person submiting the plugin to be added has to agree to this
<Lynne>
people don't play by the rules
<michaelni>
so now we have people hiding dlopen() in plugins, breaking contracts.
<michaelni>
and noone of the users noticing
<Lynne>
for a third time, I am really not comfortable with this mechanism being open, nor being merged after barely a day. I would like to know whether you want to approve this temporary revert, after which we can discuss this on the mailing list, or would like to accept my addition of this being only available for official repositories
<michaelni>
We can discuss, that was my suggestion, theres no plugin from a non ffmpeg developer, not one so also not one with dlopen and none hiding dlopen and none breaking contract
<paulk>
hi, sorry if this is a question that often comes up, but is there any particular reason I can't ask libavcodec to decode a 420 subsampled video to nv12 instead of the default yuv420p?
<paulk>
it just makes me sad to have to convert yuv420p -> nv12
<paulk>
or did I miss the way to do it?
<BtbN>
That's just what the decoder has been written to output
<paulk>
so it could be extended to also support nv12 and it would work to request it with the get_format callback that's usually used for hwaccel?
<BtbN>
No idea if that's worth it.
<TheVibeCoder>
do moratorium on voting for new leader
<BtbN>
You'll have to read the code and figure out how deeply involved the pix fmt is
<paulk>
yes yes that's what I mean by extended, not just dropping the format in the list ;)
<BtbN>
converting yuv420p to nv12 is also incredibly fast since there are CPU instructions for it
<BtbN>
so it'll be a hard sell for that patch
<paulk>
incredibely fast on incredibely fast cpus
<paulk>
not on 200 MHz ARMv5
<BtbN>
Why are you decoding video via CPU on a 200Mhz armv5?
<paulk>
to push the limits :)
<paulk>
but really my point is that optimization is a good thing in itself
<BtbN>
Not if you introduce huge extra complexity for a rather theoretical benefit
<paulk>
right but that should not be the case here
<BtbN>
Well, I don't know the relevant code. There's a good chance the assumption of 3 separate planes is pretty hard-coded in there.
<paulk>
yes probably a matter of adding an indirection layer where samples are stored
<BtbN>
i.e.... build a planar->semi-planar converter right into the decoder?
<paulk>
anyway I wanted to make sure this is an actual limitation
<BtbN>
Might as well just do it after the fact then
<paulk>
no I mean storage
<paulk>
just writing memory to different offsets
<paulk>
not writing + reading + writing againb
<BtbN>
You're gonna be running the same instructions to convert there as the scaler does
<paulk>
no the samples are the same
<paulk>
they are just stored differently
<BtbN>
but you need to interleave them
<BtbN>
so same computational effort
<BtbN>
plus, you can do it faster if you have larger chunks of it available for simd
<paulk>
interleaving is not an operation, it's just a different offset
<paulk>
there is no computational cost
<paulk>
compared to reading + writing
<paulk>
which has significant cost
<paulk>
that I'm trying to avoid
<paulk>
it's memory access cost, not cpu cost
<BtbN>
the cost is absolutely negligible on any CPU that can sensibly decode h264
<paulk>
no it's really not when it comes to serious frame sizes
<paulk>
and it's mostly the interconnect bw that matters, not the cpu
<BtbN>
you underestimate how fast even not so modern CPUs are at interlaving data like that
<paulk>
again, I'm talking about memory access, not cpu
<paulk>
the problem is storing the whole frame in memory and then reading it again and then writing it again
<paulk>
even if the cpu does no operation at all on the data
<paulk>
it has very significant cost
<BtbN>
You're welcome to develop a patch, but if it's a huge thing, and there is no tangible benefit outside of 200Mhz arm32 CPUs... I'm not sure the chances are good
<paulk>
I think it would be quite a nice optimization given that nv12 is very commonly used as the default format in many workflows
<BtbN>
I rarely ever see it outside of hwaccel contexts
<paulk>
another thing that would be nice is the ability to import source memory
<paulk>
but looking at the code that's a pretty big no
<BtbN>
What do you mean?
<BtbN>
You can construct frames that point to whatever data you like
<BtbN>
or packets
<paulk>
for the encode yes, but the decoder will use its own
<JEEB>
isn't get_buffer2 used for decoders
<paulk>
you can pre-allocate data in the AVFrame, it won't use it
<BtbN>
You very much can do that
<JEEB>
at least mpv seems to be utilizing direct rendering to get_buffer2 buffers successfully
<paulk>
oh
<paulk>
then that's one I missed, thanks!
<jkqxz>
Not all decoders can. AV_CODEC_CAP_DR1 means the decoder can do that.
<jkqxz>
What codec are you interested in for NV12 output?
<paulk>
excellent
<jkqxz>
It would be a huge nightmare to put in any of the standard video codecs because it would change all the reference frame reading code as well.
<BtbN>
What do you even _need_ nv12 for? I'm not aware of anything in ffmpeg that strictly needs it
<paulk>
well usual video codecs, h.264/5, vp8/9, av1
<paulk>
jkqxz: ah right, that would definitely mess with references
<BtbN>
you'd pretty much have to template the entire code of the decoder, and output two variants of it per pixel format
<BtbN>
cause you don't want to introduce per-pixel branching for sure
<paulk>
like I said some abstraction for store/access of samples would do
<BtbN>
no, that'd run branching code on every single pixel
<BtbN>
that'd be slow af
<paulk>
but it would need to be hooked many places
<paulk>
BtbN: no.. that's not how it works.
<paulk>
you already have plenty of branching when reconstructing and storing samples
<jkqxz>
A lot of assembly depends on the format (anything for inter prediction, basically). You would need to rewrite a lot of the assembly functions.
<paulk>
ah yes the assembly
<jkqxz>
And if you didn't do that it would likely be slower than the current decode + convert.
<paulk>
yes for sure
<BtbN>
The current code is pretty optimized to not branch more than absolutely neccesary.
<JEEB>
x264 found that nv12 was in some cases faster, and thus internally x264 operates on nv12. not sure of benefits for decoding, but you indeed would have to work on SIMD as well to at least get NV12 output to a comparable state
<jkqxz>
So... certainly theoretically possible, but a lot of work.
<paulk>
BtbN: you could introduce function pointers, etc
<paulk>
jkqxz: indeed, does not sound very realistic
<JEEB>
but yea, would require work to get to that state where you could finally compare whether it makes sense
<BtbN>
That's still a bunch of extra instructions running for _every single pixel_
<BtbN>
that stuff adds up fast
<BtbN>
You want to have differences like that be very high up, not at the almost lowest level
<paulk>
I guess I'm a bit surprised by it as a linux dev working with hardware that usually supports many output formats
<BtbN>
Hardware decoders usually support exactly one pixel format as well
<BtbN>
sometimes nv12, sometimes weird tiled formats you then have to deal with
<jkqxz>
Hardware generally cheats by not using the output as reference. Just write twice.
<paulk>
not usually, some do, but most will propose multiple formats
<jkqxz>
That might actually be easier for the software decoder implementation?
<paulk>
yes that's another approach
<paulk>
have rec buffers for decode too basically
<BtbN>
I'm really not convinced such a huge effort is worth it in any way
<paulk>
but it can be a big memory cost too
<iive>
iirc nv12 is who planes, one full Y and a 1/2*1/2 UV interleaved
<BtbN>
It's probably much easier to make whatever can't deal with yuv420p you got there be able to do so
<paulk>
anyway I see why it makes sense for the sw implementation to only support yuv420p
<JEEB>
iive: yes, half-packed
<JEEB>
you have luma on one, then chroma interleaved
<JEEB>
(the 1/2 x 1/2 comes from it being a 4:2:0 format)
<iive>
iirc the interleave is one block wide. so it could process both UV with a single operation.
<paulk>
jkqxz: btw if I'm not mistaken you're dealing with hwaccel stuff in ffmpeg?
<jkqxz>
I know a little about it.
<paulk>
I'm trying to see how to get the v4l2 stateless ffmpeg hwaccel series accepted
<jkqxz>
Has it ever been submitted?
<paulk>
(as the author of the uAPI for stateless decoding in linux and cedrus kernel driver)
<paulk>
jkqxz: there was a v1 in 2019 that received negative feedback
<BtbN>
iirc it was in not a great state
<jkqxz>
I remember it being discussed a long time ago, but the people owning it wanted to keep it as their own branch for some reason.
<paulk>
but it's not always very well implemented in userspace ;)
<jkqxz>
Tbh I find exposing the whole thing as a kernel API kindof horrifying.
<paulk>
it was that or introducing a bitstream parser in the kernel basically
<jkqxz>
The desktop hardware puts all of the complexity in user mode and just has simple buffer submission layers on the kernel for a reason.
<JEEB>
yea, I mean I understand that when kernel is the only thing you poke at you spit at things like vaapi interfaces etc
<paulk>
ah yes, when you have a virtual address space for everything :)
<JEEB>
or vdpau
<paulk>
it's very common in hardware to not have any sort of MMU/IOMMU outside of the x86 world
<paulk>
for dma access
<jkqxz>
Which is fine, becuase you have a nice DRM driver which can allocate you the buffers to use?
<jkqxz>
Anyway, suggest submitting again and maybe someone can review it this time.
<paulk>
ok
<Kwiboo>
paulk: have you tested v2 with the upstream rpi hevc driver yet? I was hoping to get around to send a v3 some time in future..., https://github.com/Kwiboo/FFmpeg/commits/v4l2request-2024-v3/ include some cleanup compared to the v2 I sent last year, it may require some changes for rpi hevc to use correct drm fourcc
<jkqxz>
I will try, but do not guarantee anything because obviously a large patch takes a very long time to review properly.
<BtbN>
If there were serious issues with it last time, just re-submitting alone is probably not enough
<paulk>
Kwiboo: nice to see you here!
<paulk>
no I don't do rpi stuff
<JEEB>
and I only have a rpi3 so none of the requests stuff :)
<JEEB>
since all of the newer rpis were meh
<JEEB>
(price wise)
<paulk>
well if someone is serious about reviewing and needs hardware let me know
<JEEB>
I at least have rk stuff that has AV1 I think
<JEEB>
and thank goodness that driver is upstream
<JEEB>
so my fedora exposes it
<paulk>
yes we have good rockchip support now
<paulk>
verisilicon and rkvdec drivers
<paulk>
including rk3588 which has av1
<JEEB>
trying to find the stateful one. but that's more or less not going anywhere unless someone guides me through why some big commit does the things it does in a big commit
<JEEB>
"rework XYZ" is kind of "how do I review this?" territory
<paulk>
Kwiboo: FYI I'm working on the stateless encode API these days, for Hantro VC8000E and later H1 and Allwinner VE
<jkqxz>
It is time, not hardware! Happy to buy whatever you suggest if it would actually make things easier to test, tbh. These things are all very cheap.
<JEEB>
but yea, I understand people hack on things and then you end up with a bunch of changes. but then such a big commit including XYZ changes is something that's not exactly easy to pull into upstream ^^;
<paulk>
but an orangepi with H3 is a good reference
<paulk>
CounterPillow: new ones lack DDR init in u-boot unfortunately, but still work with the proprietary first-stage loader
<paulk>
(new meaning after rk3399)
<CounterPillow>
I am aware
<JEEB>
I wonder if I can ask the Slop Machine to split that commit into logical change sets
<paulk>
could be reversed as it's the usual synopsys umctl2 controller, but no time to do it :(
<CounterPillow>
someone is working on reversing the rk3566 ddr init in Rust™
<Kwiboo>
paulk: cool!, think I only tested downstream rpi hevc decoder last time around, with mainline driver now on list I think it is time to re-test, there was some unclearity related to hevc slice decoding, for v2 it only send one slice per request, for v3 there was a change to send multiple slices per request
<paulk>
ah yes we do support multiple slices per request
<paulk>
Kwiboo: ok then I can try to test your latest on rk and allwinner
<Kwiboo>
paulk: the version at https://github.com/jernejsk/FFmpeg/commits/v4l2-request-n7.1/ is the one used by LibreELEC at the moment and should be stable, I should check if anything differs compared to my own tree and start a rebase on ffmpeg master ;-)
rvalue- has joined #ffmpeg-devel
rvalue has quit [Ping timeout: 260 seconds]
<Kwiboo>
should also complete the av1 decoder part, I never tested it :-)
<Kwiboo>
hehe, yeah, they have implemented lots of changes related to v4l2 stateful decoding to optimize decoding for older rpi, and on top of all those changes implemented v4l2 stateless a hevc decoder that mostly depended on that, last time I checked
<Yalda>
BtbN: It doesn't bother me but just bringing it up in case its some issue. I've noticed that when I make inline comments on PRs it does not post a notification here. Example: #20215 and #20207
<BtbN>
I explicitly disabled that, cause it caused intense spam here
<BtbN>
Cause that can often be dozens and dozens of comments at once
<Yalda>
ok cool. just bringing it up in case it was bug. thank you!
<Kwiboo>
JEEB: my v4l2 request series instead focus on the generic v4l2 stateless api, however could use some guidance on what is good or bad from ffmpeg perspective, e.g. should it use a dedicated hwaccel type, is it okay to implement it separate from v4l2 stateful etc
<Kwiboo>
the v4l2 uapi part has been tested enough, and what is unknown to me is if the integration with ffmpeg api is good, bad or terrible