michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
cone-441 has joined #ffmpeg-devel
<cone-441>
ffmpeg James Almer master:17729aa80c61: avformat/movenc: fix writing reserved bits in EC3SpecificBox
minimal has quit [Quit: Leaving]
<kode54>
cripes, he's been in commit history for 9 years now?
<cone-983>
ffmpeg Dmitrii Ovchinnikov master:39ab61bd7550: avutil/hwcontext_amf: add support for d3d12va initialisation
IndecisiveTurtle has quit [Quit: IndecisiveTurtle]
TheVibeCoder has quit [Quit: Client closed]
mkver has joined #ffmpeg-devel
<haasn>
ramiro: so, I implemented plane splitting
<haasn>
but it's currently a bit too extreme; it tries to split even pure memcpy operations into separate passes
<haasn>
which becomes a slowdown again due to more overhead
<haasn>
since we invoke the memcpy backend 4 times for something as simple as yuv444p -> yuva444p
<haasn>
y, u and v are all independent here so they are copied separately, but copying them together from a single thread slice would be faster
<haasn>
(this is all of course setting aside the fact that, in reality, we should try and ref the planes instead of copying them; but that's a separate optimization I plan on tackling sometime soon-ish)
<haasn>
that said, this optimization is already doing wonders for e.g. yuv444p -> gbrap, where it splits off the gbr planes into its own subpass and then uses memset() for the alpha plane
<ePirat>
rcombs, I didnt even get the email you replied to :D
jamrial has joined #ffmpeg-devel
<toots5446>
hi all!
<toots5446>
mkver: did you get a chance to look over that ogg/vorbis patch?
<mkver>
no
<toots5446>
ok! Should I perhaps ask michaelni if they want to review it?
<michaelni>
no, dont ask michaelni
<michaelni>
;)
<cone-983>
ffmpeg Michael Niedermayer master:453ae55d63dd: tests/fate/mov: Add bitexact for fate-mov-mp4-frag-flush
<toots5446>
okok. I'll ping you again in a little while. Thanks for your help with this.
<toots5446>
and by you I mean mkver :-)
socrates1298 has joined #ffmpeg-devel
socrates1298 has quit [Client Quit]
socrates1298 has joined #ffmpeg-devel
socrates1298 has quit [Client Quit]
<ramiro>
haasn: nice! I'll have a look when I have time again.
<ramiro>
but why is the memcpy backend invoked 4 times? can't it be invoked once, with the knowledge that the planes are split?
socrates1298 has joined #ffmpeg-devel
<haasn>
ramiro: during plane splitting we construct a separate SwsOpList for each independent subgroup and pass them to ff_sws_ops_compile() separately
<haasn>
(consider: we may want different backends for each plane)
TheVibeCoder has joined #ffmpeg-devel
<haasn>
so this results in four SwsCompiledOp, each of which internally point back to backend_memcpy:process()
<haasn>
and each get wrapped into their own SwsPass in the SwsGraph
<haasn>
consider: different backends may have different requirements about block alignment, etc
<haasn>
or slice alignment
<haasn>
so we have to run op_pass_run() separately for each SwsCompiledOp, ergo each SwsOpList, ergo each plane
<haasn>
op_pass_run() is itself run by the threaded dispatch in SwsGraph
<haasn>
so after plane splitting we launch a thread task for each slice for each plane
socrates1298 has quit [Ping timeout: 272 seconds]
<haasn>
ramiro: 700 us -> 500 us after plane splitting for yuva444p9 -> yuva444p10
<haasn>
with three subpasses that just do u16 read { elems=1 } -> u16 lshift by 1 -> u16 write { elems=1 }
<haasn>
and one subpass that does the whole float/scale/dither for alpha
<ramiro>
haasn: for 852ade6b, could you add some kind of assert or comment about SWS_COMP_PLANEx being a sequential bitmask pattern, so that we know that it's safe to use (SWS_COMP_PLANE0 << plane)?
<haasn>
sure
<haasn>
I was thinking that maybe we should add the plane swizzle mask to SwsOpReadWrite instead
<haasn>
I will give that a try and see if I like the code better
<haasn>
not happy about the interaction between SwsReadWriteOp and SwsOpList.order_src/dst
<ramiro>
haasn: regarding yuva444p9 -> yuva444p10, that conversion to f32 to multiply by a power of 2 for 3/4 planes used to bother me quite a lot :)
<TheVibeCoder>
huh, why is float intermediate?
<haasn>
TheVibeCoder: faster than integers, at least on x86
<ramiro>
I haven't tested yet. I've just looked at the commits. for split planes (like yuva444p9 -> yuva444p10), does the dithering still take into account the alpha offset? i.e.: is it still y+5, or is it y for each plane?
<haasn>
alpha offset?
<haasn>
I don't understand your question
<ramiro>
haasn: I have "static const int y_off[4] = { 0, 3, 2, 5 };" to access data inside the dither matrix, based on the y of the line. the alpha plane would read from y+5 in the matrix. is this still the same?
<haasn>
..oh, yes, that is a problem
<haasn>
so we can't reorder past a SwsDither op
<ramiro>
we can, but dither needs to know about it.
<haasn>
or that
<haasn>
but how would you communicate that
<ramiro>
SwsDither could have an extra y_off[4] array?
microlappy has joined #ffmpeg-devel
<haasn>
I hard-code these coefficients in the x86 backend, and loading them from private data would be a bit nontrivial; though possible
<haasn>
hmm
<haasn>
I'll dwell on the issue
<haasn>
maybe we can find a better solution altogether
microlappy has quit [Client Quit]
<ramiro>
haasn: adding the plane swizzle mask to SwsOpReadWrite sounds like a good idea. it could save a step or two in the call chain. also save some movs in non-jit backends
<haasn>
the reason I don't like it is because it duplicates the plane swizzle code between backends, whereas currently it lives in the backend-independent setup code
<haasn>
and actually, it needs to be done outside the SwsOp, because we need this information to set up the correct pointer strides etc
<haasn>
so nvm, it can't be moved
<ramiro>
honestly the swizzle operation is a bit annoying to deal with. it can be a copy, a move, or a reorder
<ramiro>
iirc the swizzle mask doesn't always maintain identity for elements that haven't moved, so we always have to look at next->unused.
<haasn>
maybe we could set those to -1 or something
<ramiro>
I was thinking just setting them to their index. so 0123 is a noop, 0003 is "copy 0 to 1 and 2", 2103 is "swap 0 and 2", "0323" is "move 3 to 1"
<haasn>
I used to do that in the past but removed that optimization for some reason, maybe it's safe to re-enable them now
<haasn>
since the op matching code has gotten a lot better
<ramiro>
I started working on generating asm code at build time a couple of weeks ago. for that, I need to have a list of "signatures" for the functions I want to generate, which are basically strings. swizzle_xxxx (where xxxx is what the indices like above) seemed like the simplest for swizzle op.
<ramiro>
haasn: btw this ^^ could be common to all non-jit backends. we'd have a list of which configurations of each operation we have to implement.
<cone-983>
ffmpeg Andreas Rheinhardt master:3cb37c0e715b: tests/fate-run: Remove intermediate files from enc-external tests
<cone-983>
ffmpeg Andreas Rheinhardt master:4bd1ce31fc72: avcodec/libaomenc: Avoid code duplication when setting options
<cone-983>
ffmpeg Andreas Rheinhardt master:abebdb1bdb73: avutil/frame: Always return error upon error
<cone-983>
ffmpeg Andreas Rheinhardt master:2f5f2c013cec: avutil/frame: Use av_memdup() for duplicating extended data array
<haasn>
I sort of consider checkasm/sw_ops.c to be that list
<cone-983>
ffmpeg Andreas Rheinhardt master:28c0a189b6d9: avcodec/psymodel: Use av_memdup() where appropriate
<cone-983>
ffmpeg Andreas Rheinhardt master:f0e1a315a144: avcodec/iirfilter: Remove iirfilter, psy-preprocessing
<cone-983>
ffmpeg Andreas Rheinhardt master:d0a27e01a649: avcodec/aacpsy: Remove always-true checks
<cone-983>
ffmpeg Andreas Rheinhardt master:81adbd2d3d63: avcodec/huffman: Combine allocations
<cone-983>
ffmpeg Andreas Rheinhardt master:834bedf3121e: avutil/frame: Fix av_realloc_array() argument order
<haasn>
at least that was the idea, to provide a checkasm test for every combination that can be hit in practice
<haasn>
(and some others that may reasonably be hit in the future)
<fflogger>
[newticket] mg3242: Ticket #11629 ([ffmpeg] Memory Leak in FFmpeg During Extended Processing) created https://trac.ffmpeg.org/ticket/11629
<fflogger>
[editedticket] mkver: Ticket #11620 ([avutil] av_malloc_array() and av_realloc_array(): nmemb and size arguments transposed) updated https://trac.ffmpeg.org/ticket/11620#comment:3
<fflogger>
[editedticket] mkver: Ticket #1357 ([undetermined] Memleak when avformat_open_input used together with avio_context) updated https://trac.ffmpeg.org/ticket/1357#comment:2
<softworkz>
TheVibeCoder: It doesn't include a range of patches from the past 3 years
<softworkz>
TheVibeCoder: The funny thing is that there doesn't even exist a patchset yet, based on latest FFmpeg.
<softworkz>
TheVibeCoder: What I posted was merely a question about the fields layout in AVFrame, whether a nested struct is okay or whether it should all be plain fields only (but with a prefix then)
<BtbN>
jamrial: any more thoughts on the nvenc MV-HEVC stuff? It appears functional to me now, however I still haven't found a way to fully verify that by playing back those files. I plan to push it soon and then work from there.
<TheVibeCoder>
softworkz: is it at least on branch on github?
<softworkz>
On the link I posted above
<TheVibeCoder>
added more filters since then?
<softworkz>
If you want to try it out, I can link you at a binary with all fixes. For looking at the code the PR branch is fine
<softworkz>
No, no new filters yet
<softworkz>
but some ideas..
<softworkz>
One is about using Whisper lib for audio to subtitles
<softworkz>
Another one is hardware overlay without hwuploading full frames, but that's difficult
<TheVibeCoder>
this branch no longer works with threading in ffmpeg tool?
<softworkz>
No, that's why the rebase is non-trivial
<softworkz>
I just started doing it
<TheVibeCoder>
ok
<softworkz>
I was just asking about those fields, because the are used everywhere and I wouldn't like to change it later troughout 30 patches or so
<TheVibeCoder>
why cant those fields you added in AVFrame be part of AVFrame->data[] buffers ?
<softworkz>
So how should it go? 8 here and 12 there? Really?
<TheVibeCoder>
no
<softworkz>
Hpw then?
<TheVibeCoder>
For planar audio which requires more than AV_NUM_DATA_POINTERS
<JEEB>
the docs for extended_buf is quite messy tho
<JEEB>
> Note that this is different from AVFrame.extended_data, which always contains all the pointers. This array only contains the extra pointers, which cannot fit into AVFrame.buf
<softworkz>
People had said I should use AVBufferRef *buf[AV_NUM_DATA_POINTERS];
<softworkz>
That's what it was about
<JEEB>
but does planar audio really split into multiple avbufferrefs?
<JEEB>
that sounds like :effort:
<TheVibeCoder>
there is no need to store each subtitle into separately allocated buffer
<TheVibeCoder>
it can be all one flat array
<fflogger>
[editedticket] mkver: Ticket #11494 ([avformat] Encoding to ffv1 produces broken bitstream for specific sample with specific settings) updated https://trac.ffmpeg.org/ticket/11494#comment:1
<softworkz>
Yes, you can also put all frame data into a contiguous buffer and access it by byte indexes :-)
<TheVibeCoder>
but, yes, changing from current subtitle struct to AVFrame is non-trivial task
<softworkz>
What I did is to add an array field and a nb_count field
<softworkz>
We must not forget that this is also a public api
<TheVibeCoder>
but if data is not ref-counted that its not useful
<TheVibeCoder>
you are changing ABI/API all the time
<TheVibeCoder>
specially if you insist on adding another set of fields to current AVFrame
<TheVibeCoder>
and did you changed anything in that function for subtitle type of avframe?
<softworkz>
of course
<softworkz>
maybe it's better when you would take a look at the whole patchset..?
<TheVibeCoder>
just try to update set to latest ffmpeg, good luck.
<softworkz>
Regarding refcounted buffers for subtitle text: this is not done in all current code any. It would need to start from decoders and up to the encoders. Here I have just followed the existing principles.
<softworkz>
"ust try to update set to latest ffmpeg, good luck." << yup, will do
<softworkz>
thanks a lot for your interest in the set :-)
softworkz has quit [Read error: Connection reset by peer]
softworkz has joined #ffmpeg-devel
wormie has quit [Quit: Client closed]
softworkz has quit [Read error: Connection reset by peer]
softworkz has joined #ffmpeg-devel
<softworkz>
Just to sum up: I can use extended buffers, that's okay (just not main buf as some wanted). Then there's the AVBufferRef for the ASS style header which I have as new frame field. Should I put this into buf[1] instead so that "by convention" it's always there?
TheVibeCoder has quit [Quit: Client closed]
TheVibeCoder has joined #ffmpeg-devel
System_Error has quit [Remote host closed the connection]
TheVibeCoder has quit [Quit: Client closed]
TheVibeCoder has joined #ffmpeg-devel
<softworkz>
One more note: That "I won't move" element of the "concepts text" was NOT about the things we've been talking here. This is about the "everything has to be done fundamentally different" songs which NG is singing
cone-902 has joined #ffmpeg-devel
<cone-902>
ffmpeg Manuel Lauss master:b7662ddd15dc: avcodec/sanm: fix codec33/34 tile generator
<cone-902>
ffmpeg Manuel Lauss master:c06439073651: avcodec/sanm: remove codec4 block smoothing
<cone-902>
ffmpeg Manuel Lauss master:d9797544b45a: avcodec/sanm: codec31/32 support
<Traneptora>
ramiro: sent a posix ioctl rename patch
System_Error has joined #ffmpeg-devel
softworkz has quit [Read error: Connection reset by peer]
softworkz has joined #ffmpeg-devel
<softworkz>
TheVibeCoder: When you talked about the AVBufferRef inside AVSubtitleArea as "mess" - was it about nesting AVBufferRef inside another AVBufferRef? That's not the case right now, but when making the whole AVSubtitleArea ref-counted, that would happen and therefore the inner data should be owned by the area and not be ref-counted. Is it that what you meant?
<TheVibeCoder>
i mean ref-counted, there is only single buffer allocated ever per AVFrame for subtitle
<TheVibeCoder>
not Numerous special helper allocations
<TheVibeCoder>
for each new item
<softworkz>
But there's no know size nor number of areas. The total required mem can be from zero to big
<TheVibeCoder>
you add helper functions to allocate single buffer, given some parameters
<softworkz>
What does that buffer contain? AVSubtitleArea items? Do you mean like the whole array of AVSubtitleArea items as contiguous data?
<TheVibeCoder>
yes
<softworkz>
Is this a good public API design?
<TheVibeCoder>
you can have helper pointers, but that are not real pointers - more like offsets within buffer
<TheVibeCoder>
you are confused
<TheVibeCoder>
all data in internal to libavfilter and libavcodec
<TheVibeCoder>
user nave no business touching it
<softworkz>
AVFrame is not public?
<TheVibeCoder>
not AVFrame
<TheVibeCoder>
but buffers that are refcounted inside it
<softworkz>
But API users need to be able to access the data. Same like AVSubtitle now
<TheVibeCoder>
which users?
<TheVibeCoder>
what part of API?
<softworkz>
AVSubtitle and AVSubtitleRect are public API => they get replaced by AVFrame and AVSubtitleArea
<TheVibeCoder>
data inside refcounted buffers is packed in smart manner
<softworkz>
AVSubtitleArea needs to be accessible in a non-obscure way
<TheVibeCoder>
you use variable size buffers, but instead of allocating pointers 1666 times for single AVFrame subtitle you use smart offsets within buffer
<TheVibeCoder>
this is not obscure at all
<softworkz>
typically it's 0-3 allocations when there's no bitmap. what's the problem about it?
<TheVibeCoder>
this rudimentary packaging of data inside encoders and decoders
<TheVibeCoder>
there should be only 0 or 1
<softworkz>
what's the advantage? and how should API users access it? even more without digging through ffmpeg source code to understand how it's done
<TheVibeCoder>
you can add helper functions to do the "hard" work
<softworkz>
is there any precedent example in FFmpeg for what you are proposing?
<TheVibeCoder>
not really
<softworkz>
I'm not saying it's not doable but it's an unnatural way of storing something
<TheVibeCoder>
but for unrelated example see any normal video codec and its bitstream layout
<TheVibeCoder>
its perfect natural way to store compressed audio and video
<softworkz>
of course, but we're not talking about that kind of data. these are structs
<softworkz>
yes, for that perfect
<TheVibeCoder>
audio and video are also structs
<softworkz>
fixed layout?
<TheVibeCoder>
why you listen to mp3, you get single packet and single frame
<TheVibeCoder>
not 3 separate packets that need some obscure combination
<softworkz>
that's very different
<softworkz>
data
<TheVibeCoder>
the subtitle layout representation is not trivial so it need this approach
<TheVibeCoder>
subtitle is also data
<TheVibeCoder>
you just need to design smart layout
<softworkz>
but there's a fixed structure
<TheVibeCoder>
which will not use hundreths for special allocations
<TheVibeCoder>
subtitles are also fixed structure
<TheVibeCoder>
if you need to add another Rect/Area or more text you create new AVFrame and copy/add different data
<softworkz>
the count of areas can change
<TheVibeCoder>
yes, you use new AVFrame for new subtitle
<softworkz>
not always
<TheVibeCoder>
what changes count of areas?
<softworkz>
there were case, but I'd need to look them up
<TheVibeCoder>
decoder outputs fixed data
<softworkz>
like the text from 3 areas getting merged into 1
<TheVibeCoder>
then allocate another frame
<softworkz>
the implementation is analog to AVSubtitle and AVSubtitleRect. It wasn't necessary there, and I don't see it required for this, I'm afraid
<softworkz>
it doesn't make sense to me for that case
<TheVibeCoder>
but its not reference counted 100%
<TheVibeCoder>
thus its hack
<TheVibeCoder>
you can not use AVFrame if you do not plan to convert data
<softworkz>
When I change it do use extended buffers, then the whole areas will be refcounted
<softworkz>
each
<TheVibeCoder>
are you saying that multiple subtitles can have same areas ?
<TheVibeCoder>
aka sharing data between subtitles frames
<softworkz>
when a frame gets duplicated, yes
<TheVibeCoder>
that is bad
<TheVibeCoder>
i did not meant that case
<TheVibeCoder>
and its obviously hack to do
<softworkz>
what?
<TheVibeCoder>
correct way is to increase reference counter
<softworkz>
who said otherwise?
<softworkz>
at the moment, only image data is refcounted. everything else is copied
<softworkz>
I was talking about changing that to refcount areas as a whole
<TheVibeCoder>
so?
<softworkz>
what's a hack?
<TheVibeCoder>
size of area #1 | area #1 ; size of area #2 | area #2 ...
<TheVibeCoder>
could also add offsets
<softworkz>
I'm not doing any such thing
<TheVibeCoder>
you should
<softworkz>
thats as mess IMO
<softworkz>
or hack
<TheVibeCoder>
obviously we have different meaning for word hack
<softworkz>
and mess
<softworkz>
:-)
<softworkz>
How about this: Then there's the AVBufferRef for the ASS style header which I have as new frame field. Should I put this into buf[1] instead so that "by convention" it's always there
<softworkz>
is that better?
<TheVibeCoder>
how would you know size of that field?
<softworkz>
It's a string IIRC
<TheVibeCoder>
make everyhing AVBufferRef
<softworkz>
It already is. The question only whether it should be a field in AVFrame
<softworkz>
what about this sub-struct - ok or drop it?
<softworkz>
and use plain fields?
<TheVibeCoder>
whats wrong with AVFrame->pts/duration?
<softworkz>
this has a different meaning
<softworkz>
all are used
<TheVibeCoder>
what fills timing->start_pts and timing->duration ?
<softworkz>
it's filled and changed at various cases
<TheVibeCoder>
where/when?
<softworkz>
the reasons why all fields are needed is explained in what you called "pamphlet"-something
<softworkz>
"where/when" << too much to answer in short text
<TheVibeCoder>
can you link pamphlet again?
IndecisiveTurtle has joined #ffmpeg-devel
<softworkz>
I look it up, but could you please let me know what you think about the subscruct nesting, independently from whether the fields are needed ?
<TheVibeCoder>
irrelevant
<softworkz>
what?
<softworkz>
my question?
<TheVibeCoder>
proper way is add new struct
<TheVibeCoder>
typedef it
<TheVibeCoder>
and put it where current hack is
<softworkz>
why hack?
<TheVibeCoder>
do you use ever AVFrame->pts in your code?
<TheVibeCoder>
and for what?
<softworkz>
I've seen this pattern used elsewhere
<softworkz>
Yes, pts is used
<TheVibeCoder>
the code in question is hack because its problematic on numerous conceptual and practical levels - in other words its sub-optimal solution
<TheVibeCoder>
what you use pts for?
<softworkz>
as far as I have seen, you don't even know the code...?
<softworkz>
so the frame pts is not always the same as the actual display start
<softworkz>
and for duration, the frame duration is the time until the next frame, but that's different from the display time of a subtitle event
<softworkz>
which can be overlapping for example
<TheVibeCoder>
can you move them also in #0 buffer?
<softworkz>
possible, but they are used at huindreds of places.
<softworkz>
need to be accessed simple and immediate
<softworkz>
(should)
<TheVibeCoder>
which places?
<softworkz>
where subtitles are processed, decoded encoded
<TheVibeCoder>
only decoders/filters/demuxers/muxers/encoders/parsers
<softworkz>
filtred
<softworkz>
filters
<softworkz>
but they are essential and need to be public
<TheVibeCoder>
why?
<softworkz>
you can't hide them away
<TheVibeCoder>
who fills such fields?
<TheVibeCoder>
s/who/what
<softworkz>
these are the _actual_ times for subtitles. all projects who are using ffmpeg for decoding and are doing anything with subtitles need to access it
<TheVibeCoder>
what is currently used instead?
<softworkz>
decoders fill them (meanwhile there's an adaption layer before decoders are updated)
<softworkz>
AVStubtile is currently used
<softworkz>
frames are not yet used for subs
<TheVibeCoder>
does AVSubtitle have 2 versions of pts?
<softworkz>
no, because these are handled differently thatn AVFrame. The duplicationg becomes required by using AVFrame for subs
<softworkz>
so that they can go through filtering
<softworkz>
and all other code which uses AVFrame
<softworkz>
means no longer extra handling for AVSutitle
<softworkz>
but the price is those two fields
<TheVibeCoder>
there is not much difference using start_pts in recounted buffer and having special field in AVFrame
<TheVibeCoder>
once you decode subtitle AVFrame you get start_pts
<softworkz>
it's crucial data that needs to be publicly visible
<TheVibeCoder>
why
<TheVibeCoder>
IIRC decoded text subtitles have ASS layout already
softworkz has quit [Read error: Connection reset by peer]