michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
<cone-580>
ffmpeg Martin Storsjö master:f4e72eb5a3db: configure: Enable -fno-common for Darwin targets, avoid linker warnings
Guest35 has joined #ffmpeg-devel
Guest35 has quit [Client Quit]
Nino has joined #ffmpeg-devel
mkver has joined #ffmpeg-devel
kasper93 has quit [Ping timeout: 265 seconds]
<haasn>
ramiro: 0,3,2,5 in my final branch; 0 2 3 5 didn't pass self-test (regression on rgb121 iirc)
<haasn>
0,3,2,5 was the best or close to the best setup according to some informal brute force search tests I made
<haasn>
probably because it isolates the G channel and also puts Y and A onto different parity
<haasn>
the by far biggest determinater of dither quality (in this scheme) is the low order bit
<haasn>
for YCbCr in theory it would be slightly better to do 0235 but dither mostly matters for the ultra low bit depth RGB variants
mkver has quit [Remote host closed the connection]
mkver has joined #ffmpeg-devel
<ramiro>
haasn: thanks, I updated my code for 0,3,2,5.
<ramiro>
haasn: btw have you had a look at the other points I talked about yesterday?
<haasn>
oh I missed them, looking now
<haasn>
ramiro: about 4), is this about checkasm?
<haasn>
Also, I have a problem with checkasm after my latest refactor; since I made the entry point a callable asm function now (with no reliance on simd), there is no natural way for check_func to distinguish between different simd rxtensions anymore
<haasn>
And the higher level of abstraction means the checkasm code no longer knows whether the code was compiled with avx2 or sse or w/e
<haasn>
I can’t rely on the priv pointer being unique either because it’s freed and reallocated between runs
<haasn>
The only solution I see is for the backend to report the set of active simd extensions and for checkasm to hash that into the function ptr
<ramiro>
haasn: 4 is not about checkasm. I create a function name (for example asmjit_yuva444p_to_ya8_neon) and create a symbol map for perf to properly report the time spent in each asmjit function. also I can dump the generated assembly and run it through gas (the idea is to generate some converters at build time for systems that don't support jit).
<haasn>
But in the normal code flow (not checkasm), SwsContext fields should be populated
<haasn>
At least the non legacy ones
<haasn>
Oh, wait, those are all legacy disks
<haasn>
Fields*
<haasn>
ramiro: I’d rather give you the SwsFormat on compile()
<haasn>
I assume that’s sufficient?
<ramiro>
haasn: yes, that would work.
<haasn>
I’d rather not rely on fields in the public context especially because that would be mutating user visible fields
<haasn>
Which we’d then have to restore etc etc
<ramiro>
makes sense
<haasn>
The fields that are valid are the flags etc
<haasn>
Scale settings
<haasn>
ramiro: added SwsOpsList.src/dst to swscale6
<haasn>
be advised that these _will_ be AV_PIX_FMT_NONE during checkasm
DVedaa has quit [Read error: Connection reset by peer]
DVedaa has joined #ffmpeg-devel
Gramner has joined #ffmpeg-devel
kasper93 has joined #ffmpeg-devel
<haasn>
I also added SwsCompiledOp.cpu_flags which you should populate during compile()
<haasn>
checkasm will use it to distinguish between asm and C versions of compiled functions
<haasn>
and I guess we can add a diagnostic printout as well
<haasn>
I decided I'm done constantly splitting up my commits and squashing them also, so I'll just make refactors from now on their own commits that touch all relevant files in one go
<ramiro>
haasn: there's a typo in "swscale/graph: set SwsOpsList.src/dst", you're setting ops->src twice :)
<haasn>
oops :)
aljazmc has joined #ffmpeg-devel
cone-580 has quit [Quit: transmission timeout]
<haasn>
is there not a function to pretty print cpu flags?
<haasn>
like the inverse of av_parse_cpu_flags
any1 has quit [Remote host closed the connection]
any1 has joined #ffmpeg-devel
<haasn>
I guess it's more difficult because later flags may imply earlier flags
<haasn>
so ideally you'd print only the later flag
<JEEB>
libavutil/tests/cpu.c has a print_cpu_flags, but not sure how useful that is
<JEEB>
oh it actually has a list struct it seems?
<JEEB>
cpu_flag_tab[i].flag and cpu_flag_tab[i].name
aljazmc has quit [Quit: Leaving]
<ramiro>
haasn: I'm revisiting my pack/unpack operations. I had disabled the "Skip unpacking components that are not used" optimization, since that made it harder to calculate the offsets. what I currently do is "int offsets[4] = { op.pack.pattern[3]+[2]+[1], [3]+[2], [3], 0 };". I would like to restore that optimization, but then I don't get op.pack.pattern[3], so I did "int from_size_in_bits =
<ramiro>
ff_sws_pixel_type_size(op.type) << 3;" and "int offsets[4] = { from_size_in_bits - op.pack.pattern[0], from_size_in_bits - ([0]+[1]), ...};". the problem is that for bgr4_byte and such, from_size_in_bits is not the correct size. how should I calculate the msb for those formats?
<haasn>
I think we can just remove that optimization
<haasn>
you can consult next->comps.unused[] anyway
<haasn>
to skip them
<haasn>
it's not doing anything for the other backends
jamrial has joined #ffmpeg-devel
<ramiro>
haasn: could you remove it on your branch?
Xe_ has joined #ffmpeg-devel
Xe has quit [Ping timeout: 248 seconds]
<haasn>
sure
<haasn>
also somehow the linesize -> stride change broke everything
<haasn>
I'll just drop that commit again; it's not doing anything on its own and you were wanting to move the y loop anyway
<haasn>
force pushing just this one time :)
<haasn>
ramiro: can you try haasn/swscale7 ? I pushed the line fusing commit there, maybe it makes moving the y loop unnecessary
<ramiro>
haasn: I haven't tried swscale7 yet, but I would still need to be able to do the y loop myself (not only for line fusing).
aljazmc has joined #ffmpeg-devel
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
microlappy has joined #ffmpeg-devel
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
microlappy has quit [Quit: Konversation terminated!]
microlappy has joined #ffmpeg-devel
microlappy has quit [Quit: Konversation terminated!]
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
<fflogger>
[newticket] juanitotc: Ticket #11575 ([ffplay] Unable to play an hevc video using hevc hardware decoding on an RPi5) created https://trac.ffmpeg.org/ticket/11575
Anthony_ZO has quit [Ping timeout: 245 seconds]
aljazmc has quit [Remote host closed the connection]
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
<jamrial>
mkver: did you send a patch to fix the memleaks fate is reporting? if so, please push it
cone-622 has joined #ffmpeg-devel
<cone-622>
ffmpeg Andreas Rheinhardt master:62f7b43b5347: tests/api/api-dump-stream-meta-test: Fix leaks
<mkver>
As it so happened, I was just about to push it.
j45 has joined #ffmpeg-devel
j45 has joined #ffmpeg-devel
j45 has quit [Changing host]
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
Nino has quit [Quit: Client closed]
<jkqxz>
jamrial: How strongly do you feel about the explicit relaxed ordering? It will make no difference here, so I weakly prefer leaving it as the default unless there is some other reason.
<jamrial>
jkqxz: not strong, but to increase a counter you don't really need to force ordering
<jamrial>
you do for decreasing it
<jamrial>
see how AVBufferRef and Refstruct do it
<jkqxz>
The thought it purely that an included non-seq-cst ordering parameter means the reader has to think about it (and see that it is synchronised by the thread ordering), while without that they don't.
aljazmc has quit [Remote host closed the connection]
<jkqxz>
(The store and load can be relaxed as well because of the thread synchronisation.)
aljazmc has joined #ffmpeg-devel
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
toots5446 has quit [Quit: toots5446]
<cone-622>
ffmpeg James Almer master:d34c7384351a: avcodec/hevc/hevcdec: ensure a bit was read when checking for alignment_bit_equal_to_one
<cone-622>
ffmpeg James Almer master:0af1d6995969: avcodec/hevc/hevcdec: move the slice header buffer overread check up in the function
<jamrial>
jkqxz: for that matter, are you ok with "avcodec/apv_decode: build the lut table only once", or willl you change the affected code? (i recall you mentioned you were doing some refactoring)
aljazmc has quit [Remote host closed the connection]
<mkver>
jkqxz: When I read code with the default ordering, I always wonder: "Does this code really depend upon sequential consistency?"
aljazmc has joined #ffmpeg-devel
<jkqxz>
So you'd prefer relaxed on all three operations?
rvalue has quit [Read error: Connection reset by peer]
rvalue has joined #ffmpeg-devel
<jkqxz>
jamrial: Do we ever care about numa stuff? Keeping the tables local to the thread feels safer for avoiding some bad interaction, but I'm not sure exactly what that would be.
<jkqxz>
As long as the tables don't share a line with something mutable then normal cache hierarchies will happily load read-only copies into every core's L1 separately, but if they did it could be a disaster.
<jamrial>
i have no idea. we always build static tables only once for all modules
<jamrial>
i guess you could also hardcode it for non-small builds, or builds with hardcoded tables enable
<jkqxz>
Static tables in the binary are on read-only pages and that isn't a concern.
<mkver>
jkqxz: If it is not used for synchronisation: Yes. If it is used for synchronisation: release+acquire if possible; if not, it should be documented why one needs sequential consistency.
<Lynne>
jkqxz: generally I think we rely on users manually pinning ffmpeg on cores/clusers
<jkqxz>
mkver: It is only used for atomicity, not for synchronsiaton.
<jkqxz>
That is a stronger opinion than any other I am seeing here (including my own), so I will change them all to relaxed.
<mkver>
jkqxz: Almost all of the stuff in .bss are tables that are initialized once when a codec is initialized. So it is will happen only very rarely (if ever) that your LUT will share a cacheline with something that is actually modified.
<jkqxz>
Hmm, I was thinking it was on the heap but indeed jamrial's patch puts it in bss. If anyone did have anything in lavc bss which is mutable at runtime then they would be deemed Very Naughty already, so I think that's fair.
Nino has joined #ffmpeg-devel
<jkqxz>
jamrial: Ok, go ahead. I'll push my current error handling series after that if there are no other comments, then the multisymbol decode can rebase on top of it (it adds more tables which are larger but doesn't change the setup ordering, so no problem).
<jamrial>
ok
<cone-622>
ffmpeg James Almer master:a9557c1f26f2: avcodec/apv_decode: build the lut table only once
Traneptora has quit [Quit: Quit]
aljazmc has quit [Remote host closed the connection]
aljazmc has joined #ffmpeg-devel
<cone-622>
ffmpeg Mark Thompson master:135acc8e61cf: cbs_apv: Always restore tracing state on split fragment error
<cone-622>
ffmpeg Mark Thompson master:5acd2145a4cd: apv_decode: Fix memory leak on decode error
<cone-622>
ffmpeg Mark Thompson master:1a9a2bafc8a3: apv_decode: Improve reporting of decode errors
<cone-622>
ffmpeg Mark Thompson master:2aa2095bb497: cbs_apv: Better constrain tile_width/height_in_mbs
<cone-622>
ffmpeg Mark Thompson master:ea457e54e1b0: apv_entropy: Improve robustness to bitstream errors
<cone-622>
ffmpeg Mark Thompson master:9bf54cdb19f1: cbs_apv: Check tile component sizes
abdu has joined #ffmpeg-devel
abdu has quit [Ping timeout: 240 seconds]
Nino has quit [Quit: Client closed]
ngaullier has quit [Remote host closed the connection]
<cone-622>
ffmpeg Michael Niedermayer master:e5640e67d08c: libpostproc/tests: Factor ff_chksum() out
<cone-622>
ffmpeg Michael Niedermayer master:c644720e6869: postproc/tests/.gitignore: Add temptest
<ramiro>
haasn: could you have a look at "./libswscale/tests/swscale -unscaled 1 -src rgb48be -dst bgr48be"? it performs swap_bytes/swizzle/swap_bytes, which seems redundant.
Labnan has joined #ffmpeg-devel
<haasn>
true
<haasn>
though not a deal for the x86 backend
<haasn>
since it gets compiled down to a single byte shuffle either way
<haasn>
in theory we could do swizzles before byte swapping to help eliminate this case exactly
<haasn>
though that would fail to optimize bgr48be -> bgr48le
<haasn>
since that would then end up as swizzle, swap, swizzle
<haasn>
a definitive fix would be to push swap past swizzle in one direction always
abdu has joined #ffmpeg-devel
cone-802 has joined #ffmpeg-devel
<cone-802>
ffmpeg James Almer master:2b6303762fc0: tests/fate/cbs: add tests for APV