#ffmpeg-devel on 2025-04-21 — irc logs at libera.catirclogs.org

2025-03-03 01:04 michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct

00:15 IndecisiveTurtle has quit [Ping timeout: 260 seconds]

00:19 thilo has quit [Ping timeout: 276 seconds]

00:20 thilo has joined #ffmpeg-devel

01:12 HideoSugai has quit [Ping timeout: 240 seconds]

01:20 cone-421 has quit [Quit: transmission timeout]

01:31 minimal has quit [Quit: Leaving]

02:17 Marth64 has joined #ffmpeg-devel

02:34 MisterMinister has joined #ffmpeg-devel

02:40 Xaldafax has quit [Quit: Bye...]

03:06 HideoSugai has joined #ffmpeg-devel

03:35 Martchus has joined #ffmpeg-devel

03:36 Martchus_ has quit [Ping timeout: 248 seconds]

03:39 Traneptora has quit [Quit: Quit]

04:00 jamrial has quit []

05:37 HideoSugai has quit [Quit: Client closed]

05:42 MisterMinister has quit [Ping timeout: 252 seconds]

05:44 mkver has joined #ffmpeg-devel

06:06 Traneptora has joined #ffmpeg-devel

07:24 mkver has quit [Ping timeout: 244 seconds]

07:33 rvalue has quit [Ping timeout: 245 seconds]

07:34 rvalue has joined #ffmpeg-devel

08:02 kode547 has joined #ffmpeg-devel

08:04 kode54 has quit [Ping timeout: 276 seconds]

08:04 kode547 is now known as kode54

08:12 mkver has joined #ffmpeg-devel

08:28 <fflogger> [newticket] mytait: Ticket #11549 ([undetermined] Ffmpeg fontconfig broken on windows?) created https://trac.ffmpeg.org/ticket/11549

09:26 <fflogger> [editedticket] Cigaes: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:3

10:14 <rodeo> I get a 503 error on gitweb https://git.ffmpeg.org/ffmpeg and also git pull seems to stall -- is this expected?

10:16 <Lynne> in today's internet, yes

10:17 <Lynne> ai companies are mass scraping their way to AGI, to the mooooon!

10:46 <rodeo> lol

10:46 <rodeo> OK

11:21 <fflogger> [editedticket] mytait: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:4

11:26 <fflogger> [editedticket] Cigaes: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:5

11:44 pross has quit [Read error: Connection reset by peer]

11:52 jamrial has joined #ffmpeg-devel

12:30 <fflogger> [editedticket] mytait: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:6

12:35 Gramner has quit [Remote host closed the connection]

12:47 Gramner has joined #ffmpeg-devel

13:24 rvalue has quit [Read error: Connection reset by peer]

13:25 rvalue has joined #ffmpeg-devel

13:39 minimal has joined #ffmpeg-devel

13:58 mark4o has joined #ffmpeg-devel

14:02 markh has quit [Ping timeout: 265 seconds]

14:02 mark4o is now known as markh

14:21 <fflogger> [editedticket] Cigaes: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:7

15:15 cone-731 has joined #ffmpeg-devel

15:15 <cone-731> ffmpeg Andreas Rheinhardt master:494061a49aa4: avcodec/vp8: Maintain consistency of frame pointers

15:15 <cone-731> ffmpeg Andreas Rheinhardt master:df824211c253: avcodec/vc2enc: Improve error codes

15:27 <jkqxz> mkver: Sorry, overlapped.

15:27 <jkqxz> What is the lowest number of bytes that probe can be called with?

15:28 <mkver> The buffer is always padded with AVPROBE_PADDING_SIZE bytes (which are zero). The actually valid minimal buffer size is 1 (or maybe zero?).

15:29 <mkver> The padding is not accounted for in buf_size.

15:30 <jkqxz> Hmm, ok. Is there a way to say "don't know, need more" if the buffer is too small here?

15:30 <mkver> IIRC no. You just return 0 (meaning "not a file of the file this probe function is supposed to check for).

15:34 <jkqxz> Ok.

15:37 <jkqxz> As you say the byte reader is probably better anyway, I will change it.

15:39 <jkqxz> How strong are opinions on exporting APVDecoderConfigurationRecord as extradata here?

15:39 <jkqxz> (How final is that definition, anyway? Currently all I see is the one markdown file in the reference codec repo.)

15:39 <mkver> I have none.

15:40 <mkver> As long as it is not final, we should not export it.

15:45 av500 has joined #ffmpeg-devel

15:49 <fflogger> [editedticket] brycechesternewman: Ticket #11217 ([ffmpeg] Output "-ss" memory consumption regression) updated https://trac.ffmpeg.org/ticket/11217#comment:29

17:09 <cone-731> ffmpeg Thierry Foucu master:1d244c641baf: libavformat/takdec.c: Fix msan error

17:09 <cone-731> ffmpeg James Almer master:b6c2498a5902: avformat/takdec.c: return proper error codes for avio_read() failures

17:42 ocrete has quit [Quit: The Lounge - https://thelounge.chat]

17:44 ocrete has joined #ffmpeg-devel

18:00 ocrete has quit [Quit: The Lounge - https://thelounge.chat]

18:02 ocrete has joined #ffmpeg-devel

18:10 <jkqxz> jamrial: Isn't there a penalty for mixing xmm and ymm access to the same register? (I have been careful to avoid that but your suggested change does it.)

18:31 <kurosu> the bottom of a ymm is an xmm, but you incur a penalty (implicit vzeroupper) whenver you use a ymm

18:32 <kurosu> another penalty is domain transition (between int and float) but no idea if that's that big even nowadays

18:35 <kurosu> jkqxz: looked at the code, no penalty

18:35 <jkqxz> Is there any penalty for having written a ymm register and then later addressing the xmm half only?

18:35 <kurosu> no

18:36 <kurosu> I mean, in the same function. What you refer to is likely conflict resolution and false dependency that vzeroupper solves

18:36 <kurosu> If you're going to do pure xmm in a 2nd part of the function, or a later one, then sure

18:37 <jkqxz> A false dependency on the upper half is fine as long as there isn't some unexpected lane penalty that I'm missing.

18:38 <kurosu> I kind of recall it depends on the instruction encoding as well (the vex evex etc)

18:38 <jkqxz> Trying to work this out from the Intel optimisation manual it does look like the problems are all in mixed VEXed/unVEXed cases.

18:39 <kurosu> and I think the xmm operations in an INIT_YMM avx function will be auto-encoded as the VEX form

18:39 <jkqxz> And the implicit upper-zeroing of writes to xmm registers doesn't cost anything.

18:40 <kurosu> So, non, I don't think it will cause a problem

18:40 <kurosu> RET will auto-insert a vzeroupper

18:40 <jkqxz> (Which does make sense when thinking about the lane split, because you just rename the upper half to be a reference to your zero register.)

18:43 <jkqxz> I think it makes sense to rewrite the final normalisation in pairs anyway, which avoids any xmm in the >8-bit case (write to memory with vextracti128).

18:44 <jkqxz> That doesn't quite work with the 8-bit case because it needs 64-bit writes, though maybe vpermq + movq.

18:49 <linkmauve> jkqxz, kurosu, until Skylake there was a very high performance cliff for using a ymm register as xmm without vzeroupper, that got fixed in Skylake which made vzeroupper more or less a noop IIRC.

19:01 <fflogger> [newticket] Levan: Ticket #11550 ([ffmpeg] Simple commend -c copy -t ** file.mp4 no longer works) created https://trac.ffmpeg.org/ticket/11550

19:03 <kurosu> jkqxz: btw, I imagine it's a high bitrate codec, but what is the nz count high enough that traditional dequant during inverse zz scan is slower?

19:03 <kurosu> -what

19:25 <jkqxz> The default compression ratio target is ~7 and that seems to average something like 10 nonzero coefficients per block (i.e. ~9 bits per nonzero coefficient).

19:26 <jkqxz> Having put the unzigzag inside the entropy the combination felt better this way around, but I admit I have not actually compared against the reverse.

19:42 mkver has quit [Ping timeout: 276 seconds]

20:09 cone-731 has quit [Quit: transmission timeout]

20:23 <jkqxz> kurosu: Any idea whether there would be value in checking for zero rows to help that? It can do that in the dequant easily, but if the row transform is first then it can happen there as well. (As "ptest mN ; jz skip_row_N".)

20:24 <jkqxz> Those branches would be somewhat predictable, too.

20:31 <jkqxz> I guess in the entropy you'd know what nonzero coefficients you had written and then it could dispatch to one of 8x8, 4x8, 8x4, 4x4 or DC-only for the transform.

20:32 <jkqxz> Does dav1d do anything like that for the large transforms? (AV1 mandates the 64x64 to only have coefficients in the top-left 32x32, but I mean for others where it can do it opportunistically.)

20:33 <another|> yes

20:33 quietvoid has quit []

20:35 <another|> at least if I understand you correctly

20:36 <another|> you mean like early exits in case there is only data in the top left corner?

20:36 quietvoid has joined #ffmpeg-devel

20:39 <another|> here's a random example: https://code.videolan.org/videolan/dav1d/-/blob/master/src/x86/itx16_avx2.asm#L4629-L4630

21:22 <kurosu> Yes but the nz ratio is much lower

21:24 <kurosu> I kind of remember an older idct skipping empty rows, but again much lower ratio

21:26 <kurosu> If you get a gain by that, it's worth benchmarking doing the dequant in the entropy decoding loop. I think that's what is done again in older codecs

21:26 <kurosu> implementations in FFmpeg

21:27 <kurosu> (sorry no PC to easily and quickly look that up)

21:27 <jkqxz> It seems worth trying. At the higher compression ratios it's plausible that a decent proportion of blocks will fit in 4x4. Probably people won't use the sort of artificial content which gives you DC-only, though.

21:28 <jkqxz> And yes, dequant in entropy would be wanted to go with that.

21:30 <kurosu> Yep see COND macros in simple_idct asm

21:31 <kurosu> Anyway, maybe just go with whichever is simpler, get it merged, then experiment

21:33 <kurosu> Though simple idct explicitly ors coeffs to know what it can skip, and doesn't get the info from the entropy decoding

21:33 <kurosu> (like dav1d does)

21:50 paulk has quit [Ping timeout: 252 seconds]

21:56 paulk has joined #ffmpeg-devel

22:06 paulk has quit [Ping timeout: 260 seconds]

22:24 Luna_Rabbit has quit [Quit: Do you believe in magic?]

22:26 Moon_Rabbit has joined #ffmpeg-devel

22:47 <Lynne> jkqxz: how fast is the decoder, for typical video?

23:00 <haasn> ramiro: I may have to add some fudge for code for 3-element writes (by assuming the actual write size is rounded up to some reasonable power of 2)

23:00 <haasn> did you implement that yet in your code?

23:00 <haasn> I guess with your strided writes it's easy?

23:01 <haasn> on x86 it's really hard to handle packed 3-element writes without any overwrite, though I'm sure it's possible (maybe I can just write the last 96 bits of the last XMM reg as two scalar writes)

23:02 paulk has joined #ffmpeg-devel

23:05 pross has joined #ffmpeg-devel

23:10 philipl has quit [Quit: leaving]

23:59 philipl has joined #ffmpeg-devel