michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 7.1.1 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
IndecisiveTurtle has quit [Ping timeout: 260 seconds]
thilo has quit [Ping timeout: 276 seconds]
thilo has joined #ffmpeg-devel
HideoSugai has quit [Ping timeout: 240 seconds]
cone-421 has quit [Quit: transmission timeout]
minimal has quit [Quit: Leaving]
Marth64 has joined #ffmpeg-devel
MisterMinister has joined #ffmpeg-devel
Xaldafax has quit [Quit: Bye...]
HideoSugai has joined #ffmpeg-devel
Martchus has joined #ffmpeg-devel
Martchus_ has quit [Ping timeout: 248 seconds]
Traneptora has quit [Quit: Quit]
jamrial has quit []
HideoSugai has quit [Quit: Client closed]
MisterMinister has quit [Ping timeout: 252 seconds]
mkver has joined #ffmpeg-devel
Traneptora has joined #ffmpeg-devel
mkver has quit [Ping timeout: 244 seconds]
rvalue has quit [Ping timeout: 245 seconds]
rvalue has joined #ffmpeg-devel
kode547 has joined #ffmpeg-devel
kode54 has quit [Ping timeout: 276 seconds]
kode547 is now known as kode54
mkver has joined #ffmpeg-devel
<fflogger> [newticket] mytait: Ticket #11549 ([undetermined] Ffmpeg fontconfig broken on windows?) created https://trac.ffmpeg.org/ticket/11549
<fflogger> [editedticket] Cigaes: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:3
<rodeo> I get a 503 error on gitweb https://git.ffmpeg.org/ffmpeg and also git pull seems to stall -- is this expected?
<Lynne> in today's internet, yes
<Lynne> ai companies are mass scraping their way to AGI, to the mooooon!
<rodeo> lol
<rodeo> OK
<fflogger> [editedticket] mytait: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:4
<fflogger> [editedticket] Cigaes: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:5
pross has quit [Read error: Connection reset by peer]
jamrial has joined #ffmpeg-devel
<fflogger> [editedticket] mytait: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:6
Gramner has quit [Remote host closed the connection]
Gramner has joined #ffmpeg-devel
rvalue has quit [Read error: Connection reset by peer]
rvalue has joined #ffmpeg-devel
minimal has joined #ffmpeg-devel
mark4o has joined #ffmpeg-devel
markh has quit [Ping timeout: 265 seconds]
mark4o is now known as markh
<fflogger> [editedticket] Cigaes: Ticket #11549 ([avfilter] Ffmpeg fontconfig broken on windows?) updated https://trac.ffmpeg.org/ticket/11549#comment:7
cone-731 has joined #ffmpeg-devel
<cone-731> ffmpeg Andreas Rheinhardt master:494061a49aa4: avcodec/vp8: Maintain consistency of frame pointers
<cone-731> ffmpeg Andreas Rheinhardt master:df824211c253: avcodec/vc2enc: Improve error codes
<jkqxz> mkver: Sorry, overlapped.
<jkqxz> What is the lowest number of bytes that probe can be called with?
<mkver> The buffer is always padded with AVPROBE_PADDING_SIZE bytes (which are zero). The actually valid minimal buffer size is 1 (or maybe zero?).
<mkver> The padding is not accounted for in buf_size.
<jkqxz> Hmm, ok. Is there a way to say "don't know, need more" if the buffer is too small here?
<mkver> IIRC no. You just return 0 (meaning "not a file of the file this probe function is supposed to check for).
<jkqxz> Ok.
<jkqxz> As you say the byte reader is probably better anyway, I will change it.
<jkqxz> How strong are opinions on exporting APVDecoderConfigurationRecord as extradata here?
<jkqxz> (How final is that definition, anyway? Currently all I see is the one markdown file in the reference codec repo.)
<mkver> I have none.
<mkver> As long as it is not final, we should not export it.
av500 has joined #ffmpeg-devel
<fflogger> [editedticket] brycechesternewman: Ticket #11217 ([ffmpeg] Output "-ss" memory consumption regression) updated https://trac.ffmpeg.org/ticket/11217#comment:29
<cone-731> ffmpeg Thierry Foucu master:1d244c641baf: libavformat/takdec.c: Fix msan error
<cone-731> ffmpeg James Almer master:b6c2498a5902: avformat/takdec.c: return proper error codes for avio_read() failures
ocrete has quit [Quit: The Lounge - https://thelounge.chat]
ocrete has joined #ffmpeg-devel
ocrete has quit [Quit: The Lounge - https://thelounge.chat]
ocrete has joined #ffmpeg-devel
<jkqxz> jamrial: Isn't there a penalty for mixing xmm and ymm access to the same register? (I have been careful to avoid that but your suggested change does it.)
<kurosu> the bottom of a ymm is an xmm, but you incur a penalty (implicit vzeroupper) whenver you use a ymm
<kurosu> another penalty is domain transition (between int and float) but no idea if that's that big even nowadays
<kurosu> jkqxz: looked at the code, no penalty
<jkqxz> Is there any penalty for having written a ymm register and then later addressing the xmm half only?
<kurosu> no
<kurosu> I mean, in the same function. What you refer to is likely conflict resolution and false dependency that vzeroupper solves
<kurosu> If you're going to do pure xmm in a 2nd part of the function, or a later one, then sure
<jkqxz> A false dependency on the upper half is fine as long as there isn't some unexpected lane penalty that I'm missing.
<kurosu> I kind of recall it depends on the instruction encoding as well (the vex evex etc)
<jkqxz> Trying to work this out from the Intel optimisation manual it does look like the problems are all in mixed VEXed/unVEXed cases.
<kurosu> and I think the xmm operations in an INIT_YMM avx function will be auto-encoded as the VEX form
<jkqxz> And the implicit upper-zeroing of writes to xmm registers doesn't cost anything.
<kurosu> So, non, I don't think it will cause a problem
<kurosu> RET will auto-insert a vzeroupper
<jkqxz> (Which does make sense when thinking about the lane split, because you just rename the upper half to be a reference to your zero register.)
<jkqxz> I think it makes sense to rewrite the final normalisation in pairs anyway, which avoids any xmm in the >8-bit case (write to memory with vextracti128).
<jkqxz> That doesn't quite work with the 8-bit case because it needs 64-bit writes, though maybe vpermq + movq.
<linkmauve> jkqxz, kurosu, until Skylake there was a very high performance cliff for using a ymm register as xmm without vzeroupper, that got fixed in Skylake which made vzeroupper more or less a noop IIRC.
<fflogger> [newticket] Levan: Ticket #11550 ([ffmpeg] Simple commend -c copy -t ** file.mp4 no longer works) created https://trac.ffmpeg.org/ticket/11550
<kurosu> jkqxz: btw, I imagine it's a high bitrate codec, but what is the nz count high enough that traditional dequant during inverse zz scan is slower?
<kurosu> -what
<jkqxz> The default compression ratio target is ~7 and that seems to average something like 10 nonzero coefficients per block (i.e. ~9 bits per nonzero coefficient).
<jkqxz> Having put the unzigzag inside the entropy the combination felt better this way around, but I admit I have not actually compared against the reverse.
mkver has quit [Ping timeout: 276 seconds]
cone-731 has quit [Quit: transmission timeout]
<jkqxz> kurosu: Any idea whether there would be value in checking for zero rows to help that? It can do that in the dequant easily, but if the row transform is first then it can happen there as well. (As "ptest mN ; jz skip_row_N".)
<jkqxz> Those branches would be somewhat predictable, too.
<jkqxz> I guess in the entropy you'd know what nonzero coefficients you had written and then it could dispatch to one of 8x8, 4x8, 8x4, 4x4 or DC-only for the transform.
<jkqxz> Does dav1d do anything like that for the large transforms? (AV1 mandates the 64x64 to only have coefficients in the top-left 32x32, but I mean for others where it can do it opportunistically.)
<another|> yes
quietvoid has quit []
<another|> at least if I understand you correctly
<another|> you mean like early exits in case there is only data in the top left corner?
quietvoid has joined #ffmpeg-devel
<kurosu> Yes but the nz ratio is much lower
<kurosu> I kind of remember an older idct skipping empty rows, but again much lower ratio
<kurosu> If you get a gain by that, it's worth benchmarking doing the dequant in the entropy decoding loop. I think that's what is done again in older codecs
<kurosu> implementations in FFmpeg
<kurosu> (sorry no PC to easily and quickly look that up)
<jkqxz> It seems worth trying. At the higher compression ratios it's plausible that a decent proportion of blocks will fit in 4x4. Probably people won't use the sort of artificial content which gives you DC-only, though.
<jkqxz> And yes, dequant in entropy would be wanted to go with that.
<kurosu> Yep see COND macros in simple_idct asm
<kurosu> Anyway, maybe just go with whichever is simpler, get it merged, then experiment
<kurosu> Though simple idct explicitly ors coeffs to know what it can skip, and doesn't get the info from the entropy decoding
<kurosu> (like dav1d does)
paulk has quit [Ping timeout: 252 seconds]
paulk has joined #ffmpeg-devel
paulk has joined #ffmpeg-devel
paulk has quit [Ping timeout: 260 seconds]
Luna_Rabbit has quit [Quit: Do you believe in magic?]
Moon_Rabbit has joined #ffmpeg-devel
<Lynne> jkqxz: how fast is the decoder, for typical video?
<haasn> ramiro: I may have to add some fudge for code for 3-element writes (by assuming the actual write size is rounded up to some reasonable power of 2)
<haasn> did you implement that yet in your code?
<haasn> I guess with your strided writes it's easy?
<haasn> on x86 it's really hard to handle packed 3-element writes without any overwrite, though I'm sure it's possible (maybe I can just write the last 96 bits of the last XMM reg as two scalar writes)
paulk has joined #ffmpeg-devel
paulk has joined #ffmpeg-devel
pross has joined #ffmpeg-devel
philipl has quit [Quit: leaving]
philipl has joined #ffmpeg-devel