michaelni changed the topic of #ffmpeg-devel to: Welcome to the FFmpeg development channel | Questions about using FFmpeg or developing with libav* libs should be asked in #ffmpeg | This channel is publicly logged | FFmpeg 8.0 has been released! | Please read ffmpeg.org/developer.html#Code-of-conduct
justache is now known as KennyLog-in
DauntlessOne4985 has joined #ffmpeg-devel
<Guest40>
Well, since asking here saves me a little effort...
<Guest40>
I'm modifying avfilter/asrc_flite to add support for generating speech using user-provided .flitevox files, as well as using user-provided pronunciations from text files.
<Guest40>
I want to use the existing voice_entry struct to store cached loaded voices in a static data structure.
<Guest40>
I tried to use an av_tree for storing these entries, but I've concluded that there's no correct way to use it in a way that meets my needs.
<Guest40>
Namely, to retrieve a cached voice, I need to search for the voice_entry whose source file path matches the one passed as an option to the filter,
<Guest40>
and for removing a cached voice that is no longer in use, I need to search for a voice_entry whose voice is equal to flite->voice.
<Guest40>
As a result, I've decided to use a dynamic array of pointers with corresponding count and capacity variables.
<Guest40>
Is there an appropriate pre-existing libavutil data structure I could use, and if not, would it be appropriate to add my implementation to libavutil as part of its own patch?
<mkver>
Guest40: av_dynarray2_add
<mkver>
But how would one register user-provided files at all?
welder has quit [Quit: WeeChat 4.6.3]
<Guest40>
flite/flite.h exposes a function `cst_voice *flite_voice_load(char *)", which I call using an argument passed as a filter option.
<mkver>
And how does flite now of the user provided files?
<mkver>
And given that it is handled by flite, it is not properly referenced-counted, isn't it?
<Guest40>
As part of unregistering the loaded voice in the created voice entry's unregister_fn, I call delete_voice.
<Guest40>
I've actually tested all of this and have gotten it to work with no apparent bugs.
<mkver>
My flite.h doesn't have a delete_voice.
<mkver>
Apparently it's in cst_voice.h
<Guest40>
yeah, I was gonna say that it was included in a different header that flite.h itself includes.
<Guest40>
With respect to using the av_dynarray_add family of functions, I didn't think to use them because I only saw them used in dense arrays. In my use case I'm nulling out the array indices of unregistered voice entries, while my insertion code puts the to-be-inserted item in the first null index, only expanding the array if it's full. But I suppose I
<Guest40>
can use one of those functions specifically in the branch of insertion code that gets run if the array needs expanding.
<mkver>
Could we actually use delete_voice() instead of calling the per-voice unregister_* function?
<Guest40>
I'll have to take a look at that in comparison to the unregister_cmu_us_XXX functions.
<Guest40>
unregister_cmu_us_XXX just calls delete_voice and nulls out the global pointer for the specified voice. That second part is essential because otherwise, calling register_cmu_us_XXX again after that would return a pointer to already-freed memory.
<Guest40>
So we can't just use delete_voice as the unregister_fn for the built-in voices.
<mkver>
Is there a way to register one of the built-in voices like the user provided ones?
<Guest40>
doesn't look like there is.
<mkver>
Damn. This would solve the refcounting issue.
kasper93 has quit [Remote host closed the connection]
kasper93 has joined #ffmpeg-devel
<mkver>
Guest40: How expensive is initializing a voice actually?
System_Error has quit [Remote host closed the connection]
minimal has quit [Quit: Leaving]
System_Error has joined #ffmpeg-devel
kode54 has quit [Quit: WeeChat 4.7.1]
kode54 has joined #ffmpeg-devel
<Guest40>
mkver: Can't seem to get perf on the distro I'm currently using.
<mkver>
And how much mem does it take (approximately)?
realies24 has joined #ffmpeg-devel
realies2 has quit [Read error: Connection reset by peer]
realies24 is now known as realies2
<Guest40>
I'm really not used to using profiling tools like this.
<Guest40>
Could I be pointed in the direction of what tools I should use?
<mkver>
valgrind: the default tool gives the number of allocations of your programm as well as the combined size of them; and massif (https://valgrind.org/docs/manual/ms-manual.html) can do even more than that.
<Guest40>
And if you also want to know the execution time of initializing a built-in voice verses one loaded from a file?
<mkver>
I am not really interested in that.
<mkver>
The reason I am asking for this info is because I want to know whether sharing the voices provided by files is worth it at all.
<Guest40>
Ah, okay. I'll give you the memory costs. I'll be comparing a builtin voice and the same voice loaded from a flitevox file.
<mkver>
There is also a second reason: Who says that the same filename always refers to the same file? tmpfiles can change.
<mkver>
Actually, the comparison should be "voice loaded from flitevox file" versus "not loading a voice at all". Because this will show the cost of loading a voice.
<Guest40>
A clustergen voice (one of the more complex types of supported flitevox voice files) takes about 15MB.
<Guest40>
mkver: Is 15MB a big enough memory cost to justify caching voices?
<Guest40>
s/15/2
<Guest40>
ugh, not used to IRC
<mkver>
2MB does not sound like too much.
<Guest40>
Actually, it does look like 15 MB. Just thought I was misreading the units, but I wasn't.
<mkver>
15MB is indeed a bit much; but remember the "tmpfiles can change"? Who says that they don't?
<Guest40>
You have a good point.
<Guest40>
Just in case a tmpfile changes in the time between it being loaded for one filter, and a second filter requesting to load it BEFORE the first filter has finished using the voice it loaded from that file.
Kimapr has quit [Ping timeout: 264 seconds]
<Guest40>
There's a tradeoff between memory usage and the risks of what feels like an edge case to me at first glance.
Kimapr has joined #ffmpeg-devel
<Guest40>
Would it be appropriate to have the voice cache getter use the mtime of the file as part of its check to determine if we're loading the same file?
<Guest40>
If not, would the hash of the file be usable in a similar way?
<mkver>
I am undecided on all of this; it should best be discussed in the PR.