diff mbox series

[FFmpeg-devel] Added support for MB_INFO

Message ID 9642852e9da11714c8643833562b6e86133ce1a1.camel@amazon.it
State New
Headers show
Series [FFmpeg-devel] Added support for MB_INFO | expand

Checks

Context Check Description
andriy/commit_msg_x86 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Carotti, Elias June 10, 2022, 10:11 a.m. UTC
Hi,
patch attached to add support for passing down to libx264 information
about which macroblock could be eligible for being coded as P_SKIP.




NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico

Comments

Carotti, Elias June 17, 2022, 8:41 a.m. UTC | #1
Hi all,
any chance someone could possibly have a look at this patch, please?
Thanks in advance



On Fri, 2022-06-10 at 10:11 +0000, Carotti, Elias wrote:
> Hi,
> patch attached to add support for passing down to libx264 information
> about which macroblock could be eligible for being coded as P_SKIP.
> 



NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
Timo Rothenpieler June 17, 2022, 9:40 a.m. UTC | #2
On 17.06.2022 10:41, Carotti, Elias wrote:
> Hi all,
> any chance someone could possibly have a look at this patch, please?
> Thanks in advance
> 
> 
> 
> On Fri, 2022-06-10 at 10:11 +0000, Carotti, Elias wrote:
>> Hi,
>> patch attached to add support for passing down to libx264 information
>> about which macroblock could be eligible for being coded as P_SKIP.
>>
> 
> 
> 
> NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
> 

There's a stray printf in the patch.

Also, what if a frame with such side data is passed to not-x264? Won't 
it leak the memory?
 From the looks if it, this entirely relies on x264 to free it.
Carotti, Elias June 17, 2022, 10:15 a.m. UTC | #3
Hi,
thanks for pointing out the printf. That's a left over which I removed.

I am not clear on the possible leak you are hinting at. 
The new side information only passes two pointers to libx264, the first
one being a buffer with the flags and a pointer to a deallocator which
may be NULL. 

If the deallocator is not NULL libx264 we're yielding the deallocation
responsibility to libx264, otherwise the ownership of the buffer and,
as such, the responsibility for the deallocation remains with the
caller. 
As such, the only possibility for a leak seems to me due to a
programming error.
Is that what you were asking or am I missing something else?

Please find attached the updated patch.

Elias



On Fri, 2022-06-17 at 11:40 +0200, Timo Rothenpieler wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On 17.06.2022 10:41, Carotti, Elias wrote:
> > Hi all,
> > any chance someone could possibly have a look at this patch,
> > please?
> > Thanks in advance
> > 
> > 
> > 
> > On Fri, 2022-06-10 at 10:11 +0000, Carotti, Elias wrote:
> > > Hi,
> > > patch attached to add support for passing down to libx264
> > > information
> > > about which macroblock could be eligible for being coded as
> > > P_SKIP.
> > > 
> > 
> > 
> > NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro
> > delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale
> > Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052,
> > Societa con Socio Unico
> > 
> 
> There's a stray printf in the patch.
> 
> Also, what if a frame with such side data is passed to not-x264?
> Won't
> it leak the memory?
>  From the looks if it, this entirely relies on x264 to free it.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".



NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
Timo Rothenpieler June 17, 2022, 10:34 a.m. UTC | #4
On 17.06.2022 12:15, Carotti, Elias wrote:
> Hi,
> thanks for pointing out the printf. That's a left over which I removed.
> 
> I am not clear on the possible leak you are hinting at.
> The new side information only passes two pointers to libx264, the first
> one being a buffer with the flags and a pointer to a deallocator which
> may be NULL.
> 
> If the deallocator is not NULL libx264 we're yielding the deallocation
> responsibility to libx264, otherwise the ownership of the buffer and,
> as such, the responsibility for the deallocation remains with the
> caller.
> As such, the only possibility for a leak seems to me due to a
> programming error.
> Is that what you were asking or am I missing something else?
> 
> Please find attached the updated patch.
> 
> Elias
> 

Yes, exactly. It relies on x264 to free it.
What happens if x264 is not involved, but some other encoder, which does 
not check for that kind of side data?

How does the caller know that the data was consumed, and the ownership 
passed on?
The only sane way would be for the caller to hand over the ownership to 
ffmpeg, which then takes care of making sure it gets freed.
Carotti, Elias June 17, 2022, 10:59 a.m. UTC | #5
> 
> Yes, exactly. It relies on x264 to free it.

Not really. It can rely on x264 if you explicitly want that behavior.
If you do not set a deallocator, it remains the caller responsibility.

> What happens if x264 is not involved, but some other encoder, which
> does
> not check for that kind of side data?
> 
> How does the caller know that the data was consumed, and the
> ownership
> passed on?

Again, you don't have to pass the ownership, and in fact in my use case
I do not pass it since I actually recycle and update the same buffer
for subsequent frames. If you do intentionally pass the ownership you
need to be aware of what you are doing. As I said, I see a leak only in
case of a programming error. 
Maybe we could explicitly state it in the comment section in mb_info.h:
if you set the deallocator you're relinquishing ownership of the
buffer.

Plus, there's one more flag (b_mb_info_update) in libx264 which allows
to pass back information to the caller about which macroblocks among
the flagged ones were actually encoded as P_SKIP. I did not add support
for that though but in the future somebody may want to.

In principle I could've put the buffer into the side information  and
not just pass a pointer to it but I thought that it would require
adding more semantics which would imply a stronger dependency on
libx264. 
Right now, mb_info is a vector of uint8_t flags and the only possible
value - to date - is X264_MBINFO_CONSTANT. What if, say, one day
libx264 decides the buffer has to be a vector of uint16_t or still
uint8_t but the flags are packed? On the other hand, this, AFAIK, is
only supported by libx264. Other codecs may want to choose a different
semantic for a similar field and the only possibility to make it
generic is letting the caller handling the low level details.


> The only sane way would be for the caller to hand over the ownership
> to
> ffmpeg, which then takes care of making sure it gets freed.




> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".



NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
Timo Rothenpieler June 17, 2022, 3:10 p.m. UTC | #6
On 17.06.2022 12:59, Carotti, Elias wrote:
>>
>> Yes, exactly. It relies on x264 to free it.
> 
> Not really. It can rely on x264 if you explicitly want that behavior.
> If you do not set a deallocator, it remains the caller responsibility.
> 
>> What happens if x264 is not involved, but some other encoder, which
>> does
>> not check for that kind of side data?
>>
>> How does the caller know that the data was consumed, and the
>> ownership
>> passed on?
> 
> Again, you don't have to pass the ownership, and in fact in my use case
> I do not pass it since I actually recycle and update the same buffer
> for subsequent frames. If you do intentionally pass the ownership you
> need to be aware of what you are doing. As I said, I see a leak only in
> case of a programming error.
> Maybe we could explicitly state it in the comment section in mb_info.h:
> if you set the deallocator you're relinquishing ownership of the
> buffer.

I'm not sure if that's something you can sensibly do with side data?
What if it gets buffered, copied, and so on?

> Plus, there's one more flag (b_mb_info_update) in libx264 which allows
> to pass back information to the caller about which macroblocks among
> the flagged ones were actually encoded as P_SKIP. I did not add support
> for that though but in the future somebody may want to.

Yes, it's very x264 specific.
But the side data is generic. If for some reason x264 does not process a 
frame, or any other encoder ends up getting used, the data will leak if 
it relied on x264 to free it.

> In principle I could've put the buffer into the side information  and
> not just pass a pointer to it but I thought that it would require
> adding more semantics which would imply a stronger dependency on
> libx264.
> Right now, mb_info is a vector of uint8_t flags and the only possible
> value - to date - is X264_MBINFO_CONSTANT. What if, say, one day
> libx264 decides the buffer has to be a vector of uint16_t or still
> uint8_t but the flags are packed? On the other hand, this, AFAIK, is
> only supported by libx264. Other codecs may want to choose a different
> semantic for a similar field and the only possibility to make it
> generic is letting the caller handling the low level details.

I'm not aware of any other side data with a similar semantic. And I'm 
really not sure if it's sane or even valid to do it like that.
Can't the data be entirely contained within the side data?
Carotti, Elias June 17, 2022, 3:30 p.m. UTC | #7
On Fri, 2022-06-17 at 17:10 +0200, Timo Rothenpieler wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> On 17.06.2022 12:59, Carotti, Elias wrote:
> > > Yes, exactly. It relies on x264 to free it.
> > 
> > Not really. It can rely on x264 if you explicitly want that
> > behavior.
> > If you do not set a deallocator, it remains the caller
> > responsibility.
> > 
> > > What happens if x264 is not involved, but some other encoder,
> > > which
> > > does
> > > not check for that kind of side data?
> > > 
> > > How does the caller know that the data was consumed, and the
> > > ownership
> > > passed on?
> > 
> > Again, you don't have to pass the ownership, and in fact in my use
> > case
> > I do not pass it since I actually recycle and update the same
> > buffer
> > for subsequent frames. If you do intentionally pass the ownership
> > you
> > need to be aware of what you are doing. As I said, I see a leak
> > only in
> > case of a programming error.
> > Maybe we could explicitly state it in the comment section in
> > mb_info.h:
> > if you set the deallocator you're relinquishing ownership of the
> > buffer.
> 
> I'm not sure if that's something you can sensibly do with side data?
> What if it gets buffered, copied, and so on?
> 
> > Plus, there's one more flag (b_mb_info_update) in libx264 which
> > allows
> > to pass back information to the caller about which macroblocks
> > among
> > the flagged ones were actually encoded as P_SKIP. I did not add
> > support
> > for that though but in the future somebody may want to.
> 
> Yes, it's very x264 specific.
> But the side data is generic. If for some reason x264 does not
> process a
> frame, or any other encoder ends up getting used, the data will leak
> if
> it relied on x264 to free it.

Honestly, I do not see how this could ever happen. If you don't use libx264 you're likely not be using this kind of side data either.

If you use libx264 it's up to you to decide whether you want to reliquish control over it to libx264 or not. 
Otherwise, a solution perfectly compatible with what you suggest is to
get rid of the deallocator field altogether which would make it clear
it's the caller's responsibility to free the data.
I just see it like the frame data or any other data you're passing to
the encoder. You just pass the frame data to the library and if you
don't free the memory it's not the library fault, right?

> 
> > In principle I could've put the buffer into the side
> > information  and
> > not just pass a pointer to it but I thought that it would require
> > adding more semantics which would imply a stronger dependency on
> > libx264.
> > Right now, mb_info is a vector of uint8_t flags and the only
> > possible
> > value - to date - is X264_MBINFO_CONSTANT. What if, say, one day
> > libx264 decides the buffer has to be a vector of uint16_t or still
> > uint8_t but the flags are packed? On the other hand, this, AFAIK,
> > is
> > only supported by libx264. Other codecs may want to choose a
> > different
> > semantic for a similar field and the only possibility to make it
> > generic is letting the caller handling the low level details.
> 
> I'm not aware of any other side data with a similar semantic. And I'm
> really not sure if it's sane or even valid to do it like that.
> Can't the data be entirely contained within the side data?

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".



NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
Stefano Sabatini June 17, 2022, 11:41 p.m. UTC | #8
On date Friday 2022-06-17 17:10:25 +0200, Timo Rothenpieler wrote:
> On 17.06.2022 12:59, Carotti, Elias wrote:
[...]
> > Again, you don't have to pass the ownership, and in fact in my use case
> > I do not pass it since I actually recycle and update the same buffer
> > for subsequent frames. If you do intentionally pass the ownership you
> > need to be aware of what you are doing. As I said, I see a leak only in
> > case of a programming error.
> > Maybe we could explicitly state it in the comment section in mb_info.h:
> > if you set the deallocator you're relinquishing ownership of the
> > buffer.
> 

> I'm not sure if that's something you can sensibly do with side data?
> What if it gets buffered, copied, and so on?

This semantics is specific to libx264, but can be possibly extended to
other encoders.

AFAIU libx264 keeps the data in memory until it needs it, and it
finally calls the free when it is finished with the data. I think the
concerns are about:
1. How does the consumer (libx264 or another encoder) knows that the
   data is still valid?

This is a problem of the application injecting the data, it might use
reference counting, the deallocator in this case will unref the data
on the user application.

2. How does the application knows that the data is still in use by the
   consumer (the encoder)?

It doesn't have to, since this is handled by the consumer, when it is
finished it will call the deallocator.

> > Plus, there's one more flag (b_mb_info_update) in libx264 which allows
> > to pass back information to the caller about which macroblocks among
> > the flagged ones were actually encoded as P_SKIP. I did not add support
> > for that though but in the future somebody may want to.
> 
> Yes, it's very x264 specific.
> But the side data is generic. If for some reason x264 does not process a
> frame, or any other encoder ends up getting used, the data will leak if it
> relied on x264 to free it.

So, let's make up a use case for this.

The application injects the data, it might be for example a filter
computing the MB information and it might be unaware of what encoder
is used down in the processing pipeline.

For the simplest case the application knows that libx264 is the used
encoder, and assumes that all data will be consumed by libx264.
 
For the x264 API the problem does not exist, with generic side data
the problem can raise since side data must be agnostic of its final
destination.

So I agree, for the generic use case we can have a leak.

I see some possible ways to solve this:

1. We document the limitation and assume that when the MB metadata is
injected down in the pipeline there is a single consumer, the
application will leak otherwise. This works for the simple application
-> libx264 pipeline, but it can't work for a more generic case.

2. We provide a mechanism *at the library level* to cleanup the
unconsumed data. libavcodec sees that the encoder does not handle that
kind of user data (it might be a capability flag) and automatically
cleanups the data in that case. This would be still underkill though,
since the AVFrame is not necessarily injested into an encoder (for
example the MB info could be rendered or processed by an application
or by a filter). This means that the AVFrame unref needs to call this
side data deallocator (and that the consumer needs to "steal" the data
from the AVFrame before it is deallocated by the AVFrame unref). But
then what happens in case the AVFrame is shared by multiple endpoints
(e.g. multiple encoders)? Then you need a ref count for the side data
itself.

I don't know if this is something which is feasible with the current
side data API, and if it could be extended to support this kind of
semantics.

3. The simpler approach, we copy the data into MB info and deallocate
when we free it, but this might have a performance penalty because of
the unneeded memcpy.

> > In principle I could've put the buffer into the side information  and
> > not just pass a pointer to it but I thought that it would require
> > adding more semantics which would imply a stronger dependency on
> > libx264.
> > Right now, mb_info is a vector of uint8_t flags and the only possible
> > value - to date - is X264_MBINFO_CONSTANT. What if, say, one day
> > libx264 decides the buffer has to be a vector of uint16_t or still
> > uint8_t but the flags are packed? On the other hand, this, AFAIK, is
> > only supported by libx264. Other codecs may want to choose a different
> > semantic for a similar field and the only possibility to make it
> > generic is letting the caller handling the low level details.
> 
> I'm not aware of any other side data with a similar semantic. And I'm really
> not sure if it's sane or even valid to do it like that.
> Can't the data be entirely contained within the side data?

We can assume the same semantics of libx264 (at the moment the only
use case for this).
Anton Khirnov June 18, 2022, 3:06 p.m. UTC | #9
Quoting Carotti, Elias (2022-06-17 10:41:30)
> Hi all,
> any chance someone could possibly have a look at this patch, please?
> Thanks in advance

This looks like it belongs in AVVideoEncParams.
Carotti, Elias June 21, 2022, 8:48 a.m. UTC | #10
Hi,

extending AVVideoEncParams was the first hypothesis I made but it
didn't seem it was the proper place to add the mb_info flags.

I may be wrong but my impression is that AVVideoEncParams is used to
carry encoding parameters read from the bitstream at the decoder side
while here were going the other direction, i.e., were passing
information from the application to the encoder.

Secondly, mb_info can't be strictly considered encoding parameters and
it's not present in the bitstream at all. 
It's just a way to give hints to the libx264 encoder on which
macroblock we know have not changed since the previous frame and could
be coded as P_SKIPs. Libx264 however, may or may not oblige according
to its logic, and this specific information is not transmitted in the
bitstream nor can be recovered at the decoder.

Elias
Anton Khirnov June 23, 2022, 1:29 p.m. UTC | #11
Quoting Carotti, Elias (2022-06-21 10:48:07)
> Hi,
> 
> extending AVVideoEncParams was the first hypothesis I made but it
> didn't seem it was the proper place to add the mb_info flags.
> 
> I may be wrong but my impression is that AVVideoEncParams is used to
> carry encoding parameters read from the bitstream at the decoder side
> while here were going the other direction, i.e., were passing
> information from the application to the encoder.
> 
> Secondly, mb_info can't be strictly considered encoding parameters and
> it's not present in the bitstream at all. 
> It's just a way to give hints to the libx264 encoder on which
> macroblock we know have not changed since the previous frame and could
> be coded as P_SKIPs. Libx264 however, may or may not oblige according
> to its logic, and this specific information is not transmitted in the
> bitstream nor can be recovered at the decoder.

Right, seems I was too hasty in reading your patch.

But then I have to wonder whether this really needs a new installed
header, with a struct and a destructor, given that it's specific to a
single encoder for a single codec that is about 20 years old now.

Wouldn't AV_FRAME_DATA_X264_MBINFO that would be just a raw array of
uint8_t serve your needs just as fine? You could even get custom buffer
management by using AVFrameSideData.buf.
Andreas Rheinhardt June 23, 2022, 2:21 p.m. UTC | #12
Anton Khirnov:
> Quoting Carotti, Elias (2022-06-21 10:48:07)
>> Hi,
>>
>> extending AVVideoEncParams was the first hypothesis I made but it
>> didn't seem it was the proper place to add the mb_info flags.
>>
>> I may be wrong but my impression is that AVVideoEncParams is used to
>> carry encoding parameters read from the bitstream at the decoder side
>> while here were going the other direction, i.e., were passing
>> information from the application to the encoder.
>>
>> Secondly, mb_info can't be strictly considered encoding parameters and
>> it's not present in the bitstream at all. 
>> It's just a way to give hints to the libx264 encoder on which
>> macroblock we know have not changed since the previous frame and could
>> be coded as P_SKIPs. Libx264 however, may or may not oblige according
>> to its logic, and this specific information is not transmitted in the
>> bitstream nor can be recovered at the decoder.
> 
> Right, seems I was too hasty in reading your patch.
> 
> But then I have to wonder whether this really needs a new installed
> header, with a struct and a destructor, given that it's specific to a
> single encoder for a single codec that is about 20 years old now.
> 
> Wouldn't AV_FRAME_DATA_X264_MBINFO that would be just a raw array of
> uint8_t serve your needs just as fine? You could even get custom buffer
> management by using AVFrameSideData.buf.
> 

There is one problem though: libx264's free functions don't accept an
opaque parameter, so one can't easily create a reference for libx264 to
unref. I don't see a way around duplicating this buffer in the encoder.
(Or is there a way to know when libx264 is done with using this buffer?)

- Andreas
Anton Khirnov June 23, 2022, 2:48 p.m. UTC | #13
Quoting Andreas Rheinhardt (2022-06-23 16:21:18)
> Anton Khirnov:
> > Quoting Carotti, Elias (2022-06-21 10:48:07)
> >> Hi,
> >>
> >> extending AVVideoEncParams was the first hypothesis I made but it
> >> didn't seem it was the proper place to add the mb_info flags.
> >>
> >> I may be wrong but my impression is that AVVideoEncParams is used to
> >> carry encoding parameters read from the bitstream at the decoder side
> >> while here were going the other direction, i.e., were passing
> >> information from the application to the encoder.
> >>
> >> Secondly, mb_info can't be strictly considered encoding parameters and
> >> it's not present in the bitstream at all. 
> >> It's just a way to give hints to the libx264 encoder on which
> >> macroblock we know have not changed since the previous frame and could
> >> be coded as P_SKIPs. Libx264 however, may or may not oblige according
> >> to its logic, and this specific information is not transmitted in the
> >> bitstream nor can be recovered at the decoder.
> > 
> > Right, seems I was too hasty in reading your patch.
> > 
> > But then I have to wonder whether this really needs a new installed
> > header, with a struct and a destructor, given that it's specific to a
> > single encoder for a single codec that is about 20 years old now.
> > 
> > Wouldn't AV_FRAME_DATA_X264_MBINFO that would be just a raw array of
> > uint8_t serve your needs just as fine? You could even get custom buffer
> > management by using AVFrameSideData.buf.
> > 
> 
> There is one problem though: libx264's free functions don't accept an
> opaque parameter, so one can't easily create a reference for libx264 to
> unref. I don't see a way around duplicating this buffer in the encoder.
> (Or is there a way to know when libx264 is done with using this buffer?)

An ugly, but workable hack could be
- user allocates the AVBuffer with extra space at the end
- lavc/libx264.c checks that there is extra space AND the buffer is
  writable (so the same side data wasn't passed to multiple encoders),
  then stores its AVBufferRef* in there
Andreas Rheinhardt June 24, 2022, 9:14 a.m. UTC | #14
Anton Khirnov:
> Quoting Andreas Rheinhardt (2022-06-23 16:21:18)
>> Anton Khirnov:
>>> Quoting Carotti, Elias (2022-06-21 10:48:07)
>>>> Hi,
>>>>
>>>> extending AVVideoEncParams was the first hypothesis I made but it
>>>> didn't seem it was the proper place to add the mb_info flags.
>>>>
>>>> I may be wrong but my impression is that AVVideoEncParams is used to
>>>> carry encoding parameters read from the bitstream at the decoder side
>>>> while here were going the other direction, i.e., were passing
>>>> information from the application to the encoder.
>>>>
>>>> Secondly, mb_info can't be strictly considered encoding parameters and
>>>> it's not present in the bitstream at all. 
>>>> It's just a way to give hints to the libx264 encoder on which
>>>> macroblock we know have not changed since the previous frame and could
>>>> be coded as P_SKIPs. Libx264 however, may or may not oblige according
>>>> to its logic, and this specific information is not transmitted in the
>>>> bitstream nor can be recovered at the decoder.
>>>
>>> Right, seems I was too hasty in reading your patch.
>>>
>>> But then I have to wonder whether this really needs a new installed
>>> header, with a struct and a destructor, given that it's specific to a
>>> single encoder for a single codec that is about 20 years old now.
>>>
>>> Wouldn't AV_FRAME_DATA_X264_MBINFO that would be just a raw array of
>>> uint8_t serve your needs just as fine? You could even get custom buffer
>>> management by using AVFrameSideData.buf.
>>>
>>
>> There is one problem though: libx264's free functions don't accept an
>> opaque parameter, so one can't easily create a reference for libx264 to
>> unref. I don't see a way around duplicating this buffer in the encoder.
>> (Or is there a way to know when libx264 is done with using this buffer?)
> 
> An ugly, but workable hack could be
> - user allocates the AVBuffer with extra space at the end
> - lavc/libx264.c checks that there is extra space AND the buffer is
>   writable (so the same side data wasn't passed to multiple encoders),
>   then stores its AVBufferRef* in there
> 

Ugly? Certainly. Workable? Questionable: Depending upon how the user
uses the API the frame might not be writable when it reaches the encode
callback of our libx264 wrapper even if the user actually supplies
writable frames to avcodec_send_frame(). The reason is that if said
callback is called in avcodec_send_frame(), both the user's AVFrame as
well as the newly created reference to it (which is the frame that the
callback receives) will each own a reference to each side data.
The callback is called in avcodec_send_frame() if there is currently no
buffered packet ready to be forwarded to the user. This is true if the
user drains the encoder every time a frame has been sent. And that is
how we ordinarily use the API (our examples use this pattern as well as
ffmpeg.c). So I guess that this is the common way our API is used by
others as well.
(I think I once proposed adding a flag to muxers/encoders to select
whether an ownership transfer of the packet/frame is intended or not.)

- Andreas
diff mbox series

Patch

From b658ba5e34afeb59cd07b4d4110ddfce749847b0 Mon Sep 17 00:00:00 2001
From: Elias Carotti <eliascrt@amazon.it>
Date: Thu, 26 May 2022 21:17:43 +0000
Subject: [PATCH] Added support for MB_INFO

---
 doc/APIchanges       |  4 ++++
 libavcodec/libx264.c | 22 +++++++++++++++++++
 libavutil/Makefile   |  4 ++++
 libavutil/frame.h    | 10 +++++++++
 libavutil/mb_info.c  | 43 +++++++++++++++++++++++++++++++++++++
 libavutil/mb_info.h  | 50 ++++++++++++++++++++++++++++++++++++++++++++
 libavutil/version.h  |  2 +-
 7 files changed, 134 insertions(+), 1 deletion(-)
 create mode 100644 libavutil/mb_info.c
 create mode 100644 libavutil/mb_info.h

diff --git a/doc/APIchanges b/doc/APIchanges
index 337f1466d8..57dbf65551 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -14,6 +14,10 @@  libavutil:     2021-04-27
 
 API changes, most recent first:
 
+2022-06-08 - xxxxxxxxx - lavu 57.27.100 - frame.h
+  Add AV_FRAME_DATA_MB_INFO to AVFrameSideDataType and the supporting AVMBInfo
+  API.
+
 2022-05-23 - xxxxxxxxx - lavu 57.25.100 - avutil.h
   Deprecate av_fopen_utf8() without replacement.
 
diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
index 4ce3791ae8..c4ac06c428 100644
--- a/libavcodec/libx264.c
+++ b/libavcodec/libx264.c
@@ -29,6 +29,7 @@ 
 #include "libavutil/stereo3d.h"
 #include "libavutil/time.h"
 #include "libavutil/intreadwrite.h"
+#include "libavutil/mb_info.h"
 #include "avcodec.h"
 #include "codec_internal.h"
 #include "encode.h"
@@ -119,6 +120,8 @@  typedef struct X264Context {
      * encounter a frame with ROI side data.
      */
     int roi_warned;
+
+    int mb_info;
 } X264Context;
 
 static void X264_log(void *p, int level, const char *fmt, va_list args)
@@ -495,6 +498,20 @@  static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame,
         }
     }
 
+    if (frame && x4->mb_info) {
+        AVFrameSideData *mbinfo_sd = av_frame_get_side_data(frame, AV_FRAME_DATA_MB_INFO);
+
+        if (mbinfo_sd) {
+            AVMBInfo *par = (AVMBInfo *)mbinfo_sd->data;
+
+            x4->pic.prop.mb_info = par->mb_info;
+            x4->pic.prop.mb_info_free = par->mb_info_free;
+        } else if (!mbinfo_sd || !mbinfo_sd->data) {
+            av_log(ctx, AV_LOG_WARNING,
+                    "mb_info flag set but no actual MB info was provided\n");
+        }
+    }
+
     do {
         if (x264_encoder_encode(x4->enc, &nal, &nnal, frame? &x4->pic: NULL, &pic_out) < 0)
             return AVERROR_EXTERNAL;
@@ -969,6 +986,10 @@  static av_cold int X264_init(AVCodecContext *avctx)
         }
     }
 
+    x4->params.analyse.b_mb_info = x4->mb_info;
+    x4->params.analyse.b_fast_pskip = 1;
+    printf("MB info is: %s\n", x4->mb_info ? "enabled" : "disabled");
+
     // update AVCodecContext with x264 parameters
     avctx->has_b_frames = x4->params.i_bframe ?
         x4->params.i_bframe_pyramid ? 2 : 1 : 0;
@@ -1176,6 +1197,7 @@  static const AVOption options[] = {
     { "noise_reduction", "Noise reduction",                               OFFSET(noise_reduction), AV_OPT_TYPE_INT, { .i64 = -1 }, INT_MIN, INT_MAX, VE },
     { "udu_sei",      "Use user data unregistered SEI if available",      OFFSET(udu_sei),  AV_OPT_TYPE_BOOL,   { .i64 = 0 }, 0, 1, VE },
     { "x264-params",  "Override the x264 configuration using a :-separated list of key=value parameters", OFFSET(x264_params), AV_OPT_TYPE_DICT, { 0 }, 0, 0, VE },
+    { "mb_info",      "Set mb_info data through AVSideData, only useful when used from the API", OFFSET(mb_info), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, VE },
     { NULL },
 };
 
diff --git a/libavutil/Makefile b/libavutil/Makefile
index 74d21a8103..4890da77f0 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -89,6 +89,7 @@  HEADERS = adler32.h                                                     \
           tea.h                                                         \
           tx.h                                                          \
           film_grain_params.h                                           \
+          mb_info.h							\
 
 ARCH_HEADERS = bswap.h                                                  \
                intmath.h                                                \
@@ -193,6 +194,7 @@  OBJS-$(CONFIG_VAAPI)                    += hwcontext_vaapi.o
 OBJS-$(CONFIG_VIDEOTOOLBOX)             += hwcontext_videotoolbox.o
 OBJS-$(CONFIG_VDPAU)                    += hwcontext_vdpau.o
 OBJS-$(CONFIG_VULKAN)                   += hwcontext_vulkan.o
+OBJS-$(CONFIG_LIBX264)                  += mb_info.o
 
 OBJS += $(COMPAT_OBJS:%=../compat/%)
 
@@ -214,6 +216,8 @@  SKIPHEADERS-$(CONFIG_VULKAN)           += hwcontext_vulkan.h vulkan.h   \
                                           vulkan_functions.h            \
                                           vulkan_loader.h
 
+SKIPHEADERS-$(CONFIG_LIBX264)  	       += mb_info.h
+
 TESTPROGS = adler32                                                     \
             aes                                                         \
             aes_ctr                                                     \
diff --git a/libavutil/frame.h b/libavutil/frame.h
index 33fac2054c..08c84a1b63 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -209,6 +209,16 @@  enum AVFrameSideDataType {
      * volume transform - CUVA 005.1-2021.
      */
     AV_FRAME_DATA_DYNAMIC_HDR_VIVID,
+
+    /**
+     * Provide macro block encoder-specific hinting information for the encoder
+     * processing.  It can be used to pass information about which macroblock
+     * can be skipped because it hasn't changed from the corresponding one in
+     * the previous frame. This is useful for applications which know in
+     * advance this information to speed up real-time encoding.  Currently only
+     * used by libx264.
+     */
+    AV_FRAME_DATA_MB_INFO,
 };
 
 enum AVActiveFormatDescription {
diff --git a/libavutil/mb_info.c b/libavutil/mb_info.c
new file mode 100644
index 0000000000..d27d8dc412
--- /dev/null
+++ b/libavutil/mb_info.c
@@ -0,0 +1,43 @@ 
+/*
+ * Copyright (c) 2022 Elias Carotti eliascrt _AT_ amazon.it
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <string.h>
+
+#include "avstring.h"
+#include "frame.h"
+#include "macros.h"
+#include "mem.h"
+#include "mb_info.h"
+
+
+AVMBInfo *av_mb_info_create_side_data(AVFrame *frame, uint8_t* mb_info,
+                                            void (*mb_info_free)(void *))
+{
+    AVFrameSideData *side_data = av_frame_new_side_data(frame,
+                                                        AV_FRAME_DATA_MB_INFO,
+                                                        sizeof(AVMBInfo));
+    AVMBInfo *par = (AVMBInfo *)side_data->data;
+
+    par->mb_info = mb_info;
+    par->mb_info_free = mb_info_free;
+
+    return par;
+}
+
diff --git a/libavutil/mb_info.h b/libavutil/mb_info.h
new file mode 100644
index 0000000000..007f276dc4
--- /dev/null
+++ b/libavutil/mb_info.h
@@ -0,0 +1,50 @@ 
+/**
+ * Copyright (c) 2022 Elias Carotti eliascrt _AT_ amazon.it
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_MB_INFO_H
+#define AVUTIL_MB_INFO_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include "libavutil/avassert.h"
+#include "libavutil/frame.h"
+
+
+typedef struct AVMBInfo {
+    /**
+     * The actual mb_info data: one uint8_t per macroblock in raster-scan order.
+     * Currently the only flag defined in x264.h is X264_MB_INFO_CONSTANT
+     */
+    uint8_t *mb_info;
+
+    /* An optional pointer (may be NULL) to a de-allocator for the mb_info data */
+    void (*mb_info_free)(void *);
+} AVMBInfo;
+
+/**
+ * Allocate memory for AVMBInfo in the given AVFrame {@code frame}
+ * as AVFrameSideData of type AV_FRAME_DATA_MB_INFO
+ * and initializes the variables.
+ */
+AVMBInfo *av_mb_info_create_side_data(AVFrame *frame, uint8_t* mb_info,
+                                            void (*mb_info_free)(void *));
+
+#endif /* AVUTIL_MB_INFO_H */
diff --git a/libavutil/version.h b/libavutil/version.h
index 2c7f4f6b37..2e9e02dda8 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -79,7 +79,7 @@ 
  */
 
 #define LIBAVUTIL_VERSION_MAJOR  57
-#define LIBAVUTIL_VERSION_MINOR  26
+#define LIBAVUTIL_VERSION_MINOR  27
 #define LIBAVUTIL_VERSION_MICRO 100
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
-- 
2.25.1