diff mbox series

[FFmpeg-devel] libavutil/video_enc_params: add block type

Message ID 20200707224729.2788660-1-yonglel@google.com
State Superseded
Headers show
Series [FFmpeg-devel] libavutil/video_enc_params: add block type | expand

Checks

Context Check Description
andriy/default pending
andriy/make success Make finished
andriy/make_fate success Make fate finished

Commit Message

Yongle Lin July 7, 2020, 10:47 p.m. UTC
add block type field to AVVideoBlockParams so we could either export or visualize it later.
---
 libavutil/video_enc_params.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Lynne July 7, 2020, 10:54 p.m. UTC | #1
Jul 7, 2020, 23:47 by yongle.lin.94@gmail.com:

> add block type field to AVVideoBlockParams so we could either export or visualize it later.
> ---
>  libavutil/video_enc_params.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
> index 43fa443154..52c0058f5b 100644
> --- a/libavutil/video_enc_params.h
> +++ b/libavutil/video_enc_params.h
> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
>  AV_VIDEO_ENC_PARAMS_H264,
>  };
>  
> +enum AVVideoBlockFlags {
> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */
> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */
> +};
> +
>  /**
>  * Video encoding parameters for a given frame. This struct is allocated along
>  * with an optional array of per-block AVVideoBlockParams descriptors.
> @@ -126,6 +131,17 @@ typedef struct AVVideoBlockParams {
>  * corresponding per-frame value.
>  */
>  int32_t delta_qp;
> +
> +    /**
> +     * Type flag of the block
> +     * Each bit field indicates a type flag
> +     */
> +    enum AVVideoBlockFlags flags;
> +
> +    /**
> +     * Reference frames used for prediction
> +     */
> +    uint8_t ref[8];
>  } AVVideoBlockParams; 
>

After some discussion on IRC, could you clarify the ref array description to this:

> Each entry specifies the first/second/third/etc. reference frame the current frame uses.
> The value at each entry specifies the index inside the reference frame array for that current frame.

E.g. your current frame has 6 valid possible references, and your frame header specifies you
can use ref_frame[3] and ref_frame[5] as a reference.
So the values of ref[] for each block must be either 3 or 5.
Its convoluted because the array maps indices to indices but it makes sense.
Yongle Lin July 7, 2020, 11:06 p.m. UTC | #2
On Tue, Jul 7, 2020 at 10:54 PM Lynne <dev@lynne.ee> wrote:

> Jul 7, 2020, 23:47 by yongle.lin.94@gmail.com:
>
> > add block type field to AVVideoBlockParams so we could either export or
> visualize it later.
> > ---
> >  libavutil/video_enc_params.h | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
> > index 43fa443154..52c0058f5b 100644
> > --- a/libavutil/video_enc_params.h
> > +++ b/libavutil/video_enc_params.h
> > @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
> >  AV_VIDEO_ENC_PARAMS_H264,
> >  };
> >
> > +enum AVVideoBlockFlags {
> > +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses
> intra prediction */
> > +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not
> coded (skipped) */
> > +};
> > +
> >  /**
> >  * Video encoding parameters for a given frame. This struct is allocated
> along
> >  * with an optional array of per-block AVVideoBlockParams descriptors.
> > @@ -126,6 +131,17 @@ typedef struct AVVideoBlockParams {
> >  * corresponding per-frame value.
> >  */
> >  int32_t delta_qp;
> > +
> > +    /**
> > +     * Type flag of the block
> > +     * Each bit field indicates a type flag
> > +     */
> > +    enum AVVideoBlockFlags flags;
> > +
> > +    /**
> > +     * Reference frames used for prediction
> > +     */
> > +    uint8_t ref[8];
> >  } AVVideoBlockParams;
> >
>
> After some discussion on IRC, could you clarify the ref array description
> to this:
>
> > Each entry specifies the first/second/third/etc. reference frame the
> current frame uses.
> > The value at each entry specifies the index inside the reference frame
> array for that current frame.
>
> E.g. your current frame has 6 valid possible references, and your frame
> header specifies you
> can use ref_frame[3] and ref_frame[5] as a reference.
> So the values of ref[] for each block must be either 3 or 5.
> Its convoluted because the array maps indices to indices but it makes
> sense.
>
Yes. That makes sense to me. So the ref array stores the index of reference
frame in the ref_frame[]

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Mark Thompson July 7, 2020, 11:08 p.m. UTC | #3
On 07/07/2020 23:54, Lynne wrote:
> Jul 7, 2020, 23:47 by yongle.lin.94@gmail.com:
> 
>> add block type field to AVVideoBlockParams so we could either export or visualize it later.
>> ---
>>   libavutil/video_enc_params.h | 16 ++++++++++++++++
>>   1 file changed, 16 insertions(+)
>>
>> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
>> index 43fa443154..52c0058f5b 100644
>> --- a/libavutil/video_enc_params.h
>> +++ b/libavutil/video_enc_params.h
>> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
>>   AV_VIDEO_ENC_PARAMS_H264,
>>   };
>>   
>> +enum AVVideoBlockFlags {
>> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */
>> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */
>> +};
>> +
>>   /**
>>   * Video encoding parameters for a given frame. This struct is allocated along
>>   * with an optional array of per-block AVVideoBlockParams descriptors.
>> @@ -126,6 +131,17 @@ typedef struct AVVideoBlockParams {
>>   * corresponding per-frame value.
>>   */
>>   int32_t delta_qp;
>> +
>> +    /**
>> +     * Type flag of the block
>> +     * Each bit field indicates a type flag
>> +     */
>> +    enum AVVideoBlockFlags flags;
>> +
>> +    /**
>> +     * Reference frames used for prediction
>> +     */
>> +    uint8_t ref[8];
>>   } AVVideoBlockParams;
>>
> 
> After some discussion on IRC, could you clarify the ref array description to this:
> 
>> Each entry specifies the first/second/third/etc. reference frame the current frame uses.
>> The value at each entry specifies the index inside the reference frame array for that current frame.
> 
> E.g. your current frame has 6 valid possible references, and your frame header specifies you
> can use ref_frame[3] and ref_frame[5] as a reference.
> So the values of ref[] for each block must be either 3 or 5.
> Its convoluted because the array maps indices to indices but it makes sense.

Please also define it precisely for H.264, the other supported codec.

I came up with:

"""
For H.264, the values in this array are indices into the default RefPicList0 as constructed by 8.2.4.2 (before ref pic list modification has run and without any truncation).
If the block is intra-coded, no entries are valid.
If the block in inter-coded with reference to a single picture, ref[0] containes the index of that picture (which might have come from L0 or L1 list).
If the block is inter-coded using biprediction, ref[0] contains the index of the L0 picture and ref[1] contains the index of the L1 picture.
"""

Not sure if that's doing exactly the right thing or matches what you intended, but this is tricky so it needs that level of detail.

8 distinct reference pictures also seems slightly ambitious for a single lowest-level block, but I guess the future is always about ever-more-complex coding tools...

- Mark
Lynne July 7, 2020, 11:12 p.m. UTC | #4
Jul 8, 2020, 00:08 by sw@jkqxz.net:

> On 07/07/2020 23:54, Lynne wrote:
>
>> Jul 7, 2020, 23:47 by yongle.lin.94@gmail.com:
>>
>>> add block type field to AVVideoBlockParams so we could either export or visualize it later.
>>> ---
>>>  libavutil/video_enc_params.h | 16 ++++++++++++++++
>>>  1 file changed, 16 insertions(+)
>>>
>>> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
>>> index 43fa443154..52c0058f5b 100644
>>> --- a/libavutil/video_enc_params.h
>>> +++ b/libavutil/video_enc_params.h
>>> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
>>>  AV_VIDEO_ENC_PARAMS_H264,
>>>  };
>>>  +enum AVVideoBlockFlags {
>>> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */
>>> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */
>>> +};
>>> +
>>>  /**
>>>  * Video encoding parameters for a given frame. This struct is allocated along
>>>  * with an optional array of per-block AVVideoBlockParams descriptors.
>>> @@ -126,6 +131,17 @@ typedef struct AVVideoBlockParams {
>>>  * corresponding per-frame value.
>>>  */
>>>  int32_t delta_qp;
>>> +
>>> +    /**
>>> +     * Type flag of the block
>>> +     * Each bit field indicates a type flag
>>> +     */
>>> +    enum AVVideoBlockFlags flags;
>>> +
>>> +    /**
>>> +     * Reference frames used for prediction
>>> +     */
>>> +    uint8_t ref[8];
>>>  } AVVideoBlockParams;
>>>
>>
>> After some discussion on IRC, could you clarify the ref array description to this:
>>
>>> Each entry specifies the first/second/third/etc. reference frame the current frame uses.
>>> The value at each entry specifies the index inside the reference frame array for that current frame.
>>>
>>
>> E.g. your current frame has 6 valid possible references, and your frame header specifies you
>> can use ref_frame[3] and ref_frame[5] as a reference.
>> So the values of ref[] for each block must be either 3 or 5.
>> Its convoluted because the array maps indices to indices but it makes sense.
>>
>
> Please also define it precisely for H.264, the other supported codec.
>
> I came up with:
>
> """
> For H.264, the values in this array are indices into the default RefPicList0 as constructed by 8.2.4.2 (before ref pic list modification has run and without any truncation).
> If the block is intra-coded, no entries are valid.
> If the block in inter-coded with reference to a single picture, ref[0] containes the index of that picture (which might have come from L0 or L1 list).
> If the block is inter-coded using biprediction, ref[0] contains the index of the L0 picture and ref[1] contains the index of the L1 picture.
> """
>
> Not sure if that's doing exactly the right thing or matches what you intended, but this is tricky so it needs that level of detail.
>
> 8 distinct reference pictures also seems slightly ambitious for a single lowest-level block, but I guess the future is always about ever-more-complex coding tools...
>
> - Mark
>

Also specify that any entry that's unused will be set to -1. Zero is a valid index for non-keyframes after all.
Yongle Lin July 8, 2020, 12:16 a.m. UTC | #5
On Tue, Jul 7, 2020 at 4:08 PM Mark Thompson <sw@jkqxz.net> wrote:

> On 07/07/2020 23:54, Lynne wrote:
> > Jul 7, 2020, 23:47 by yongle.lin.94@gmail.com:
> >
> >> add block type field to AVVideoBlockParams so we could either export or
> visualize it later.
> >> ---
> >>   libavutil/video_enc_params.h | 16 ++++++++++++++++
> >>   1 file changed, 16 insertions(+)
> >>
> >> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
> >> index 43fa443154..52c0058f5b 100644
> >> --- a/libavutil/video_enc_params.h
> >> +++ b/libavutil/video_enc_params.h
> >> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
> >>   AV_VIDEO_ENC_PARAMS_H264,
> >>   };
> >>
> >> +enum AVVideoBlockFlags {
> >> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses
> intra prediction */
> >> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not
> coded (skipped) */
> >> +};
> >> +
> >>   /**
> >>   * Video encoding parameters for a given frame. This struct is
> allocated along
> >>   * with an optional array of per-block AVVideoBlockParams descriptors.
> >> @@ -126,6 +131,17 @@ typedef struct AVVideoBlockParams {
> >>   * corresponding per-frame value.
> >>   */
> >>   int32_t delta_qp;
> >> +
> >> +    /**
> >> +     * Type flag of the block
> >> +     * Each bit field indicates a type flag
> >> +     */
> >> +    enum AVVideoBlockFlags flags;
> >> +
> >> +    /**
> >> +     * Reference frames used for prediction
> >> +     */
> >> +    uint8_t ref[8];
> >>   } AVVideoBlockParams;
> >>
> >
> > After some discussion on IRC, could you clarify the ref array
> description to this:
> >
> >> Each entry specifies the first/second/third/etc. reference frame the
> current frame uses.
> >> The value at each entry specifies the index inside the reference frame
> array for that current frame.
> >
> > E.g. your current frame has 6 valid possible references, and your frame
> header specifies you
> > can use ref_frame[3] and ref_frame[5] as a reference.
> > So the values of ref[] for each block must be either 3 or 5.
> > Its convoluted because the array maps indices to indices but it makes
> sense.
>
> Please also define it precisely for H.264, the other supported codec.
>
> I came up with:
>
> """
> For H.264, the values in this array are indices into the default
> RefPicList0 as constructed by 8.2.4.2 (before ref pic list modification has
> run and without any truncation).
> If the block is intra-coded, no entries are valid.
> If the block in inter-coded with reference to a single picture, ref[0]
> containes the index of that picture (which might have come from L0 or L1
> list).
> If the block is inter-coded using biprediction, ref[0] contains the index
> of the L0 picture and ref[1] contains the index of the L1 picture.
> """
>
> Not sure if that's doing exactly the right thing or matches what you
> intended, but this is tricky so it needs that level of detail.
>
> 8 distinct reference pictures also seems slightly ambitious for a single
> lowest-level block, but I guess the future is always about
> ever-more-complex coding tools...
>


That's a good approach. I think currently we only support H.264 and VP9
codecs for block data. For VP9, we might use the same logic.

>
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Mark Thompson July 15, 2020, 11:06 p.m. UTC | #6
On 15/07/2020 18:43, Yongle Lin wrote:
> add block type field to AVVideoBlockParams so we could either export or visualize it later.
> ---
>   libavutil/video_enc_params.h | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
> index 43fa443154..8bf5f240c9 100644
> --- a/libavutil/video_enc_params.h
> +++ b/libavutil/video_enc_params.h
> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
>       AV_VIDEO_ENC_PARAMS_H264,
>   };
>   
> +enum AVVideoBlockFlags {
> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */

The ULL is only confusing matters here - standard-conforming enum values have type int.  Some compilers allow it to be a larger integer type, but I don't think we can rely on that extension.

> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */

Can you define more precisely what you mean by "not coded"?  Is that just that it has no residuals, or does it also indicate no motion vector, or no coded motion vector relative to prediction?

> +};
> +
>   /**
>    * Video encoding parameters for a given frame. This struct is allocated along
>    * with an optional array of per-block AVVideoBlockParams descriptors.
> @@ -126,6 +131,20 @@ typedef struct AVVideoBlockParams {
>        * corresponding per-frame value.
>        */
>       int32_t delta_qp;
> +
> +    /**
> +     * Type flag of the block
> +     * Each bit field indicates a type flag
> +     */
> +    enum AVVideoBlockFlags flags;

Don't make this an enum, since you aren't using it as an enum - you're going to assign combinations of flags.  (Also, the size of the field may change as more flags are added.)

> +
> +    /**
> +     * Reference frames used for prediction
> +     * Each entry specifies the first/second/third/etc. reference frame the current frame uses.
> +     * The value at each entry specifies the index inside the reference frame array for that current frame.

You'll need to define more carefully what "the reference frame array" means.  I can guess that it's the ref_frame_idx[] number for VP9, but it's not at all obvious what it would mean for H.264.

> +     * Any entry that is unused will be set to -1
> +     */
> +    int8_t ref[8];
>   } AVVideoBlockParams;
>   
>   /*
> 

- Mark
Mark Thompson July 15, 2020, 11:09 p.m. UTC | #7
On 15/07/2020 22:05, Yongle Lin wrote:
> ---
>   libavcodec/h264dec.c | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/libavcodec/h264dec.c b/libavcodec/h264dec.c
> index 1e2ca68449..b3de5290d0 100644
> --- a/libavcodec/h264dec.c
> +++ b/libavcodec/h264dec.c
> @@ -816,6 +816,20 @@ static int h264_export_enc_params(AVFrame *f, H264Picture *p)
>               b->h     = 16;
>   
>               b->delta_qp = p->qscale_table[mb_xy] - par->qp;
> +
> +            int mb_type = p->mb_type[mb_xy];
> +            if (IS_PCM(mb_type))
> +                b->flags |= AV_VIDEO_ENC_BLOCK_INTRA;

Can you explain your definition of what makes a block "intra"?  If you really do mean to only include PCM blocks then maybe it should have a different name, because that isn't what I would expect.

> +            if (IS_SKIP(mb_type))
> +                b->flags |= AV_VIDEO_ENC_BLOCK_SKIP; > +            if (!USES_LIST(mb_type, 1))
> +                b->ref[0] = p->ref_index[0];
> +            else if (!USES_LIST(mb_type, 0))
> +                b->ref[0] = p->ref_index[1];
> +            else {
> +                b->ref[0] = p->ref_index[0];
> +                b->ref[1] = p->ref_index[1];

I don't think ref_index is anything to do with what you think it is.

Also note that code of this form must be insufficient due to the lack dependence on both the slice and the block.  (Slice headers define the two reference picture lists, and blocks within a slice 
choose pictures from those lists.)

> +            }
>           }
>   
>       return 0;
> 

- Mark
Lynne July 15, 2020, 11:35 p.m. UTC | #8
Jul 16, 2020, 00:06 by sw@jkqxz.net:

> On 15/07/2020 18:43, Yongle Lin wrote:
>
>> add block type field to AVVideoBlockParams so we could either export or visualize it later.
>> ---
>>  libavutil/video_enc_params.h | 19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>>
>> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
>> index 43fa443154..8bf5f240c9 100644
>> --- a/libavutil/video_enc_params.h
>> +++ b/libavutil/video_enc_params.h
>> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
>>  AV_VIDEO_ENC_PARAMS_H264,
>>  };
>>  +enum AVVideoBlockFlags {
>> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */
>>
>
> The ULL is only confusing matters here - standard-conforming enum values have type int.  Some compilers allow it to be a larger integer type, but I don't think we can rely on that extension.
>

We already rely on this behavior internally in hwcontext_vulkan. Are you sure there's a compiler
we support that doesn't allow this? If so we ought to fix that.



>> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */
>>
>
> Can you define more precisely what you mean by "not coded"?  Is that just that it has no residuals, or does it also indicate no motion vector, or no coded motion vector relative to prediction?
>

In AV1 and VP9, which are AFAIK the only codecs to have this type of block skipping, are only able to
code deblocking level adjustments for skipped blocks. No residual is sent, no motion vector
either (hence the motion vector used is the likeliest candidate from the list of motion vectors).
If a block isn't skipped, a motion vector may or may not be coded, depending on what's in
the list of candidates for a block, so it would be inappropriate to apply the skip flag for anything
but AV1/VP9-style skipped blocks.
VP8 also supports all of that but no one cares about VP8 so its irrelevant.
Mark Thompson July 16, 2020, 12:09 a.m. UTC | #9
On 16/07/2020 00:35, Lynne wrote:
> Jul 16, 2020, 00:06 by sw@jkqxz.net:
> 
>> On 15/07/2020 18:43, Yongle Lin wrote:
>>
>>> add block type field to AVVideoBlockParams so we could either export or visualize it later.
>>> ---
>>>   libavutil/video_enc_params.h | 19 +++++++++++++++++++
>>>   1 file changed, 19 insertions(+)
>>>
>>> diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
>>> index 43fa443154..8bf5f240c9 100644
>>> --- a/libavutil/video_enc_params.h
>>> +++ b/libavutil/video_enc_params.h
>>> @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
>>>   AV_VIDEO_ENC_PARAMS_H264,
>>>   };
>>>   +enum AVVideoBlockFlags {
>>> +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */
>>>
>>
>> The ULL is only confusing matters here - standard-conforming enum values have type int.  Some compilers allow it to be a larger integer type, but I don't think we can rely on that extension.
>>
> 
> We already rely on this behavior internally in hwcontext_vulkan. Are you sure there's a compiler
> we support that doesn't allow this? If so we ought to fix that.
You don't actually need it in hwcontext_vulkan, since it's internal-only (no ABI issues) and everything would work on a conforming compiler which just converts the values to int.

IMO it's not a useful enough feature to add to the special list of non-C99 things allowed.  Here it seems highly unlikely that we will get anywhere near 32 separate flags.

>>> +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */
>>>
>>
>> Can you define more precisely what you mean by "not coded"?  Is that just that it has no residuals, or does it also indicate no motion vector, or no coded motion vector relative to prediction?
>>
> 
> In AV1 and VP9, which are AFAIK the only codecs to have this type of block skipping, are only able to
> code deblocking level adjustments for skipped blocks. No residual is sent, no motion vector
> either (hence the motion vector used is the likeliest candidate from the list of motion vectors).
> If a block isn't skipped, a motion vector may or may not be coded, depending on what's in
> the list of candidates for a block, so it would be inappropriate to apply the skip flag for anything
> but AV1/VP9-style skipped blocks.
> VP8 also supports all of that but no one cares about VP8 so its irrelevant.

There are block types P_Skip and B_Skip in H.264, which code nothing at all in the bitstream after the skip indication.

P_Skip uses the predicted motion vector against index 0 of RefPicList0.

B_Skip has a complex (but precisely-defined) procedure to try to choose a reference picture from each of the two RefPicLists and motion vectors in them, and will combine them as normal if it has both.

The first sounds exactly like what you're saying is in VP9/AV1.  The second is more complex, but should probably still count?  Dunno.

- Mark
Yongle Lin July 16, 2020, 5:48 p.m. UTC | #10
On Wed, Jul 15, 2020 at 4:37 PM Mark Thompson <sw@jkqxz.net> wrote:

> On 15/07/2020 22:05, Yongle Lin wrote:
> > ---
> >   libavcodec/h264dec.c | 14 ++++++++++++++
> >   1 file changed, 14 insertions(+)
> >
> > diff --git a/libavcodec/h264dec.c b/libavcodec/h264dec.c
> > index 1e2ca68449..b3de5290d0 100644
> > --- a/libavcodec/h264dec.c
> > +++ b/libavcodec/h264dec.c
> > @@ -816,6 +816,20 @@ static int h264_export_enc_params(AVFrame *f,
> H264Picture *p)
> >               b->h     = 16;
> >
> >               b->delta_qp = p->qscale_table[mb_xy] - par->qp;
> > +
> > +            int mb_type = p->mb_type[mb_xy];
> > +            if (IS_PCM(mb_type))
> > +                b->flags |= AV_VIDEO_ENC_BLOCK_INTRA;
>
> Can you explain your definition of what makes a block "intra"?  If you
> really do mean to only include PCM blocks then maybe it should have a
> different name, because that isn't what I would expect.
>

I think intra is the block that uses no reference frame to predict? The
purpose of exporting this data is to enable visualization of block type
which has been deprecated since version 58. And previously it only
supported MPEG video and I want to make it more general to other codecs as
well.
Probably I should include  IS_PCM, IS_INTRA
&& IS_ACPRED(mb_type), IS_INTRA16x16, IS_INTRA4x4 as "intra" type for H.264?


> > +            if (IS_SKIP(mb_type))
> > +                b->flags |= AV_VIDEO_ENC_BLOCK_SKIP; > +            if
> (!USES_LIST(mb_type, 1))
> > +                b->ref[0] = p->ref_index[0];
> > +            else if (!USES_LIST(mb_type, 0))
> > +                b->ref[0] = p->ref_index[1];
> > +            else {
> > +                b->ref[0] = p->ref_index[0];
> > +                b->ref[1] = p->ref_index[1];
>
> I don't think ref_index is anything to do with what you think it is.
>
> Also note that code of this form must be insufficient due to the lack
> dependence on both the slice and the block.  (Slice headers define the two
> reference picture lists, and blocks within a slice
> choose pictures from those lists.)
>

I am not good at H.264. I read the source code and because there isn't too
much comments I thought ref_index is the thing I want. I will try to
correct it to be the index of the L0, L1 picture.


> > +            }
> >           }
> >
> >       return 0;
> >
>
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".


Best,
Yongle
Yongle Lin July 17, 2020, 9:25 p.m. UTC | #11
On Wed, Jul 15, 2020 at 4:13 PM Mark Thompson <sw@jkqxz.net> wrote:

> On 15/07/2020 18:43, Yongle Lin wrote:
> > add block type field to AVVideoBlockParams so we could either export or
> visualize it later.
> > ---
> >   libavutil/video_enc_params.h | 19 +++++++++++++++++++
> >   1 file changed, 19 insertions(+)
> >
> > diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
> > index 43fa443154..8bf5f240c9 100644
> > --- a/libavutil/video_enc_params.h
> > +++ b/libavutil/video_enc_params.h
> > @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
> >       AV_VIDEO_ENC_PARAMS_H264,
> >   };
> >
> > +enum AVVideoBlockFlags {
> > +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses
> intra prediction */
>
> The ULL is only confusing matters here - standard-conforming enum values
> have type int.  Some compilers allow it to be a larger integer type, but I
> don't think we can rely on that extension.
>

I am thinking of defining a bit field struct to for type flags like what we
did in other places:
struct AVVideoBlockFlags {
    unsigned int intra:1;
    unsigned int skip:1;
}

Or we could use the same way of mb_type defined in H.264 like
#define AV_VIDEO_ENC_BLOCK_INTRA 1ULL << 0
#define AV_VIDEO_ENC_BLOCK_SKIP 1ULL << 1

uint64_t b_type


> > +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not
> coded (skipped) */
>
> Can you define more precisely what you mean by "not coded"?  Is that just
> that it has no residuals, or does it also indicate no motion vector, or no
> coded motion vector relative to prediction?
>

 I want to make it more general which can be applied to more codec. So in
VP9, there is a skip flag to indicate if the block has residual
coefficients. As for H.264 you mentioned there are P_Skip and B_Skip, I
think both of them should be considered as skip.

>
> > +};
> > +
> >   /**
> >    * Video encoding parameters for a given frame. This struct is
> allocated along
> >    * with an optional array of per-block AVVideoBlockParams descriptors.
> > @@ -126,6 +131,20 @@ typedef struct AVVideoBlockParams {
> >        * corresponding per-frame value.
> >        */
> >       int32_t delta_qp;
> > +
> > +    /**
> > +     * Type flag of the block
> > +     * Each bit field indicates a type flag
> > +     */
> > +    enum AVVideoBlockFlags flags;
>
> Don't make this an enum, since you aren't using it as an enum - you're
> going to assign combinations of flags.  (Also, the size of the field may
> change as more flags are added.)
>
> > +
> > +    /**
> > +     * Reference frames used for prediction
> > +     * Each entry specifies the first/second/third/etc. reference frame
> the current frame uses.
> > +     * The value at each entry specifies the index inside the reference
> frame array for that current frame.
>
> You'll need to define more carefully what "the reference frame array"
> means.  I can guess that it's the ref_frame_idx[] number for VP9, but it's
> not at all obvious what it would mean for H.264.
>

Previously I planned to store if the block uses previous or future ref
frames for this field. I don't fully understand how H,264 stores the
reference frame. Perhaps we could change the definition of ref so that it
can be applied to both vp9 and h264.


> > +     * Any entry that is unused will be set to -1
> > +     */
> > +    int8_t ref[8];
> >   } AVVideoBlockParams;
> >
> >   /*
> >
>
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".


Best,
Yongle
Michael Niedermayer July 18, 2020, 7:01 a.m. UTC | #12
On Fri, Jul 17, 2020 at 02:25:24PM -0700, Yongle Lin wrote:
> On Wed, Jul 15, 2020 at 4:13 PM Mark Thompson <sw@jkqxz.net> wrote:
> 
> > On 15/07/2020 18:43, Yongle Lin wrote:
> > > add block type field to AVVideoBlockParams so we could either export or
> > visualize it later.
> > > ---
> > >   libavutil/video_enc_params.h | 19 +++++++++++++++++++
> > >   1 file changed, 19 insertions(+)
> > >
> > > diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
> > > index 43fa443154..8bf5f240c9 100644
> > > --- a/libavutil/video_enc_params.h
> > > +++ b/libavutil/video_enc_params.h
> > > @@ -57,6 +57,11 @@ enum AVVideoEncParamsType {
> > >       AV_VIDEO_ENC_PARAMS_H264,
> > >   };
> > >
> > > +enum AVVideoBlockFlags {
> > > +    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses
> > intra prediction */
> >
> > The ULL is only confusing matters here - standard-conforming enum values
> > have type int.  Some compilers allow it to be a larger integer type, but I
> > don't think we can rely on that extension.
> >
> 
> I am thinking of defining a bit field struct to for type flags like what we
> did in other places:
> struct AVVideoBlockFlags {
>     unsigned int intra:1;
>     unsigned int skip:1;
> }
> 
> Or we could use the same way of mb_type defined in H.264 like
> #define AV_VIDEO_ENC_BLOCK_INTRA 1ULL << 0
> #define AV_VIDEO_ENC_BLOCK_SKIP 1ULL << 1
> 
> uint64_t b_type
> 
> 
> > > +    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not
> > coded (skipped) */
> >
> > Can you define more precisely what you mean by "not coded"?  Is that just
> > that it has no residuals, or does it also indicate no motion vector, or no
> > coded motion vector relative to prediction?
> >
> 
>  I want to make it more general which can be applied to more codec. So in
> VP9, there is a skip flag to indicate if the block has residual
> coefficients. As for H.264 you mentioned there are P_Skip and B_Skip, I
> think both of them should be considered as skip.

If you want to make it general, maybe use multiple skip flags with clear meaning

SKIP_CODED_RESIDUAL  // No residual coded, the block may have a predicted residual
SKIP_CODED_MVD       // No motion vector difference coded, the block may have a non 0 motion vector
ZERO_RESIDUAL        // Block has no residual
ZERO_MV              // Block has a 0,0 motion vector

also in light of motion vectors, the intra AC prediction mode / direction
should probably be exported, its kind of the equivalent of what motion
vectors are in inter blocks but for intra

thx

[...]
diff mbox series

Patch

diff --git a/libavutil/video_enc_params.h b/libavutil/video_enc_params.h
index 43fa443154..52c0058f5b 100644
--- a/libavutil/video_enc_params.h
+++ b/libavutil/video_enc_params.h
@@ -57,6 +57,11 @@  enum AVVideoEncParamsType {
     AV_VIDEO_ENC_PARAMS_H264,
 };
 
+enum AVVideoBlockFlags {
+    AV_VIDEO_ENC_BLOCK_INTRA = 1ULL <<  0,  /* Indicates block uses intra prediction */
+    AV_VIDEO_ENC_BLOCK_SKIP = 1ULL <<  1,   /* Indicates block is not coded (skipped) */
+};
+
 /**
  * Video encoding parameters for a given frame. This struct is allocated along
  * with an optional array of per-block AVVideoBlockParams descriptors.
@@ -126,6 +131,17 @@  typedef struct AVVideoBlockParams {
      * corresponding per-frame value.
      */
     int32_t delta_qp;
+
+    /**
+     * Type flag of the block
+     * Each bit field indicates a type flag
+     */
+    enum AVVideoBlockFlags flags;
+
+    /**
+     * Reference frames used for prediction
+     */
+    uint8_t ref[8];
 } AVVideoBlockParams;
 
 /*