[FFmpeg-devel] Extract QP from h264 encoded videos

Message ID	20190805192007.45680-1-juandl@google.com
State	New
Headers	show Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> Date: Mon, 5 Aug 2019 12:20:08 -0700 In-Reply-To: <CAPsBPi2N=Xa=zQkzchbKCn4FWan1x5wwpgNY4nehZ4yMt8Y7UA@mail.gmail.com> Message-Id: <20190805192007.45680-1-juandl@google.com> Mime-Version: 1.0 References: <CAPsBPi2N=Xa=zQkzchbKCn4FWan1x5wwpgNY4nehZ4yMt8Y7UA@mail.gmail.com> From: "=?UTF-8?q?Juan=20De=20Le=C3=B3n?=" <juandl-at-google.com@ffmpeg.org> To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH] Extract QP from h264 encoded videos Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: =?UTF-8?q?Juan=20De=20Le=C3=B3n?= <juandl@google.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Message ID

20190805192007.45680-1-juandl@google.com

State

New

Headers

Date: Mon,  5 Aug 2019 12:20:08 -0700
In-Reply-To: <CAPsBPi2N=Xa=zQkzchbKCn4FWan1x5wwpgNY4nehZ4yMt8Y7UA@mail.gmail.com>
Message-Id: <20190805192007.45680-1-juandl@google.com>
Mime-Version: 1.0
References: <CAPsBPi2N=Xa=zQkzchbKCn4FWan1x5wwpgNY4nehZ4yMt8Y7UA@mail.gmail.com>
From: "=?UTF-8?q?Juan=20De=20Le=C3=B3n?=" <juandl-at-google.com@ffmpeg.org>
To: ffmpeg-devel@ffmpeg.org
Subject: [FFmpeg-devel] [PATCH] Extract QP from h264 encoded videos
Precedence: list
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: =?UTF-8?q?Juan=20De=20Le=C3=B3n?= <juandl@google.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Commit Message

=?UTF-8?q?Juan=20De=20Le=C3=B3n?= Aug. 5, 2019, 7:20 p.m. UTC

AVQuantizationParams data structure for extracting qp and storing as AV_FRAME_DATA_QUANTIZATION_PARAMS AVFrameSideDataType
design doc: https://docs.google.com/document/d/1WClt3EqhjwdGXhEw386O0wfn3IBQ1Ib-_5emVM1gbnA/edit?usp=sharing

Signed-off-by: Juan De León <juandl@google.com>
---
 libavutil/Makefile              |   2 +
 libavutil/frame.h               |   6 ++
 libavutil/quantization_params.c |  83 ++++++++++++++++++++++++
 libavutil/quantization_params.h | 110 ++++++++++++++++++++++++++++++++
 4 files changed, 201 insertions(+)
 create mode 100644 libavutil/quantization_params.c
 create mode 100644 libavutil/quantization_params.h

Comments

Mark Thompson Aug. 6, 2019, 11:59 p.m. UTC | #1

On 05/08/2019 20:20, Juan De León wrote:
> AVQuantizationParams data structure for extracting qp and storing as AV_FRAME_DATA_QUANTIZATION_PARAMS AVFrameSideDataType
> design doc: https://docs.google.com/document/d/1WClt3EqhjwdGXhEw386O0wfn3IBQ1Ib-_5emVM1gbnA/edit?usp=sharing
> 
> Signed-off-by: Juan De León <juandl@google.com>
> ---
>  libavutil/Makefile              |   2 +
>  libavutil/frame.h               |   6 ++
>  libavutil/quantization_params.c |  83 ++++++++++++++++++++++++
>  libavutil/quantization_params.h | 110 ++++++++++++++++++++++++++++++++
>  4 files changed, 201 insertions(+)
>  create mode 100644 libavutil/quantization_params.c
>  create mode 100644 libavutil/quantization_params.h

This should be in libavcodec, not libavutil - it relates directly to codecs.  (Indeed, you've ended up having to define a special non-libavcodec enum of codecs below to make it work in libavutil at all.)

> diff --git a/libavutil/quantization_params.h b/libavutil/quantization_params.h
> new file mode 100644
> index 0000000000..1c1b474dca
> --- /dev/null
> +++ b/libavutil/quantization_params.h
> @@ -0,0 +1,110 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_QUANTIZATION_PARAMS_H
> +#define AVUTIL_QUANTIZATION_PARAMS_H
> +
> +/**
> + * Supported decoders for extraction and filter
> + */
> +enum AVExtractQPSupportedCodecs {
> +    AV_EXTRACT_QP_CODEC_ID_H264 = 0,
> +    AV_EXTRACT_QP_CODEC_ID_VP9,
> +    AV_EXTRACT_QP_CODEC_ID_AV1,
> +};
> +
> +/**
> + * Enums for different codecs to store qp in the type array
> + * Each enum must have an array of strings describing each field
> + * declared in libavutil/quantization_params.c
> + */
> +
> +enum AVQPArrIndexesH264 {  // varaible names in spec document

I don't think giving these enums a tag has any value?

> +    AV_QP_Y_H264 = 0,      // QPy
> +    AV_QP_U_H264,          // QPcb

There is no variable in the standard with this name, which will confuse attempts to search for its meaning.  I think you mean QPc for the Cb component, as described in section 8.5.8.

> +    AV_QP_V_H264,          // QPcr

Likewise here.

> +    AV_QP_ARR_SIZE_H264
> +};
> +
> +enum AVQPArrIndexesVP9 {   // variable names in spec document
> +    AV_QP_YAC_VP9 = 0,     // get_dc_quant[][base_q_idx]

This isn't right - get_dc_quant() is a function of one variable, not a two-dimensional array (confused with dc_qlookup[][] somehow?).

> +    AV_QP_YDC_VP9,         // get_dc_quant[][base_q_idx+delta_q_y_dc]
> +    AV_QP_UVDC_VP9,        // get_dc_quant[][base_q_idx+delta_q_uv_dc]
> +    AV_QP_UVAC_VP9,        // get_ac_quant[][base_q_idx+delta_q_uv_ac]
> +    AV_QP_INDEX_YAC_VP9,   // base_q_idx
> +    AV_QP_INDEX_YDC_VP9,   // base_q_idx+delta_q_y_dc
> +    AV_QP_INDEX_UVDC_VP9,  // base_q_idx+delta_q_uv_dc
> +    AV_QP_INDEX_UVAC_VP9,  // base_q_idx+delta_q_uv_ac

Why are you including higher-level frame values for VP9 and AV1, but not including similar ones for H.264?

> +    AV_QP_ARR_SIZE_VP9
> +};
> +
> +enum AVQPArrIndexesAV1 {  // variable names in spec document
> +    AV_QP_YAC_AV1 = 0,    // dc_q(base_q_idx)

What is the motivation for AV1 returning the exponential coefficient scaling values (range 4..29247) rather than the linear parameter (range 0..255) as you do for H.264?

> +    AV_QP_YDC_AV1,        // dc_q(base_q_idx+DeltaQYDc)
> +    AV_QP_UDC_AV1,        // dc_q(base_q_idx+DeltaQUDc)
> +    AV_QP_UAC_AV1,        // dc_q(base_q_idx+DeltaQUAc)
> +    AV_QP_VDC_AV1,        // dc_q(base_q_idx+DeltaQVDc)
> +    AV_QP_VAC_AV1,        // dc_q(base_q_idx+DeltaQVAc)
> +    AV_QP_INDEX_YAC_AV1,  // base_q_idx
> +    AV_QP_INDEX_YDC_AV1,  // base_q_idx+DeltaQYDc
> +    AV_QP_INDEX_UDC_AV1,  // base_q_idx+DeltaQUDc
> +    AV_QP_INDEX_UAC_AV1,  // base_q_idx+DeltaQUAc
> +    AV_QP_INDEX_VDC_AV1,  // base_q_idx+DeltaQVDc
> +    AV_QP_INDEX_VAC_AV1,  // base_q_idx+DeltaQVAc
> +    AV_QP_ARR_SIZE_AV1
> +};

The term "QP" is never used in VP9 or AV1.  Maybe these could have a less H.26[45]-oriented name?

> +
> +/**
> + * Update AV_QP_ARR_MAX_SIZE when a new enum is defined that
> + * exceeds the current max size.
> + */
> +
> +#define AV_QP_ARR_MAX_SIZE AV_QP_ARR_SIZE_AV1

Fixing this for all time for a particular codec which happens to need the most space when it is defined doesn't seem like a good idea.  E.g. you can't support JPEG with only this number (it would need all entries in up to four tables).

It might be better if the structure size wasn't fixed forever by the first version of the API/ABI.  Perhaps an approach something like that used for AVRegionOfInterest would work?

> +
> +/**
> + * Data structure for extracting Quantization Parameters, codec independent
> + */
> +typedef struct AVQuantizationParams {
> +    /**
> +     * x and y coordinates of the block in pixels
> +     */
> +    int x, y;

Don't call these x/y coordinates because it not clear exactly what that means (what is the scale, where is the origin, which direction is positive, where in the block is being referred to, etc.).

Instead follow the same convention as other structures in FFmpeg and define them as the distance in pixels from the left or the top edge of the picture to the top-left corner of the block.

On 30/07/2019 03:19, Juan De León wrote:
> On Mon, Jul 29, 2019 at 12:48 PM Mark Thompson <sw@jkqxz.net> wrote: 
>>
>> How do these values interact with cropping?
> 
> I'm not sure I understand, could you elaborate?

For codecs which include cropping such as H.26[45], the decoder may directly apply cropping from the stream (controlled by AVCodecContext.apply_cropping), possibly modified by alignment (with AV_CODEC_FLAG_UNALIGNED), and then sets the AVFrame cropping fields to reflect the remainder.

For example, in H.264 try setting the frame_crop_left_offset/frame_crop_top_offset fields in a stream to large values (h264_metadata can do this for an existing stream).  What do your x/y values then refer to in the result?  They could be negative to indicate macroblocks which are off the edges of the cropped picture, or they might be relative to the uncropped picture in which case you would need additional information to reconstruct which blocks they refer to in the frame you actually have.

> +    /**
> +     * width and height of the block in pixels
> +     * set to 0 for the last block in the array
> +     */
> +    int w, h;

Just call them width, height - there's no benefit to brevity here.

> +    /**
> +     * qp_type array indexed using the enum corresponding
> +     * to the codec extracting the QP
> +     * AV_QP_ARR_MAX_SIZE sould always be set to
> +     * the largest size of the supported codecs
> +     */
> +    int qp_type[AV_QP_ARR_MAX_SIZE];

Give this a less misleading name.  It's not an array of QP types, rather it's an array of quantiser values which is indexed by the codec-dependent type.

> +    /**
> +     * Stores an id corresponding to one of the supported codecs
> +     */
> +    enum AVExtractQPSupportedCodecs codec_id;

enum AVCodecID, with this in libavcodec.

> +} AVQuantizationParams;
> +
> +/**
> + * Get the string describing the qp type for the given codec
> + */
> +const char* av_get_qp_type_string(enum AVExtractQPSupportedCodecs codec_id, int index);

I'm not sure there is a good reason to embed this in the public API - what user is ever going to call this function?  Anyone using the enum values must already know exactly what each of them mean to do anything with them at all, so if they need string names they'll already have clearer ones than the cryptic short names you provide here.

I think it would probably be better to just include your string names in the showinfo filter (or some other) and not have it in the public API.

> +
> +#endif /* AVUTIL_QUANTIZATION_PARAMS_H */
> 

- Mark

Michael Niedermayer Aug. 7, 2019, 6:43 p.m. UTC | #2

On Wed, Aug 07, 2019 at 12:59:33AM +0100, Mark Thompson wrote:
> On 05/08/2019 20:20, Juan De León wrote:
[...]
> 
> > +    /**
> > +     * Stores an id corresponding to one of the supported codecs
> > +     */
> > +    enum AVExtractQPSupportedCodecs codec_id;
> 
> enum AVCodecID, with this in libavcodec.

This may have interresting corner cases
like mpeg4 which can use h263 style or mpeg2 style quantization depending on a flag

so for these we may want the stored type to differ from the actual
decoder codec_id

and then the question of course arrises if a codec uses the same type
of quantization as another, should it use the same id as that other
or its own id. I think both have advantages and disadvantages

Thanks

[...]

=?UTF-8?q?Juan=20De=20Le=C3=B3n?= Aug. 7, 2019, 10:46 p.m. UTC | #3

On Tue, Aug 6, 2019 at 5:07 PM Mark Thompson <sw@jkqxz.net> wrote:

> This should be in libavcodec, not libavutil - it relates directly to
> codecs.  (Indeed, you've ended up having to define a special non-libavcodec
> enum of codecs below to make it work in libavutil at all.)
>
If this belongs in avcodec I can move it there, but I don't see a similar
data structure in that library.
I believe declaring different IDs for supported codecs here is a better
approach.


> > +enum AVQPArrIndexesH264 {  // varaible names in spec document
>
> I don't think giving these enums a tag has any value?
>
They are not used in the code, but keeping them makes the purpose of each
enum clearer.


> > +    AV_QP_ARR_SIZE_H264
> > +};
> > +
> > +enum AVQPArrIndexesVP9 {   // variable names in spec document
> > +    AV_QP_YAC_VP9 = 0,     // get_dc_quant[][base_q_idx]
>
> This isn't right - get_dc_quant() is a function of one variable, not a
> two-dimensional array (confused with dc_qlookup[][] somehow?).
>
Thank you, I think I meant: ac_q(get_qindex()).

Why are you including higher-level frame values for VP9 and AV1, but not
> including similar ones for H.264?
>
Again, I meant get_qindex(), it is supposed to represent the quant index
for the specified segment, not frame quant index.

What is the motivation for AV1 returning the exponential coefficient
> scaling values (range 4..29247) rather than the linear parameter (range
> 0..255) as you do for H.264?

Exposing both the values was a requirement by my team.

> +#define AV_QP_ARR_MAX_SIZE AV_QP_ARR_SIZE_AV1
>
> Fixing this for all time for a particular codec which happens to need the
> most space when it is defined doesn't seem like a good idea.  E.g. you
> can't support JPEG with only this number (it would need all entries in up
> to four tables).

It might be better if the structure size wasn't fixed forever by the first
> version of the API/ABI.  Perhaps an approach something like that used for
> AVRegionOfInterest would work?
>
Each instance of AVQuantizationParams has an array of qp values/indexes
(qp_type[]) for which I need a constant to allocate memory.
The approach AVRegionOfInterest uses does not solve that problem.

> +    /**
> > +     * x and y coordinates of the block in pixels
> > +     */
> > +    int x, y;
>
> Don't call these x/y coordinates because it not clear exactly what that
> means (what is the scale, where is the origin, which direction is positive,
> where in the block is being referred to, etc.).

Instead follow the same convention as other structures in FFmpeg and define
> them as the distance in pixels from the left or the top edge of the picture
> to the top-left corner of the block.
>
That's exactly their purpose, the distance in pixels from the top-left
corner of the frame, to the top-left corner of the block. I will make the
description clearer, thank you.


> On 30/07/2019 03:19, Juan De León wrote:
> > On Mon, Jul 29, 2019 at 12:48 PM Mark Thompson <sw@jkqxz.net> wrote:
> >>
> >> How do these values interact with cropping?
> >
> > I'm not sure I understand, could you elaborate?
>
> For codecs which include cropping such as H.26[45], the decoder may
> directly apply cropping from the stream (controlled by
> AVCodecContext.apply_cropping), possibly modified by alignment (with
> AV_CODEC_FLAG_UNALIGNED), and then sets the AVFrame cropping fields to
> reflect the remainder.

For example, in H.264 try setting the
> frame_crop_left_offset/frame_crop_top_offset fields in a stream to large
> values (h264_metadata can do this for an existing stream).  What do your
> x/y values then refer to in the result?  They could be negative to indicate
> macroblocks which are off the edges of the cropped picture, or they might
> be relative to the uncropped picture in which case you would need
> additional information to reconstruct which blocks they refer to in the
> frame you actually have.
>
The coordinates of the blocks should correspond to the coded picture,
quantization is still applied to cropped MBs outside of the frame so that
should be considered for the logging and avg calculation, similar to an
analyzer.


> > +    /**
> > +     * Stores an id corresponding to one of the supported codecs
> > +     */
> > +    enum AVExtractQPSupportedCodecs codec_id;
>
> enum AVCodecID, with this in libavcodec.
>
Like Michael said, this could cause conflict when extracting QP. It might
be better to leave it as a separate ID.

> +/**
> > + * Get the string describing the qp type for the given codec
> > + */
> > +const char* av_get_qp_type_string(enum AVExtractQPSupportedCodecs
> codec_id, int index);
>
> I'm not sure there is a good reason to embed this in the public API - what
> user is ever going to call this function?  Anyone using the enum values
> must already know exactly what each of them mean to do anything with them
> at all, so if they need string names they'll already have clearer ones than
> the cryptic short names you provide here.

I think it would probably be better to just include your string names in
> the showinfo filter (or some other) and not have it in the public API.
>
I'm using this for logging purposes in a filter that calculates
min/max/avg. I believe it's better to leave them in the public API than
limit them only to the filter.

Lynne Aug. 7, 2019, 10:59 p.m. UTC | #4

Aug 7, 2019, 11:46 PM by juandl-at-google.com@ffmpeg.org:

> On Tue, Aug 6, 2019 at 5:07 PM Mark Thompson <sw@jkqxz.net> wrote:
>
>> This should be in libavcodec, not libavutil - it relates directly to
>> codecs.  (Indeed, you've ended up having to define a special non-libavcodec
>> enum of codecs below to make it work in libavutil at all.)
>>
> If this belongs in avcodec I can move it there, but I don't see a similar
> data structure in that library.
> I believe declaring different IDs for supported codecs here is a better
> approach.
>
>
>> > +enum AVQPArrIndexesH264 {  // varaible names in spec document
>>
>> I don't think giving these enums a tag has any value?
>>
> They are not used in the code, but keeping them makes the purpose of each
> enum clearer.
>
>
>> > +    AV_QP_ARR_SIZE_H264
>> > +};
>> > +
>> > +enum AVQPArrIndexesVP9 {   // variable names in spec document
>> > +    AV_QP_YAC_VP9 = 0,     // get_dc_quant[][base_q_idx]
>>
>> This isn't right - get_dc_quant() is a function of one variable, not a
>> two-dimensional array (confused with dc_qlookup[][] somehow?).
>>
> Thank you, I think I meant: ac_q(get_qindex()).
>
> Why are you including higher-level frame values for VP9 and AV1, but not
>
>> including similar ones for H.264?
>>
> Again, I meant get_qindex(), it is supposed to represent the quant index
> for the specified segment, not frame quant index.
>
> What is the motivation for AV1 returning the exponential coefficient
>
>> scaling values (range 4..29247) rather than the linear parameter (range
>> 0..255) as you do for H.264?
>>
>
> Exposing both the values was a requirement by my team.
>
>> +#define AV_QP_ARR_MAX_SIZE AV_QP_ARR_SIZE_AV1
>>
>> Fixing this for all time for a particular codec which happens to need the
>> most space when it is defined doesn't seem like a good idea.  E.g. you
>> can't support JPEG with only this number (it would need all entries in up
>> to four tables).
>>
>
> It might be better if the structure size wasn't fixed forever by the first
>
>> version of the API/ABI.  Perhaps an approach something like that used for
>> AVRegionOfInterest would work?
>>
> Each instance of AVQuantizationParams has an array of qp values/indexes
> (qp_type[]) for which I need a constant to allocate memory.
> The approach AVRegionOfInterest uses does not solve that problem.
>
>> +    /**
>> > +     * x and y coordinates of the block in pixels
>> > +     */
>> > +    int x, y;
>>
>> Don't call these x/y coordinates because it not clear exactly what that
>> means (what is the scale, where is the origin, which direction is positive,
>> where in the block is being referred to, etc.).
>>
>
> Instead follow the same convention as other structures in FFmpeg and define
>
>> them as the distance in pixels from the left or the top edge of the picture
>> to the top-left corner of the block.
>>
> That's exactly their purpose, the distance in pixels from the top-left
> corner of the frame, to the top-left corner of the block. I will make the
> description clearer, thank you.
>
>
>> On 30/07/2019 03:19, Juan De León wrote:
>> > On Mon, Jul 29, 2019 at 12:48 PM Mark Thompson <sw@jkqxz.net> wrote:
>> >>
>> >> How do these values interact with cropping?
>> >
>> > I'm not sure I understand, could you elaborate?
>>
>> For codecs which include cropping such as H.26[45], the decoder may
>> directly apply cropping from the stream (controlled by
>> AVCodecContext.apply_cropping), possibly modified by alignment (with
>> AV_CODEC_FLAG_UNALIGNED), and then sets the AVFrame cropping fields to
>> reflect the remainder.
>>
>
> For example, in H.264 try setting the
>
>> frame_crop_left_offset/frame_crop_top_offset fields in a stream to large
>> values (h264_metadata can do this for an existing stream).  What do your
>> x/y values then refer to in the result?  They could be negative to indicate
>> macroblocks which are off the edges of the cropped picture, or they might
>> be relative to the uncropped picture in which case you would need
>> additional information to reconstruct which blocks they refer to in the
>> frame you actually have.
>>
> The coordinates of the blocks should correspond to the coded picture,
> quantization is still applied to cropped MBs outside of the frame so that
> should be considered for the logging and avg calculation, similar to an
> analyzer.
>
>
>> > +    /**
>> > +     * Stores an id corresponding to one of the supported codecs
>> > +     */
>> > +    enum AVExtractQPSupportedCodecs codec_id;
>>
>> enum AVCodecID, with this in libavcodec.
>>
> Like Michael said, this could cause conflict when extracting QP. It might
> be better to leave it as a separate ID.
>

No one will start writing an mpeg2 encoder now, so I disagree. A codec ID is better.



>> +/**
>> > + * Get the string describing the qp type for the given codec
>> > + */
>> > +const char* av_get_qp_type_string(enum AVExtractQPSupportedCodecs
>> codec_id, int index);
>>
>> I'm not sure there is a good reason to embed this in the public API - what
>> user is ever going to call this function?  Anyone using the enum values
>> must already know exactly what each of them mean to do anything with them
>> at all, so if they need string names they'll already have clearer ones than
>> the cryptic short names you provide here.
>>
>
> I think it would probably be better to just include your string names in
>
>> the showinfo filter (or some other) and not have it in the public API.
>>
> I'm using this for logging purposes in a filter that calculates
> min/max/avg. I believe it's better to leave them in the public API than
> limit them only to the filter.
>

I disagree, hardcoding it in the public API will make users do strcmp, which is worse.


Overall, I still disagree with this patch. I'd rather have an all or nothing API, not one only exposing quantization indices only and a separate one for motion vectors.
I would agree to have it renamed to something more generic, like AVBlockData, which could only support quantization indices now but could later be extended to support motion vectors and other features without breaking the API.

diff --git a/libavutil/Makefile b/libavutil/Makefile
index 8a7a44e4b5..be1a9c3a9c 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -60,6 +60,7 @@  HEADERS = adler32.h                                                     \
           pixdesc.h                                                     \
           pixelutils.h                                                  \
           pixfmt.h                                                      \
+          quantization_params.h                                         \
           random_seed.h                                                 \
           rc4.h                                                         \
           rational.h                                                    \
@@ -140,6 +141,7 @@  OBJS = adler32.o                                                        \
        parseutils.o                                                     \
        pixdesc.o                                                        \
        pixelutils.o                                                     \
+       quantization_params.o                                            \
        random_seed.o                                                    \
        rational.o                                                       \
        reverse.o                                                        \
diff --git a/libavutil/frame.h b/libavutil/frame.h
index 5d3231e7bb..b64fd9c02c 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -179,6 +179,12 @@  enum AVFrameSideDataType {
      * array element is implied by AVFrameSideData.size / AVRegionOfInterest.self_size.
      */
     AV_FRAME_DATA_REGIONS_OF_INTEREST,
+    /**
+     * To extract quantization parameters from supported decoders.
+     * The data is stored as AVQuantizationParamsArray type, described in
+     * libavuitl/quantization_params.h
+     */
+    AV_FRAME_DATA_QUANTIZATION_PARAMS,
 };
 
 enum AVActiveFormatDescription {
diff --git a/libavutil/quantization_params.c b/libavutil/quantization_params.c
new file mode 100644
index 0000000000..d0aff7b35a
--- /dev/null
+++ b/libavutil/quantization_params.c
@@ -0,0 +1,83 @@ 
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+#include <stddef.h>
+
+#include "libavutil/quantization_params.h"
+
+/**
+ * Strings describing the corresponding qp_type for each of the enums
+ * listed in libavutil/quantization_params.h
+ * Used for logging.
+ */
+
+static const char* const QP_NAMES_H264[] = {          // enum AVQPArrIndexesH264:
+                                            "qp",     // AV_QP_Y_H264
+                                            "qpcb",   // AV_QP_U_H264
+                                            "qpcr"    // AV_QP_V_H264
+                                            };
+
+static const char* const QP_NAMES_VP9[] = {           // enum AVQPArrIndexesVP9:
+                                           "qyac",    // AV_QP_YAC_VP9
+                                           "qydc",    // AV_QP_YDC_VP9
+                                           "quvdc",   // AV_QP_UVDC_VP9
+                                           "quvac",   // AV_QP_UVAC_VP9
+                                           "qiyac",   // AV_QP_INDEX_YAC_VP9
+                                           "qiydc",   // AV_QP_INDEX_YDC_VP9
+                                           "qiuvdc",  // AV_QP_INDEX_UVDC_VP9
+                                           "qiuvac"   // AV_QP_INDEX_UVAC_VP9
+                                           };
+
+static const char* const QP_NAMES_AV1[] = {          // enum AVQPArrIndexesAV1:
+                                           "qyac",   // AV_QP_YAC_AV1
+                                           "qydc",   // AV_QP_YDC_AV1
+                                           "qudc",   // AV_QP_UDC_AV1
+                                           "quac",   // AV_QP_UAC_AV1
+                                           "qvdc",   // AV_QP_VDC_AV1
+                                           "qvac",   // AV_QP_VAC_AV1
+                                           "qiyac",  // AV_QP_INDEX_YAC_AV1
+                                           "qiydc",  // AV_QP_INDEX_YDC_AV1
+                                           "qiudc",  // AV_QP_INDEX_UDC_AV1
+                                           "qiuac",  // AV_QP_INDEX_UAC_AV1
+                                           "qivdc",  // AV_QP_INDEX_VDC_AV1
+                                           "qivac"   // AV_QP_INDEX_VAC_AV1
+                                           };
+
+/**
+ * Returns a pointer to the char string corresponding to the qp_type of the given parameters.
+ * Returns NULL for {@code index} values out of range or invalid {@code codec_id}codec_id.
+ * @param codec_id corresponds to one of the supported codecs described in 
+ * libavutil/quantizaion_params.h
+ * @param index the enum corresponding to the qp_type to index the string array
+ */
+
+const char* av_get_qp_type_string(enum AVExtractQPSupportedCodecs codec_id, int index)
+{
+    if (index < 0) {
+        return NULL;
+    }
+    switch (codec_id) {
+        case AV_EXTRACT_QP_CODEC_ID_H264:
+            return index < AV_QP_ARR_SIZE_H264 ? QP_NAMES_H264[index] :NULL;
+        case AV_EXTRACT_QP_CODEC_ID_VP9:
+            return index < AV_QP_ARR_SIZE_VP9  ? QP_NAMES_VP9[index]  :NULL;
+        case AV_EXTRACT_QP_CODEC_ID_AV1:
+            return index < AV_QP_ARR_SIZE_AV1  ? QP_NAMES_AV1[index]  :NULL;
+        default:
+            return NULL;
+    }
+}
diff --git a/libavutil/quantization_params.h b/libavutil/quantization_params.h
new file mode 100644
index 0000000000..1c1b474dca
--- /dev/null
+++ b/libavutil/quantization_params.h
@@ -0,0 +1,110 @@ 
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_QUANTIZATION_PARAMS_H
+#define AVUTIL_QUANTIZATION_PARAMS_H
+
+/**
+ * Supported decoders for extraction and filter
+ */
+enum AVExtractQPSupportedCodecs {
+    AV_EXTRACT_QP_CODEC_ID_H264 = 0,
+    AV_EXTRACT_QP_CODEC_ID_VP9,
+    AV_EXTRACT_QP_CODEC_ID_AV1,
+};
+
+/**
+ * Enums for different codecs to store qp in the type array
+ * Each enum must have an array of strings describing each field
+ * declared in libavutil/quantization_params.c
+ */
+
+enum AVQPArrIndexesH264 {  // varaible names in spec document
+    AV_QP_Y_H264 = 0,      // QPy
+    AV_QP_U_H264,          // QPcb
+    AV_QP_V_H264,          // QPcr
+    AV_QP_ARR_SIZE_H264
+};
+
+enum AVQPArrIndexesVP9 {   // variable names in spec document
+    AV_QP_YAC_VP9 = 0,     // get_dc_quant[][base_q_idx]
+    AV_QP_YDC_VP9,         // get_dc_quant[][base_q_idx+delta_q_y_dc]
+    AV_QP_UVDC_VP9,        // get_dc_quant[][base_q_idx+delta_q_uv_dc]
+    AV_QP_UVAC_VP9,        // get_ac_quant[][base_q_idx+delta_q_uv_ac]
+    AV_QP_INDEX_YAC_VP9,   // base_q_idx
+    AV_QP_INDEX_YDC_VP9,   // base_q_idx+delta_q_y_dc
+    AV_QP_INDEX_UVDC_VP9,  // base_q_idx+delta_q_uv_dc
+    AV_QP_INDEX_UVAC_VP9,  // base_q_idx+delta_q_uv_ac
+    AV_QP_ARR_SIZE_VP9
+};
+
+enum AVQPArrIndexesAV1 {  // variable names in spec document
+    AV_QP_YAC_AV1 = 0,    // dc_q(base_q_idx)
+    AV_QP_YDC_AV1,        // dc_q(base_q_idx+DeltaQYDc)
+    AV_QP_UDC_AV1,        // dc_q(base_q_idx+DeltaQUDc)
+    AV_QP_UAC_AV1,        // dc_q(base_q_idx+DeltaQUAc)
+    AV_QP_VDC_AV1,        // dc_q(base_q_idx+DeltaQVDc)
+    AV_QP_VAC_AV1,        // dc_q(base_q_idx+DeltaQVAc)
+    AV_QP_INDEX_YAC_AV1,  // base_q_idx
+    AV_QP_INDEX_YDC_AV1,  // base_q_idx+DeltaQYDc
+    AV_QP_INDEX_UDC_AV1,  // base_q_idx+DeltaQUDc
+    AV_QP_INDEX_UAC_AV1,  // base_q_idx+DeltaQUAc
+    AV_QP_INDEX_VDC_AV1,  // base_q_idx+DeltaQVDc
+    AV_QP_INDEX_VAC_AV1,  // base_q_idx+DeltaQVAc
+    AV_QP_ARR_SIZE_AV1
+};
+
+/**
+ * Update AV_QP_ARR_MAX_SIZE when a new enum is defined that
+ * exceeds the current max size.
+ */
+
+#define AV_QP_ARR_MAX_SIZE AV_QP_ARR_SIZE_AV1
+
+/**
+ * Data structure for extracting Quantization Parameters, codec independent
+ */
+typedef struct AVQuantizationParams {
+    /**
+     * x and y coordinates of the block in pixels
+     */
+    int x, y;
+    /**
+     * width and height of the block in pixels
+     * set to 0 for the last block in the array
+     */
+    int w, h;
+    /**
+     * qp_type array indexed using the enum corresponding
+     * to the codec extracting the QP
+     * AV_QP_ARR_MAX_SIZE sould always be set to
+     * the largest size of the supported codecs
+     */
+    int qp_type[AV_QP_ARR_MAX_SIZE];
+    /**
+     * Stores an id corresponding to one of the supported codecs
+     */
+    enum AVExtractQPSupportedCodecs codec_id;
+} AVQuantizationParams;
+
+/**
+ * Get the string describing the qp type for the given codec
+ */
+const char* av_get_qp_type_string(enum AVExtractQPSupportedCodecs codec_id, int index);
+
+#endif /* AVUTIL_QUANTIZATION_PARAMS_H */

[FFmpeg-devel] Extract QP from h264 encoded videos

Commit Message

Comments

Patch