[FFmpeg-devel] libavc/libx264: add support to propagate SSE values through encoder stats

Message ID	235b296cd67fed2c79862d4403dbe268d2f7e25b.camel@amazon.it
State	New
Headers	show Delivered-To: ffmpegpatchwork2@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; To: "ffmpeg-devel@ffmpeg.org" <ffmpeg-devel@ffmpeg.org> Thread-Topic: libavc/libx264: add support to propagate SSE values through encoder stats Thread-Index: AQHZ7gVWeZgr9yv0QEqkANBqjQmzOw== Date: Sat, 23 Sep 2023 10:04:28 +0000 Message-ID: <235b296cd67fed2c79862d4403dbe268d2f7e25b.camel@amazon.it> Accept-Language: en-US Content-Language: en-US Content-Type: multipart/mixed; boundary="_002_235b296cd67fed2c79862d4403dbe268d2f7e25bcamelamazonit_" MIME-Version: 1.0 Subject: [FFmpeg-devel] libavc/libx264: add support to propagate SSE values through encoder stats Precedence: list From: "Carotti, Elias via ffmpeg-devel" <ffmpeg-devel@ffmpeg.org> Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: "Carotti, Elias" <eliascrt@amazon.it> Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Series	[FFmpeg-devel] libavc/libx264: add support to propagate SSE values through encoder stats \| expand [FFmpeg-devel] libavc/libx264: add support to propagate SSE values through encoder stats

Context	Check	Description
andriy/commit_msg_x86	warning	The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
andriy/make_x86	success	Make finished
andriy/make_fate_x86	success	Make fate finished

Diego Felix de Souza via ffmpeg-devel Sept. 23, 2023, 10:04 a.m. UTC

Hi,
please find attached a patch to propagate the SSE for a frame into the
encoder stats.
Since libx264 already provides PSNR values, this is done by basically
inverting the formula to recover the SSE values.

Would it be possible to also append other values to the errors vector?
E.g., libx264 also computes SSIM but other values could be provided.

Best,
Elias



NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico

Diego Felix de Souza via ffmpeg-devel Oct. 2, 2023, 5:35 p.m. UTC | #1

> Hi,
> please find attached a patch to propagate the SSE for a frame into
> the
> encoder stats.
> Since libx264 already provides PSNR values, this is done by basically
> inverting the formula to recover the SSE values.
> 
> Would it be possible to also append other values to the errors
> vector?
> E.g., libx264 also computes SSIM but other values could be provided.
> 
> Best,
> Elias

Hi,
very minor fix to a minor patch.
I noticed I wasn't rounding values properly.

Could someone have a look, please?
Elias



NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico

Anton Khirnov Oct. 10, 2023, 10:54 a.m. UTC | #2

Quoting Carotti, Elias via ffmpeg-devel (2023-10-02 19:35:09)
> diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
> index 77a9f173b4..4c643c9066 100644
> --- a/libavcodec/libx264.c
> +++ b/libavcodec/libx264.c
> @@ -129,6 +129,8 @@ typedef struct X264Context {
>      int roi_warned;
> 
>      int mb_info;
> +
> +    int64_t sse[3];

The values don't need to be preserved across frames, so might as well
put this on stack in the block calling ff_side_data_set_encoder_stats().

>  } X264Context;
> 
>  static void X264_log(void *p, int level, const char *fmt, va_list args)
> @@ -726,7 +728,40 @@ FF_ENABLE_DEPRECATION_WARNINGS
> 
>      pkt->flags |= AV_PKT_FLAG_KEY*pic_out.b_keyframe;
>      if (ret) {
> -        ff_side_data_set_encoder_stats(pkt, (pic_out.i_qpplus1 - 1) * FF_QP2LAMBDA, NULL, 0, pict_type);
> +        const AVPixFmtDescriptor *pix_desc = av_pix_fmt_desc_get(csp_to_pixfmt(pic_out.img.i_csp));
> +        int error_count = 0;
> +        int64_t *errors = NULL;
> +
> +        if (ctx->flags & AV_CODEC_FLAG_PSNR) {
> +            double scale[3] = { 1,
> +                (1 << pix_desc->log2_chroma_h) * (double)(1 << pix_desc->log2_chroma_w),
> +                (1 << pix_desc->log2_chroma_h) * (double)(1 << pix_desc->log2_chroma_w),

Any particular reason the cast is on the second value? It looks strange.

> +            };
> +            double mse;
> +            int i;
> +
> +            error_count = pix_desc->nb_components;
> +
> +            av_log(ctx, AV_LOG_DEBUG, "PSNR values from libx264: %.3f %.3f %.3f.\n",
> +                   pic_out.prop.f_psnr[0], pic_out.prop.f_psnr[1], pic_out.prop.f_psnr[2]);
> +
> +            for (i = 0; i < pix_desc->nb_components; ++i) {

for (int i ....

Diego Felix de Souza via ffmpeg-devel Oct. 11, 2023, 10:54 a.m. UTC | #3

Hi Anton, 

On Tue, 2023-10-10 at 12:54 +0200, Anton Khirnov wrote:
> 
> Quoting Carotti, Elias via ffmpeg-devel (2023-10-02 19:35:09)
> > diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
> > index 77a9f173b4..4c643c9066 100644
> > --- a/libavcodec/libx264.c
> > +++ b/libavcodec/libx264.c
> > @@ -129,6 +129,8 @@ typedef struct X264Context {
> >      int roi_warned;
> > 
> >      int mb_info;
> > +
> > +    int64_t sse[3];
> 
> The values don't need to be preserved across frames, so might as well
> put this on stack in the block calling
> ff_side_data_set_encoder_stats().

Agreed.

> 
> >  } X264Context;
> > 
> >  static void X264_log(void *p, int level, const char *fmt, va_list
> > args)
> > @@ -726,7 +728,40 @@ FF_ENABLE_DEPRECATION_WARNINGS
> > 
> >      pkt->flags |= AV_PKT_FLAG_KEY*pic_out.b_keyframe;
> >      if (ret) {
> > -        ff_side_data_set_encoder_stats(pkt, (pic_out.i_qpplus1 -
> > 1) * FF_QP2LAMBDA, NULL, 0, pict_type);
> > +        const AVPixFmtDescriptor *pix_desc =
> > av_pix_fmt_desc_get(csp_to_pixfmt(pic_out.img.i_csp));
> > +        int error_count = 0;
> > +        int64_t *errors = NULL;
> > +
> > +        if (ctx->flags & AV_CODEC_FLAG_PSNR) {
> > +            double scale[3] = { 1,
> > +                (1 << pix_desc->log2_chroma_h) * (double)(1 <<
> > pix_desc->log2_chroma_w),
> > +                (1 << pix_desc->log2_chroma_h) * (double)(1 <<
> > pix_desc->log2_chroma_w),
> 
> Any particular reason the cast is on the second value? It looks
> strange.
> 

Just my habit. Fixed.

> > +            };
> > +            double mse;
> > +            int i;
> > +
> > +            error_count = pix_desc->nb_components;
> > +
> > +            av_log(ctx, AV_LOG_DEBUG, "PSNR values from libx264:
> > %.3f %.3f %.3f.\n",
> > +                   pic_out.prop.f_psnr[0], pic_out.prop.f_psnr[1],
> > pic_out.prop.f_psnr[2]);
> > +
> > +            for (i = 0; i < pix_desc->nb_components; ++i) {
> 
> for (int i ....

Agreed.

I also found the - (minus) sign in the mse formula was wrong and I
removed it.
Numbers seem to be coherent with those from libx264.
Please find attached a new patch rebased against the latest master with
the above fixes.

There is an increasing error (over increasing PSNRs and resolutions)
when reconstructing the PSNR from the SSE as computed above due to the
approximations and the roundings back and forth, however it seems to
yield similar values as those computed by libx264.

Best





NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico

Anton Khirnov Oct. 13, 2023, 2:16 p.m. UTC | #4

Quoting Carotti, Elias via ffmpeg-devel (2023-10-11 12:54:21)
> diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
> index 77a9f173b4..85bd870f5d 100644
> --- a/libavcodec/libx264.c
> +++ b/libavcodec/libx264.c
> @@ -726,7 +726,39 @@ FF_ENABLE_DEPRECATION_WARNINGS
> 
>      pkt->flags |= AV_PKT_FLAG_KEY*pic_out.b_keyframe;
>      if (ret) {
> -        ff_side_data_set_encoder_stats(pkt, (pic_out.i_qpplus1 - 1) * FF_QP2LAMBDA, NULL, 0, pict_type);
> +        const AVPixFmtDescriptor *pix_desc = av_pix_fmt_desc_get(csp_to_pixfmt(pic_out.img.i_csp));

There's a problem here - we do not handle all values of i_csp.
E.g. we have no equivalent of X264_CSP_NV12 | X264_CSP_HIGH_DEPTH, which
x264 will use for YUV420P10 input.

The best solution is probably to use AVCodecContext.pix_fmt and assume
that x264 doesn't do any nontrivial (i.e. other than interleaving and
such) pixel format transformations internally.

> +        int error_count = 0;
> +        int64_t *errors = NULL;
> +        int64_t sse[3] = {0};
> +
> +        if (ctx->flags & AV_CODEC_FLAG_PSNR) {
> +            double scale[3] = { 1,
> +                (double)(1 << pix_desc->log2_chroma_h) * (1 << pix_desc->log2_chroma_w),
> +                (double)(1 << pix_desc->log2_chroma_h) * (1 << pix_desc->log2_chroma_w),
> +            };
> +
> +            error_count = pix_desc->nb_components;
> +
> +            av_log(ctx, AV_LOG_DEBUG, "PSNR values from libx264: %.3f %.3f %.3f.\n",
> +                   pic_out.prop.f_psnr[0], pic_out.prop.f_psnr[1], pic_out.prop.f_psnr[2]);

In my tests libx264 prints these values by itself, so this seems
redundant.

Diego Felix de Souza via ffmpeg-devel Oct. 13, 2023, 4:35 p.m. UTC | #5

On Fri, 2023-10-13 at 16:16 +0200, Anton Khirnov wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> Quoting Carotti, Elias via ffmpeg-devel (2023-10-11 12:54:21)
> > diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
> > index 77a9f173b4..85bd870f5d 100644
> > --- a/libavcodec/libx264.c
> > +++ b/libavcodec/libx264.c
> > @@ -726,7 +726,39 @@ FF_ENABLE_DEPRECATION_WARNINGS
> > 
> >      pkt->flags |= AV_PKT_FLAG_KEY*pic_out.b_keyframe;
> >      if (ret) {
> > -        ff_side_data_set_encoder_stats(pkt, (pic_out.i_qpplus1 -
> > 1) * FF_QP2LAMBDA, NULL, 0, pict_type);
> > +        const AVPixFmtDescriptor *pix_desc =
> > av_pix_fmt_desc_get(csp_to_pixfmt(pic_out.img.i_csp));
> 
> There's a problem here - we do not handle all values of i_csp.
> E.g. we have no equivalent of X264_CSP_NV12 | X264_CSP_HIGH_DEPTH,
> which
> x264 will use for YUV420P10 input.
> 
> The best solution is probably to use AVCodecContext.pix_fmt and
> assume
> that x264 doesn't do any nontrivial (i.e. other than interleaving and
> such) pixel format transformations internally.
> 


I see. Wouldn't not outputting the SSE values when csp_to_pixfmt()
returns AV_PIX_FMT_NONE work better? 
That wouldn't be worse than it is today (meaning that right now we
don't get those values for any pix_fmt).
Anyway, I did as you suggested and used AVCodecContext.pix_fmt.


> > <snip>
> > +            av_log(ctx, AV_LOG_DEBUG, "PSNR values from libx264:
> > %.3f %.3f %.3f.\n",
> > +                   pic_out.prop.f_psnr[0], pic_out.prop.f_psnr[1],
> > pic_out.prop.f_psnr[2]);
> 
> In my tests libx264 prints these values by itself, so this seems
> redundant.

removed.




NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico

Anton Khirnov Oct. 19, 2023, 11:48 a.m. UTC | #6

Quoting Carotti, Elias via ffmpeg-devel (2023-09-23 12:04:28)
> Hi,
> please find attached a patch to propagate the SSE for a frame into the
> encoder stats.
> Since libx264 already provides PSNR values, this is done by basically
> inverting the formula to recover the SSE values.

Thanks, pushed.

> Would it be possible to also append other values to the errors vector?
> E.g., libx264 also computes SSIM but other values could be provided.

It is certainly conceivable, but it's a nontrivial question whether it
should be added to this side data or to a new one. Ideally the same type
of data should be produced both by this and the various
metrics-computing filters.

Anton Khirnov Oct. 19, 2023, 11:50 a.m. UTC | #7

Quoting Carotti, Elias via ffmpeg-devel (2023-10-13 18:35:14)
> I see. Wouldn't not outputting the SSE values when csp_to_pixfmt()
> returns AV_PIX_FMT_NONE work better? 
> That wouldn't be worse than it is today (meaning that right now we
> don't get those values for any pix_fmt).

It would work in strictly fewer cases than this patch, so I don't see
why you'd want to go that way.

[FFmpeg-devel] libavc/libx264: add support to propagate SSE values through encoder stats

Checks

Commit Message

Comments

Patch