diff mbox series

[FFmpeg-devel,4/5] avcodec/on2avc: Avoid indirection when calling float dsp function

Message ID 20201019110706.857212-1-andreas.rheinhardt@gmail.com
State Withdrawn
Headers show
Series [FFmpeg-devel,1/3] avcodec/on2avcdata: Deduplicate symbol tables
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished

Commit Message

Andreas Rheinhardt Oct. 19, 2020, 11:07 a.m. UTC
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
---
 libavcodec/on2avc.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

Comments

Anton Khirnov Oct. 19, 2020, 1:35 p.m. UTC | #1
Quoting Andreas Rheinhardt (2020-10-19 13:07:05)
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> ---
>  libavcodec/on2avc.c | 35 ++++++++++++++++++-----------------
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 

I don't quite see the point of this. I cannot imagine it is measurably
faster, the memory savings should also be negligible.
And strictly speaking, it is API abuse as the functions are tied to the
context. It is conceivable that they could be compiled at runtime and
freeing the context would invalidate them.
Andreas Rheinhardt Oct. 21, 2020, 5:37 a.m. UTC | #2
Anton Khirnov:
> Quoting Andreas Rheinhardt (2020-10-19 13:07:05)
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
>> ---
>>  libavcodec/on2avc.c | 35 ++++++++++++++++++-----------------
>>  1 file changed, 18 insertions(+), 17 deletions(-)
>>
> 
> I don't quite see the point of this. I cannot imagine it is measurably
> faster, the memory savings should also be negligible.
> And strictly speaking, it is API abuse as the functions are tied to the
> context. It is conceivable that they could be compiled at runtime and
> freeing the context would invalidate them.
> 
Seems like we disagree here: I don't think that something needs to be
measurably faster for it to be done; I don't think that small memory
savings are negligible. And I absolutely don't think it to be API abuse:
The API does not say that these functions are tied to the context. And
I'd really be interested to know how freeing the context could somehow
invalidate the functions -- after all, the AVFloatDSPContext doesn't
have a custom free function.

- Andreas
Andreas Rheinhardt Oct. 21, 2020, 5:38 a.m. UTC | #3
Andreas Rheinhardt:
> Anton Khirnov:
>> Quoting Andreas Rheinhardt (2020-10-19 13:07:05)
>>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
>>> ---
>>>  libavcodec/on2avc.c | 35 ++++++++++++++++++-----------------
>>>  1 file changed, 18 insertions(+), 17 deletions(-)
>>>
>>
>> I don't quite see the point of this. I cannot imagine it is measurably
>> faster, the memory savings should also be negligible.
>> And strictly speaking, it is API abuse as the functions are tied to the
>> context. It is conceivable that they could be compiled at runtime and
>> freeing the context would invalidate them.
>>
> Seems like we disagree here: I don't think that something needs to be
> measurably faster for it to be done; I don't think that small memory
> savings are negligible. And I absolutely don't think it to be API abuse:
> The API does not say that these functions are tied to the context. And
> I'd really be interested to know how freeing the context could somehow
> invalidate the functions -- after all, the AVFloatDSPContext doesn't
> have a custom free function.
> 
> - Andreas
> 
I forgot to add: Despite me disagreeing with you, I'll drop this patch.

- Andreas
Anton Khirnov Oct. 21, 2020, 9:04 a.m. UTC | #4
Quoting Andreas Rheinhardt (2020-10-21 07:37:03)
> Anton Khirnov:
> > Quoting Andreas Rheinhardt (2020-10-19 13:07:05)
> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
> >> ---
> >>  libavcodec/on2avc.c | 35 ++++++++++++++++++-----------------
> >>  1 file changed, 18 insertions(+), 17 deletions(-)
> >>
> > 
> > I don't quite see the point of this. I cannot imagine it is measurably
> > faster, the memory savings should also be negligible.
> > And strictly speaking, it is API abuse as the functions are tied to the
> > context. It is conceivable that they could be compiled at runtime and
> > freeing the context would invalidate them.
> > 
> Seems like we disagree here: I don't think that something needs to be
> measurably faster for it to be done; I don't think that small memory
> savings are negligible.

Yeah, I guess we disagree. In my view, the "baseline" value (to the
project) of any change is negative, because it necessarily adds
- noise in git history
- noise on the ML
- a risk of introducing a new bug
IMO to counterbalance these, every change should overcome some minimum
threshold of usefulness in order to rise above the noise.

Saving something like 80 bytes per decoder instance is never ever going
to have a practical impact on anything. Same for potentially saving a
few cycles for the extra pointer dereference. Especially since this is a
rather obscure decoder few people use. So in my view the change is just
not useful enough to be worth it. But then again that's just my opinion,
so feel free to contest it if you care enough.

> And I absolutely don't think it to be API abuse:
> The API does not say that these functions are tied to the context.

It doesn't, but the usual convention is that objects are owned by their
parent unless specifically documented otherwise.

> And I'd really be interested to know how freeing the context could
> somehow invalidate the functions -- after all, the AVFloatDSPContext
> doesn't have a custom free function.

Right, I missed that fact.
diff mbox series

Patch

diff --git a/libavcodec/on2avc.c b/libavcodec/on2avc.c
index 625e733ca3..3b566e1e4b 100644
--- a/libavcodec/on2avc.c
+++ b/libavcodec/on2avc.c
@@ -46,7 +46,8 @@  enum WindowTypes {
 
 typedef struct On2AVCContext {
     AVCodecContext *avctx;
-    AVFloatDSPContext *fdsp;
+    void (*vector_fmul_window)(float *dst, const float *src0,
+                               const float *src1, const float *win, int len);
     FFTContext mdct, mdct_half, mdct_small;
     FFTContext fft128, fft256, fft512, fft1024;
     void (*wtf)(struct On2AVCContext *ctx, float *out, float *in, int size);
@@ -720,7 +721,7 @@  static int on2avc_reconstruct_channel_ext(On2AVCContext *c, AVFrame *dst, int of
         }
 
         memcpy(out, saved, 448 * sizeof(float));
-        c->fdsp->vector_fmul_window(wout, saved + 448, buf, c->short_win, 64);
+        c->vector_fmul_window(wout, saved + 448, buf, c->short_win, 64);
         memcpy(wout + 128,  buf + 64,         448 * sizeof(float));
         memcpy(saved,       buf + 512,        448 * sizeof(float));
         memcpy(saved + 448, buf + 7*128 + 64,  64 * sizeof(float));
@@ -756,20 +757,20 @@  static int on2avc_reconstruct_channel(On2AVCContext *c, int channel,
          c->prev_window_type == WINDOW_TYPE_LONG_STOP) &&
         (c->window_type == WINDOW_TYPE_LONG ||
          c->window_type == WINDOW_TYPE_LONG_START)) {
-        c->fdsp->vector_fmul_window(out, saved, buf, c->long_win, 512);
+        c->vector_fmul_window(out, saved, buf, c->long_win, 512);
     } else {
         float *wout = out + 448;
         memcpy(out, saved, 448 * sizeof(float));
 
         if (c->window_type == WINDOW_TYPE_8SHORT) {
-            c->fdsp->vector_fmul_window(wout + 0*128, saved + 448,      buf + 0*128, c->short_win, 64);
-            c->fdsp->vector_fmul_window(wout + 1*128, buf + 0*128 + 64, buf + 1*128, c->short_win, 64);
-            c->fdsp->vector_fmul_window(wout + 2*128, buf + 1*128 + 64, buf + 2*128, c->short_win, 64);
-            c->fdsp->vector_fmul_window(wout + 3*128, buf + 2*128 + 64, buf + 3*128, c->short_win, 64);
-            c->fdsp->vector_fmul_window(temp,         buf + 3*128 + 64, buf + 4*128, c->short_win, 64);
+            c->vector_fmul_window(wout + 0*128, saved + 448,      buf + 0*128, c->short_win, 64);
+            c->vector_fmul_window(wout + 1*128, buf + 0*128 + 64, buf + 1*128, c->short_win, 64);
+            c->vector_fmul_window(wout + 2*128, buf + 1*128 + 64, buf + 2*128, c->short_win, 64);
+            c->vector_fmul_window(wout + 3*128, buf + 2*128 + 64, buf + 3*128, c->short_win, 64);
+            c->vector_fmul_window(temp,         buf + 3*128 + 64, buf + 4*128, c->short_win, 64);
             memcpy(wout + 4*128, temp, 64 * sizeof(float));
         } else {
-            c->fdsp->vector_fmul_window(wout, saved + 448, buf, c->short_win, 64);
+            c->vector_fmul_window(wout, saved + 448, buf, c->short_win, 64);
             memcpy(wout + 128, buf + 64, 448 * sizeof(float));
         }
     }
@@ -778,9 +779,9 @@  static int on2avc_reconstruct_channel(On2AVCContext *c, int channel,
     switch (c->window_type) {
     case WINDOW_TYPE_8SHORT:
         memcpy(saved,       temp + 64,         64 * sizeof(float));
-        c->fdsp->vector_fmul_window(saved + 64,  buf + 4*128 + 64, buf + 5*128, c->short_win, 64);
-        c->fdsp->vector_fmul_window(saved + 192, buf + 5*128 + 64, buf + 6*128, c->short_win, 64);
-        c->fdsp->vector_fmul_window(saved + 320, buf + 6*128 + 64, buf + 7*128, c->short_win, 64);
+        c->vector_fmul_window(saved + 64,  buf + 4*128 + 64, buf + 5*128, c->short_win, 64);
+        c->vector_fmul_window(saved + 192, buf + 5*128 + 64, buf + 6*128, c->short_win, 64);
+        c->vector_fmul_window(saved + 320, buf + 6*128 + 64, buf + 7*128, c->short_win, 64);
         memcpy(saved + 448, buf + 7*128 + 64,  64 * sizeof(float));
         break;
     case WINDOW_TYPE_LONG_START:
@@ -906,6 +907,7 @@  static av_cold void on2avc_free_vlcs(On2AVCContext *c)
 static av_cold int on2avc_decode_init(AVCodecContext *avctx)
 {
     On2AVCContext *c = avctx->priv_data;
+    AVFloatDSPContext *fdsp;
     int i;
 
     if (avctx->channels > 2U) {
@@ -952,9 +954,11 @@  static av_cold int on2avc_decode_init(AVCodecContext *avctx)
     ff_fft_init(&c->fft256,  7, 0);
     ff_fft_init(&c->fft512,  8, 1);
     ff_fft_init(&c->fft1024, 9, 1);
-    c->fdsp = avpriv_float_dsp_alloc(avctx->flags & AV_CODEC_FLAG_BITEXACT);
-    if (!c->fdsp)
+    fdsp = avpriv_float_dsp_alloc(avctx->flags & AV_CODEC_FLAG_BITEXACT);
+    if (!fdsp)
         return AVERROR(ENOMEM);
+    c->vector_fmul_window = fdsp->vector_fmul_window;
+    av_free(fdsp);
 
     if (init_vlc(&c->scale_diff, 9, ON2AVC_SCALE_DIFFS,
                  ff_on2avc_scale_diff_bits,  1, 1,
@@ -975,7 +979,6 @@  static av_cold int on2avc_decode_init(AVCodecContext *avctx)
 vlc_fail:
     av_log(avctx, AV_LOG_ERROR, "Cannot init VLC\n");
     on2avc_free_vlcs(c);
-    av_freep(&c->fdsp);
     return AVERROR(ENOMEM);
 }
 
@@ -991,8 +994,6 @@  static av_cold int on2avc_decode_close(AVCodecContext *avctx)
     ff_fft_end(&c->fft512);
     ff_fft_end(&c->fft1024);
 
-    av_freep(&c->fdsp);
-
     on2avc_free_vlcs(c);
 
     return 0;