[FFmpeg-devel,v3] aacenc: add SIMD optimizations for abs_pow34 and quantization

On 18 October 2016 at 14:51, Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Tue, Oct 18, 2016 at 09:02:19AM +0100, Rostislav Pehlivanov wrote:
> > On 17 October 2016 at 23:43, Michael Niedermayer <michael@niedermayer.cc
> >
> > wrote:
> >
> > > On Mon, Oct 17, 2016 at 10:24:48PM +0100, Rostislav Pehlivanov wrote:
> > > > Should fix segfaults on x86-32
> > > >
> > > > Performance improvements:
> > > >
> > > > quant_bands:
> > > > with:     681 decicycles in quant_bands, 8388453 runs,    155 skips
> > > > without: 1190 decicycles in quant_bands, 8388386 runs,    222 skips
> > > > Around 42% for the function
> > > >
> > > > Twoloop coder:
> > > >
> > > > abs_pow34:
> > > > with/without: 7.82s/8.17s
> > > > Around 4% for the entire encoder
> > > >
> > > > Both:
> > > > with/without: 7.15s/8.17s
> > > > Around 12% for the entire encoder
> > > >
> > > > Fast coder:
> > > >
> > > > abs_pow34:
> > > > with/without: 3.40s/3.77s
> > > > Around 10% for the entire encoder
> > > >
> > > > Both:
> > > > with/without: 3.02s/3.77s
> > > > Around 20% faster for the entire encoder
> > > >
> > > > Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
> > > > ---
> > > >  libavcodec/aaccoder.c            | 27 +++++++------
> > > >  libavcodec/aaccoder_trellis.h    |  2 +-
> > > >  libavcodec/aaccoder_twoloop.h    |  2 +-
> > > >  libavcodec/aacenc.c              |  4 ++
> > > >  libavcodec/aacenc.h              |  6 +++
> > > >  libavcodec/aacenc_is.c           |  6 +--
> > > >  libavcodec/aacenc_ltp.c          |  4 +-
> > > >  libavcodec/aacenc_pred.c         |  6 +--
> > > >  libavcodec/aacenc_quantization.h |  4 +-
> > > >  libavcodec/aacenc_utils.h        |  4 +-
> > > >  libavcodec/x86/Makefile          |  2 +
> > > >  libavcodec/x86/aacencdsp.asm     | 87 ++++++++++++++++++++++++++++++
> > > ++++++++++
> > > >  libavcodec/x86/aacencdsp_init.c  | 43 ++++++++++++++++++++
> > > >  13 files changed, 170 insertions(+), 27 deletions(-)
> > > >  create mode 100644 libavcodec/x86/aacencdsp.asm
> > > >  create mode 100644 libavcodec/x86/aacencdsp_init.c
> > >
> > > fate passes on linux32/64 x86, mingw32/64 x86
> > >
> > > build fails on arm:
> > >
> > > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > > `ff_aac_dsp_init_x86'
> > > collect2: ld returned 1 exit status
> > > make: *** [ffserver_g] Error 1
> > > make: *** Waiting for unfinished jobs....
> > > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > > `ff_aac_dsp_init_x86'
> > > collect2: ld returned 1 exit status
> > > make: *** [ffprobe_g] Error 1
> > > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > > `ff_aac_dsp_init_x86'
> > > collect2: ld returned 1 exit status
> > > make: *** [ffmpeg_g] Error 1
> > >
> > > [...]
> > > --
> > > Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC7
> 87040B0FAB
> > >
> > > While the State exists there can be no freedom; when there is freedom
> there
> > > will be no State. -- Vladimir Lenin
> > >
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel@ffmpeg.org
> > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
> > >
> > Attaching a new version with the fixes from James Almer which should also
> > fix non-x86 compilation
>
> >  aaccoder.c            |   27 +++++++--------
> >  aaccoder_trellis.h    |    2 -
> >  aaccoder_twoloop.h    |    2 -
> >  aacenc.c              |    4 ++
> >  aacenc.h              |    6 +++
> >  aacenc_is.c           |    6 +--
> >  aacenc_ltp.c          |    4 +-
> >  aacenc_pred.c         |    6 +--
> >  aacenc_quantization.h |    4 +-
> >  aacenc_utils.h        |    2 -
> >  x86/Makefile          |    2 +
> >  x86/aacencdsp.asm     |   88 ++++++++++++++++++++++++++++++
> ++++++++++++++++++++
> >  x86/aacencdsp_init.c  |   43 ++++++++++++++++++++++++
> >  13 files changed, 170 insertions(+), 26 deletions(-)
> > 84d67e14dbd62ef958a52a4027a8dff22f7480b6  0001-aacenc-add-SIMD-
> optimizations-for-abs_pow34-and-quan.patch
> > From d92003e23d82bc40fd85712538983209a7704248 Mon Sep 17 00:00:00 2001
> > From: Rostislav Pehlivanov <atomnuker@gmail.com>
> > Date: Sat, 8 Oct 2016 15:59:14 +0100
> > Subject: [PATCH] aacenc: add SIMD optimizations for abs_pow34 and
> quantization
> >
> > Performance improvements:
> >
> > quant_bands:
> > with:     681 decicycles in quant_bands, 8388453 runs,    155 skips
> > without: 1190 decicycles in quant_bands, 8388386 runs,    222 skips
> > Around 42% for the function
> >
> > Twoloop coder:
> >
> > abs_pow34:
> > with/without: 7.82s/8.17s
> > Around 4% for the entire encoder
> >
> > Both:
> > with/without: 7.15s/8.17s
> > Around 12% for the entire encoder
> >
> > Fast coder:
> >
> > abs_pow34:
> > with/without: 3.40s/3.77s
> > Around 10% for the entire encoder
> >
> > Both:
> > with/without: 3.02s/3.77s
> > Around 20% faster for the entire encoder
> >
> > Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
> > ---
> >  libavcodec/aaccoder.c            | 27 ++++++------
> >  libavcodec/aaccoder_trellis.h    |  2 +-
> >  libavcodec/aaccoder_twoloop.h    |  2 +-
> >  libavcodec/aacenc.c              |  4 ++
> >  libavcodec/aacenc.h              |  6 +++
> >  libavcodec/aacenc_is.c           |  6 +--
> >  libavcodec/aacenc_ltp.c          |  4 +-
> >  libavcodec/aacenc_pred.c         |  6 +--
> >  libavcodec/aacenc_quantization.h |  4 +-
> >  libavcodec/aacenc_utils.h        |  2 +-
> >  libavcodec/x86/Makefile          |  2 +
> >  libavcodec/x86/aacencdsp.asm     | 88 ++++++++++++++++++++++++++++++
> ++++++++++
> >  libavcodec/x86/aacencdsp_init.c  | 43 ++++++++++++++++++++
> >  13 files changed, 170 insertions(+), 26 deletions(-)
> >  create mode 100644 libavcodec/x86/aacencdsp.asm
> >  create mode 100644 libavcodec/x86/aacencdsp_init.c
>
> still fails to build on arm-qemu:
> it looks like you call a function thats just not there on non x86
> missing if (ARCH_X86) or #if i assume
>
> LD      ffmpeg_g
> libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> /home/michael/ffmpeg-git/ffmpeg/arm/src/libavcodec/aacenc.c:1038:
> undefined reference to `ff_aac_dsp_init_x86'
> collect2: ld returned 1 exit status
> make: *** [ffmpeg_g] Error 1
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> No snowflake in an avalanche ever feels responsible. -- Voltaire
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
Damn, forgot to amend the patch with that change, attached should finally
fix it

[FFmpeg-devel,v3] aacenc: add SIMD optimizations for abs_pow34 and quantization

Commit Message

Comments

Patch