[FFmpeg-devel] h274: remove optimization pragma

Message ID	MhylPZz--3-2@lynne.ee
State	Accepted
Commit	033105a73901cf9ecfa6d410e96d7f347dc69c71
Headers	show Delivered-To: ffmpegpatchwork2@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Date: Wed, 25 Aug 2021 21:24:49 +0200 (CEST) From: Lynne <dev@lynne.ee> To: Ffmpeg Devel <ffmpeg-devel@ffmpeg.org> Message-ID: <MhylPZz--3-2@lynne.ee> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_286104_2142586228.1629919489205" Subject: [FFmpeg-devel] [PATCH] h274: remove optimization pragma Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Series	[FFmpeg-devel] h274: remove optimization pragma \| expand [FFmpeg-devel] h274: remove optimization pragma

Message ID

MhylPZz--3-2@lynne.ee

State

Accepted

Commit

033105a73901cf9ecfa6d410e96d7f347dc69c71

Headers

Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Date: Wed, 25 Aug 2021 21:24:49 +0200 (CEST)
From: Lynne <dev@lynne.ee>
To: Ffmpeg Devel <ffmpeg-devel@ffmpeg.org>
Message-ID: <MhylPZz--3-2@lynne.ee>
MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="----=_Part_286104_2142586228.1629919489205"
Subject: [FFmpeg-devel] [PATCH] h274: remove optimization pragma
Precedence: list
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Series

[FFmpeg-devel] h274: remove optimization pragma | expand

Checks

Context	Check	Description
andriy/make_x86	success	Make finished
andriy/make_fate_x86	success	Make fate finished
andriy/make_ppc	success	Make finished
andriy/make_fate_ppc	success	Make fate finished

Context

Check

Description

andriy/make_x86

success

Make finished

andriy/make_fate_x86

success

Make fate finished

andriy/make_ppc

success

Make finished

andriy/make_fate_ppc

success

Make fate finished

Commit Message

Lynne Aug. 25, 2021, 7:24 p.m. UTC

This results in warnings on compilers which don't support it, 
objections were raised during the review process about it but went unnoticed,
and the speed benefit is highly compiler and version specific, and
also not very critical.

We generally hand-write assembly to optimize loops like that, rather
than use compiler magic, and for 40% best case scenario, it's simply
not worth it.

Plus, tree vectorization is still problematic with GCC and disabled by default
for a good reason, so enabling it locally is sketchy.

Patch attached.
Subject: [PATCH] h274: remove optimization pragma

This results in warnings on compilers which don't support it,
objections were raised during the review process about it but went unnoticed,
and the speed benefit is highly compiler and version specific, and
also not very critical.

We generally hand-write assembly to optimize loops like that, rather
than use compiler magic, and for 40% best case scenario, it's simply
not worth it.

Plus, tree vectorization is still problematic with GCC and disabled by default
for a good reason, so enabling it locally is sketchy.
---
 libavcodec/h274.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Jan Poonthong Aug. 26, 2021, 3:13 a.m. UTC | #1

I didn't really understand what you meant. So I should install nasm and run
./configure or just ./configure --disable-x86asm?

On Thu, Aug 26, 2021 at 2:24 AM Lynne <dev@lynne.ee> wrote:

> This results in warnings on compilers which don't support it,
> objections were raised during the review process about it but went
> unnoticed,
> and the speed benefit is highly compiler and version specific, and
> also not very critical.
>
> We generally hand-write assembly to optimize loops like that, rather
> than use compiler magic, and for 40% best case scenario, it's simply
> not worth it.
>
> Plus, tree vectorization is still problematic with GCC and disabled by
> default
> for a good reason, so enabling it locally is sketchy.
>
> Patch attached.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>

Paul B Mahol Aug. 26, 2021, 7:11 a.m. UTC | #2

LGTM

diff --git a/libavcodec/h274.c b/libavcodec/h274.c
index 0efc00ca1d..5e2cf150ea 100644
--- a/libavcodec/h274.c
+++ b/libavcodec/h274.c
@@ -30,10 +30,6 @@ 
 
 #include "h274.h"
 
-// The code in this file has a lot of loops that vectorize very well, this is
-// about a 40% speedup for no obvious downside.
-#pragma GCC optimize("tree-vectorize")
-
 static const int8_t Gaussian_LUT[2048+256];
 static const uint32_t Seed_LUT[256];
 static const int8_t R64T[64][64];