From patchwork Tue Sep 17 13:39:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 15120 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id ED046444226 for ; Tue, 17 Sep 2019 16:41:55 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D41D1688254; Tue, 17 Sep 2019 16:41:55 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2C55B6881C1 for ; Tue, 17 Sep 2019 16:41:47 +0300 (EEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Sep 2019 06:41:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,516,1559545200"; d="scan'208";a="211498641" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.90]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2019 06:41:38 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2019 21:39:38 +0800 Message-Id: <20190917133938.949-3-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190917133938.949-1-ting.fu@intel.com> References: <20190917133938.949-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH 3/3] avfilter/x86/vf_eq: add SSE2 version X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Ting Fu --- libavfilter/x86/vf_eq.asm | 19 +++++++++++++++++-- libavfilter/x86/vf_eq_init.c | 20 ++++++++++++++++++++ 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/libavfilter/x86/vf_eq.asm b/libavfilter/x86/vf_eq.asm index bf28691297..d6b51cf6df 100644 --- a/libavfilter/x86/vf_eq.asm +++ b/libavfilter/x86/vf_eq.asm @@ -24,14 +24,21 @@ SECTION .text -INIT_MMX mmx +%macro PROCESS_ONE_LINE 1 cglobal process_one_line, 5, 7, 5, src, dst, contrast, brightness, w movd m3, contrastd movd m4, brightnessd movsx r5d, contrastw movsx r6d, brightnessw +%if mmsize == 8 pshufw m3, m3, 0 pshufw m4, m4, 0 +%elif mmsize == 16 + pshuflw m3, m3, 0 + movlhps m3, m3 + pshuflw m4, m4, 0 + movlhps m4, m4 +%endif DEFINE_ARGS src, dst, tmp, scalar, w xor tmpd, tmpd @@ -39,7 +46,7 @@ cglobal process_one_line, 5, 7, 5, src, dst, contrast, brightness, w pxor m1, m1 mov scalard, wd and scalard, mmsize-1 - sar wd, 3 + sar wd, %1 cmp wd, 1 jl .loop1 @@ -80,3 +87,11 @@ cglobal process_one_line, 5, 7, 5, src, dst, contrast, brightness, w .end: RET + +%endmacro + +INIT_MMX mmx +PROCESS_ONE_LINE 3 + +INIT_XMM sse2 +PROCESS_ONE_LINE 4 diff --git a/libavfilter/x86/vf_eq_init.c b/libavfilter/x86/vf_eq_init.c index 63c69078fb..cdd5272220 100644 --- a/libavfilter/x86/vf_eq_init.c +++ b/libavfilter/x86/vf_eq_init.c @@ -28,6 +28,8 @@ extern void ff_process_one_line_mmx(const uint8_t *src, uint8_t *dst, int contvec, int brvec, int w); +extern void ff_process_one_line_sse2(const uint8_t *src, uint8_t *dst, int contvec, + int brvec, int w); static void process_mmx(EQParameters *param, uint8_t *dst, int dst_stride, const uint8_t *src, int src_stride, int w, int h) @@ -44,6 +46,21 @@ static void process_mmx(EQParameters *param, uint8_t *dst, int dst_stride, emms_c(); } +static void process_sse2(EQParameters *param, uint8_t *dst, int dst_stride, + const uint8_t *src, int src_stride, int w, int h) +{ + short contrast = (short) (param->contrast * 256 * 16); + short brightness = ((short) (100.0 * param->brightness + 100.0) * 511) + / 200 - 128 - contrast / 32; + + while (h--) { + ff_process_one_line_sse2(src, dst, contrast, brightness, w); + src += src_stride; + dst += dst_stride; + } + emms_c(); +} + av_cold void ff_eq_init_x86(EQContext *eq) { int cpu_flags = av_get_cpu_flags(); @@ -51,5 +68,8 @@ av_cold void ff_eq_init_x86(EQContext *eq) if (cpu_flags & AV_CPU_FLAG_MMX) { eq->process = process_mmx; } + if (cpu_flags & AV_CPU_FLAG_SSE2) { + eq->process = process_sse2; + } }