From patchwork Fri May 24 09:36:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lance Wang X-Patchwork-Id: 13271 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B6082449A69 for ; Fri, 24 May 2019 12:37:03 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A0ABB68A6AB; Fri, 24 May 2019 12:37:03 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5EFE3689E33 for ; Fri, 24 May 2019 12:36:57 +0300 (EEST) Received: by mail-pl1-f196.google.com with SMTP id g69so3957480plb.7 for ; Fri, 24 May 2019 02:36:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9Z524YjXfgyNBadCjGTAftjvCJjC+LnomLv/JgqBbgA=; b=HGpsZIrIWNQUUJTzKSIn09wVIz7/CwhEBitPFbVWSiJD/SzGPY0uP1hais32aRz/ok LIBbDlKIjjhDfQQaSwqKCmD6SBeFxDVaY2ab38dO8au/Es/G+U8N6NadNU8aoqcxsI+6 rrFVTwg/yNLecvDWGajBwQosvsbksg+/8171nkHcadDZcySR7pu+IuobjmwY7gDDSq3H aiQS3rVF4F0462AJG4XKjca+V/aOH7WEa68Ml7n89S+gfK+oRPEdxJhlS8aUUUJJCc+d 8UDqn/SEzMk/wl0gokhFFUH2S7HL/sTMs6GIiZIZ19wAPhp4jCZ1PU3sSG6M9aBYxms6 Fc8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9Z524YjXfgyNBadCjGTAftjvCJjC+LnomLv/JgqBbgA=; b=tHKW9cfDrWmZTqYIFWfqsw6uJdsFotNXmWMBRvakjolRmGamrFBbib/2ey2ijHQuca VAO2Ni27XX5emFQ7InRL98N0VMMDDKpfM8O+j58erjB5rcjAF1tBrkQApfPQb7Jgpe7Y fn7uYrWs5t0hNIsUygQamRni71wY1lvjEFHI5cjXmnZ/DFgkXTCVKsDeguAQwOfqFQVl b0Fp7ZSlSVOsfZHZzZEIkMBWrVGUtAEUCcaCljK7Dfn09mtJN2aMHFNgE2gkYaGiCM+Y EbWhd++vo/fVgJ1XqluU0o5l+CvwJf+3ufN4o/7soOa//RuCYsesQDZxlB3gJVFmjdNv XBag== X-Gm-Message-State: APjAAAXy4RQXQT68FjZue7SArQJO42BfOTAzUaDWMlwf8AJH5e65j2Am Gmu1Dc0c0K4WanKjIlwlRiT5jVbncp0= X-Google-Smtp-Source: APXvYqxlYBgMeG5TwVukL+LRp3vzOE9RLUAlcpCD14TfNIodFQpgGiuLI4b8bAhAcx/LND4+flOUyQ== X-Received: by 2002:a17:902:868b:: with SMTP id g11mr23667083plo.183.1558690615550; Fri, 24 May 2019 02:36:55 -0700 (PDT) Received: from localhost.localdomain ([47.90.99.151]) by smtp.gmail.com with ESMTPSA id x7sm1951254pfm.82.2019.05.24.02.36.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 24 May 2019 02:36:55 -0700 (PDT) From: lance.lmwang@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Fri, 24 May 2019 17:36:15 +0800 Message-Id: <20190524093616.74647-6-lance.lmwang@gmail.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190524093616.74647-1-lance.lmwang@gmail.com> References: <20190524093616.74647-1-lance.lmwang@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/7] libavfilter/vf_overlay.c: using the nbits and depth for 8bits and 10bit support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: onemda@gmail.com, Limin Wang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Limin Wang --- libavfilter/vf_overlay.c | 69 +++++++++++++++++++++++++--------------- 1 file changed, 44 insertions(+), 25 deletions(-) diff --git a/libavfilter/vf_overlay.c b/libavfilter/vf_overlay.c index ee51a54659..8376494efc 100644 --- a/libavfilter/vf_overlay.c +++ b/libavfilter/vf_overlay.c @@ -464,22 +464,26 @@ static av_always_inline void blend_plane_##depth##_##nbits##bits(AVFilterContext int dst_hp = AV_CEIL_RSHIFT(dst_h, vsub); \ int yp = y>>vsub; \ int xp = x>>hsub; \ - uint8_t *s, *sp, *d, *dp, *dap, *a, *da, *ap; \ + uint##depth##_t *s, *sp, *d, *dp, *dap, *a, *da, *ap; \ int jmax, j, k, kmax; \ int slice_start, slice_end; \ + const int max = (1 << nbits) - 1; \ + const int mid = (1 << (nbits -1)) ; \ + int bytes = depth / 8; \ \ + dst_step /= bytes; \ j = FFMAX(-yp, 0); \ jmax = FFMIN3(-yp + dst_hp, FFMIN(src_hp, dst_hp), yp + src_hp); \ \ slice_start = j + (jmax * jobnr) / nb_jobs; \ slice_end = j + (jmax * (jobnr+1)) / nb_jobs; \ \ - sp = src->data[i] + (slice_start) * src->linesize[i]; \ - dp = dst->data[dst_plane] \ + sp = (uint##depth##_t *)(src->data[i] + (slice_start) * src->linesize[i]); \ + dp = (uint##depth##_t *)(dst->data[dst_plane] \ + (yp + slice_start) * dst->linesize[dst_plane] \ - + dst_offset; \ - ap = src->data[3] + (slice_start << vsub) * src->linesize[3]; \ - dap = dst->data[3] + ((yp + slice_start) << vsub) * dst->linesize[3]; \ + + dst_offset); \ + ap = (uint##depth##_t *)(src->data[3] + (slice_start << vsub) * src->linesize[3]); \ + dap = (uint##depth##_t *)(dst->data[3] + ((yp + slice_start) << vsub) * dst->linesize[3]); \ \ for (j = slice_start; j < slice_end; j++) { \ k = FFMAX(-xp, 0); \ @@ -489,7 +493,7 @@ static av_always_inline void blend_plane_##depth##_##nbits##bits(AVFilterContext da = dap + ((xp+k) << hsub); \ kmax = FFMIN(-xp + dst_wp, src_wp); \ \ - if (((vsub && j+1 < src_hp) || !vsub) && octx->blend_row[i]) { \ + if (nbits == 8 && ((vsub && j+1 < src_hp) || !vsub) && octx->blend_row[i]) { \ int c = octx->blend_row[i](d, da, s, a, kmax - k, src->linesize[3]); \ \ s += c; \ @@ -515,7 +519,7 @@ static av_always_inline void blend_plane_##depth##_##nbits##bits(AVFilterContext alpha = a[0]; \ /* if the main channel has an alpha channel, alpha has to be calculated */ \ /* to create an un-premultiplied (straight) alpha value */ \ - if (main_has_alpha && alpha != 0 && alpha != 255) { \ + if (main_has_alpha && alpha != 0 && alpha != max) { \ /* average alpha for color components, improve quality */ \ uint8_t alpha_d; \ if (hsub && vsub && j+1 < src_hp && k+1 < src_wp) { \ @@ -532,22 +536,32 @@ static av_always_inline void blend_plane_##depth##_##nbits##bits(AVFilterContext alpha = UNPREMULTIPLY_ALPHA(alpha, alpha_d); \ } \ if (straight) { \ - *d = FAST_DIV255(*d * (255 - alpha) + *s * alpha); \ - } else { \ - if (i && yuv) \ - *d = av_clip(FAST_DIV255((*d - 128) * (255 - alpha)) + *s - 128, -128, 128) + 128; \ + if (nbits > 8) \ + *d = (*d * (max - alpha) + *s * alpha) / max; \ else \ - *d = FFMIN(FAST_DIV255(*d * (255 - alpha)) + *s, 255); \ + *d = FAST_DIV255(*d * (255 - alpha) + *s * alpha); \ + } else { \ + if (nbits > 8) { \ + if (i && yuv) \ + *d = av_clip((*d * (max - alpha) + *s * alpha) / max + *s - 128, -128, 128) + 128; \ + else \ + *d = FFMIN((*d * (max - alpha) + *s * alpha) / max + *s, 255); \ + } else { \ + if (i && yuv) \ + *d = av_clip(FAST_DIV255((*d - 128) * (255 - alpha)) + *s - 128, -128, 128) + 128; \ + else \ + *d = FFMIN(FAST_DIV255(*d * (255 - alpha)) + *s, 255); \ + } \ } \ s++; \ d += dst_step; \ da += 1 << hsub; \ a += 1 << hsub; \ } \ - dp += dst->linesize[dst_plane]; \ - sp += src->linesize[i]; \ - ap += (1 << vsub) * src->linesize[3]; \ - dap += (1 << vsub) * dst->linesize[3]; \ + dp += dst->linesize[dst_plane] / bytes; \ + sp += src->linesize[i] / bytes; \ + ap += (1 << vsub) * src->linesize[3] / bytes; \ + dap += (1 << vsub) * dst->linesize[3] / bytes; \ } \ } DEFINE_BLEND_PLANE(8, 8); @@ -559,18 +573,20 @@ static inline void alpha_composite_##depth##_##nbits##bits(const AVFrame *src, c int x, int y, \ int jobnr, int nb_jobs) \ { \ - uint8_t alpha; /* the amount of overlay to blend on to main */ \ - uint8_t *s, *sa, *d, *da; \ + uint##depth##_t alpha; /* the amount of overlay to blend on to main */ \ + uint##depth##_t *s, *sa, *d, *da; \ int i, imax, j, jmax; \ int slice_start, slice_end; \ + const int max = (1 << nbits) - 1; \ + int bytes = depth / 8; \ \ imax = FFMIN(-y + dst_h, src_h); \ slice_start = (imax * jobnr) / nb_jobs; \ slice_end = ((imax * (jobnr+1)) / nb_jobs); \ \ i = FFMAX(-y, 0); \ - sa = src->data[3] + (i + slice_start) * src->linesize[3]; \ - da = dst->data[3] + (y + i + slice_start) * dst->linesize[3]; \ + sa = (uint##depth##_t *)(src->data[3] + (i + slice_start) * src->linesize[3]); \ + da = (uint##depth##_t *)(dst->data[3] + (y + i + slice_start) * dst->linesize[3]); \ \ for (i = i + slice_start; i < slice_end; i++) { \ j = FFMAX(-x, 0); \ @@ -586,18 +602,21 @@ static inline void alpha_composite_##depth##_##nbits##bits(const AVFrame *src, c switch (alpha) { \ case 0: \ break; \ - case 255: \ + case max: \ *d = *s; \ break; \ default: \ /* apply alpha compositing: main_alpha += (1-main_alpha) * overlay_alpha */ \ - *d += FAST_DIV255((255 - *d) * *s); \ + if (nbits > 8) \ + *d += (max - *d) * *s / max; \ + else \ + *d += FAST_DIV255((255 - *d) * *s); \ } \ d += 1; \ s += 1; \ } \ - da += dst->linesize[3]; \ - sa += src->linesize[3]; \ + da += dst->linesize[3] / bytes; \ + sa += src->linesize[3] / bytes; \ } \ } DEFINE_ALPHA_COMPOSITE(8, 8);