From patchwork Mon Jul 12 11:07:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anton Khirnov X-Patchwork-Id: 28898 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:bbc9:0:0:0:0:0 with SMTP id c9csp2812555ybk; Mon, 12 Jul 2021 04:09:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxXPVWwHJrWlODK9lfLxt7ZeVjPxf08cGcWGRld76NWpYcJhaOO8Lm7JFLzMCGO8jXYf/QY X-Received: by 2002:a05:6402:1d37:: with SMTP id dh23mr12708728edb.360.1626088164034; Mon, 12 Jul 2021 04:09:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626088164; cv=none; d=google.com; s=arc-20160816; b=ZY5sU6z2eBoHNyfXW5rlWsLPEKGMXCzkKyTFSdtLd2WmFBJo0DAWquCNjGdQOLF9oF 3BGWhC5OQGoTQYBFHh/7pucHL4lPRx9BXN0g6SQ4LsiM8U9Y7fL661un119LFhQk46nT avqcWlZ/w1dmoSxrrsEp9go9Nq+JyVKDVg6goDAyR98FLOIA10WUVtKmp7GJe/yoIYlR ckTkb5/6fHixJK55mPMLhr5NPd9zz6nc165O/O0oh4BGVvD3RjdhlFjPUaUUY92/ZnwQ Va9Iz57vPpNmuy75cVSEGSjWTkJjJFjsa4cN10qp6EFn1yK1JIppBjlIUJaiLxlFVuSY t7iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=yZhbeNxsEahbLbmIAwMOQ59uShFcACgty/derPY0eqQ=; b=WXvjgiU1KiWTejYZqO+2BvqvnOZd9j3uUdELo8u/5jE0y7iAwayqhVW53tR5ry7BQi tL05iyPQHT0XX/8nnOABwPY3v/duCYSN6nzD0CbwVIe/2lQQyCDBeBEUe+IYgJL0IuXG lOy+eOo4hQxCBrnY/mfAMmaUfFTO0Bq2ZUOHZworhCKJT7H5LGEYBVQZFk0e0g9QuIG4 5szhEsDnGAZElH2Lp2pD44w33CyRPKtz2tbXSJAdT+4UGXKNoAuChe9/M244QUIn9bXy 8NmWattCPiV/z1cMQYX/rhcflpfxOZeVOsXGiBCYgqZsl3BE97hnQuE46dTi07jZyijf Q4Ow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ho44si15969492ejc.217.2021.07.12.04.09.23; Mon, 12 Jul 2021 04:09:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9EDD568A995; Mon, 12 Jul 2021 14:08:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail0.khirnov.net (red.khirnov.net [176.97.15.12]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1794D68A962 for ; Mon, 12 Jul 2021 14:08:00 +0300 (EEST) Received: from localhost (localhost [IPv6:::1]) by mail0.khirnov.net (Postfix) with ESMTP id 25832240699 for ; Mon, 12 Jul 2021 13:07:57 +0200 (CEST) Received: from mail0.khirnov.net ([IPv6:::1]) by localhost (mail0.khirnov.net [IPv6:::1]) (amavisd-new, port 10024) with ESMTP id h--8LbIAKXw3 for ; Mon, 12 Jul 2021 13:07:56 +0200 (CEST) Received: from libav.khirnov.net (libav.khirnov.net [IPv6:2a00:c500:561:201::7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "libav.khirnov.net", Issuer "smtp.khirnov.net SMTP CA" (verified OK)) by mail0.khirnov.net (Postfix) with ESMTPS id AF47E240698 for ; Mon, 12 Jul 2021 13:07:52 +0200 (CEST) Received: by libav.khirnov.net (Postfix, from userid 1000) id A9BF83A0BBE; Mon, 12 Jul 2021 13:07:50 +0200 (CEST) From: Anton Khirnov To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Jul 2021 13:07:06 +0200 Message-Id: <20210712110709.15532-6-anton@khirnov.net> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210712110709.15532-1-anton@khirnov.net> References: <20210712110709.15532-1-anton@khirnov.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/8] sws: add a new scaling API X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: thwXfvQ8zyZU Content-Length: 27254 --- libswscale/swscale.c | 263 ++++++++++++++++++++++++++-------- libswscale/swscale.h | 80 +++++++++++ libswscale/swscale_internal.h | 19 +++ libswscale/utils.c | 70 +++++++++ 4 files changed, 374 insertions(+), 58 deletions(-) diff --git a/libswscale/swscale.c b/libswscale/swscale.c index 61dfcb4dff..8b32ce5a40 100644 --- a/libswscale/swscale.c +++ b/libswscale/swscale.c @@ -236,13 +236,16 @@ static void lumRangeFromJpeg16_c(int16_t *_dst, int width) av_log(c, AV_LOG_DEBUG, __VA_ARGS__) static int swscale(SwsContext *c, const uint8_t *src[], - int srcStride[], int srcSliceY, - int srcSliceH, uint8_t *dst[], int dstStride[]) + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[], + int dstSliceY, int dstSliceH) { + const int scale_dst = dstSliceY > 0 || dstSliceH < c->dstH; + /* load a few things into local vars to make the code more readable? * and faster */ const int dstW = c->dstW; - const int dstH = c->dstH; + int dstH = c->dstH; const enum AVPixelFormat dstFormat = c->dstFormat; const int flags = c->flags; @@ -331,10 +334,15 @@ static int swscale(SwsContext *c, const uint8_t *src[], } } - /* Note the user might start scaling the picture in the middle so this - * will not get executed. This is not really intended but works - * currently, so people might do it. */ - if (srcSliceY == 0) { + if (scale_dst) { + dstY = dstSliceY; + dstH = dstY + dstSliceH; + lastInLumBuf = -1; + lastInChrBuf = -1; + } else if (srcSliceY == 0) { + /* Note the user might start scaling the picture in the middle so this + * will not get executed. This is not really intended but works + * currently, so people might do it. */ dstY = 0; lastInLumBuf = -1; lastInChrBuf = -1; @@ -352,8 +360,8 @@ static int swscale(SwsContext *c, const uint8_t *src[], srcSliceY, srcSliceH, chrSrcSliceY, chrSrcSliceH, 1); ff_init_slice_from_src(vout_slice, (uint8_t**)dst, dstStride, c->dstW, - dstY, dstH, dstY >> c->chrDstVSubSample, - AV_CEIL_RSHIFT(dstH, c->chrDstVSubSample), 0); + dstY, dstSliceH, dstY >> c->chrDstVSubSample, + AV_CEIL_RSHIFT(dstSliceH, c->chrDstVSubSample), scale_dst); if (srcSliceY == 0) { hout_slice->plane[0].sliceY = lastInLumBuf + 1; hout_slice->plane[1].sliceY = lastInChrBuf + 1; @@ -373,7 +381,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], // First line needed as input const int firstLumSrcY = FFMAX(1 - vLumFilterSize, vLumFilterPos[dstY]); - const int firstLumSrcY2 = FFMAX(1 - vLumFilterSize, vLumFilterPos[FFMIN(dstY | ((1 << c->chrDstVSubSample) - 1), dstH - 1)]); + const int firstLumSrcY2 = FFMAX(1 - vLumFilterSize, vLumFilterPos[FFMIN(dstY | ((1 << c->chrDstVSubSample) - 1), c->dstH - 1)]); // First line needed as input const int firstChrSrcY = FFMAX(1 - vChrFilterSize, vChrFilterPos[chrDstY]); @@ -477,7 +485,7 @@ static int swscale(SwsContext *c, const uint8_t *src[], c->chrDither8 = ff_dither_8x8_128[chrDstY & 7]; c->lumDither8 = ff_dither_8x8_128[dstY & 7]; } - if (dstY >= dstH - 2) { + if (dstY >= c->dstH - 2) { /* hmm looks like we can't use MMX here without overwriting * this array's tail */ ff_sws_init_output_funcs(c, &yuv2plane1, &yuv2planeX, &yuv2nv12cX, @@ -491,21 +499,22 @@ static int swscale(SwsContext *c, const uint8_t *src[], desc[i].process(c, &desc[i], dstY, 1); } if (isPlanar(dstFormat) && isALPHA(dstFormat) && !needAlpha) { + int offset = lastDstY - dstSliceY; int length = dstW; int height = dstY - lastDstY; if (is16BPS(dstFormat) || isNBPS(dstFormat)) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); - fillPlane16(dst[3], dstStride[3], length, height, lastDstY, + fillPlane16(dst[3], dstStride[3], length, height, offset, 1, desc->comp[3].depth, isBE(dstFormat)); } else if (is32BPS(dstFormat)) { const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); - fillPlane32(dst[3], dstStride[3], length, height, lastDstY, + fillPlane32(dst[3], dstStride[3], length, height, offset, 1, desc->comp[3].depth, isBE(dstFormat), desc->flags & AV_PIX_FMT_FLAG_FLOAT); } else - fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255); + fillPlane(dst[3], dstStride[3], length, height, offset, 255); } #if HAVE_MMXEXT_INLINE @@ -809,33 +818,42 @@ static void update_palette(SwsContext *c, const uint32_t *pal) } } +static int scale_internal(SwsContext *c, + const uint8_t * const srcSlice[], const int srcStride[], + int srcSliceY, int srcSliceH, + uint8_t *const dstSlice[], const int dstStride[], + int dstSliceY, int dstSliceH); + static int scale_gamma(SwsContext *c, const uint8_t * const srcSlice[], const int srcStride[], int srcSliceY, int srcSliceH, - uint8_t * const dst[], const int dstStride[]) + uint8_t * const dstSlice[], const int dstStride[], + int dstSliceY, int dstSliceH) { - int ret = sws_scale(c->cascaded_context[0], - srcSlice, srcStride, srcSliceY, srcSliceH, - c->cascaded_tmp, c->cascaded_tmpStride); + int ret = scale_internal(c->cascaded_context[0], + srcSlice, srcStride, srcSliceY, srcSliceH, + c->cascaded_tmp, c->cascaded_tmpStride, 0, c->srcH); if (ret < 0) return ret; if (c->cascaded_context[2]) - ret = sws_scale(c->cascaded_context[1], (const uint8_t * const *)c->cascaded_tmp, - c->cascaded_tmpStride, srcSliceY, srcSliceH, c->cascaded1_tmp, - c->cascaded1_tmpStride); + ret = scale_internal(c->cascaded_context[1], (const uint8_t * const *)c->cascaded_tmp, + c->cascaded_tmpStride, srcSliceY, srcSliceH, + c->cascaded1_tmp, c->cascaded1_tmpStride, 0, c->dstH); else - ret = sws_scale(c->cascaded_context[1], (const uint8_t * const *)c->cascaded_tmp, - c->cascaded_tmpStride, srcSliceY, srcSliceH, dst, dstStride); + ret = scale_internal(c->cascaded_context[1], (const uint8_t * const *)c->cascaded_tmp, + c->cascaded_tmpStride, srcSliceY, srcSliceH, + dstSlice, dstStride, dstSliceY, dstSliceH); if (ret < 0) return ret; if (c->cascaded_context[2]) { - ret = sws_scale(c->cascaded_context[2], (const uint8_t * const *)c->cascaded1_tmp, - c->cascaded1_tmpStride, c->cascaded_context[1]->dstY - ret, - c->cascaded_context[1]->dstY, dst, dstStride); + ret = scale_internal(c->cascaded_context[2], (const uint8_t * const *)c->cascaded1_tmp, + c->cascaded1_tmpStride, c->cascaded_context[1]->dstY - ret, + c->cascaded_context[1]->dstY, + dstSlice, dstStride, dstSliceY, dstSliceH); } return ret; } @@ -843,56 +861,64 @@ static int scale_gamma(SwsContext *c, static int scale_cascaded(SwsContext *c, const uint8_t * const srcSlice[], const int srcStride[], int srcSliceY, int srcSliceH, - uint8_t * const dst[], const int dstStride[]) + uint8_t * const dstSlice[], const int dstStride[], + int dstSliceY, int dstSliceH) { - int ret = sws_scale(c->cascaded_context[0], - srcSlice, srcStride, srcSliceY, srcSliceH, - c->cascaded_tmp, c->cascaded_tmpStride); + int ret = scale_internal(c->cascaded_context[0], + srcSlice, srcStride, srcSliceY, srcSliceH, + c->cascaded_tmp, c->cascaded_tmpStride, + 0, c->cascaded_context[0]->dstH); if (ret < 0) return ret; - ret = sws_scale(c->cascaded_context[1], - (const uint8_t * const * )c->cascaded_tmp, c->cascaded_tmpStride, - 0, c->cascaded_context[0]->dstH, dst, dstStride); + ret = scale_internal(c->cascaded_context[1], + (const uint8_t * const * )c->cascaded_tmp, c->cascaded_tmpStride, + 0, c->cascaded_context[0]->dstH, + dstSlice, dstStride, dstSliceY, dstSliceH); return ret; } -/** - * swscale wrapper, so we don't need to export the SwsContext. - * Assumes planar YUV to be in YUV order instead of YVU. - */ -int attribute_align_arg sws_scale(struct SwsContext *c, - const uint8_t * const srcSlice[], - const int srcStride[], int srcSliceY, - int srcSliceH, uint8_t *const dst[], - const int dstStride[]) +static int scale_internal(SwsContext *c, + const uint8_t * const srcSlice[], const int srcStride[], + int srcSliceY, int srcSliceH, + uint8_t *const dstSlice[], const int dstStride[], + int dstSliceY, int dstSliceH) { - const int frame_start = !c->sliceDir; + const int scale_dst = dstSliceY > 0 || dstSliceH < c->dstH; + const int frame_start = scale_dst || !c->sliceDir; int i, ret; const uint8_t *src2[4]; uint8_t *dst2[4]; - int macro_height = isBayer(c->srcFormat) ? 2 : (1 << c->chrSrcVSubSample); + int macro_height_src = isBayer(c->srcFormat) ? 2 : (1 << c->chrSrcVSubSample); + int macro_height_dst = isBayer(c->dstFormat) ? 2 : (1 << c->chrDstVSubSample); // copy strides, so they can safely be modified int srcStride2[4]; int dstStride2[4]; int srcSliceY_internal = srcSliceY; - if (!srcStride || !dstStride || !dst || !srcSlice) { + if (!srcStride || !dstStride || !dstSlice || !srcSlice) { av_log(c, AV_LOG_ERROR, "One of the input parameters to sws_scale() is NULL, please check the calling code\n"); return AVERROR(EINVAL); } - if ((srcSliceY & (macro_height-1)) || - ((srcSliceH& (macro_height-1)) && srcSliceY + srcSliceH != c->srcH) || + if ((srcSliceY & (macro_height_src - 1)) || + ((srcSliceH & (macro_height_src - 1)) && srcSliceY + srcSliceH != c->srcH) || srcSliceY + srcSliceH > c->srcH) { av_log(c, AV_LOG_ERROR, "Slice parameters %d, %d are invalid\n", srcSliceY, srcSliceH); return AVERROR(EINVAL); } + if ((dstSliceY & (macro_height_dst - 1)) || + ((dstSliceH & (macro_height_dst - 1)) && dstSliceY + dstSliceH != c->dstH) || + dstSliceY + dstSliceH > c->dstH) { + av_log(c, AV_LOG_ERROR, "Slice parameters %d, %d are invalid\n", dstSliceY, dstSliceH); + return AVERROR(EINVAL); + } + if (!check_image_pointers(srcSlice, c->srcFormat, srcStride)) { av_log(c, AV_LOG_ERROR, "bad src image pointers\n"); return AVERROR(EINVAL); } - if (!check_image_pointers((const uint8_t* const*)dst, c->dstFormat, dstStride)) { + if (!check_image_pointers((const uint8_t* const*)dstSlice, c->dstFormat, dstStride)) { av_log(c, AV_LOG_ERROR, "bad dst image pointers\n"); return AVERROR(EINVAL); } @@ -902,10 +928,12 @@ int attribute_align_arg sws_scale(struct SwsContext *c, return 0; if (c->gamma_flag && c->cascaded_context[0]) - return scale_gamma(c, srcSlice, srcStride, srcSliceY, srcSliceH, dst, dstStride); + return scale_gamma(c, srcSlice, srcStride, srcSliceY, srcSliceH, + dstSlice, dstStride, dstSliceY, dstSliceH); if (c->cascaded_context[0] && srcSliceY == 0 && srcSliceH == c->cascaded_context[0]->srcH) - return scale_cascaded(c, srcSlice, srcStride, srcSliceY, srcSliceH, dst, dstStride); + return scale_cascaded(c, srcSlice, srcStride, srcSliceY, srcSliceH, + dstSlice, dstStride, dstSliceY, dstSliceH); if (!srcSliceY && (c->flags & SWS_BITEXACT) && c->dither == SWS_DITHER_ED && c->dither_error[0]) for (i = 0; i < 4; i++) @@ -915,18 +943,19 @@ int attribute_align_arg sws_scale(struct SwsContext *c, update_palette(c, (const uint32_t *)srcSlice[1]); memcpy(src2, srcSlice, sizeof(src2)); - memcpy(dst2, dst, sizeof(dst2)); + memcpy(dst2, dstSlice, sizeof(dst2)); memcpy(srcStride2, srcStride, sizeof(srcStride2)); memcpy(dstStride2, dstStride, sizeof(dstStride2)); - if (frame_start) { + if (frame_start && !scale_dst) { if (srcSliceY != 0 && srcSliceY + srcSliceH != c->srcH) { av_log(c, AV_LOG_ERROR, "Slices start in the middle!\n"); return AVERROR(EINVAL); } c->sliceDir = (srcSliceY == 0) ? 1 : -1; - } + } else if (scale_dst) + c->sliceDir = 1; if (c->src0Alpha && !c->dst0Alpha && isALPHA(c->dstFormat)) { uint8_t *base; @@ -985,11 +1014,28 @@ int attribute_align_arg sws_scale(struct SwsContext *c, reset_ptr(src2, c->srcFormat); reset_ptr((void*)dst2, c->dstFormat); - if (c->convert_unscaled) - ret = c->convert_unscaled(c, src2, srcStride2, srcSliceY_internal, srcSliceH, + if (c->convert_unscaled) { + int offset = srcSliceY_internal; + int slice_h = srcSliceH; + + // for dst slice scaling, offset the src pointers to match the dst slice + if (scale_dst) { + av_assert0(offset == 0); + for (i = 0; i < 4 && src2[i]; i++) { + if (!src2[i] || (i > 0 && usePal(c->srcFormat))) + break; + src2[i] += (dstSliceY >> ((i == 1 || i == 2) ? c->chrSrcVSubSample : 0)) * srcStride2[i]; + } + offset = 0; + slice_h = dstSliceH; + } + + ret = c->convert_unscaled(c, src2, srcStride2, offset, slice_h, dst2, dstStride2); - else - ret = swscale(c, src2, srcStride2, srcSliceY_internal, srcSliceH, dst2, dstStride2); + } else { + ret = swscale(c, src2, srcStride2, srcSliceY_internal, srcSliceH, + dst2, dstStride2, dstSliceY, dstSliceH); + } if (c->dstXYZ && !(c->srcXYZ && c->srcW==c->dstW && c->srcH==c->dstH)) { int dstY = c->dstY ? c->dstY : srcSliceY + srcSliceH; @@ -1003,8 +1049,109 @@ int attribute_align_arg sws_scale(struct SwsContext *c, } /* reset slice direction at end of frame */ - if (srcSliceY_internal + srcSliceH == c->srcH) + if ((srcSliceY_internal + srcSliceH == c->srcH) || scale_dst) c->sliceDir = 0; return ret; } + +void sws_frame_end(struct SwsContext *c) +{ + av_frame_unref(c->frame_src); + av_frame_unref(c->frame_dst); + c->src_ranges.nb_ranges = 0; +} + +int sws_frame_start(struct SwsContext *c, AVFrame *dst, const AVFrame *src) +{ + int ret, allocated = 0; + + ret = av_frame_ref(c->frame_src, src); + if (ret < 0) + return ret; + + if (!dst->buf[0]) { + dst->width = c->dstW; + dst->height = c->dstH; + dst->format = c->dstFormat; + + ret = av_frame_get_buffer(dst, 0); + if (ret < 0) + return ret; + allocated = 1; + } + + ret = av_frame_ref(c->frame_dst, dst); + if (ret < 0) { + if (allocated) + av_frame_unref(dst); + + return ret; + } + + return 0; +} + +int sws_send_slice(struct SwsContext *c, unsigned int slice_start, + unsigned int slice_height) +{ + int ret; + + ret = ff_range_add(&c->src_ranges, slice_start, slice_height); + if (ret < 0) + return ret; + + return 0; +} + +int sws_receive_slice(struct SwsContext *c, unsigned int slice_start, + unsigned int slice_height) +{ + uint8_t *dst[4]; + + /* wait until complete input has been received */ + if (!(c->src_ranges.nb_ranges == 1 && + c->src_ranges.ranges[0].start == 0 && + c->src_ranges.ranges[0].len == c->srcH)) + return AVERROR(EAGAIN); + + for (int i = 0; i < FF_ARRAY_ELEMS(dst) && c->frame_dst->data[i]; i++) { + dst[i] = c->frame_dst->data[i] + + c->frame_dst->linesize[i] * (slice_start >> c->chrDstVSubSample); + } + + return scale_internal(c, (const uint8_t * const *)c->frame_src->data, + c->frame_src->linesize, 0, c->srcH, + dst, c->frame_dst->linesize, slice_start, slice_height); +} + +int sws_scale_frame(struct SwsContext *c, AVFrame *dst, const AVFrame *src) +{ + int ret; + + ret = sws_frame_start(c, dst, src); + if (ret < 0) + return ret; + + ret = sws_send_slice(c, 0, src->height); + if (ret >= 0) + ret = sws_receive_slice(c, 0, dst->height); + + sws_frame_end(c); + + return ret; +} + +/** + * swscale wrapper, so we don't need to export the SwsContext. + * Assumes planar YUV to be in YUV order instead of YVU. + */ +int attribute_align_arg sws_scale(struct SwsContext *c, + const uint8_t * const srcSlice[], + const int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *const dst[], + const int dstStride[]) +{ + return scale_internal(c, srcSlice, srcStride, srcSliceY, srcSliceH, + dst, dstStride, 0, c->dstH); +} diff --git a/libswscale/swscale.h b/libswscale/swscale.h index 50d6d46553..41eacd2dea 100644 --- a/libswscale/swscale.h +++ b/libswscale/swscale.h @@ -30,6 +30,7 @@ #include #include "libavutil/avutil.h" +#include "libavutil/frame.h" #include "libavutil/log.h" #include "libavutil/pixfmt.h" #include "version.h" @@ -218,6 +219,85 @@ int sws_scale(struct SwsContext *c, const uint8_t *const srcSlice[], const int srcStride[], int srcSliceY, int srcSliceH, uint8_t *const dst[], const int dstStride[]); +/** + * Scale source data from src and write the output to dst. + * + * This is merely a convenience wrapper around + * - sws_frame_start() + * - sws_send_slice(0, src->height) + * - sws_receive_slice(0, dst->height) + * - sws_frame_end() + * + * @param dst The destination frame. See documentation for sws_frame_start() for + * more details. + * @param src The source frame. + * + * @return 0 on success, a negative AVERROR code on failure + */ +int sws_scale_frame(struct SwsContext *c, AVFrame *dst, const AVFrame *src); + +/** + * Initialize the scaling process for a given pair of source/destination frames. + * Must be called before any calls to sws_send_slice() and sws_receive_slice(). + * + * This function will retain references to src and dst. + * + * @param dst The destination frame. + * + * The data buffers may either be already allocated by the caller or + * left clear, in which case they will be allocated by the scaler. + * The latter may have performance advantages - e.g. in certain cases + * some output planes may be references to input planes, rather than + * copies. + * + * Output data will be written into this frame in successful + * sws_receive_slice() calls. + * @param src The source frame. The data buffers must be allocated, but the + * frame data does not have to be ready at this point. Data + * availability is then signalled by sws_send_slice(). + * @return 0 on success, a negative AVERROR code on failure + * + * @see sws_frame_end() + */ +int sws_frame_start(struct SwsContext *c, AVFrame *dst, const AVFrame *src); + +/** + * Finish the scaling process for a pair of source/destination frames previously + * submitted with sws_frame_start(). Must be called after all sws_send_slice() + * and sws_receive_slice() calls are done, before any new sws_frame_start() + * calls. + */ +void sws_frame_end(struct SwsContext *c); + +/** + * Indicate that a horizontal slice of input data is available in the source + * frame previously provided to sws_frame_start(). The slices may be provided in + * any order, but may not overlap. For vertically subsampled pixel formats, the + * slices must be aligned according to subsampling. + * + * @param slice_start first row of the slice + * @param slice_height number of rows in the slice + * + * @return 0 on success, a negative AVERROR code on failure. + */ +int sws_send_slice(struct SwsContext *c, unsigned int slice_start, + unsigned int slice_height); + +/** + * Request a horizontal slice of the output data to be written into the frame + * previously provided to sws_frame_start(). + * + * @param slice_start first row of the slice + * @param slice_height number of rows in the slice + * + * @return 0 if the data was successfully written into the output + * AVERROR(EAGAIN) if more input data needs to be provided before the + * output can be produced + * another negative AVERROR code on other kinds of scaling failure + */ +int sws_receive_slice(struct SwsContext *c, unsigned int slice_start, + unsigned int slice_height); + /** * @param dstRange flag indicating the while-black range of the output (1=jpeg / 0=mpeg) * @param srcRange flag indicating the while-black range of the input (1=jpeg / 0=mpeg) diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 673407636a..c1098d6026 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -27,6 +27,7 @@ #include "libavutil/avassert.h" #include "libavutil/avutil.h" #include "libavutil/common.h" +#include "libavutil/frame.h" #include "libavutil/intreadwrite.h" #include "libavutil/log.h" #include "libavutil/mem_internal.h" @@ -80,6 +81,19 @@ typedef enum SwsAlphaBlend { SWS_ALPHA_BLEND_NB, } SwsAlphaBlend; +typedef struct Range { + unsigned int start; + unsigned int len; +} Range; + +typedef struct RangeList { + Range *ranges; + unsigned int nb_ranges; + int ranges_allocated; +} RangeList; + +int ff_range_add(RangeList *r, unsigned int start, unsigned int len); + typedef int (*SwsFunc)(struct SwsContext *context, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]); @@ -313,6 +327,11 @@ typedef struct SwsContext { int sliceDir; ///< Direction that slices are fed to the scaler (1 = top-to-bottom, -1 = bottom-to-top). double param[2]; ///< Input parameters for scaling algorithms that need them. + AVFrame *frame_src; + AVFrame *frame_dst; + + RangeList src_ranges; + /* The cascaded_* fields allow spliting a scaler task into multiple * sequential steps, this is for example used to limit the maximum * downscaling factor that needs to be supported in one scaler. diff --git a/libswscale/utils.c b/libswscale/utils.c index 176fc6fd63..dbb907d761 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -1761,6 +1761,11 @@ av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter, if (!FF_ALLOCZ_TYPED_ARRAY(c->dither_error[i], c->dstW + 2)) goto nomem; + c->frame_src = av_frame_alloc(); + c->frame_dst = av_frame_alloc(); + if (!c->frame_src || !c->frame_dst) + goto nomem; + c->needAlpha = (CONFIG_SWSCALE_ALPHA && isALPHA(c->srcFormat) && isALPHA(c->dstFormat)) ? 1 : 0; // 64 / c->scalingBpp is the same as 16 / sizeof(scaling_intermediate) @@ -2250,6 +2255,11 @@ void sws_freeContext(SwsContext *c) for (i = 0; i < 4; i++) av_freep(&c->dither_error[i]); + av_frame_free(&c->frame_src); + av_frame_free(&c->frame_dst); + + av_freep(&c->src_ranges.ranges); + av_freep(&c->vLumFilter); av_freep(&c->vChrFilter); av_freep(&c->hLumFilter); @@ -2364,3 +2374,63 @@ struct SwsContext *sws_getCachedContext(struct SwsContext *context, int srcW, } return context; } + +int ff_range_add(RangeList *rl, unsigned int start, unsigned int len) +{ + Range *tmp; + unsigned int idx; + + /* find the first existing range after the new one */ + for (idx = 0; idx < rl->nb_ranges; idx++) + if (rl->ranges[idx].start > start) + break; + + /* check for overlap */ + if (idx > 0) { + Range *prev = &rl->ranges[idx - 1]; + if (prev->start + prev->len > start) + return AVERROR(EINVAL); + } + if (idx < rl->nb_ranges) { + Range *next = &rl->ranges[idx]; + if (start + len > next->start) + return AVERROR(EINVAL); + } + + tmp = av_fast_realloc(rl->ranges, &rl->ranges_allocated, + (rl->nb_ranges + 1) * sizeof(*rl->ranges)); + if (!tmp) + return AVERROR(ENOMEM); + rl->ranges = tmp; + + memmove(rl->ranges + idx + 1, rl->ranges + idx, + sizeof(*rl->ranges) * (rl->nb_ranges - idx)); + rl->ranges[idx].start = start; + rl->ranges[idx].len = len; + rl->nb_ranges++; + + /* merge ranges */ + if (idx > 0) { + Range *prev = &rl->ranges[idx - 1]; + Range *cur = &rl->ranges[idx]; + if (prev->start + prev->len == cur->start) { + prev->len += cur->len; + memmove(rl->ranges + idx - 1, rl->ranges + idx, + sizeof(*rl->ranges) * (rl->nb_ranges - idx)); + rl->nb_ranges--; + idx--; + } + } + if (idx < rl->nb_ranges - 1) { + Range *cur = &rl->ranges[idx]; + Range *next = &rl->ranges[idx + 1]; + if (cur->start + cur->len == next->start) { + cur->len += next->len; + memmove(rl->ranges + idx, rl->ranges + idx + 1, + sizeof(*rl->ranges) * (rl->nb_ranges - idx - 1)); + rl->nb_ranges--; + } + } + + return 0; +}