From patchwork Fri May 10 05:59:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Langdale X-Patchwork-Id: 13054 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id D9370448353 for ; Fri, 10 May 2019 08:59:24 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B5F32689AC1; Fri, 10 May 2019 08:59:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail.overt.org (mail.overt.org [157.230.92.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2638C68832D for ; Fri, 10 May 2019 08:59:17 +0300 (EEST) Received: from authenticated-user (mail.overt.org [157.230.92.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.overt.org (Postfix) with ESMTPSA id 0137640991; Fri, 10 May 2019 00:59:15 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail; t=1557467956; bh=T5OG5lr+985mcva4l/+S3bx2Z/nKHvBhoew6B5W0qmg=; h=From:To:Cc:Subject:Date:From; b=BKzk5LvwFhXMI7KMyESNb7pWdT5hduAilhe20nJvo6wAWIuAWJc7Uphu4su1PUfGv LPMRXRNz93nhxqt8yQ9E4/GlinGSwmW/vV8JDtpD4tT9HJ+HrCa1Xln6mYz+tYAE+z gfETUjUBzQ/tY3yYImZU8BVPh2FKLRKgf9IkNuN6VjH/9oCsV5tdGHbhjUtd6tJDrs LGkFRF2iD4z1pUtIpk+zuxfuUMAjPkkSdjOUbRlNJIQnmMtNWkzQrgKGXT72Shntmz aXgHY7swgbR6xGdey9rrb4sgTuaZFbnsDue6a0mxwt4K8hFGJogCB7zrdmnNugR1FV 2wPS1H72VpaQQ== From: Philip Langdale To: ffmpeg-devel@ffmpeg.org Date: Thu, 9 May 2019 22:59:12 -0700 Message-Id: <20190510055912.23140-1-philipl@overt.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] swscale: Add support for NV24 and NV42 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Philip Langdale Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" I don't think this is terribly useful, as the only thing out there that can even handle NV24 content is VDPAU and the only time you have to deal with it is when doing VDPAU OpenGL interop where swscale is irrelevant. In the other cases you can use YV24 (YUV444P). But anyway, I was asked to do this for the sake of completeness. The implementation is pretty straight-forward. Most of the existing NV12 codepaths work regardless of subsampling and are re-used as is. Where necessary I wrote the slightly different NV24 versions. Finally, the one thing that confused me for a long time was the asm specific x86 path that did an explicit exclusion check for NV12. I replaced that with a semi-planar check and also updated the equivalent PPC code, but which I cannot test. Signed-off-by: Philip Langdale --- libswscale/input.c | 2 + libswscale/output.c | 6 ++- libswscale/ppc/swscale_altivec.c | 3 +- libswscale/ppc/swscale_vsx.c | 3 +- libswscale/swscale_unscaled.c | 51 ++++++++++++++++++++++++ libswscale/utils.c | 2 + libswscale/version.h | 2 +- libswscale/x86/swscale_template.c | 4 +- tests/ref/fate/filter-pixfmts-copy | 2 + tests/ref/fate/filter-pixfmts-crop | 2 + tests/ref/fate/filter-pixfmts-field | 2 + tests/ref/fate/filter-pixfmts-fieldorder | 2 + tests/ref/fate/filter-pixfmts-hflip | 2 + tests/ref/fate/filter-pixfmts-il | 2 + tests/ref/fate/filter-pixfmts-null | 2 + tests/ref/fate/filter-pixfmts-pad | 2 + tests/ref/fate/filter-pixfmts-scale | 2 + tests/ref/fate/filter-pixfmts-transpose | 2 + tests/ref/fate/filter-pixfmts-vflip | 2 + 19 files changed, 86 insertions(+), 9 deletions(-) diff --git a/libswscale/input.c b/libswscale/input.c index c2dc356b5d..064f8da314 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -1020,9 +1020,11 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) c->chrToYV12 = uyvyToUV_c; break; case AV_PIX_FMT_NV12: + case AV_PIX_FMT_NV24: c->chrToYV12 = nv12ToUV_c; break; case AV_PIX_FMT_NV21: + case AV_PIX_FMT_NV42: c->chrToYV12 = nv21ToUV_c; break; case AV_PIX_FMT_RGB8: diff --git a/libswscale/output.c b/libswscale/output.c index d3401f0cd1..26b0ff3d48 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -410,7 +410,8 @@ static void yuv2nv12cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS const uint8_t *chrDither = c->chrDither8; int i; - if (dstFormat == AV_PIX_FMT_NV12) + if (dstFormat == AV_PIX_FMT_NV12 || + dstFormat == AV_PIX_FMT_NV24) for (i=0; isrcBpc == 8 && c->dstBpc <= 14) { c->hyScale = c->hcScale = hScale_real_altivec; } - if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && - dstFormat != AV_PIX_FMT_NV12 && dstFormat != AV_PIX_FMT_NV21 && + if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && !isSemiPlanarYUV(dstFormat) && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE && !c->needAlpha) { c->yuv2planeX = yuv2planeX_altivec; diff --git a/libswscale/ppc/swscale_vsx.c b/libswscale/ppc/swscale_vsx.c index 2e20ab388a..069b77985c 100644 --- a/libswscale/ppc/swscale_vsx.c +++ b/libswscale/ppc/swscale_vsx.c @@ -1874,8 +1874,7 @@ av_cold void ff_sws_init_swscale_vsx(SwsContext *c) c->hcscale_fast = hcscale_fast_vsx; } } - if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && - dstFormat != AV_PIX_FMT_NV12 && dstFormat != AV_PIX_FMT_NV21 && + if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && !isSemiPlanarYUV(dstFormat) && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE && !c->needAlpha) { c->yuv2planeX = yuv2planeX_vsx; diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index be04a236d8..d7cc0bd4c5 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -180,6 +180,47 @@ static int nv12ToPlanarWrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int planarToNv24Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam[], + int dstStride[]) +{ + uint8_t *dst = dstParam[1] + dstStride[1] * srcSliceY; + + copyPlane(src[0], srcStride[0], srcSliceY, srcSliceH, c->srcW, + dstParam[0], dstStride[0]); + + if (c->dstFormat == AV_PIX_FMT_NV24) + interleaveBytes(src[1], src[2], dst, c->chrSrcW, (srcSliceH + 1), + srcStride[1], srcStride[2], dstStride[1]); + else + interleaveBytes(src[2], src[1], dst, c->chrSrcW, (srcSliceH + 1), + srcStride[2], srcStride[1], dstStride[1]); + + return srcSliceH; +} + +static int nv24ToPlanarWrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam[], + int dstStride[]) +{ + uint8_t *dst1 = dstParam[1] + dstStride[1] * srcSliceY; + uint8_t *dst2 = dstParam[2] + dstStride[2] * srcSliceY; + + copyPlane(src[0], srcStride[0], srcSliceY, srcSliceH, c->srcW, + dstParam[0], dstStride[0]); + + if (c->srcFormat == AV_PIX_FMT_NV24) + deinterleaveBytes(src[1], dst1, dst2, c->chrSrcW, (srcSliceH + 1), + srcStride[1], dstStride[1], dstStride[2]); + else + deinterleaveBytes(src[1], dst2, dst1, c->chrSrcW, (srcSliceH + 1), + srcStride[1], dstStride[2], dstStride[1]); + + return srcSliceH; +} + static int planarToP01xWrapper(SwsContext *c, const uint8_t *src8[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam8[], @@ -1872,11 +1913,21 @@ void ff_get_unscaled_swscale(SwsContext *c) (dstFormat == AV_PIX_FMT_NV12 || dstFormat == AV_PIX_FMT_NV21)) { c->swscale = planarToNv12Wrapper; } + /* yv24_to_nv24 */ + if ((srcFormat == AV_PIX_FMT_YUV444P || srcFormat == AV_PIX_FMT_YUVA444P) && + (dstFormat == AV_PIX_FMT_NV24 || dstFormat == AV_PIX_FMT_NV42)) { + c->swscale = planarToNv24Wrapper; + } /* nv12_to_yv12 */ if (dstFormat == AV_PIX_FMT_YUV420P && (srcFormat == AV_PIX_FMT_NV12 || srcFormat == AV_PIX_FMT_NV21)) { c->swscale = nv12ToPlanarWrapper; } + /* nv24_to_yv24 */ + if (dstFormat == AV_PIX_FMT_YUV444P && + (srcFormat == AV_PIX_FMT_NV24 || srcFormat == AV_PIX_FMT_NV42)) { + c->swscale = nv24ToPlanarWrapper; + } /* yuv2bgr */ if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUV422P || srcFormat == AV_PIX_FMT_YUVA420P) && isAnyRGB(dstFormat) && diff --git a/libswscale/utils.c b/libswscale/utils.c index df68bcc0d9..1b1f779532 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -264,6 +264,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = { [AV_PIX_FMT_YUVA422P12LE] = { 1, 1 }, [AV_PIX_FMT_YUVA444P12BE] = { 1, 1 }, [AV_PIX_FMT_YUVA444P12LE] = { 1, 1 }, + [AV_PIX_FMT_NV24] = { 1, 1 }, + [AV_PIX_FMT_NV42] = { 1, 1 }, }; int sws_isSupportedInput(enum AVPixelFormat pix_fmt) diff --git a/libswscale/version.h b/libswscale/version.h index 0e28a76e64..891c76d915 100644 --- a/libswscale/version.h +++ b/libswscale/version.h @@ -28,7 +28,7 @@ #define LIBSWSCALE_VERSION_MAJOR 5 #define LIBSWSCALE_VERSION_MINOR 4 -#define LIBSWSCALE_VERSION_MICRO 100 +#define LIBSWSCALE_VERSION_MICRO 101 #define LIBSWSCALE_VERSION_INT AV_VERSION_INT(LIBSWSCALE_VERSION_MAJOR, \ LIBSWSCALE_VERSION_MINOR, \ diff --git a/libswscale/x86/swscale_template.c b/libswscale/x86/swscale_template.c index 7c30470679..823056c2ea 100644 --- a/libswscale/x86/swscale_template.c +++ b/libswscale/x86/swscale_template.c @@ -1499,8 +1499,8 @@ static av_cold void RENAME(sws_init_swscale)(SwsContext *c) enum AVPixelFormat dstFormat = c->dstFormat; c->use_mmx_vfilter= 0; - if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && dstFormat != AV_PIX_FMT_NV12 - && dstFormat != AV_PIX_FMT_NV21 && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE + if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && !isSemiPlanarYUV(dstFormat) + && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE && !(c->flags & SWS_BITEXACT)) { if (c->flags & SWS_ACCURATE_RND) { if (!(c->flags & SWS_FULL_CHR_H_INT)) { diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy index 0609344c2a..4675b6e832 100644 --- a/tests/ref/fate/filter-pixfmts-copy +++ b/tests/ref/fate/filter-pixfmts-copy @@ -53,6 +53,8 @@ monob 8b04f859fee6a0be856be184acd7a0b5 monow 54d16d2c01abfd72ecdb5e51e283937c nv12 8e24feb2c544dc26a20047a71e4c27aa nv21 335d85c9af6110f26ae9e187a82ed2cf +nv24 f30fc8d0ac40af69e119ea919a314572 +nv42 29a212f70f8780fe0eb99abcae81894d p010be 7f9842d6015026136bad60d03c035cc3 p010le c453421b9f726bdaf2bacf59a492c43b p016be 7f9842d6015026136bad60d03c035cc3 diff --git a/tests/ref/fate/filter-pixfmts-crop b/tests/ref/fate/filter-pixfmts-crop index 4e4b6e4a63..4b9f67cf6e 100644 --- a/tests/ref/fate/filter-pixfmts-crop +++ b/tests/ref/fate/filter-pixfmts-crop @@ -51,6 +51,8 @@ grayf32be cf40ec06a8abe54852b7f85a00549eec grayf32le b672526c9da9c8959ab881f242f6890a nv12 92cda427f794374731ec0321ee00caac nv21 1bcfc197f4fb95de85ba58182d8d2f69 +nv24 514c8f12082f0737e558778cbe7de258 +nv42 ece9baae1c5de579dac2c66a89e08ef3 p010be 8b2de2eb6b099bbf355bfc55a0694ddc p010le 373b50c766dfd0a8e79c9a73246d803a p016be 8b2de2eb6b099bbf355bfc55a0694ddc diff --git a/tests/ref/fate/filter-pixfmts-field b/tests/ref/fate/filter-pixfmts-field index d59c982880..059347e3e2 100644 --- a/tests/ref/fate/filter-pixfmts-field +++ b/tests/ref/fate/filter-pixfmts-field @@ -53,6 +53,8 @@ monob 2129cc72a484d7e10a44de9117aa9f80 monow 03d783611d265cae78293f88ea126ea1 nv12 16f7a46708ef25ebd0b72e47920cc11e nv21 7294574037cc7f9373ef5695d8ebe809 +nv24 3b100fb527b64ee2b2d7120da573faf5 +nv42 1841ce853152d86b27c130f319ea0db2 p010be a0311a09bba7383553267d2b3b9c075e p010le ee09a18aefa3ebe97715b3a7312cb8ff p016be a0311a09bba7383553267d2b3b9c075e diff --git a/tests/ref/fate/filter-pixfmts-fieldorder b/tests/ref/fate/filter-pixfmts-fieldorder index 1996649e10..066b944513 100644 --- a/tests/ref/fate/filter-pixfmts-fieldorder +++ b/tests/ref/fate/filter-pixfmts-fieldorder @@ -49,6 +49,8 @@ gray9be ec877f5bcf0ea275a6f36c12cc9adf11 gray9le fba944fde7923d5089f4f52d12988b9e grayf32be 1aa7960131f880c54fe3c77f13448674 grayf32le 4029ac9d197f255794c1b9e416520fc7 +nv24 4fdbef26042c77f012df114e666efdb2 +nv42 59608290fece913e6b7d61edf581a529 rgb0 2e3d8c91c7a83d451593dfd06607ff39 rgb24 b82577f8215d3dc2681be60f1da247af rgb444be 1c3afc3a0c53c51139c76504f59bb1f4 diff --git a/tests/ref/fate/filter-pixfmts-hflip b/tests/ref/fate/filter-pixfmts-hflip index f171a95fa3..100dd708c3 100644 --- a/tests/ref/fate/filter-pixfmts-hflip +++ b/tests/ref/fate/filter-pixfmts-hflip @@ -51,6 +51,8 @@ grayf32be a69add7bbf892a71fe81b3b75982dbe2 grayf32le 4563e176a35dc8a8a07e0829fad5eb88 nv12 801e58f1be5fd0b5bc4bf007c604b0b4 nv21 9f10dfff8963dc327d3395af21f0554f +nv24 f0c5b2f42970f8d4003621d8857a872f +nv42 4dcf9aec82b110712b396a8b365dcb13 p010be 744b13e44d39e1ff7588983fa03e0101 p010le a50b160346ab94f55a425065b57006f0 p016be 744b13e44d39e1ff7588983fa03e0101 diff --git a/tests/ref/fate/filter-pixfmts-il b/tests/ref/fate/filter-pixfmts-il index 0839a77ed2..979eb0ce3a 100644 --- a/tests/ref/fate/filter-pixfmts-il +++ b/tests/ref/fate/filter-pixfmts-il @@ -53,6 +53,8 @@ monob faba75df28033ba7ce3d82ff2a99ee68 monow 6e9cfb8d3a344c5f0c3e1d5e1297e580 nv12 3c3ba9b1b4c4dfff09c26f71b51dd146 nv21 ab586d8781246b5a32d8760a61db9797 +nv24 554153c71d142e3fd8e40b7dcaaec229 +nv42 d699724c8deaeb4f87faf2766512eec3 p010be 3df51286ef66b53e3e283dbbab582263 p010le eadcd8241e97e35b2b47d5eb2eaea6cd p016be 3df51286ef66b53e3e283dbbab582263 diff --git a/tests/ref/fate/filter-pixfmts-null b/tests/ref/fate/filter-pixfmts-null index 0609344c2a..4675b6e832 100644 --- a/tests/ref/fate/filter-pixfmts-null +++ b/tests/ref/fate/filter-pixfmts-null @@ -53,6 +53,8 @@ monob 8b04f859fee6a0be856be184acd7a0b5 monow 54d16d2c01abfd72ecdb5e51e283937c nv12 8e24feb2c544dc26a20047a71e4c27aa nv21 335d85c9af6110f26ae9e187a82ed2cf +nv24 f30fc8d0ac40af69e119ea919a314572 +nv42 29a212f70f8780fe0eb99abcae81894d p010be 7f9842d6015026136bad60d03c035cc3 p010le c453421b9f726bdaf2bacf59a492c43b p016be 7f9842d6015026136bad60d03c035cc3 diff --git a/tests/ref/fate/filter-pixfmts-pad b/tests/ref/fate/filter-pixfmts-pad index c863d541f6..41ccec8c29 100644 --- a/tests/ref/fate/filter-pixfmts-pad +++ b/tests/ref/fate/filter-pixfmts-pad @@ -23,6 +23,8 @@ gray16le 468bda6155bdc7a7a20c34d6e599fd16 gray9le f8f3dfe31ca5fcba828285bceefdab9a nv12 381574979cb04be10c9168540310afad nv21 0fdeb2cdd56cf5a7147dc273456fa217 +nv24 193b9eadcc06ad5081609f76249b3e47 +nv42 1738ad3c31c6c16e17679f5b09ce4677 rgb0 78d500c8361ab6423a4826a00268c908 rgb24 17f9e2e0c609009acaf2175c42d4a2a5 rgba b157c90191463d34fb3ce77b36c96386 diff --git a/tests/ref/fate/filter-pixfmts-scale b/tests/ref/fate/filter-pixfmts-scale index 3226e8b53c..2f38241d87 100644 --- a/tests/ref/fate/filter-pixfmts-scale +++ b/tests/ref/fate/filter-pixfmts-scale @@ -53,6 +53,8 @@ monob f01cb0b623357387827902d9d0963435 monow 35c68b86c226d6990b2dcb573a05ff6b nv12 b118d24a3653fe66e5d9e079033aef79 nv21 c74bb1c10dbbdee8a1f682b194486c4d +nv24 2aa6e805bf6d4179ed8d7dea37d75db3 +nv42 80714d1eb2d8bcaeab3abc3124df1abd p010be 1d6726d94bf1385996a9a9840dd0e878 p010le 4b316f2b9e18972299beb73511278fa8 p016be 31e204018cbb53f8988c4e1174ea8ce9 diff --git a/tests/ref/fate/filter-pixfmts-transpose b/tests/ref/fate/filter-pixfmts-transpose index 7bcb88c38b..b2ab3b72d9 100644 --- a/tests/ref/fate/filter-pixfmts-transpose +++ b/tests/ref/fate/filter-pixfmts-transpose @@ -51,6 +51,8 @@ grayf32be 823288e1ec497bb1f22c070e502e5272 grayf32le 6e9ec0e1cac3617f3041e681afd2c575 nv12 1965e3826144686748f2f6b516fca5ba nv21 292adaf5271c5c8516b71640458c01f4 +nv24 ea9de8b47faed722ee40182f89489beb +nv42 636af6cd6a4f3ac5edc0fc3ce3c56d63 p010be ad0de2cc9bff81688b182a870fcf7000 p010le e7ff5143595021246733ce6bd0a769e8 p016be ad0de2cc9bff81688b182a870fcf7000 diff --git a/tests/ref/fate/filter-pixfmts-vflip b/tests/ref/fate/filter-pixfmts-vflip index 933ea0c815..e4d58f9f14 100644 --- a/tests/ref/fate/filter-pixfmts-vflip +++ b/tests/ref/fate/filter-pixfmts-vflip @@ -53,6 +53,8 @@ monob 7810c4857822ccfc844d78f5e803269a monow 90a947bfcd5f2261e83b577f48ec57b1 nv12 261ebe585ae2aa4e70d39a10c1679294 nv21 2909feacd27bebb080c8e0fa41795269 +nv24 334420b9d3df84499d2ca16bb66eed2b +nv42 ba4063e2795c17fea3c8a646b01fd1f5 p010be 06e9354b6e0e38ba41736352cedc0bd5 p010le fd18d322bffbf5816902c13102872e22 p016be 06e9354b6e0e38ba41736352cedc0bd5