From patchwork Sat May 11 18:31:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Langdale X-Patchwork-Id: 13076 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B66AC44928B for ; Sat, 11 May 2019 21:32:15 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A153D68AA22; Sat, 11 May 2019 21:32:15 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail.overt.org (mail.overt.org [157.230.92.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 31CC368A9B4 for ; Sat, 11 May 2019 21:32:08 +0300 (EEST) Received: from authenticated-user (mail.overt.org [157.230.92.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.overt.org (Postfix) with ESMTPSA id E1E2E409B1; Sat, 11 May 2019 13:32:06 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail; t=1557599527; bh=T3LQVF/0UNjy+451G/CWFe7i1UOumgBBzlqr4ycUQOQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Isf3VHYMCteWdmcdAWXWCL1F8cvbp5CnAzhzUcD0kbNHaU0dxGevR4drn/8FKPGmu zXzYbC9qdTWzbMRVEhO3j7g6wqYT5+Hkj9xFzQ7tcOrQx4yVVfd9rKVKl0p4YLzru5 yyaeV4kn0uGtsRfI/eRUjg06Ylo0szrd4oChszacBgUlwek5R/eVc9Kk1RV+xOOi9m R+G1un4qsRi8+HUcqDLppMYwu5lF/RLvUnWen2nrcWn+1i09tVkg8qBhrf/5uuTgBO XzgWyTUIItyUG72aNdX/buhNNt3CilNZSS5qFLa83RIa1J1smqJNJJq01GiBCoLf85 vVOEjG5LbYmfg== From: Philip Langdale To: ffmpeg-devel@ffmpeg.org Date: Sat, 11 May 2019 11:31:56 -0700 Message-Id: <20190511183157.27909-3-philipl@overt.org> In-Reply-To: <20190511183157.27909-1-philipl@overt.org> References: <20190511183157.27909-1-philipl@overt.org> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/3] swscale: Add support for NV24 and NV42 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Philip Langdale Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" For the sake of completeness, I'm added NV24/NV42 support to swscale, but the specific use-case I noted when adding the pixel formats doesn't require swscale support (because it's OpenGL interop). The implementation is pretty straight-forward. Most of the existing NV12 codepaths work regardless of subsampling and are re-used as is. Where necessary I wrote the slightly different NV24 versions. Finally, the one thing that confused me for a long time was the asm specific x86 path that did an explicit exclusion check for NV12. I replaced that with a semi-planar check and also updated the equivalent PPC code, which Lauri kindly checked. Signed-off-by: Philip Langdale --- libswscale/input.c | 2 + libswscale/output.c | 6 ++- libswscale/ppc/swscale_altivec.c | 3 +- libswscale/ppc/swscale_vsx.c | 3 +- libswscale/swscale_unscaled.c | 51 ++++++++++++++++++++++++ libswscale/utils.c | 2 + libswscale/version.h | 2 +- libswscale/x86/swscale_template.c | 4 +- tests/ref/fate/filter-pixfmts-copy | 2 + tests/ref/fate/filter-pixfmts-crop | 2 + tests/ref/fate/filter-pixfmts-field | 2 + tests/ref/fate/filter-pixfmts-fieldorder | 2 + tests/ref/fate/filter-pixfmts-hflip | 2 + tests/ref/fate/filter-pixfmts-il | 2 + tests/ref/fate/filter-pixfmts-null | 2 + tests/ref/fate/filter-pixfmts-pad | 2 + tests/ref/fate/filter-pixfmts-scale | 2 + tests/ref/fate/filter-pixfmts-transpose | 2 + tests/ref/fate/filter-pixfmts-vflip | 2 + tests/ref/fate/sws-pixdesc-query | 6 +++ 20 files changed, 92 insertions(+), 9 deletions(-) diff --git a/libswscale/input.c b/libswscale/input.c index c2dc356b5d..064f8da314 100644 --- a/libswscale/input.c +++ b/libswscale/input.c @@ -1020,9 +1020,11 @@ av_cold void ff_sws_init_input_funcs(SwsContext *c) c->chrToYV12 = uyvyToUV_c; break; case AV_PIX_FMT_NV12: + case AV_PIX_FMT_NV24: c->chrToYV12 = nv12ToUV_c; break; case AV_PIX_FMT_NV21: + case AV_PIX_FMT_NV42: c->chrToYV12 = nv21ToUV_c; break; case AV_PIX_FMT_RGB8: diff --git a/libswscale/output.c b/libswscale/output.c index d3401f0cd1..26b0ff3d48 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -410,7 +410,8 @@ static void yuv2nv12cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS const uint8_t *chrDither = c->chrDither8; int i; - if (dstFormat == AV_PIX_FMT_NV12) + if (dstFormat == AV_PIX_FMT_NV12 || + dstFormat == AV_PIX_FMT_NV24) for (i=0; isrcBpc == 8 && c->dstBpc <= 14) { c->hyScale = c->hcScale = hScale_real_altivec; } - if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && - dstFormat != AV_PIX_FMT_NV12 && dstFormat != AV_PIX_FMT_NV21 && + if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && !isSemiPlanarYUV(dstFormat) && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE && !c->needAlpha) { c->yuv2planeX = yuv2planeX_altivec; diff --git a/libswscale/ppc/swscale_vsx.c b/libswscale/ppc/swscale_vsx.c index a617f76741..75dee5ea58 100644 --- a/libswscale/ppc/swscale_vsx.c +++ b/libswscale/ppc/swscale_vsx.c @@ -2096,8 +2096,7 @@ av_cold void ff_sws_init_swscale_vsx(SwsContext *c) : hScale16To15_vsx; } } - if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && - dstFormat != AV_PIX_FMT_NV12 && dstFormat != AV_PIX_FMT_NV21 && + if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && !isSemiPlanarYUV(dstFormat) && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE && !c->needAlpha) { c->yuv2planeX = yuv2planeX_vsx; diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index be04a236d8..d7cc0bd4c5 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -180,6 +180,47 @@ static int nv12ToPlanarWrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int planarToNv24Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam[], + int dstStride[]) +{ + uint8_t *dst = dstParam[1] + dstStride[1] * srcSliceY; + + copyPlane(src[0], srcStride[0], srcSliceY, srcSliceH, c->srcW, + dstParam[0], dstStride[0]); + + if (c->dstFormat == AV_PIX_FMT_NV24) + interleaveBytes(src[1], src[2], dst, c->chrSrcW, (srcSliceH + 1), + srcStride[1], srcStride[2], dstStride[1]); + else + interleaveBytes(src[2], src[1], dst, c->chrSrcW, (srcSliceH + 1), + srcStride[2], srcStride[1], dstStride[1]); + + return srcSliceH; +} + +static int nv24ToPlanarWrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam[], + int dstStride[]) +{ + uint8_t *dst1 = dstParam[1] + dstStride[1] * srcSliceY; + uint8_t *dst2 = dstParam[2] + dstStride[2] * srcSliceY; + + copyPlane(src[0], srcStride[0], srcSliceY, srcSliceH, c->srcW, + dstParam[0], dstStride[0]); + + if (c->srcFormat == AV_PIX_FMT_NV24) + deinterleaveBytes(src[1], dst1, dst2, c->chrSrcW, (srcSliceH + 1), + srcStride[1], dstStride[1], dstStride[2]); + else + deinterleaveBytes(src[1], dst2, dst1, c->chrSrcW, (srcSliceH + 1), + srcStride[1], dstStride[2], dstStride[1]); + + return srcSliceH; +} + static int planarToP01xWrapper(SwsContext *c, const uint8_t *src8[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam8[], @@ -1872,11 +1913,21 @@ void ff_get_unscaled_swscale(SwsContext *c) (dstFormat == AV_PIX_FMT_NV12 || dstFormat == AV_PIX_FMT_NV21)) { c->swscale = planarToNv12Wrapper; } + /* yv24_to_nv24 */ + if ((srcFormat == AV_PIX_FMT_YUV444P || srcFormat == AV_PIX_FMT_YUVA444P) && + (dstFormat == AV_PIX_FMT_NV24 || dstFormat == AV_PIX_FMT_NV42)) { + c->swscale = planarToNv24Wrapper; + } /* nv12_to_yv12 */ if (dstFormat == AV_PIX_FMT_YUV420P && (srcFormat == AV_PIX_FMT_NV12 || srcFormat == AV_PIX_FMT_NV21)) { c->swscale = nv12ToPlanarWrapper; } + /* nv24_to_yv24 */ + if (dstFormat == AV_PIX_FMT_YUV444P && + (srcFormat == AV_PIX_FMT_NV24 || srcFormat == AV_PIX_FMT_NV42)) { + c->swscale = nv24ToPlanarWrapper; + } /* yuv2bgr */ if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUV422P || srcFormat == AV_PIX_FMT_YUVA420P) && isAnyRGB(dstFormat) && diff --git a/libswscale/utils.c b/libswscale/utils.c index df68bcc0d9..1b1f779532 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -264,6 +264,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = { [AV_PIX_FMT_YUVA422P12LE] = { 1, 1 }, [AV_PIX_FMT_YUVA444P12BE] = { 1, 1 }, [AV_PIX_FMT_YUVA444P12LE] = { 1, 1 }, + [AV_PIX_FMT_NV24] = { 1, 1 }, + [AV_PIX_FMT_NV42] = { 1, 1 }, }; int sws_isSupportedInput(enum AVPixelFormat pix_fmt) diff --git a/libswscale/version.h b/libswscale/version.h index 0e28a76e64..891c76d915 100644 --- a/libswscale/version.h +++ b/libswscale/version.h @@ -28,7 +28,7 @@ #define LIBSWSCALE_VERSION_MAJOR 5 #define LIBSWSCALE_VERSION_MINOR 4 -#define LIBSWSCALE_VERSION_MICRO 100 +#define LIBSWSCALE_VERSION_MICRO 101 #define LIBSWSCALE_VERSION_INT AV_VERSION_INT(LIBSWSCALE_VERSION_MAJOR, \ LIBSWSCALE_VERSION_MINOR, \ diff --git a/libswscale/x86/swscale_template.c b/libswscale/x86/swscale_template.c index 7c30470679..823056c2ea 100644 --- a/libswscale/x86/swscale_template.c +++ b/libswscale/x86/swscale_template.c @@ -1499,8 +1499,8 @@ static av_cold void RENAME(sws_init_swscale)(SwsContext *c) enum AVPixelFormat dstFormat = c->dstFormat; c->use_mmx_vfilter= 0; - if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && dstFormat != AV_PIX_FMT_NV12 - && dstFormat != AV_PIX_FMT_NV21 && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE + if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && !isSemiPlanarYUV(dstFormat) + && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE && !(c->flags & SWS_BITEXACT)) { if (c->flags & SWS_ACCURATE_RND) { if (!(c->flags & SWS_FULL_CHR_H_INT)) { diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy index 0609344c2a..4675b6e832 100644 --- a/tests/ref/fate/filter-pixfmts-copy +++ b/tests/ref/fate/filter-pixfmts-copy @@ -53,6 +53,8 @@ monob 8b04f859fee6a0be856be184acd7a0b5 monow 54d16d2c01abfd72ecdb5e51e283937c nv12 8e24feb2c544dc26a20047a71e4c27aa nv21 335d85c9af6110f26ae9e187a82ed2cf +nv24 f30fc8d0ac40af69e119ea919a314572 +nv42 29a212f70f8780fe0eb99abcae81894d p010be 7f9842d6015026136bad60d03c035cc3 p010le c453421b9f726bdaf2bacf59a492c43b p016be 7f9842d6015026136bad60d03c035cc3 diff --git a/tests/ref/fate/filter-pixfmts-crop b/tests/ref/fate/filter-pixfmts-crop index 4e4b6e4a63..4b9f67cf6e 100644 --- a/tests/ref/fate/filter-pixfmts-crop +++ b/tests/ref/fate/filter-pixfmts-crop @@ -51,6 +51,8 @@ grayf32be cf40ec06a8abe54852b7f85a00549eec grayf32le b672526c9da9c8959ab881f242f6890a nv12 92cda427f794374731ec0321ee00caac nv21 1bcfc197f4fb95de85ba58182d8d2f69 +nv24 514c8f12082f0737e558778cbe7de258 +nv42 ece9baae1c5de579dac2c66a89e08ef3 p010be 8b2de2eb6b099bbf355bfc55a0694ddc p010le 373b50c766dfd0a8e79c9a73246d803a p016be 8b2de2eb6b099bbf355bfc55a0694ddc diff --git a/tests/ref/fate/filter-pixfmts-field b/tests/ref/fate/filter-pixfmts-field index d59c982880..059347e3e2 100644 --- a/tests/ref/fate/filter-pixfmts-field +++ b/tests/ref/fate/filter-pixfmts-field @@ -53,6 +53,8 @@ monob 2129cc72a484d7e10a44de9117aa9f80 monow 03d783611d265cae78293f88ea126ea1 nv12 16f7a46708ef25ebd0b72e47920cc11e nv21 7294574037cc7f9373ef5695d8ebe809 +nv24 3b100fb527b64ee2b2d7120da573faf5 +nv42 1841ce853152d86b27c130f319ea0db2 p010be a0311a09bba7383553267d2b3b9c075e p010le ee09a18aefa3ebe97715b3a7312cb8ff p016be a0311a09bba7383553267d2b3b9c075e diff --git a/tests/ref/fate/filter-pixfmts-fieldorder b/tests/ref/fate/filter-pixfmts-fieldorder index 1996649e10..066b944513 100644 --- a/tests/ref/fate/filter-pixfmts-fieldorder +++ b/tests/ref/fate/filter-pixfmts-fieldorder @@ -49,6 +49,8 @@ gray9be ec877f5bcf0ea275a6f36c12cc9adf11 gray9le fba944fde7923d5089f4f52d12988b9e grayf32be 1aa7960131f880c54fe3c77f13448674 grayf32le 4029ac9d197f255794c1b9e416520fc7 +nv24 4fdbef26042c77f012df114e666efdb2 +nv42 59608290fece913e6b7d61edf581a529 rgb0 2e3d8c91c7a83d451593dfd06607ff39 rgb24 b82577f8215d3dc2681be60f1da247af rgb444be 1c3afc3a0c53c51139c76504f59bb1f4 diff --git a/tests/ref/fate/filter-pixfmts-hflip b/tests/ref/fate/filter-pixfmts-hflip index f171a95fa3..100dd708c3 100644 --- a/tests/ref/fate/filter-pixfmts-hflip +++ b/tests/ref/fate/filter-pixfmts-hflip @@ -51,6 +51,8 @@ grayf32be a69add7bbf892a71fe81b3b75982dbe2 grayf32le 4563e176a35dc8a8a07e0829fad5eb88 nv12 801e58f1be5fd0b5bc4bf007c604b0b4 nv21 9f10dfff8963dc327d3395af21f0554f +nv24 f0c5b2f42970f8d4003621d8857a872f +nv42 4dcf9aec82b110712b396a8b365dcb13 p010be 744b13e44d39e1ff7588983fa03e0101 p010le a50b160346ab94f55a425065b57006f0 p016be 744b13e44d39e1ff7588983fa03e0101 diff --git a/tests/ref/fate/filter-pixfmts-il b/tests/ref/fate/filter-pixfmts-il index 0839a77ed2..979eb0ce3a 100644 --- a/tests/ref/fate/filter-pixfmts-il +++ b/tests/ref/fate/filter-pixfmts-il @@ -53,6 +53,8 @@ monob faba75df28033ba7ce3d82ff2a99ee68 monow 6e9cfb8d3a344c5f0c3e1d5e1297e580 nv12 3c3ba9b1b4c4dfff09c26f71b51dd146 nv21 ab586d8781246b5a32d8760a61db9797 +nv24 554153c71d142e3fd8e40b7dcaaec229 +nv42 d699724c8deaeb4f87faf2766512eec3 p010be 3df51286ef66b53e3e283dbbab582263 p010le eadcd8241e97e35b2b47d5eb2eaea6cd p016be 3df51286ef66b53e3e283dbbab582263 diff --git a/tests/ref/fate/filter-pixfmts-null b/tests/ref/fate/filter-pixfmts-null index 0609344c2a..4675b6e832 100644 --- a/tests/ref/fate/filter-pixfmts-null +++ b/tests/ref/fate/filter-pixfmts-null @@ -53,6 +53,8 @@ monob 8b04f859fee6a0be856be184acd7a0b5 monow 54d16d2c01abfd72ecdb5e51e283937c nv12 8e24feb2c544dc26a20047a71e4c27aa nv21 335d85c9af6110f26ae9e187a82ed2cf +nv24 f30fc8d0ac40af69e119ea919a314572 +nv42 29a212f70f8780fe0eb99abcae81894d p010be 7f9842d6015026136bad60d03c035cc3 p010le c453421b9f726bdaf2bacf59a492c43b p016be 7f9842d6015026136bad60d03c035cc3 diff --git a/tests/ref/fate/filter-pixfmts-pad b/tests/ref/fate/filter-pixfmts-pad index c863d541f6..41ccec8c29 100644 --- a/tests/ref/fate/filter-pixfmts-pad +++ b/tests/ref/fate/filter-pixfmts-pad @@ -23,6 +23,8 @@ gray16le 468bda6155bdc7a7a20c34d6e599fd16 gray9le f8f3dfe31ca5fcba828285bceefdab9a nv12 381574979cb04be10c9168540310afad nv21 0fdeb2cdd56cf5a7147dc273456fa217 +nv24 193b9eadcc06ad5081609f76249b3e47 +nv42 1738ad3c31c6c16e17679f5b09ce4677 rgb0 78d500c8361ab6423a4826a00268c908 rgb24 17f9e2e0c609009acaf2175c42d4a2a5 rgba b157c90191463d34fb3ce77b36c96386 diff --git a/tests/ref/fate/filter-pixfmts-scale b/tests/ref/fate/filter-pixfmts-scale index 3226e8b53c..2f38241d87 100644 --- a/tests/ref/fate/filter-pixfmts-scale +++ b/tests/ref/fate/filter-pixfmts-scale @@ -53,6 +53,8 @@ monob f01cb0b623357387827902d9d0963435 monow 35c68b86c226d6990b2dcb573a05ff6b nv12 b118d24a3653fe66e5d9e079033aef79 nv21 c74bb1c10dbbdee8a1f682b194486c4d +nv24 2aa6e805bf6d4179ed8d7dea37d75db3 +nv42 80714d1eb2d8bcaeab3abc3124df1abd p010be 1d6726d94bf1385996a9a9840dd0e878 p010le 4b316f2b9e18972299beb73511278fa8 p016be 31e204018cbb53f8988c4e1174ea8ce9 diff --git a/tests/ref/fate/filter-pixfmts-transpose b/tests/ref/fate/filter-pixfmts-transpose index 7bcb88c38b..b2ab3b72d9 100644 --- a/tests/ref/fate/filter-pixfmts-transpose +++ b/tests/ref/fate/filter-pixfmts-transpose @@ -51,6 +51,8 @@ grayf32be 823288e1ec497bb1f22c070e502e5272 grayf32le 6e9ec0e1cac3617f3041e681afd2c575 nv12 1965e3826144686748f2f6b516fca5ba nv21 292adaf5271c5c8516b71640458c01f4 +nv24 ea9de8b47faed722ee40182f89489beb +nv42 636af6cd6a4f3ac5edc0fc3ce3c56d63 p010be ad0de2cc9bff81688b182a870fcf7000 p010le e7ff5143595021246733ce6bd0a769e8 p016be ad0de2cc9bff81688b182a870fcf7000 diff --git a/tests/ref/fate/filter-pixfmts-vflip b/tests/ref/fate/filter-pixfmts-vflip index 933ea0c815..e4d58f9f14 100644 --- a/tests/ref/fate/filter-pixfmts-vflip +++ b/tests/ref/fate/filter-pixfmts-vflip @@ -53,6 +53,8 @@ monob 7810c4857822ccfc844d78f5e803269a monow 90a947bfcd5f2261e83b577f48ec57b1 nv12 261ebe585ae2aa4e70d39a10c1679294 nv21 2909feacd27bebb080c8e0fa41795269 +nv24 334420b9d3df84499d2ca16bb66eed2b +nv42 ba4063e2795c17fea3c8a646b01fd1f5 p010be 06e9354b6e0e38ba41736352cedc0bd5 p010le fd18d322bffbf5816902c13102872e22 p016be 06e9354b6e0e38ba41736352cedc0bd5 diff --git a/tests/ref/fate/sws-pixdesc-query b/tests/ref/fate/sws-pixdesc-query index 6c41a86e1e..bc8147e3c7 100644 --- a/tests/ref/fate/sws-pixdesc-query +++ b/tests/ref/fate/sws-pixdesc-query @@ -178,6 +178,8 @@ isYUV: nv20be nv20le nv21 + nv24 + nv42 p010be p010le p016be @@ -268,6 +270,8 @@ isPlanarYUV: nv20be nv20le nv21 + nv24 + nv42 p010be p010le p016be @@ -703,6 +707,8 @@ Planar: nv20be nv20le nv21 + nv24 + nv42 p010be p010le p016be