From patchwork Thu Jan 18 14:23:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wu Jianhua X-Patchwork-Id: 45647 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:968f:b0:199:de12:6fa6 with SMTP id hp15csp296311pzc; Thu, 18 Jan 2024 06:24:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IEDgsbSOA0XtyZ9avqCCFsmfDSP/QFQ4m37UFHfGKpti09LwioB4iSJu6b/xk8oEJAv13B2 X-Received: by 2002:a17:906:bc46:b0:a2a:c2b7:ee92 with SMTP id s6-20020a170906bc4600b00a2ac2b7ee92mr613009ejv.50.1705587872020; Thu, 18 Jan 2024 06:24:32 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n26-20020a1709061d1a00b00a2d5d737b33si3933484ejh.932.2024.01.18.06.24.31; Thu, 18 Jan 2024 06:24:32 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=SietaL83; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 86B4D68AA2A; Thu, 18 Jan 2024 16:24:27 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from JPN01-OS0-obe.outbound.protection.outlook.com (unknown [40.92.98.96]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 43D5968AA2A for ; Thu, 18 Jan 2024 16:24:19 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=K1Qf11rpa9+7i7HsyeUT9POt8LJnKnKnCYgbyqtMAHJRjfy4lpKDscZP6rj5DQ43j2k4iJmx7dQ5dRcDEizMySRHwhQtqSRm0a0chz+a6xiLpSZmtmYUGYXNjPZySOKh9tOZTajZSM40L0IDORm7fAxWsb2p27b8eESwUp9zarFU2ilq4DfXd+mXSkC6fYdhI/wY1UZ+Mk0bXakRGbdwSdwzBA0NQ0M2fZ6l1FCKAXw7hrlIknE2vo2Z8PapkhCwd+FotlvomUtUGCWUGX6pd9b8wuyP2LjQI/vLOVkO9wGmWC2xJixRWHT9/+4rgEI/UlB4oN/zBA+iAwPdskUDLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+kDushbViWZ6mfCHewVHfueru+uP06YgpwwE43O4dCU=; b=GACSHDAkwG02weFS1NRQbSZDYQjx86gOAQzaIZhnqI2Jorbz8AKIRwDIR0MZGpuBB4I6ddtycRnDz7qh1EMWg6tNgBh0dmr3XaoEkY90pDkQRtbZ8ujnyFd8qBX1LIEvipm/Ok9i8eKcTrSpUTrVc6hZoFL9gk1wCLJqSNa+hKqlP3FyAL3wfboCS/Nds4plCzhEJQ6bUd70V9pkw5TxJ77q0FiovzfSEvwRxwcYxnAu21qBJS6enIXtq6ezPBFWyDjhYNPG/cZ6j7rVI997QsHk0S6/42mFav7ghfqCCAQLIt2svbw9kN3CfgnqhctnywRW2U6ZoQq3iZHKr7+SVg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+kDushbViWZ6mfCHewVHfueru+uP06YgpwwE43O4dCU=; b=SietaL83/i4B1Ne54iDKeR3HSHelWYBDjb1SsV8NQnNMR3iO7pOmOoQ2tednF9eGzKSGUnBjQALDYbFPcN2Db5RIHWkr/qFgubPDJJNAIU4NcqFsjelFKQsP3P5Wy63RbtKFSwuJDDINB4BPiVU7H8DannaEJhBLAGgoE5QZWTlcNoriaasvoysla5AM774AoDlLHGaZ1BVa0Zqbu73CnapbUSnCFFJ14fuFNKoQ0S31D99pAq0NZR1uRqLsHF+McCWgo8BBGxPWZDvqY6GrEroFDQVazVGFYktAAYddNa5OsnSwS0BK6qFEm2aPiSguR4kNriGzGX31a3W3HL/RrQ== Received: from OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:186::5) by TYCP286MB2622.JPNP286.PROD.OUTLOOK.COM (2603:1096:400:241::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7202.24; Thu, 18 Jan 2024 14:24:15 +0000 Received: from OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM ([fe80::1bbf:406b:216:f56e]) by OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM ([fe80::1bbf:406b:216:f56e%7]) with mapi id 15.20.7202.024; Thu, 18 Jan 2024 14:24:15 +0000 From: toqsxw@outlook.com To: ffmpeg-devel@ffmpeg.org Date: Thu, 18 Jan 2024 22:23:57 +0800 Message-ID: X-Mailer: git-send-email 2.34.1 X-TMN: [UYZFG++Y7BCQaQpKgSu1i0Vc1mxbYH8q] X-ClientProxiedBy: SI1PR02CA0009.apcprd02.prod.outlook.com (2603:1096:4:1f7::10) To OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:186::5) X-Microsoft-Original-Message-ID: <20240118142404.68192-1-toqsxw@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: OSZP286MB2173:EE_|TYCP286MB2622:EE_ X-MS-Office365-Filtering-Correlation-Id: bd60fff3-eebb-4c21-f08d-08dc183125db X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: YSzrRRf4MqWYULGHiBXyPIkUfOyk44eoFuqdbkztNPLAKZ6HzmsUp/NoYd0NiFzrmcrCi/HdhOlGCzbcq16QUrjIGJY16FdPlvHd60T6nfP2rBxVj7La6zUp8/+oEyWxfLM0KSMkuhUEOv4jMWQXnnupJbZbt0uc8AN1gzTcHRfRgpe0SG3t38IsLDw41zn/bg3PiJJBMFQz6jJh1y9B4dTa/28sTnbwMSuZwgK7pLSAwudT5IXfhWfa6opIAR2hqxVkswovnai7YFG/cPLGGpWc9vQpFbuknr2OZv/6E7U3rVeq02u7UdwiPwqw5RTSyxG1ZBc3ANWeuMZkyWYshaG+sN7qEWYPrTJRs8vOIScNCiHkF9/XtIj9qm9R7tUdhMrGG+MwKI/auxGIjg/Xd+e+Q0ABo/onFrqGG+7o6sMGZi0O1goeH6CzwhhC63f8er+iT1+dm4wyjNSwk/KYFwua/4+iXTcKE7l4tuDCJsb0aL/gzG7StIjbLX4ijj+nxCpnBPKoDTZ+1rhdeBTtYNqQSaY4CkAFeSDfh57MUtxoiP9AQrL3dD6oYxVk9F8pQYg9rNor7CnfgISx+VOqzTB4uIMPcJLYok4wagNHIa2w0PjPzQAao4ftCxcm4jdI X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: a8ryh3WKBOaDeI15/w8xf0URT6/uPc/USOBCk8j52TXIRq4tOpR2K59MXTj+5PkxysDppulIW/2JyW4UxYuLQ6TG88fC/EX9kLgjLXHCBpmvElQru+517tpswW28RotBT0AjqAAyC0vaa16NDfN60o8FqSazFk08VksSXW8qUUZhvFzaI8wWlZwteelMpkDFbpncbmhg0R48505tBzo3YQaT/5fVB/mkhD0AOSlLlD8f7l+zm3F4UvmqGd3Xm+PKvAhYnp5DpXEYHnqNgTKNToPm8GQoYDhtpkIuQlcezZ+hEvzo+e4Acr0sSMnYP6l6lyRw9juNLTEniSeFeQ/c5MnvXoznfsd5jTC4VhGa4qqjou/e95CDVpFihqHrJYoPzKHYBgKOhVSOWJ5fkoOpru4iz2aUrSbo4Zhe3vxYzbeN3JEqCBJf7RSDClV7uakNCIz90Npc8TkByS5L9XuJlBZ4xeY3KzpKraEg0br6FbCJGANcDpZ0nGqi9psapT6fv9vVG2u6VRmLMQkURZw6Yp+zHWmJG3Jlx2pcvqrya9cSN6kZDMCzlJ2bfgWTZCoqu67mQLia5djrp7UbPUmEeepHUmm8Z3uNdJZy3f1tUa1ipI3trKqkGx8JnBD8q64atnECj7NfVmwG8QA5wMLfJO5fQH48HYbol6f2MRAA4z3KyGnxI5StKAQyGvazOT6lla/l9V5fCPptyTBHJDSy6XmHlmsdyhm09gXxhEElafI0AkVij6zdZPdWUxAPS7YD1Ovx/eMgHJnVM48sqF3HoR+VkOrlpzbIDD3KthQZ/KcqbMOKkyk1LPH+6ec2ncHIAbCXSxvpJ/guMdsVmuuXXbxyB8EBYo+qgGkhfKvnOthqd4rNsko9DClx60hkTlckkdI8hslybOGujQlN1wYPgHyU53ZdRT8JhNpAGra3V9A6O7bTyZl2NderFgEfpCQpVEspliD3t5ANzJ68CUb2GuLxoANxO90vmqX89H3WvOsUdtBvmodk/XF0vw5w1mfTFAdjWfTKMmcvBGGbbgI9Z7c2PfF53gJu6E6oLSjx+FIY0qlTSkldcCQKwOEDInYdw0ok4aWHaEvD6d1xotsKqlCGREP7+XBw/t3PFJTG2amOpT5qNjan33b0GFarnVTYhdP4657vmZlIP9afTs36Y8Z+haTma5hoqo+idJRqxw2Nss1xF3suZoGAcjR0oimAvrbzKzvDDrWcgawNFLikqmKsWaZQ9W5/300/0EEsfGI= X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: bd60fff3-eebb-4c21-f08d-08dc183125db X-MS-Exchange-CrossTenant-AuthSource: OSZP286MB2173.JPNP286.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Jan 2024 14:24:15.3957 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCP286MB2622 Subject: [FFmpeg-devel] [PATCH 1/8] avcodec/vvc/vvc_inter_template: move put/put_luma/put_chroma template to h2656_inter_template.c X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Wu Jianhua Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XG6Rldnkt83x From: Wu Jianhua Signed-off-by: Wu Jianhua --- libavcodec/h26x/h2656_inter_template.c | 577 +++++++++++++++++++++++++ libavcodec/vvc/vvc_inter_template.c | 559 +----------------------- 2 files changed, 578 insertions(+), 558 deletions(-) create mode 100644 libavcodec/h26x/h2656_inter_template.c diff --git a/libavcodec/h26x/h2656_inter_template.c b/libavcodec/h26x/h2656_inter_template.c new file mode 100644 index 0000000000..864f6c7e7d --- /dev/null +++ b/libavcodec/h26x/h2656_inter_template.c @@ -0,0 +1,577 @@ +/* + * inter prediction template for HEVC/VVC + * + * Copyright (C) 2022 Nuo Mi + * Copyright (C) 2024 Wu Jianhua + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#define CHROMA_EXTRA_BEFORE 1 +#define CHROMA_EXTRA 3 +#define LUMA_EXTRA_BEFORE 3 +#define LUMA_EXTRA 7 + +static void FUNC(put_pixels)(int16_t *dst, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel *)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = src[x] << (14 - BIT_DEPTH); + src += src_stride; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_uni_pixels)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, const int height, + const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + + for (int y = 0; y < height; y++) { + memcpy(dst, src, width * sizeof(pixel)); + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_w_pixels)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, const int height, + const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, + const int width) +{ + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int shift = denom + 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + const int ox = _ox * (1 << (BIT_DEPTH - 8)); + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) { + const int v = (src[x] << (14 - BIT_DEPTH)); + dst[x] = av_clip_pixel(((v * wx + offset) >> shift) + ox); + } + src += src_stride; + dst += dst_stride; + } +} + +#define LUMA_FILTER(src, stride) \ + (filter[0] * src[x - 3 * stride] + \ + filter[1] * src[x - 2 * stride] + \ + filter[2] * src[x - stride] + \ + filter[3] * src[x ] + \ + filter[4] * src[x + stride] + \ + filter[5] * src[x + 2 * stride] + \ + filter[6] * src[x + 3 * stride] + \ + filter[7] * src[x + 4 * stride]) + +static void FUNC(put_luma_h)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel*)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = hf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_luma_v)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (pixel*)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = vf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = LUMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8); + src += src_stride; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_luma_hv)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + int16_t tmp_array[(MAX_PB_SIZE + LUMA_EXTRA) * MAX_PB_SIZE]; + int16_t *tmp = tmp_array; + const pixel *src = (const pixel*)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = hf; + + src -= LUMA_EXTRA_BEFORE * src_stride; + for (int y = 0; y < height + LUMA_EXTRA; y++) { + for (int x = 0; x < width; x++) + tmp[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + tmp += MAX_PB_SIZE; + } + + tmp = tmp_array + LUMA_EXTRA_BEFORE * MAX_PB_SIZE; + filter = vf; + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = LUMA_FILTER(tmp, MAX_PB_SIZE) >> 6; + tmp += MAX_PB_SIZE; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_uni_luma_h)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel*)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int shift = 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) { + const int val = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + dst[x] = av_clip_pixel((val + offset) >> shift); + } + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_luma_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + + const pixel *src = (const pixel*)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = vf; + const int shift = 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) { + const int val = LUMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8); + dst[x] = av_clip_pixel((val + offset) >> shift); + } + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_luma_hv)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + int16_t tmp_array[(MAX_PB_SIZE + LUMA_EXTRA) * MAX_PB_SIZE]; + int16_t *tmp = tmp_array; + const pixel *src = (const pixel*)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = hf; + const int shift = 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + src -= LUMA_EXTRA_BEFORE * src_stride; + for (int y = 0; y < height + LUMA_EXTRA; y++) { + for (int x = 0; x < width; x++) + tmp[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + tmp += MAX_PB_SIZE; + } + + tmp = tmp_array + LUMA_EXTRA_BEFORE * MAX_PB_SIZE; + filter = vf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) { + const int val = LUMA_FILTER(tmp, MAX_PB_SIZE) >> 6; + dst[x] = av_clip_pixel((val + offset) >> shift); + } + tmp += MAX_PB_SIZE; + dst += dst_stride; + } + +} + +static void FUNC(put_uni_luma_w_h)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, int height, + const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, + const int width) +{ + const pixel *src = (const pixel*)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int ox = _ox * (1 << (BIT_DEPTH - 8)); + const int shift = denom + 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel((((LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_luma_w_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, const int height, + const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, + const int width) +{ + const pixel *src = (const pixel*)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = vf; + const int ox = _ox * (1 << (BIT_DEPTH - 8)); + const int shift = denom + 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel((((LUMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_luma_w_hv)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, const int height, const int denom, + const int wx, const int _ox, const int8_t *hf, const int8_t *vf, const int width) +{ + int16_t tmp_array[(MAX_PB_SIZE + LUMA_EXTRA) * MAX_PB_SIZE]; + int16_t *tmp = tmp_array; + const pixel *src = (const pixel*)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int ox = _ox * (1 << (BIT_DEPTH - 8)); + const int shift = denom + 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + src -= LUMA_EXTRA_BEFORE * src_stride; + for (int y = 0; y < height + LUMA_EXTRA; y++) { + for (int x = 0; x < width; x++) + tmp[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + tmp += MAX_PB_SIZE; + } + + tmp = tmp_array + LUMA_EXTRA_BEFORE * MAX_PB_SIZE; + filter = vf; + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel((((LUMA_FILTER(tmp, MAX_PB_SIZE) >> 6) * wx + offset) >> shift) + ox); + tmp += MAX_PB_SIZE; + dst += dst_stride; + } +} + +#define CHROMA_FILTER(src, stride) \ + (filter[0] * src[x - stride] + \ + filter[1] * src[x] + \ + filter[2] * src[x + stride] + \ + filter[3] * src[x + 2 * stride]) + +static void FUNC(put_chroma_h)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel *)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = hf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_chroma_v)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel *)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = vf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = CHROMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8); + src += src_stride; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_chroma_hv)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + int16_t tmp_array[(MAX_PB_SIZE + CHROMA_EXTRA) * MAX_PB_SIZE]; + int16_t *tmp = tmp_array; + const pixel *src = (const pixel *)_src; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const int8_t *filter = hf; + + src -= CHROMA_EXTRA_BEFORE * src_stride; + + for (int y = 0; y < height + CHROMA_EXTRA; y++) { + for (int x = 0; x < width; x++) + tmp[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + tmp += MAX_PB_SIZE; + } + + tmp = tmp_array + CHROMA_EXTRA_BEFORE * MAX_PB_SIZE; + filter = vf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = CHROMA_FILTER(tmp, MAX_PB_SIZE) >> 6; + tmp += MAX_PB_SIZE; + dst += MAX_PB_SIZE; + } +} + +static void FUNC(put_uni_chroma_h)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int shift = 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel(((CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8)) + offset) >> shift); + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_chroma_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = vf; + const int shift = 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel(((CHROMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8)) + offset) >> shift); + src += src_stride; + dst += dst_stride; + } +} + +static void FUNC(put_uni_chroma_hv)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, + const int height, const int8_t *hf, const int8_t *vf, const int width) +{ + int16_t tmp_array[(MAX_PB_SIZE + CHROMA_EXTRA) * MAX_PB_SIZE]; + int16_t *tmp = tmp_array; + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int shift = 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + src -= CHROMA_EXTRA_BEFORE * src_stride; + + for (int y = 0; y < height + CHROMA_EXTRA; y++) { + for (int x = 0; x < width; x++) + tmp[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + tmp += MAX_PB_SIZE; + } + + tmp = tmp_array + CHROMA_EXTRA_BEFORE * MAX_PB_SIZE; + filter = vf; + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel(((CHROMA_FILTER(tmp, MAX_PB_SIZE) >> 6) + offset) >> shift); + tmp += MAX_PB_SIZE; + dst += dst_stride; + } +} + +static void FUNC(put_uni_chroma_w_h)(uint8_t *_dst, ptrdiff_t _dst_stride, + const uint8_t *_src, ptrdiff_t _src_stride, int height, int denom, int wx, int ox, + const int8_t *hf, const int8_t *vf, int width) +{ + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int shift = denom + 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + ox = ox * (1 << (BIT_DEPTH - 8)); + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) { + dst[x] = av_clip_pixel((((CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); + } + dst += dst_stride; + src += src_stride; + } +} + +static void FUNC(put_uni_chroma_w_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, + const uint8_t *_src, const ptrdiff_t _src_stride, const int height, + const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, + const int width) +{ + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = vf; + const int shift = denom + 14 - BIT_DEPTH; + const int ox = _ox * (1 << (BIT_DEPTH - 8)); +#if BIT_DEPTH < 14 + int offset = 1 << (shift - 1); +#else + int offset = 0; +#endif + + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) { + dst[x] = av_clip_pixel((((CHROMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); + } + dst += dst_stride; + src += src_stride; + } +} + +static void FUNC(put_uni_chroma_w_hv)(uint8_t *_dst, ptrdiff_t _dst_stride, + const uint8_t *_src, ptrdiff_t _src_stride, int height, int denom, int wx, int ox, + const int8_t *hf, const int8_t *vf, int width) +{ + int16_t tmp_array[(MAX_PB_SIZE + CHROMA_EXTRA) * MAX_PB_SIZE]; + int16_t *tmp = tmp_array; + const pixel *src = (const pixel *)_src; + pixel *dst = (pixel *)_dst; + const ptrdiff_t src_stride = _src_stride / sizeof(pixel); + const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); + const int8_t *filter = hf; + const int shift = denom + 14 - BIT_DEPTH; +#if BIT_DEPTH < 14 + const int offset = 1 << (shift - 1); +#else + const int offset = 0; +#endif + + src -= CHROMA_EXTRA_BEFORE * src_stride; + + for (int y = 0; y < height + CHROMA_EXTRA; y++) { + for (int x = 0; x < width; x++) + tmp[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); + src += src_stride; + tmp += MAX_PB_SIZE; + } + + tmp = tmp_array + CHROMA_EXTRA_BEFORE * MAX_PB_SIZE; + filter = vf; + + ox = ox * (1 << (BIT_DEPTH - 8)); + for (int y = 0; y < height; y++) { + for (int x = 0; x < width; x++) + dst[x] = av_clip_pixel((((CHROMA_FILTER(tmp, MAX_PB_SIZE) >> 6) * wx + offset) >> shift) + ox); + tmp += MAX_PB_SIZE; + dst += dst_stride; + } +} diff --git a/libavcodec/vvc/vvc_inter_template.c b/libavcodec/vvc/vvc_inter_template.c index b67b66a2dc..9a9284a9a5 100644 --- a/libavcodec/vvc/vvc_inter_template.c +++ b/libavcodec/vvc/vvc_inter_template.c @@ -20,564 +20,7 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -//////////////////////////////////////////////////////////////////////////////// -// -//////////////////////////////////////////////////////////////////////////////// -static void FUNC(put_pixels)(int16_t *dst, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel *)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = src[x] << (14 - BIT_DEPTH); - src += src_stride; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_uni_pixels)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, const int height, - const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - - for (int y = 0; y < height; y++) { - memcpy(dst, src, width * sizeof(pixel)); - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_w_pixels)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, const int height, - const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, - const int width) -{ - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int shift = denom + 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - const int ox = _ox * (1 << (BIT_DEPTH - 8)); - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) { - const int v = (src[x] << (14 - BIT_DEPTH)); - dst[x] = av_clip_pixel(((v * wx + offset) >> shift) + ox); - } - src += src_stride; - dst += dst_stride; - } -} - -//////////////////////////////////////////////////////////////////////////////// -// -//////////////////////////////////////////////////////////////////////////////// -#define LUMA_FILTER(src, stride) \ - (filter[0] * src[x - 3 * stride] + \ - filter[1] * src[x - 2 * stride] + \ - filter[2] * src[x - stride] + \ - filter[3] * src[x ] + \ - filter[4] * src[x + stride] + \ - filter[5] * src[x + 2 * stride] + \ - filter[6] * src[x + 3 * stride] + \ - filter[7] * src[x + 4 * stride]) - -static void FUNC(put_luma_h)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel*)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = hf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_luma_v)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (pixel*)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = vf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = LUMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8); - src += src_stride; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_luma_hv)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - int16_t tmp_array[(MAX_PB_SIZE + LUMA_EXTRA) * MAX_PB_SIZE]; - int16_t *tmp = tmp_array; - const pixel *src = (const pixel*)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = hf; - - src -= LUMA_EXTRA_BEFORE * src_stride; - for (int y = 0; y < height + LUMA_EXTRA; y++) { - for (int x = 0; x < width; x++) - tmp[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - tmp += MAX_PB_SIZE; - } - - tmp = tmp_array + LUMA_EXTRA_BEFORE * MAX_PB_SIZE; - filter = vf; - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = LUMA_FILTER(tmp, MAX_PB_SIZE) >> 6; - tmp += MAX_PB_SIZE; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_uni_luma_h)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel*)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int shift = 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) { - const int val = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - dst[x] = av_clip_pixel((val + offset) >> shift); - } - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_luma_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - - const pixel *src = (const pixel*)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = vf; - const int shift = 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) { - const int val = LUMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8); - dst[x] = av_clip_pixel((val + offset) >> shift); - } - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_luma_hv)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - int16_t tmp_array[(MAX_PB_SIZE + LUMA_EXTRA) * MAX_PB_SIZE]; - int16_t *tmp = tmp_array; - const pixel *src = (const pixel*)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = hf; - const int shift = 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - src -= LUMA_EXTRA_BEFORE * src_stride; - for (int y = 0; y < height + LUMA_EXTRA; y++) { - for (int x = 0; x < width; x++) - tmp[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - tmp += MAX_PB_SIZE; - } - - tmp = tmp_array + LUMA_EXTRA_BEFORE * MAX_PB_SIZE; - filter = vf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) { - const int val = LUMA_FILTER(tmp, MAX_PB_SIZE) >> 6; - dst[x] = av_clip_pixel((val + offset) >> shift); - } - tmp += MAX_PB_SIZE; - dst += dst_stride; - } - -} - -static void FUNC(put_uni_luma_w_h)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, int height, - const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, - const int width) -{ - const pixel *src = (const pixel*)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int ox = _ox * (1 << (BIT_DEPTH - 8)); - const int shift = denom + 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel((((LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_luma_w_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, const int height, - const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, - const int width) -{ - const pixel *src = (const pixel*)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = vf; - const int ox = _ox * (1 << (BIT_DEPTH - 8)); - const int shift = denom + 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel((((LUMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_luma_w_hv)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, const int height, const int denom, - const int wx, const int _ox, const int8_t *hf, const int8_t *vf, const int width) -{ - int16_t tmp_array[(MAX_PB_SIZE + LUMA_EXTRA) * MAX_PB_SIZE]; - int16_t *tmp = tmp_array; - const pixel *src = (const pixel*)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int ox = _ox * (1 << (BIT_DEPTH - 8)); - const int shift = denom + 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - src -= LUMA_EXTRA_BEFORE * src_stride; - for (int y = 0; y < height + LUMA_EXTRA; y++) { - for (int x = 0; x < width; x++) - tmp[x] = LUMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - tmp += MAX_PB_SIZE; - } - - tmp = tmp_array + LUMA_EXTRA_BEFORE * MAX_PB_SIZE; - filter = vf; - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel((((LUMA_FILTER(tmp, MAX_PB_SIZE) >> 6) * wx + offset) >> shift) + ox); - tmp += MAX_PB_SIZE; - dst += dst_stride; - } -} - -//////////////////////////////////////////////////////////////////////////////// -// -//////////////////////////////////////////////////////////////////////////////// -#define CHROMA_FILTER(src, stride) \ - (filter[0] * src[x - stride] + \ - filter[1] * src[x] + \ - filter[2] * src[x + stride] + \ - filter[3] * src[x + 2 * stride]) - -static void FUNC(put_chroma_h)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel *)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = hf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_chroma_v)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel *)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = vf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = CHROMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8); - src += src_stride; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_chroma_hv)(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - int16_t tmp_array[(MAX_PB_SIZE + CHROMA_EXTRA) * MAX_PB_SIZE]; - int16_t *tmp = tmp_array; - const pixel *src = (const pixel *)_src; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const int8_t *filter = hf; - - src -= CHROMA_EXTRA_BEFORE * src_stride; - - for (int y = 0; y < height + CHROMA_EXTRA; y++) { - for (int x = 0; x < width; x++) - tmp[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - tmp += MAX_PB_SIZE; - } - - tmp = tmp_array + CHROMA_EXTRA_BEFORE * MAX_PB_SIZE; - filter = vf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = CHROMA_FILTER(tmp, MAX_PB_SIZE) >> 6; - tmp += MAX_PB_SIZE; - dst += MAX_PB_SIZE; - } -} - -static void FUNC(put_uni_chroma_h)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int shift = 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel(((CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8)) + offset) >> shift); - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_chroma_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = vf; - const int shift = 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel(((CHROMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8)) + offset) >> shift); - src += src_stride; - dst += dst_stride; - } -} - -static void FUNC(put_uni_chroma_hv)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, - const int height, const int8_t *hf, const int8_t *vf, const int width) -{ - int16_t tmp_array[(MAX_PB_SIZE + CHROMA_EXTRA) * MAX_PB_SIZE]; - int16_t *tmp = tmp_array; - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int shift = 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - src -= CHROMA_EXTRA_BEFORE * src_stride; - - for (int y = 0; y < height + CHROMA_EXTRA; y++) { - for (int x = 0; x < width; x++) - tmp[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - tmp += MAX_PB_SIZE; - } - - tmp = tmp_array + CHROMA_EXTRA_BEFORE * MAX_PB_SIZE; - filter = vf; - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel(((CHROMA_FILTER(tmp, MAX_PB_SIZE) >> 6) + offset) >> shift); - tmp += MAX_PB_SIZE; - dst += dst_stride; - } -} - -static void FUNC(put_uni_chroma_w_h)(uint8_t *_dst, ptrdiff_t _dst_stride, - const uint8_t *_src, ptrdiff_t _src_stride, int height, int denom, int wx, int ox, - const int8_t *hf, const int8_t *vf, int width) -{ - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int shift = denom + 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - ox = ox * (1 << (BIT_DEPTH - 8)); - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) { - dst[x] = av_clip_pixel((((CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); - } - dst += dst_stride; - src += src_stride; - } -} - -static void FUNC(put_uni_chroma_w_v)(uint8_t *_dst, const ptrdiff_t _dst_stride, - const uint8_t *_src, const ptrdiff_t _src_stride, const int height, - const int denom, const int wx, const int _ox, const int8_t *hf, const int8_t *vf, - const int width) -{ - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = vf; - const int shift = denom + 14 - BIT_DEPTH; - const int ox = _ox * (1 << (BIT_DEPTH - 8)); -#if BIT_DEPTH < 14 - int offset = 1 << (shift - 1); -#else - int offset = 0; -#endif - - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) { - dst[x] = av_clip_pixel((((CHROMA_FILTER(src, src_stride) >> (BIT_DEPTH - 8)) * wx + offset) >> shift) + ox); - } - dst += dst_stride; - src += src_stride; - } -} - -static void FUNC(put_uni_chroma_w_hv)(uint8_t *_dst, ptrdiff_t _dst_stride, - const uint8_t *_src, ptrdiff_t _src_stride, int height, int denom, int wx, int ox, - const int8_t *hf, const int8_t *vf, int width) -{ - int16_t tmp_array[(MAX_PB_SIZE + CHROMA_EXTRA) * MAX_PB_SIZE]; - int16_t *tmp = tmp_array; - const pixel *src = (const pixel *)_src; - pixel *dst = (pixel *)_dst; - const ptrdiff_t src_stride = _src_stride / sizeof(pixel); - const ptrdiff_t dst_stride = _dst_stride / sizeof(pixel); - const int8_t *filter = hf; - const int shift = denom + 14 - BIT_DEPTH; -#if BIT_DEPTH < 14 - const int offset = 1 << (shift - 1); -#else - const int offset = 0; -#endif - - src -= CHROMA_EXTRA_BEFORE * src_stride; - - for (int y = 0; y < height + CHROMA_EXTRA; y++) { - for (int x = 0; x < width; x++) - tmp[x] = CHROMA_FILTER(src, 1) >> (BIT_DEPTH - 8); - src += src_stride; - tmp += MAX_PB_SIZE; - } - - tmp = tmp_array + CHROMA_EXTRA_BEFORE * MAX_PB_SIZE; - filter = vf; - - ox = ox * (1 << (BIT_DEPTH - 8)); - for (int y = 0; y < height; y++) { - for (int x = 0; x < width; x++) - dst[x] = av_clip_pixel((((CHROMA_FILTER(tmp, MAX_PB_SIZE) >> 6) * wx + offset) >> shift) + ox); - tmp += MAX_PB_SIZE; - dst += dst_stride; - } -} +#include "libavcodec/h26x/h2656_inter_template.c" static void FUNC(avg)(uint8_t *_dst, const ptrdiff_t _dst_stride, const int16_t *src0, const int16_t *src1, const int width, const int height)