From patchwork Sat Sep 21 12:42:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lance Wang X-Patchwork-Id: 15205 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id F1654445B5C for ; Sat, 21 Sep 2019 15:42:47 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C7F8768A45F; Sat, 21 Sep 2019 15:42:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DF6146897E2 for ; Sat, 21 Sep 2019 15:42:41 +0300 (EEST) Received: by mail-pg1-f193.google.com with SMTP id a24so5390209pgj.2 for ; Sat, 21 Sep 2019 05:42:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2WfJDHEHboN9+5ertZeilr++2BIE51G+RqVFjmL+HbY=; b=t+zExmxSH/jp9xQaTXPJVYTVpZcRQJmZLG312TvqxhneCIuQa6ke18CMy0iQ0Wswc8 Sddw+lcIm8dgzEUYOuh+SE/kIY/S6dxFZrf2u8KIvVYmgK27Qn2Qh3oW9bNTOmRCdRhY KhudYOf0pJrafXSUSlDjlghZLwKQS25jdi/f3RL+/YUj0oNE2qXeDsJy0rk+SKCXaqOU 4n5lzGZ47bL5+laQKHoNkLrVyxBzieM/ka780No5UJVaqL8mr2nl4jCpUen27qDIkiK4 Jg7IEgA7JTvwmw5YGR45jUlegXvqI3RDZGoHKhDdglzUFTCNPGISJjEQFG6qiENAPE1T EiTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2WfJDHEHboN9+5ertZeilr++2BIE51G+RqVFjmL+HbY=; b=k6Nmwb+3Avz/WNobq6PK0782OkM0jTukC5g1KLe/pIW0nuWxF3HH5AOH2aB2Z2gB0k jVMsXPytVWWyzVsNGEKmo360noSZINqJj8t9oGvcCM13aIQlKXFGQKREOiXJWxxlJ/Oe FQMF1n+qvXMGQz5UAmAZEyfQv1F2AOvuaKa2DzB/q1GRdPHTfuD3Y4b3EaV6YMq1mMgO ZfB1hRirO4rPV7rcJJG//tdlA6krKtd72MCZkd0MNYLQhY+Iuasq2ojm0Xubdji7IQIn QPXG8B+eRLpIHTIXfUlfZASFDktnzKssiGiIkvHjWFRTD4xPhiXqRqjO3Rkh3sz0LQaY lL6Q== X-Gm-Message-State: APjAAAUdBBdpjG6F+qPTqdb/tsLIHW6cjWenApDq11szOr8eva36OJtt 7Xt3x1nDn/N0nVtFmxCTqsW4QHJg X-Google-Smtp-Source: APXvYqwmZ72qj1HWB5kY4CKeK8nLIdxWVbNrsUTvIUEo0+76tneF2BB4OnVZNRlbcsoPXyIahQDYWw== X-Received: by 2002:a63:ca06:: with SMTP id n6mr19600984pgi.17.1569069759878; Sat, 21 Sep 2019 05:42:39 -0700 (PDT) Received: from vpn.localdomain ([47.90.99.151]) by smtp.gmail.com with ESMTPSA id v15sm9325420pfn.27.2019.09.21.05.42.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 Sep 2019 05:42:39 -0700 (PDT) From: lance.lmwang@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 21 Sep 2019 20:42:35 +0800 Message-Id: <20190921124235.31729-1-lance.lmwang@gmail.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20190920155517.5854-1-lance.lmwang@gmail.com> References: <20190920155517.5854-1-lance.lmwang@gmail.com> Subject: [FFmpeg-devel] [PATCH v2] avcodec/v210enc: add yuv420p/yuv420p10 input pixel format support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Devin Heitmueller , Limin Wang MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Limin Wang With the patch, we simply reuse the same source chroma line for each pair of lines in the output and the yuv420 and yuv420p10 format of the decoder can go to the v210 encoder without having to touch the pixels at all with autoscale by swscale filter. The end effect is swscale filter is running the full scaling algorithm against a line that doesn't require any scaling or bit depth conversion, where a simple memcpy() could achieve the same result. This can improve performance a lot, the following are the benchmark results: 1. yuv420p ./ffmpeg -benchmark -y -lavfi testsrc2=s=4096x3072:r=10:d=10 -pix_fmt yuv420p -c:v v210 -f null - master: frame= 100 fps= 30 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=3.02x bench: utime=2.762s stime=0.539s rtime=3.308s bench: maxrss=93372416kB applied the patch: frame= 100 fps= 36 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=3.57x video:3302400kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown bench: utime=2.258s stime=0.536s rtime=2.803s bench: maxrss=80809984kB 2. yuv420p10 ./ffmpeg -benchmark -y -lavfi testsrc2=s=4096x3072:r=10:d=10 -pix_fmt yuv420p10 -c:v v210 -f null - master: frame= 100 fps= 26 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=2.61x video:3302400kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown bench: utime=3.257s stime=0.559s rtime=3.827s bench: maxrss=152371200kB applied the patch frame= 100 fps= 31 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=3.11x video:3302400kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown bench: utime=2.625s stime=0.573s rtime=3.212s bench: maxrss=127197184kB Signed-off-by: Devin Heitmueller Signed-off-by: Limin Wang --- libavcodec/v210_template.c | 20 ++++++++++++++++++++ libavcodec/v210enc.c | 8 +++++--- 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/libavcodec/v210_template.c b/libavcodec/v210_template.c index 9e1d9f9..083a9f1 100644 --- a/libavcodec/v210_template.c +++ b/libavcodec/v210_template.c @@ -43,11 +43,31 @@ static void RENAME(v210_enc)(AVCodecContext *avctx, const TYPE *y = (const TYPE *)pic->data[0]; const TYPE *u = (const TYPE *)pic->data[1]; const TYPE *v = (const TYPE *)pic->data[2]; + const TYPE *u_even = u; + const TYPE *v_even = v; const int sample_size = 6 * s->RENAME(sample_factor); const int sample_w = avctx->width / sample_size; for (h = 0; h < avctx->height; h++) { uint32_t val; + + if (pic->format == AV_PIX_FMT_YUV420P10 || + pic->format == AV_PIX_FMT_YUV420P) { + int mod = pic->interlaced_frame == 1 ? 4 : 2; + if (h % mod == 0) { + u_even = u; + v_even = v; + } else { + /* progressive chroma */ + if (mod == 2) { + u = u_even; + v = v_even; + } else if (h % 4 == 2) { + u = u_even; + v = v_even; + } + } + } w = sample_w * sample_size; s->RENAME(pack_line)(y, u, v, dst, w); diff --git a/libavcodec/v210enc.c b/libavcodec/v210enc.c index 16e8810..2180737 100644 --- a/libavcodec/v210enc.c +++ b/libavcodec/v210enc.c @@ -131,9 +131,9 @@ static int encode_frame(AVCodecContext *avctx, AVPacket *pkt, } dst = pkt->data; - if (pic->format == AV_PIX_FMT_YUV422P10) + if (pic->format == AV_PIX_FMT_YUV422P10 || pic->format == AV_PIX_FMT_YUV420P10) v210_enc_10(avctx, dst, pic); - else if(pic->format == AV_PIX_FMT_YUV422P) + else if(pic->format == AV_PIX_FMT_YUV422P || pic->format == AV_PIX_FMT_YUV420P) v210_enc_8(avctx, dst, pic); side_data = av_frame_get_side_data(pic, AV_FRAME_DATA_A53_CC); @@ -165,5 +165,7 @@ AVCodec ff_v210_encoder = { .priv_data_size = sizeof(V210EncContext), .init = encode_init, .encode2 = encode_frame, - .pix_fmts = (const enum AVPixelFormat[]){ AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P, AV_PIX_FMT_NONE }, + .pix_fmts = (const enum AVPixelFormat[]){ AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P, + AV_PIX_FMT_YUV420P10, AV_PIX_FMT_YUV420P, + AV_PIX_FMT_NONE }, };