From patchwork Sun Oct 20 20:05:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Niklas Haas X-Patchwork-Id: 52420 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:612c:143:b0:48e:c0f8:d0de with SMTP id h3csp2253537vqi; Sun, 20 Oct 2024 13:59:03 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVwSrFGCGGvN+jgKGubppdXCEnOS73bzyMGDS/lDkFAojRmLejIQKZWTQCzaklQqH7GXIECFtJyx731/GxT7Y0d@gmail.com X-Google-Smtp-Source: AGHT+IE47sfZKZ/MvnObbtlCmxEkZb3e8yRRtuuTT0w+d+Mxp8BjUmkCO7xaXeTdnZXofSDCaeui X-Received: by 2002:a05:651c:88b:b0:2fa:bf5f:f97e with SMTP id 38308e7fff4ca-2fb832432bcmr13834251fa.8.1729457943543; Sun, 20 Oct 2024 13:59:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1729457943; cv=none; d=google.com; s=arc-20240605; b=bApQjfIvNPPRI5ZjzzCvtDvSapMk5j8c9EDQgwnGpkxUgHLNnmKnVAi/9U3oGEqV+t dULjns7mAFiQFFnUtgVYtt6bdYHoMnvWprJWTDoJI0tE5ZmqB5SuIDtgPhlNNyZJfSWj ti/7ksTr8VgoLJmlwwh44bN86wJYCwoRpgGizdDd3RLQtvRlaB281/DlDocc3Rs+6QJb NYXKtaA/kXD2tPDIbZeKKvEPSP8McFXLO2P1QVjPfGGIJqkYNUKrmLvEpsKfLB8pMdcm kQvuhjSxqTnQTf+Sp2Uc9iqjxT9bOtDCDvMkxG0WanJtA/5J8szlEtdE8Ty+JOHzS6KF B1JQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=P12XIiv3laDUb/dZoIrUhpjlE0oskQfFZ1JF5R6XKts=; fh=xmAeKtysnShNOmkhiJmYkS30uw4Fu2hvBJ7qlIwukxQ=; b=Gz/Bo6WkfNqCOKr7HK/3VsUj5/DCMpmH4z5fHVoOplfCA8UmqXgM7lXS7ZSZuppOf/ IEUPNmsJDdKwnBlQ9x89qU82Z1yRkcPZS5A5XmT4jDqCLwEi/axB9zrXJ/Q4lN9fq6iL Hj8/d47DbDc+b/5jgJiPOTH87CkjokmdN+D7U1JvlRwDTZeqEn+921azw4AKFigwS5dI amJjnTWsVJw4Fz4QzOtce0LfMUIrn7WcFJCF7snSUeMqA4LH28T+M63Urw6CfjOFoTU6 dfBXaJLYcqOeO5rQXmEm+E7JHdeDX8BgIRLutABraqS3DArq6tOv05bz9aC2RUzwSQ+h jWFA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@haasn.xyz header.s=mail header.b=dlnXkOfF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2fb9adc584csi6341521fa.7.2024.10.20.13.59.03; Sun, 20 Oct 2024 13:59:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@haasn.xyz header.s=mail header.b=dlnXkOfF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 30F1068DAE3; Sun, 20 Oct 2024 23:09:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from haasn.dev (haasn.dev [78.46.187.166]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E0F6368DB08 for ; Sun, 20 Oct 2024 23:09:00 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=haasn.xyz; s=mail; t=1729454938; bh=2mtYBl/3UewOGpaBqqCYdLlOK7bQAP6Z2F1VT/TMzU4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dlnXkOfFkDBSJqCylPDY6Z//AgtXIHGIwH1C2Wfjn1mEm9VteVoEgLS0ZsbVV6AmT nwgZ87+5SAyiAko6ajgM7xlI5fQW6ZwoSPJBPPC0N5JrM/Ub2zIxjV/piHyMSEM1SP xnQuPLi1qL1k2cKfbXjZXg29xuSrbjgm0y7dEULE= Received: from haasn.dev (unknown [10.30.0.2]) by haasn.dev (Postfix) with ESMTP id AC9FC4BE92; Sun, 20 Oct 2024 22:08:58 +0200 (CEST) From: Niklas Haas To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Oct 2024 22:05:26 +0200 Message-ID: <20241020200851.1414766-18-ffmpeg@haasn.xyz> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241020200851.1414766-1-ffmpeg@haasn.xyz> References: <20241020200851.1414766-1-ffmpeg@haasn.xyz> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 17/18] tests/swscale: add a benchmarking mode X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Niklas Haas Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 0CBOWZM+7yLk From: Niklas Haas With the ability to set the thread count as well. This benchmark includes the constant overhead of context initialization. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas --- libswscale/tests/swscale.c | 93 ++++++++++++++++++++++++++++---------- 1 file changed, 68 insertions(+), 25 deletions(-) diff --git a/libswscale/tests/swscale.c b/libswscale/tests/swscale.c index c11a46024e..f5ad4b3132 100644 --- a/libswscale/tests/swscale.c +++ b/libswscale/tests/swscale.c @@ -31,21 +31,22 @@ #include "libavutil/lfg.h" #include "libavutil/sfc64.h" #include "libavutil/frame.h" +#include "libavutil/opt.h" +#include "libavutil/time.h" #include "libavutil/pixfmt.h" #include "libavutil/avassert.h" #include "libavutil/macros.h" #include "libswscale/swscale.h" -enum { - WIDTH = 96, - HEIGHT = 96, -}; - struct options { enum AVPixelFormat src_fmt; enum AVPixelFormat dst_fmt; double prob; + int w, h; + int threads; + int iters; + int bench; }; struct mode { @@ -53,9 +54,6 @@ struct mode { SwsDither dither; }; -const int dst_w[] = { WIDTH, WIDTH - WIDTH / 3, WIDTH + WIDTH / 3 }; -const int dst_h[] = { HEIGHT, HEIGHT - HEIGHT / 3, HEIGHT + HEIGHT / 3 }; - const struct mode modes[] = { { SWS_FAST_BILINEAR }, { SWS_BILINEAR }, @@ -114,7 +112,8 @@ static void get_mse(int mse[4], const AVFrame *a, const AVFrame *b, int comps) } } -static int scale_legacy(AVFrame *dst, const AVFrame *src, struct mode mode) +static int scale_legacy(AVFrame *dst, const AVFrame *src, struct mode mode, + struct options opts) { SwsContext *sws_legacy; int ret; @@ -131,23 +130,28 @@ static int scale_legacy(AVFrame *dst, const AVFrame *src, struct mode mode) sws_legacy->dst_format = dst->format; sws_legacy->flags = mode.flags; sws_legacy->dither = mode.dither; + sws_legacy->threads = opts.threads; + + if ((ret = sws_init_context(sws_legacy, NULL, NULL)) < 0) + goto error; - ret = sws_init_context(sws_legacy, NULL, NULL); - if (!ret) + for (int i = 0; i < opts.iters; i++) ret = sws_scale_frame(sws_legacy, dst, src); +error: sws_freeContext(sws_legacy); return ret; } /* Runs a series of ref -> src -> dst -> out, and compares out vs ref */ static int run_test(enum AVPixelFormat src_fmt, enum AVPixelFormat dst_fmt, - int dst_w, int dst_h, struct mode mode, const AVFrame *ref, - const int mse_ref[4]) + int dst_w, int dst_h, struct mode mode, struct options opts, + const AVFrame *ref, const int mse_ref[4]) { AVFrame *src = NULL, *dst = NULL, *out = NULL; int mse[4], mse_sws[4], ret = -1; const int comps = fmt_comps(src_fmt) & fmt_comps(dst_fmt); + int64_t time, time_ref = 0; src = av_frame_alloc(); dst = av_frame_alloc(); @@ -174,12 +178,20 @@ static int run_test(enum AVPixelFormat src_fmt, enum AVPixelFormat dst_fmt, sws[1]->flags = mode.flags; sws[1]->dither = mode.dither; - if (sws_scale_frame(sws[1], dst, src) < 0) { - fprintf(stderr, "Failed %s ---> %s\n", av_get_pix_fmt_name(src->format), - av_get_pix_fmt_name(dst->format)); - goto error; + sws[1]->threads = opts.threads; + + time = av_gettime_relative(); + + for (int i = 0; i < opts.iters; i++) { + if (sws_scale_frame(sws[1], dst, src) < 0) { + fprintf(stderr, "Failed %s ---> %s\n", av_get_pix_fmt_name(src->format), + av_get_pix_fmt_name(dst->format)); + goto error; + } } + time = av_gettime_relative() - time; + if (sws_scale_frame(sws[2], out, dst) < 0) { fprintf(stderr, "Failed %s ---> %s\n", av_get_pix_fmt_name(dst->format), av_get_pix_fmt_name(out->format)); @@ -196,11 +208,13 @@ static int run_test(enum AVPixelFormat src_fmt, enum AVPixelFormat dst_fmt, if (!mse_ref) { /* Compare against the legacy swscale API as a reference */ - if (scale_legacy(dst, src, mode) < 0) { + time_ref = av_gettime_relative(); + if (scale_legacy(dst, src, mode, opts) < 0) { fprintf(stderr, "Failed ref %s ---> %s\n", av_get_pix_fmt_name(src->format), av_get_pix_fmt_name(dst->format)); goto error; } + time_ref = av_gettime_relative() - time_ref; if (sws_scale_frame(sws[2], out, dst) < 0) goto error; @@ -221,6 +235,15 @@ static int run_test(enum AVPixelFormat src_fmt, enum AVPixelFormat dst_fmt, } } + if (opts.bench && time_ref) { + printf(" time=%"PRId64" us, ref=%"PRId64" us, speedup=%.3fx %s\n", + time / opts.iters, time_ref / opts.iters, + (double) time_ref / time, + time <= time_ref ? "faster" : "\033[1;33mslower\033[0m"); + } else if (opts.bench) { + printf(" time=%"PRId64" us\n", time / opts.iters); + } + fflush(stdout); ret = 0; /* fall through */ error: @@ -232,6 +255,9 @@ static int run_test(enum AVPixelFormat src_fmt, enum AVPixelFormat dst_fmt, static int run_self_tests(const AVFrame *ref, struct options opts) { + const int dst_w[] = { opts.w, opts.w - opts.w / 3, opts.w + opts.w / 3 }; + const int dst_h[] = { opts.h, opts.h - opts.h / 3, opts.h + opts.h / 3 }; + enum AVPixelFormat src_fmt, dst_fmt, src_fmt_min = 0, dst_fmt_min = 0, @@ -254,8 +280,9 @@ static int run_self_tests(const AVFrame *ref, struct options opts) for (int m = 0; m < FF_ARRAY_ELEMS(modes); m++) { if (ff_sfc64_get(&prng_state) > UINT64_MAX * opts.prob) continue; + if (run_test(src_fmt, dst_fmt, dst_w[w], dst_h[h], - modes[m], ref, NULL) < 0) + modes[m], opts, ref, NULL) < 0) return -1; } } @@ -300,7 +327,7 @@ static int run_file_tests(const AVFrame *ref, FILE *fp, struct options opts) opts.dst_fmt != AV_PIX_FMT_NONE && dst_fmt != opts.dst_fmt) continue; - if (run_test(src_fmt, dst_fmt, dw, dh, mode, ref, mse) < 0) + if (run_test(src_fmt, dst_fmt, dw, dh, mode, opts, ref, mse) < 0) return -1; } @@ -312,7 +339,11 @@ int main(int argc, char **argv) struct options opts = { .src_fmt = AV_PIX_FMT_NONE, .dst_fmt = AV_PIX_FMT_NONE, - .prob = 1.0, + .w = 96, + .h = 96, + .threads = 1, + .iters = 1, + .prob = 1.0, }; AVFrame *rgb = NULL, *ref = NULL; @@ -335,6 +366,10 @@ int main(int argc, char **argv) " Only test the specified destination pixel format\n" " -src \n" " Only test the specified source pixel format\n" + " -bench \n" + " Run benchmarks with the specified number of iterations. This mode also increases the size of the test images\n" + " -threads \n" + " Use the specified number of threads\n" " -cpuflags \n" " Uses the specified cpuflags in the tests\n" ); @@ -368,6 +403,14 @@ int main(int argc, char **argv) fprintf(stderr, "invalid pixel format %s\n", argv[i + 1]); goto error; } + } else if (!strcmp(argv[i], "-bench")) { + opts.bench = 1; + opts.iters = atoi(argv[i + 1]); + opts.iters = FFMAX(opts.iters, 1); + opts.w = 1920; + opts.h = 1080; + } else if (!strcmp(argv[i], "-threads")) { + opts.threads = atoi(argv[i + 1]); } else if (!strcmp(argv[i], "-p")) { opts.prob = atof(argv[i + 1]); } else { @@ -390,8 +433,8 @@ bad_option: rgb = av_frame_alloc(); if (!rgb) goto error; - rgb->width = WIDTH / 12; - rgb->height = HEIGHT / 12; + rgb->width = opts.w / 12; + rgb->height = opts.h / 12; rgb->format = AV_PIX_FMT_RGBA; if (av_frame_get_buffer(rgb, 32) < 0) goto error; @@ -406,8 +449,8 @@ bad_option: ref = av_frame_alloc(); if (!ref) goto error; - ref->width = WIDTH; - ref->height = HEIGHT; + ref->width = opts.w; + ref->height = opts.h; ref->format = AV_PIX_FMT_YUVA420P; if (sws_scale_frame(sws[0], ref, rgb) < 0)