From patchwork Sat Sep 10 01:07:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 37817 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp1245556pzh; Fri, 9 Sep 2022 18:08:33 -0700 (PDT) X-Google-Smtp-Source: AA6agR7cwAWBUjtaXE48PoFu7kEYcsxoQo7LaGJR0j7ZBi/AqrBQsyrb3i/cA98TdPkN/HLqVIU3 X-Received: by 2002:a17:906:58c8:b0:6fe:91d5:18d2 with SMTP id e8-20020a17090658c800b006fe91d518d2mr11952203ejs.190.1662772113486; Fri, 09 Sep 2022 18:08:33 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id di16-20020a170906731000b006feb76dbd51si1811205ejc.289.2022.09.09.18.08.33; Fri, 09 Sep 2022 18:08:33 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=cVz+VS+T; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A2D1F68BB2D; Sat, 10 Sep 2022 04:08:02 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05olkn2089.outbound.protection.outlook.com [40.92.89.89]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 42E3568BB15 for ; Sat, 10 Sep 2022 04:07:58 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=V4w6a+9v4Q88VEkzLwOYebKBpaFdg65r3ccxe62eQyzRVe/AeEJa5/mPr0qw3RKwPd3EbOcnfbKm3F+HdrgIGnTpQS+hAufJi27ZpJ7FP/1rbeqYPVKkUAw2+C6rVw1hhmbnpSUl3mRCUy1yqOvDMTPsUNAqOZi7874ibWylAzFwCC/xE2aXIg/gZ1Zvv1ZG7fyNdTMNmAMI4oVSwDVYq8lTEoUhJdztSGwRsQV/Z+vWbAsI5+Kbipi+DkGFZRnQBO8Erez0XgDwfndN8M6eW9jsbKZeUp58OYqCvToSZeILE3UHTgyXTE4zd+mULqHyWstut8hxN3TvixpgL1iLbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GjvuQoJy3QSkwjwNLrPSMJ8AIb/wQBxSSwV2FmU283o=; b=Q6NNthu9JROBXP8A4ojvV/h61pGnn3vzHoKqzEEWBFFYARQK5/32sojAlTb/L6njDxRwSw+oUhD4tUHictVBFO+1L5Rx5QvJksoybguaG9X0w51uzHwO08f9nB/YpbtTTqEEf/G6ohMIkKLn3CsNyvy4ZusVvUwClH0jpMqckQMsY8I/RwyD78Omy1lIJ3QQiqLQcg9yDF8N/EHUE54cvnhpTBpO2zUmHfGBL86Rx6et9ZoKTosDLohm3WBtKcAAA24FIVdxFz2PkEVYJUX+lAvjSClABND6q/pZC1Y5mqm9PKHhCbqKJyksy4UeYuimlJj+SGSeAkhHwBVCuaG/Wg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GjvuQoJy3QSkwjwNLrPSMJ8AIb/wQBxSSwV2FmU283o=; b=cVz+VS+TGeHrt+mKHWjMmk4ufWkFn8qSiubc2XDgM7awjLq6oiagd4a0aqbl6AetK53Ll5I1PhULhKuw2LpQHKmNdhpwt5NnBQZxNiOd9dxqPAdNwovJAHSESlggePBHipseUSpolQS6Z6ixAbiaaHiHQo/Hr5wNZg/fzIdR9HOC4UEdtnN5KwrfqKHxfKp2A6UEK0NPs0I3A573JLHRVEIP4qD1LuqsputOUz+LSx6XZRcNz66emtxe5YYkqYUKq5Sza+XO29eSacMEJXCsFCf6J2QaZb+lUbEbstEtgtvwgckJD0ekqcrxOmGkHcywpfP4diB7m3+Itf7oragqkg== Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) by PR3P250MB0242.EURP250.PROD.OUTLOOK.COM (2603:10a6:102:17d::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.22; Sat, 10 Sep 2022 01:07:56 +0000 Received: from AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::611e:1608:45cb:b58a]) by AS8P250MB0744.EURP250.PROD.OUTLOOK.COM ([fe80::611e:1608:45cb:b58a%4]) with mapi id 15.20.5612.019; Sat, 10 Sep 2022 01:07:56 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Sat, 10 Sep 2022 03:07:18 +0200 Message-ID: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: X-TMN: [Hbf2YhpzJomkX3ZPyPuwgaBjD3rj8l5EdaS0CYW0O/E=] X-ClientProxiedBy: FR0P281CA0102.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a9::18) To AS8P250MB0744.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:541::14) X-Microsoft-Original-Message-ID: <20220910010729.2961339-6-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AS8P250MB0744:EE_|PR3P250MB0242:EE_ X-MS-Office365-Filtering-Correlation-Id: 0222e281-e2be-4b01-0839-08da92c8e53b X-MS-Exchange-SLBlob-MailProps: AZnQBsB9Xmqc989vEjNQgjm661f4SJJ4hz7uxVOxsgdcn6UgXWar9Zpx2OIsLLZ+G0nZD+4oaKVaYmmTbrUeph8rOHfVqM56tzqTshYJrurRbA+NnQXpIeLkpcllkqWkVwwc7Xu4MwTlLDrMcBPI6tdDiL6MXyruMUzFizXTgGOjglpW32wsE8eI+ZhUJW3MuONeCUC/nXHgu5HR+yCN+MBDjOpkPBNMz9e2gLR1V0FxuXoNyEsNd24tAMA9QSv/VD+mcrqdlbtqh0xnxkYfXVrUw2u2AqsDkDiAttAEpK3+dHTkytCFfmn8G0VJ05JeNGxL15TaMbW3SZiIP1CAhJULIsswWkmpNQymPU80K4G8S4fhQo58eD6pJHDO9bCf4ch/ybH1Pwii0tYt9rSt9ou/QRprb6ttlneiEhnstz7I4hYDBA8cSuD9Q0yhrTdOy73nI3jkx7KeuLdjEhopIp4B8kvscRUkwwMLoGEHHLVdkJgfailwMmBCXz0TBp+UFenZY5vv//0hfkIsP7kNP7gdwutUWoDVfWB9Rwc/+p8/SiOh7e31IBCqLQ/+DzdI66xqH5GEN/ozZYFyW6cioloX29aLp1sRU0QWYpvibQAJVeresBJMgwlc4f5d+ZdCURkCn6iagIvtpZCc/bH3o8pRJc4965lfaUDmTeJlVQjHAIJyIJzbcDk5M9LyTX+3pgVY9D72ge4KBeOVwsRxjKSDTHizGCgoJMxyuy75mXv1MZlNntax5FY7jPYriMGfrYCl5yKJbzE= X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: toSjPeG76akCdL6FIghz7dfKtnrIgrwbPIX0aHw6v3+GG9FahaqbItHMJvzUwPece/lfoYZNeKV1CwSMJXAoejDCgcj05bV1sASK1bAbXnvjTZivCYIe9M9hWr9T2QQm8L9etaA7oK0pWBQv09/sZHpQPr9Zh4BDlRhqnongt22oI1m6GKij1tGsVV47c2YnTuUwvKBXjC6o4HDX9tD6voVzjbY1du8VNMJI7p9h/EcC81BZAoQKIKiAQqwCHPL99SeNTWzcVZq4wqRpXkhU1E8YSpb1N1WJ2ddyvIDArJswtVe2d479n8yV/mWOFgOli/LuqacGk0SmRrTfM2cQLoy9SF129YOAyMeFBZsq2RwIrUTKhiFEu3poCqGe7Za9xXlx7+vb3stnIaa9rMPc9AAF9sRX6WERSmLbuRvX357EHPgVcBC89pFaRsxcW3A+6ORq0k/KEK2kCpEGD12S0ACbgCJbxz4gG7It3rHX6C1PcSVf0SX5yW55k/ufP8QfII9qYA+epgXEA3rdiRLIVuwYXFsodHHi11/t5xIcv/4dUHsDQFTrEzvB6+NFAEB1n1ASlbyClV17vSDghp9PdWfvvFLPHv0Yf7ECv5uHfnRGM9CBPM256ujH2bCit/l3A2Vr9NPBAhIiOBb2qPgTwA== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: j/gL0ZUOS7VMB0qxE8sJdI2SH+Df3i12OaMM7YYIL8/hN34+qnaySvjU0Owux7osUVNj2PzakoA61WwFknvp3oQBRjFjav7zd+VRgWAlLWqnUmh/HmPRo0ePhX4w5eZ/PyKe2QCBtRNvXvVfkswy/yMtC/xaRIV39hB5ZjAnXF76bQio6FTkBM0Z3xl59oY/kGqR5jDvdUDGqC67wemossz0bh5aOuf5dqh/mxnXR05qM/T7wzqlE7E6Q1UxMBNBu6FHPbodNUWK+Y/nGIocSlAT39vPsPY5akj1ubl2AEbzujb3m5/Zmdhz4HVrwbkEWbpIlmCmxWKdauIMaARAUwxWOjeW+86QkCfSuRRB713vCRUogtL24g+OeJtkYmKmrUaus9rshwlEKSF13n1fRBDOxiiAljeR8+2ovxWAxhmv1W/D9pqsqw33/oWxSR87wHIn19C7LenhqvhVM29XgGoMYPhcUKuUibL8/9fccZW8DTJmA7OOdcgPsv8EbCDg08Wg7KJG1wGA5mH6sXPD+YhjIF4K2Ctb6JNAFDx7ZURaJzW8JgenoRkhr7zQN3ixCveSKAqRE3Sic8mVW9VqMvRzM2q08H3dnHjQKfLDjtSgnBcszArVQ7luOVTlsaPKTPkhHKZhqH/f01z7Duo2a2WMZyBTbA069d2miFIvtK5DhHcXy5n1Evm/maJ0Jf4UzfTA3+hw4Jq5YaZvkTvweNxroVT69HUjpJowkA6tbrFT6SCLQGZXrL2/kIlRDED48xETmZRpfUV7CNzSyGz+QuFyjY5E54P95cV2+zfynIz/muIHgypUjxUAd2iemqPoaZEPaQJcpwNoC+dKN7tCv02zbQTT2s2WKGg2MF2TlUVWlK4RJgKbhoac9vvnFFDAMdRmyJRBDj39wwHnor3K8jGTAUrIEqDOaCPpxehMUluuct4ieakycGEHLhmOn80c377O4l+k1GceLPCXSee/Olg8RJgpcQ9T7FKL4Nmv/1NTF37lNitUfb/I7uaUZYO6FsdZNKuUWmqTCFQpGHYgcuVdqaWSOMpuxJ1Nhc2HVHJ9XWwD10+kTsod6KqPNlfc6bUepV6deYfmkeGnZPnEyStAk5Y9dxY9t4q2gPRcyZ2YhVUPj9q3WZo/542d1yBiuVDGKrhmI/bKsSF+koGEJ0WpsCN2GEmJSlQR+JB0CKkWPd3/uAhG59w3gZHmQWQhHUMDT5Y55FeqrxKH4BS3dwbseR9yLtXLuCkGPuku7nsJtUOcFrnhmiUXkkvitczbJeXmupfgKNMgZ8t8cD9+lvyQ4EZscN6byVpKtAGq6e10JKpPTmvl4RwybxlM6kvv X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0222e281-e2be-4b01-0839-08da92c8e53b X-MS-Exchange-CrossTenant-AuthSource: AS8P250MB0744.EURP250.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Sep 2022 01:07:56.5030 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3P250MB0242 Subject: [FFmpeg-devel] [PATCH 07/18] avcodec/vp8: Pass mb_y explicitly X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ueQyVc+ChCzd Avoids atomic stores and loads and is a prerequisite for removing all atomic synchronizations for VP7. Notice that removing the explicit atomic_store() in vp78_decode_mb_row_sliced() does not negatively affect parallelism during slice-threading, because no check_thread_pos() ever waits for an (mb_x, mb_y) pair with mb_x == 0 (which this atomic store signalled). Signed-off-by: Andreas Rheinhardt --- Btw: The code in update_pos looks fishy to me; namely the part that tries to avoid the broadcast. Consider the scenario in which the other threads (prev and next, A and B) are not waiting when the current thread C checks their wait_mb_pos. Then one of the other threads reads C's thread_mb_pos and notices that it needs to wait for an update from C. It therefore locks C's mutex, stores its wait_mb_pos, checks C's thread_mb_pos again (still reading the old value in this scenario) and waits via pthread_cond_wait(). Then C updates its thread_mb_pos, but because C uses outdated values for A and B's wait_mb_pos, it never signals a broadcast. Who will then wake up the waiting thread? This should be fixable by moving the loads after C's update of thread_mb_pos: In case C's read of A's wait_mb_pos value happens before A updates it, then C's update of its thread_mb_pos happens before A updates its wait_mb_pos and A will therefore read C's updated value of thread_mb_pos its atomic_load while holding C's lock (and will therefore never call pthread_cond_wait()). In case C's read of A's wait_mb_pos value happens after A updates it, C will emit its broadcast, waking A which reads the updated value and stops. libavcodec/vp8.c | 30 +++++++++++++++--------------- libavcodec/vp8.h | 4 ++-- 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/libavcodec/vp8.c b/libavcodec/vp8.c index c259f3588c..5ecb9b07e5 100644 --- a/libavcodec/vp8.c +++ b/libavcodec/vp8.c @@ -2389,11 +2389,11 @@ static int vp8_decode_mv_mb_modes(AVCodecContext *avctx, VP8Frame *cur_frame, #endif static av_always_inline int decode_mb_row_no_filter(AVCodecContext *avctx, void *tdata, - int jobnr, int threadnr, int is_vp7) + int jobnr, int threadnr, int mb_y, + int is_vp7) { VP8Context *s = avctx->priv_data; VP8ThreadData *prev_td, *next_td, *td = &s->thread_data[threadnr]; - int mb_y = atomic_load(&td->thread_mb_pos) >> 16; int mb_x, mb_xy = mb_y * s->mb_width; int num_jobs = s->num_jobs; const VP8Frame *prev_frame = s->prev_frame; @@ -2518,23 +2518,24 @@ static av_always_inline int decode_mb_row_no_filter(AVCodecContext *avctx, void } static int vp7_decode_mb_row_no_filter(AVCodecContext *avctx, void *tdata, - int jobnr, int threadnr) + int jobnr, int threadnr, int mb_y) { - return decode_mb_row_no_filter(avctx, tdata, jobnr, threadnr, 1); + return decode_mb_row_no_filter(avctx, tdata, jobnr, threadnr, mb_y, 1); } static int vp8_decode_mb_row_no_filter(AVCodecContext *avctx, void *tdata, - int jobnr, int threadnr) + int jobnr, int threadnr, int mb_y) { - return decode_mb_row_no_filter(avctx, tdata, jobnr, threadnr, 0); + return decode_mb_row_no_filter(avctx, tdata, jobnr, threadnr, mb_y, 0); } static av_always_inline void filter_mb_row(AVCodecContext *avctx, void *tdata, - int jobnr, int threadnr, int is_vp7) + int jobnr, int threadnr, int mb_y, + int is_vp7) { VP8Context *s = avctx->priv_data; VP8ThreadData *td = &s->thread_data[threadnr]; - int mb_x, mb_y = atomic_load(&td->thread_mb_pos) >> 16, num_jobs = s->num_jobs; + int mb_x, num_jobs = s->num_jobs; AVFrame *curframe = s->curframe->tf.f; VP8Macroblock *mb; VP8ThreadData *prev_td, *next_td; @@ -2589,15 +2590,15 @@ static av_always_inline void filter_mb_row(AVCodecContext *avctx, void *tdata, } static void vp7_filter_mb_row(AVCodecContext *avctx, void *tdata, - int jobnr, int threadnr) + int jobnr, int threadnr, int mb_y) { - filter_mb_row(avctx, tdata, jobnr, threadnr, 1); + filter_mb_row(avctx, tdata, jobnr, threadnr, mb_y, 1); } static void vp8_filter_mb_row(AVCodecContext *avctx, void *tdata, - int jobnr, int threadnr) + int jobnr, int threadnr, int mb_y) { - filter_mb_row(avctx, tdata, jobnr, threadnr, 0); + filter_mb_row(avctx, tdata, jobnr, threadnr, mb_y, 0); } static av_always_inline @@ -2615,14 +2616,13 @@ int vp78_decode_mb_row_sliced(AVCodecContext *avctx, void *tdata, int jobnr, td->mv_bounds.mv_min.y = -MARGIN - 64 * threadnr; td->mv_bounds.mv_max.y = ((s->mb_height - 1) << 6) + MARGIN - 64 * threadnr; for (mb_y = jobnr; mb_y < s->mb_height; mb_y += num_jobs) { - atomic_store(&td->thread_mb_pos, mb_y << 16); - ret = s->decode_mb_row_no_filter(avctx, tdata, jobnr, threadnr); + ret = s->decode_mb_row_no_filter(avctx, tdata, jobnr, threadnr, mb_y); if (ret < 0) { update_pos(td, s->mb_height, INT_MAX & 0xFFFF); return ret; } if (s->deblock_filter) - s->filter_mb_row(avctx, tdata, jobnr, threadnr); + s->filter_mb_row(avctx, tdata, jobnr, threadnr, mb_y); update_pos(td, mb_y, INT_MAX & 0xFFFF); td->mv_bounds.mv_min.y -= 64 * num_jobs; diff --git a/libavcodec/vp8.h b/libavcodec/vp8.h index 30aeb4cb06..ed79bc79c1 100644 --- a/libavcodec/vp8.h +++ b/libavcodec/vp8.h @@ -330,8 +330,8 @@ typedef struct VP8Context { */ int mb_layout; - int (*decode_mb_row_no_filter)(AVCodecContext *avctx, void *tdata, int jobnr, int threadnr); - void (*filter_mb_row)(AVCodecContext *avctx, void *tdata, int jobnr, int threadnr); + int (*decode_mb_row_no_filter)(AVCodecContext *avctx, void *tdata, int jobnr, int threadnr, int mb_y); + void (*filter_mb_row)(AVCodecContext *avctx, void *tdata, int jobnr, int threadnr, int mb_y); int vp7;