From patchwork Fri Sep 17 02:08:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 30301 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2a4a:0:0:0:0 with SMTP id k10csp1784764iov; Thu, 16 Sep 2021 19:09:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyEwcl73aOgmvgnrKFWo04GAgEAreqEzCRNbz6mSbi4fxWfMyai2bbb9zBFku2qILcowdvO X-Received: by 2002:a17:906:498b:: with SMTP id p11mr9629139eju.295.1631844564805; Thu, 16 Sep 2021 19:09:24 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e7si6560972edk.96.2021.09.16.19.09.24; Thu, 16 Sep 2021 19:09:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=QGx9Y1xI; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D41A168B12A; Fri, 17 Sep 2021 05:08:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-oln040092066061.outbound.protection.outlook.com [40.92.66.61]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5AE9368B114 for ; Fri, 17 Sep 2021 05:08:30 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SecgOE/vOY3jkilYFB7n+0SsSftHTDU4MDtFXcs9YEKmwEM+ZVfm8GHwOJV3g0xb5rOibgBsqZjED4SPTpihQoefc6ZwffjQlHPiw6aRyzqI80eY6lMwNSuHMQfi2xjuuLqb4hRGwLqumB36cKdXkrCoQEftJzIt7b9RObNiQnCxl5diP8bK0WqnUCFTE1+HgM99Ftd36uc+2IiOY7zDTFYkl7Lvd4OQ9Lzb/qY3zlSlLiyLPTgbGBqTwN0GKsCtFIYQY9KP+VB8fHqCLMEXZ3ipopFKRrKh/Gz2kka3eA1SA8/EqQP9fd5qJXbUh07qwo9vNqZFOoHrm0DKta7szw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=Cu4O1B+MRik2JxygrGwsa1GXzylXGn/5avR3s1GOXck=; b=I0meefuzmShMrfHs5JRWtQO9zTnxKezR8DNoxeJZACU9s/UcinLUxVyUWar0SnQ9+htklTt6NOP62QUqJiRegzXZeSnxLIK8HWG4n2MarmqquGTPH2Vz4mJBDZVHo91vYxjoNWlJDr63xZAhdUF7NjanxeLiD8WK7lUQrht8FukjTMsumOlAMCXYqTUl8ZDQzsa8LFE6rRLLUfba/4XGpNJ974SiA0Po/7PXxW+K92CTpjEYMh0vlPTqFwQccW46HJboS2Wp/pKrhtOVLhIcJfBG/GzwiFV8f6CtkAHd3FEF6nKbPqn5DjAUg6eoTWjj0MUYRA91exnipSVTb06KnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Cu4O1B+MRik2JxygrGwsa1GXzylXGn/5avR3s1GOXck=; b=QGx9Y1xIAw+kpXyOTx1YcBaWFlrhiMcLAXHUc3jL+zghdT7+jSeJZN1WWYG5D2s1bIkyA3hJRlq6eruWWe5+ugLx6U9tFBVzmYaBYXyCrxZxoZoKcLiGv1Oknye0umQ4YAwn/MP56Zp7DOP0yscq0BSo9g/+J0BF3KcuWuclm6YBgoSoTKC+kswh3WhYphUHItNEJMklyCmCk+WeJyKKqdb/Mwr+sr/fsvUwzc+Hqa0HbQBwtNIxIM0q1WWElQdAejHWiWNDXzRbWYuPyYVmsZ0SfxIokIhUQizgwVrOM81K8rzy6VF463yh1Uqdj2IPRGesaQQsFRs7yVRnhRsTLg== Received: from AM7PR03MB6660.eurprd03.prod.outlook.com (2603:10a6:20b:1c1::22) by AM6PR03MB4104.eurprd03.prod.outlook.com (2603:10a6:20b:18::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.17; Fri, 17 Sep 2021 02:08:28 +0000 Received: from AM7PR03MB6660.eurprd03.prod.outlook.com ([fe80::787b:2156:ca99:fe00]) by AM7PR03MB6660.eurprd03.prod.outlook.com ([fe80::787b:2156:ca99:fe00%3]) with mapi id 15.20.4523.016; Fri, 17 Sep 2021 02:08:28 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Fri, 17 Sep 2021 04:08:03 +0200 Message-ID: X-Mailer: git-send-email 2.30.2 In-Reply-To: References: X-TMN: [DzSUa1N2fYnviPDvMoK1ibgtcXSKmpRQ] X-ClientProxiedBy: AM8P191CA0018.EURP191.PROD.OUTLOOK.COM (2603:10a6:20b:21a::23) To AM7PR03MB6660.eurprd03.prod.outlook.com (2603:10a6:20b:1c1::22) X-Microsoft-Original-Message-ID: <20210917020808.275498-7-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from sblaptop.fritz.box (188.192.142.38) by AM8P191CA0018.EURP191.PROD.OUTLOOK.COM (2603:10a6:20b:21a::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.14 via Frontend Transport; Fri, 17 Sep 2021 02:08:27 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: fd48fd6f-0cdd-439a-14fa-08d9798009f0 X-MS-Exchange-SLBlob-MailProps: S/btQ8cKWiQB5KKqMro6wr3957+0yV3UUASCoka4SNXhNpK4w9X5n8lg3yRdiV6pmkkIzkU78tZBf4ZzTTwNVcJBC9prbcPk8Iy2xXZHF6yRkRiOGDr7CAHBB7cToVLm5TQjf5/hkh05WbsxghW3rNbLDKvQulpeSBcuZaKWmDmvhOJqr1XKvb/MD5EAPKsOhr7GxERs8khS/CZZW+Q7MmcuffHa85PKnXgomeRjNAnHH+NriimSIJebz9416R6ZZ0kLGrJKoItHFsmVvVYSjIbK1k5aW+WLeje6k3PJ4d1L4x9CO630XyKQxWR2n64GF3IZk0aWSM/Ux/qqusugqOItZ/M63Zpsqom0EluBvQfnE6ZGfp6jNYWMePnU4ne0sScqCEThSjWsYIUTWWCD+TqDQkA6IZlYkpC/hHzoC07vierDY9zu1KB3fRTX9Q0e4+KkV9l7lnB5qT1SlinRvtHwHULMtdu7NwMQNtL2Oa8Y6UrUa4zvs77X8foO1aRAmHReDFaLin2w2AfdjzSr5ah2yqTrfscx7RMct9eRJEH7asZ9UUZfAXUTPwyNkLfiy0/ERM55uSGdo9JL7Ir+u3cpTfXOoi9Zyq9EzWpDTT403r1Uu+SSA9BR3mgIWOUikfA+WZFVdks5b4kwanjshYzvdHaUqdQ3KufHLOehVU2gVlSRSLl4IQ97xEXn34Xvf23oYw//29VG6uhG3ldahMQ0bdESX00sDrZMb/6S5ofQ5kjXjKQyIsN193uYjR+Vejxtdv9jQgU= X-MS-TrafficTypeDiagnostic: AM6PR03MB4104: X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Te4Cl/P823Aj4zgCePL+rM8e6wNLqCMm+jvuhI6VKUeH4npWB0aKZZWbNvcAIM2zIo+Yt0MeUpWtLr3zkFpWkETyDymWGJuGAQ+oEjgw5OAKh7hcgRO0UT7dqO1dIosn8g+kjvUwXVENvfzE6KPghwHRf4vimezOdd0raJTz1G1Ha74pInwy5XDd8s/JNUUzv7kFpZNc/xGeHhzs0tq4eqGe8QI2sps2rE2NkbQD1LIme4NnOjQ0MObvo3tWgbvnEEgC8oPUHLI9BXLEhl7D/GSc0mv482RlHaVtrrOCwiBS5fPBY9xOqVIWnHRV3f6Dbk8ezNYWnsjRGLGm7cMXwlgx3gs1jMRgaXkYfj2k4thxj3aHnAP1EgQgE9F9l4TCyH3/sEb8yz5JbP1c+euS6NX3qw2C0oldaHc0d8yCR2VphYfAtXKR7R8QEVpHaBev X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: tyvzvA8aKtPlcRJs1qg2qSVOlcriwxNxritiAVn34beb2D+yvpHdsGQdZJCivnXUjK2niQr1KkKYPEpwju6ndR02MY6rtKBr+EPAqKzDeyfspBkl8mP2wxhvSMdaBC2QfP1fFJoYfm25aqof484WcQ== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: fd48fd6f-0cdd-439a-14fa-08d9798009f0 X-MS-Exchange-CrossTenant-AuthSource: AM7PR03MB6660.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Sep 2021 02:08:28.0951 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR03MB4104 Subject: [FFmpeg-devel] [PATCH 08/13] avcodec/elbg: Keep buffers to avoid allocations and frees X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nZtJNsAIiqem Up until now, each call to avpriv_elbg_do() would result in at least six allocations. And this function is called a lot: A typical FATE run results in 52213653 calls to av_malloc; of these, 34974671 originate from av_malloc_array and from these 34783679 originate from avpriv_elbg_do; the msvideo1 encoder tests are behind most of these. This commit changes this by keeping the buffers and only reallocating them when needed. E.g. for the encoding part of fate-vsynth1-msvideo1 total heap usage went down from 11,407,939 allocs and frees with 468,106,207 bytes allocated to 3,149 allocs and frees with 13,181,847 bytes allocated. The time for one encode2-call went down by 69%. Signed-off-by: Andreas Rheinhardt --- libavcodec/elbg.c | 84 ++++++++++++++++++++++++++++++----------------- 1 file changed, 54 insertions(+), 30 deletions(-) diff --git a/libavcodec/elbg.c b/libavcodec/elbg.c index 24c6f06f54..4397bff1ef 100644 --- a/libavcodec/elbg.c +++ b/libavcodec/elbg.c @@ -53,8 +53,20 @@ typedef struct ELBGContext { int64_t *utility_inc; int *nearest_cb; int *points; + int *size_part; AVLFG *rand_state; int *scratchbuf; + cell *cell_buffer; + + /* Sizes for the buffers above. Pointers without such a field + * are not allocated by us and only valid for the duration + * of a single call to avpriv_elbg_do(). */ + unsigned utility_allocated; + unsigned utility_inc_allocated; + unsigned size_part_allocated; + unsigned cells_allocated; + unsigned scratchbuf_allocated; + unsigned cell_buffer_allocated; } ELBGContext; static inline int distance_limited(int *a, int *b, int dim, int limit) @@ -332,32 +344,19 @@ static void do_shiftings(ELBGContext *elbg) } } -static int do_elbg(ELBGContext *elbg, int *points, int numpoints, - int max_steps) +static void do_elbg(ELBGContext *elbg, int *points, int numpoints, + int max_steps) { - int i, j, steps = 0, ret = 0; - int *size_part = av_malloc_array(elbg->num_cb, sizeof(int)); - cell *list_buffer = av_malloc_array(numpoints, sizeof(cell)); - cell *free_cells; + int *const size_part = elbg->size_part; + int i, j, steps = 0; int best_idx = 0; int64_t last_error; elbg->error = INT64_MAX; - elbg->cells = av_malloc_array(elbg->num_cb, sizeof(cell *)); - elbg->utility = av_malloc_array(elbg->num_cb, sizeof(*elbg->utility)); elbg->points = points; - elbg->utility_inc = av_malloc_array(elbg->num_cb, sizeof(*elbg->utility_inc)); - elbg->scratchbuf = av_malloc_array(5 * elbg->dim, sizeof(int)); - - if (!size_part || !list_buffer || !elbg->cells || - !elbg->utility || !elbg->utility_inc || !elbg->scratchbuf) { - ret = AVERROR(ENOMEM); - goto out; - } - do { - free_cells = list_buffer; + cell *free_cells = elbg->cell_buffer; last_error = elbg->error; steps++; memset(elbg->utility, 0, elbg->num_cb * sizeof(*elbg->utility)); @@ -408,15 +407,6 @@ static int do_elbg(ELBGContext *elbg, int *points, int numpoints, } while(((last_error - elbg->error) > DELTA_ERR_MAX*elbg->error) && (steps < max_steps)); - -out: - av_free(size_part); - av_free(elbg->utility); - av_free(list_buffer); - av_free(elbg->cells); - av_free(elbg->utility_inc); - av_free(elbg->scratchbuf); - return ret; } #define BIG_PRIME 433494437LL @@ -450,13 +440,13 @@ static int init_elbg(ELBGContext *elbg, int *points, int numpoints, av_freep(&temp_points); return ret; } - ret = do_elbg(elbg, temp_points, numpoints / 8, 2 * max_steps); + do_elbg(elbg, temp_points, numpoints / 8, 2 * max_steps); av_free(temp_points); } else // If not, initialize the codebook with random positions for (int i = 0; i < elbg->num_cb; i++) memcpy(elbg->codebook + i * dim, points + ((i*BIG_PRIME)%numpoints)*dim, dim * sizeof(*elbg->codebook)); - return ret; + return 0; } int avpriv_elbg_do(ELBGContext **elbgp, int *points, int dim, int numpoints, @@ -476,13 +466,47 @@ int avpriv_elbg_do(ELBGContext **elbgp, int *points, int dim, int numpoints, elbg->num_cb = num_cb; elbg->dim = dim; +#define ALLOCATE_IF_NECESSARY(field, new_elements, multiplicator) \ + if (elbg->field ## _allocated < new_elements) { \ + av_freep(&elbg->field); \ + elbg->field = av_malloc_array(new_elements, \ + multiplicator * sizeof(*elbg->field)); \ + if (!elbg->field) { \ + elbg->field ## _allocated = 0; \ + return AVERROR(ENOMEM); \ + } \ + elbg->field ## _allocated = new_elements; \ + } + /* Allocating the buffers for do_elbg() here once relies + * on their size being always the same even when do_elbg() + * is called from init_elbg(). It also relies on do_elbg() + * never calling itself recursively. */ + ALLOCATE_IF_NECESSARY(cells, num_cb, 1) + ALLOCATE_IF_NECESSARY(utility, num_cb, 1) + ALLOCATE_IF_NECESSARY(utility_inc, num_cb, 1) + ALLOCATE_IF_NECESSARY(size_part, num_cb, 1) + ALLOCATE_IF_NECESSARY(cell_buffer, numpoints, 1) + ALLOCATE_IF_NECESSARY(scratchbuf, dim, 5) + ret = init_elbg(elbg, points, numpoints, max_steps); if (ret < 0) return ret; - return do_elbg (elbg, points, numpoints, max_steps); + do_elbg (elbg, points, numpoints, max_steps); + return 0; } av_cold void avpriv_elbg_free(ELBGContext **elbgp) { + ELBGContext *elbg = *elbgp; + if (!elbg) + return; + + av_freep(&elbg->size_part); + av_freep(&elbg->utility); + av_freep(&elbg->cell_buffer); + av_freep(&elbg->cells); + av_freep(&elbg->utility_inc); + av_freep(&elbg->scratchbuf); + av_freep(elbgp); }