From patchwork Sat Jul 4 13:17:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: hanishkvc X-Patchwork-Id: 20797 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id D770744949A for ; Sat, 4 Jul 2020 16:18:41 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C17B168B464; Sat, 4 Jul 2020 16:18:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pj1-f65.google.com (mail-pj1-f65.google.com [209.85.216.65]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6BE1168B0DF for ; Sat, 4 Jul 2020 16:18:34 +0300 (EEST) Received: by mail-pj1-f65.google.com with SMTP id h22so15090573pjf.1 for ; Sat, 04 Jul 2020 06:18:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7Kg9ZnMpC9E5TuGMr3epz4OXgSpmlcES47C/yE/C66A=; b=sHUNh7lmQCbXReRmz1SMvsDh7lmpGoNsvir4GKAbqVAqvrleN+oUDSgcuiYAg2Ss3L migIhiLewr5IFfIa1y4GECTpuyQ9rhZAD1aSqUM0qUP8NUzhJ8WZ324o2vHktahCQXGa Hd1Unx2oFK6QSBN9WcS95rUlpGgMN2/0TgqbzrbhDGegSWggDPQKKfXZvu4KGWw6EBY9 yHrLNcIvDWKS23sma2v11o46KxrOUwnwUpRZTUl1Pkt1X5KVdm+wlPEGZc4NYqWF4U68 sWxhwptopIexcgCTNCn1/RAqIAAwzZEbe7Pc7CpHs60GvS4vrAeMV4r5ajItwHoMpkmy 8HgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7Kg9ZnMpC9E5TuGMr3epz4OXgSpmlcES47C/yE/C66A=; b=O9NHHGAKx3ny+Du47N0veBO5K1WA1mHzRXxZIa2ETNsKD/ugTNAHaqWnJJx1Slov0v ETeBXYvWUl+MMF2Iaj1xJN4eMWenHdrnFBd/XLEctSiR/9aEiKGAKvy7fHi9i+JkRvTK GxagKEG5mIXm5OQtfWL7yntJklCFqHg4nuZwWR0/nBfxGgkCAYg7qGY6ojWXJ7T0aKbu YkOmYtnvyBwvg4QAUQs/fpiZZ28h1frMghLymn1vzL0/AA5pkP6OsdjMS2SwuYb4RElM DN5DnxaTEauQEXD7tjx4HcwMbzkh2AgJDwq5EcE/v4HbnafSjDMd3yAF1DPWh2XTzJyt tvjA== X-Gm-Message-State: AOAM530MOGADOuwDfXYY5tSiqpuXmDiMLKVZ5hvY28nIUu7TkfmZmfcB njS1aM/ZRahelQsboZ3eiEve9rIe X-Google-Smtp-Source: ABdhPJzkvArq9Ki1p16xXR3JAxFmkFK5MbKNPOEpCIQ+6TRT3A56xRnhBkjzfaB8PXP2OwtwORbJFw== X-Received: by 2002:a17:902:bd08:: with SMTP id p8mr24434645pls.154.1593868712155; Sat, 04 Jul 2020 06:18:32 -0700 (PDT) Received: from localhost.localdomain ([122.179.70.80]) by smtp.gmail.com with ESMTPSA id b191sm14657507pga.13.2020.07.04.06.18.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jul 2020 06:18:31 -0700 (PDT) From: hanishkvc To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 Jul 2020 18:47:13 +0530 Message-Id: <20200704131717.49428-2-hanishkvc@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200704131717.49428-1-hanishkvc@gmail.com> References: <20200704131717.49428-1-hanishkvc@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v06 1/5] KMSGrab: getfb2 format_modifier if user doesnt specify X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: hanishkvc Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" If user doesnt specify a format_modifier explicitly, then use GetFB2 to identify the format_modifier of the framebuffer being grabbed. --- Changelog | 1 + libavdevice/kmsgrab.c | 22 +++++++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/Changelog b/Changelog index a60e7d2eb8..3881587caa 100644 --- a/Changelog +++ b/Changelog @@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release, releases are sorted from youngest to oldest. version : +- kmsgrab GetFB2 format_modifier, if user doesnt specify - AudioToolbox output device - MacCaption demuxer diff --git a/libavdevice/kmsgrab.c b/libavdevice/kmsgrab.c index d0de774871..10ed707e60 100644 --- a/libavdevice/kmsgrab.c +++ b/libavdevice/kmsgrab.c @@ -239,6 +239,7 @@ static av_cold int kmsgrab_read_header(AVFormatContext *avctx) drmModePlaneRes *plane_res = NULL; drmModePlane *plane = NULL; drmModeFB *fb = NULL; + drmModeFB2 *fb2 = NULL; AVStream *stream; int err, i; @@ -364,6 +365,23 @@ static av_cold int kmsgrab_read_header(AVFormatContext *avctx) goto fail; } + fb2 = drmModeGetFB2(ctx->hwctx->fd, plane->fb_id); + if (!fb2) { + err = errno; + av_log(avctx, AV_LOG_ERROR, "Failed to get " + "framebuffer2 %"PRIu32": %s.\n", + plane->fb_id, strerror(err)); + err = AVERROR(err); + goto fail; + } + + av_log(avctx, AV_LOG_INFO, "Template framebuffer2 is %"PRIu32": " + "%"PRIu32"x%"PRIu32", pixel_format: 0x%"PRIx32", format_modifier: 0x%"PRIx64".\n", + fb2->fb_id, fb2->width, fb2->height, fb2->pixel_format, fb2->modifier); + + if (ctx->drm_format_modifier == DRM_FORMAT_MOD_INVALID) + ctx->drm_format_modifier = fb2->modifier; + stream = avformat_new_stream(avctx, NULL); if (!stream) { err = AVERROR(ENOMEM); @@ -408,6 +426,8 @@ fail: drmModeFreePlane(plane); if (fb) drmModeFreeFB(fb); + if (fb2) + drmModeFreeFB2(fb2); return err; } @@ -433,7 +453,7 @@ static const AVOption options[] = { { .i64 = AV_PIX_FMT_BGR0 }, 0, UINT32_MAX, FLAGS }, { "format_modifier", "DRM format modifier for framebuffer", OFFSET(drm_format_modifier), AV_OPT_TYPE_INT64, - { .i64 = DRM_FORMAT_MOD_NONE }, 0, INT64_MAX, FLAGS }, + { .i64 = DRM_FORMAT_MOD_INVALID}, 0, INT64_MAX, FLAGS }, { "crtc_id", "CRTC ID to define capture source", OFFSET(source_crtc), AV_OPT_TYPE_INT64, { .i64 = 0 }, 0, UINT32_MAX, FLAGS }, From patchwork Sat Jul 4 13:17:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: hanishkvc X-Patchwork-Id: 20798 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id A93964498AB for ; Sat, 4 Jul 2020 16:19:06 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 944C968B4AD; Sat, 4 Jul 2020 16:19:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 33D3468B0DF for ; Sat, 4 Jul 2020 16:19:00 +0300 (EEST) Received: by mail-pg1-f181.google.com with SMTP id e18so16303671pgn.7 for ; Sat, 04 Jul 2020 06:19:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1r282YTvE73z8Jt1OkUJh+3dnPFOCN451eKtHqgJ5Vc=; b=O7nR5sXq/8L+Shp8WckKzm2UlUoTUMD6mLFZsVxKDds8bVxUKkDL1DjR+vnvAu1HWb N79VX87CzHGFuS4Qstu408CwehpaDyuavjP9BUQjJ5E79rqW56ZN8S5C4+jvKeuI3UZY wCP8BTXugDrqprgvXjwTcCop+TrgHAnLKw5dbhDLk+7sULDDdkJQevEXCe1hyzjsZRdQ ceHCN9USllduzyAxO0BQ8FMRFoY4zWusfRrKjX/44/3tQXI6CEMUjh00YsdQFnoVP1gR /XPhdIV41Xg5wlR2wyC2n+3cKdPnp4WdyKln86JxwuFSbZZShEyeQHQqaWhNW8aSvc5l UukA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1r282YTvE73z8Jt1OkUJh+3dnPFOCN451eKtHqgJ5Vc=; b=sy4afJAH80QXQlH7ZTuEUaUmHp6Qx/h+m/7zqZsKHvo8eqcz+/uZtpSfFFmO6oUaVW ryJ+T47TKbfVGI4inxDDi7Cp76Tt31CB0TeYTd+a9YU5UBH598nEfJec8Fvhdl8a5uSm EspdeowqjGtk5197mGKnZ5nT22qQrw2fCdFqil7IwuCfsdEfmz5+K/zvOHcXYxCxtJox OyClJnfcIb5X577qMwRs6QROJOzssX7BjFXNdk0kzPDKPGTtv7qzb241tE9gxfCtrOFR T0T5Ja8pF+DW93VcBSt3OUSDkF+rkg2HXsTwVluySifhvX1vPl9FpzSQC9lHvZSODZDg cdRg== X-Gm-Message-State: AOAM531BWITwfYF/B+3iLURXy7MeL3lQmDvEbt2cqDeiNYtgCgTC7T2g o6JoMoLK3iK8hybkv7WBIbVPddb/ X-Google-Smtp-Source: ABdhPJyLQWrLN1AYOcZQ231piWkNFzhOD8+97w9i3cMAnm7av9iwgxv7QTEXomJwwN6P16I3d0dfqA== X-Received: by 2002:a62:1b4e:: with SMTP id b75mr2813050pfb.33.1593868737656; Sat, 04 Jul 2020 06:18:57 -0700 (PDT) Received: from localhost.localdomain ([122.179.70.80]) by smtp.gmail.com with ESMTPSA id b191sm14657507pga.13.2020.07.04.06.18.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jul 2020 06:18:56 -0700 (PDT) From: hanishkvc To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 Jul 2020 18:47:14 +0530 Message-Id: <20200704131717.49428-3-hanishkvc@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200704131717.49428-1-hanishkvc@gmail.com> References: <20200704131717.49428-1-hanishkvc@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v06 2/5] fbtile helperRoutines cpu based framebuffer detiling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: hanishkvc Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Add helper routines which can be used to detile tiled framebuffer layouts into a linear layout, using the cpu. Currently it supports Legacy Intel Tile-X, Legacy Intel Tile-Y and Newer Intel Tile-Yf tiled layouts. Currently supported pixel format is 32bit RGB. It also contains detile_generic logic, which can be easily configured to support different kinds of tiling layouts, at the expense of some processing speed, compared to developing a targeted detiling logic. --- libavutil/Makefile | 2 + libavutil/fbtile.c | 441 +++++++++++++++++++++++++++++++++++++++++++++ libavutil/fbtile.h | 228 +++++++++++++++++++++++ 3 files changed, 671 insertions(+) create mode 100644 libavutil/fbtile.c create mode 100644 libavutil/fbtile.h diff --git a/libavutil/Makefile b/libavutil/Makefile index 9b08372eb2..9b58ac5980 100644 --- a/libavutil/Makefile +++ b/libavutil/Makefile @@ -84,6 +84,7 @@ HEADERS = adler32.h \ xtea.h \ tea.h \ tx.h \ + fbtile.h \ HEADERS-$(CONFIG_LZO) += lzo.h @@ -169,6 +170,7 @@ OBJS = adler32.o \ tx_float.o \ tx_double.o \ tx_int32.o \ + fbtile.o \ video_enc_params.o \ diff --git a/libavutil/fbtile.c b/libavutil/fbtile.c new file mode 100644 index 0000000000..ca04f0a7d2 --- /dev/null +++ b/libavutil/fbtile.c @@ -0,0 +1,441 @@ +/* + * CPU based Framebuffer Tile DeTile logic + * Copyright (c) 2020 C Hanish Menon + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "avutil.h" +#include "common.h" +#include "fbtile.h" +#if CONFIG_LIBDRM +#include +#endif + + +int fbtilemode_from_formatmodifier(uint64_t formatModifier) +{ + int mode = TILE_NONE_END; + +#if CONFIG_LIBDRM + switch(formatModifier) { + case DRM_FORMAT_MOD_LINEAR: + mode = TILE_NONE; + break; + case I915_FORMAT_MOD_X_TILED: + mode = TILE_INTELX; + break; + case I915_FORMAT_MOD_Y_TILED: + mode = TILE_INTELY; + break; + case I915_FORMAT_MOD_Yf_TILED: + mode = TILE_INTELYF; + break; + default: + mode = TILE_NONE_END; + break; + } +#endif +#ifdef DEBUG_FBTILE_FORMATMODIFIER_MAPPING + av_log(NULL, AV_LOG_DEBUG, "fbtile:formatmodifier[%lx] mapped to mode[%d]\n", formatModifier, mode); +#endif + return mode; +} + + +/** + * Supported pixel formats + * Currently only RGB based 32bit formats are specified + * TODO: Technically the logic is transparent to 16bit RGB formats also to a great extent + */ +const enum AVPixelFormat fbtilePixFormats[] = {AV_PIX_FMT_RGB0, AV_PIX_FMT_0RGB, AV_PIX_FMT_BGR0, AV_PIX_FMT_0BGR, + AV_PIX_FMT_RGBA, AV_PIX_FMT_ARGB, AV_PIX_FMT_BGRA, AV_PIX_FMT_ABGR, + AV_PIX_FMT_NONE}; + +int fbtile_checkpixformats(const enum AVPixelFormat srcPixFormat, const enum AVPixelFormat dstPixFormat) +{ + int okSrc = 0; + int okDst = 0; + for (int i = 0; fbtilePixFormats[i] != AV_PIX_FMT_NONE; i++) { + if (fbtilePixFormats[i] == srcPixFormat) + okSrc = 1; + if (fbtilePixFormats[i] == dstPixFormat) + okDst = 1; + } + return (okSrc && okDst); +} + + +void detile_intelx(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize) +{ + // Offsets and LineSize are in bytes + const int pixBytes = 4; // bytes per pixel + const int tileW = 128; // tileWidth inPixels, 512/4, For a 32Bits/Pixel framebuffer + const int tileH = 8; // tileHeight inPixelLines + const int tileWBytes = tileW*pixBytes; // tileWidth inBytes + + if (w*pixBytes != srcLineSize) { + av_log(NULL, AV_LOG_ERROR, "fbdetile:intelx: w%dxh%d, dL%d, sL%d\n", w, h, dstLineSize, srcLineSize); + av_log(NULL, AV_LOG_ERROR, "fbdetile:intelx: dont support LineSize | Pitch going beyond width\n"); + } + int sO = 0; // srcOffset inBytes + int dX = 0; // destX inPixels + int dY = 0; // destY inPixels + int nTLines = (w*h)/tileW; // numTileLines; One TileLine = One TileWidth + int cTL = 0; // curTileLine + while (cTL < nTLines) { + int dO = dY*dstLineSize + dX*pixBytes; +#ifdef DEBUG_FBTILE + av_log(NULL, AV_LOG_DEBUG, "fbdetile:intelx: dX%d dY%d, sO%d, dO%d\n", dX, dY, sO, dO); +#endif + memcpy(dst+dO+0*dstLineSize, src+sO+0*tileWBytes, tileWBytes); + memcpy(dst+dO+1*dstLineSize, src+sO+1*tileWBytes, tileWBytes); + memcpy(dst+dO+2*dstLineSize, src+sO+2*tileWBytes, tileWBytes); + memcpy(dst+dO+3*dstLineSize, src+sO+3*tileWBytes, tileWBytes); + memcpy(dst+dO+4*dstLineSize, src+sO+4*tileWBytes, tileWBytes); + memcpy(dst+dO+5*dstLineSize, src+sO+5*tileWBytes, tileWBytes); + memcpy(dst+dO+6*dstLineSize, src+sO+6*tileWBytes, tileWBytes); + memcpy(dst+dO+7*dstLineSize, src+sO+7*tileWBytes, tileWBytes); + dX += tileW; + if (dX >= w) { + dX = 0; + dY += tileH; + } + sO = sO + tileW*tileH*pixBytes; + cTL += tileH; + } +} + + +/* + * Intel Legacy Tile-Y layout conversion support + * + * currently done in a simple dumb way. Two low hanging optimisations + * that could be readily applied are + * + * a) unrolling the inner for loop + * --- Given small size memcpy, should help, DONE + * + * b) using simd based 128bit loading and storing along with prefetch + * hinting. + * + * TOTHINK|CHECK: Does memcpy already does this and more if situation + * is right?! + * + * As code (or even intrinsics) would be specific to each architecture, + * avoiding for now. Later have to check if vector_size attribute and + * corresponding implementation by gcc can handle different architectures + * properly, such that it wont become worse than memcpy provided for that + * architecture. + * + * Or maybe I could even merge the two intel detiling logics into one, as + * the semantic and flow is almost same for both logics. + * + */ +void detile_intely(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize) +{ + // Offsets and LineSize are in bytes + const int pixBytes = 4; // bytesPerPixel + // tileW represents subTileWidth here, as it can be repeated to fill a tile + const int tileW = 4; // tileWidth inPixels, 16/4, For a 32Bits/Pixel framebuffer + const int tileH = 32; // tileHeight inPixelLines + const int tileWBytes = tileW*pixBytes; // tileWidth inBytes + + if (w*pixBytes != srcLineSize) { + av_log(NULL, AV_LOG_ERROR, "fbdetile:intely: w%dxh%d, dL%d, sL%d\n", w, h, dstLineSize, srcLineSize); + av_log(NULL, AV_LOG_ERROR, "fbdetile:intely: dont support LineSize | Pitch going beyond width\n"); + } + int sO = 0; + int dX = 0; + int dY = 0; + const int nTLines = (w*h)/tileW; + int cTL = 0; + while (cTL < nTLines) { + int dO = dY*dstLineSize + dX*pixBytes; +#ifdef DEBUG_FBTILE + av_log(NULL, AV_LOG_DEBUG, "fbdetile:intely: dX%d dY%d, sO%d, dO%d\n", dX, dY, sO, dO); +#endif + + memcpy(dst+dO+0*dstLineSize, src+sO+0*tileWBytes, tileWBytes); + memcpy(dst+dO+1*dstLineSize, src+sO+1*tileWBytes, tileWBytes); + memcpy(dst+dO+2*dstLineSize, src+sO+2*tileWBytes, tileWBytes); + memcpy(dst+dO+3*dstLineSize, src+sO+3*tileWBytes, tileWBytes); + memcpy(dst+dO+4*dstLineSize, src+sO+4*tileWBytes, tileWBytes); + memcpy(dst+dO+5*dstLineSize, src+sO+5*tileWBytes, tileWBytes); + memcpy(dst+dO+6*dstLineSize, src+sO+6*tileWBytes, tileWBytes); + memcpy(dst+dO+7*dstLineSize, src+sO+7*tileWBytes, tileWBytes); + memcpy(dst+dO+8*dstLineSize, src+sO+8*tileWBytes, tileWBytes); + memcpy(dst+dO+9*dstLineSize, src+sO+9*tileWBytes, tileWBytes); + memcpy(dst+dO+10*dstLineSize, src+sO+10*tileWBytes, tileWBytes); + memcpy(dst+dO+11*dstLineSize, src+sO+11*tileWBytes, tileWBytes); + memcpy(dst+dO+12*dstLineSize, src+sO+12*tileWBytes, tileWBytes); + memcpy(dst+dO+13*dstLineSize, src+sO+13*tileWBytes, tileWBytes); + memcpy(dst+dO+14*dstLineSize, src+sO+14*tileWBytes, tileWBytes); + memcpy(dst+dO+15*dstLineSize, src+sO+15*tileWBytes, tileWBytes); + memcpy(dst+dO+16*dstLineSize, src+sO+16*tileWBytes, tileWBytes); + memcpy(dst+dO+17*dstLineSize, src+sO+17*tileWBytes, tileWBytes); + memcpy(dst+dO+18*dstLineSize, src+sO+18*tileWBytes, tileWBytes); + memcpy(dst+dO+19*dstLineSize, src+sO+19*tileWBytes, tileWBytes); + memcpy(dst+dO+20*dstLineSize, src+sO+20*tileWBytes, tileWBytes); + memcpy(dst+dO+21*dstLineSize, src+sO+21*tileWBytes, tileWBytes); + memcpy(dst+dO+22*dstLineSize, src+sO+22*tileWBytes, tileWBytes); + memcpy(dst+dO+23*dstLineSize, src+sO+23*tileWBytes, tileWBytes); + memcpy(dst+dO+24*dstLineSize, src+sO+24*tileWBytes, tileWBytes); + memcpy(dst+dO+25*dstLineSize, src+sO+25*tileWBytes, tileWBytes); + memcpy(dst+dO+26*dstLineSize, src+sO+26*tileWBytes, tileWBytes); + memcpy(dst+dO+27*dstLineSize, src+sO+27*tileWBytes, tileWBytes); + memcpy(dst+dO+28*dstLineSize, src+sO+28*tileWBytes, tileWBytes); + memcpy(dst+dO+29*dstLineSize, src+sO+29*tileWBytes, tileWBytes); + memcpy(dst+dO+30*dstLineSize, src+sO+30*tileWBytes, tileWBytes); + memcpy(dst+dO+31*dstLineSize, src+sO+31*tileWBytes, tileWBytes); + + dX += tileW; + if (dX >= w) { + dX = 0; + dY += tileH; + } + sO = sO + tileW*tileH*pixBytes; + cTL += tileH; + } +} + + +/* + * Generic detile logic + */ + +/* + * Direction Change Entry + * Used to specify the tile walking of subtiles within a tile. + */ +/** + * Settings for Intel Tile-Yf framebuffer layout. + * May need to swap the 4 pixel wide subtile, have to check doc bit more + */ +const int tyfBytesPerPixel = 4; +const int tyfSubTileWidth = 4; +const int tyfSubTileHeight = 8; +const int tyfSubTileWidthBytes = tyfSubTileWidth*tyfBytesPerPixel; //16 +const int tyfTileWidth = 32; +const int tyfTileHeight = 32; +const int tyfNumDirChanges = 6; +struct dirChange tyfDirChanges[] = { {8, 4, 0}, {16, -4, 8}, {32, 4, -8}, {64, -12, 8 }, {128, 4, -24}, {256, 4, -24} }; + +/** + * Setting for Intel Tile-X framebuffer layout + */ +const int txBytesPerPixel = 4; +const int txSubTileWidth = 128; +const int txSubTileHeight = 8; +const int txSubTileWidthBytes = txSubTileWidth*txBytesPerPixel; //512 +const int txTileWidth = 128; +const int txTileHeight = 8; +const int txNumDirChanges = 1; +struct dirChange txDirChanges[] = { {8, 128, 0} }; + +/** + * Setting for Intel Tile-Y framebuffer layout + * Even thou a simple generic detiling logic doesnt require the + * dummy 256 posOffset entry. The pseudo parallel detiling based + * opti logic requires to know about the Tile boundry. + */ +const int tyBytesPerPixel = 4; +const int tySubTileWidth = 4; +const int tySubTileHeight = 32; +const int tySubTileWidthBytes = tySubTileWidth*tyBytesPerPixel; //16 +const int tyTileWidth = 32; +const int tyTileHeight = 32; +const int tyNumDirChanges = 2; +struct dirChange tyDirChanges[] = { {32, 4, 0}, {256, 4, 0} }; + + +void detile_generic_simple(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize, + int bytesPerPixel, + int subTileWidth, int subTileHeight, int subTileWidthBytes, + int tileWidth, int tileHeight, + int numDirChanges, struct dirChange *dirChanges) +{ + + if (w*bytesPerPixel != srcLineSize) { + av_log(NULL, AV_LOG_ERROR, "fbdetile:generic: w%dxh%d, dL%d, sL%d\n", w, h, dstLineSize, srcLineSize); + av_log(NULL, AV_LOG_ERROR, "fbdetile:generic: dont support LineSize | Pitch going beyond width\n"); + } + int sO = 0; + int dX = 0; + int dY = 0; + int nSTLines = (w*h)/subTileWidth; // numSubTileLines + int cSTL = 0; // curSubTileLine + while (cSTL < nSTLines) { + int dO = dY*dstLineSize + dX*bytesPerPixel; +#ifdef DEBUG_FBTILE + av_log(NULL, AV_LOG_DEBUG, "fbdetile:generic: dX%d dY%d, sO%d, dO%d\n", dX, dY, sO, dO); +#endif + + for (int k = 0; k < subTileHeight; k++) { + memcpy(dst+dO+k*dstLineSize, src+sO+k*subTileWidthBytes, subTileWidthBytes); + } + sO = sO + subTileHeight*subTileWidthBytes; + + cSTL += subTileHeight; + for (int i=numDirChanges-1; i>=0; i--) { + if ((cSTL%dirChanges[i].posOffset) == 0) { + dX += dirChanges[i].xDelta; + dY += dirChanges[i].yDelta; + break; + } + } + if (dX >= w) { + dX = 0; + dY += tileHeight; + } + } +} + + +void detile_generic_opti(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize, + int bytesPerPixel, + int subTileWidth, int subTileHeight, int subTileWidthBytes, + int tileWidth, int tileHeight, + int numDirChanges, struct dirChange *dirChanges) +{ + int parallel = 1; + + if (w*bytesPerPixel != srcLineSize) { + av_log(NULL, AV_LOG_ERROR, "fbdetile:generic: w%dxh%d, dL%d, sL%d\n", w, h, dstLineSize, srcLineSize); + av_log(NULL, AV_LOG_ERROR, "fbdetile:generic: dont support LineSize | Pitch going beyond width\n"); + } + if (w%tileWidth != 0) { + av_log(NULL, AV_LOG_ERROR, "fbdetile:generic:NotSupported:NonMultWidth: width%d, tileWidth%d\n", w, tileWidth); + } + int sO = 0; + int sOPrev = 0; + int dX = 0; + int dY = 0; + int nSTLines = (w*h)/subTileWidth; + //int nSTLinesInATile = (tileWidth*tileHeight)/subTileWidth; + int nTilesInARow = w/tileWidth; + for (parallel=8; parallel>0; parallel--) { + if (nTilesInARow%parallel == 0) + break; + } + int cSTL = 0; + int curTileInRow = 0; + while (cSTL < nSTLines) { + int dO = dY*dstLineSize + dX*bytesPerPixel; +#ifdef DEBUG_FBTILE + av_log(NULL, AV_LOG_DEBUG, "fbdetile:generic: dX%d dY%d, sO%d, dO%d\n", dX, dY, sO, dO); +#endif + + // As most tiling layouts have a minimum subtile of 4x4, if I remember correctly, + // so this loop has been unrolled to be multiples of 4, and speed up a bit. + // However tiling involving 3x3 or 2x2 wont be handlable. Use detile_generic_simple + // for such tile layouts. + // Detile parallely to a limited extent. To avoid any cache set-associativity and or + // limited cache based thrashing, keep it spacially and inturn temporaly small at one level. + for (int k = 0; k < subTileHeight; k+=4) { + for (int p = 0; p < parallel; p++) { + int pSrcOffset = p*tileWidth*tileHeight*bytesPerPixel; + int pDstOffset = p*tileWidth*bytesPerPixel; + memcpy(dst+dO+k*dstLineSize+pDstOffset, src+sO+k*subTileWidthBytes+pSrcOffset, subTileWidthBytes); + memcpy(dst+dO+(k+1)*dstLineSize+pDstOffset, src+sO+(k+1)*subTileWidthBytes+pSrcOffset, subTileWidthBytes); + memcpy(dst+dO+(k+2)*dstLineSize+pDstOffset, src+sO+(k+2)*subTileWidthBytes+pSrcOffset, subTileWidthBytes); + memcpy(dst+dO+(k+3)*dstLineSize+pDstOffset, src+sO+(k+3)*subTileWidthBytes+pSrcOffset, subTileWidthBytes); + } + } + sO = sO + subTileHeight*subTileWidthBytes; + + cSTL += subTileHeight; + for (int i=numDirChanges-1; i>=0; i--) { + if ((cSTL%dirChanges[i].posOffset) == 0) { + if (i == numDirChanges-1) { + curTileInRow += parallel; + dX = curTileInRow*tileWidth; + sO = sOPrev + tileWidth*tileHeight*bytesPerPixel*(parallel); + sOPrev = sO; + } else { + dX += dirChanges[i].xDelta; + } + dY += dirChanges[i].yDelta; + break; + } + } + if (dX >= w) { + dX = 0; + curTileInRow = 0; + dY += tileHeight; + if (dY >= h) { + break; + } + } + } +} + + +int detile_this(int mode, uint64_t arg1, + int w, int h, + uint8_t *dst, int dstLineSize, + uint8_t *src, int srcLineSize, + int bytesPerPixel) +{ + static int logState=0; + if (mode == TILE_AUTO) { + mode = fbtilemode_from_formatmodifier(arg1); + } + if (mode == TILE_NONE) { + return 1; + } + + if (mode == TILE_INTELX) { + detile_intelx(w, h, dst, dstLineSize, src, srcLineSize); + } else if (mode == TILE_INTELY) { + detile_intely(w, h, dst, dstLineSize, src, srcLineSize); + } else if (mode == TILE_INTELYF) { + detile_generic(w, h, dst, dstLineSize, src, srcLineSize, + tyfBytesPerPixel, tyfSubTileWidth, tyfSubTileHeight, tyfSubTileWidthBytes, + tyfTileWidth, tyfTileHeight, + tyfNumDirChanges, tyfDirChanges); + } else if (mode == TILE_INTELGX) { + detile_generic(w, h, dst, dstLineSize, src, srcLineSize, + txBytesPerPixel, txSubTileWidth, txSubTileHeight, txSubTileWidthBytes, + txTileWidth, txTileHeight, + txNumDirChanges, txDirChanges); + } else if (mode == TILE_INTELGY) { + detile_generic(w, h, dst, dstLineSize, src, srcLineSize, + tyBytesPerPixel, tySubTileWidth, tySubTileHeight, tySubTileWidthBytes, + tyTileWidth, tyTileHeight, + tyNumDirChanges, tyDirChanges); + } else if (mode == TILE_NONE_END) { + av_log_once(NULL, AV_LOG_WARNING, AV_LOG_VERBOSE, &logState, "fbtile:detile_this:TILE_AUTOOr???: invalid or unsupported format_modifier:%"PRIx64"\n",arg1); + return 1; + } else { + av_log(NULL, AV_LOG_ERROR, "fbtile:detile_this:????: unknown mode specified, check caller\n"); + return 1; + } + return 0; +} + + +// vim: set expandtab sts=4: // diff --git a/libavutil/fbtile.h b/libavutil/fbtile.h new file mode 100644 index 0000000000..51556db93a --- /dev/null +++ b/libavutil/fbtile.h @@ -0,0 +1,228 @@ +/* + * CPU based Framebuffer Tile DeTile logic + * Copyright (c) 2020 C Hanish Menon + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_FBTILE_H +#define AVUTIL_FBTILE_H + +#include +#include + +/** + * @file + * @brief CPU based Framebuffer tiler detiler + * @author C Hanish Menon + * @{ + */ + + +enum FBTileMode { + TILE_NONE, + TILE_AUTO, + TILE_INTELX, + TILE_INTELY, + TILE_INTELYF, + TILE_INTELGX, + TILE_INTELGY, + TILE_NONE_END, +}; + + +/** + * Map from formatmodifier to fbtile's internal mode. + * + * @param formatModifier the format_modifier to map + * @return the fbtile's equivalent internal mode + */ +#undef DEBUG_FBTILE_FORMATMODIFIER_MAPPING +int fbtilemode_from_formatmodifier(uint64_t formatModifier); + + +/** + * Supported pixel formats by the fbtile logics + */ +extern const enum AVPixelFormat fbtilePixFormats[]; +/** + * Check if the given pixel formats are supported by fbtile logic. + * + * @param srcPixFormat pixel format of source image + * @param dstPixFormat pixel format of destination image + */ +int fbtile_checkpixformats(const enum AVPixelFormat srcPixFormat, const enum AVPixelFormat dstPixFormat); + + +/** + * Detile legacy intel tile-x layout into linear layout. + * + * @param w width of the image + * @param h height of the image + * @param dst the destination image buffer + * @param dstLineSize the size of each row in dst image, in bytes + * @param src the source image buffer + * @param srcLineSize the size of each row in src image, in bytes + */ +void detile_intelx(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize); + + +/** + * Detile legacy intel tile-y layout into linear layout. + * + * @param w width of the image + * @param h height of the image + * @param dst the destination image buffer + * @param dstLineSize the size of each row in dst image, in bytes + * @param src the source image buffer + * @param srcLineSize the size of each row in src image, in bytes + */ +void detile_intely(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize); + + +/** + * Generic Logic. + */ + +/* + * Direction Change Entry + * Used to specify the tile walking of subtiles within a tile. + */ +struct dirChange { + int posOffset; + int xDelta; + int yDelta; +}; +/** + * Settings for Intel Tile-Yf framebuffer layout. + * May need to swap the 4 pixel wide subtile, have to check doc bit more + */ +extern const int tyfBytesPerPixel; +extern const int tyfSubTileWidth; +extern const int tyfSubTileHeight; +extern const int tyfSubTileWidthBytes; +extern const int tyfTileWidth; +extern const int tyfTileHeight; +extern const int tyfNumDirChanges; +extern struct dirChange tyfDirChanges[]; +/** + * Setting for Intel Tile-X framebuffer layout + */ +extern const int txBytesPerPixel; +extern const int txSubTileWidth; +extern const int txSubTileHeight; +extern const int txSubTileWidthBytes; +extern const int txTileWidth; +extern const int txTileHeight; +extern const int txNumDirChanges; +extern struct dirChange txDirChanges[]; +/** + * Setting for Intel Tile-Y framebuffer layout + * Even thou a simple generic detiling logic doesnt require the + * dummy 256 posOffset entry. The pseudo parallel detiling based + * opti logic requires to know about the Tile boundry. + */ +extern const int tyBytesPerPixel; +extern const int tySubTileWidth; +extern const int tySubTileHeight; +extern const int tySubTileWidthBytes; +extern const int tyTileWidth; +extern const int tyTileHeight; +extern const int tyNumDirChanges; +extern struct dirChange tyDirChanges[]; + +/** + * Generic Logic to Detile into linear layout. + * + * @param w width of the image + * @param h height of the image + * @param dst the destination image buffer + * @param dstLineSize the size of each row in dst image, in bytes + * @param src the source image buffer + * @param srcLineSize the size of each row in src image, in bytes + * @param bytesPerPixel the bytes per pixel for the image + * @param subTileWidth the width of subtile within the tile, in pixels + * @param subTileHeight the height of subtile within the tile, in pixels + * @param subTileWidthBytes the width of subtile within the tile, in bytes + * @param tileWidth the width of the tile, in pixels + * @param tileHeight the height of the tile, in pixels + */ + + +/** + * Generic detile simple version, which is fine-grained. + */ +void detile_generic_simple(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize, + int bytesPerPixel, + int subTileWidth, int subTileHeight, int subTileWidthBytes, + int tileWidth, int tileHeight, + int numDirChanges, struct dirChange *dirChanges); + + +/** + * Generic detile optimised version, minimum subtile supported 4x4. + */ +void detile_generic_opti(int w, int h, + uint8_t *dst, int dstLineSize, + const uint8_t *src, int srcLineSize, + int bytesPerPixel, + int subTileWidth, int subTileHeight, int subTileWidthBytes, + int tileWidth, int tileHeight, + int numDirChanges, struct dirChange *dirChanges); + + +#ifdef DETILE_GENERIC_OPTI +#define detile_generic detile_generic_opti +#else +#define detile_generic detile_generic_simple +#endif + + +/** + * detile demuxer. + * + * @param mode the fbtile mode based detiling to call + * @param arg1 the format_modifier, in case mode is TILE_AUTO + * @param w width of the image + * @param h height of the image + * @param dst the destination image buffer + * @param dstLineSize the size of each row in dst image, in bytes + * @param src the source image buffer + * @param srcLineSize the size of each row in src image, in bytes + * @param bytesPerPixel the bytes per pixel for the image + * + * @return 0 if detiled, 1 if not + */ +int detile_this(int mode, uint64_t arg1, + int w, int h, + uint8_t *dst, int dstLineSize, + uint8_t *src, int srcLineSize, + int bytesPerPixel); + + +/** + * @} + */ + +#endif /* AVUTIL_FBTILE_H */ +// vim: set expandtab sts=4: // From patchwork Sat Jul 4 13:17:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: hanishkvc X-Patchwork-Id: 20799 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id ED1654498AB for ; Sat, 4 Jul 2020 16:19:17 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D3A9F68B4C3; Sat, 4 Jul 2020 16:19:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9E62868B414 for ; Sat, 4 Jul 2020 16:19:10 +0300 (EEST) Received: by mail-pj1-f52.google.com with SMTP id k5so5513568pjg.3 for ; Sat, 04 Jul 2020 06:19:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=X73mpZPOht/b+HX0SbCvXAXfylNaVU4HacnUBXFPbvk=; b=OT5IrAsHKB+dlgrz4grHq/xzIp0G8dbKu6NbeWMlhWU7qKjtZcOpVy8pP4/8u4ukJh YRGTEDXJ5uNTtmccqLisvrb2c8l3EbzYfypYyKPxV/tqTrYkkJABzmr2saJ6RqBrERqK qr4f4kYuuMKHxyBc6wojRHh4iICXhvc76TwnrCqRfMU6wZG7OfBUhRbLORegrmSqCGt5 1+WbfSSfg/3Sd6NsxS11d/C4roG/TNF5qQmFvAMVwvnn5+QomqpMGGTMjE0hLye8juWg XnC8fP4JnCS7U8x9gNnrtv/RyHAbPY+g70co2jC5n6h8Rvdc4LKAKD6cv+psjWDP+bjt S49A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X73mpZPOht/b+HX0SbCvXAXfylNaVU4HacnUBXFPbvk=; b=sqlNgnBmPUtafpiw727xuS44yinNuXk1OSPFpyrL3bRVomeZhasC5WGXvuAgnJN6SQ +BZlCSR0MDRzeQdnkicpgeJ3CPuSXRrY+c57ZHjwUyArlBQoBvFVTMpm7YRipQqRtY4Y d7Q1PRqGIB8eZ1AtdeOuIHWmZCOSZZHjw/aUsfdRttUAF237hjD860RQ+/E19DylkJSq xL/IUe16E9eD1LIpF6BJZHHONzX3UstG1+TZs0O4jp9gJfiYAN+/2ZqO0Spg9PWVYMOa wIQTvyUQLoFnuzc7x24FYm88qw06DzPEGm+JR4QLCUWWZaZBZ3Z6r5pMvxpi6KU+e234 Kk2A== X-Gm-Message-State: AOAM532Dy0X68CXwknRwB5rOZx1iA3ExWmicHoKwSJPg6aNigQvOycA4 NzMKMrsEg8DivD53dP15mKPl4Djs X-Google-Smtp-Source: ABdhPJw+xFXevpdlf2hJjK68NfFEInRgghfA0pIdAvOEjQVuKH99cGSLR0S1j1IhTYoMx8t3KvNsCg== X-Received: by 2002:a17:902:c206:: with SMTP id 6mr11883711pll.30.1593868748525; Sat, 04 Jul 2020 06:19:08 -0700 (PDT) Received: from localhost.localdomain ([122.179.70.80]) by smtp.gmail.com with ESMTPSA id b191sm14657507pga.13.2020.07.04.06.19.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jul 2020 06:19:07 -0700 (PDT) From: hanishkvc To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 Jul 2020 18:47:15 +0530 Message-Id: <20200704131717.49428-4-hanishkvc@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200704131717.49428-1-hanishkvc@gmail.com> References: <20200704131717.49428-1-hanishkvc@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v06 3/5] hwcontext_drm detile non linear layout, if possible X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: hanishkvc Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" If the framebuffer is a tiled layout, use the fbtile helper routines to try and detile it into linear layout, if supported by fbtile. It uses the format_modifier associated with the framebuffer to decide whether to apply detiling or not and inturn which specific detiling to apply. If user is using kmsgrab, they will have to use -format_modifer option of kmsgrab to force a specific detile logic, in case they dont want to use the original format_modifier related detiling. Or they could even use -format_modifier 0 to make hwcontext_drm bypass this detiling. --- Changelog | 1 + libavutil/hwcontext_drm.c | 32 ++++++++++++++++++++++++++++++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/Changelog b/Changelog index 3881587caa..b6a4ad1b34 100644 --- a/Changelog +++ b/Changelog @@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release, releases are sorted from youngest to oldest. version : +- hwcontext_drm detiles non linear layouts, if possible - kmsgrab GetFB2 format_modifier, if user doesnt specify - AudioToolbox output device - MacCaption demuxer diff --git a/libavutil/hwcontext_drm.c b/libavutil/hwcontext_drm.c index 32cbde82eb..bd74b3f13d 100644 --- a/libavutil/hwcontext_drm.c +++ b/libavutil/hwcontext_drm.c @@ -21,6 +21,7 @@ #include #include +#include #include #include "avassert.h" @@ -28,6 +29,7 @@ #include "hwcontext_drm.h" #include "hwcontext_internal.h" #include "imgutils.h" +#include "fbtile.h" static void drm_device_free(AVHWDeviceContext *hwdev) @@ -185,6 +187,32 @@ static int drm_transfer_get_formats(AVHWFramesContext *ctx, return 0; } +// Can be overridden during compiling, if required. +#ifndef HWCTXDRM_SYNCRELATED_FORMATMODIFIER +#define HWCTXDRM_SYNCRELATED_FORMATMODIFIER 1 +#endif +static int drm_transfer_with_detile(const AVFrame *hwAVFrame, AVFrame *dst, const AVFrame *src) +{ + int err = 0; + + if (hwAVFrame->format == AV_PIX_FMT_DRM_PRIME) { + AVDRMFrameDescriptor *drmFrame = (AVDRMFrameDescriptor*)hwAVFrame->data[0]; + uint64_t formatModifier = drmFrame->objects[0].format_modifier; + if (formatModifier != DRM_FORMAT_MOD_LINEAR) { + err = detile_this(TILE_AUTO, formatModifier, dst->width, dst->height, + dst->data[0], dst->linesize[0], + src->data[0], src->linesize[0], 4); + if (!err) { +#if HWCTXDRM_SYNCRELATED_FORMATMODIFIER + drmFrame->objects[0].format_modifier = DRM_FORMAT_MOD_LINEAR; +#endif + return 0; + } + } + } + return av_frame_copy(dst, src); +} + static int drm_transfer_data_from(AVHWFramesContext *hwfc, AVFrame *dst, const AVFrame *src) { @@ -206,7 +234,7 @@ static int drm_transfer_data_from(AVHWFramesContext *hwfc, map->width = dst->width; map->height = dst->height; - err = av_frame_copy(dst, map); + err = drm_transfer_with_detile(src, dst, map); if (err) goto fail; @@ -238,7 +266,7 @@ static int drm_transfer_data_to(AVHWFramesContext *hwfc, map->width = src->width; map->height = src->height; - err = av_frame_copy(map, src); + err = drm_transfer_with_detile(dst, map, src); if (err) goto fail; From patchwork Sat Jul 4 13:17:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: hanishkvc X-Patchwork-Id: 20800 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 3C6914498AB for ; Sat, 4 Jul 2020 16:19:22 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1AB6968B4DF; Sat, 4 Jul 2020 16:19:22 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C457E68B490 for ; Sat, 4 Jul 2020 16:19:19 +0300 (EEST) Received: by mail-pf1-f173.google.com with SMTP id 207so14876123pfu.3 for ; Sat, 04 Jul 2020 06:19:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QDZMz80FcmqrcFdzvu5UD2WrCNCtjJ352jbHJosKPcg=; b=AUSTYg9OGzoGYkg7onPVxUy3s8nlUSuI79r7kZJRQ9LbbgiGgJug19YGtuOLt/iqSa I6SSu+HvztCK3MCbRtfu2SFWWmSS0zCWgedc2Vd3kYUAMidCRViOFRCYk8/FZw3dhoJN ld0TvAgJWv7dGIPkUJeHwBbgOORc4ynm+2V+GI+SQdb6Ka/eC9DrN2V0R02hVO+lU3wb FEao2eAx7ASlCHdeHyXGN3/WdoLB416t1BnxIaLwNIUmL8rNan/EdblfLSjsMW5NYn+o pPSFFt2oa1B/qwcbWk1lGukyG9IJIUVf/VG9OoYnoUF3A3LdAfd1Jn4wiua7Rj3bK0VX XxRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QDZMz80FcmqrcFdzvu5UD2WrCNCtjJ352jbHJosKPcg=; b=MuQGw/yhU6AnJ53eiFWoEJhzqQi1WBI2aGA0SESECiNPgr1/tvXizXhea5irUVziw2 oqGW2Kd9RsVNR9rB3TH1VE4akZn/52W7DxjAgud4w9iwdmDTJ4clO/Z9llg2G/T/Bd3E HmKcmsCKJTus5EKxQPZBFr37mT+Xke626AjzcQVEbhte2eabwm04+51b4ufJUaX7aRyP D/fU+5Fzqp7pk9FPVCthHSF9MaO6DY+JkA7wnohoersZzktqtDRV8pG2UUVzKy69559z fQ6DkXCbFgb6F46Ab4Kmsq4sxQ1Dq8By/vNPcHiD5fBKJSAbQlVQ3QOh1kJHj48IxJsg 3f0g== X-Gm-Message-State: AOAM5326EH+u3u1R0/i3GvmfCgsjPX+CSA7d/oJxzI/JMfitl2KKlSw+ DzS8Wn8Z9RGjdxbLrONVbQDqrxQk X-Google-Smtp-Source: ABdhPJy9q0quqV+AII/GsoObv+kfDt7R1uYWHVmDq/TN7KQMMPz2bFQA0JmUkfvtHD3NrIlTf2rsBw== X-Received: by 2002:a63:9246:: with SMTP id s6mr32185547pgn.22.1593868757470; Sat, 04 Jul 2020 06:19:17 -0700 (PDT) Received: from localhost.localdomain ([122.179.70.80]) by smtp.gmail.com with ESMTPSA id b191sm14657507pga.13.2020.07.04.06.19.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jul 2020 06:19:16 -0700 (PDT) From: hanishkvc To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 Jul 2020 18:47:16 +0530 Message-Id: <20200704131717.49428-5-hanishkvc@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200704131717.49428-1-hanishkvc@gmail.com> References: <20200704131717.49428-1-hanishkvc@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v06 4/5] hwdownload detile framebuffer, if requested by user X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: hanishkvc Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Added logic to support detiling of framebuffer. By default this is disabled. Only if requested by the user, the logic will be triggered. It uses the fbtile helper routines to do the detiling. Currently 32bit RGB pixel format based framebuffers are supported. If the underlying hardware context provides linear layouts, then nothing is done. Only if the underlying hardware context generates tiled layout, then user can use this to detile, where possible. ./ffmpeg -f kmsgrab -i - -vf hwdownload=1,format=bgr0 out.mp4 --- Changelog | 1 + doc/filters.texi | 19 ++++++++++ libavfilter/vf_hwdownload.c | 74 ++++++++++++++++++++++++++++++++++++- 3 files changed, 92 insertions(+), 2 deletions(-) diff --git a/Changelog b/Changelog index b6a4ad1b34..6174770ce1 100644 --- a/Changelog +++ b/Changelog @@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release, releases are sorted from youngest to oldest. version : +- hwdownload framebuffer layout detiling (Intel tile-x|y|yf layouts) - hwcontext_drm detiles non linear layouts, if possible - kmsgrab GetFB2 format_modifier, if user doesnt specify - AudioToolbox output device diff --git a/doc/filters.texi b/doc/filters.texi index 67892e0afb..c783e059c2 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -12097,6 +12097,25 @@ Not all formats will be supported on the output - it may be necessary to insert an additional @option{format} filter immediately following in the graph to get the output in a supported format. +It supports the following optional parameters + +@table @option +@item fbdetile +Specify type of CPU based FrameBuffer layout detiling to apply. The supported values are +@table @var +@item 0 +Dont do sw detiling (the default). +@item 1 +Auto detect detile logic to apply (for hwcontext_drm). +@item 2 +intel tile-x to linear conversion. +@item 3 +intel tile-y to linear conversion. +@item 4 +intel tile-yf to linear conversion. +@end table +@end table + @section hwmap Map hardware frames to system memory or to another device. diff --git a/libavfilter/vf_hwdownload.c b/libavfilter/vf_hwdownload.c index 33af30cf40..5413ff104d 100644 --- a/libavfilter/vf_hwdownload.c +++ b/libavfilter/vf_hwdownload.c @@ -22,6 +22,10 @@ #include "libavutil/mem.h" #include "libavutil/opt.h" #include "libavutil/pixdesc.h" +#include "libavutil/fbtile.h" +#if CONFIG_LIBDRM +#include "libavutil/hwcontext_drm.h" +#endif #include "avfilter.h" #include "formats.h" @@ -33,8 +37,23 @@ typedef struct HWDownloadContext { AVBufferRef *hwframes_ref; AVHWFramesContext *hwframes; + int fbdetile; } HWDownloadContext; +#define OFFSET(x) offsetof(HWDownloadContext, x) +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM +static const AVOption hwdownload_options[] = { + { "fbdetile", "set framebuffer detile mode", OFFSET(fbdetile), AV_OPT_TYPE_INT, {.i64=TILE_NONE}, 0, TILE_NONE_END-1, FLAGS, "fbdetile" }, + { "none", "No SW detiling", 0, AV_OPT_TYPE_CONST, {.i64=TILE_NONE}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { "auto", "auto select based on format_modifier", 0, AV_OPT_TYPE_CONST, {.i64=TILE_AUTO}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { "intelx", "Intel Tile-X layout", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELX}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { "intely", "Intel Tile-Y layout", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELY}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { "intelyf", "Intel Tile-Yf layout", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELYF}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { "intelgx", "Intel Tile-X layout, GenericDetile", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELGX}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { "intelgy", "Intel Tile-Y layout, GenericDetile", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELGY}, INT_MIN, INT_MAX, FLAGS, "fbdetile" }, + { NULL } +}; + static int hwdownload_query_formats(AVFilterContext *avctx) { AVFilterFormats *infmts = NULL; @@ -81,6 +100,16 @@ static int hwdownload_config_input(AVFilterLink *inlink) ctx->hwframes = (AVHWFramesContext*)ctx->hwframes_ref->data; + int found = 0; + if (ctx->fbdetile != 0) { + found = fbtile_checkpixformats(ctx->hwframes->sw_format, fbtilePixFormats[0]); + if (!found) { + av_log(ctx, AV_LOG_ERROR, "Invalid input format %s for fbdetile.\n", + av_get_pix_fmt_name(ctx->hwframes->sw_format)); + return AVERROR(EINVAL); + } + } + return 0; } @@ -116,6 +145,15 @@ static int hwdownload_config_output(AVFilterLink *outlink) return AVERROR(EINVAL); } + if (ctx->fbdetile != 0) { + found = fbtile_checkpixformats(outlink->format, fbtilePixFormats[0]); + if (!found) { + av_log(ctx, AV_LOG_ERROR, "Invalid output format %s for fbdetile.\n", + av_get_pix_fmt_name(outlink->format)); + return AVERROR(EINVAL); + } + } + outlink->w = inlink->w; outlink->h = inlink->h; @@ -128,6 +166,7 @@ static int hwdownload_filter_frame(AVFilterLink *link, AVFrame *input) AVFilterLink *outlink = avctx->outputs[0]; HWDownloadContext *ctx = avctx->priv; AVFrame *output = NULL; + AVFrame *output2 = NULL; int err; if (!ctx->hwframes_ref || !input->hw_frames_ctx) { @@ -162,13 +201,44 @@ static int hwdownload_filter_frame(AVFilterLink *link, AVFrame *input) if (err < 0) goto fail; + if (ctx->fbdetile == 0) { + av_frame_free(&input); + return ff_filter_frame(avctx->outputs[0], output); + } + + output2 = ff_get_video_buffer(outlink, ctx->hwframes->width, + ctx->hwframes->height); + if (!output2) { + err = AVERROR(ENOMEM); + goto fail; + } + + output2->width = outlink->w; + output2->height = outlink->h; + uint64_t formatModifier = 0; +#if CONFIG_LIBDRM + if (input->format == AV_PIX_FMT_DRM_PRIME) { + AVDRMFrameDescriptor *drmFrame = input->data[0]; + formatModifier = drmFrame->objects[0].format_modifier; + } +#endif + detile_this(ctx->fbdetile, formatModifier, output2->width, output2->height, + output2->data[0], output2->linesize[0], + output->data[0], output->linesize[0], 4); + + err = av_frame_copy_props(output2, input); + if (err < 0) + goto fail; + av_frame_free(&input); + av_frame_free(&output); - return ff_filter_frame(avctx->outputs[0], output); + return ff_filter_frame(avctx->outputs[0], output2); fail: av_frame_free(&input); av_frame_free(&output); + av_frame_free(&output2); return err; } @@ -182,7 +252,7 @@ static av_cold void hwdownload_uninit(AVFilterContext *avctx) static const AVClass hwdownload_class = { .class_name = "hwdownload", .item_name = av_default_item_name, - .option = NULL, + .option = hwdownload_options, .version = LIBAVUTIL_VERSION_INT, }; From patchwork Sat Jul 4 13:17:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: hanishkvc X-Patchwork-Id: 20801 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 34AF44498AB for ; Sat, 4 Jul 2020 16:19:33 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 19C6E68B4B2; Sat, 4 Jul 2020 16:19:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CDD0268B464 for ; Sat, 4 Jul 2020 16:19:25 +0300 (EEST) Received: by mail-pj1-f52.google.com with SMTP id cm21so5589380pjb.3 for ; Sat, 04 Jul 2020 06:19:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oOCrAqbZlwMEKiTEhmO9Mtxfesr4Xe5LSV9XP39BQhg=; b=b2MtaHY5fhXxAFkQyEheYlviHDgps7NA3FxKtCRblu1GagF3zfjESsl+PiilEmfg1s tilDj3+xqRLcasDmn/jO5DNNs4+BzOSOLhctM9jU2ccEqTZKLTIqkm4Ss6DjwhP7t/ia pJWAcdBCV21d04TQ9oWa1y9r/8riiiaWpy3da7gyoeJy5A5eqmN+ujGtoVW/kWH7t2jt CRHiD+YrJOEyeIt964NX+PRCrRpqLmJjope+JNyunmGUiJhtKUIJmp34Se+bk5a43gz7 vsSKABHCBNRU9Z9e8GsaqtNl/GfKRKGAeN799aT23EGxQg4MESR8hAo2ciPyEgFT5sTk ydbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oOCrAqbZlwMEKiTEhmO9Mtxfesr4Xe5LSV9XP39BQhg=; b=W5AH0zVYAZxSHIAkA9Bt7P26DvJ2vv2qHwSkcSxCxOwERNEBSfkisf24blDstanNVd U3T7K7/r3EasiGOW2jeXU061i7P7IdqRw2Am2FQbDe6andqr0vPndSafvGu+YJ9rKxWQ 16psHf6eBySz4l4YvMxWFy86oiXepkwo8Y1u9NqgphWEGHbcHk6QDvVTyHzR57CZlxy4 Z674RYXCuQ0s8O6gFMOpH+lQJ+ovEMG7dBSrK2ucSyfid9xckEM62eeo4/z53meUC27l BleQ92E4/2PjZUX/ChqVdTh+Feh/683jMxug0Nkoq7mUV2EtHdQrWM37WgMnua052KCq cT8Q== X-Gm-Message-State: AOAM530XFJNGspQCBxtY5fhbVAYzr0Cads/zWJmpqjHY3IaIJelyycJs 0Dyb1UmS2YuLkw9dUzKJy3mX9kDU X-Google-Smtp-Source: ABdhPJyOgLVqLLxpIVLH09+gS4v/1hel8nix4xR49O8eLA3RUyhnEfIY5PdnQ+4dscGVQR69F+KdUw== X-Received: by 2002:a17:90a:2465:: with SMTP id h92mr12225292pje.26.1593868763586; Sat, 04 Jul 2020 06:19:23 -0700 (PDT) Received: from localhost.localdomain ([122.179.70.80]) by smtp.gmail.com with ESMTPSA id b191sm14657507pga.13.2020.07.04.06.19.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jul 2020 06:19:22 -0700 (PDT) From: hanishkvc To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 Jul 2020 18:47:17 +0530 Message-Id: <20200704131717.49428-6-hanishkvc@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200704131717.49428-1-hanishkvc@gmail.com> References: <20200704131717.49428-1-hanishkvc@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v06 5/5] fbdetile videofilter cpu based framebuffer detiling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: hanishkvc Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" This adds a video filter called fbdetile, which allows the user to detile framebuffer layout into a linear layout, if required. It uses the fbtile helper routines to achieve the detiling. This is useful, if a) the user doesnt want to apply detiling when capturing some content/framebuffer which is tiled. OR b) the user already has tiled content with them. OR c) a developer wants to experiment with tiled data. --- Changelog | 1 + doc/filters.texi | 78 +++++++++++++ libavfilter/Makefile | 1 + libavfilter/allfilters.c | 1 + libavfilter/vf_fbdetile.c | 238 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 319 insertions(+) create mode 100644 libavfilter/vf_fbdetile.c diff --git a/Changelog b/Changelog index 6174770ce1..a4e098f94f 100644 --- a/Changelog +++ b/Changelog @@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release, releases are sorted from youngest to oldest. version : +- fbdetile cpu based framebuffer layout detiling video filter - hwdownload framebuffer layout detiling (Intel tile-x|y|yf layouts) - hwcontext_drm detiles non linear layouts, if possible - kmsgrab GetFB2 format_modifier, if user doesnt specify diff --git a/doc/filters.texi b/doc/filters.texi index c783e059c2..4ff8b7edc4 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -12229,6 +12229,84 @@ It accepts the following optional parameters: The number of the CUDA device to use @end table +@anchor{fbdetile} +@section fbdetile + +Detiles the Framebuffer tile layout into a linear layout using CPU. + +It currently supports conversion from Intel legacy tile-x and tile-y as well +as the newer Intel tile-yf layouts into a linear layout. This is useful if +one is using kmsgrab and hwdownload to capture a screen which is using one +of these non-linear layouts. + +NOTE: It also provides a generic detiling logic, which can be easily configured +to detile many different tiling schemes if required, in future. The same is +used for detiling the intel tile-yf layout. Also sample configuration to handle +intel tile-x and tile-y using generic detile logic is also shown for reference, +in the code. + +Currently it expects the data to be a 32bit RGB based pixel format. However +the logic doesnt do any pixel format conversion or so. Later will be enabling +16bit RGB data also, as the logic is transparent to it at one level. + +One could either insert this into the filter chain while capturing itself, +or else, if it is slowing things down or so, then one could instead insert +it into the filter chain during playback or transcoding or so. + +It supports the following optional parameters + +@table @option +@item type +Specify which detiling conversion to apply. The supported values are +@table @var +@item 0 +Dont do detiling. +@item 1 +Auto detect detile logic to apply (supported in vf_hwdownload, not in vf_fbdetile). +@item 2 +intel tile-x to linear conversion (the default). +@item 3 +intel tile-y to linear conversion. +@item 4 +intel tile-yf to linear conversion. +@end table +@end table + +If one wants to convert during capture itself, one could do +@example +ffmpeg -f kmsgrab -i - -vf "hwdownload,format=bgr0,fbdetile" OUTPUT +@end example + +However if one wants to convert after the tiled data has been already captured +@example +ffmpeg -i INPUT -vf "fbdetile" OUTPUT +@end example +@example +ffplay -i INPUT -vf "fbdetile" +@end example + +NOTE: While transcoding a test 1080p h264 stream, with 276 frames, below was +the average times taken by the different detile logics. +@example +rm out.mp4; time ./ffmpeg -i input.mp4 out.mp4 +rm out.mp4; time ./ffmpeg -i input.mp4 -vf fbdetile=2 out.mp4 +rm out.mp4; time ./ffmpeg -i input.mp4 -vf fbdetile=3 out.mp4 +rm out.mp4; time ./ffmpeg -i input.mp4 -vf fbdetile=4 out.mp4 +@end example +@table @option +@item with no fbdetile filter +it took ~7.28 secs, i5-8th Gen +it took ~10.1 secs, i7-7th Gen +@item with fbdetile=2 filter, Intel Tile-X +it took ~8.69 secs, i5-8th Gen +it took ~13.3 secs, i7-7th Gen +@item with fbdetile=3 filter, Intel Tile-Y +it took ~9.20 secs. i5-8th Gen +it took ~13.5 secs. i7-7th Gen +@item with fbdetile=4 filter, Intel Tile-Yf +it took ~13.8 secs. i7-7th Gen +@end table + @section hqx Apply a high-quality magnification filter designed for pixel art. This filter diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 5123540653..bdb0c379ae 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -280,6 +280,7 @@ OBJS-$(CONFIG_HWDOWNLOAD_FILTER) += vf_hwdownload.o OBJS-$(CONFIG_HWMAP_FILTER) += vf_hwmap.o OBJS-$(CONFIG_HWUPLOAD_CUDA_FILTER) += vf_hwupload_cuda.o OBJS-$(CONFIG_HWUPLOAD_FILTER) += vf_hwupload.o +OBJS-$(CONFIG_FBDETILE_FILTER) += vf_fbdetile.o OBJS-$(CONFIG_HYSTERESIS_FILTER) += vf_hysteresis.o framesync.o OBJS-$(CONFIG_IDET_FILTER) += vf_idet.o OBJS-$(CONFIG_IL_FILTER) += vf_il.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 1183e40267..f8dceb2a88 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -265,6 +265,7 @@ extern AVFilter ff_vf_hwdownload; extern AVFilter ff_vf_hwmap; extern AVFilter ff_vf_hwupload; extern AVFilter ff_vf_hwupload_cuda; +extern AVFilter ff_vf_fbdetile; extern AVFilter ff_vf_hysteresis; extern AVFilter ff_vf_idet; extern AVFilter ff_vf_il; diff --git a/libavfilter/vf_fbdetile.c b/libavfilter/vf_fbdetile.c new file mode 100644 index 0000000000..bfc28da465 --- /dev/null +++ b/libavfilter/vf_fbdetile.c @@ -0,0 +1,238 @@ +/* + * Copyright (c) 2020 HanishKVC + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * Detile the Frame buffer's tile layout using the cpu + * Currently it supports detiling of following layouts + * legacy Intel Tile-X + * legacy Intel Tile-Y + * newer Intel Tile-Yf + * More tiling layouts can be easily supported by adding configuration data + * for the generic detile logic, wrt the required tiling schemes. + * + */ + +/* + * ToThink|Check: Optimisations + * + * Does gcc setting used by ffmpeg allows memcpy | stringops inlining, + * loop unrolling, better native matching instructions, additional + * optimisations, ... + * + * Does gcc map to optimal memcpy logic, based on the situation it is + * used in i.e like + * based on size of transfer, alignment, architecture, etc + * a suitable combination of inlining and or rep movsb and or + * simd load/store and or unrolling and or ... + * + * If not, may be look at vector_size or intrinsics or appropriate arch + * and cpu specific inline asm or ... + * + */ + +/* + * Performance check results on i7-7500u + * TileYf, TileGX, TileGY using detile_generic_opti + * This mainly impacts TileYf, due to its deeper subtiling + * Without opti, its TSCCnt rises to aroun 11.XYM + * Run Type : Type : Seconds Max, Min : TSCCnt Min, Max + * Non filter run: : 10.11s, 09.96s : + * fbdetile=2 run: TileX : 13.45s, 13.20s : 05.95M, 06.10M + * fbdetile=3 run: TileY : 13.50s, 13.39s : 06.22M, 06.39M + * fbdetile=4 run: TileYf : 13.75s, 13.63s : 09.82M, 09.90M + * fbdetile=5 run: TileGX : 13.70s, 13.32s : 06.15M, 06.24M + * fbdetile=6 run: TileGY : 14.12s, 13.57s : 08.75M, 09.10M + */ + +#include "libavutil/avassert.h" +#include "libavutil/imgutils.h" +#include "libavutil/opt.h" +#include "libavutil/fbtile.h" +#include "avfilter.h" +#include "formats.h" +#include "internal.h" +#include "video.h" + +// Use Optimised detile_generic or the Simpler but more fine grained one +#define DETILE_GENERIC_OPTI 1 +// Enable printing of the tile walk +#undef DEBUG_FBTILE +// Print time taken by detile using performance counter +#if ARCH_X86 +#define DEBUG_PERF 1 +#else +#undef DEBUG_PERF +#endif + +#ifdef DEBUG_PERF +#include +uint64_t perfTime = 0; +int perfCnt = 0; +#endif + +typedef struct FBDetileContext { + const AVClass *class; + int width, height; + int type; +} FBDetileContext; + +#define OFFSET(x) offsetof(FBDetileContext, x) +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM +static const AVOption fbdetile_options[] = { + { "type", "set framebuffer tile|format_modifier conversion type", OFFSET(type), AV_OPT_TYPE_INT, {.i64=TILE_INTELX}, 0, TILE_NONE_END-1, FLAGS, "type" }, + { "None", "Dont detile", 0, AV_OPT_TYPE_CONST, {.i64=TILE_NONE}, INT_MIN, INT_MAX, FLAGS, "type" }, + { "Auto", "Auto detect tile conversion type, NotImplemented", 0, AV_OPT_TYPE_CONST, {.i64=TILE_AUTO}, INT_MIN, INT_MAX, FLAGS, "type" }, + { "intelx", "Intel Tile-X layout", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELX}, INT_MIN, INT_MAX, FLAGS, "type" }, + { "intely", "Intel Tile-Y layout", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELY}, INT_MIN, INT_MAX, FLAGS, "type" }, + { "intelyf", "Intel Tile-Yf layout", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELYF}, INT_MIN, INT_MAX, FLAGS, "type" }, + { "intelgx", "Intel Tile-X layout, GenericDetile", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELGX}, INT_MIN, INT_MAX, FLAGS, "type" }, + { "intelgy", "Intel Tile-Y layout, GenericDetile", 0, AV_OPT_TYPE_CONST, {.i64=TILE_INTELGY}, INT_MIN, INT_MAX, FLAGS, "type" }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(fbdetile); + +static av_cold int init(AVFilterContext *ctx) +{ + FBDetileContext *fbdetile = ctx->priv; + + if (fbdetile->type == TILE_NONE) { + av_log(ctx, AV_LOG_INFO, "init: Wont detile, pass through\n"); + } else if (fbdetile->type == TILE_AUTO) { + av_log(ctx, AV_LOG_WARNING, "init: Auto detile mode detect, not supported, pass through\n"); + fbdetile->type = TILE_NONE; + } else if (fbdetile->type == TILE_INTELX) { + av_log(ctx, AV_LOG_INFO, "init: Intel tile-x to linear\n"); + } else if (fbdetile->type == TILE_INTELY) { + av_log(ctx, AV_LOG_INFO, "init: Intel tile-y to linear\n"); + } else if (fbdetile->type == TILE_INTELYF) { + av_log(ctx, AV_LOG_INFO, "init: Intel tile-yf to linear\n"); + } else if (fbdetile->type == TILE_INTELGX) { + av_log(ctx, AV_LOG_INFO, "init: Intel tile-x to linear, using generic detile\n"); + } else if (fbdetile->type == TILE_INTELGY) { + av_log(ctx, AV_LOG_INFO, "init: Intel tile-y to linear, using generic detile\n"); + } else { + av_log(ctx, AV_LOG_ERROR, "init: Unknown Tile format specified, shouldnt reach here\n"); + } + fbdetile->width = 1920; + fbdetile->height = 1080; + return 0; +} + +static int query_formats(AVFilterContext *ctx) +{ + AVFilterFormats *fmts_list; + + fmts_list = ff_make_format_list(fbtilePixFormats); + if (!fmts_list) + return AVERROR(ENOMEM); + return ff_set_common_formats(ctx, fmts_list); +} + +static int config_props(AVFilterLink *inlink) +{ + AVFilterContext *ctx = inlink->dst; + FBDetileContext *fbdetile = ctx->priv; + + fbdetile->width = inlink->w; + fbdetile->height = inlink->h; + av_log(ctx, AV_LOG_INFO, "config_props: %d x %d\n", fbdetile->width, fbdetile->height); + + return 0; +} + + +static int filter_frame(AVFilterLink *inlink, AVFrame *in) +{ + AVFilterContext *ctx = inlink->dst; + FBDetileContext *fbdetile = ctx->priv; + AVFilterLink *outlink = ctx->outputs[0]; + AVFrame *out; + + if (fbdetile->type == TILE_NONE) + return ff_filter_frame(outlink, in); + + out = ff_get_video_buffer(outlink, outlink->w, outlink->h); + if (!out) { + av_frame_free(&in); + return AVERROR(ENOMEM); + } + av_frame_copy_props(out, in); + +#ifdef DEBUG_PERF + uint64_t perfStart = __rdtsc(); +#endif + + detile_this(fbdetile->type, 0, fbdetile->width, fbdetile->height, + out->data[0], out->linesize[0], + in->data[0], in->linesize[0], 4); + +#ifdef DEBUG_PERF + uint64_t perfEnd = __rdtsc(); + perfTime += (perfEnd - perfStart); + perfCnt += 1; +#endif + + av_frame_free(&in); + return ff_filter_frame(outlink, out); +} + +static av_cold void uninit(AVFilterContext *ctx) +{ +#ifdef DEBUG_PERF + if (perfCnt == 0) + perfCnt = 1; + av_log(ctx, AV_LOG_INFO, "uninit:perf: AvgTSCCnt %ld\n", perfTime/perfCnt); +#endif +} + +static const AVFilterPad fbdetile_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = config_props, + .filter_frame = filter_frame, + }, + { NULL } +}; + +static const AVFilterPad fbdetile_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + }, + { NULL } +}; + +AVFilter ff_vf_fbdetile = { + .name = "fbdetile", + .description = NULL_IF_CONFIG_SMALL("Detile Framebuffer using CPU"), + .priv_size = sizeof(FBDetileContext), + .init = init, + .uninit = uninit, + .query_formats = query_formats, + .inputs = fbdetile_inputs, + .outputs = fbdetile_outputs, + .priv_class = &fbdetile_class, + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC, +}; + +// vim: set expandtab sts=4: //