From patchwork Wed May 18 21:56:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vignesh Venkat X-Patchwork-Id: 35826 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a885:b0:7f:4be2:bd17 with SMTP id ca5csp541084pzb; Wed, 18 May 2022 14:56:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxxQGmHCSDq8v/3Dgm7un2s+qp5BvZ3QspvQponWlrnjP+nuqA3TC/SGgEgPcr7PmP4K8tc X-Received: by 2002:a05:6402:1804:b0:42a:b015:4acc with SMTP id g4-20020a056402180400b0042ab0154accmr1983038edy.163.1652910994611; Wed, 18 May 2022 14:56:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652910994; cv=none; d=google.com; s=arc-20160816; b=T67zeIbnlRM+t38QTKfwmCdQvaDuGHm9OLYHVofgtvgg/s3zlanfSJ8LtvTzqNAuhi KHN7zZ5EZbUiNh1SlMYm9wUOwj9CU+tZJgv1bTUq7AR4/EWkUAwoRICr7+UINNWRif3R oivt0tGrpmIjPTJJNPGPnjGnb07MLW1EwrTF787k805jQw6P4h64E/Xk3A5Z3Jg6mWWr NEnwwptrPHHnZ8ShhTGCnx60DADYQOw5/fGmrCTBdnv6B9vgNAoQCFVcKOrCYec+JGAT kY/RKGVXLNTaudrocMoCcdP7QZAD+elL5xfCGhmMgIeXwDzQle6VBX5v+VovFjIahfsK AxjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=AGV8oXaoX1RBb+f+ipW0H/1i8y3onMCdp0DR2ekxDr0=; b=yJpCTpCH0u0gyW6qTR2x4InzEcKu4F8d3hE05RxDkEQ+vhERiHg7exq7Ls1S7KBf2a V9CE34ptJAb0QDT6tTfTjTpUPHI22SbSLX9tcrwjJH8nJnSwT1s1ZHIVwC/+FDw4fzud hPi67gyB6XvTz537NaMoElJDaVqy4VQZO4Z/AyCOTtl/5MuiDT1ylkXGjQDXWs2kg7Ad qVWV9UFaTfKMOOvd9wWbjQhFD9DGkhMKO91SdANdksk91idbp1CtInpevz1uBsCKiJQH r48wX1aBOoKwA9KdL+8RXS1D0/3vWFficTzSWWyJWoDJjnIlWU2fzkyRsTozjs8YnMLE 0prA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=R66cHNsZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b14-20020a170906038e00b006e82ffe65b6si3060923eja.935.2022.05.18.14.56.34; Wed, 18 May 2022 14:56:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20210112 header.b=R66cHNsZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DFA2568B499; Thu, 19 May 2022 00:56:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CC98B68B25B for ; Thu, 19 May 2022 00:56:25 +0300 (EEST) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-2fef32314f7so29987387b3.18 for ; Wed, 18 May 2022 14:56:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=suCN+owPiWP0FwTZ4sR4QWq2JBB4/vuGHrtZAM530BQ=; b=R66cHNsZV5upIJ53Y1xkaCTiDp3YVqo1FNM6oFBaR+4lwLFDtcQAxqwhvdYvSiAYRE wupVMmPdrip2xLNG/Y1pYsRflUzHNU9utBz/GyCxKLiLs5RhfKP/9BVFs5qvLF8y7UBB 3LTnf8pttU2Fv2exb0uftPv8wGj5tqWVt8iKs7xx1+C+ZagyTUTj5pyFYXNl5WcAwJZA 99DeOcwhv3YXFKS5Kz5BfsaIcCDsLJ8FpwQpGShplzgnRzIr+zrF9LBQ1jMY7kIhgvGK CU1zQ9gBZaP3Pn9s7mza4u3rvIwUAZsTQFXVy+Gp/b0Mg0tl3MmsF+Fqna+rlN4Jdmo+ fSQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=suCN+owPiWP0FwTZ4sR4QWq2JBB4/vuGHrtZAM530BQ=; b=zN8UIKrnY/4NUs/+16wXsH6wWNFEEfZvlBwbFbUWrqLKfMoG57I8Opg8I2y2ujnDKY fqtQjeOlzb0+EcwNbRwlvj1v2AYXqL5bO5V4e8mYQ1BITXbLvYX57WFG45TbDtZCJt8A LDYyrw1QFxtrKwwKooP6d+dFylm10gnWuPSvf6TvU/ZWdvud8dEWpii7HGIrTg3pNu1d xtpQsX0RGAolKxByvrndkYu/t8ujoIQxYK09gqQuiior5gtCHU4boPMVhAqUcnNe6/ez 3p6QfLTeFe9Ktz+Xc7tcWGZrUqK162KNFrqR0btkz1oYyO/N/XHxISUqGnF8gMzC/xdB Bylg== X-Gm-Message-State: AOAM533hzRR5cKdQELKLbSn6ePDh/+/eEatNOGj5iJT/AsjKGs2Oxe14 MNNcFughrK/0n2J6t84v5CQX8rjn2j3ePXURYI3An2rESBmYwYOM47Wov+xN07vefgwbCChUn6Y xmDxkpPdnvxyC519XxHq2LIINcQ1vh25ygP//m4NBsnCA061nylm9RiD/JkiV4eWk4YOK X-Received: from vigneshv3.mtv.corp.google.com ([2620:0:1000:2511:9dc8:d2dd:109c:b91]) (user=vigneshv job=sendgmr) by 2002:a25:54c:0:b0:64d:69bb:e090 with SMTP id 73-20020a25054c000000b0064d69bbe090mr1606468ybf.429.1652910983551; Wed, 18 May 2022 14:56:23 -0700 (PDT) Date: Wed, 18 May 2022 14:56:20 -0700 Message-Id: <20220518215620.1718203-1-vigneshv@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.36.1.124.g0e6072fb45-goog From: Vignesh Venkatasubramanian To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH] avformat/movenc: Support alpha channel for AVIF X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Vignesh Venkatasubramanian Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GHBcrOSEHHut AVIF specification allows for alpha channel as an auxillary item (in case of still images) or as an auxillary track (in case of animated images). Add support for both of these. The AVIF muxer will take exactly two streams (when alpha is present) as input (first one being the YUV planes and the second one being the alpha plane). The input has to come from two different images (one of it color and the other one being alpha), or it can come from a single file source with the alpha channel extracted using the "alphaextract" filter. Example using alphaextract: ffmpeg -i rgba.png -filter_complex "[0:v]alphaextract[a]" -map 0 -map "[a]" -still-picture 1 avif_with_alpha.avif Example using two sources (first source can be in any pixel format and the second source has to be in monochrome grey pixel format): ffmpeg -i color.avif -i grey.avif -map 0 -map 1 -c copy avif_with_alpha.avif The generated files pass the compliance checks in Compliance Warden: https://github.com/gpac/ComplianceWarden libavif (the reference avif library) is able to decode the files generated using this patch. They also play back properly (with transparent background) in: 1) Chrome 2) Firefox (only still AVIF, no animation support) Signed-off-by: Vignesh Venkatasubramanian --- libavformat/movenc.c | 185 +++++++++++++++++++++++++++++-------------- libavformat/movenc.h | 4 +- 2 files changed, 128 insertions(+), 61 deletions(-) diff --git a/libavformat/movenc.c b/libavformat/movenc.c index de971f94e8..00e42b7abb 100644 --- a/libavformat/movenc.c +++ b/libavformat/movenc.c @@ -2852,7 +2852,7 @@ static int mov_write_hdlr_tag(AVFormatContext *s, AVIOContext *pb, MOVTrack *tra hdlr = (track->mode == MODE_MOV) ? "mhlr" : "\0\0\0\0"; if (track->par->codec_type == AVMEDIA_TYPE_VIDEO) { if (track->mode == MODE_AVIF) { - hdlr_type = "pict"; + hdlr_type = (track == &mov->tracks[0]) ? "pict" : "auxv"; descr = "PictureHandler"; } else { hdlr_type = "vide"; @@ -2940,57 +2940,83 @@ static int mov_write_iloc_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte avio_wb32(pb, 0); /* Version & flags */ avio_w8(pb, (4 << 4) + 4); /* offset_size(4) and length_size(4) */ avio_w8(pb, 0); /* base_offset_size(4) and reserved(4) */ - avio_wb16(pb, 1); /* item_count */ + avio_wb16(pb, s->nb_streams); /* item_count */ - avio_wb16(pb, 1); /* item_id */ - avio_wb16(pb, 0); /* data_reference_index */ - avio_wb16(pb, 1); /* extent_count */ - mov->avif_extent_pos = avio_tell(pb); - avio_wb32(pb, 0); /* extent_offset (written later) */ - // For animated AVIF, we simply write the first packet's size. - avio_wb32(pb, mov->avif_extent_length); /* extent_length */ + for (int i = 0; i < s->nb_streams; i++) { + avio_wb16(pb, i + 1); /* item_id */ + avio_wb16(pb, 0); /* data_reference_index */ + avio_wb16(pb, 1); /* extent_count */ + mov->avif_extent_pos[i] = avio_tell(pb); + avio_wb32(pb, 0); /* extent_offset (written later) */ + // For animated AVIF, we simply write the first packet's size. + avio_wb32(pb, mov->avif_extent_length[i]); /* extent_length */ + } return update_size(pb, pos); } static int mov_write_iinf_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s) { - int64_t infe_pos; int64_t iinf_pos = avio_tell(pb); avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "iinf"); avio_wb32(pb, 0); /* Version & flags */ - avio_wb16(pb, 1); /* entry_count */ + avio_wb16(pb, s->nb_streams); /* entry_count */ - infe_pos = avio_tell(pb); - avio_wb32(pb, 0); /* size */ - ffio_wfourcc(pb, "infe"); - avio_w8(pb, 0x2); /* Version */ - avio_wb24(pb, 0); /* flags */ - avio_wb16(pb, 1); /* item_id */ - avio_wb16(pb, 0); /* item_protection_index */ - avio_write(pb, "av01", 4); /* item_type */ - avio_write(pb, "Color\0", 6); /* item_name */ - update_size(pb, infe_pos); + for (int i = 0; i < s->nb_streams; i++) { + int64_t infe_pos = avio_tell(pb); + avio_wb32(pb, 0); /* size */ + ffio_wfourcc(pb, "infe"); + avio_w8(pb, 0x2); /* Version */ + avio_wb24(pb, 0); /* flags */ + avio_wb16(pb, i + 1); /* item_id */ + avio_wb16(pb, 0); /* item_protection_index */ + avio_write(pb, "av01", 4); /* item_type */ + avio_write(pb, !i ? "Color\0" : "Alpha\0", 6); /* item_name */ + update_size(pb, infe_pos); + } return update_size(pb, iinf_pos); } -static int mov_write_ispe_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s) + +static int mov_write_iref_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s) +{ + int64_t auxl_pos; + int64_t iref_pos = avio_tell(pb); + avio_wb32(pb, 0); /* size */ + ffio_wfourcc(pb, "iref"); + avio_wb32(pb, 0); /* Version & flags */ + + auxl_pos = avio_tell(pb); + avio_wb32(pb, 0); /* size */ + ffio_wfourcc(pb, "auxl"); + avio_wb16(pb, 2); /* from_item_ID */ + avio_wb16(pb, 1); /* reference_count */ + avio_wb16(pb, 1); /* to_item_ID */ + update_size(pb, auxl_pos); + + return update_size(pb, iref_pos); +} + +static int mov_write_ispe_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s, + int stream_index) { int64_t pos = avio_tell(pb); avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "ispe"); avio_wb32(pb, 0); /* Version & flags */ - avio_wb32(pb, s->streams[0]->codecpar->width); /* image_width */ - avio_wb32(pb, s->streams[0]->codecpar->height); /* image_height */ + avio_wb32(pb, s->streams[stream_index]->codecpar->width); /* image_width */ + avio_wb32(pb, s->streams[stream_index]->codecpar->height); /* image_height */ return update_size(pb, pos); } -static int mov_write_pixi_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s) +static int mov_write_pixi_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s, + int stream_index) { int64_t pos = avio_tell(pb); - const AVPixFmtDescriptor *pixdesc = av_pix_fmt_desc_get(s->streams[0]->codecpar->format); + const AVPixFmtDescriptor *pixdesc = + av_pix_fmt_desc_get(s->streams[stream_index]->codecpar->format); avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "pixi"); avio_wb32(pb, 0); /* Version & flags */ @@ -3001,15 +3027,30 @@ static int mov_write_pixi_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte return update_size(pb, pos); } +static int mov_write_auxC_tag(AVIOContext *pb) +{ + int64_t pos = avio_tell(pb); + avio_wb32(pb, 0); /* size */ + ffio_wfourcc(pb, "auxC"); + avio_wb32(pb, 0); /* Version & flags */ + avio_write(pb, "urn:mpeg:mpegB:cicp:systems:auxiliary:alpha\0", 44); + return update_size(pb, pos); +} + static int mov_write_ipco_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatContext *s) { int64_t pos = avio_tell(pb); avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "ipco"); - mov_write_ispe_tag(pb, mov, s); - mov_write_pixi_tag(pb, mov, s); - mov_write_av1c_tag(pb, &mov->tracks[0]); - mov_write_colr_tag(pb, &mov->tracks[0], 0); + for (int i = 0; i < s->nb_streams; i++) { + mov_write_ispe_tag(pb, mov, s, i); + mov_write_pixi_tag(pb, mov, s, i); + mov_write_av1c_tag(pb, &mov->tracks[i]); + if (!i) + mov_write_colr_tag(pb, &mov->tracks[0], 0); + else + mov_write_auxC_tag(pb); + } return update_size(pb, pos); } @@ -3019,18 +3060,21 @@ static int mov_write_ipma_tag(AVIOContext *pb, MOVMuxContext *mov, AVFormatConte avio_wb32(pb, 0); /* size */ ffio_wfourcc(pb, "ipma"); avio_wb32(pb, 0); /* Version & flags */ - avio_wb32(pb, 1); /* entry_count */ - avio_wb16(pb, 1); /* item_ID */ - avio_w8(pb, 4); /* association_count */ - - // ispe association. - avio_w8(pb, 1); /* essential and property_index */ - // pixi association. - avio_w8(pb, 2); /* essential and property_index */ - // av1C association. - avio_w8(pb, 0x80 | 3); /* essential and property_index */ - // colr association. - avio_w8(pb, 4); /* essential and property_index */ + avio_wb32(pb, s->nb_streams); /* entry_count */ + + for (int i = 0, index = 1; i < s->nb_streams; i++) { + avio_wb16(pb, i + 1); /* item_ID */ + avio_w8(pb, 4); /* association_count */ + + // ispe association. + avio_w8(pb, index++); /* essential and property_index */ + // pixi association. + avio_w8(pb, index++); /* essential and property_index */ + // av1C association. + avio_w8(pb, 0x80 | index++); /* essential and property_index */ + // colr/auxC association. + avio_w8(pb, index++); /* essential and property_index */ + } return update_size(pb, pos); } @@ -4112,6 +4156,8 @@ static int mov_write_meta_tag(AVIOContext *pb, MOVMuxContext *mov, mov_write_pitm_tag(pb, 1); mov_write_iloc_tag(pb, mov, s); mov_write_iinf_tag(pb, mov, s); + if (s->nb_streams > 1) + mov_write_iref_tag(pb, mov, s); mov_write_iprp_tag(pb, mov, s); } else { /* iTunes metadata tag */ @@ -6040,8 +6086,8 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt) avio_write(pb, reformatted_data, size); } else { size = ff_av1_filter_obus(pb, pkt->data, pkt->size); - if (trk->mode == MODE_AVIF && !mov->avif_extent_length) { - mov->avif_extent_length = size; + if (trk->mode == MODE_AVIF && !mov->avif_extent_length[pkt->stream_index]) { + mov->avif_extent_length[pkt->stream_index] = size; } } @@ -6874,14 +6920,23 @@ static int mov_init(AVFormatContext *s) /* AVIF output must have exactly one video stream */ if (mov->mode == MODE_AVIF) { - if (s->nb_streams > 1) { - av_log(s, AV_LOG_ERROR, "AVIF output requires exactly one stream\n"); + if (s->nb_streams > 2) { + av_log(s, AV_LOG_ERROR, "AVIF output requires exactly one or two streams\n"); return AVERROR(EINVAL); } - if (s->streams[0]->codecpar->codec_type != AVMEDIA_TYPE_VIDEO) { - av_log(s, AV_LOG_ERROR, "AVIF output requires one video stream\n"); + if (s->streams[0]->codecpar->codec_type != AVMEDIA_TYPE_VIDEO && + (s->nb_streams > 1 && s->streams[1]->codecpar->codec_type != AVMEDIA_TYPE_VIDEO)) { + av_log(s, AV_LOG_ERROR, "AVIF output supports only video streams\n"); return AVERROR(EINVAL); } + if (s->nb_streams > 1) { + const AVPixFmtDescriptor *pixdesc = + av_pix_fmt_desc_get(s->streams[1]->codecpar->format); + if (pixdesc->nb_components != 1) { + av_log(s, AV_LOG_ERROR, "Second stream for AVIF (alpha) output must have exactly one plane\n"); + return AVERROR(EINVAL); + } + } s->streams[0]->disposition |= AV_DISPOSITION_DEFAULT; } @@ -7543,18 +7598,25 @@ static int avif_write_trailer(AVFormatContext *s) { AVIOContext *pb = s->pb; MOVMuxContext *mov = s->priv_data; - int64_t pos_backup, mdat_pos; + int64_t pos_backup, extent_offsets[2]; uint8_t *buf; - int buf_size, moov_size; + int buf_size, moov_size, i; if (mov->moov_written) return 0; mov->is_animated_avif = s->streams[0]->nb_frames > 1; + if (mov->is_animated_avif && s->nb_streams > 1) { + // For animated avif with alpha channel, we need to write a the tref + // tag with type "auxl". + mov->tracks[1].tref_tag = MKTAG('a', 'u', 'x', 'l'); + mov->tracks[1].tref_id = 1; + } mov_write_identification(pb, s); mov_write_meta_tag(pb, mov, s); moov_size = get_moov_size(s); - mov->tracks[0].data_offset = avio_tell(pb) + moov_size + 8; + for (i = 0; i < s->nb_streams; i++) + mov->tracks[i].data_offset = avio_tell(pb) + moov_size + 8; if (mov->is_animated_avif) { int ret; @@ -7565,19 +7627,24 @@ static int avif_write_trailer(AVFormatContext *s) buf_size = avio_get_dyn_buf(mov->mdat_buf, &buf); avio_wb32(pb, buf_size + 8); ffio_wfourcc(pb, "mdat"); - mdat_pos = avio_tell(pb); - if (mdat_pos != (uint32_t)mdat_pos) { - av_log(s, AV_LOG_ERROR, "mdat offset does not fit in 32 bits\n"); - return AVERROR_INVALIDDATA; - } + // The offset for the YUV planes is the starting position of mdat. + extent_offsets[0] = avio_tell(pb); + // The offset for alpha plane is YUV offset + YUV size. + extent_offsets[1] = extent_offsets[0] + mov->avif_extent_length[0]; avio_write(pb, buf, buf_size); - // write extent offset. + // write extent offsets. pos_backup = avio_tell(pb); - avio_seek(pb, mov->avif_extent_pos, SEEK_SET); - avio_wb32(pb, mdat_pos); /* rewrite offset */ + for (i = 0; i < s->nb_streams; i++) { + if (extent_offsets[i] != (uint32_t)extent_offsets[i]) { + av_log(s, AV_LOG_ERROR, "extent offset does not fit in 32 bits\n"); + return AVERROR_INVALIDDATA; + } + avio_seek(pb, mov->avif_extent_pos[i], SEEK_SET); + avio_wb32(pb, extent_offsets[i]); /* rewrite offset */ + } avio_seek(pb, pos_backup, SEEK_SET); return 0; diff --git a/libavformat/movenc.h b/libavformat/movenc.h index 281576cc66..e4550f7900 100644 --- a/libavformat/movenc.h +++ b/libavformat/movenc.h @@ -246,8 +246,8 @@ typedef struct MOVMuxContext { int empty_hdlr_name; int movie_timescale; - int64_t avif_extent_pos; - int avif_extent_length; + int64_t avif_extent_pos[2]; // index 0 is YUV and 1 is Alpha. + int avif_extent_length[2]; // index 0 is YUV and 1 is Alpha. int is_animated_avif; } MOVMuxContext;