Message ID | 20220630210434.1551769-2-vigneshv@google.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,1/2] avformat/mov: Rework the AVIF parser to handle multiple items | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
andriy/make_armv7_RPi4 | success | Make finished |
andriy/make_fate_armv7_RPi4 | success | Make fate finished |
Quoting Vignesh Venkatasubramanian (2022-06-30 23:04:34) > Parse the alpha channel for still AVIF images and expose it as a > separate track. This is the simplest way of supporting AVIF alpha > channel in a codec independent manner (similar to how ffmpeg > supports animated AVIF with alpha channel). > > One can use the alphamerge filter to get a transparent image with > a single command. For example: > > ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png > > Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> > --- > libavformat/isom.h | 1 + > libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 67 insertions(+) I am against this patch, this is a digusting hack. These are not two streams, it is a single stream and should be exposed as such.
On Sat, Jul 2, 2022 at 2:35 AM Anton Khirnov <anton@khirnov.net> wrote: > > Quoting Vignesh Venkatasubramanian (2022-06-30 23:04:34) > > Parse the alpha channel for still AVIF images and expose it as a > > separate track. This is the simplest way of supporting AVIF alpha > > channel in a codec independent manner (similar to how ffmpeg > > supports animated AVIF with alpha channel). > > > > One can use the alphamerge filter to get a transparent image with > > a single command. For example: > > > > ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png > > > > Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> > > --- > > libavformat/isom.h | 1 + > > libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 67 insertions(+) > > I am against this patch, this is a digusting hack. > > These are not two streams, it is a single stream and should be exposed > as such. > Yes, while it is a hack, it is also the simplest way of supporting AVIF images with alpha. Since AVIF alpha images require two separate decoders, i could not think of any other solution that does not involve modifying the underlying av1 codec itself. If there are any alternative solutions where we can expose a single stream that needs two codecs, then i am open to implementing that. Otherwise, this solution improves the current situation where there is no way for the user to extract the alpha channel (albeit in a hacky way). > -- > Anton Khirnov
On Sat, Jul 2, 2022 at 7:32 PM Vignesh Venkatasubramanian <vigneshv-at-google.com@ffmpeg.org> wrote: > > On Sat, Jul 2, 2022 at 2:35 AM Anton Khirnov <anton@khirnov.net> wrote: > > > > Quoting Vignesh Venkatasubramanian (2022-06-30 23:04:34) > > > Parse the alpha channel for still AVIF images and expose it as a > > > separate track. This is the simplest way of supporting AVIF alpha > > > channel in a codec independent manner (similar to how ffmpeg > > > supports animated AVIF with alpha channel). > > > > > > One can use the alphamerge filter to get a transparent image with > > > a single command. For example: > > > > > > ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png > > > > > > Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> > > > --- > > > libavformat/isom.h | 1 + > > > libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 67 insertions(+) > > > > I am against this patch, this is a digusting hack. > > > > These are not two streams, it is a single stream and should be exposed > > as such. > > > > Yes, while it is a hack, it is also the simplest way of supporting > AVIF images with alpha. Since AVIF alpha images require two separate > decoders, i could not think of any other solution that does not > involve modifying the underlying av1 codec itself. > > If there are any alternative solutions where we can expose a single > stream that needs two codecs, then i am open to implementing that. > > Otherwise, this solution improves the current situation where there is > no way for the user to extract the alpha channel (albeit in a hacky > way). I have discussed this on IRC, and as far as I can see we have multiple cases where an additional image has to be decoded by a separate decoder context to get alpha, namely: - vp8/9 (currently handled by AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL) - HEIF-like formats (such as AVIF), which are codec agnostic (you can put JPEG, HEVC, AV1 etc into HEIF) This means the logic (preferably) should not be codec-specific, and would allow to implement at least two separate My proposal - which Anton didn't really like - was to add a side data entry AV_PKT_DATA_AUXILIARY_IMAGE , which could have type (ALPHA, DEPTH etc) as well as the relevant packet(s) - as well as if any of specifications define codec initialization data, that. Then libavcodec/decode or so could initialize a secondary decoder context for alpha images and add the third plane if parameters match what's required. This would also make remux possible, since containers that do not support such auxiliary images would ignore the alpha plane, and those that do would properly pass it through. As for encoding, not fully sure how it should be integrated, if any encoders actually at this moment do proper alpha coding, or do all API clients have to separately encode with one context the primary image, and the alpha with another? That said, the multi-stream implementation is already merged on the muxer side, so that can be utilized in the mean time, even though the auxiliary images are not technically a separate stream within HEIF-likes. Jan
On Sat, Jul 2, 2022 at 12:35 PM Jan Ekström <jeebjp@gmail.com> wrote: > > On Sat, Jul 2, 2022 at 7:32 PM Vignesh Venkatasubramanian > <vigneshv-at-google.com@ffmpeg.org> wrote: > > > > On Sat, Jul 2, 2022 at 2:35 AM Anton Khirnov <anton@khirnov.net> wrote: > > > > > > Quoting Vignesh Venkatasubramanian (2022-06-30 23:04:34) > > > > Parse the alpha channel for still AVIF images and expose it as a > > > > separate track. This is the simplest way of supporting AVIF alpha > > > > channel in a codec independent manner (similar to how ffmpeg > > > > supports animated AVIF with alpha channel). > > > > > > > > One can use the alphamerge filter to get a transparent image with > > > > a single command. For example: > > > > > > > > ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png > > > > > > > > Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> > > > > --- > > > > libavformat/isom.h | 1 + > > > > libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > 2 files changed, 67 insertions(+) > > > > > > I am against this patch, this is a digusting hack. > > > > > > These are not two streams, it is a single stream and should be exposed > > > as such. > > > > > > > Yes, while it is a hack, it is also the simplest way of supporting > > AVIF images with alpha. Since AVIF alpha images require two separate > > decoders, i could not think of any other solution that does not > > involve modifying the underlying av1 codec itself. > > > > If there are any alternative solutions where we can expose a single > > stream that needs two codecs, then i am open to implementing that. > > > > Otherwise, this solution improves the current situation where there is > > no way for the user to extract the alpha channel (albeit in a hacky > > way). > > I have discussed this on IRC, Thank you for discussing this! > and as far as I can see we have multiple > cases where an additional image has to be decoded by a separate > decoder context to get alpha, namely: > > - vp8/9 (currently handled by AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL) > - HEIF-like formats (such as AVIF), which are codec agnostic (you can > put JPEG, HEVC, AV1 etc into HEIF) > > This means the logic (preferably) should not be codec-specific, and > would allow to implement at least two separate > > My proposal - which Anton didn't really like - was to add a side data > entry AV_PKT_DATA_AUXILIARY_IMAGE , which could have type (ALPHA, > DEPTH etc) as well as the relevant packet(s) - as well as if any of > specifications define codec initialization data, that. Then > libavcodec/decode or so could initialize a secondary decoder context > for alpha images and add the third plane if parameters match what's > required. > My only concern about this solution is that, while the demuxer can easily put the alpha data in the side data, we will then have to modify the decoder (in case of HEIF, we will have to change each underlying codec like av1, hevc, jpeg, etc) to support handling of this side data. I was just hoping it would be nicer if we can have something simpler without having to change every individual libavcodec implementation. ffmpeg already has a way to cleanly initialize codecs and pass data to them, so it would be nicer if we can leverage that somehow instead of making each libavcodec implementation handle this side data type. For example, the proposed solution is what we do for the VP8 alpha demuxing and there is already some fragmentation. The libvpx based vp8 decoder supports alpha channel decoding while the native ffmpeg vp8 decoder does not support it. So, a user has to explicitly pass -c:v libvpx-vp8 to be able to decode files with alpha channel properly. > This would also make remux possible, since containers that do not > support such auxiliary images would ignore the alpha plane, and those > that do would properly pass it through. > The remux is possible in the multi-stream approach/hack as well. But I do see your point. > As for encoding, not fully sure how it should be integrated, if any > encoders actually at this moment do proper alpha coding, or do all API > clients have to separately encode with one context the primary image, > and the alpha with another? I am not sure about other codecs, but in the case of AVIF/AV1, the encoder does not understand/support alpha channels. The only way to do it is to use two separate encoders. > That said, the multi-stream implementation > is already merged on the muxer side, so that can be utilized in the > mean time, even though the auxiliary images are not technically a > separate stream within HEIF-likes. > > Jan > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On Sun, Jul 3, 2022 at 12:15 AM Vignesh Venkatasubramanian <vigneshv-at-google.com@ffmpeg.org> wrote: > > On Sat, Jul 2, 2022 at 12:35 PM Jan Ekström <jeebjp@gmail.com> wrote: > > > > On Sat, Jul 2, 2022 at 7:32 PM Vignesh Venkatasubramanian > > <vigneshv-at-google.com@ffmpeg.org> wrote: > > > > > > On Sat, Jul 2, 2022 at 2:35 AM Anton Khirnov <anton@khirnov.net> wrote: > > > > > > > > Quoting Vignesh Venkatasubramanian (2022-06-30 23:04:34) > > > > > Parse the alpha channel for still AVIF images and expose it as a > > > > > separate track. This is the simplest way of supporting AVIF alpha > > > > > channel in a codec independent manner (similar to how ffmpeg > > > > > supports animated AVIF with alpha channel). > > > > > > > > > > One can use the alphamerge filter to get a transparent image with > > > > > a single command. For example: > > > > > > > > > > ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png > > > > > > > > > > Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> > > > > > --- > > > > > libavformat/isom.h | 1 + > > > > > libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > > 2 files changed, 67 insertions(+) > > > > > > > > I am against this patch, this is a digusting hack. > > > > > > > > These are not two streams, it is a single stream and should be exposed > > > > as such. > > > > > > > > > > Yes, while it is a hack, it is also the simplest way of supporting > > > AVIF images with alpha. Since AVIF alpha images require two separate > > > decoders, i could not think of any other solution that does not > > > involve modifying the underlying av1 codec itself. > > > > > > If there are any alternative solutions where we can expose a single > > > stream that needs two codecs, then i am open to implementing that. > > > > > > Otherwise, this solution improves the current situation where there is > > > no way for the user to extract the alpha channel (albeit in a hacky > > > way). > > > > I have discussed this on IRC, > > Thank you for discussing this! > > > and as far as I can see we have multiple > > cases where an additional image has to be decoded by a separate > > decoder context to get alpha, namely: > > > > - vp8/9 (currently handled by AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL) > > - HEIF-like formats (such as AVIF), which are codec agnostic (you can > > put JPEG, HEVC, AV1 etc into HEIF) > > > > This means the logic (preferably) should not be codec-specific, and > > would allow to implement at least two separate > > > > My proposal - which Anton didn't really like - was to add a side data > > entry AV_PKT_DATA_AUXILIARY_IMAGE , which could have type (ALPHA, > > DEPTH etc) as well as the relevant packet(s) - as well as if any of > > specifications define codec initialization data, that. Then > > libavcodec/decode or so could initialize a secondary decoder context > > for alpha images and add the third plane if parameters match what's > > required. > > > > My only concern about this solution is that, while the demuxer can > easily put the alpha data in the side data, we will then have to > modify the decoder (in case of HEIF, we will have to change each > underlying codec like av1, hevc, jpeg, etc) to support handling of > this side data. I was just hoping it would be nicer if we can have > something simpler without having to change every individual libavcodec > implementation. ffmpeg already has a way to cleanly initialize codecs > and pass data to them, so it would be nicer if we can leverage that > somehow instead of making each libavcodec implementation handle this > side data type. Yes This is why I specifically noted that general decode logic should be modified, not separate decoders. Just like the recent ICC profile <-> color space information patch set by Niklas. It is clearly a thing that is going to get utilized by multiple things, so having it specific to specific codecs makes no sense :) . Jan
On Sun, Jul 3, 2022 at 5:18 AM Jan Ekström <jeebjp@gmail.com> wrote: > > On Sun, Jul 3, 2022 at 12:15 AM Vignesh Venkatasubramanian > <vigneshv-at-google.com@ffmpeg.org> wrote: > > > > On Sat, Jul 2, 2022 at 12:35 PM Jan Ekström <jeebjp@gmail.com> wrote: > > > > > > On Sat, Jul 2, 2022 at 7:32 PM Vignesh Venkatasubramanian > > > <vigneshv-at-google.com@ffmpeg.org> wrote: > > > > > > > > On Sat, Jul 2, 2022 at 2:35 AM Anton Khirnov <anton@khirnov.net> wrote: > > > > > > > > > > Quoting Vignesh Venkatasubramanian (2022-06-30 23:04:34) > > > > > > Parse the alpha channel for still AVIF images and expose it as a > > > > > > separate track. This is the simplest way of supporting AVIF alpha > > > > > > channel in a codec independent manner (similar to how ffmpeg > > > > > > supports animated AVIF with alpha channel). > > > > > > > > > > > > One can use the alphamerge filter to get a transparent image with > > > > > > a single command. For example: > > > > > > > > > > > > ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png > > > > > > > > > > > > Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> > > > > > > --- > > > > > > libavformat/isom.h | 1 + > > > > > > libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > 2 files changed, 67 insertions(+) > > > > > > > > > > I am against this patch, this is a digusting hack. > > > > > > > > > > These are not two streams, it is a single stream and should be exposed > > > > > as such. > > > > > > > > > > > > > Yes, while it is a hack, it is also the simplest way of supporting > > > > AVIF images with alpha. Since AVIF alpha images require two separate > > > > decoders, i could not think of any other solution that does not > > > > involve modifying the underlying av1 codec itself. > > > > > > > > If there are any alternative solutions where we can expose a single > > > > stream that needs two codecs, then i am open to implementing that. > > > > > > > > Otherwise, this solution improves the current situation where there is > > > > no way for the user to extract the alpha channel (albeit in a hacky > > > > way). > > > > > > I have discussed this on IRC, > > > > Thank you for discussing this! > > > > > and as far as I can see we have multiple > > > cases where an additional image has to be decoded by a separate > > > decoder context to get alpha, namely: > > > > > > - vp8/9 (currently handled by AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL) > > > - HEIF-like formats (such as AVIF), which are codec agnostic (you can > > > put JPEG, HEVC, AV1 etc into HEIF) > > > > > > This means the logic (preferably) should not be codec-specific, and > > > would allow to implement at least two separate > > > > > > My proposal - which Anton didn't really like - was to add a side data > > > entry AV_PKT_DATA_AUXILIARY_IMAGE , which could have type (ALPHA, > > > DEPTH etc) as well as the relevant packet(s) - as well as if any of > > > specifications define codec initialization data, that. Then > > > libavcodec/decode or so could initialize a secondary decoder context > > > for alpha images and add the third plane if parameters match what's > > > required. > > > > > > > My only concern about this solution is that, while the demuxer can > > easily put the alpha data in the side data, we will then have to > > modify the decoder (in case of HEIF, we will have to change each > > underlying codec like av1, hevc, jpeg, etc) to support handling of > > this side data. I was just hoping it would be nicer if we can have > > something simpler without having to change every individual libavcodec > > implementation. ffmpeg already has a way to cleanly initialize codecs > > and pass data to them, so it would be nicer if we can leverage that > > somehow instead of making each libavcodec implementation handle this > > side data type. > > Yes This is why I specifically noted that general decode logic should > be modified, not separate decoders. Just like the recent ICC profile > <-> color space information patch set by Niklas. > Ah okay, i see what you are saying now. This makes sense. I am also in favor of this solution then. We can abandon this second patch for now and i will see if i can implement the side data based solution. The first patch in this series is still a refactor that will be useful for the future. So i will leave that open for review. > It is clearly a thing that is going to get utilized by multiple > things, so having it specific to specific codecs makes no sense :) . Yes, sounds good! > > Jan > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Quoting Vignesh Venkatasubramanian (2022-07-02 23:15:35) > > As for encoding, not fully sure how it should be integrated, if any > > encoders actually at this moment do proper alpha coding, or do all API > > clients have to separately encode with one context the primary image, > > and the alpha with another? > > I am not sure about other codecs, but in the case of AVIF/AV1, the > encoder does not understand/support alpha channels. The only way to do > it is to use two separate encoders. Can this not be handled in our encoder wrapper? We should strive as much as possible to shield our callers from codec-specific implementation details.
On Tue, Jul 5, 2022 at 9:54 AM Anton Khirnov <anton@khirnov.net> wrote: > > Quoting Vignesh Venkatasubramanian (2022-07-02 23:15:35) > > > As for encoding, not fully sure how it should be integrated, if any > > > encoders actually at this moment do proper alpha coding, or do all API > > > clients have to separately encode with one context the primary image, > > > and the alpha with another? > > > > I am not sure about other codecs, but in the case of AVIF/AV1, the > > encoder does not understand/support alpha channels. The only way to do > > it is to use two separate encoders. > > Can this not be handled in our encoder wrapper? > By encoder wrapper, you mean codec wrappers in libavcodec like libaomenc.c right ? Yes, it is possible to do this in the encoder wrapper by keeping multiple instances of the encoder. But each encoder wrapper has to be updated to support it. AV1, for example, has three different encoders that can be used as of today (aom, rav1e and svt-av1). > We should strive as much as possible to shield our callers from > codec-specific implementation details. > > -- > Anton Khirnov > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff --git a/libavformat/isom.h b/libavformat/isom.h index d8b262e915..62b95b40ff 100644 --- a/libavformat/isom.h +++ b/libavformat/isom.h @@ -318,6 +318,7 @@ typedef struct MOVContext { uint32_t max_stts_delta; int is_still_picture_avif; int primary_item_id; + int alpha_item_id; int *avif_item_ids; int avif_item_ids_size; int *avif_extent_lengths; diff --git a/libavformat/mov.c b/libavformat/mov.c index 9df5055d4e..72b17b618d 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -4758,6 +4758,7 @@ static int mov_read_meta(MOVContext *c, AVIOContext *pb, MOVAtom atom) int ret; avio_seek(pb, -8, SEEK_CUR); atom.size += 8; + c->alpha_item_id = -1; ret = mov_read_default(c, pb, atom); if (ret < 0) return ret; @@ -4767,6 +4768,12 @@ static int mov_read_meta(MOVContext *c, AVIOContext *pb, MOVAtom atom) ret = avif_add_stream(c, c->primary_item_id); if (ret) return ret; + if (c->alpha_item_id != -1) { + // Add a stream for the Alpha plane. + ret = avif_add_stream(c, c->alpha_item_id); + if (ret) + return ret; + } // For still AVIF images, the meta box contains all the // necessary information that would generally be provided by the // moov box. So simply mark that we have found the moov box so @@ -7556,6 +7563,64 @@ static int mov_read_pitm(MOVContext *c, AVIOContext *pb, MOVAtom atom) return atom.size; } +static int mov_read_iprp(MOVContext *c, AVIOContext *pb, MOVAtom atom) +{ + int entry_count, size, version, flags; + int index = 0, auxC_alpha_index = -1; + size = avio_rb32(pb); + if (avio_rl32(pb) != MKTAG('i','p','c','o')) + return AVERROR_INVALIDDATA; + size -= 8; + while (size > 0) { + int sub_size, sub_type; + sub_size = avio_rb32(pb); + sub_type = avio_rl32(pb); + sub_size -= 8; + size -= sub_size + 8; + index++; + if (sub_type == MKTAG('a','u','x','C')) { + const char *expected_alpha_urn = "urn:mpeg:mpegB:cicp:systems:auxiliary:alpha"; + avio_rb32(pb); // version & flags. + sub_size -= 4; + if (sub_size >= strlen(expected_alpha_urn) + 1) { + char alpha_urn[44]; + avio_read(pb, alpha_urn, 44); + sub_size -= 44; + if (!strncmp(alpha_urn, expected_alpha_urn, 44)) { + auxC_alpha_index = index; + } + } + } + avio_skip(pb, sub_size); + } + if (auxC_alpha_index == -1) + return atom.size; + + // ipma. + size = avio_rb32(pb); + if (avio_rl32(pb) != MKTAG('i','p','m','a')) + return AVERROR_INVALIDDATA; + version = avio_r8(pb); + flags = avio_rb24(pb); + entry_count = avio_rb32(pb); + for (int i = 0; i < entry_count; i++) { + int item_id, association_count; + item_id = (version < 1) ? avio_rb16(pb) : avio_rb32(pb); + association_count = avio_r8(pb); + for (int j = 0; j < association_count; j++) { + int property_index; + if (flags & 1) + property_index = avio_rb16(pb) & 0x7fff; + else + property_index = avio_r8(pb) & 0x7f; + if (property_index == auxC_alpha_index) { + c->alpha_item_id = item_id; + } + } + } + return atom.size; +} + static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom) { int version, offset_size, length_size, base_offset_size, index_size; @@ -7732,6 +7797,7 @@ static const MOVParseTableEntry mov_default_parse_table[] = { { MKTAG('i','l','o','c'), mov_read_iloc }, { MKTAG('p','c','m','C'), mov_read_pcmc }, /* PCM configuration box */ { MKTAG('p','i','t','m'), mov_read_pitm }, +{ MKTAG('i','p','r','p'), mov_read_iprp }, { 0, NULL } };
Parse the alpha channel for still AVIF images and expose it as a separate track. This is the simplest way of supporting AVIF alpha channel in a codec independent manner (similar to how ffmpeg supports animated AVIF with alpha channel). One can use the alphamerge filter to get a transparent image with a single command. For example: ffmpeg -i image_with_alpha.avif -filter_complex alphamerge image_with_alpha.png Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com> --- libavformat/isom.h | 1 + libavformat/mov.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)