diff mbox series

[FFmpeg-devel] avformat/img2dec: return immediately in jpeg_probe() when EOI is found

Message ID 20200401160041.595923-1-matthieu.bouron@gmail.com
State New
Headers show
Series [FFmpeg-devel] avformat/img2dec: return immediately in jpeg_probe() when EOI is found
Related show

Checks

Context Check Description
andriy/ffmpeg-patchwork pending
andriy/ffmpeg-patchwork warning Failed to apply patch

Commit Message

Matthieu Bouron April 1, 2020, 4 p.m. UTC
Fixes probing of JPEG files containing MPF metadata appended at the end
of the file.

The MPF metadata chunk can contains multiple JPEG images (thumbnails)
which makes the jpeg_probe fails (return 0) because it finds a SOI
marker after EOI.
---

This patch fixes probing of JPEG files containing MPF metadata [1] appended
at the end of the file.

Such files can be produced by GoPro camera (which produces JPEG files
with MPF metadata).

You can find a sample here:
https://0x5c.me/gopro_jpg_mpf_probe_fail

To reproduce the issue using ffmpeg master:
wget https://0x5c.me/gopro_jpg_mpf_probe_fail
./ffmpeg -formatprobesize 5000000 -i gopro_jpg_mpf_probe_fail

I removed intentionally the jpeg extension from the filename so the
image2 demuxer is not used.

Current ffmpeg master won't still be able to decode this file because of a
"regression" introduced by ec3d8a0e6945fe015d16cd98a1e7dbb4be815c15 on
the mjpeg_parser.
The jpeg_pipe demuxer outputs partial chunks of the jpeg file that the
mjpeg decoder can't handle (it needs the whole data). Before
ec3d8a0e6945fe015d16cd98a1e7dbb4be815c15, the mjpeg_parser was
outputting a complete frame.

If the parser is correct as is, another way to fix this issue whould be
to add AVSTREAM_PARSE_HEADERS to the need_parsing flags from the
jpeg_pipe demuxer, ie:

Comments

Carl Eugen Hoyos April 1, 2020, 4:19 p.m. UTC | #1
Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> Fixes probing of JPEG files containing MPF metadata appended at the end
> of the file.
>
> The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> which makes the jpeg_probe fails (return 0) because it finds a SOI
> marker after EOI.
> ---
>
> This patch fixes probing of JPEG files containing MPF metadata [1] appended
> at the end of the file.
>
> Such files can be produced by GoPro camera (which produces JPEG files
> with MPF metadata).
>
> You can find a sample here:
> https://0x5c.me/gopro_jpg_mpf_probe_fail
>
> To reproduce the issue using ffmpeg master:
> wget https://0x5c.me/gopro_jpg_mpf_probe_fail
> ./ffmpeg -formatprobesize 5000000 -i gopro_jpg_mpf_probe_fail
>
> I removed intentionally the jpeg extension from the filename so the
> image2 demuxer is not used.
>
> Current ffmpeg master won't still be able to decode this file because of a
> "regression" introduced by ec3d8a0e6945fe015d16cd98a1e7dbb4be815c15 on
> the mjpeg_parser.
> The jpeg_pipe demuxer outputs partial chunks of the jpeg file that the
> mjpeg decoder can't handle (it needs the whole data). Before
> ec3d8a0e6945fe015d16cd98a1e7dbb4be815c15, the mjpeg_parser was
> outputting a complete frame.
>
> If the parser is correct as is, another way to fix this issue whould be
> to add AVSTREAM_PARSE_HEADERS to the need_parsing flags from the
> jpeg_pipe demuxer, ie:
>
> --- a/libavformat/img2dec.c
> +++ b/libavformat/img2dec.c
> @@ -209,7 +209,7 @@ int ff_img_read_header(AVFormatContext *s1)
>          s->is_pipe = 0;
>      else {
>          s->is_pipe       = 1;
> -        st->need_parsing = AVSTREAM_PARSE_FULL;
> +        st->need_parsing = AVSTREAM_PARSE_FULL | AVSTREAM_PARSE_HEADERS;
>
> Settings AVSTREAM_PARSE_HEADERS makes avformat requests complete frames
> from the parser in libavformat/utils.c
>
> What do you think ?

Why is your sample not detected by the mjpeg demuxer?

I suspect your patch breaks the mjpeg demuxer.

> [1] https://exiftool.org/TagNames/MPF.html
>
> ---
>  libavformat/img2dec.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> index 93cd51c1932..decd8023e02 100644
> --- a/libavformat/img2dec.c
> +++ b/libavformat/img2dec.c
> @@ -773,9 +773,7 @@ static int jpeg_probe(const AVProbeData *p)
>          case EOI:
>              if (state != SOS)
>                  return 0;
> -            state = EOI;
> -            break;
> -        case DHT:
> +            return AVPROBE_SCORE_EXTENSION + 1;

I don't think this is correct.

>          case DQT:
>          case APP0:
>          case APP1:
> @@ -803,8 +801,6 @@ static int jpeg_probe(const AVProbeData *p)
>          }
>      }
>

> -    if (state == EOI)
> -        return AVPROBE_SCORE_EXTENSION + 1;

>      if (state == SOS)
>          return AVPROBE_SCORE_EXTENSION / 2;
>      return AVPROBE_SCORE_EXTENSION / 8;

Carl Eugen
Carl Eugen Hoyos April 1, 2020, 4:29 p.m. UTC | #2
Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> Fixes probing of JPEG files containing MPF metadata appended at the end
> of the file.
>
> The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> which makes the jpeg_probe fails (return 0) because it finds a SOI
> marker after EOI.
> ---
>
> This patch fixes probing of JPEG files containing MPF metadata [1] appended
> at the end of the file.
>
> Such files can be produced by GoPro camera (which produces JPEG files
> with MPF metadata).
>
> You can find a sample here:
> https://0x5c.me/gopro_jpg_mpf_probe_fail

The sample works fine here with FFmpeg 4.2: Is there a regression?

Carl Eugen
Matthieu Bouron April 1, 2020, 4:34 p.m. UTC | #3
On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> <matthieu.bouron@gmail.com>:
> >
> > Fixes probing of JPEG files containing MPF metadata appended at the end
> > of the file.
> >
> > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > marker after EOI.
> > ---
> >
> > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > at the end of the file.
> >
> > Such files can be produced by GoPro camera (which produces JPEG files
> > with MPF metadata).
> >
> > You can find a sample here:
> > https://0x5c.me/gopro_jpg_mpf_probe_fail
> 
> The sample works fine here with FFmpeg 4.2: Is there a regression?

This sample does not work on FFmpeg 4.2 and master if you set a big
enought formatprobesize (which matches the file size of the sample):

ffmpeg -formatprobesize 5000000 -i ~/gopro_jpg_mpf_probe_fail gives:

/home/mateo/gopro_jpg_mpf_probe_fail: Invalid data found when processing
input

> 
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Matthieu Bouron April 1, 2020, 4:54 p.m. UTC | #4
On Wed, Apr 01, 2020 at 06:19:56PM +0200, Carl Eugen Hoyos wrote:
> Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> <matthieu.bouron@gmail.com>:
> >
> > Fixes probing of JPEG files containing MPF metadata appended at the end
> > of the file.
> >
> > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > marker after EOI.
> > ---
> >
> > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > at the end of the file.
> >
> > Such files can be produced by GoPro camera (which produces JPEG files
> > with MPF metadata).
> >
> > You can find a sample here:
> > https://0x5c.me/gopro_jpg_mpf_probe_fail
> >
> > To reproduce the issue using ffmpeg master:
> > wget https://0x5c.me/gopro_jpg_mpf_probe_fail
> > ./ffmpeg -formatprobesize 5000000 -i gopro_jpg_mpf_probe_fail
> >
> > I removed intentionally the jpeg extension from the filename so the
> > image2 demuxer is not used.
> >
> > Current ffmpeg master won't still be able to decode this file because of a
> > "regression" introduced by ec3d8a0e6945fe015d16cd98a1e7dbb4be815c15 on
> > the mjpeg_parser.
> > The jpeg_pipe demuxer outputs partial chunks of the jpeg file that the
> > mjpeg decoder can't handle (it needs the whole data). Before
> > ec3d8a0e6945fe015d16cd98a1e7dbb4be815c15, the mjpeg_parser was
> > outputting a complete frame.
> >
> > If the parser is correct as is, another way to fix this issue whould be
> > to add AVSTREAM_PARSE_HEADERS to the need_parsing flags from the
> > jpeg_pipe demuxer, ie:
> >
> > --- a/libavformat/img2dec.c
> > +++ b/libavformat/img2dec.c
> > @@ -209,7 +209,7 @@ int ff_img_read_header(AVFormatContext *s1)
> >          s->is_pipe = 0;
> >      else {
> >          s->is_pipe       = 1;
> > -        st->need_parsing = AVSTREAM_PARSE_FULL;
> > +        st->need_parsing = AVSTREAM_PARSE_FULL | AVSTREAM_PARSE_HEADERS;
> >
> > Settings AVSTREAM_PARSE_HEADERS makes avformat requests complete frames
> > from the parser in libavformat/utils.c
> >
> > What do you think ?
> 
> Why is your sample not detected by the mjpeg demuxer?

I don't know yet, it looks like nb_invalid is incremented in the default
case of mjpeg_probe(). I will investigate why.

This shouldn't change the fact that the jpeg_probe() should be able to
recognize this kind of files.

> 
> I suspect your patch breaks the mjpeg demuxer.

Why would it break the mjpeg demuxer ?

> 
> > [1] https://exiftool.org/TagNames/MPF.html
> >
> > ---
> >  libavformat/img2dec.c | 6 +-----
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > index 93cd51c1932..decd8023e02 100644
> > --- a/libavformat/img2dec.c
> > +++ b/libavformat/img2dec.c
> > @@ -773,9 +773,7 @@ static int jpeg_probe(const AVProbeData *p)
> >          case EOI:
> >              if (state != SOS)
> >                  return 0;
> > -            state = EOI;
> > -            break;
> > -        case DHT:
> > +            return AVPROBE_SCORE_EXTENSION + 1;
> 
> I don't think this is correct.

Why ? Isn't the data JPEG if we find an EOI marker after having found a
SOS marker ?

> 
> >          case DQT:
> >          case APP0:
> >          case APP1:
> > @@ -803,8 +801,6 @@ static int jpeg_probe(const AVProbeData *p)
> >          }
> >      }
> >
> 
> > -    if (state == EOI)
> > -        return AVPROBE_SCORE_EXTENSION + 1;
> 
> >      if (state == SOS)
> >          return AVPROBE_SCORE_EXTENSION / 2;
> >      return AVPROBE_SCORE_EXTENSION / 8;
> 
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carl Eugen Hoyos April 1, 2020, 4:55 p.m. UTC | #5
Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > <matthieu.bouron@gmail.com>:
> > >
> > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > of the file.
> > >
> > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > marker after EOI.
> > > ---
> > >
> > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > at the end of the file.
> > >
> > > Such files can be produced by GoPro camera (which produces JPEG files
> > > with MPF metadata).
> > >
> > > You can find a sample here:
> > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> >
> > The sample works fine here with FFmpeg 4.2: Is there a regression?
>
> This sample does not work on FFmpeg 4.2 and master if you set a big
> enought formatprobesize (which matches the file size of the sample):

Inlined is a patch that fixes detection but I wonder if this shouldn't be
detected as mjpeg instead

Carl Eugen

diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
index 40f3e3d499..4e63b3494e 100644
--- a/libavformat/img2dec.c
+++ b/libavformat/img2dec.c
@@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
 static int jpeg_probe(const AVProbeData *p)
 {
     const uint8_t *b = p->buf;
-    int i, state = SOI;
+    int i, state = SOI, MPF = 0;

     if (AV_RB16(b) != 0xFFD8 ||
         AV_RB32(b) == 0xFFD8FFF7)
     return 0;

     b += 2;
-    for (i = 0; i < p->buf_size - 3; i++) {
+    for (i = 0; i < p->buf_size - 8; i++) {
         int c;
         if (b[i] != 0xFF)
             continue;
         c = b[i + 1];
         switch (c) {
         case SOI:
+            if (state == EOI && MPF)
+                return AVPROBE_SCORE_EXTENSION + 1;
             return 0;
         case SOF0:
         case SOF1:
@@ -775,10 +777,12 @@ static int jpeg_probe(const AVProbeData *p)
                 return 0;
             state = EOI;
             break;
+        case APP2:
+            if (AV_RB24(&b[i + 4]) == AV_RB24("MPF"))
+                MPF = 1;
         case DQT:
         case APP0:
         case APP1:
-        case APP2:
         case APP3:
         case APP4:
         case APP5:
Matthieu Bouron April 1, 2020, 5:15 p.m. UTC | #6
On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> <matthieu.bouron@gmail.com>:
> >
> > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > <matthieu.bouron@gmail.com>:
> > > >
> > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > of the file.
> > > >
> > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > marker after EOI.
> > > > ---
> > > >
> > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > at the end of the file.
> > > >
> > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > with MPF metadata).
> > > >
> > > > You can find a sample here:
> > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > >
> > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> >
> > This sample does not work on FFmpeg 4.2 and master if you set a big
> > enought formatprobesize (which matches the file size of the sample):
> 
> Inlined is a patch that fixes detection but I wonder if this shouldn't be
> detected as mjpeg instead
> 
> Carl Eugen
> 
> diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> index 40f3e3d499..4e63b3494e 100644
> --- a/libavformat/img2dec.c
> +++ b/libavformat/img2dec.c
> @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
>  static int jpeg_probe(const AVProbeData *p)
>  {
>      const uint8_t *b = p->buf;
> -    int i, state = SOI;
> +    int i, state = SOI, MPF = 0;
> 
>      if (AV_RB16(b) != 0xFFD8 ||
>          AV_RB32(b) == 0xFFD8FFF7)
>      return 0;
> 
>      b += 2;
> -    for (i = 0; i < p->buf_size - 3; i++) {
> +    for (i = 0; i < p->buf_size - 8; i++) {
>          int c;
>          if (b[i] != 0xFF)
>              continue;
>          c = b[i + 1];
>          switch (c) {
>          case SOI:
> +            if (state == EOI && MPF)
> +                return AVPROBE_SCORE_EXTENSION + 1;

Your patch might work but the code is trying to find JPEG markers in the
MPF metadata until we reach the first EOI in the MPF JPEG thumbnails, even
if we already found the SOS and EOI markes of the main JPEG chunk. This
logic does not seem right to me.

Is there a specific reason why we couldn't end the probe when we reach the
EOI marker in the SOS state (ie: what my patch is doing) ?

>              return 0;
>          case SOF0:
>          case SOF1:
> @@ -775,10 +777,12 @@ static int jpeg_probe(const AVProbeData *p)
>                  return 0;
>              state = EOI;
>              break;
> +        case APP2:
> +            if (AV_RB24(&b[i + 4]) == AV_RB24("MPF"))
> +                MPF = 1;
>          case DQT:
>          case APP0:
>          case APP1:
> -        case APP2:
>          case APP3:
>          case APP4:
>          case APP5:
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carl Eugen Hoyos April 1, 2020, 5:30 p.m. UTC | #7
Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > <matthieu.bouron@gmail.com>:
> > >
> > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > <matthieu.bouron@gmail.com>:
> > > > >
> > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > of the file.
> > > > >
> > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > marker after EOI.
> > > > > ---
> > > > >
> > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > at the end of the file.
> > > > >
> > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > with MPF metadata).
> > > > >
> > > > > You can find a sample here:
> > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > >
> > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > >
> > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > enought formatprobesize (which matches the file size of the sample):
> >
> > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > detected as mjpeg instead
> >
> > Carl Eugen
> >
> > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > index 40f3e3d499..4e63b3494e 100644
> > --- a/libavformat/img2dec.c
> > +++ b/libavformat/img2dec.c
> > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> >  static int jpeg_probe(const AVProbeData *p)
> >  {
> >      const uint8_t *b = p->buf;
> > -    int i, state = SOI;
> > +    int i, state = SOI, MPF = 0;
> >
> >      if (AV_RB16(b) != 0xFFD8 ||
> >          AV_RB32(b) == 0xFFD8FFF7)
> >      return 0;
> >
> >      b += 2;
> > -    for (i = 0; i < p->buf_size - 3; i++) {
> > +    for (i = 0; i < p->buf_size - 8; i++) {
> >          int c;
> >          if (b[i] != 0xFF)
> >              continue;
> >          c = b[i + 1];
> >          switch (c) {
> >          case SOI:
> > +            if (state == EOI && MPF)
> > +                return AVPROBE_SCORE_EXTENSION + 1;
>
> Your patch might work

You could test it.

> but the code is trying to find JPEG markers in the
> MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,

No, and I am quite sure that I tested this sufficiently.

Shouldn't there be a mpf demuxer that also provides the attachments?

Carl Eugen
Matthieu Bouron April 1, 2020, 5:54 p.m. UTC | #8
On Wed, Apr 01, 2020 at 07:30:55PM +0200, Carl Eugen Hoyos wrote:
> Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
> <matthieu.bouron@gmail.com>:
> >
> > On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > > <matthieu.bouron@gmail.com>:
> > > >
> > > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > > <matthieu.bouron@gmail.com>:
> > > > > >
> > > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > > of the file.
> > > > > >
> > > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > > marker after EOI.
> > > > > > ---
> > > > > >
> > > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > > at the end of the file.
> > > > > >
> > > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > > with MPF metadata).
> > > > > >
> > > > > > You can find a sample here:
> > > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > > >
> > > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > > >
> > > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > > enought formatprobesize (which matches the file size of the sample):
> > >
> > > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > > detected as mjpeg instead
> > >
> > > Carl Eugen
> > >
> > > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > > index 40f3e3d499..4e63b3494e 100644
> > > --- a/libavformat/img2dec.c
> > > +++ b/libavformat/img2dec.c
> > > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> > >  static int jpeg_probe(const AVProbeData *p)
> > >  {
> > >      const uint8_t *b = p->buf;
> > > -    int i, state = SOI;
> > > +    int i, state = SOI, MPF = 0;
> > >
> > >      if (AV_RB16(b) != 0xFFD8 ||
> > >          AV_RB32(b) == 0xFFD8FFF7)
> > >      return 0;
> > >
> > >      b += 2;
> > > -    for (i = 0; i < p->buf_size - 3; i++) {
> > > +    for (i = 0; i < p->buf_size - 8; i++) {
> > >          int c;
> > >          if (b[i] != 0xFF)
> > >              continue;
> > >          c = b[i + 1];
> > >          switch (c) {
> > >          case SOI:
> > > +            if (state == EOI && MPF)
> > > +                return AVPROBE_SCORE_EXTENSION + 1;
> >
> > Your patch might work
> 
> You could test it.
> 
> > but the code is trying to find JPEG markers in the
> > MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,
> 

I meant SOI sorry.

> No, and I am quite sure that I tested this sufficiently.

My point is, we are trying to parse way more data than necessary (the data
we parse might not be jpeg at this point): the code looks for a marker and
handles it, if the marker is unknown it repeats this operation for the
next byte until it reaches the end of the buffer or it finds SOI in the
EOI+MPF state (with your patch).

Why cant' we end the processing at EOI ? If there is a particular reason
for that ?

> 
> Shouldn't there be a mpf demuxer that also provides the attachments?

I already have a patch to parse the MPF metadata. The only thing left to
do is to expose the attachments (it currently only exposes those as
file offsets in stream metadata). I will send the patch later on.

> 
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carl Eugen Hoyos April 1, 2020, 5:59 p.m. UTC | #9
Am Mi., 1. Apr. 2020 um 19:54 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> On Wed, Apr 01, 2020 at 07:30:55PM +0200, Carl Eugen Hoyos wrote:
> > Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
> > <matthieu.bouron@gmail.com>:
> > >
> > > On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > > > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > > > <matthieu.bouron@gmail.com>:
> > > > >
> > > > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > > > <matthieu.bouron@gmail.com>:
> > > > > > >
> > > > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > > > of the file.
> > > > > > >
> > > > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > > > marker after EOI.
> > > > > > > ---
> > > > > > >
> > > > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > > > at the end of the file.
> > > > > > >
> > > > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > > > with MPF metadata).
> > > > > > >
> > > > > > > You can find a sample here:
> > > > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > > > >
> > > > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > > > >
> > > > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > > > enought formatprobesize (which matches the file size of the sample):
> > > >
> > > > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > > > detected as mjpeg instead
> > > >
> > > > Carl Eugen
> > > >
> > > > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > > > index 40f3e3d499..4e63b3494e 100644
> > > > --- a/libavformat/img2dec.c
> > > > +++ b/libavformat/img2dec.c
> > > > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> > > >  static int jpeg_probe(const AVProbeData *p)
> > > >  {
> > > >      const uint8_t *b = p->buf;
> > > > -    int i, state = SOI;
> > > > +    int i, state = SOI, MPF = 0;
> > > >
> > > >      if (AV_RB16(b) != 0xFFD8 ||
> > > >          AV_RB32(b) == 0xFFD8FFF7)
> > > >      return 0;
> > > >
> > > >      b += 2;
> > > > -    for (i = 0; i < p->buf_size - 3; i++) {
> > > > +    for (i = 0; i < p->buf_size - 8; i++) {
> > > >          int c;
> > > >          if (b[i] != 0xFF)
> > > >              continue;
> > > >          c = b[i + 1];
> > > >          switch (c) {
> > > >          case SOI:
> > > > +            if (state == EOI && MPF)
> > > > +                return AVPROBE_SCORE_EXTENSION + 1;
> > >
> > > Your patch might work
> >
> > You could test it.
> >
> > > but the code is trying to find JPEG markers in the
> > > MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,
> >
>
> I meant SOI sorry.
>
> > No, and I am quite sure that I tested this sufficiently.
>
> My point is, we are trying to parse way more data than necessary (the data
> we parse might not be jpeg at this point): the code looks for a marker and
> handles it, if the marker is unknown it repeats this operation for the
> next byte until it reaches the end of the buffer or it finds SOI in the
> EOI+MPF state (with your patch).
>
> Why cant' we end the processing at EOI ? If there is a particular reason
> for that ?

You claim (that's how I understood it) was that a valid MPF jpg contains a
second (smaller) jpeg immediately after the big jpeg (that contains
an additional jpg in its APP2 data). I test if this is the case.
If you tell me a valid MPF jpg can contain any data after EOI, my
test if of course not correct.

> > Shouldn't there be a mpf demuxer that also provides the attachments?
>
> I already have a patch to parse the MPF metadata. The only thing left to
> do is to expose the attachments (it currently only exposes those as
> file offsets in stream metadata). I will send the patch later on.

But this parsing cannot output the image(s) after the main image as
attachments, or do I misunderstand?

Carl Eugen
Matthieu Bouron April 1, 2020, 7:07 p.m. UTC | #10
On Wed, Apr 01, 2020 at 07:59:46PM +0200, Carl Eugen Hoyos wrote:
> Am Mi., 1. Apr. 2020 um 19:54 Uhr schrieb Matthieu Bouron
> <matthieu.bouron@gmail.com>:
> >
> > On Wed, Apr 01, 2020 at 07:30:55PM +0200, Carl Eugen Hoyos wrote:
> > > Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
> > > <matthieu.bouron@gmail.com>:
> > > >
> > > > On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > > > > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > > > > <matthieu.bouron@gmail.com>:
> > > > > >
> > > > > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > > > > <matthieu.bouron@gmail.com>:
> > > > > > > >
> > > > > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > > > > of the file.
> > > > > > > >
> > > > > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > > > > marker after EOI.
> > > > > > > > ---
> > > > > > > >
> > > > > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > > > > at the end of the file.
> > > > > > > >
> > > > > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > > > > with MPF metadata).
> > > > > > > >
> > > > > > > > You can find a sample here:
> > > > > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > > > > >
> > > > > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > > > > >
> > > > > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > > > > enought formatprobesize (which matches the file size of the sample):
> > > > >
> > > > > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > > > > detected as mjpeg instead
> > > > >
> > > > > Carl Eugen
> > > > >
> > > > > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > > > > index 40f3e3d499..4e63b3494e 100644
> > > > > --- a/libavformat/img2dec.c
> > > > > +++ b/libavformat/img2dec.c
> > > > > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> > > > >  static int jpeg_probe(const AVProbeData *p)
> > > > >  {
> > > > >      const uint8_t *b = p->buf;
> > > > > -    int i, state = SOI;
> > > > > +    int i, state = SOI, MPF = 0;
> > > > >
> > > > >      if (AV_RB16(b) != 0xFFD8 ||
> > > > >          AV_RB32(b) == 0xFFD8FFF7)
> > > > >      return 0;
> > > > >
> > > > >      b += 2;
> > > > > -    for (i = 0; i < p->buf_size - 3; i++) {
> > > > > +    for (i = 0; i < p->buf_size - 8; i++) {
> > > > >          int c;
> > > > >          if (b[i] != 0xFF)
> > > > >              continue;
> > > > >          c = b[i + 1];
> > > > >          switch (c) {
> > > > >          case SOI:
> > > > > +            if (state == EOI && MPF)
> > > > > +                return AVPROBE_SCORE_EXTENSION + 1;
> > > >
> > > > Your patch might work
> > >
> > > You could test it.
> > >
> > > > but the code is trying to find JPEG markers in the
> > > > MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,
> > >
> >
> > I meant SOI sorry.
> >
> > > No, and I am quite sure that I tested this sufficiently.
> >
> > My point is, we are trying to parse way more data than necessary (the data
> > we parse might not be jpeg at this point): the code looks for a marker and
> > handles it, if the marker is unknown it repeats this operation for the
> > next byte until it reaches the end of the buffer or it finds SOI in the
> > EOI+MPF state (with your patch).
> >
> > Why cant' we end the processing at EOI ? If there is a particular reason
> > for that ?
> 
> You claim (that's how I understood it) was that a valid MPF jpg contains a
> second (smaller) jpeg immediately after the big jpeg (that contains
> an additional jpg in its APP2 data). I test if this is the case.
> If you tell me a valid MPF jpg can contain any data after EOI, my
> test if of course not correct.

The MPF metadata contained in the APP2 segment can reference multiple
images. For each image, an offset and a size is given.
The multiple images are stored after the main JPEG chunk. So the EOI of
the main JPEG chunk might be followed by the SOI of the first referenced
image (and this is the case in the sample I provided). So your patch do
not introduce any unnecessary processing if there is no padding between
EOI and SOI.

I wrongly remembered how MPF worked and assumed there was MPF data between
the SOI and EOI. Last time I worked on this was two years ago and I am
trying to upstream the patches I made back then.

Sorry to ask again but, why can't we stop parsing at EOI in the SOS state
?

> 
> > > Shouldn't there be a mpf demuxer that also provides the attachments?
> >
> > I already have a patch to parse the MPF metadata. The only thing left to
> > do is to expose the attachments (it currently only exposes those as
> > file offsets in stream metadata). I will send the patch later on.
> 
> But this parsing cannot output the image(s) after the main image as
> attachments, or do I misunderstand?

Yes, this is why I need to finish my patch so the thumbnails referenced by
the MPF metadata are exposed as attachments of the main jpeg image.

> 
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carl Eugen Hoyos April 1, 2020, 7:10 p.m. UTC | #11
Am Mi., 1. Apr. 2020 um 21:07 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> On Wed, Apr 01, 2020 at 07:59:46PM +0200, Carl Eugen Hoyos wrote:
> > Am Mi., 1. Apr. 2020 um 19:54 Uhr schrieb Matthieu Bouron
> > <matthieu.bouron@gmail.com>:
> > >
> > > On Wed, Apr 01, 2020 at 07:30:55PM +0200, Carl Eugen Hoyos wrote:
> > > > Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
> > > > <matthieu.bouron@gmail.com>:
> > > > >
> > > > > On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > > > > > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > > > > > <matthieu.bouron@gmail.com>:
> > > > > > >
> > > > > > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > > > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > > > > > <matthieu.bouron@gmail.com>:
> > > > > > > > >
> > > > > > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > > > > > of the file.
> > > > > > > > >
> > > > > > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > > > > > marker after EOI.
> > > > > > > > > ---
> > > > > > > > >
> > > > > > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > > > > > at the end of the file.
> > > > > > > > >
> > > > > > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > > > > > with MPF metadata).
> > > > > > > > >
> > > > > > > > > You can find a sample here:
> > > > > > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > > > > > >
> > > > > > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > > > > > >
> > > > > > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > > > > > enought formatprobesize (which matches the file size of the sample):
> > > > > >
> > > > > > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > > > > > detected as mjpeg instead
> > > > > >
> > > > > > Carl Eugen
> > > > > >
> > > > > > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > > > > > index 40f3e3d499..4e63b3494e 100644
> > > > > > --- a/libavformat/img2dec.c
> > > > > > +++ b/libavformat/img2dec.c
> > > > > > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> > > > > >  static int jpeg_probe(const AVProbeData *p)
> > > > > >  {
> > > > > >      const uint8_t *b = p->buf;
> > > > > > -    int i, state = SOI;
> > > > > > +    int i, state = SOI, MPF = 0;
> > > > > >
> > > > > >      if (AV_RB16(b) != 0xFFD8 ||
> > > > > >          AV_RB32(b) == 0xFFD8FFF7)
> > > > > >      return 0;
> > > > > >
> > > > > >      b += 2;
> > > > > > -    for (i = 0; i < p->buf_size - 3; i++) {
> > > > > > +    for (i = 0; i < p->buf_size - 8; i++) {
> > > > > >          int c;
> > > > > >          if (b[i] != 0xFF)
> > > > > >              continue;
> > > > > >          c = b[i + 1];
> > > > > >          switch (c) {
> > > > > >          case SOI:
> > > > > > +            if (state == EOI && MPF)
> > > > > > +                return AVPROBE_SCORE_EXTENSION + 1;
> > > > >
> > > > > Your patch might work
> > > >
> > > > You could test it.
> > > >
> > > > > but the code is trying to find JPEG markers in the
> > > > > MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,
> > > >
> > >
> > > I meant SOI sorry.
> > >
> > > > No, and I am quite sure that I tested this sufficiently.
> > >
> > > My point is, we are trying to parse way more data than necessary (the data
> > > we parse might not be jpeg at this point): the code looks for a marker and
> > > handles it, if the marker is unknown it repeats this operation for the
> > > next byte until it reaches the end of the buffer or it finds SOI in the
> > > EOI+MPF state (with your patch).
> > >
> > > Why cant' we end the processing at EOI ? If there is a particular reason
> > > for that ?
> >
> > You claim (that's how I understood it) was that a valid MPF jpg contains a
> > second (smaller) jpeg immediately after the big jpeg (that contains
> > an additional jpg in its APP2 data). I test if this is the case.
> > If you tell me a valid MPF jpg can contain any data after EOI, my
> > test if of course not correct.
>
> The MPF metadata contained in the APP2 segment can reference multiple
> images. For each image, an offset and a size is given.
> The multiple images are stored after the main JPEG chunk. So the EOI of
> the main JPEG chunk might be followed by the SOI of the first referenced
> image (and this is the case in the sample I provided). So your patch do
> not introduce any unnecessary processing if there is no padding between
> EOI and SOI.
>
> I wrongly remembered how MPF worked and assumed there was MPF data between
> the SOI and EOI. Last time I worked on this was two years ago and I am
> trying to upstream the patches I made back then.
>
> Sorry to ask again but, why can't we stop parsing at EOI in the SOS state
> ?

Sorry for the repeated answer: It would break mjpeg detection.

We can of course stop at EOI for MPF but if every valid MPF file either ends
at EOI or contains another jpeg image, we are still good.

Carl Eugen
Matthieu Bouron April 1, 2020, 7:39 p.m. UTC | #12
On Wed, Apr 01, 2020 at 09:10:19PM +0200, Carl Eugen Hoyos wrote:
> Am Mi., 1. Apr. 2020 um 21:07 Uhr schrieb Matthieu Bouron
> <matthieu.bouron@gmail.com>:
> >
> > On Wed, Apr 01, 2020 at 07:59:46PM +0200, Carl Eugen Hoyos wrote:
> > > Am Mi., 1. Apr. 2020 um 19:54 Uhr schrieb Matthieu Bouron
> > > <matthieu.bouron@gmail.com>:
> > > >
> > > > On Wed, Apr 01, 2020 at 07:30:55PM +0200, Carl Eugen Hoyos wrote:
> > > > > Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
> > > > > <matthieu.bouron@gmail.com>:
> > > > > >
> > > > > > On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > > > > > > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > > > > > > <matthieu.bouron@gmail.com>:
> > > > > > > >
> > > > > > > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > > > > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > > > > > > <matthieu.bouron@gmail.com>:
> > > > > > > > > >
> > > > > > > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > > > > > > of the file.
> > > > > > > > > >
> > > > > > > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > > > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > > > > > > marker after EOI.
> > > > > > > > > > ---
> > > > > > > > > >
> > > > > > > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > > > > > > at the end of the file.
> > > > > > > > > >
> > > > > > > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > > > > > > with MPF metadata).
> > > > > > > > > >
> > > > > > > > > > You can find a sample here:
> > > > > > > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > > > > > > >
> > > > > > > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > > > > > > >
> > > > > > > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > > > > > > enought formatprobesize (which matches the file size of the sample):
> > > > > > >
> > > > > > > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > > > > > > detected as mjpeg instead
> > > > > > >
> > > > > > > Carl Eugen
> > > > > > >
> > > > > > > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > > > > > > index 40f3e3d499..4e63b3494e 100644
> > > > > > > --- a/libavformat/img2dec.c
> > > > > > > +++ b/libavformat/img2dec.c
> > > > > > > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> > > > > > >  static int jpeg_probe(const AVProbeData *p)
> > > > > > >  {
> > > > > > >      const uint8_t *b = p->buf;
> > > > > > > -    int i, state = SOI;
> > > > > > > +    int i, state = SOI, MPF = 0;
> > > > > > >
> > > > > > >      if (AV_RB16(b) != 0xFFD8 ||
> > > > > > >          AV_RB32(b) == 0xFFD8FFF7)
> > > > > > >      return 0;
> > > > > > >
> > > > > > >      b += 2;
> > > > > > > -    for (i = 0; i < p->buf_size - 3; i++) {
> > > > > > > +    for (i = 0; i < p->buf_size - 8; i++) {
> > > > > > >          int c;
> > > > > > >          if (b[i] != 0xFF)
> > > > > > >              continue;
> > > > > > >          c = b[i + 1];
> > > > > > >          switch (c) {
> > > > > > >          case SOI:
> > > > > > > +            if (state == EOI && MPF)
> > > > > > > +                return AVPROBE_SCORE_EXTENSION + 1;
> > > > > >
> > > > > > Your patch might work
> > > > >
> > > > > You could test it.
> > > > >
> > > > > > but the code is trying to find JPEG markers in the
> > > > > > MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,
> > > > >
> > > >
> > > > I meant SOI sorry.
> > > >
> > > > > No, and I am quite sure that I tested this sufficiently.
> > > >
> > > > My point is, we are trying to parse way more data than necessary (the data
> > > > we parse might not be jpeg at this point): the code looks for a marker and
> > > > handles it, if the marker is unknown it repeats this operation for the
> > > > next byte until it reaches the end of the buffer or it finds SOI in the
> > > > EOI+MPF state (with your patch).
> > > >
> > > > Why cant' we end the processing at EOI ? If there is a particular reason
> > > > for that ?
> > >
> > > You claim (that's how I understood it) was that a valid MPF jpg contains a
> > > second (smaller) jpeg immediately after the big jpeg (that contains
> > > an additional jpg in its APP2 data). I test if this is the case.
> > > If you tell me a valid MPF jpg can contain any data after EOI, my
> > > test if of course not correct.
> >
> > The MPF metadata contained in the APP2 segment can reference multiple
> > images. For each image, an offset and a size is given.
> > The multiple images are stored after the main JPEG chunk. So the EOI of
> > the main JPEG chunk might be followed by the SOI of the first referenced
> > image (and this is the case in the sample I provided). So your patch do
> > not introduce any unnecessary processing if there is no padding between
> > EOI and SOI.
> >
> > I wrongly remembered how MPF worked and assumed there was MPF data between
> > the SOI and EOI. Last time I worked on this was two years ago and I am
> > trying to upstream the patches I made back then.
> >
> > Sorry to ask again but, why can't we stop parsing at EOI in the SOS state
> > ?
> 
> Sorry for the repeated answer: It would break mjpeg detection.
> 
> We can of course stop at EOI for MPF but if every valid MPF file either ends
> at EOI or contains another jpeg image, we are still good.

Ok, I think I understand why it would break the mjpeg detection now. This
is why jpeg_probe() returns 0 if it finds SOI to give a chance to the
mjpeg probe to find something (and because the mjpeg demuxer is supposed
to handle multiple JPEG chunks). Is this correct ?

> 
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Carl Eugen Hoyos April 1, 2020, 7:53 p.m. UTC | #13
Am Mi., 1. Apr. 2020 um 21:39 Uhr schrieb Matthieu Bouron
<matthieu.bouron@gmail.com>:
>
> On Wed, Apr 01, 2020 at 09:10:19PM +0200, Carl Eugen Hoyos wrote:
> > Am Mi., 1. Apr. 2020 um 21:07 Uhr schrieb Matthieu Bouron
> > <matthieu.bouron@gmail.com>:
> > >
> > > On Wed, Apr 01, 2020 at 07:59:46PM +0200, Carl Eugen Hoyos wrote:
> > > > Am Mi., 1. Apr. 2020 um 19:54 Uhr schrieb Matthieu Bouron
> > > > <matthieu.bouron@gmail.com>:
> > > > >
> > > > > On Wed, Apr 01, 2020 at 07:30:55PM +0200, Carl Eugen Hoyos wrote:
> > > > > > Am Mi., 1. Apr. 2020 um 19:15 Uhr schrieb Matthieu Bouron
> > > > > > <matthieu.bouron@gmail.com>:
> > > > > > >
> > > > > > > On Wed, Apr 01, 2020 at 06:55:42PM +0200, Carl Eugen Hoyos wrote:
> > > > > > > > Am Mi., 1. Apr. 2020 um 18:35 Uhr schrieb Matthieu Bouron
> > > > > > > > <matthieu.bouron@gmail.com>:
> > > > > > > > >
> > > > > > > > > On Wed, Apr 01, 2020 at 06:29:03PM +0200, Carl Eugen Hoyos wrote:
> > > > > > > > > > Am Mi., 1. Apr. 2020 um 18:01 Uhr schrieb Matthieu Bouron
> > > > > > > > > > <matthieu.bouron@gmail.com>:
> > > > > > > > > > >
> > > > > > > > > > > Fixes probing of JPEG files containing MPF metadata appended at the end
> > > > > > > > > > > of the file.
> > > > > > > > > > >
> > > > > > > > > > > The MPF metadata chunk can contains multiple JPEG images (thumbnails)
> > > > > > > > > > > which makes the jpeg_probe fails (return 0) because it finds a SOI
> > > > > > > > > > > marker after EOI.
> > > > > > > > > > > ---
> > > > > > > > > > >
> > > > > > > > > > > This patch fixes probing of JPEG files containing MPF metadata [1] appended
> > > > > > > > > > > at the end of the file.
> > > > > > > > > > >
> > > > > > > > > > > Such files can be produced by GoPro camera (which produces JPEG files
> > > > > > > > > > > with MPF metadata).
> > > > > > > > > > >
> > > > > > > > > > > You can find a sample here:
> > > > > > > > > > > https://0x5c.me/gopro_jpg_mpf_probe_fail
> > > > > > > > > >
> > > > > > > > > > The sample works fine here with FFmpeg 4.2: Is there a regression?
> > > > > > > > >
> > > > > > > > > This sample does not work on FFmpeg 4.2 and master if you set a big
> > > > > > > > > enought formatprobesize (which matches the file size of the sample):
> > > > > > > >
> > > > > > > > Inlined is a patch that fixes detection but I wonder if this shouldn't be
> > > > > > > > detected as mjpeg instead
> > > > > > > >
> > > > > > > > Carl Eugen
> > > > > > > >
> > > > > > > > diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
> > > > > > > > index 40f3e3d499..4e63b3494e 100644
> > > > > > > > --- a/libavformat/img2dec.c
> > > > > > > > +++ b/libavformat/img2dec.c
> > > > > > > > @@ -737,20 +737,22 @@ static int j2k_probe(const AVProbeData *p)
> > > > > > > >  static int jpeg_probe(const AVProbeData *p)
> > > > > > > >  {
> > > > > > > >      const uint8_t *b = p->buf;
> > > > > > > > -    int i, state = SOI;
> > > > > > > > +    int i, state = SOI, MPF = 0;
> > > > > > > >
> > > > > > > >      if (AV_RB16(b) != 0xFFD8 ||
> > > > > > > >          AV_RB32(b) == 0xFFD8FFF7)
> > > > > > > >      return 0;
> > > > > > > >
> > > > > > > >      b += 2;
> > > > > > > > -    for (i = 0; i < p->buf_size - 3; i++) {
> > > > > > > > +    for (i = 0; i < p->buf_size - 8; i++) {
> > > > > > > >          int c;
> > > > > > > >          if (b[i] != 0xFF)
> > > > > > > >              continue;
> > > > > > > >          c = b[i + 1];
> > > > > > > >          switch (c) {
> > > > > > > >          case SOI:
> > > > > > > > +            if (state == EOI && MPF)
> > > > > > > > +                return AVPROBE_SCORE_EXTENSION + 1;
> > > > > > >
> > > > > > > Your patch might work
> > > > > >
> > > > > > You could test it.
> > > > > >
> > > > > > > but the code is trying to find JPEG markers in the
> > > > > > > MPF metadata until we reach the first EOI in the MPF JPEG thumbnails,
> > > > > >
> > > > >
> > > > > I meant SOI sorry.
> > > > >
> > > > > > No, and I am quite sure that I tested this sufficiently.
> > > > >
> > > > > My point is, we are trying to parse way more data than necessary (the data
> > > > > we parse might not be jpeg at this point): the code looks for a marker and
> > > > > handles it, if the marker is unknown it repeats this operation for the
> > > > > next byte until it reaches the end of the buffer or it finds SOI in the
> > > > > EOI+MPF state (with your patch).
> > > > >
> > > > > Why cant' we end the processing at EOI ? If there is a particular reason
> > > > > for that ?
> > > >
> > > > You claim (that's how I understood it) was that a valid MPF jpg contains a
> > > > second (smaller) jpeg immediately after the big jpeg (that contains
> > > > an additional jpg in its APP2 data). I test if this is the case.
> > > > If you tell me a valid MPF jpg can contain any data after EOI, my
> > > > test if of course not correct.
> > >
> > > The MPF metadata contained in the APP2 segment can reference multiple
> > > images. For each image, an offset and a size is given.
> > > The multiple images are stored after the main JPEG chunk. So the EOI of
> > > the main JPEG chunk might be followed by the SOI of the first referenced
> > > image (and this is the case in the sample I provided). So your patch do
> > > not introduce any unnecessary processing if there is no padding between
> > > EOI and SOI.
> > >
> > > I wrongly remembered how MPF worked and assumed there was MPF data between
> > > the SOI and EOI. Last time I worked on this was two years ago and I am
> > > trying to upstream the patches I made back then.
> > >
> > > Sorry to ask again but, why can't we stop parsing at EOI in the SOS state
> > > ?
> >
> > Sorry for the repeated answer: It would break mjpeg detection.
> >
> > We can of course stop at EOI for MPF but if every valid MPF file either ends
> > at EOI or contains another jpeg image, we are still good.
>
> Ok, I think I understand why it would break the mjpeg detection now. This
> is why jpeg_probe() returns 0 if it finds SOI to give a chance to the
> mjpeg probe to find something (and because the mjpeg demuxer is supposed
> to handle multiple JPEG chunks). Is this correct ?

Yes.

Carl Eugen
diff mbox series

Patch

--- a/libavformat/img2dec.c
+++ b/libavformat/img2dec.c
@@ -209,7 +209,7 @@  int ff_img_read_header(AVFormatContext *s1)
         s->is_pipe = 0;
     else {
         s->is_pipe       = 1;
-        st->need_parsing = AVSTREAM_PARSE_FULL;
+        st->need_parsing = AVSTREAM_PARSE_FULL | AVSTREAM_PARSE_HEADERS;

Settings AVSTREAM_PARSE_HEADERS makes avformat requests complete frames
from the parser in libavformat/utils.c

What do you think ?

[1] https://exiftool.org/TagNames/MPF.html

---
 libavformat/img2dec.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c
index 93cd51c1932..decd8023e02 100644
--- a/libavformat/img2dec.c
+++ b/libavformat/img2dec.c
@@ -773,9 +773,7 @@  static int jpeg_probe(const AVProbeData *p)
         case EOI:
             if (state != SOS)
                 return 0;
-            state = EOI;
-            break;
-        case DHT:
+            return AVPROBE_SCORE_EXTENSION + 1;
         case DQT:
         case APP0:
         case APP1:
@@ -803,8 +801,6 @@  static int jpeg_probe(const AVProbeData *p)
         }
     }
 
-    if (state == EOI)
-        return AVPROBE_SCORE_EXTENSION + 1;
     if (state == SOS)
         return AVPROBE_SCORE_EXTENSION / 2;
     return AVPROBE_SCORE_EXTENSION / 8;