[FFmpeg-devel] avformat: Fix probing on some JPEGs

Submitted by Nikolas Bowe on Aug. 23, 2019, 11:03 p.m.

Details

Message ID CANCrOA=TQuXp-QKT1E6gXQFZjXVZyE5Oft_AicxxmJ5agUHFeA@mail.gmail.com
State New
Headers show

Commit Message

Nikolas Bowe Aug. 23, 2019, 11:03 p.m.
On Thu, Aug 22, 2019 at 2:30 AM Paul B Mahol <onemda@gmail.com> wrote:

> On Thu, Aug 22, 2019 at 11:19 AM Carl Eugen Hoyos <ceffmpeg@gmail.com>
> wrote:
>
> > Am Mi., 21. Aug. 2019 um 23:05 Uhr schrieb Niki Bowe
> > <nbowe-at-google.com@ffmpeg.org>:
> > >
> > > On Mon, Aug 19, 2019 at 7:22 PM Carl Eugen Hoyos <ceffmpeg@gmail.com>
> > wrote:
> > >
> > > >
> > > > This score would mean that mjpeg can never be detected.
> > > > I suspect you have to reduce one of the demuxers to "- 1".
> > > >
> > > >
> > > Thanks Carl.
> > > Attached patch to reduce mpeg probe by -1, which also fixes the issue.
> >
> > Sorry, I misread the original report, it looked to me as if mJpeg was
> > the culprit.
> >
> > Imo, the mpeg probing should be fixed (return a smaller value) for your
> > sample
> > by detecting that it is not mpeg, not by returning a smaller value for
> > all samples.
> >
>
> 1000% agree.
>
>
It didn't return a smaller value for all samples, only the "invalid VDR
files and short PES streams" case.
Most mpegps files still return 26 immediately, because they have pack
headers.

However, here is another patch where I try to limit it to only changing
score for these jpegs.
I noticed that these jpegs had a lot of 0x00000100 sequences, which matches
mpeg picture header start code. I added another heuristic which matches
these jpegs, but not any mpegps files I could find.

Alternatively I could make reduce score if it doesn't start with a start
code? At the moment its happy to search until it finds start codes.


Is everyone really sure the best approach is to modify mpegps_probe for
this?
The mpegps_probe function returns 25 in many instances where it may not be
mpegps. It does only minimal structural checking, and allows invalid data
to still classify as mpegps.
jpeg probing returns 25 in some cases where it is almost certainly a jpeg
(Has to go through multiple tags to get to SOS, many of which early out if
they find invalid data).
Note that 25 is still treated as "low confidence" for jpeg. It logs "Format
jpeg_pipe detected only with low score of 25, misdetection possible!" for
these jpegs.
So I still think a score of 25 is too low for these jpegs, and that a
better fix would be to return 26 for jpeg_pipe and mjpeg if it makes it
past multiple tags to SOS.

Comments

Michael Niedermayer Aug. 25, 2019, 6:39 p.m.
On Fri, Aug 23, 2019 at 04:03:10PM -0700, Niki Bowe wrote:
> On Thu, Aug 22, 2019 at 2:30 AM Paul B Mahol <onemda@gmail.com> wrote:
> 
> > On Thu, Aug 22, 2019 at 11:19 AM Carl Eugen Hoyos <ceffmpeg@gmail.com>
> > wrote:
> >
> > > Am Mi., 21. Aug. 2019 um 23:05 Uhr schrieb Niki Bowe
> > > <nbowe-at-google.com@ffmpeg.org>:
> > > >
> > > > On Mon, Aug 19, 2019 at 7:22 PM Carl Eugen Hoyos <ceffmpeg@gmail.com>
> > > wrote:
> > > >
> > > > >
> > > > > This score would mean that mjpeg can never be detected.
> > > > > I suspect you have to reduce one of the demuxers to "- 1".
> > > > >
> > > > >
> > > > Thanks Carl.
> > > > Attached patch to reduce mpeg probe by -1, which also fixes the issue.
> > >
> > > Sorry, I misread the original report, it looked to me as if mJpeg was
> > > the culprit.
> > >
> > > Imo, the mpeg probing should be fixed (return a smaller value) for your
> > > sample
> > > by detecting that it is not mpeg, not by returning a smaller value for
> > > all samples.
> > >
> >
> > 1000% agree.
> >
> >
> It didn't return a smaller value for all samples, only the "invalid VDR
> files and short PES streams" case.
> Most mpegps files still return 26 immediately, because they have pack
> headers.
> 
> However, here is another patch where I try to limit it to only changing
> score for these jpegs.
> I noticed that these jpegs had a lot of 0x00000100 sequences, which matches
> mpeg picture header start code. I added another heuristic which matches
> these jpegs, but not any mpegps files I could find.
> 
> Alternatively I could make reduce score if it doesn't start with a start
> code? At the moment its happy to search until it finds start codes.
> 
> 
> Is everyone really sure the best approach is to modify mpegps_probe for
> this?
> The mpegps_probe function returns 25 in many instances where it may not be
> mpegps. It does only minimal structural checking, and allows invalid data
> to still classify as mpegps.
> jpeg probing returns 25 in some cases where it is almost certainly a jpeg
> (Has to go through multiple tags to get to SOS, many of which early out if
> they find invalid data).
> Note that 25 is still treated as "low confidence" for jpeg. It logs "Format
> jpeg_pipe detected only with low score of 25, misdetection possible!" for
> these jpegs.
> So I still think a score of 25 is too low for these jpegs, and that a
> better fix would be to return 26 for jpeg_pipe and mjpeg if it makes it
> past multiple tags to SOS.

jpegs can be in other container formats its not jpeg in that case but the
other container format

about this patch
it breaks this:

./ffplay tickets//3327/issue3327-libc-2.17.so

https://trac.ffmpeg.org/raw-attachment/ticket/3327/issue3327-libc-2.17.so

which is detected as mpeg after the patch.
really nothing should be detected as mpeg after this that was not before
the idea IIUC is make something that was detected as mpeg to be not anymore

Thanks

[...]
Nikolas Bowe Aug. 28, 2019, 1:30 a.m.
On Sun, Aug 25, 2019 at 11:39 AM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Fri, Aug 23, 2019 at 04:03:10PM -0700, Niki Bowe wrote:
> > On Thu, Aug 22, 2019 at 2:30 AM Paul B Mahol <onemda@gmail.com> wrote:
> >
> > > On Thu, Aug 22, 2019 at 11:19 AM Carl Eugen Hoyos <ceffmpeg@gmail.com>
> > > wrote:
> > >
> > > > Am Mi., 21. Aug. 2019 um 23:05 Uhr schrieb Niki Bowe
> > > > <nbowe-at-google.com@ffmpeg.org>:
> > > > >
> > > > > On Mon, Aug 19, 2019 at 7:22 PM Carl Eugen Hoyos <
> ceffmpeg@gmail.com>
> > > > wrote:
> > > > >
> > > > > >
> > > > > > This score would mean that mjpeg can never be detected.
> > > > > > I suspect you have to reduce one of the demuxers to "- 1".
> > > > > >
> > > > > >
> > > > > Thanks Carl.
> > > > > Attached patch to reduce mpeg probe by -1, which also fixes the
> issue.
> > > >
> > > > Sorry, I misread the original report, it looked to me as if mJpeg was
> > > > the culprit.
> > > >
> > > > Imo, the mpeg probing should be fixed (return a smaller value) for
> your
> > > > sample
> > > > by detecting that it is not mpeg, not by returning a smaller value
> for
> > > > all samples.
> > > >
> > >
> > > 1000% agree.
> > >
> > >
> > It didn't return a smaller value for all samples, only the "invalid VDR
> > files and short PES streams" case.
> > Most mpegps files still return 26 immediately, because they have pack
> > headers.
> >
> > However, here is another patch where I try to limit it to only changing
> > score for these jpegs.
> > I noticed that these jpegs had a lot of 0x00000100 sequences, which
> matches
> > mpeg picture header start code. I added another heuristic which matches
> > these jpegs, but not any mpegps files I could find.
> >
> > Alternatively I could make reduce score if it doesn't start with a start
> > code? At the moment its happy to search until it finds start codes.
> >
> >
> > Is everyone really sure the best approach is to modify mpegps_probe for
> > this?
> > The mpegps_probe function returns 25 in many instances where it may not
> be
> > mpegps. It does only minimal structural checking, and allows invalid data
> > to still classify as mpegps.
> > jpeg probing returns 25 in some cases where it is almost certainly a jpeg
> > (Has to go through multiple tags to get to SOS, many of which early out
> if
> > they find invalid data).
> > Note that 25 is still treated as "low confidence" for jpeg. It logs
> "Format
> > jpeg_pipe detected only with low score of 25, misdetection possible!" for
> > these jpegs.
> > So I still think a score of 25 is too low for these jpegs, and that a
> > better fix would be to return 26 for jpeg_pipe and mjpeg if it makes it
> > past multiple tags to SOS.
>
> jpegs can be in other container formats its not jpeg in that case but the
> other container format
>
about this patch
> it breaks this:
>
> ./ffplay tickets//3327/issue3327-libc-2.17.so
>
> https://trac.ffmpeg.org/raw-attachment/ticket/3327/issue3327-libc-2.17.so
>
> which is detected as mpeg after the patch.
> really nothing should be detected as mpeg after this that was not before
> the idea IIUC is make something that was detected as mpeg to be not anymore
>
>
Thanks Michael. Good find.
Attached patch which only applies the extra sanity checks to the "Invalid
VDR files and short PES streams" path.



> Thanks
>
> [...]
>
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Awnsering whenever a program halts or runs forever is
> On a turing machine, in general impossible (turings halting problem).
> On any real computer, always possible as a real computer has a finite
> number
> of states N, and will either halt in less than N cycles or never halt.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

Patch hide | download patch | download mbox

From 168485f954452de1e3ff78932c896cff01c2a303 Mon Sep 17 00:00:00 2001
From: Nikolas Bowe <nbowe@google.com>
Date: Thu, 8 Aug 2019 15:32:51 -0700
Subject: [PATCH] avformat: Fix probing on some JPEGs

Fixes "Invalid data found when processing input" on some JPEGs.

Some large extensionless JPEGs can get probe score collisions with mpeg
eg
$ ffprobe -loglevel trace  /tmp/foo
[NULL @ 0x55c130ab04c0] Opening '/tmp/foo' for reading
[file @ 0x55c130ab0f40] Setting default whitelist 'file,crypto'
Probing jpeg_pipe score:6 size:2048
Probing jpeg_pipe score:6 size:4096
Probing jpeg_pipe score:6 size:8192
Probing jpeg_pipe score:6 size:16384
Probing jpeg_pipe score:25 size:32768
Probing jpeg_pipe score:25 size:65536
Probing jpeg_pipe score:25 size:131072
Probing jpeg_pipe score:25 size:262144
Probing jpeg_pipe score:25 size:524288
Probing mpeg score:25 size:1048576
Probing jpeg_pipe score:25 size:1048576
[AVIOContext @ 0x55c130ab9300] Statistics: 1048576 bytes read, 0 seeks
/tmp/foo: Invalid data found when processing input

This patch fixes this by reducing probe score for mpeg
---
 libavformat/mpeg.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/libavformat/mpeg.c b/libavformat/mpeg.c
index 3205f209e6..6dcaefb5de 100644
--- a/libavformat/mpeg.c
+++ b/libavformat/mpeg.c
@@ -68,7 +68,7 @@  static int mpegps_probe(const AVProbeData *p)
 {
     uint32_t code = -1;
     int i;
-    int sys = 0, pspack = 0, priv1 = 0, vid = 0;
+    int sys = 0, pspack = 0, priv1 = 0, vid = 0, pic = 0;
     int audio = 0, invalid = 0, score = 0;
     int endpes = 0;
 
@@ -92,18 +92,23 @@  static int mpegps_probe(const AVProbeData *p)
             else if ((code & 0xe0) == AUDIO_ID &&  pes) {audio++; i+=len;}
             else if (code == PRIVATE_STREAM_1  &&  pes) {priv1++; i+=len;}
             else if (code == 0x1fd             &&  pes) vid++; //VC1
-
             else if ((code & 0xf0) == VIDEO_ID && !pes) invalid++;
             else if ((code & 0xe0) == AUDIO_ID && !pes) invalid++;
             else if (code == PRIVATE_STREAM_1  && !pes) invalid++;
+            else if (code == 0x100) pic++;
         }
     }
 
+    // If we don't find a pack header, and we have much more picture headers codes than 
+    // mpeg video stream codes, then its probably not mpeg.
+    if (pspack == 0 && vid > 0 && vid * 10 < pic)
+        return AVPROBE_SCORE_EXTENSION / 2 - 1;
+
     if (vid + audio > invalid + 1) /* invalid VDR files nd short PES streams */
         score = AVPROBE_SCORE_EXTENSION / 2;
 
-//     av_log(NULL, AV_LOG_ERROR, "vid:%d aud:%d sys:%d pspack:%d invalid:%d size:%d \n",
-//            vid, audio, sys, pspack, invalid, p->buf_size);
+//     av_log(NULL, AV_LOG_ERROR, "vid:%d aud:%d sys:%d pspack:%d pic:%d invalid:%d size:%d \n",
+//            vid, audio, sys, pspack, pic, invalid, p->buf_size);
 
     if (sys > invalid && sys * 9 <= pspack * 10)
         return (audio > 12 || vid > 3 || pspack > 2) ? AVPROBE_SCORE_EXTENSION + 2
-- 
2.23.0.187.g17f5b7556c-goog