diff mbox

[FFmpeg-devel] mov: Evaluate the movie display matrix

Message ID 20161005212533.90669-1-vittorio.giovara@gmail.com
State Superseded
Headers show

Commit Message

Vittorio Giovara Oct. 5, 2016, 9:25 p.m. UTC
This matrix needs to be applied after all others have (currently only
display matrix from trak), but cannot be handled in movie box, since
streams are not allocated yet.

So store it in main context and if not identity, apply it when appropriate,
handling the case when trak display matrix is identity and when it is not.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
---
Please keep my in CC.
Vittorio

 libavformat/isom.h |  2 ++
 libavformat/mov.c  | 63 +++++++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 55 insertions(+), 10 deletions(-)

Comments

Michael Niedermayer Oct. 6, 2016, 2:07 a.m. UTC | #1
On Wed, Oct 05, 2016 at 05:25:33PM -0400, Vittorio Giovara wrote:
> This matrix needs to be applied after all others have (currently only
> display matrix from trak), but cannot be handled in movie box, since
> streams are not allocated yet.
> 
> So store it in main context and if not identity, apply it when appropriate,
> handling the case when trak display matrix is identity and when it is not.

do you have a testcase for this which you can share ?

thx

[...]
Vittorio Giovara Oct. 6, 2016, 4:12 a.m. UTC | #2
On Wed, Oct 5, 2016 at 10:07 PM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Wed, Oct 05, 2016 at 05:25:33PM -0400, Vittorio Giovara wrote:
>> This matrix needs to be applied after all others have (currently only
>> display matrix from trak), but cannot be handled in movie box, since
>> streams are not allocated yet.
>>
>> So store it in main context and if not identity, apply it when appropriate,
>> handling the case when trak display matrix is identity and when it is not.
>
> do you have a testcase for this which you can share ?
>
> thx
>
> [...]
> --

I created one for the occasion
https://www.dropbox.com/s/w2uu37o11rvoz1q/moviedispmat.mp4?dl=1
Michael Niedermayer Oct. 6, 2016, 8:07 p.m. UTC | #3
On Thu, Oct 06, 2016 at 12:12:07AM -0400, Vittorio Giovara wrote:
> On Wed, Oct 5, 2016 at 10:07 PM, Michael Niedermayer
> <michael@niedermayer.cc> wrote:
> > On Wed, Oct 05, 2016 at 05:25:33PM -0400, Vittorio Giovara wrote:
> >> This matrix needs to be applied after all others have (currently only
> >> display matrix from trak), but cannot be handled in movie box, since
> >> streams are not allocated yet.
> >>
> >> So store it in main context and if not identity, apply it when appropriate,
> >> handling the case when trak display matrix is identity and when it is not.
> >
> > do you have a testcase for this which you can share ?
> >
> > thx
> >
> > [...]
> > --
> 
> I created one for the occasion
> https://www.dropbox.com/s/w2uu37o11rvoz1q/moviedispmat.mp4?dl=1

with ffplay before the patch the image looks round, after the patch
it looks quite squished, is that intended ?
what is the intended / correct SAR / DAR for this sample ?

thx

[...]
Vittorio Giovara Oct. 6, 2016, 8:40 p.m. UTC | #4
On Thu, Oct 6, 2016 at 4:07 PM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Thu, Oct 06, 2016 at 12:12:07AM -0400, Vittorio Giovara wrote:
>> On Wed, Oct 5, 2016 at 10:07 PM, Michael Niedermayer
>> <michael@niedermayer.cc> wrote:
>> > On Wed, Oct 05, 2016 at 05:25:33PM -0400, Vittorio Giovara wrote:
>> >> This matrix needs to be applied after all others have (currently only
>> >> display matrix from trak), but cannot be handled in movie box, since
>> >> streams are not allocated yet.
>> >>
>> >> So store it in main context and if not identity, apply it when appropriate,
>> >> handling the case when trak display matrix is identity and when it is not.
>> >
>> > do you have a testcase for this which you can share ?
>> >
>> > thx
>> >
>> > [...]
>> > --
>>
>> I created one for the occasion
>> https://www.dropbox.com/s/w2uu37o11rvoz1q/moviedispmat.mp4?dl=1
>
> with ffplay before the patch the image looks round, after the patch
> it looks quite squished, is that intended ?

yes that is correct

> what is the intended / correct SAR / DAR for this sample ?

without the patch     Stream #0:0(und): Video: h264 (High) (avc1 /
0x31637661), yuv420p, 540x576 [SAR 1:1 DAR 15:16], 102 kb/s, 25 fps,
25 tbr, 12800 tbn, 50 tbc (default)

with the patch     Stream #0:0(und): Video: h264 (High) (avc1 /
0x31637661), yuv420p, 540x576 [SAR 1:1 DAR 15:16], 102 kb/s, SAR
93207:65536 DAR 1016804:762601, 25 fps, 25 tbr, 12800 tbn, 50 tbc
(default)

This last one is the intended and correct one.
compn Oct. 6, 2016, 10:11 p.m. UTC | #5
On Thu, 6 Oct 2016 16:40:12 -0400
Vittorio Giovara <vittorio.giovara@gmail.com> wrote:

> > what is the intended / correct SAR / DAR for this sample ?
> 
> without the patch     Stream #0:0(und): Video: h264 (High) (avc1 /
> 0x31637661), yuv420p, 540x576 [SAR 1:1 DAR 15:16], 102 kb/s, 25 fps,
> 25 tbr, 12800 tbn, 50 tbc (default)
> 
> with the patch     Stream #0:0(und): Video: h264 (High) (avc1 /
> 0x31637661), yuv420p, 540x576 [SAR 1:1 DAR 15:16], 102 kb/s, SAR
> 93207:65536 DAR 1016804:762601, 25 fps, 25 tbr, 12800 tbn, 50 tbc
> (default)
> 
> This last one is the intended and correct one.

why are SAR and DAR in the output twice now? it might be confusing to
users? or just me...

-compn
Michael Niedermayer Oct. 6, 2016, 11:10 p.m. UTC | #6
On Wed, Oct 05, 2016 at 05:25:33PM -0400, Vittorio Giovara wrote:
> This matrix needs to be applied after all others have (currently only
> display matrix from trak), but cannot be handled in movie box, since
> streams are not allocated yet.
> 
> So store it in main context and if not identity, apply it when appropriate,
> handling the case when trak display matrix is identity and when it is not.
> 
> Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
> ---
> Please keep my in CC.
> Vittorio
> 
>  libavformat/isom.h |  2 ++
>  libavformat/mov.c  | 63 +++++++++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 55 insertions(+), 10 deletions(-)
> 
> diff --git a/libavformat/isom.h b/libavformat/isom.h
> index 2246fed..2aeb8fa 100644
> --- a/libavformat/isom.h
> +++ b/libavformat/isom.h
> @@ -238,6 +238,8 @@ typedef struct MOVContext {
>      uint8_t *decryption_key;
>      int decryption_key_len;
>      int enable_drefs;
> +
> +    int32_t movie_display_matrix[3][3]; ///< display matrix from mvhd
>  } MOVContext;
>  
>  int ff_mp4_read_descr_len(AVIOContext *pb);
> diff --git a/libavformat/mov.c b/libavformat/mov.c
> index a15c8d1..26b332c 100644
> --- a/libavformat/mov.c
> +++ b/libavformat/mov.c
> @@ -1211,6 +1211,7 @@ static int mov_read_mdhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>  
>  static int mov_read_mvhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>  {
> +    int i;
>      int64_t creation_time;
>      int version = avio_r8(pb); /* version */
>      avio_rb24(pb); /* flags */
> @@ -1238,7 +1239,12 @@ static int mov_read_mvhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>  
>      avio_skip(pb, 10); /* reserved */
>  
> -    avio_skip(pb, 36); /* display matrix */
> +    /* movie display matrix, store it in main context and use it later on */
> +    for (i = 0; i < 3; i++) {
> +        c->movie_display_matrix[i][0] = avio_rb32(pb); // 16.16 fixed point
> +        c->movie_display_matrix[i][1] = avio_rb32(pb); // 16.16 fixed point
> +        c->movie_display_matrix[i][2] = avio_rb32(pb); //  2.30 fixed point
> +    }
>  
>      avio_rb32(pb); /* preview time */
>      avio_rb32(pb); /* preview duration */
> @@ -3798,9 +3804,24 @@ static int mov_read_meta(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>      return 0;
>  }
>  
> +// return 0 when matrix is identity, 1 otherwise
> +#define IS_MATRIX_FULL(matrix)       \
> +    (matrix[0][0] != (1 << 16) ||    \
> +     matrix[1][1] != (1 << 16) ||    \
> +     matrix[2][2] != (1 << 30) ||    \
> +     matrix[0][1] || matrix[0][2] || \
> +     matrix[1][0] || matrix[1][2] || \
> +     matrix[2][0] || matrix[2][1])
> +
> +// fixed point to double
> +#define CONV_FP(x, sh) ((double) (x)) / (1 << sh)
> +
> +// double to fixed point
> +#define CONV_DB(x, sh) (int32_t) ((x) * (1 << sh))
> +
>  static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>  {
> -    int i;
> +    int i, j, e;
>      int width;
>      int height;
>      int display_matrix[3][3];
> @@ -3855,13 +3876,7 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>  
>      // save the matrix and add rotate metadata when it is not the default
>      // identity
> -    if (display_matrix[0][0] != (1 << 16) ||
> -        display_matrix[1][1] != (1 << 16) ||
> -        display_matrix[2][2] != (1 << 30) ||
> -        display_matrix[0][1] || display_matrix[0][2] ||
> -        display_matrix[1][0] || display_matrix[1][2] ||
> -        display_matrix[2][0] || display_matrix[2][1]) {
> -        int i, j;
> +    if (IS_MATRIX_FULL(display_matrix)) {
>          double rotate;
>  
>          av_freep(&sc->display_matrix);
> @@ -3884,13 +3899,41 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>          }
>      }
>  
> +    // if movie display matrix is not identity, and if this is a video track
> +    if (IS_MATRIX_FULL(c->movie_display_matrix) && width && height) {
> +        // if trak display matrix was identity, just copy the movie one
> +        if (!sc->display_matrix) {
> +            sc->display_matrix = av_malloc(sizeof(int32_t) * 9);
> +            if (!sc->display_matrix)
> +                return AVERROR(ENOMEM);
> +
> +            for (i = 0; i < 3; i++)
> +                for (j = 0; j < 3; j++)
> +                    sc->display_matrix[i * 3 + j] = c->movie_display_matrix[i][j];
> +        } else { // otherwise multiply the two and store the result
> +            double val = 0;
> +            for (i = 0; i < 3; i++) {
> +                for (j = 0; j < 3; j++) {
> +                    int sh = j == 2 ? 30 : 16;
> +                    for (e = 0; e < 3; e++) {
> +                        val += CONV_FP(display_matrix[i][e], sh) *
> +                               CONV_FP(c->movie_display_matrix[e][j], sh);
> +                    }
> +                    sc->display_matrix[i * 3 + j] = CONV_DB(val, sh);
> +                    val = 0;

the matrixes are 32bit the product of their entries would thus fit in
64bit and double should not be needed, int64_t should work and
avoid rounding errors

ive also uploaded your sample to fatesamples, please add a fate test
so this doesnt break in the future
(mov/moviedispmat.mp4)

thx

[...]
Carl Eugen Hoyos Oct. 7, 2016, 1:56 p.m. UTC | #7
2016-10-06 22:40 GMT+02:00 Vittorio Giovara <vittorio.giovara@gmail.com>:
>>> https://www.dropbox.com/s/w2uu37o11rvoz1q/moviedispmat.mp4?dl=1

>> what is the intended / correct SAR / DAR for this sample ?
>
> without the patch     Stream #0:0(und): Video: h264 (High) (avc1 /
> 0x31637661), yuv420p, 540x576 [SAR 1:1 DAR 15:16], 102 kb/s, 25 fps,
> 25 tbr, 12800 tbn, 50 tbc (default)
>
> with the patch     Stream #0:0(und): Video: h264 (High) (avc1 /
> 0x31637661), yuv420p, 540x576 [SAR 1:1 DAR 15:16], 102 kb/s, SAR
> 93207:65536 DAR 1016804:762601, 25 fps, 25 tbr, 12800 tbn, 50 tbc
> (default)
>
> This last one is the intended and correct one.

Which application (except patched FFmpeg) allows to reproduce this?

Carl Eugen
diff mbox

Patch

diff --git a/libavformat/isom.h b/libavformat/isom.h
index 2246fed..2aeb8fa 100644
--- a/libavformat/isom.h
+++ b/libavformat/isom.h
@@ -238,6 +238,8 @@  typedef struct MOVContext {
     uint8_t *decryption_key;
     int decryption_key_len;
     int enable_drefs;
+
+    int32_t movie_display_matrix[3][3]; ///< display matrix from mvhd
 } MOVContext;
 
 int ff_mp4_read_descr_len(AVIOContext *pb);
diff --git a/libavformat/mov.c b/libavformat/mov.c
index a15c8d1..26b332c 100644
--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -1211,6 +1211,7 @@  static int mov_read_mdhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
 static int mov_read_mvhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
+    int i;
     int64_t creation_time;
     int version = avio_r8(pb); /* version */
     avio_rb24(pb); /* flags */
@@ -1238,7 +1239,12 @@  static int mov_read_mvhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
     avio_skip(pb, 10); /* reserved */
 
-    avio_skip(pb, 36); /* display matrix */
+    /* movie display matrix, store it in main context and use it later on */
+    for (i = 0; i < 3; i++) {
+        c->movie_display_matrix[i][0] = avio_rb32(pb); // 16.16 fixed point
+        c->movie_display_matrix[i][1] = avio_rb32(pb); // 16.16 fixed point
+        c->movie_display_matrix[i][2] = avio_rb32(pb); //  2.30 fixed point
+    }
 
     avio_rb32(pb); /* preview time */
     avio_rb32(pb); /* preview duration */
@@ -3798,9 +3804,24 @@  static int mov_read_meta(MOVContext *c, AVIOContext *pb, MOVAtom atom)
     return 0;
 }
 
+// return 0 when matrix is identity, 1 otherwise
+#define IS_MATRIX_FULL(matrix)       \
+    (matrix[0][0] != (1 << 16) ||    \
+     matrix[1][1] != (1 << 16) ||    \
+     matrix[2][2] != (1 << 30) ||    \
+     matrix[0][1] || matrix[0][2] || \
+     matrix[1][0] || matrix[1][2] || \
+     matrix[2][0] || matrix[2][1])
+
+// fixed point to double
+#define CONV_FP(x, sh) ((double) (x)) / (1 << sh)
+
+// double to fixed point
+#define CONV_DB(x, sh) (int32_t) ((x) * (1 << sh))
+
 static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 {
-    int i;
+    int i, j, e;
     int width;
     int height;
     int display_matrix[3][3];
@@ -3855,13 +3876,7 @@  static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
 
     // save the matrix and add rotate metadata when it is not the default
     // identity
-    if (display_matrix[0][0] != (1 << 16) ||
-        display_matrix[1][1] != (1 << 16) ||
-        display_matrix[2][2] != (1 << 30) ||
-        display_matrix[0][1] || display_matrix[0][2] ||
-        display_matrix[1][0] || display_matrix[1][2] ||
-        display_matrix[2][0] || display_matrix[2][1]) {
-        int i, j;
+    if (IS_MATRIX_FULL(display_matrix)) {
         double rotate;
 
         av_freep(&sc->display_matrix);
@@ -3884,13 +3899,41 @@  static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
         }
     }
 
+    // if movie display matrix is not identity, and if this is a video track
+    if (IS_MATRIX_FULL(c->movie_display_matrix) && width && height) {
+        // if trak display matrix was identity, just copy the movie one
+        if (!sc->display_matrix) {
+            sc->display_matrix = av_malloc(sizeof(int32_t) * 9);
+            if (!sc->display_matrix)
+                return AVERROR(ENOMEM);
+
+            for (i = 0; i < 3; i++)
+                for (j = 0; j < 3; j++)
+                    sc->display_matrix[i * 3 + j] = c->movie_display_matrix[i][j];
+        } else { // otherwise multiply the two and store the result
+            double val = 0;
+            for (i = 0; i < 3; i++) {
+                for (j = 0; j < 3; j++) {
+                    int sh = j == 2 ? 30 : 16;
+                    for (e = 0; e < 3; e++) {
+                        val += CONV_FP(display_matrix[i][e], sh) *
+                               CONV_FP(c->movie_display_matrix[e][j], sh);
+                    }
+                    sc->display_matrix[i * 3 + j] = CONV_DB(val, sh);
+                    val = 0;
+                }
+            }
+        }
+    }
+
     // transform the display width/height according to the matrix
     // to keep the same scale, use [width height 1<<16]
     if (width && height && sc->display_matrix) {
         double disp_transform[2];
 
         for (i = 0; i < 2; i++)
-            disp_transform[i] = hypot(display_matrix[i][0], display_matrix[i][1]);
+            disp_transform[i] = hypot(sc->display_matrix[i * 3],
+                                      sc->display_matrix[i * 3 + 1]);
 
         if (disp_transform[0] > 0       && disp_transform[1] > 0 &&
             disp_transform[0] < (1<<24) && disp_transform[1] < (1<<24) &&