diff mbox

[FFmpeg-devel,2/3] lavc: add a framework to fix alignment problems.

Message ID 20170510154920.GA1581372@phare.normalesup.org
State Superseded
Headers show

Commit Message

Nicolas George May 10, 2017, 3:49 p.m. UTC
Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
> breaks (green stuff on edges)
>  ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vframes 5  -vf uspp=4:8 -qscale 1 -y file.avi

It is a bug in vf_uspp: 


gives:

  Stream #0:1 -> #0:0 (mpeg2video (native) -> mpeg4 (native))
encode 720 x 576 with 736 x 592
zsh: segmentation fault  ./ffmpeg_g -nostdin -i ~/tmp/samples/matrixbench_mpeg2.mpg -an -vframes 5 -vf

vf_uspp is giving a 720×576 frame to a 736×592 encoder.

I do not know how to fix that.

Regards,

Comments

Michael Niedermayer May 10, 2017, 8:16 p.m. UTC | #1
On Wed, May 10, 2017 at 05:49:20PM +0200, Nicolas George wrote:
> Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
> > breaks (green stuff on edges)
> >  ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vframes 5  -vf uspp=4:8 -qscale 1 -y file.avi
> 
> It is a bug in vf_uspp: 
> 
> diff --git a/libavfilter/vf_uspp.c b/libavfilter/vf_uspp.c
> index ef493b860f..6e378253a0 100644
> --- a/libavfilter/vf_uspp.c
> +++ b/libavfilter/vf_uspp.c
> @@ -250,6 +250,9 @@ static void filter(USPPContext *p, uint8_t *dst[3], uint8_t *src[3],
>          p->frame->data[2] = p->src[2] + x1c  + y1c  * p->frame->linesize[2];
>          p->frame->format  = p->avctx_enc[i]->pix_fmt;
>  
> +        av_log(0, 16, "encode %d x %d with %d x %d\n",
> +               p->frame->width, p->frame->height,
> +               p->avctx_enc[i]->width, p->avctx_enc[i]->height);
>          ret = avcodec_encode_video2(p->avctx_enc[i], &pkt, p->frame, &got_pkt_ptr);
>          if (ret < 0) {
>              av_log(p->avctx_enc[i], AV_LOG_ERROR, "Encoding failed\n");
> 
> gives:
> 
>   Stream #0:1 -> #0:0 (mpeg2video (native) -> mpeg4 (native))
> encode 720 x 576 with 736 x 592
> zsh: segmentation fault  ./ffmpeg_g -nostdin -i ~/tmp/samples/matrixbench_mpeg2.mpg -an -vframes 5 -vf
> 
> vf_uspp is giving a 720×576 frame to a 736×592 encoder.
> 
> I do not know how to fix that.

fixed, and now your patchset seems to cause a 1% slowdown of the
example above

real    0m2.182s
user    0m2.129s
sys     0m0.064s

real    0m2.182s
user    0m2.131s
sys     0m0.064s

real    0m2.182s
user    0m2.124s
sys     0m0.072s

vs:
real    0m2.204s
user    0m2.158s
sys     0m0.056s

real    0m2.203s
user    0m2.151s
sys     0m0.063s

real    0m2.204s
user    0m2.167s
sys     0m0.048s


[...]
Paul B Mahol May 10, 2017, 8:25 p.m. UTC | #2
On 5/10/17, Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Wed, May 10, 2017 at 05:49:20PM +0200, Nicolas George wrote:
>> Le primidi 21 floreal, an CCXXV, Michael Niedermayer a ecrit :
>> > breaks (green stuff on edges)
>> >  ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vframes 5  -vf uspp=4:8
>> > -qscale 1 -y file.avi
>>
>> It is a bug in vf_uspp:
>>
>> diff --git a/libavfilter/vf_uspp.c b/libavfilter/vf_uspp.c
>> index ef493b860f..6e378253a0 100644
>> --- a/libavfilter/vf_uspp.c
>> +++ b/libavfilter/vf_uspp.c
>> @@ -250,6 +250,9 @@ static void filter(USPPContext *p, uint8_t *dst[3],
>> uint8_t *src[3],
>>          p->frame->data[2] = p->src[2] + x1c  + y1c  *
>> p->frame->linesize[2];
>>          p->frame->format  = p->avctx_enc[i]->pix_fmt;
>>
>> +        av_log(0, 16, "encode %d x %d with %d x %d\n",
>> +               p->frame->width, p->frame->height,
>> +               p->avctx_enc[i]->width, p->avctx_enc[i]->height);
>>          ret = avcodec_encode_video2(p->avctx_enc[i], &pkt, p->frame,
>> &got_pkt_ptr);
>>          if (ret < 0) {
>>              av_log(p->avctx_enc[i], AV_LOG_ERROR, "Encoding failed\n");
>>
>> gives:
>>
>>   Stream #0:1 -> #0:0 (mpeg2video (native) -> mpeg4 (native))
>> encode 720 x 576 with 736 x 592
>> zsh: segmentation fault  ./ffmpeg_g -nostdin -i
>> ~/tmp/samples/matrixbench_mpeg2.mpg -an -vframes 5 -vf
>>
>> vf_uspp is giving a 720 *576 frame to a 736 *592 encoder.
>>
>> I do not know how to fix that.
>
> fixed, and now your patchset seems to cause a 1% slowdown of the
> example above
>
> real    0m2.182s
> user    0m2.129s
> sys     0m0.064s
>
> real    0m2.182s
> user    0m2.131s
> sys     0m0.064s
>
> real    0m2.182s
> user    0m2.124s
> sys     0m0.072s
>
> vs:
> real    0m2.204s
> user    0m2.158s
> sys     0m0.056s
>
> real    0m2.203s
> user    0m2.151s
> sys     0m0.063s
>
> real    0m2.204s
> user    0m2.167s
> sys     0m0.048s

Of course it does. Your example exhibits unaligned behaviour.
Nicolas George May 10, 2017, 8:28 p.m. UTC | #3
Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
> fixed, and now your patchset seems to cause a 1% slowdown of the
> example above

Thanks. The slowdown is to be expected, since this filter used
non-aligned frames which are now realigned. I guess you can get back the
previous speed by setting "avctx->alignment = 0" in snow_encoder.init().

The default in this patchset is 5 to fix all the alignment problems, at
the cost of some slowdown in corner cases such as this one. I am not
qualified to decide the best way of setting an optimal value to get
speed without crashes. My opinion would be:

- have avctx->alignment default to 0;

- have all codecs with explicit assembly set avctx->alignment to the
  required value;

- have all codecs that use generic DSP code init it through a function
  that sets the alignment accordingly.

Regards,
Michael Niedermayer May 10, 2017, 9:02 p.m. UTC | #4
On Wed, May 10, 2017 at 10:28:22PM +0200, Nicolas George wrote:
> Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
> > fixed, and now your patchset seems to cause a 1% slowdown of the
> > example above
> 
> Thanks. The slowdown is to be expected, since this filter used
> non-aligned frames which are now realigned. I guess you can get back the
> previous speed by setting "avctx->alignment = 0" in snow_encoder.init().
> 

> The default in this patchset is 5 to fix all the alignment problems, at
> the cost of some slowdown in corner cases such as this one. I am not
> qualified to decide the best way of setting an optimal value to get
> speed without crashes. My opinion would be:
> 
> - have avctx->alignment default to 0;
> 
> - have all codecs with explicit assembly set avctx->alignment to the
>   required value;
> 
> - have all codecs that use generic DSP code init it through a function
>   that sets the alignment accordingly.

hmm
run fate with intentionally misaligned frames, and make a list of
all codecs that dont crash and work fine.
These should not use realignment as they dont need it and would be
slowed down.

[...]
Hendrik Leppkes May 10, 2017, 9:06 p.m. UTC | #5
On Wed, May 10, 2017 at 10:28 PM, Nicolas George <george@nsup.org> wrote:
> Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
>> fixed, and now your patchset seems to cause a 1% slowdown of the
>> example above
>
> Thanks. The slowdown is to be expected, since this filter used
> non-aligned frames which are now realigned. I guess you can get back the
> previous speed by setting "avctx->alignment = 0" in snow_encoder.init().
>
> The default in this patchset is 5 to fix all the alignment problems, at
> the cost of some slowdown in corner cases such as this one. I am not
> qualified to decide the best way of setting an optimal value to get
> speed without crashes. My opinion would be:
>
> - have avctx->alignment default to 0;
>
> - have all codecs with explicit assembly set avctx->alignment to the
>   required value;
>
> - have all codecs that use generic DSP code init it through a function
>   that sets the alignment accordingly.
>

If you want to fix these problems, you should rather white-list codecs
which are known to be safe, instead of the other way around.

- Hendrik
Ronald S. Bultje May 10, 2017, 10:46 p.m. UTC | #6
Hi,

On Wed, May 10, 2017 at 5:02 PM, Michael Niedermayer <michael@niedermayer.cc
> wrote:

> On Wed, May 10, 2017 at 10:28:22PM +0200, Nicolas George wrote:
> > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
> > > fixed, and now your patchset seems to cause a 1% slowdown of the
> > > example above
> >
> > Thanks. The slowdown is to be expected, since this filter used
> > non-aligned frames which are now realigned. I guess you can get back the
> > previous speed by setting "avctx->alignment = 0" in snow_encoder.init().
> >
>
> > The default in this patchset is 5 to fix all the alignment problems, at
> > the cost of some slowdown in corner cases such as this one. I am not
> > qualified to decide the best way of setting an optimal value to get
> > speed without crashes. My opinion would be:
> >
> > - have avctx->alignment default to 0;
> >
> > - have all codecs with explicit assembly set avctx->alignment to the
> >   required value;
> >
> > - have all codecs that use generic DSP code init it through a function
> >   that sets the alignment accordingly.
>
> hmm
> run fate with intentionally misaligned frames, and make a list of
> all codecs that dont crash and work fine.


That's generally arch-specific. Codecs using AVX2 need 32byte, SSE2/Neon
only need 16, MMX need none (although 8 would probably be slightly faster).

Ronald
Michael Niedermayer May 11, 2017, 12:51 a.m. UTC | #7
On Wed, May 10, 2017 at 06:46:41PM -0400, Ronald S. Bultje wrote:
> Hi,
> 
> On Wed, May 10, 2017 at 5:02 PM, Michael Niedermayer <michael@niedermayer.cc
> > wrote:
> 
> > On Wed, May 10, 2017 at 10:28:22PM +0200, Nicolas George wrote:
> > > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit :
> > > > fixed, and now your patchset seems to cause a 1% slowdown of the
> > > > example above
> > >
> > > Thanks. The slowdown is to be expected, since this filter used
> > > non-aligned frames which are now realigned. I guess you can get back the
> > > previous speed by setting "avctx->alignment = 0" in snow_encoder.init().
> > >
> >
> > > The default in this patchset is 5 to fix all the alignment problems, at
> > > the cost of some slowdown in corner cases such as this one. I am not
> > > qualified to decide the best way of setting an optimal value to get
> > > speed without crashes. My opinion would be:
> > >
> > > - have avctx->alignment default to 0;
> > >
> > > - have all codecs with explicit assembly set avctx->alignment to the
> > >   required value;
> > >
> > > - have all codecs that use generic DSP code init it through a function
> > >   that sets the alignment accordingly.
> >
> > hmm
> > run fate with intentionally misaligned frames, and make a list of
> > all codecs that dont crash and work fine.
> 
> 
> That's generally arch-specific. Codecs using AVX2 need 32byte, SSE2/Neon
> only need 16, MMX need none (although 8 would probably be slightly faster).

yes

If a test is written to check codecs like described above. It should
be easy to run it on multiple archs to find out which codecs never
require alignment so as to find an initial default for each.


[...]
diff mbox

Patch

diff --git a/libavfilter/vf_uspp.c b/libavfilter/vf_uspp.c
index ef493b860f..6e378253a0 100644
--- a/libavfilter/vf_uspp.c
+++ b/libavfilter/vf_uspp.c
@@ -250,6 +250,9 @@  static void filter(USPPContext *p, uint8_t *dst[3], uint8_t *src[3],
         p->frame->data[2] = p->src[2] + x1c  + y1c  * p->frame->linesize[2];
         p->frame->format  = p->avctx_enc[i]->pix_fmt;
 
+        av_log(0, 16, "encode %d x %d with %d x %d\n",
+               p->frame->width, p->frame->height,
+               p->avctx_enc[i]->width, p->avctx_enc[i]->height);
         ret = avcodec_encode_video2(p->avctx_enc[i], &pkt, p->frame, &got_pkt_ptr);
         if (ret < 0) {
             av_log(p->avctx_enc[i], AV_LOG_ERROR, "Encoding failed\n");