Message ID | 20170510154920.GA1581372@phare.normalesup.org |
---|---|
State | Superseded |
Headers | show |
On Wed, May 10, 2017 at 05:49:20PM +0200, Nicolas George wrote: > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit : > > breaks (green stuff on edges) > > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vframes 5 -vf uspp=4:8 -qscale 1 -y file.avi > > It is a bug in vf_uspp: > > diff --git a/libavfilter/vf_uspp.c b/libavfilter/vf_uspp.c > index ef493b860f..6e378253a0 100644 > --- a/libavfilter/vf_uspp.c > +++ b/libavfilter/vf_uspp.c > @@ -250,6 +250,9 @@ static void filter(USPPContext *p, uint8_t *dst[3], uint8_t *src[3], > p->frame->data[2] = p->src[2] + x1c + y1c * p->frame->linesize[2]; > p->frame->format = p->avctx_enc[i]->pix_fmt; > > + av_log(0, 16, "encode %d x %d with %d x %d\n", > + p->frame->width, p->frame->height, > + p->avctx_enc[i]->width, p->avctx_enc[i]->height); > ret = avcodec_encode_video2(p->avctx_enc[i], &pkt, p->frame, &got_pkt_ptr); > if (ret < 0) { > av_log(p->avctx_enc[i], AV_LOG_ERROR, "Encoding failed\n"); > > gives: > > Stream #0:1 -> #0:0 (mpeg2video (native) -> mpeg4 (native)) > encode 720 x 576 with 736 x 592 > zsh: segmentation fault ./ffmpeg_g -nostdin -i ~/tmp/samples/matrixbench_mpeg2.mpg -an -vframes 5 -vf > > vf_uspp is giving a 720×576 frame to a 736×592 encoder. > > I do not know how to fix that. fixed, and now your patchset seems to cause a 1% slowdown of the example above real 0m2.182s user 0m2.129s sys 0m0.064s real 0m2.182s user 0m2.131s sys 0m0.064s real 0m2.182s user 0m2.124s sys 0m0.072s vs: real 0m2.204s user 0m2.158s sys 0m0.056s real 0m2.203s user 0m2.151s sys 0m0.063s real 0m2.204s user 0m2.167s sys 0m0.048s [...]
On 5/10/17, Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, May 10, 2017 at 05:49:20PM +0200, Nicolas George wrote: >> Le primidi 21 floreal, an CCXXV, Michael Niedermayer a ecrit : >> > breaks (green stuff on edges) >> > ./ffmpeg -i ~/videos/matrixbench_mpeg2.mpg -an -vframes 5 -vf uspp=4:8 >> > -qscale 1 -y file.avi >> >> It is a bug in vf_uspp: >> >> diff --git a/libavfilter/vf_uspp.c b/libavfilter/vf_uspp.c >> index ef493b860f..6e378253a0 100644 >> --- a/libavfilter/vf_uspp.c >> +++ b/libavfilter/vf_uspp.c >> @@ -250,6 +250,9 @@ static void filter(USPPContext *p, uint8_t *dst[3], >> uint8_t *src[3], >> p->frame->data[2] = p->src[2] + x1c + y1c * >> p->frame->linesize[2]; >> p->frame->format = p->avctx_enc[i]->pix_fmt; >> >> + av_log(0, 16, "encode %d x %d with %d x %d\n", >> + p->frame->width, p->frame->height, >> + p->avctx_enc[i]->width, p->avctx_enc[i]->height); >> ret = avcodec_encode_video2(p->avctx_enc[i], &pkt, p->frame, >> &got_pkt_ptr); >> if (ret < 0) { >> av_log(p->avctx_enc[i], AV_LOG_ERROR, "Encoding failed\n"); >> >> gives: >> >> Stream #0:1 -> #0:0 (mpeg2video (native) -> mpeg4 (native)) >> encode 720 x 576 with 736 x 592 >> zsh: segmentation fault ./ffmpeg_g -nostdin -i >> ~/tmp/samples/matrixbench_mpeg2.mpg -an -vframes 5 -vf >> >> vf_uspp is giving a 720 *576 frame to a 736 *592 encoder. >> >> I do not know how to fix that. > > fixed, and now your patchset seems to cause a 1% slowdown of the > example above > > real 0m2.182s > user 0m2.129s > sys 0m0.064s > > real 0m2.182s > user 0m2.131s > sys 0m0.064s > > real 0m2.182s > user 0m2.124s > sys 0m0.072s > > vs: > real 0m2.204s > user 0m2.158s > sys 0m0.056s > > real 0m2.203s > user 0m2.151s > sys 0m0.063s > > real 0m2.204s > user 0m2.167s > sys 0m0.048s Of course it does. Your example exhibits unaligned behaviour.
Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit : > fixed, and now your patchset seems to cause a 1% slowdown of the > example above Thanks. The slowdown is to be expected, since this filter used non-aligned frames which are now realigned. I guess you can get back the previous speed by setting "avctx->alignment = 0" in snow_encoder.init(). The default in this patchset is 5 to fix all the alignment problems, at the cost of some slowdown in corner cases such as this one. I am not qualified to decide the best way of setting an optimal value to get speed without crashes. My opinion would be: - have avctx->alignment default to 0; - have all codecs with explicit assembly set avctx->alignment to the required value; - have all codecs that use generic DSP code init it through a function that sets the alignment accordingly. Regards,
On Wed, May 10, 2017 at 10:28:22PM +0200, Nicolas George wrote: > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit : > > fixed, and now your patchset seems to cause a 1% slowdown of the > > example above > > Thanks. The slowdown is to be expected, since this filter used > non-aligned frames which are now realigned. I guess you can get back the > previous speed by setting "avctx->alignment = 0" in snow_encoder.init(). > > The default in this patchset is 5 to fix all the alignment problems, at > the cost of some slowdown in corner cases such as this one. I am not > qualified to decide the best way of setting an optimal value to get > speed without crashes. My opinion would be: > > - have avctx->alignment default to 0; > > - have all codecs with explicit assembly set avctx->alignment to the > required value; > > - have all codecs that use generic DSP code init it through a function > that sets the alignment accordingly. hmm run fate with intentionally misaligned frames, and make a list of all codecs that dont crash and work fine. These should not use realignment as they dont need it and would be slowed down. [...]
On Wed, May 10, 2017 at 10:28 PM, Nicolas George <george@nsup.org> wrote: > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit : >> fixed, and now your patchset seems to cause a 1% slowdown of the >> example above > > Thanks. The slowdown is to be expected, since this filter used > non-aligned frames which are now realigned. I guess you can get back the > previous speed by setting "avctx->alignment = 0" in snow_encoder.init(). > > The default in this patchset is 5 to fix all the alignment problems, at > the cost of some slowdown in corner cases such as this one. I am not > qualified to decide the best way of setting an optimal value to get > speed without crashes. My opinion would be: > > - have avctx->alignment default to 0; > > - have all codecs with explicit assembly set avctx->alignment to the > required value; > > - have all codecs that use generic DSP code init it through a function > that sets the alignment accordingly. > If you want to fix these problems, you should rather white-list codecs which are known to be safe, instead of the other way around. - Hendrik
Hi, On Wed, May 10, 2017 at 5:02 PM, Michael Niedermayer <michael@niedermayer.cc > wrote: > On Wed, May 10, 2017 at 10:28:22PM +0200, Nicolas George wrote: > > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit : > > > fixed, and now your patchset seems to cause a 1% slowdown of the > > > example above > > > > Thanks. The slowdown is to be expected, since this filter used > > non-aligned frames which are now realigned. I guess you can get back the > > previous speed by setting "avctx->alignment = 0" in snow_encoder.init(). > > > > > The default in this patchset is 5 to fix all the alignment problems, at > > the cost of some slowdown in corner cases such as this one. I am not > > qualified to decide the best way of setting an optimal value to get > > speed without crashes. My opinion would be: > > > > - have avctx->alignment default to 0; > > > > - have all codecs with explicit assembly set avctx->alignment to the > > required value; > > > > - have all codecs that use generic DSP code init it through a function > > that sets the alignment accordingly. > > hmm > run fate with intentionally misaligned frames, and make a list of > all codecs that dont crash and work fine. That's generally arch-specific. Codecs using AVX2 need 32byte, SSE2/Neon only need 16, MMX need none (although 8 would probably be slightly faster). Ronald
On Wed, May 10, 2017 at 06:46:41PM -0400, Ronald S. Bultje wrote: > Hi, > > On Wed, May 10, 2017 at 5:02 PM, Michael Niedermayer <michael@niedermayer.cc > > wrote: > > > On Wed, May 10, 2017 at 10:28:22PM +0200, Nicolas George wrote: > > > Le primidi 21 floréal, an CCXXV, Michael Niedermayer a écrit : > > > > fixed, and now your patchset seems to cause a 1% slowdown of the > > > > example above > > > > > > Thanks. The slowdown is to be expected, since this filter used > > > non-aligned frames which are now realigned. I guess you can get back the > > > previous speed by setting "avctx->alignment = 0" in snow_encoder.init(). > > > > > > > > The default in this patchset is 5 to fix all the alignment problems, at > > > the cost of some slowdown in corner cases such as this one. I am not > > > qualified to decide the best way of setting an optimal value to get > > > speed without crashes. My opinion would be: > > > > > > - have avctx->alignment default to 0; > > > > > > - have all codecs with explicit assembly set avctx->alignment to the > > > required value; > > > > > > - have all codecs that use generic DSP code init it through a function > > > that sets the alignment accordingly. > > > > hmm > > run fate with intentionally misaligned frames, and make a list of > > all codecs that dont crash and work fine. > > > That's generally arch-specific. Codecs using AVX2 need 32byte, SSE2/Neon > only need 16, MMX need none (although 8 would probably be slightly faster). yes If a test is written to check codecs like described above. It should be easy to run it on multiple archs to find out which codecs never require alignment so as to find an initial default for each. [...]
diff --git a/libavfilter/vf_uspp.c b/libavfilter/vf_uspp.c index ef493b860f..6e378253a0 100644 --- a/libavfilter/vf_uspp.c +++ b/libavfilter/vf_uspp.c @@ -250,6 +250,9 @@ static void filter(USPPContext *p, uint8_t *dst[3], uint8_t *src[3], p->frame->data[2] = p->src[2] + x1c + y1c * p->frame->linesize[2]; p->frame->format = p->avctx_enc[i]->pix_fmt; + av_log(0, 16, "encode %d x %d with %d x %d\n", + p->frame->width, p->frame->height, + p->avctx_enc[i]->width, p->avctx_enc[i]->height); ret = avcodec_encode_video2(p->avctx_enc[i], &pkt, p->frame, &got_pkt_ptr); if (ret < 0) { av_log(p->avctx_enc[i], AV_LOG_ERROR, "Encoding failed\n");