Message ID | 20170506092006.22397-2-george@nsup.org |
---|---|
State | Superseded |
Headers | show |
On Sat, May 6, 2017 at 11:20 AM, Nicolas George <george@nsup.org> wrote: > A lot of codecs require aligned frame data, but do not document it. > It can result in crashes with perfectly valid uses of the API. > > TODO Implementation missing. > > Design rationale: > > - requiring frame data to be always aligned would be wasteful; > > - alignment requirements will evolve > (the 16->32 bump is still recent); > > - requiring applications to worry about alignment is not convenient. > > Signed-off-by: Nicolas George <george@nsup.org> > --- > libavcodec/avcodec.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h > index 35df4f6ced..51a7e2db21 100644 > --- a/libavcodec/avcodec.h > +++ b/libavcodec/avcodec.h > @@ -3644,6 +3644,16 @@ typedef struct AVCodecContext { > * AVCodecContext.get_format callback) > */ > int hwaccel_flags; > + > + /** > + * Minimum alignment of frame data required by the codec. > + * All frame data pointers must have the alignment lower bits cleared, > + * i.e. be a multiple of 1<<alignment. > + * - encoding: set by the encoder and used by the framework > + * - decoding: unused > + */ > + unsigned alignment; > + > } AVCodecContext; > I don't think this is very practical. We have a bunch of generic DSPs and if a codec uses those it would need to be constantly updated if the requirements of those DSPs change. Today it may only use SSE and be happy with 16-byte alignment, but what if someone adds an AVX2 implementation, or AVX512? Going through all codecs that use some DSP and somehow set their alignment value seems rather unfortunate - and on top of that, it'll add arch-specific conditions, which leads to a lot of unnecessary changes to a bunch of encoders (and filters) over time. Or the alternative is that its initialized to some default value based on the current CPU and only overriden in a minority of cases, which still makes me wonder if the effort is worth it. Right now the general consensus is that frames (and stride) should be aligned to the maximum your CPU might require, ie. 32 byte on a AVX2 CPU, even if this is not documented. I think it would be far better to officially document this. Making applications aware of the alignment (which they likely are already anyway, otherwise they would crash, even if they had to learn the hard way) can make the entire process more efficient, since re-aligning can be extremely expensive (ie. high res video) and should be avoided at all cost if possible (which it not always is when you crop, for example). There is no particular reason we can't make the framework check the alignment and re-align if required to the maximum CPU value (with a loud warning, perhaps), but I don't think a per-codec or per-filter alignment value is very practical. Related, Libav introduced an API called av_cpu_max_align to query the alignment requirements for the CPU, in case someone is not using av_malloc to allocate their buffers, but their own allocation mechanisms. - Hendrik
Hi, On Sat, May 6, 2017 at 5:20 AM, Nicolas George <george@nsup.org> wrote: > + /** > + * Minimum alignment of frame data required by the codec. > + * All frame data pointers must have the alignment lower bits cleared, > + * i.e. be a multiple of 1<<alignment. > + * - encoding: set by the encoder and used by the framework > + * - decoding: unused > + */ > + unsigned alignment; > + > } AVCodecContext; I agree it's likely that one codec (e.g. h264) would need 32-byte alignment on a particular system (e.g. x86/haswell), whereas another codec on the same system (e.g. wmavoice) might not. However, I find it unlikely that one codec *instance* would need different alignment from another codec *instance* (for the same codec). The above allows seems specifically designed for that. Do you think adding alignment to AVCodec would make more sense (and possibly set it at runtime using static_init)? Ronald
On Sat, 6 May 2017, Ronald S. Bultje wrote: > Hi, > > On Sat, May 6, 2017 at 5:20 AM, Nicolas George <george@nsup.org> wrote: > >> + /** >> + * Minimum alignment of frame data required by the codec. >> + * All frame data pointers must have the alignment lower bits cleared, >> + * i.e. be a multiple of 1<<alignment. >> + * - encoding: set by the encoder and used by the framework >> + * - decoding: unused >> + */ >> + unsigned alignment; >> + >> } AVCodecContext; > > > I agree it's likely that one codec (e.g. h264) would need 32-byte alignment > on a particular system (e.g. x86/haswell), whereas another codec on the > same system (e.g. wmavoice) might not. > > However, I find it unlikely that one codec *instance* would need different > alignment from another codec *instance* (for the same codec). I can imagine a case where alignment requirements are dependant on pixel format. Not the best example, because it is not a "codec", and it is using bitpacked formats, but the Decklink output device would require 128 byte or even 256 byte alignment for 10/12 bit, according to the SDK. Regards, Marton
Le septidi 17 floréal, an CCXXV, Ronald S. Bultje a écrit : > I agree it's likely that one codec (e.g. h264) would need 32-byte alignment > on a particular system (e.g. x86/haswell), whereas another codec on the > same system (e.g. wmavoice) might not. > > However, I find it unlikely that one codec *instance* would need different > alignment from another codec *instance* (for the same codec). The above > allows seems specifically designed for that. Do you think adding alignment > to AVCodec would make more sense (and possibly set it at runtime using > static_init)? I think Marton's mail gives a reasonable example. As for me, I do not know exactly what is needed, this is one of the reason I did not want to draft the proposal myself. This way seemed more future-proof, but it can be discussed. Also, I thought our AVCodec structure were const, so putting the field there would have been incompatible with run-time CPU detection, but apparently I was wrong. Still, I do not think codecs are supposed to modify it. Regards,
Le septidi 17 floréal, an CCXXV, Hendrik Leppkes a écrit : > I don't think this is very practical. We have a bunch of generic DSPs > and if a codec uses those it would need to be constantly updated if > the requirements of those DSPs change. Today it may only use SSE and > be happy with 16-byte alignment, but what if someone adds an AVX2 > implementation, or AVX512? Well, exactly: pray tell me, with your design what happens when Intel adds AVX512? I will tell you what happens: applications will have been written for 32-octets alignment, and as soon as their libavcodec is updated, the crashes will start. > Going through all codecs that use some DSP and somehow set their > alignment value seems rather unfortunate - and on top of that, it'll > add arch-specific conditions, which leads to a lot of unnecessary > changes to a bunch of encoders (and filters) over time. The way I see it, the functions that are used to init the DSP should return the required alignment. That will require a little change since they do not take an AVCodecContext argument, but only trivial and mechanical changes, so even if hundreds of files were touched it is really not an issue. Of course, setting a high default value by default is a possible short-term solution, especially in the branches. > Right now the general consensus is that frames (and stride) should be > aligned to the maximum your CPU might require, ie. 32 byte on a AVX2 > CPU, even if this is not documented. It may have been the consensus for people who work on the vectorized implementations, but not the general consensus, especially for people who work on user-facing API. > I think it would be far better to officially document this. Making > applications aware of the alignment (which they likely are already > anyway, otherwise they would crash, even if they had to learn the hard > way) can make the entire process more efficient, since re-aligning can > be extremely expensive (ie. high res video) and should be avoided at > all cost if possible (which it not always is when you crop, for > example). Encouraging the applications to be aware of the issue for performances reasons is a good thing. Forcing them to do so when it is not convenient, on pain of crash, when it can be done for them just as easily, not so much. > There is no particular reason we can't make the framework check the > alignment and re-align if required to the maximum CPU value (with a > loud warning, perhaps), but I don't think a per-codec or per-filter > alignment value is very practical. If the framework does the realigning, doing it per-codec or per-filter is almost free. And someone really smart said that "since re-aligning can be extremely expensive" it "should be avoided at all cost if possible", doing the realigning only if the codecs needs it is really the right thing to do. Regards,
On Sat, May 6, 2017 at 8:34 PM, Nicolas George <george@nsup.org> wrote: > > Encouraging the applications to be aware of the issue for performances > reasons is a good thing. > > Forcing them to do so when it is not convenient, on pain of crash, when > it can be done for them just as easily, not so much. > >> There is no particular reason we can't make the framework check the >> alignment and re-align if required to the maximum CPU value (with a >> loud warning, perhaps), but I don't think a per-codec or per-filter >> alignment value is very practical. > > If the framework does the realigning, doing it per-codec or per-filter > is almost free. And someone really smart said that "since re-aligning > can be extremely expensive" it "should be avoided at all cost if > possible", doing the realigning only if the codecs needs it is really > the right thing to do. > One of my main concerns is that FATE will not actually be testing any of this (otherwise it would be crashing right now), and as such it can (and we all know, will) lead to subtle bugs that only occur with certain callers on certain platforms. This is usually not a very good situation to be in. Ronald suggested on IRC that the only parts suffering from this are the ASM DSP components, and we should just make checkasm tests for all of them and let checkasm test the alignment they claim to support, but alas this won't be done for months, if not years. There is potentially nothing wrong with supporting per-codec values in the framework, but setting them to something lower then the global alignment (ie. 32-byte) should only be done if we have sufficient automated testing to ensure this value is and remains correct. - Hendrik
Hi, On Sun, May 7, 2017 at 11:15 AM, Hendrik Leppkes <h.leppkes@gmail.com> wrote: > Ronald suggested on IRC that the only parts suffering from this are > the ASM DSP components, and we should just make checkasm tests for all > of them and let checkasm test the alignment they claim to support, but > alas this won't be done for months, if not years. This really isn't that complicated once we get the general design and goals explicitly documented. We test all kind of crazy stuff in checkasm, like xmm register clobbering. This is just another strange thing that can be done in generic code. We just need to decide what is OK so we know what to test. Ronald
On Sun, May 7, 2017 at 7:17 PM, Ronald S. Bultje <rsbultje@gmail.com> wrote: > Hi, > > On Sun, May 7, 2017 at 11:15 AM, Hendrik Leppkes <h.leppkes@gmail.com> > wrote: > >> Ronald suggested on IRC that the only parts suffering from this are >> the ASM DSP components, and we should just make checkasm tests for all >> of them and let checkasm test the alignment they claim to support, but >> alas this won't be done for months, if not years. > > > This really isn't that complicated once we get the general design and goals > explicitly documented. > > We test all kind of crazy stuff in checkasm, like xmm register clobbering. > This is just another strange thing that can be done in generic code. We > just need to decide what is OK so we know what to test. > I'm not worried about checkasm perfoming the test, but all DSP components actually having tests. - Hendrik
Hi, On Sun, May 7, 2017 at 2:55 PM, Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Sun, May 7, 2017 at 7:17 PM, Ronald S. Bultje <rsbultje@gmail.com> > wrote: > > Hi, > > > > On Sun, May 7, 2017 at 11:15 AM, Hendrik Leppkes <h.leppkes@gmail.com> > > wrote: > > > >> Ronald suggested on IRC that the only parts suffering from this are > >> the ASM DSP components, and we should just make checkasm tests for all > >> of them and let checkasm test the alignment they claim to support, but > >> alas this won't be done for months, if not years. > > > > > > This really isn't that complicated once we get the general design and > goals > > explicitly documented. > > > > We test all kind of crazy stuff in checkasm, like xmm register > clobbering. > > This is just another strange thing that can be done in generic code. We > > just need to decide what is OK so we know what to test. > > > > I'm not worried about checkasm perfoming the test, but all DSP > components actually having tests. Test coverage in general is a fair point, we can work on that... Ronald
diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index 35df4f6ced..51a7e2db21 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -3644,6 +3644,16 @@ typedef struct AVCodecContext { * AVCodecContext.get_format callback) */ int hwaccel_flags; + + /** + * Minimum alignment of frame data required by the codec. + * All frame data pointers must have the alignment lower bits cleared, + * i.e. be a multiple of 1<<alignment. + * - encoding: set by the encoder and used by the framework + * - decoding: unused + */ + unsigned alignment; + } AVCodecContext; AVRational av_codec_get_pkt_timebase (const AVCodecContext *avctx);
A lot of codecs require aligned frame data, but do not document it. It can result in crashes with perfectly valid uses of the API. TODO Implementation missing. Design rationale: - requiring frame data to be always aligned would be wasteful; - alignment requirements will evolve (the 16->32 bump is still recent); - requiring applications to worry about alignment is not convenient. Signed-off-by: Nicolas George <george@nsup.org> --- libavcodec/avcodec.h | 10 ++++++++++ 1 file changed, 10 insertions(+)