Message ID | 20170211212145.GO1516@example.net |
---|---|
State | New |
Headers | show |
On Sat, Feb 11, 2017 at 10:25:03PM +0100, u-9iep@aetey.se wrote: > Hello, > > This is my best effort attempt to make the patch acceptable > by the upstream's criteria. > > Daniel, do you mind that I referred to your message in the commit? > I believe is is best to indicate numbers from a third party measurement. > > The code seems to be equvalent to the previous patch, > with about 20% less LOC. > > This hurts readability (my subjective impression) but on the positive side > the change makes the structure of the code more explicit. > > Attaching the patch. > > Now I have done what I can, have to leave. > Unless there are bugs there in the patch, my attempt to contribute ends > at this point. > > Thanks to everyone who cared to objectively discuss a specific case > of ffmpeg usage, the implications of techniques around VQ and whether/why > some non-traditional approaches can make sense. > > Good luck to the ffmpeg project, it is very useful and valuable. > > Best regards, > Rune > cinepak.c | 844 ++++++++++++++++++++++++++++++++++++++++++++++++++------------ > 1 file changed, 692 insertions(+), 152 deletions(-) > cc2ab45b7633651bc0ff80ca57c78ef4fc649d3c 0001-Cinepak-speed-up-decoding-several-fold-depending-on-.patch > From 0c9badec5d144b995c0bb52c7a80939b672be3f5 Mon Sep 17 00:00:00 2001 > From: Rl <addr-see-the-website@aetey.se> > Date: Sat, 11 Feb 2017 20:28:54 +0100 > Subject: [PATCH] Cinepak: speed up decoding several-fold, depending on the > scenario, by supporting multiple output pixel formats. > > Decoding to rgb24 and pal8 is optimized. > > Added rgb32, rgb565, yuv420p, each with faster decoding than to rgb24. > > The most noticeable gain is achieved by the created possibility > to skip format conversions, for example when decoding to rgb565 > ---- > Using matrixbench_mpeg2.mpg (720x567) encoded with ffmpeg into Cinepak > using default settings, decoding on an i5 3570K, 3.4 GHz: > > bicubic (default): ~24x realtime > fast_bilinear: ~65x realtime > patch w/rgb565 override: ~154x realtime > ---- > (https://ffmpeg.org/pipermail/ffmpeg-devel/2017-February/206799.html) > > palettized input can be decoded to any of the output formats, > pal8 output is still limited to palettized input > > with input other than palettized/grayscale > yuv420 is approximated by the Cinepak colorspace > > The output format can be chosen at runtime by an option or via the API. > --- > libavcodec/cinepak.c | 844 +++++++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 692 insertions(+), 152 deletions(-) you may want to add yourself to MAINTAINERs (after talking with roberto, who i belive has less interrest in cinepak than you do nowadays) > > diff --git a/libavcodec/cinepak.c b/libavcodec/cinepak.c > index d657e9c0c1..7b08e20e06 100644 > --- a/libavcodec/cinepak.c > +++ b/libavcodec/cinepak.c > @@ -31,6 +31,8 @@ > * > * Cinepak colorspace support (c) 2013 Rl, Aetey Global Technologies AB > * @author Cinepak colorspace, Rl, Aetey Global Technologies AB > + * Extra output formats / optimizations (c) 2017 Rl, Aetey Global Technologies AB > + * @author Extra output formats / optimizations, Rl, Aetey Global Technologies AB > */ > > #include <stdio.h> > @@ -39,23 +41,48 @@ > > #include "libavutil/common.h" > #include "libavutil/intreadwrite.h" > +#include "libavutil/opt.h" > +/* #include "libavutil/avassert.h" */ useless commented out code [...] > + switch (avctx->pix_fmt) { > + case AV_PIX_FMT_RGB32: > + s->decode_codebook = cinepak_decode_codebook_rgb32; > + s->decode_vectors = cinepak_decode_vectors_rgb32; > + break; > + case AV_PIX_FMT_RGB24: > + s->decode_codebook = cinepak_decode_codebook_rgb24; > + s->decode_vectors = cinepak_decode_vectors_rgb24; > + break; > + case AV_PIX_FMT_RGB565: > + s->decode_codebook = cinepak_decode_codebook_rgb565; > + s->decode_vectors = cinepak_decode_vectors_rgb565; > + break; > + case AV_PIX_FMT_YUV420P: > + s->decode_codebook = cinepak_decode_codebook_yuv420; > + s->decode_vectors = cinepak_decode_vectors_yuv420; > + break; > + case AV_PIX_FMT_PAL8: > + if (!s->palette_video) { > + av_log(avctx, AV_LOG_ERROR, "Palettized output not supported without palettized input\n"); > + return AVERROR(EINVAL); > + } > + s->decode_codebook = cinepak_decode_codebook_pal8; > + s->decode_vectors = cinepak_decode_vectors_pal8; > + break; > + default: > + av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n", avctx->pix_fmt); av_get_pix_fmt_name() [...] > @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = { > .close = cinepak_decode_end, > .decode = cinepak_decode_frame, > .capabilities = AV_CODEC_CAP_DR1, > + .pix_fmts = pixfmt_list, This is possibly unneeded thx [...]
On 2/13/17, Michael Niedermayer <michael@niedermayer.cc> wrote: > On Sat, Feb 11, 2017 at 10:25:03PM +0100, u-9iep@aetey.se wrote: >> Hello, >> >> This is my best effort attempt to make the patch acceptable >> by the upstream's criteria. >> >> Daniel, do you mind that I referred to your message in the commit? >> I believe is is best to indicate numbers from a third party measurement. >> >> The code seems to be equvalent to the previous patch, >> with about 20% less LOC. >> >> This hurts readability (my subjective impression) but on the positive >> side >> the change makes the structure of the code more explicit. >> >> Attaching the patch. >> >> Now I have done what I can, have to leave. >> Unless there are bugs there in the patch, my attempt to contribute ends >> at this point. >> >> Thanks to everyone who cared to objectively discuss a specific case >> of ffmpeg usage, the implications of techniques around VQ and whether/why >> some non-traditional approaches can make sense. >> >> Good luck to the ffmpeg project, it is very useful and valuable. >> >> Best regards, >> Rune > >> cinepak.c | 844 >> ++++++++++++++++++++++++++++++++++++++++++++++++++------------ >> 1 file changed, 692 insertions(+), 152 deletions(-) >> cc2ab45b7633651bc0ff80ca57c78ef4fc649d3c >> 0001-Cinepak-speed-up-decoding-several-fold-depending-on-.patch >> From 0c9badec5d144b995c0bb52c7a80939b672be3f5 Mon Sep 17 00:00:00 2001 >> From: Rl <addr-see-the-website@aetey.se> >> Date: Sat, 11 Feb 2017 20:28:54 +0100 >> Subject: [PATCH] Cinepak: speed up decoding several-fold, depending on >> the >> scenario, by supporting multiple output pixel formats. >> >> Decoding to rgb24 and pal8 is optimized. >> >> Added rgb32, rgb565, yuv420p, each with faster decoding than to rgb24. >> >> The most noticeable gain is achieved by the created possibility >> to skip format conversions, for example when decoding to rgb565 >> ---- >> Using matrixbench_mpeg2.mpg (720x567) encoded with ffmpeg into Cinepak >> using default settings, decoding on an i5 3570K, 3.4 GHz: >> >> bicubic (default): ~24x realtime >> fast_bilinear: ~65x realtime >> patch w/rgb565 override: ~154x realtime >> ---- >> (https://ffmpeg.org/pipermail/ffmpeg-devel/2017-February/206799.html) >> >> palettized input can be decoded to any of the output formats, >> pal8 output is still limited to palettized input >> >> with input other than palettized/grayscale >> yuv420 is approximated by the Cinepak colorspace >> >> The output format can be chosen at runtime by an option or via the API. >> --- >> libavcodec/cinepak.c | 844 >> +++++++++++++++++++++++++++++++++++++++++---------- >> 1 file changed, 692 insertions(+), 152 deletions(-) > > you may want to add yourself to MAINTAINERs (after talking with > roberto, who i belive has less interrest in cinepak than you do > nowadays) > > >> >> diff --git a/libavcodec/cinepak.c b/libavcodec/cinepak.c >> index d657e9c0c1..7b08e20e06 100644 >> --- a/libavcodec/cinepak.c >> +++ b/libavcodec/cinepak.c >> @@ -31,6 +31,8 @@ >> * >> * Cinepak colorspace support (c) 2013 Rl, Aetey Global Technologies AB >> * @author Cinepak colorspace, Rl, Aetey Global Technologies AB >> + * Extra output formats / optimizations (c) 2017 Rl, Aetey Global >> Technologies AB >> + * @author Extra output formats / optimizations, Rl, Aetey Global >> Technologies AB >> */ >> >> #include <stdio.h> >> @@ -39,23 +41,48 @@ >> >> #include "libavutil/common.h" >> #include "libavutil/intreadwrite.h" >> +#include "libavutil/opt.h" > >> +/* #include "libavutil/avassert.h" */ > > useless commented out code > > [...] >> + switch (avctx->pix_fmt) { >> + case AV_PIX_FMT_RGB32: >> + s->decode_codebook = cinepak_decode_codebook_rgb32; >> + s->decode_vectors = cinepak_decode_vectors_rgb32; >> + break; >> + case AV_PIX_FMT_RGB24: >> + s->decode_codebook = cinepak_decode_codebook_rgb24; >> + s->decode_vectors = cinepak_decode_vectors_rgb24; >> + break; >> + case AV_PIX_FMT_RGB565: >> + s->decode_codebook = cinepak_decode_codebook_rgb565; >> + s->decode_vectors = cinepak_decode_vectors_rgb565; >> + break; >> + case AV_PIX_FMT_YUV420P: >> + s->decode_codebook = cinepak_decode_codebook_yuv420; >> + s->decode_vectors = cinepak_decode_vectors_yuv420; >> + break; >> + case AV_PIX_FMT_PAL8: >> + if (!s->palette_video) { >> + av_log(avctx, AV_LOG_ERROR, "Palettized output not supported >> without palettized input\n"); >> + return AVERROR(EINVAL); >> + } >> + s->decode_codebook = cinepak_decode_codebook_pal8; >> + s->decode_vectors = cinepak_decode_vectors_pal8; >> + break; >> + default: > >> + av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n", >> avctx->pix_fmt); > > av_get_pix_fmt_name() > > > [...] > >> @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = { >> .close = cinepak_decode_end, >> .decode = cinepak_decode_frame, >> .capabilities = AV_CODEC_CAP_DR1, >> + .pix_fmts = pixfmt_list, > > This is possibly unneeded > > thx > > [...] > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Many things microsoft did are stupid, but not doing something just because > microsoft did it is even more stupid. If everything ms did were stupid they > would be bankrupt already. > Is this new mini swscale?
Thanks Michael, Your corrections are appreciated. On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > you may want to add yourself to MAINTAINERs (after talking with > roberto, who i belive has less interrest in cinepak than you do > nowadays) Sounds ok for me. Roberto, what do you think (if you read this)? > > +/* #include "libavutil/avassert.h" */ > > useless commented out code I hesitated but left it together with the corresponding commented out assert() statement to serve as an indication of the validity assumption we make for pal8. Will change. > > + av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n", avctx->pix_fmt); > > av_get_pix_fmt_name() Thanks. > > @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = { > > .close = cinepak_decode_end, > > .decode = cinepak_decode_frame, > > .capabilities = AV_CODEC_CAP_DR1, > > + .pix_fmts = pixfmt_list, > > This is possibly unneeded ok! Regards, Rune
On Mon, Feb 13, 2017 at 02:23:47PM +0100, Paul B Mahol wrote: > >> @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = { > >> .close = cinepak_decode_end, > >> .decode = cinepak_decode_frame, > >> .capabilities = AV_CODEC_CAP_DR1, > >> + .pix_fmts = pixfmt_list, > > [...] > Is this new mini swscale? Generality of swscaler is traded away to be able to fuse the conversion with the decoder. In this special case this is remarkably fruitful, while definitely not applicable "en masse". If there were at least another practical codec with similar properties, it might be nice to refactor such "codebook-targeting format conversion" to make it reusable. It would still be much smaller in scope than swscaler and hence could be optimized more specifically. I would like to find a rival to Cinepak and would happily do such a refactoring (if applicable) for the sake of another fast-decodable codec, but nothing I know of comes near enough to be practical. Well, while we are at it, "general purpose" vs "game" codec aside, RoQ decoding could be possibly made faster by building a full second level codebook in advance and avoiding the double indirection later. (This looks like what the "dreamroq" decoder does btw.) While I do not expect that RoQ can be made faster than Cinepak (it is more complex at decoding), it can be optimized against pixel format conversions in the same way so it may be worth to give this a try. Then abstracting a "mini-swscale" could become justifiable. Regards, Rune
On Mon, 13 Feb 2017 18:51:39 +0100
u-9iep@aetey.se wrote:
> Then abstracting a "mini-swscale" could become justifiable.
And this is why we should probably reject this patch. What you wrote
paints a horrifying future.
Note that we would have this discussion even if it'd speed up the h264
decoder. Pissing all over modularization is not a good thing to do.
If you really want to get anything applied, you should probably try
looking at outputting ycgco, which appears to be the native colorspace
of the codec (and convert it vf_colorspace, I guess).
On Tue, Feb 14, 2017 at 06:51:46AM +0100, wm4 wrote: > On Mon, 13 Feb 2017 18:51:39 +0100 > u-9iep@aetey.se wrote: > > > Then abstracting a "mini-swscale" could become justifiable. > > And this is why we should probably reject this patch. > What you wrote paints a horrifying future. --^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Note that we would have this discussion even if it'd speed up the h264 > decoder. Pissing all over modularization is not a good thing to do. -----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > If you really want to get anything applied, you should probably try > looking at outputting ycgco, which appears to be the native colorspace > of the codec (and convert it vf_colorspace, I guess). Dear wm4, Your readiness to give feedback and your endeavor to keep the high quality of the code are appreciated. Nevertheless: I kindly ask you to mind your use of disparaging terms (emotionally charged expressions like "horrifying" or "pissing" which attribute a negative quality or attitude to your opponent), the corresponding phrases are marked above for your reference and please check your data. For additional information I would suggest https://ffmpeg.org/developer.html#Code-of-conduct https://multimedia.cx/mirror/cinepak.txt the contents of this thread Regards, Rune
On Tue, 14 Feb 2017 09:51:54 +0100 u-9iep@aetey.se wrote: > On Tue, Feb 14, 2017 at 06:51:46AM +0100, wm4 wrote: > > On Mon, 13 Feb 2017 18:51:39 +0100 > > u-9iep@aetey.se wrote: > > > > > Then abstracting a "mini-swscale" could become justifiable. > > > > And this is why we should probably reject this patch. > > What you wrote paints a horrifying future. > --^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > Note that we would have this discussion even if it'd speed up the h264 > > decoder. Pissing all over modularization is not a good thing to do. > -----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > If you really want to get anything applied, you should probably try > > looking at outputting ycgco, which appears to be the native colorspace > > of the codec (and convert it vf_colorspace, I guess). > > Dear wm4, > > Your readiness to give feedback and your endeavor to keep the high quality > of the code are appreciated. Nevertheless: > > I kindly ask you to mind your use of disparaging terms > (emotionally charged expressions like "horrifying" or "pissing" > which attribute a negative quality or attitude to your opponent), > the corresponding phrases are marked above for your reference > > and please check your data. > > For additional information I would suggest > > https://ffmpeg.org/developer.html#Code-of-conduct > > https://multimedia.cx/mirror/cinepak.txt > > the contents of this thread Nevertheless your patches are rejected.
On 2/14/17, wm4 <nfxjfg@googlemail.com> wrote: > On Tue, 14 Feb 2017 09:51:54 +0100 > u-9iep@aetey.se wrote: > >> On Tue, Feb 14, 2017 at 06:51:46AM +0100, wm4 wrote: >> > On Mon, 13 Feb 2017 18:51:39 +0100 >> > u-9iep@aetey.se wrote: >> > >> > > Then abstracting a "mini-swscale" could become justifiable. >> > >> > And this is why we should probably reject this patch. >> > What you wrote paints a horrifying future. >> --^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> > >> > Note that we would have this discussion even if it'd speed up the h264 >> > decoder. Pissing all over modularization is not a good thing to do. >> -----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> > >> > If you really want to get anything applied, you should probably try >> > looking at outputting ycgco, which appears to be the native colorspace >> > of the codec (and convert it vf_colorspace, I guess). >> >> Dear wm4, >> >> Your readiness to give feedback and your endeavor to keep the high quality >> of the code are appreciated. Nevertheless: >> >> I kindly ask you to mind your use of disparaging terms >> (emotionally charged expressions like "horrifying" or "pissing" >> which attribute a negative quality or attitude to your opponent), >> the corresponding phrases are marked above for your reference >> >> and please check your data. >> >> For additional information I would suggest >> >> https://ffmpeg.org/developer.html#Code-of-conduct >> >> https://multimedia.cx/mirror/cinepak.txt >> >> the contents of this thread > > Nevertheless your patches are rejected. Correct way in solving this is outputing in cinepak decoder actual native format that it uses and not do any conversions of colorspace which is currently done. Implement correct colorspace conversions of cinepak format to others in libswscale.
On Tue, Feb 14, 2017 at 10:03:41AM +0100, Paul B Mahol wrote: > Correct way in solving this is outputing in cinepak decoder actual > native format that it > uses and not do any conversions of colorspace which is currently done. > Implement correct colorspace conversions of cinepak format to others > in libswscale. Would you comment on what is the difference between now and the situation https://ffmpeg.org/pipermail/ffmpeg-devel/2013-February/138811.html when you (?) wrote on the same topic ----------- > i think hes asking if it should be a new colorspace in swscale, or > added into the cinepak decoder/encoder common code. > > by your answer i think it has to be a new swscale colorspace? Please leave libswscale alone. ----------- That position was IMHO reasonable. Commenting on your current suggestion: What would be the advantage of moving the Cinepak-specific conversions into the general purpose library? Either this would mean as many variations of the conversion code as there are in the decoder now or the conversion would be done through some intermediate format, which is a loss per se. The code still only would be used by Cinepak. Besides this, for a purely technical reason the conversions done in Cinepak are not suitable for delayed delegation to swscaler. The reason has been already pointed out in the thread: The huge difference in the amount of the data to be processed; in other words the very essence of the vector quantization technology where frame data is represented by a codebook, by design meant to be much smaller. (BTW mathematically DCT to the contrary is equivalent to referencing an extremely large codebook. This codebook does not have to be transmitted or stored, neither of which would be practical. Instead its entries are generated on-demand, for each frame data block. This is the situation where swscale is an efficient tool.) Hope this helps. Regards, Rune
Hi Rune, On Tue, Feb 14, 2017 at 6:14 AM, <u-9iep@aetey.se> wrote: > The huge difference in the amount of the data to be processed; in other > words the very essence of the vector quantization technology where frame > data is represented by a codebook, by design meant to be much smaller. We acknowledge that. We understand this. Nobody disputes this. But we still don't think breaking the modularization is the right way forward. I'm sorry. We're thinking about this in terms of maintainability as well as speed. The problem is that once we allow this, people will ask for 16bit output in h264 for all native bitdepths, or even packed formats (Kieran already asked). We know it's faster. We also know it's unreasonable from a maintenance perspective. We have to draw a line somewhere. It's an imperfect line but there's a reason for it. I'm sorry. That's life. Ronald
Hi Ronald, On Tue, Feb 14, 2017 at 09:46:55AM -0500, Ronald S. Bultje wrote: > > The huge difference in the amount of the data to be processed; in other > > words the very essence of the vector quantization technology where frame > > data is represented by a codebook, by design meant to be much smaller. > > > We acknowledge that. We understand this. Nobody disputes this. Just several lines below you assume (why?) that using this specific advantage in Cinepak would create any ground for a "corresponding change" in h264. As far as I know h264 does not use vector quantization on the final output data, or what do you mean by the following: > But we still don't think breaking the modularization is the right way > forward. I'm sorry. We're thinking about this in terms of maintainability > as well as speed. The problem is that once we allow this, people will ask > for 16bit output in h264 for all native bitdepths, or even packed formats > (Kieran already asked). Unfortunately I do not see this as having any real relevance. If there indeed is a gain to be collected for h264, it should be weighed against the cost to be caused by a change _there_. As a matter of fact, I do not suggest "breaking modularity in ffmpeg". Modularity, a very useful concept, like any other concept has its area of usefulness. An approach very reasonable in most situations should not be mistaken for "the best one in all cases". Here we have a case where enforced application of this generally useful concept is remarkably far from optimal. > We know it's faster. We also know it's unreasonable from a maintenance You do not know the latter. Your guess possibly reflects the feeling of a fundamental concept being neglected, but this in not a case of neglection/ignorance. > perspective. We have to draw a line somewhere. It's an imperfect line but > there's a reason for it. I'm sorry. That's life. You do not have to be sorry, just be substantial. I have checked the changes done to Cinepak decoder during the 4 years since it began to decode correctly. If the change being discussed now would have been applied back then, one of the later commits would have had about 15 extra lines to change (btw, in a very regular fashion: "frame." => "frame->" nothing else). Another commit would have to change 5 lines instead of 1. The latter indicated indeed an unnecessary duplication in variable declarations, sorry for that. I now change the patch to avoid this. That's all. Tell me if I missed something. The decoder is used rarely but it is indispensable when maximal speed is needed. There is no substitute. Is a 3-fold improvement in the decoding speed worth 15 extra lines to change once in 4 years? Regards, Rune
On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > > you may want to add yourself to MAINTAINERs (after talking with > > roberto, who i belive has less interrest in cinepak than you do > > nowadays) > > Sounds ok for me. Roberto, what do you think (if you read this)? The only address to him which I found (in an old commit) bounced, there was no reply here on the list either. Both can be a coincidence, but otherwise it looks like the change should be OK. Regards, Rune
Hi, On Sun, Mar 5, 2017 at 2:22 PM, <u-9iep@aetey.se> wrote: > On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: > > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > > > you may want to add yourself to MAINTAINERs (after talking with > > > roberto, who i belive has less interrest in cinepak than you do > > > nowadays) > > > > Sounds ok for me. Roberto, what do you think (if you read this)? > > The only address to him which I found (in an old commit) bounced, > there was no reply here on the list either. > > Both can be a coincidence, but otherwise it looks like the change should > be OK. No. This has been discussed repeatedly. Stop trying to push this through. Ronald
Ronald, On Sun, Mar 05, 2017 at 02:38:31PM -0500, Ronald S. Bultje wrote: > Hi, > > On Sun, Mar 5, 2017 at 2:22 PM, <u-9iep@aetey.se> wrote: > > > On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: > > > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > > > > you may want to add yourself to MAINTAINERs (after talking with > > > > roberto, who i belive has less interrest in cinepak than you do > > > > nowadays) > > > > > > Sounds ok for me. Roberto, what do you think (if you read this)? > > > > The only address to him which I found (in an old commit) bounced, > > there was no reply here on the list either. > > > > Both can be a coincidence, but otherwise it looks like the change should > > be OK. > > > No. This has been discussed repeatedly. Stop trying to push this through. My maintainership (for the code which I have contributed to, you may be unaware about this fact) was not discussed otherwise than cited here. Please check what you are commenting, especially when you mean to sound like having a definite power and being quite rude. What _has_ been discussed are the proposed Cinepak decoder improvements. There has not been even a single substantial technical argument against any version of the patch, nor any style/duplication argument against the last two versions. (Did you read the discussion? Did you check the validity of the presented arguments from all the involved parties, yours truly included? You did not say a word after I addressed your concerns.) This makes it even harder to put your statement into a proper context. Regards, Rune
Hi Rune, On Sun, Mar 5, 2017 at 4:26 PM, <u-9iep@aetey.se> wrote: > Ronald, > > On Sun, Mar 05, 2017 at 02:38:31PM -0500, Ronald S. Bultje wrote: > > Hi, > > > > On Sun, Mar 5, 2017 at 2:22 PM, <u-9iep@aetey.se> wrote: > > > > > On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: > > > > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > > > > > you may want to add yourself to MAINTAINERs (after talking with > > > > > roberto, who i belive has less interrest in cinepak than you do > > > > > nowadays) > > > > > > > > Sounds ok for me. Roberto, what do you think (if you read this)? > > > > > > The only address to him which I found (in an old commit) bounced, > > > there was no reply here on the list either. > > > > > > Both can be a coincidence, but otherwise it looks like the change > should > > > be OK. > > > > > > No. This has been discussed repeatedly. Stop trying to push this through. > > My maintainership (for the code which I have contributed to, you may be > unaware about this fact) was not discussed otherwise than cited here. > > Please check what you are commenting, > especially when you mean to sound like having a definite power The rule is that a patch cannot be committed while a developer has outstanding comments. There's outstanding comments, including some from me. You said "the change should be OK", and I'm simply saying "no" to that, because it's not. The patch is not OK until review comments from other developers have been addressed by the patch submitter. Ronald
On Sun, Mar 05, 2017 at 06:20:34PM -0500, Ronald S. Bultje wrote: > Hi Rune, > > On Sun, Mar 5, 2017 at 4:26 PM, <u-9iep@aetey.se> wrote: > > > Ronald, > > > > On Sun, Mar 05, 2017 at 02:38:31PM -0500, Ronald S. Bultje wrote: > > > Hi, > > > > > > On Sun, Mar 5, 2017 at 2:22 PM, <u-9iep@aetey.se> wrote: > > > > > > > On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: > > > > > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > > > > > > you may want to add yourself to MAINTAINERs (after talking with > > > > > > roberto, who i belive has less interrest in cinepak than you do > > > > > > nowadays) > > > > > > > > > > Sounds ok for me. Roberto, what do you think (if you read this)? > > > > > > > > The only address to him which I found (in an old commit) bounced, > > > > there was no reply here on the list either. > > > > > > > > Both can be a coincidence, but otherwise it looks like the change > > should > > > > be OK. > > > > > > > > > No. This has been discussed repeatedly. Stop trying to push this through. > > > > My maintainership (for the code which I have contributed to, you may be > > unaware about this fact) was not discussed otherwise than cited here. > > > > Please check what you are commenting, > > especially when you mean to sound like having a definite power > > > The rule is that a patch cannot be committed while a developer has > outstanding comments. There's outstanding comments, including some from me. > You said "the change should be OK", and I'm simply saying "no" to that, > because it's not. The patch is not OK until review comments from other > developers have been addressed by the patch submitter. Thanks for the explanation Ronald, 1. Apparently you did not aim at the maintainership question. It would be nice if you confirm this point, to avoid further misunderstandings. 2. Would you cite the outstanding comment or comments about the patches which you feel are valid and not addressed? I kindly ask you check the latest iteration of the patch series where I tried to summarize all discussion points in the containing letter. TL;DR: I do have respect for your work and competence. Please do the same to others. Even if we all too often meet people who lack in those areas, there are some exceptions. To be fair your comments concerning the patches were among the most detailed and friendly, this is appreciated! But even your well meant comments happened to miss the point, being based in unfounded assumptions. I answered and explained and there it stopped. Frankly the only outstanding comments which I am aware of are of the kind "you the patch submitter do not understand why 'this' is better than 'that'". The fact is that I _do_ understand why you believe that something is better or not. I just do not feel a belief is sufficient per se. I invited you and others to look at _what_ makes something better or not and got literally no answers. Besides of course "imagine if someone else will do something else, it would be terrible, thus you are leading us to hell" :) or otherwise, citing literally: "we know this code, we know it can do this, don't tell us it's not possible" without specifying (and in fact misunderstanding) what "this" we were talking about: making the speedup usable with an unprepared application, which the overhelming majority of applications are. Regrettably, the present code is not prepared nor meant to be able to. The proposed code could, but this possiblity is now cut off, just to avoid contention. As a third example, your comment "We [...] know it's unreasonable from a maintenance perspective". The very presence of the cinepak decoder is a proof to the opposite because it worked this way (generating two different pixel formats inside the decoder) for years. Again, what "it's" we are talking about? The same "it" in your mind as in the patch? I went so long as to quantify/estimate the expected extra maintenance burden while you did not even mention any tangible criteria. This makes me doubt that you or others who commented have time to read the answers or are prepared to expect competence of your counterparts. Unfortunately this affects the quality of the judgement. I do have respect for your work and competence. Please do the same to others. Regards, Rune
On 3/6/17, u-9iep@aetey.se <u-9iep@aetey.se> wrote: > On Sun, Mar 05, 2017 at 06:20:34PM -0500, Ronald S. Bultje wrote: >> Hi Rune, >> >> On Sun, Mar 5, 2017 at 4:26 PM, <u-9iep@aetey.se> wrote: >> >> > Ronald, >> > >> > On Sun, Mar 05, 2017 at 02:38:31PM -0500, Ronald S. Bultje wrote: >> > > Hi, >> > > >> > > On Sun, Mar 5, 2017 at 2:22 PM, <u-9iep@aetey.se> wrote: >> > > >> > > > On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: >> > > > > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer >> > > > > wrote: >> > > > > > you may want to add yourself to MAINTAINERs (after talking with >> > > > > > roberto, who i belive has less interrest in cinepak than you do >> > > > > > nowadays) >> > > > > >> > > > > Sounds ok for me. Roberto, what do you think (if you read this)? >> > > > >> > > > The only address to him which I found (in an old commit) bounced, >> > > > there was no reply here on the list either. >> > > > >> > > > Both can be a coincidence, but otherwise it looks like the change >> > should >> > > > be OK. >> > > >> > > >> > > No. This has been discussed repeatedly. Stop trying to push this >> > > through. >> > >> > My maintainership (for the code which I have contributed to, you may be >> > unaware about this fact) was not discussed otherwise than cited here. >> > >> > Please check what you are commenting, >> > especially when you mean to sound like having a definite power >> >> >> The rule is that a patch cannot be committed while a developer has >> outstanding comments. There's outstanding comments, including some from >> me. >> You said "the change should be OK", and I'm simply saying "no" to that, >> because it's not. The patch is not OK until review comments from other >> developers have been addressed by the patch submitter. > > Thanks for the explanation Ronald, > > 1. Apparently you did not aim at the maintainership question. > > It would be nice if you confirm this point, > to avoid further misunderstandings. > > 2. Would you cite the outstanding comment or comments about the patches > which you feel are valid and not addressed? > > I kindly ask you check the latest iteration of the patch series > where I tried to summarize all discussion points in the containing > letter. > > TL;DR: I do have respect for your work and competence. > Please do the same to others. Even if we all too often meet people > who lack in those areas, there are some exceptions. > > To be fair your comments concerning the patches were among the most > detailed and friendly, this is appreciated! > > But even your well meant comments happened to miss the point, being based > in unfounded assumptions. I answered and explained and there it stopped. > > Frankly the only outstanding comments which I am aware of are of the kind > "you the patch submitter do not understand why 'this' is better than > 'that'". > > The fact is that I _do_ understand why you believe that something is > better or not. I just do not feel a belief is sufficient per se. > > I invited you and others to look at _what_ makes something better > or not and got literally no answers. > > Besides of course > "imagine if someone else will do something else, > it would be terrible, thus you are leading us to hell" :) > > or otherwise, citing literally: > "we know this code, we know it can do this, don't tell us it's not > possible" without specifying (and in fact misunderstanding) what > "this" we were talking about: making the speedup usable with an > unprepared application, which the overhelming majority of applications > are. Regrettably, the present code is not prepared nor meant to be able to. > The proposed code could, but this possiblity is now cut off, > just to avoid contention. > > As a third example, your comment > "We [...] know it's unreasonable from a maintenance perspective". > > The very presence of the cinepak decoder is a proof to the opposite > because it worked this way (generating two different pixel formats > inside the decoder) for years. Again, what "it's" we are talking about? > The same "it" in your mind as in the patch? > > I went so long as to quantify/estimate the expected extra maintenance > burden while you did not even mention any tangible criteria. > > This makes me doubt that you or others who commented have time to read > the answers or are prepared to expect competence of your counterparts. > Unfortunately this affects the quality of the judgement. > > I do have respect for your work and competence. > Please do the same to others. What is your Work and Competence in FFmpeg?
On Mon, 6 Mar 2017 10:15:44 +0100 u-9iep@aetey.se wrote: > On Sun, Mar 05, 2017 at 06:20:34PM -0500, Ronald S. Bultje wrote: > > Hi Rune, > > > > On Sun, Mar 5, 2017 at 4:26 PM, <u-9iep@aetey.se> wrote: > > > > > Ronald, > > > > > > On Sun, Mar 05, 2017 at 02:38:31PM -0500, Ronald S. Bultje wrote: > > > > Hi, > > > > > > > > On Sun, Mar 5, 2017 at 2:22 PM, <u-9iep@aetey.se> wrote: > > > > > > > > > On Mon, Feb 13, 2017 at 02:41:40PM +0100, u-9iep@aetey.se wrote: > > > > > > On Mon, Feb 13, 2017 at 02:19:45PM +0100, Michael Niedermayer wrote: > > > > > > > you may want to add yourself to MAINTAINERs (after talking with > > > > > > > roberto, who i belive has less interrest in cinepak than you do > > > > > > > nowadays) > > > > > > > > > > > > Sounds ok for me. Roberto, what do you think (if you read this)? > > > > > > > > > > The only address to him which I found (in an old commit) bounced, > > > > > there was no reply here on the list either. > > > > > > > > > > Both can be a coincidence, but otherwise it looks like the change > > > should > > > > > be OK. > > > > > > > > > > > > No. This has been discussed repeatedly. Stop trying to push this through. > > > > > > My maintainership (for the code which I have contributed to, you may be > > > unaware about this fact) was not discussed otherwise than cited here. > > > > > > Please check what you are commenting, > > > especially when you mean to sound like having a definite power > > > > > > The rule is that a patch cannot be committed while a developer has > > outstanding comments. There's outstanding comments, including some from me. > > You said "the change should be OK", and I'm simply saying "no" to that, > > because it's not. The patch is not OK until review comments from other > > developers have been addressed by the patch submitter. > > Thanks for the explanation Ronald, > > 1. Apparently you did not aim at the maintainership question. > > It would be nice if you confirm this point, > to avoid further misunderstandings. > > 2. Would you cite the outstanding comment or comments about the patches > which you feel are valid and not addressed? > > I kindly ask you check the latest iteration of the patch series > where I tried to summarize all discussion points in the containing > letter. > > TL;DR: I do have respect for your work and competence. > Please do the same to others. Even if we all too often meet people > who lack in those areas, there are some exceptions. > > To be fair your comments concerning the patches were among the most > detailed and friendly, this is appreciated! > > But even your well meant comments happened to miss the point, being based > in unfounded assumptions. I answered and explained and there it stopped. > > Frankly the only outstanding comments which I am aware of are of the kind > "you the patch submitter do not understand why 'this' is better than > 'that'". > > The fact is that I _do_ understand why you believe that something is > better or not. I just do not feel a belief is sufficient per se. > > I invited you and others to look at _what_ makes something better > or not and got literally no answers. > > Besides of course > "imagine if someone else will do something else, > it would be terrible, thus you are leading us to hell" :) > > or otherwise, citing literally: > "we know this code, we know it can do this, don't tell us it's not > possible" without specifying (and in fact misunderstanding) what > "this" we were talking about: making the speedup usable with an > unprepared application, which the overhelming majority of applications > are. Regrettably, the present code is not prepared nor meant to be able to. > The proposed code could, but this possiblity is now cut off, > just to avoid contention. > > As a third example, your comment > "We [...] know it's unreasonable from a maintenance perspective". > > The very presence of the cinepak decoder is a proof to the opposite > because it worked this way (generating two different pixel formats > inside the decoder) for years. Again, what "it's" we are talking about? > The same "it" in your mind as in the patch? > > I went so long as to quantify/estimate the expected extra maintenance > burden while you did not even mention any tangible criteria. > > This makes me doubt that you or others who commented have time to read > the answers or are prepared to expect competence of your counterparts. > Unfortunately this affects the quality of the judgement. > > I do have respect for your work and competence. > Please do the same to others. I think we've repeatedly made clear that: - we don't really want the decoder to output multiple pixfmts by choice - but if it's limited to a very small number of formats (say, 2) it might be ok - but ideally the decoder should output the "native" pixfmt/colorspace of the codec, which apparently is YCgCo (which would also require libswscale modifications) - we're not even interested in a faster cinepak decoder, because we see no need for it (even you haven't demonstrated any need for it - I'm pretty sure everyone would be all over a fast intermediate codec, but people don't use cinepak for that) - trying to trick us into applying this patch by playing policy lawyer won't work - whether or not having these multiple code paths would be so horrible is not even the main question - by now we're just annoyed by this topic and how much attention it seems to drain, so we'd rather ignore this (don't blame us - people tend to ignore unimportant things to get important work done, and not applying your patches seems to be the safer option) - again, nobody understands your needs, and some we find extremely absurd and out of place, like using getenv() in a decoder (we'd probably suggest a better alternative if we did, maybe)
On Mon, Mar 06, 2017 at 10:19:33AM +0100, Paul B Mahol wrote: > On 3/6/17, u-9iep@aetey.se <u-9iep@aetey.se> wrote: > > I do have respect for your work and competence. > > Please do the same to others. > > What is your Work and Competence in FFmpeg? This is offtopic and looks like ad hominem "a logical fallacy in which an argument is rebutted by attacking the character, motive, or other attribute of the person making the argument" (Wikipedia) It is more of a fallacy because you make it look like the "competence in FFmpeg" (whatever you mean with this) is the only competence kind to be respected or/and to be relevant while doing work for FFmpeg. As for your curiosity, feel free to do the research. The data is public. Rune
On 3/6/17, u-9iep@aetey.se <u-9iep@aetey.se> wrote: > On Mon, Mar 06, 2017 at 10:19:33AM +0100, Paul B Mahol wrote: >> On 3/6/17, u-9iep@aetey.se <u-9iep@aetey.se> wrote: >> > I do have respect for your work and competence. >> > Please do the same to others. >> >> What is your Work and Competence in FFmpeg? > > This is offtopic and looks like ad hominem > "a logical fallacy in which an argument is rebutted by attacking the > character, motive, or other attribute of the person making the argument" > (Wikipedia) > > It is more of a fallacy because you make it look like the "competence > in FFmpeg" (whatever you mean with this) is the only competence kind to > be respected or/and to be relevant while doing work for FFmpeg. > > As for your curiosity, feel free to do the research. > The data is public. > > Rune Your mail doesn't give me anything useful when googling.
To everyone: I am really sorry for having to react to this kind of irrational arguments. OTOH keeping silence could be interpreted as accepting them. As far as my common sense goes, I can not count these as "pending comments". TL;DR: trying to reason, given the arbitrary statements and careless behaviour of the opponent On Mon, Mar 06, 2017 at 10:31:42AM +0100, wm4 wrote: > I think we've repeatedly made clear that: > - we don't really want the decoder to output multiple pixfmts by choice It is clear that you personally do not want the Cinepak decoder to be able to output multiple pixel formats, for unspecified reasons ("by choice"). You make it look like it is your discretion who decides whether something is deemed to be good or not for FFmpeg. I do not believe this. > - but if it's limited to a very small number of formats (say, 2) it > might be ok You are again referring to nothing but your discretion. Anyway, if you take your time to look at the current version of the patch, it _happens_ to do exactly this. So this point is moot. > - but ideally the decoder should output the "native" pixfmt/colorspace > of the codec, which apparently is YCgCo (which would also require > libswscale modifications) What makes it "ideal", by which criteria? Be technical please. (BTW "which apparently is" is a wrong assumption. You still did not use the references I offered you for some background information on the matter. Please learn, before posting next time.) > - we're not even interested in a faster cinepak decoder, because we see > no need for it (even you haven't demonstrated any need for it - I'm > pretty sure everyone would be all over a fast intermediate codec, but > people don't use cinepak for that) Are you saying no need for everyone? Nor have they ever tasted a 3 times faster decoder but you already know they will not like it? Sorry, seriously I have no reason to assume that you have prerequisites to know for everyone and especially in advance. You are talking about what is inside the horizon of yourself. > - trying to trick us into applying this patch by playing policy lawyer > won't work You make it again sound like you decide for the project. Frankly, such behaviour makes the project look bad. Also you are accusing me of "tricking" and "playing". Please be specific about what I am doing wrong or apologize otherwise. While at it, I am still expecting your apology for the use of abusive language when you were commenting on my proposal last time. > - whether or not having these multiple code paths would be so horrible > is not even the main question - by now we're just annoyed by this > topic and how much attention it seems to drain, so we'd rather ignore > this (don't blame us - people tend to ignore unimportant things to > get important work done, and not applying your patches seems to be > the safer option) You did not have to pay attention to the patch, given your limited understanding of the matter (which _you_ stated repeatedly) and the limited capacity to participate, which you demonstrated by relying on wrong assumptions and neglecting to correct your mistakes in this discussion. > - again, nobody understands your needs, and some we find extremely You are assuming that everyone has the same level of understanding as you do. This is not necessarily true. > absurd and out of place, like using getenv() in a decoder (we'd > probably suggest a better alternative if we did, maybe) Regrettably, the proposed alternatives were either to put the corresponding functionality into every application or to avoid ffmpeg and start a separate project. Neither of those alternatives would be practically useful. Never mind, this is not part of the current patch. Thanks for reading this long! Regards, Rune
Hi, On Mon, Mar 6, 2017 at 7:08 AM, <u-9iep@aetey.se> wrote: > It is clear that you personally do not want the Cinepak decoder to be able > to output multiple pixel formats, for unspecified reasons ("by choice"). > > You make it look like it is your discretion who decides whether something > is deemed to be good or not for FFmpeg. I do not believe this. > It is a general project design decision. If you want to change that, fine, propose it and let's discuss that on its own merit. But right now this is how the project works. You can't change that without changing official project policy. I'm fundamentally against it, and as such I'm against the patch. I understand why you want it, but I don't think that's sufficient reason. > You did not have to pay attention to the patch, given your limited > understanding of the matter And this is a CoC [1] violation, please don't do that again. Ronald [1] https://ffmpeg.org/developer.html#Code-of-conduct
On Mon, Mar 06, 2017 at 08:19:06AM -0500, Ronald S. Bultje wrote: > Hi, > > On Mon, Mar 6, 2017 at 7:08 AM, <u-9iep@aetey.se> wrote: > > > You did not have to pay attention to the patch, given your limited > > understanding of the matter > > > And this is a CoC [1] violation, please don't do that again. Actually what I wrote is: > > You did not have to pay attention to the patch, given your limited > > understanding of the matter (which _you_ stated repeatedly) I am sorry for saying this. Pointing to one's behaviour is correct, presenting it as one's inherent quality is not. I did wrong. Indeed that person did not state her/his limited understanding nor did this repeatedly. What was repeated was erroneous technical comments about which "native" pixel format Cinepak has. The formulation about understanding I meant to refer to was that of wm4: > I don't know what you're thinking, but that's just wrong. It looks bizarre to me to judge without knowing but this was _not_ a confession of a "limited understanding of the matter". I am sorry for misrepresentation of the statements of wm4! Now Ronald where were you with your policing when the person in question (wm4) repeatedly violated CoC? I point out that your behaviour of selective punishment is most probably not conscious but it is a fact. Regards, Rune
On Mon, Mar 06, 2017 at 08:19:06AM -0500, Ronald S. Bultje wrote: > On Mon, Mar 6, 2017 at 7:08 AM, <u-9iep@aetey.se> wrote: > > > It is clear that you personally do not want the Cinepak decoder to be able > > to output multiple pixel formats, for unspecified reasons ("by choice"). > > > > You make it look like it is your discretion who decides whether something > > is deemed to be good or not for FFmpeg. I do not believe this. > It is a general project design decision. Did the project (who?) ever make a general decision about Cinepak or delegate to wm4 to represent the project's stance? I question her/his tone, not the policy, but what is the latter when not refracted through your, wm4's or someone else's interpretation? The design decision you refer to I would love to read. What I have seen on the ffmpeg.org did not say "decoders shall not produce multiple pixel formats". > If you want to change that, fine, propose it and let's discuss that on its I am really interested to know what is "that" you refer to. > own merit. But right now this is how the project works. You can't change > that without changing official project policy. > > I'm fundamentally against it, and as such I'm against the patch. I > understand why you want it, but I don't think that's sufficient reason. Thanks for stating your position as you do, constructively and coherently. Regards, Rune
On Mon, 6 Mar 2017 16:23:15 +0100 u-9iep@aetey.se wrote: > Did the project (who?) ever make a general decision about Cinepak > or delegate to wm4 to represent the project's stance? > > I question her/his tone, not the policy, but what is the latter > when not refracted through your, wm4's or someone else's interpretation? > > The design decision you refer to I would love to read. What I have > seen on the ffmpeg.org did not say "decoders shall not produce multiple > pixel formats". Before you get too obsessed with me, I have never represented anything, or intended to represent. I merely contributed my own opinion about these patches. I have not more powers than Paul Mahol or Ronald Bultje (if anything, I have less powers than them).
Hi, On Mon, Mar 6, 2017 at 9:52 AM, <u-9iep@aetey.se> wrote: > I am sorry for misrepresentation of the statements of wm4! > No problem. Now Ronald where were you with your policing > when the person in question (wm4) repeatedly violated CoC? > > I point out that your behaviour of selective punishment > is most probably not conscious but it is a fact. I will deal with people privately on IRC where I can. But thank you for pointing it out. Ronald
On Mon, Mar 06, 2017 at 04:31:32PM +0100, wm4 wrote: > On Mon, 6 Mar 2017 16:23:15 +0100 > u-9iep@aetey.se wrote: > > > Did the project (who?) ever make a general decision about Cinepak > > or delegate to wm4 to represent the project's stance? > > > > I question her/his tone, not the policy, but what is the latter > > when not refracted through your, wm4's or someone else's interpretation? > > > > The design decision you refer to I would love to read. What I have > > seen on the ffmpeg.org did not say "decoders shall not produce multiple > > pixel formats". > > Before you get too obsessed with me, I have never represented anything, > or intended to represent. I merely contributed my own opinion about > these patches. I have not more powers than Paul Mahol or Ronald Bultje > (if anything, I have less powers than them). It is nice of you to say this. Thanks for the clarification. Regards, Rune
diff --git a/libavcodec/cinepak.c b/libavcodec/cinepak.c index d657e9c0c1..7b08e20e06 100644 --- a/libavcodec/cinepak.c +++ b/libavcodec/cinepak.c @@ -31,6 +31,8 @@ * * Cinepak colorspace support (c) 2013 Rl, Aetey Global Technologies AB * @author Cinepak colorspace, Rl, Aetey Global Technologies AB + * Extra output formats / optimizations (c) 2017 Rl, Aetey Global Technologies AB + * @author Extra output formats / optimizations, Rl, Aetey Global Technologies AB */ #include <stdio.h> @@ -39,23 +41,48 @@ #include "libavutil/common.h" #include "libavutil/intreadwrite.h" +#include "libavutil/opt.h" +/* #include "libavutil/avassert.h" */ #include "avcodec.h" #include "internal.h" +/* rounding to nearest; truncation would be slightly faster + * but it noticeably affects the picture quality; + * unless we become extremely desperate to use every single cycle + * we do not bother implementing a choice -- rl */ +#define PACK_RGB_RGB565(r,g,b) (((av_clip_uint8((r)+4)>>3)<<11)|((av_clip_uint8((g)+2)>>2)<<5)|(av_clip_uint8((b)+4)>>3)) -typedef uint8_t cvid_codebook[12]; +/* + * more "desperate/ultimate" optimization possibilites: + * - possibly (hardly?) spare a cycle or two by not ensuring to stay + * inside the frame at vector decoding (the frame is allocated with + * a margin for us as an extra precaution, we can as well use this) + * - skip filling in opacity when it is not needed by the data consumer, + * in many cases rgb32 is almost as fast as rgb565, with full quality, + * improving its speed can make sense + */ + +typedef union cvid_codebook { + uint32_t rgb32[256][ 4]; + uint8_t rgb24[256][12]; + uint16_t rgb565[256][ 4]; + uint8_t yuv420[256][ 6]; + uint8_t pal8[256][ 4]; +} cvid_codebook; -#define MAX_STRIPS 32 +#define MAX_STRIPS 32 /* an arbitrary limit -- rl */ typedef struct cvid_strip { uint16_t id; uint16_t x1, y1; uint16_t x2, y2; - cvid_codebook v4_codebook[256]; - cvid_codebook v1_codebook[256]; + cvid_codebook v4_codebook; + cvid_codebook v1_codebook; } cvid_strip; -typedef struct CinepakContext { +typedef struct CinepakContext CinepakContext; +struct CinepakContext { + const AVClass *class; AVCodecContext *avctx; AVFrame *frame; @@ -71,57 +98,192 @@ typedef struct CinepakContext { int sega_film_skip_bytes; uint32_t pal[256]; -} CinepakContext; -static void cinepak_decode_codebook (cvid_codebook *codebook, - int chunk_id, int size, const uint8_t *data) -{ - const uint8_t *eod = (data + size); - uint32_t flag, mask; - int i, n; - uint8_t *p; - - /* check if this chunk contains 4- or 6-element vectors */ - n = (chunk_id & 0x04) ? 4 : 6; - flag = 0; - mask = 0; - - p = codebook[0]; - for (i=0; i < 256; i++) { - if ((chunk_id & 0x01) && !(mask >>= 1)) { - if ((data + 4) > eod) - break; - - flag = AV_RB32 (data); - data += 4; - mask = 0x80000000; - } + void (*decode_codebook)(CinepakContext *s, + cvid_codebook *codebook, int chunk_id, + int size, const uint8_t *data); + int (*decode_vectors)(CinepakContext *s, cvid_strip *strip, + int chunk_id, int size, const uint8_t *data); +/* options */ + enum AVPixelFormat out_pixfmt; +}; - if (!(chunk_id & 0x01) || (flag & mask)) { - int k, kk; +#define OFFSET(x) offsetof(CinepakContext, x) +#define VD AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_DECODING_PARAM +static const AVOption options[] = { +{"output_pixel_format", "set output pixel format: rgb24/rgb32/rgb565/yuv420p/pal8; yuv420p is approximate", OFFSET(out_pixfmt), AV_OPT_TYPE_PIXEL_FMT, {.i64=AV_PIX_FMT_NONE}, -1, INT_MAX, VD }, + { NULL }, +}; - if ((data + n) > eod) - break; +static const AVClass cinepak_class = { + .class_name = "cinepak decoder", + .item_name = av_default_item_name, + .option = options, + .version = LIBAVUTIL_VERSION_INT, +}; - for (k = 0; k < 4; ++k) { - int r = *data++; - for (kk = 0; kk < 3; ++kk) - *p++ = r; +/* this is an attempt to satisfy the requirement to reduce code duplication + * feel free to do this in a more elegant fashion, but keep the speed + * -- rl */ +#define CODEBOOK_HEAD \ + const uint8_t *eod;\ + uint32_t flag, mask;\ + int i, n;\ + int palette_video;\ + int selective_update;\ + +#define CODEBOOK_STREAM_PARSING \ + for (i=0; i < 256; i++) {\ + if (selective_update && !(mask >>= 1)) {\ + if ((data + 4) > eod)\ + break;\ +\ + flag = AV_RB32 (data);\ + data += 4;\ + mask = 0x80000000;\ + }\ +\ + if (!selective_update || (flag & mask)) {\ + int k;\ +\ + if ((data + n) > eod)\ + break;\ + +#define CODEBOOK_INTRO \ + selective_update = (chunk_id & 0x01);\ + eod = (data + size);\ + flag = 0;\ + mask = 0;\ + +#define CODEBOOK_FULL_COLOR \ + /* check if this chunk contains 4- or 6-element vectors */\ + n = (chunk_id & 0x04) ? 4 : 6;\ + palette_video = s->palette_video;\ + CODEBOOK_INTRO\ + CODEBOOK_STREAM_PARSING\ + +#define VECTOR_INTRO \ + CODEBOOK_INTRO\ + v1_only = (chunk_id & 0x02);\ +\ + for (y=strip->y1; y < strip->y2; y+=4) {\ + +#define VECTOR_STREAM_PARSING \ + for (x=strip->x1; x < strip->x2; x+=4) {\ + if (selective_update && !(mask >>= 1)) {\ + if ((data + 4) > eod)\ + return AVERROR_INVALIDDATA;\ +\ + flag = AV_RB32 (data);\ + data += 4;\ + mask = 0x80000000;\ + }\ +\ + if (!selective_update || (flag & mask)) {\ + if (!v1_only && !(mask >>= 1)) {\ + if ((data + 4) > eod)\ + return AVERROR_INVALIDDATA;\ +\ + flag = AV_RB32 (data);\ + data += 4;\ + mask = 0x80000000;\ + }\ +\ + if (v1_only || (~flag & mask)) {\ + POINTER_TYPE *p;\ + if (data >= eod)\ + return AVERROR_INVALIDDATA;\ + +#define VECTOR_DO \ +/* take care of y dimension not being multiple of 4, such streams exist */\ + if(s->avctx->height - y > 1) {\ + ip1 = ip0 + s->frame->linesize[0];\ + if(s->avctx->height - y > 2) {\ + ip2 = ip1 + s->frame->linesize[0];\ + if(s->avctx->height - y > 3) {\ + ip3 = ip2 + s->frame->linesize[0];\ + }\ + }\ + }\ +/* to get the correct picture for not-multiple-of-4 cases let us fill each\ + * block from the bottom up, thus possibly overwriting the bottommost line\ + * more than once but ending with the correct data in place\ + * (instead of in-loop checking) */\ + VECTOR_STREAM_PARSING\ + +static void cinepak_decode_codebook_rgb32 (CinepakContext *s, + cvid_codebook *codebook, int chunk_id, int size, const uint8_t *data) +{ + CODEBOOK_HEAD + uint32_t *p = codebook->rgb32[0]; + + CODEBOOK_FULL_COLOR + + if (n == 4) + if (palette_video) + for (k = 0; k < 4; ++k) + *p++ = s->pal[*data++]; /* this is easy */ + else + for (k = 0; k < 4; ++k) { + int r = *data++; +/* in some situations we might not have to set opacity */ + *p++ = /**/ (255<<24)| /**/ (r<<16)|(r<<8)|r; + } + else { /* n == 6 */ + int y, u, v; + u = (int8_t)data[4]; + v = (int8_t)data[5]; + for(k=0; k<4; ++k) { + y = *data++; +/* in some situations we might not have to set opacity */ + *p++ = /**/ (255<<24)| /**/ +/* here the cinepak color space excels */ + (av_clip_uint8(y + v*2)<<16)| + (av_clip_uint8(y - (u/2) - v)<<8)| + av_clip_uint8(y + u*2); + } + data += 2; } - if (n == 6) { - int r, g, b, u, v; - u = *(int8_t *)data++; - v = *(int8_t *)data++; - p -= 12; + } else { + p += 4; + } + } +} + +static void cinepak_decode_codebook_rgb24 (CinepakContext *s, + cvid_codebook *codebook, int chunk_id, int size, const uint8_t *data) +{ + CODEBOOK_HEAD + uint8_t *p = codebook->rgb24[0]; + + CODEBOOK_FULL_COLOR + + if (n == 4) + if (palette_video) + for (k = 0; k < 4; ++k) { + uint32_t r = s->pal[*data++]; + *p++ = (r>>16)&0xff; + *p++ = (r>>8) &0xff; + *p++ = r &0xff; + } + else + for (k = 0; k < 4; ++k) { + int kk, r = *data++; + for (kk = 0; kk < 3; ++kk) + *p++ = r; + } + else { /* n == 6 */ + int y, u, v; + u = (int8_t)data[4]; + v = (int8_t)data[5]; for(k=0; k<4; ++k) { - r = *p++ + v*2; - g = *p++ - (u/2) - v; - b = *p + u*2; - p -= 2; - *p++ = av_clip_uint8(r); - *p++ = av_clip_uint8(g); - *p++ = av_clip_uint8(b); + y = *data++; +/* here the cinepak color space excels */ + *p++ = av_clip_uint8(y + v*2); + *p++ = av_clip_uint8(y - (u/2) - v); + *p++ = av_clip_uint8(y + u*2); } + data += 2; } } else { p += 12; @@ -129,134 +291,448 @@ static void cinepak_decode_codebook (cvid_codebook *codebook, } } -static int cinepak_decode_vectors (CinepakContext *s, cvid_strip *strip, +static void cinepak_decode_codebook_rgb565 (CinepakContext *s, + cvid_codebook *codebook, int chunk_id, int size, const uint8_t *data) +{ + CODEBOOK_HEAD + uint16_t *p = codebook->rgb565[0]; + + CODEBOOK_FULL_COLOR + + if (n == 4) + if (palette_video) + for (k = 0; k < 4; ++k) { + uint32_t r = s->pal[*data++]; + *p++ = PACK_RGB_RGB565((r>>16)&0xff, + (r>>8)&0xff, + r&0xff); + } + else + for (k = 0; k < 4; ++k) { + int r = *data++; + *p++ = PACK_RGB_RGB565(r,r,r); + } + else { /* n == 6 */ + int y, u, v; + u = (int8_t)data[4]; + v = (int8_t)data[5]; + for(k=0; k<4; ++k) { + y = *data++; +/* here the cinepak color space excels */ + *p++ = PACK_RGB_RGB565(y + v*2, + y - (u/2) - v, + y + u*2); + } + data += 2; + } + } else { + p += 4; + } + } +} + +/* a simplistic version to begin with, it is also fast -- rl */ +static void cinepak_decode_codebook_yuv420 (CinepakContext *s, + cvid_codebook *codebook, int chunk_id, int size, const uint8_t *data) +{ + CODEBOOK_HEAD + uint8_t *p = codebook->yuv420[0]; + + CODEBOOK_FULL_COLOR + + if (n == 4) + if (palette_video) { +/* here we have kind of "more" data than the output format can express */ + int r, g, b, u = 0, v = 0; + for (k = 0; k < 4; ++k) { + uint32_t rr = s->pal[*data++]; + r = (rr>>16)&0xff; + g = (rr>>8) &0xff; + b = rr &0xff; +/* calculate the components (https://en.wikipedia.org/wiki/YUV) */ + *p++ = ((r*66+g*129+b*25+128)>>8)+16; + u += (-r*38-g*74+b*112+128)>>8; + v += (r*112-g*94-b*18+128)>>8; + } + *p++ = (u+2)/4+128; + *p++ = (v+2)/4+128; + } else { /* grayscale, easy */ + for (k = 0; k < 4; ++k) { + *p++ = *data++; + } + *p++ = 128; + *p++ = 128; + } + else { /* n == 6 */ +/* here we'd have to handle double format conversion + * Cinepak=>rgb24 and then rgb24=>yuv420p, which can not be shortcut; + * for the moment just copying as-is, for simplicity and speed, + * color will be slightly off but not much */ + *p++ = *data++; + *p++ = *data++; + *p++ = *data++; + *p++ = *data++; + *p++ = *data++ + 128; + *p++ = *data++ + 128; + } + } else { + p += 6; + } + } +} + +/* here we do not expect anything besides palettized video, + * nor check the data for validity, which should be ok + * as long as we do not write beyond the bounds */ +static void cinepak_decode_codebook_pal8 (CinepakContext *s, + cvid_codebook *codebook, int chunk_id, int size, const uint8_t *data) +{ + const uint8_t *eod; + uint32_t flag, mask; + int i; + int selective_update; + uint8_t *p = codebook->pal8[0]; + +#define PAL8_VECTOR_LENGTH 4 +#define n PAL8_VECTOR_LENGTH +/* av_assert0(chunk_id & 0x04); */ + + CODEBOOK_INTRO + CODEBOOK_STREAM_PARSING + +#undef n + + for (k = 0; k < 4; ++k) + *p++ = *data++; + } else { + p += 4; + } + } +} + +static int cinepak_decode_vectors_rgb32 (CinepakContext *s, cvid_strip *strip, + int chunk_id, int size, const uint8_t *data) +{ + const uint8_t *eod; + uint32_t flag, mask; + uint32_t *cb0, *cb1, *cb2, *cb3; + int x, y; + char *ip0, *ip1, *ip2, *ip3; + int selective_update; + int v1_only; + + VECTOR_INTRO + + ip0 = ip1 = ip2 = ip3 = s->frame->data[0] + + strip->x1*4 + y*s->frame->linesize[0]; +#define POINTER_TYPE uint32_t + VECTOR_DO +#undef POINTER_TYPE + + p = strip->v1_codebook.rgb32[*data++] + 2; /* ... + 8 */ + memcpy(ip3 + 0, p, 4); memcpy(ip3 + 4, p, 4); + memcpy(ip2 + 0, p, 4); memcpy(ip2 + 4, p, 4); + p += 1; /* ... + 12 */ + memcpy(ip3 + 8, p, 4); memcpy(ip3 + 12, p, 4); + memcpy(ip2 + 8, p, 4); memcpy(ip2 + 12, p, 4); + p -= 3; /* ... + 0 */ + memcpy(ip1 + 0, p, 4); memcpy(ip1 + 4, p, 4); + memcpy(ip0 + 0, p, 4); memcpy(ip0 + 4, p, 4); + p += 1; /* ... + 4 */ + memcpy(ip1 + 8, p, 4); memcpy(ip1 + 12, p, 4); + memcpy(ip0 + 8, p, 4); memcpy(ip0 + 12, p, 4); + + } else if (flag & mask) { + if ((data + 4) > eod) + return AVERROR_INVALIDDATA; + + cb0 = strip->v4_codebook.rgb32[*data++]; + cb1 = strip->v4_codebook.rgb32[*data++]; + cb2 = strip->v4_codebook.rgb32[*data++]; + cb3 = strip->v4_codebook.rgb32[*data++]; + memcpy(ip3 + 0, cb2 + 2, 8); + memcpy(ip3 + 8, cb3 + 2, 8); + memcpy(ip2 + 0, cb2 + 0, 8); + memcpy(ip2 + 8, cb3 + 0, 8); + memcpy(ip1 + 0, cb0 + 2, 8); + memcpy(ip1 + 8, cb1 + 2, 8); + memcpy(ip0 + 0, cb0 + 0, 8); + memcpy(ip0 + 8, cb1 + 0, 8); + + } + } + + ip0 += 16; ip1 += 16; + ip2 += 16; ip3 += 16; + } + } + + return 0; +} + +static int cinepak_decode_vectors_rgb24 (CinepakContext *s, cvid_strip *strip, int chunk_id, int size, const uint8_t *data) { - const uint8_t *eod = (data + size); + const uint8_t *eod; uint32_t flag, mask; uint8_t *cb0, *cb1, *cb2, *cb3; - int x, y; + int x, y; char *ip0, *ip1, *ip2, *ip3; + int selective_update; + int v1_only; - flag = 0; - mask = 0; + VECTOR_INTRO - for (y=strip->y1; y < strip->y2; y+=4) { + ip0 = ip1 = ip2 = ip3 = s->frame->data[0] + + strip->x1*3 + y*s->frame->linesize[0]; + +#define POINTER_TYPE uint8_t + VECTOR_DO +#undef POINTER_TYPE + + p = strip->v1_codebook.rgb24[*data++] + 6; + memcpy(ip3 + 0, p, 3); memcpy(ip3 + 3, p, 3); + memcpy(ip2 + 0, p, 3); memcpy(ip2 + 3, p, 3); + p += 3; /* ... + 9 */ + memcpy(ip3 + 6, p, 3); memcpy(ip3 + 9, p, 3); + memcpy(ip2 + 6, p, 3); memcpy(ip2 + 9, p, 3); + p -= 9; /* ... + 0 */ + memcpy(ip1 + 0, p, 3); memcpy(ip1 + 3, p, 3); + memcpy(ip0 + 0, p, 3); memcpy(ip0 + 3, p, 3); + p += 3; /* ... + 3 */ + memcpy(ip1 + 6, p, 3); memcpy(ip1 + 9, p, 3); + memcpy(ip0 + 6, p, 3); memcpy(ip0 + 9, p, 3); + + } else if (flag & mask) { + if ((data + 4) > eod) + return AVERROR_INVALIDDATA; + + cb0 = strip->v4_codebook.rgb24[*data++]; + cb1 = strip->v4_codebook.rgb24[*data++]; + cb2 = strip->v4_codebook.rgb24[*data++]; + cb3 = strip->v4_codebook.rgb24[*data++]; + memcpy(ip3 + 0, cb2 + 6, 6); + memcpy(ip3 + 6, cb3 + 6, 6); + memcpy(ip2 + 0, cb2 + 0, 6); + memcpy(ip2 + 6, cb3 + 0, 6); + memcpy(ip1 + 0, cb0 + 6, 6); + memcpy(ip1 + 6, cb1 + 6, 6); + memcpy(ip0 + 0, cb0 + 0, 6); + memcpy(ip0 + 6, cb1 + 0, 6); + + } + } + + ip0 += 12; ip1 += 12; + ip2 += 12; ip3 += 12; + } + } + + return 0; +} + +static int cinepak_decode_vectors_rgb565 (CinepakContext *s, cvid_strip *strip, + int chunk_id, int size, const uint8_t *data) +{ + const uint8_t *eod; + uint32_t flag, mask; + uint16_t *cb0, *cb1, *cb2, *cb3; + int x, y; + char *ip0, *ip1, *ip2, *ip3; + int selective_update; + int v1_only; + + VECTOR_INTRO + + ip0 = ip1 = ip2 = ip3 = s->frame->data[0] + + strip->x1*2 + y*s->frame->linesize[0]; + +#define POINTER_TYPE uint16_t + VECTOR_DO +#undef POINTER_TYPE + + p = strip->v1_codebook.rgb565[*data++]; + * (uint16_t *)ip3 = *((uint16_t *)ip3+1) = + * (uint16_t *)ip2 = *((uint16_t *)ip2+1) = p[2]; + *((uint16_t *)ip3+2) = *((uint16_t *)ip3+3) = + *((uint16_t *)ip2+2) = *((uint16_t *)ip2+3) = p[3]; + * (uint16_t *)ip1 = *((uint16_t *)ip1+1) = + * (uint16_t *)ip0 = *((uint16_t *)ip0+1) = p[0]; + *((uint16_t *)ip1+2) = *((uint16_t *)ip1+3) = + *((uint16_t *)ip0+2) = *((uint16_t *)ip0+3) = p[1]; + + } else if (flag & mask) { + if ((data + 4) > eod) + return AVERROR_INVALIDDATA; + + cb0 = strip->v4_codebook.rgb565[*data++]; + cb1 = strip->v4_codebook.rgb565[*data++]; + cb2 = strip->v4_codebook.rgb565[*data++]; + cb3 = strip->v4_codebook.rgb565[*data++]; + memcpy(ip3 + 0, cb2 + 2, 4); + memcpy(ip3 + 4, cb3 + 2, 4); + memcpy(ip2 + 0, cb2 + 0, 4); + memcpy(ip2 + 4, cb3 + 0, 4); + memcpy(ip1 + 0, cb0 + 2, 4); + memcpy(ip1 + 4, cb1 + 2, 4); + memcpy(ip0 + 0, cb0 + 0, 4); + memcpy(ip0 + 4, cb1 + 0, 4); + + } + } + + ip0 += 8; ip1 += 8; + ip2 += 8; ip3 += 8; + } + } + + return 0; +} + +static int cinepak_decode_vectors_yuv420 (CinepakContext *s, cvid_strip *strip, + int chunk_id, int size, const uint8_t *data) +{ + const uint8_t *eod; + uint32_t flag, mask; + uint8_t *cb0, *cb1, *cb2, *cb3; + int x, y; + char *ip0, *ip1, *ip2, *ip3, + *up01, *up23, *vp01, *vp23; + int selective_update; + int v1_only; + + VECTOR_INTRO -/* take care of y dimension not being multiple of 4, such streams exist */ ip0 = ip1 = ip2 = ip3 = s->frame->data[0] + - (s->palette_video?strip->x1:strip->x1*3) + (y * s->frame->linesize[0]); + strip->x1*3 + y*s->frame->linesize[0]; + up01 = up23 = s->frame->data[1] + strip->x1 + y/2*s->frame->linesize[1]; + vp01 = vp23 = s->frame->data[2] + strip->x1 + y/2*s->frame->linesize[2]; if(s->avctx->height - y > 1) { ip1 = ip0 + s->frame->linesize[0]; if(s->avctx->height - y > 2) { ip2 = ip1 + s->frame->linesize[0]; + up23 = up01 + s->frame->linesize[1]; + vp23 = vp01 + s->frame->linesize[2]; if(s->avctx->height - y > 3) { ip3 = ip2 + s->frame->linesize[0]; } } } + /* to get the correct picture for not-multiple-of-4 cases let us fill each * block from the bottom up, thus possibly overwriting the bottommost line * more than once but ending with the correct data in place * (instead of in-loop checking) */ - for (x=strip->x1; x < strip->x2; x+=4) { - if ((chunk_id & 0x01) && !(mask >>= 1)) { - if ((data + 4) > eod) - return AVERROR_INVALIDDATA; +#define POINTER_TYPE uint8_t + VECTOR_STREAM_PARSING +#undef POINTER_TYPE - flag = AV_RB32 (data); - data += 4; - mask = 0x80000000; - } + p = strip->v1_codebook.yuv420[*data++]; + ip3[0] = ip3[1] = ip2[0] = ip2[1] = p[2]; + ip3[2] = ip3[3] = ip2[2] = ip2[3] = p[3]; + ip1[0] = ip1[1] = ip0[0] = ip0[1] = p[0]; + ip1[2] = ip1[3] = ip0[2] = ip0[3] = p[1]; + p += 4; + up01[0] = up01[1] = up23[0] = up23[1] = *p++; + vp01[0] = vp01[1] = vp23[0] = vp23[1] = *p++; - if (!(chunk_id & 0x01) || (flag & mask)) { - if (!(chunk_id & 0x02) && !(mask >>= 1)) { + } else if (flag & mask) { if ((data + 4) > eod) return AVERROR_INVALIDDATA; - flag = AV_RB32 (data); - data += 4; - mask = 0x80000000; + cb0 = strip->v4_codebook.yuv420[*data++]; + cb1 = strip->v4_codebook.yuv420[*data++]; + cb2 = strip->v4_codebook.yuv420[*data++]; + cb3 = strip->v4_codebook.yuv420[*data++]; + memcpy(ip3 + 0, cb2 + 2, 2); + memcpy(ip3 + 2, cb3 + 2, 2); + memcpy(ip2 + 0, cb2 + 0, 2); + memcpy(ip2 + 2, cb3 + 0, 2); + memcpy(ip1 + 0, cb0 + 2, 2); + memcpy(ip1 + 2, cb1 + 2, 2); + memcpy(ip0 + 0, cb0 + 0, 2); + memcpy(ip0 + 2, cb1 + 0, 2); + cb0 += 4; cb1 += 4; cb2 += 4; cb3 += 4; + up23[0] = *cb2++; vp23[0] = *cb2; + up23[1] = *cb3++; vp23[1] = *cb3; + up01[0] = *cb0++; vp01[0] = *cb0; + up01[1] = *cb1++; vp01[1] = *cb1; + } + } - if ((chunk_id & 0x02) || (~flag & mask)) { - uint8_t *p; - if (data >= eod) - return AVERROR_INVALIDDATA; + ip0 += 4; ip1 += 4; + ip2 += 4; ip3 += 4; + up01 += 2; up23 += 2; + vp01 += 2; vp23 += 2; + } + } - p = strip->v1_codebook[*data++]; - if (s->palette_video) { - ip3[0] = ip3[1] = ip2[0] = ip2[1] = p[6]; - ip3[2] = ip3[3] = ip2[2] = ip2[3] = p[9]; - ip1[0] = ip1[1] = ip0[0] = ip0[1] = p[0]; - ip1[2] = ip1[3] = ip0[2] = ip0[3] = p[3]; - } else { - p += 6; - memcpy(ip3 + 0, p, 3); memcpy(ip3 + 3, p, 3); - memcpy(ip2 + 0, p, 3); memcpy(ip2 + 3, p, 3); - p += 3; /* ... + 9 */ - memcpy(ip3 + 6, p, 3); memcpy(ip3 + 9, p, 3); - memcpy(ip2 + 6, p, 3); memcpy(ip2 + 9, p, 3); - p -= 9; /* ... + 0 */ - memcpy(ip1 + 0, p, 3); memcpy(ip1 + 3, p, 3); - memcpy(ip0 + 0, p, 3); memcpy(ip0 + 3, p, 3); - p += 3; /* ... + 3 */ - memcpy(ip1 + 6, p, 3); memcpy(ip1 + 9, p, 3); - memcpy(ip0 + 6, p, 3); memcpy(ip0 + 9, p, 3); - } + return 0; +} + +static int cinepak_decode_vectors_pal8 (CinepakContext *s, cvid_strip *strip, + int chunk_id, int size, const uint8_t *data) +{ + const uint8_t *eod; + uint32_t flag, mask; + uint8_t *cb0, *cb1, *cb2, *cb3; + int x, y; + char *ip0, *ip1, *ip2, *ip3; + int selective_update; + int v1_only; + + VECTOR_INTRO + + ip0 = ip1 = ip2 = ip3 = s->frame->data[0] + + strip->x1 + y*s->frame->linesize[0]; + +#define POINTER_TYPE uint8_t + VECTOR_DO +#undef POINTER_TYPE + + p = strip->v1_codebook.pal8[*data++]; + ip3[0] = ip3[1] = ip2[0] = ip2[1] = p[2]; + ip3[2] = ip3[3] = ip2[2] = ip2[3] = p[3]; + ip1[0] = ip1[1] = ip0[0] = ip0[1] = p[0]; + ip1[2] = ip1[3] = ip0[2] = ip0[3] = p[1]; } else if (flag & mask) { + uint8_t *p; if ((data + 4) > eod) return AVERROR_INVALIDDATA; - cb0 = strip->v4_codebook[*data++]; - cb1 = strip->v4_codebook[*data++]; - cb2 = strip->v4_codebook[*data++]; - cb3 = strip->v4_codebook[*data++]; - if (s->palette_video) { - uint8_t *p; - p = ip3; - *p++ = cb2[6]; - *p++ = cb2[9]; - *p++ = cb3[6]; - *p = cb3[9]; - p = ip2; - *p++ = cb2[0]; - *p++ = cb2[3]; - *p++ = cb3[0]; - *p = cb3[3]; - p = ip1; - *p++ = cb0[6]; - *p++ = cb0[9]; - *p++ = cb1[6]; - *p = cb1[9]; - p = ip0; - *p++ = cb0[0]; - *p++ = cb0[3]; - *p++ = cb1[0]; - *p = cb1[3]; - } else { - memcpy(ip3 + 0, cb2 + 6, 6); - memcpy(ip3 + 6, cb3 + 6, 6); - memcpy(ip2 + 0, cb2 + 0, 6); - memcpy(ip2 + 6, cb3 + 0, 6); - memcpy(ip1 + 0, cb0 + 6, 6); - memcpy(ip1 + 6, cb1 + 6, 6); - memcpy(ip0 + 0, cb0 + 0, 6); - memcpy(ip0 + 6, cb1 + 0, 6); - } + cb0 = strip->v4_codebook.pal8[*data++]; + cb1 = strip->v4_codebook.pal8[*data++]; + cb2 = strip->v4_codebook.pal8[*data++]; + cb3 = strip->v4_codebook.pal8[*data++]; + p = ip3; + *p++ = cb2[2]; + *p++ = cb2[3]; + *p++ = cb3[2]; + *p = cb3[3]; + p = ip2; + *p++ = cb2[0]; + *p++ = cb2[1]; + *p++ = cb3[0]; + *p = cb3[1]; + p = ip1; + *p++ = cb0[2]; + *p++ = cb0[3]; + *p++ = cb1[2]; + *p = cb1[3]; + p = ip0; + *p++ = cb0[0]; + *p++ = cb0[1]; + *p++ = cb1[0]; + *p = cb1[1]; } } - if (s->palette_video) { - ip0 += 4; ip1 += 4; - ip2 += 4; ip3 += 4; - } else { - ip0 += 12; ip1 += 12; - ip2 += 12; ip3 += 12; - } + ip0 += 4; ip1 += 4; + ip2 += 4; ip3 += 4; } } @@ -290,22 +766,22 @@ static int cinepak_decode_strip (CinepakContext *s, case 0x21: case 0x24: case 0x25: - cinepak_decode_codebook (strip->v4_codebook, chunk_id, - chunk_size, data); + s->decode_codebook(s, &strip->v4_codebook, + chunk_id, chunk_size, data); break; case 0x22: case 0x23: case 0x26: case 0x27: - cinepak_decode_codebook (strip->v1_codebook, chunk_id, - chunk_size, data); + s->decode_codebook (s, &strip->v1_codebook, + chunk_id, chunk_size, data); break; case 0x30: case 0x31: case 0x32: - return cinepak_decode_vectors (s, strip, chunk_id, + return s->decode_vectors (s, strip, chunk_id, chunk_size, data); } @@ -385,9 +861,9 @@ static int cinepak_decode (CinepakContext *s) strip_size = ((s->data + strip_size) > eod) ? (eod - s->data) : strip_size; if ((i > 0) && !(frame_flags & 0x01)) { - memcpy (s->strips[i].v4_codebook, s->strips[i-1].v4_codebook, + memcpy (&s->strips[i].v4_codebook, &s->strips[i-1].v4_codebook, sizeof(s->strips[i].v4_codebook)); - memcpy (s->strips[i].v1_codebook, s->strips[i-1].v1_codebook, + memcpy (&s->strips[i].v1_codebook, &s->strips[i-1].v1_codebook, sizeof(s->strips[i].v1_codebook)); } @@ -402,23 +878,85 @@ static int cinepak_decode (CinepakContext *s) return 0; } +/* given a palettized input */ +static const enum AVPixelFormat pixfmt_list[] = { + AV_PIX_FMT_RGB24, + AV_PIX_FMT_RGB32, + AV_PIX_FMT_RGB565, + AV_PIX_FMT_YUV420P, + AV_PIX_FMT_PAL8, /* only when input is palettized */ + AV_PIX_FMT_NONE +}; + +/* given a non-palettized input */ +static const enum AVPixelFormat pixfmt_list_2[] = { + AV_PIX_FMT_RGB24, + AV_PIX_FMT_RGB32, + AV_PIX_FMT_RGB565, + AV_PIX_FMT_YUV420P, + AV_PIX_FMT_NONE +}; + static av_cold int cinepak_decode_init(AVCodecContext *avctx) { CinepakContext *s = avctx->priv_data; +/* we take advantage of VQ to efficiently support + * multiple output formats */ + s->avctx = avctx; s->width = (avctx->width + 3) & ~3; s->height = (avctx->height + 3) & ~3; s->sega_film_skip_bytes = -1; /* uninitialized state */ - // check for paletted data - if (avctx->bits_per_coded_sample != 8) { - s->palette_video = 0; - avctx->pix_fmt = AV_PIX_FMT_RGB24; - } else { - s->palette_video = 1; - avctx->pix_fmt = AV_PIX_FMT_PAL8; + /* check for paletted data */ + s->palette_video = (avctx->bits_per_coded_sample == 8); + +/* If you are in a crisis, needing to influence the format choice + * in the decoder to workaround dumb/misbehaving applications, + * here would be the place to insert checking some dedicated + * environment variable -- rl + * NOTE that ffmpeg API policies as of 2017 STRONGLY DISCOURAGE + * taking such freedoms, do not bother the developers with this, + * even as an example such code is not welcome */ + + if (s->out_pixfmt != AV_PIX_FMT_NONE) /* the option is set to something */ + avctx->pix_fmt = s->out_pixfmt; + else + if (s->palette_video) + avctx->pix_fmt = ff_get_format(avctx, pixfmt_list); + else + avctx->pix_fmt = ff_get_format(avctx, pixfmt_list_2); + + switch (avctx->pix_fmt) { + case AV_PIX_FMT_RGB32: + s->decode_codebook = cinepak_decode_codebook_rgb32; + s->decode_vectors = cinepak_decode_vectors_rgb32; + break; + case AV_PIX_FMT_RGB24: + s->decode_codebook = cinepak_decode_codebook_rgb24; + s->decode_vectors = cinepak_decode_vectors_rgb24; + break; + case AV_PIX_FMT_RGB565: + s->decode_codebook = cinepak_decode_codebook_rgb565; + s->decode_vectors = cinepak_decode_vectors_rgb565; + break; + case AV_PIX_FMT_YUV420P: + s->decode_codebook = cinepak_decode_codebook_yuv420; + s->decode_vectors = cinepak_decode_vectors_yuv420; + break; + case AV_PIX_FMT_PAL8: + if (!s->palette_video) { + av_log(avctx, AV_LOG_ERROR, "Palettized output not supported without palettized input\n"); + return AVERROR(EINVAL); + } + s->decode_codebook = cinepak_decode_codebook_pal8; + s->decode_vectors = cinepak_decode_vectors_pal8; + break; + default: + av_log(avctx, AV_LOG_ERROR, "Unsupported pixel format %d\n", avctx->pix_fmt); + return AVERROR(EINVAL); } s->frame = av_frame_alloc(); @@ -457,7 +995,7 @@ static int cinepak_decode_frame(AVCodecContext *avctx, av_log(avctx, AV_LOG_ERROR, "cinepak_decode failed\n"); } - if (s->palette_video) + if (avctx->pix_fmt == AV_PIX_FMT_PAL8) memcpy (s->frame->data[1], s->pal, AVPALETTE_SIZE); if ((ret = av_frame_ref(data, s->frame)) < 0) @@ -488,4 +1026,6 @@ AVCodec ff_cinepak_decoder = { .close = cinepak_decode_end, .decode = cinepak_decode_frame, .capabilities = AV_CODEC_CAP_DR1, + .pix_fmts = pixfmt_list, + .priv_class = &cinepak_class, };
Hello, This is my best effort attempt to make the patch acceptable by the upstream's criteria. Daniel, do you mind that I referred to your message in the commit? I believe is is best to indicate numbers from a third party measurement. The code seems to be equvalent to the previous patch, with about 20% less LOC. This hurts readability (my subjective impression) but on the positive side the change makes the structure of the code more explicit. Attaching the patch. Now I have done what I can, have to leave. Unless there are bugs there in the patch, my attempt to contribute ends at this point. Thanks to everyone who cared to objectively discuss a specific case of ffmpeg usage, the implications of techniques around VQ and whether/why some non-traditional approaches can make sense. Good luck to the ffmpeg project, it is very useful and valuable. Best regards, Rune From 0c9badec5d144b995c0bb52c7a80939b672be3f5 Mon Sep 17 00:00:00 2001 From: Rl <addr-see-the-website@aetey.se> Date: Sat, 11 Feb 2017 20:28:54 +0100 Subject: [PATCH] Cinepak: speed up decoding several-fold, depending on the scenario, by supporting multiple output pixel formats. Decoding to rgb24 and pal8 is optimized. Added rgb32, rgb565, yuv420p, each with faster decoding than to rgb24. The most noticeable gain is achieved by the created possibility to skip format conversions, for example when decoding to rgb565 ---- Using matrixbench_mpeg2.mpg (720x567) encoded with ffmpeg into Cinepak using default settings, decoding on an i5 3570K, 3.4 GHz: bicubic (default): ~24x realtime fast_bilinear: ~65x realtime patch w/rgb565 override: ~154x realtime ---- (https://ffmpeg.org/pipermail/ffmpeg-devel/2017-February/206799.html) palettized input can be decoded to any of the output formats, pal8 output is still limited to palettized input with input other than palettized/grayscale yuv420 is approximated by the Cinepak colorspace The output format can be chosen at runtime by an option or via the API. --- libavcodec/cinepak.c | 844 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 692 insertions(+), 152 deletions(-)