[FFmpeg-devel,4/6] avcodec/hnm4video: Optimize postprocess_current_frame()

Submitted by Michael Niedermayer on Aug. 3, 2019, 4:57 p.m.

Details

Message ID 20190803165739.GY3219@michaelspb
State New
Headers show

Commit Message

Michael Niedermayer Aug. 3, 2019, 4:57 p.m.
On Sat, Aug 03, 2019 at 04:07:22PM +0200, Tomas Härdin wrote:
> lör 2019-08-03 klockan 01:49 +0200 skrev Michael Niedermayer:
> > -    uint32_t x, y, src_x, src_y;
> > +    uint32_t x, y, src_y;
> > +    int width = hnm->width;
> >  
> >      for (y = 0; y < hnm->height; y++) {
> > +        uint8_t *dst = hnm->processed + y * width;
> > +        const uint8_t *src = hnm->current;
> >          src_y = y - (y % 2);
> > -        src_x = src_y * hnm->width + (y % 2);
> > -        for (x = 0; x < hnm->width; x++) {
> > -            hnm->processed[(y * hnm->width) + x] = hnm-
> > >current[src_x];
> > -            src_x += 2;
> > +        src += src_y * width + (y % 2);
> > +        for (x = 0; x < width; x++) {
> > +            dst[x] = *src;
> > +            src += 2;
> 
> Looks OK. Maybe telling the compiler that src and dst don't alias would
> be worthwhile?

i can add restrict keywords if you want:
?



[...]

Comments

Tomas Härdin Aug. 5, 2019, 9:58 a.m.
lör 2019-08-03 klockan 18:57 +0200 skrev Michael Niedermayer:
> On Sat, Aug 03, 2019 at 04:07:22PM +0200, Tomas Härdin wrote:
> > lör 2019-08-03 klockan 01:49 +0200 skrev Michael Niedermayer:
> > > -    uint32_t x, y, src_x, src_y;
> > > +    uint32_t x, y, src_y;
> > > +    int width = hnm->width;
> > >  
> > >      for (y = 0; y < hnm->height; y++) {
> > > +        uint8_t *dst = hnm->processed + y * width;
> > > +        const uint8_t *src = hnm->current;
> > >          src_y = y - (y % 2);
> > > -        src_x = src_y * hnm->width + (y % 2);
> > > -        for (x = 0; x < hnm->width; x++) {
> > > -            hnm->processed[(y * hnm->width) + x] = hnm-
> > > > current[src_x];
> > > -            src_x += 2;
> > > +        src += src_y * width + (y % 2);
> > > +        for (x = 0; x < width; x++) {
> > > +            dst[x] = *src;
> > > +            src += 2;
> > 
> > Looks OK. Maybe telling the compiler that src and dst don't alias
> > would
> > be worthwhile?
> 
> i can add restrict keywords if you want:
> ?
> 
> diff --git a/libavcodec/hnm4video.c b/libavcodec/hnm4video.c
> index 68d0baef6d..1c2501afab 100644
> --- a/libavcodec/hnm4video.c
> +++ b/libavcodec/hnm4video.c
> @@ -121,8 +121,8 @@ static void
> postprocess_current_frame(AVCodecContext *avctx)
>      int width = hnm->width;
>  
>      for (y = 0; y < hnm->height; y++) {
> -        uint8_t *dst = hnm->processed + y * width;
> -        const uint8_t *src = hnm->current;
> +        uint8_t * restrict dst = hnm->processed + y * width;
> +        const uint8_t * restrict src = hnm->current;

Does it improve performance? Else there's little point

/Tomas
Michael Niedermayer Aug. 6, 2019, 10:37 p.m.
On Mon, Aug 05, 2019 at 11:58:01AM +0200, Tomas Härdin wrote:
> lör 2019-08-03 klockan 18:57 +0200 skrev Michael Niedermayer:
> > On Sat, Aug 03, 2019 at 04:07:22PM +0200, Tomas Härdin wrote:
> > > lör 2019-08-03 klockan 01:49 +0200 skrev Michael Niedermayer:
> > > > -    uint32_t x, y, src_x, src_y;
> > > > +    uint32_t x, y, src_y;
> > > > +    int width = hnm->width;
> > > >  
> > > >      for (y = 0; y < hnm->height; y++) {
> > > > +        uint8_t *dst = hnm->processed + y * width;
> > > > +        const uint8_t *src = hnm->current;
> > > >          src_y = y - (y % 2);
> > > > -        src_x = src_y * hnm->width + (y % 2);
> > > > -        for (x = 0; x < hnm->width; x++) {
> > > > -            hnm->processed[(y * hnm->width) + x] = hnm-
> > > > > current[src_x];
> > > > -            src_x += 2;
> > > > +        src += src_y * width + (y % 2);
> > > > +        for (x = 0; x < width; x++) {
> > > > +            dst[x] = *src;
> > > > +            src += 2;
> > > 
> > > Looks OK. Maybe telling the compiler that src and dst don't alias
> > > would
> > > be worthwhile?
> > 
> > i can add restrict keywords if you want:
> > ?
> > 
> > diff --git a/libavcodec/hnm4video.c b/libavcodec/hnm4video.c
> > index 68d0baef6d..1c2501afab 100644
> > --- a/libavcodec/hnm4video.c
> > +++ b/libavcodec/hnm4video.c
> > @@ -121,8 +121,8 @@ static void
> > postprocess_current_frame(AVCodecContext *avctx)
> >      int width = hnm->width;
> >  
> >      for (y = 0; y < hnm->height; y++) {
> > -        uint8_t *dst = hnm->processed + y * width;
> > -        const uint8_t *src = hnm->current;
> > +        uint8_t * restrict dst = hnm->processed + y * width;
> > +        const uint8_t * restrict src = hnm->current;
> 
> Does it improve performance? Else there's little point

I cannot meassure a performance difference for it its within 1%

thx

[...]

Patch hide | download patch | download mbox

diff --git a/libavcodec/hnm4video.c b/libavcodec/hnm4video.c
index 68d0baef6d..1c2501afab 100644
--- a/libavcodec/hnm4video.c
+++ b/libavcodec/hnm4video.c
@@ -121,8 +121,8 @@  static void postprocess_current_frame(AVCodecContext *avctx)
     int width = hnm->width;
 
     for (y = 0; y < hnm->height; y++) {
-        uint8_t *dst = hnm->processed + y * width;
-        const uint8_t *src = hnm->current;
+        uint8_t * restrict dst = hnm->processed + y * width;
+        const uint8_t * restrict src = hnm->current;
         src_y = y - (y % 2);
         src += src_y * width + (y % 2);
         for (x = 0; x < width; x++) {