Message ID | 20220713204716.3114529-1-martin@martin.st |
---|---|
State | Accepted |
Headers | show |
Series | [FFmpeg-devel,1/2] x86: Don't hardcode the height to 8 in sad8_xy2_mmx | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Wed, 13 Jul 2022, Martin Storsjö wrote: > The height is hardcoded in some of the me_cmp functions, but not > in all of them. But in the case of all other functions, it's hardcoded > in the same place in SIMD functions as in the C reference functions, > while this one function differs from the behaviour of the C code. > > (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a > couple other sad8_*_mmx functions with similar hardcoded height.) > --- > libavcodec/x86/me_cmp_init.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c > index 61e9396b8f..dcc2621276 100644 > --- a/libavcodec/x86/me_cmp_init.c > +++ b/libavcodec/x86/me_cmp_init.c > @@ -202,13 +202,12 @@ static inline int sum_mmx(void) > static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ > uint8_t *blk1, ptrdiff_t stride, int h) \ > { \ > - av_assert2(h == 8); \ > __asm__ volatile ( \ > "pxor %%mm7, %%mm7 \n\t" \ > "pxor %%mm6, %%mm6 \n\t" \ > ::); \ > \ > - sad8_4_ ## suf(blk1, blk2, stride, 8); \ > + sad8_4_ ## suf(blk1, blk2, stride, h); \ > \ > return sum_ ## suf(); \ > } \ > -- > 2.25.1 Ping, does this seem reasonable? Michael indicated a desire to make the me_cmp functions more general and flexible than what they are today, and this would be a first step to making checkasm test such cases. // Martin
On Thu, Aug 04, 2022 at 10:47:34AM +0300, Martin Storsjö wrote: > On Wed, 13 Jul 2022, Martin Storsjö wrote: > > > The height is hardcoded in some of the me_cmp functions, but not > > in all of them. But in the case of all other functions, it's hardcoded > > in the same place in SIMD functions as in the C reference functions, > > while this one function differs from the behaviour of the C code. > > > > (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a > > couple other sad8_*_mmx functions with similar hardcoded height.) > > --- > > libavcodec/x86/me_cmp_init.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c > > index 61e9396b8f..dcc2621276 100644 > > --- a/libavcodec/x86/me_cmp_init.c > > +++ b/libavcodec/x86/me_cmp_init.c > > @@ -202,13 +202,12 @@ static inline int sum_mmx(void) > > static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ > > uint8_t *blk1, ptrdiff_t stride, int h) \ > > { \ > > - av_assert2(h == 8); \ > > __asm__ volatile ( \ > > "pxor %%mm7, %%mm7 \n\t" \ > > "pxor %%mm6, %%mm6 \n\t" \ > > ::); \ > > \ > > - sad8_4_ ## suf(blk1, blk2, stride, 8); \ > > + sad8_4_ ## suf(blk1, blk2, stride, h); \ > > \ > > return sum_ ## suf(); \ > > } \ > > -- > > 2.25.1 > > Ping, does this seem reasonable? Michael indicated a desire to make the > me_cmp functions more general and flexible than what they are today, and > this would be a first step to making checkasm test such cases. LGTM assuming it doesnt have any problematic perforamce impact thx [...]
On Thu, 4 Aug 2022, Michael Niedermayer wrote: > On Thu, Aug 04, 2022 at 10:47:34AM +0300, Martin Storsjö wrote: >> On Wed, 13 Jul 2022, Martin Storsjö wrote: >> >>> The height is hardcoded in some of the me_cmp functions, but not >>> in all of them. But in the case of all other functions, it's hardcoded >>> in the same place in SIMD functions as in the C reference functions, >>> while this one function differs from the behaviour of the C code. >>> >>> (Before 542765ce3eccbca587d54262a512cbdb1407230d, there were a >>> couple other sad8_*_mmx functions with similar hardcoded height.) >>> --- >>> libavcodec/x86/me_cmp_init.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c >>> index 61e9396b8f..dcc2621276 100644 >>> --- a/libavcodec/x86/me_cmp_init.c >>> +++ b/libavcodec/x86/me_cmp_init.c >>> @@ -202,13 +202,12 @@ static inline int sum_mmx(void) >>> static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ >>> uint8_t *blk1, ptrdiff_t stride, int h) \ >>> { \ >>> - av_assert2(h == 8); \ >>> __asm__ volatile ( \ >>> "pxor %%mm7, %%mm7 \n\t" \ >>> "pxor %%mm6, %%mm6 \n\t" \ >>> ::); \ >>> \ >>> - sad8_4_ ## suf(blk1, blk2, stride, 8); \ >>> + sad8_4_ ## suf(blk1, blk2, stride, h); \ >>> \ >>> return sum_ ## suf(); \ >>> } \ >>> -- >>> 2.25.1 >> >> Ping, does this seem reasonable? Michael indicated a desire to make the >> me_cmp functions more general and flexible than what they are today, and >> this would be a first step to making checkasm test such cases. > > LGTM assuming it doesnt have any problematic perforamce impact Thanks - I didn't notice any significant change in the checkasm bench numbers for it. // Martin
diff --git a/libavcodec/x86/me_cmp_init.c b/libavcodec/x86/me_cmp_init.c index 61e9396b8f..dcc2621276 100644 --- a/libavcodec/x86/me_cmp_init.c +++ b/libavcodec/x86/me_cmp_init.c @@ -202,13 +202,12 @@ static inline int sum_mmx(void) static int sad8_xy2_ ## suf(MpegEncContext *v, uint8_t *blk2, \ uint8_t *blk1, ptrdiff_t stride, int h) \ { \ - av_assert2(h == 8); \ __asm__ volatile ( \ "pxor %%mm7, %%mm7 \n\t" \ "pxor %%mm6, %%mm6 \n\t" \ ::); \ \ - sad8_4_ ## suf(blk1, blk2, stride, 8); \ + sad8_4_ ## suf(blk1, blk2, stride, h); \ \ return sum_ ## suf(); \ } \