diff mbox

[FFmpeg-devel,5/5] avcodec/vc1: store zero MVs for all blocks in a MB

Message ID 3784d0c2-03b8-2533-83ea-72c60b857b3a@carpalis.nl
State New
Headers show

Commit Message

Jerome Borsboom May 18, 2018, 3:06 p.m. UTC
Direct prediction for interlace frame B pictures references the mv in the
second block in an MB in the backward reference frame for the twomv case.
When the backward reference frame is an I frame, this value may be unset.

Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
---
 libavcodec/vc1_block.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Michael Niedermayer May 19, 2018, 9:40 p.m. UTC | #1
On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote:
> Direct prediction for interlace frame B pictures references the mv in the
> second block in an MB in the backward reference frame for the twomv case.
> When the backward reference frame is an I frame, this value may be unset.
> 
> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
> ---
>  libavcodec/vc1_block.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
> index 74935ec9e9..9c170a1e3c 100644
> --- a/libavcodec/vc1_block.c
> +++ b/libavcodec/vc1_block.c
> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v)
>              s->bdsp.clear_blocks(block[0]);
>              mb_pos = s->mb_x + s->mb_y * s->mb_stride;
>              s->current_picture.mb_type[mb_pos + v->mb_off]                         = MB_TYPE_INTRA;
> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0;
> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0;
> +            for (int i = 0; i < 4; i++) {
> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0;
> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0;
> +            }

see AV_ZERO*

[...]
James Almer May 19, 2018, 11:23 p.m. UTC | #2
On 5/19/2018 6:40 PM, Michael Niedermayer wrote:
> On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote:
>> Direct prediction for interlace frame B pictures references the mv in the
>> second block in an MB in the backward reference frame for the twomv case.
>> When the backward reference frame is an I frame, this value may be unset.
>>
>> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
>> ---
>>  libavcodec/vc1_block.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
>> index 74935ec9e9..9c170a1e3c 100644
>> --- a/libavcodec/vc1_block.c
>> +++ b/libavcodec/vc1_block.c
>> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v)
>>              s->bdsp.clear_blocks(block[0]);
>>              mb_pos = s->mb_x + s->mb_y * s->mb_stride;
>>              s->current_picture.mb_type[mb_pos + v->mb_off]                         = MB_TYPE_INTRA;
>> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0;
>> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0;
>> +            for (int i = 0; i < 4; i++) {
>> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0;
>> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0;
>> +            }
> 
> see AV_ZERO*

Those may use mmx instructions, which may require calling emms_c() after
the loop.
memset() may be a better choice here. For a small amount of bytes the
compiler should be able to optimize it into an inlined mov/xor on its own.

Or we could get rid of the mmx implementations from both AV_ZERO* and
AV_COPY* and forget about it.
Michael Niedermayer May 20, 2018, 12:13 a.m. UTC | #3
On Sat, May 19, 2018 at 08:23:59PM -0300, James Almer wrote:
> On 5/19/2018 6:40 PM, Michael Niedermayer wrote:
> > On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote:
> >> Direct prediction for interlace frame B pictures references the mv in the
> >> second block in an MB in the backward reference frame for the twomv case.
> >> When the backward reference frame is an I frame, this value may be unset.
> >>
> >> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
> >> ---
> >>  libavcodec/vc1_block.c | 6 ++++--
> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
> >> index 74935ec9e9..9c170a1e3c 100644
> >> --- a/libavcodec/vc1_block.c
> >> +++ b/libavcodec/vc1_block.c
> >> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v)
> >>              s->bdsp.clear_blocks(block[0]);
> >>              mb_pos = s->mb_x + s->mb_y * s->mb_stride;
> >>              s->current_picture.mb_type[mb_pos + v->mb_off]                         = MB_TYPE_INTRA;
> >> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0;
> >> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0;
> >> +            for (int i = 0; i < 4; i++) {
> >> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0;
> >> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0;
> >> +            }
> > 
> > see AV_ZERO*
> 
> Those may use mmx instructions, which may require calling emms_c() after
> the loop.
> memset() may be a better choice here. For a small amount of bytes the
> compiler should be able to optimize it into an inlined mov/xor on its own.

This calls clear_blocks() a few lines before which can use mmx too unless iam
missing something


[...]
James Almer May 20, 2018, 1:45 a.m. UTC | #4
On 5/19/2018 9:13 PM, Michael Niedermayer wrote:
> On Sat, May 19, 2018 at 08:23:59PM -0300, James Almer wrote:
>> On 5/19/2018 6:40 PM, Michael Niedermayer wrote:
>>> On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote:
>>>> Direct prediction for interlace frame B pictures references the mv in the
>>>> second block in an MB in the backward reference frame for the twomv case.
>>>> When the backward reference frame is an I frame, this value may be unset.
>>>>
>>>> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl>
>>>> ---
>>>>  libavcodec/vc1_block.c | 6 ++++--
>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
>>>> index 74935ec9e9..9c170a1e3c 100644
>>>> --- a/libavcodec/vc1_block.c
>>>> +++ b/libavcodec/vc1_block.c
>>>> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v)
>>>>              s->bdsp.clear_blocks(block[0]);
>>>>              mb_pos = s->mb_x + s->mb_y * s->mb_stride;
>>>>              s->current_picture.mb_type[mb_pos + v->mb_off]                         = MB_TYPE_INTRA;
>>>> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0;
>>>> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0;
>>>> +            for (int i = 0; i < 4; i++) {
>>>> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0;
>>>> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0;
>>>> +            }
>>>
>>> see AV_ZERO*
>>
>> Those may use mmx instructions, which may require calling emms_c() after
>> the loop.
>> memset() may be a better choice here. For a small amount of bytes the
>> compiler should be able to optimize it into an inlined mov/xor on its own.
> 
> This calls clear_blocks() a few lines before which can use mmx too unless iam
> missing something

I see no emms in either ff_clear_blocks_mmx() or anywhere in
vc1_block.c, so either nobody tried this decoder on machines without
sse, or it's a non issue.

Keep in mind AV_ZERO* will choose to use mmx or not at compile time, not
runtime. So an x86_32 build with -msse (but not -msse2) for example
would use it, regardless of it being run on a PIII or a Haswell.
Carl Eugen Hoyos May 20, 2018, 7:54 a.m. UTC | #5
2018-05-20 3:45 GMT+02:00, James Almer <jamrial@gmail.com>:

> I see no emms in either ff_clear_blocks_mmx() or anywhere in
> vc1_block.c, so either nobody tried this decoder on machines
> without sse

I do have access to an SSE-only system, I doubt FFmpeg was
used or tested on MMX-only systems in a very, very long time.

musl doesn't work regardless of this particular possible issue,
so it cannot be used to test it.

Carl Eugen
Jerome Borsboom May 20, 2018, 9:05 a.m. UTC | #6
>> Direct prediction for interlace frame B pictures references the mv in the
>> second block in an MB in the backward reference frame for the twomv case.
>> When the backward reference frame is an I frame, this value may be unset.
>> 
>> Signed-off-by: Jerome Borsboom <jerome.borsboom at carpalis.nl>
>> ---
>>  libavcodec/vc1_block.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
>> index 74935ec9e9..9c170a1e3c 100644
>> --- a/libavcodec/vc1_block.c
>> +++ b/libavcodec/vc1_block.c
>> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v)
>>              s->bdsp.clear_blocks(block[0]);
>>              mb_pos = s->mb_x + s->mb_y * s->mb_stride;
>>              s->current_picture.mb_type[mb_pos + v->mb_off]                         = MB_TYPE_INTRA;
>> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0;
>> -            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0;
>> +            for (int i = 0; i < 4; i++) {
>> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0;
>> +                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0;
>> +            }
> 
> see AV_ZERO*

This style of setting motion_val to zero is used all over the VC-1
decoder. vc1_decode_p_mb_intfr uses the exact same code. Changing to
AV_ZERO may certainly a good point for improvement, however, for the
sake of consistency the proposed code might be preferable.

Regards,
Jerome
Carl Eugen Hoyos May 28, 2018, 9:52 p.m. UTC | #7
2018-05-20 11:05 GMT+02:00, Jerome Borsboom <jerome.borsboom@carpalis.nl>:
>>> Direct prediction for interlace frame B pictures references the mv in the
>>> second block in an MB in the backward reference frame for the twomv case.
>>> When the backward reference frame is an I frame, this value may be unset.
>>>
>>> Signed-off-by: Jerome Borsboom <jerome.borsboom at carpalis.nl>
>>> ---
>>>  libavcodec/vc1_block.c | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
>>> index 74935ec9e9..9c170a1e3c 100644
>>> --- a/libavcodec/vc1_block.c
>>> +++ b/libavcodec/vc1_block.c
>>> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v)
>>>              s->bdsp.clear_blocks(block[0]);
>>>              mb_pos = s->mb_x + s->mb_y * s->mb_stride;
>>>              s->current_picture.mb_type[mb_pos + v->mb_off]
>>>           = MB_TYPE_INTRA;
>>> -            s->current_picture.motion_val[1][s->block_index[0] +
>>> v->blocks_off][0] = 0;
>>> -            s->current_picture.motion_val[1][s->block_index[0] +
>>> v->blocks_off][1] = 0;
>>> +            for (int i = 0; i < 4; i++) {
>>> +                s->current_picture.motion_val[1][s->block_index[i] +
>>> v->blocks_off][0] = 0;
>>> +                s->current_picture.motion_val[1][s->block_index[i] +
>>> v->blocks_off][1] = 0;
>>> +            }
>>
>> see AV_ZERO*
>
> This style of setting motion_val to zero is used all over the VC-1
> decoder. vc1_decode_p_mb_intfr uses the exact same code. Changing to
> AV_ZERO may certainly a good point for improvement, however, for the
> sake of consistency the proposed code might be preferable.

Patch applied.

Thank you, Carl Eugen
diff mbox

Patch

diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c
index 74935ec9e9..9c170a1e3c 100644
--- a/libavcodec/vc1_block.c
+++ b/libavcodec/vc1_block.c
@@ -2678,8 +2678,10 @@  static void vc1_decode_i_blocks_adv(VC1Context *v)
             s->bdsp.clear_blocks(block[0]);
             mb_pos = s->mb_x + s->mb_y * s->mb_stride;
             s->current_picture.mb_type[mb_pos + v->mb_off]                         = MB_TYPE_INTRA;
-            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0;
-            s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0;
+            for (int i = 0; i < 4; i++) {
+                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0;
+                s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0;
+            }
 
             // do actual MB decoding and displaying
             if (v->fcm == ILACE_FRAME && v->fieldtx_is_raw)