Message ID | 3784d0c2-03b8-2533-83ea-72c60b857b3a@carpalis.nl |
---|---|
State | New |
Headers | show |
On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote: > Direct prediction for interlace frame B pictures references the mv in the > second block in an MB in the backward reference frame for the twomv case. > When the backward reference frame is an I frame, this value may be unset. > > Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl> > --- > libavcodec/vc1_block.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c > index 74935ec9e9..9c170a1e3c 100644 > --- a/libavcodec/vc1_block.c > +++ b/libavcodec/vc1_block.c > @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) > s->bdsp.clear_blocks(block[0]); > mb_pos = s->mb_x + s->mb_y * s->mb_stride; > s->current_picture.mb_type[mb_pos + v->mb_off] = MB_TYPE_INTRA; > - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0; > - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0; > + for (int i = 0; i < 4; i++) { > + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0; > + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0; > + } see AV_ZERO* [...]
On 5/19/2018 6:40 PM, Michael Niedermayer wrote: > On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote: >> Direct prediction for interlace frame B pictures references the mv in the >> second block in an MB in the backward reference frame for the twomv case. >> When the backward reference frame is an I frame, this value may be unset. >> >> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl> >> --- >> libavcodec/vc1_block.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c >> index 74935ec9e9..9c170a1e3c 100644 >> --- a/libavcodec/vc1_block.c >> +++ b/libavcodec/vc1_block.c >> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) >> s->bdsp.clear_blocks(block[0]); >> mb_pos = s->mb_x + s->mb_y * s->mb_stride; >> s->current_picture.mb_type[mb_pos + v->mb_off] = MB_TYPE_INTRA; >> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0; >> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0; >> + for (int i = 0; i < 4; i++) { >> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0; >> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0; >> + } > > see AV_ZERO* Those may use mmx instructions, which may require calling emms_c() after the loop. memset() may be a better choice here. For a small amount of bytes the compiler should be able to optimize it into an inlined mov/xor on its own. Or we could get rid of the mmx implementations from both AV_ZERO* and AV_COPY* and forget about it.
On Sat, May 19, 2018 at 08:23:59PM -0300, James Almer wrote: > On 5/19/2018 6:40 PM, Michael Niedermayer wrote: > > On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote: > >> Direct prediction for interlace frame B pictures references the mv in the > >> second block in an MB in the backward reference frame for the twomv case. > >> When the backward reference frame is an I frame, this value may be unset. > >> > >> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl> > >> --- > >> libavcodec/vc1_block.c | 6 ++++-- > >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> > >> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c > >> index 74935ec9e9..9c170a1e3c 100644 > >> --- a/libavcodec/vc1_block.c > >> +++ b/libavcodec/vc1_block.c > >> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) > >> s->bdsp.clear_blocks(block[0]); > >> mb_pos = s->mb_x + s->mb_y * s->mb_stride; > >> s->current_picture.mb_type[mb_pos + v->mb_off] = MB_TYPE_INTRA; > >> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0; > >> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0; > >> + for (int i = 0; i < 4; i++) { > >> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0; > >> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0; > >> + } > > > > see AV_ZERO* > > Those may use mmx instructions, which may require calling emms_c() after > the loop. > memset() may be a better choice here. For a small amount of bytes the > compiler should be able to optimize it into an inlined mov/xor on its own. This calls clear_blocks() a few lines before which can use mmx too unless iam missing something [...]
On 5/19/2018 9:13 PM, Michael Niedermayer wrote: > On Sat, May 19, 2018 at 08:23:59PM -0300, James Almer wrote: >> On 5/19/2018 6:40 PM, Michael Niedermayer wrote: >>> On Fri, May 18, 2018 at 05:06:36PM +0200, Jerome Borsboom wrote: >>>> Direct prediction for interlace frame B pictures references the mv in the >>>> second block in an MB in the backward reference frame for the twomv case. >>>> When the backward reference frame is an I frame, this value may be unset. >>>> >>>> Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl> >>>> --- >>>> libavcodec/vc1_block.c | 6 ++++-- >>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c >>>> index 74935ec9e9..9c170a1e3c 100644 >>>> --- a/libavcodec/vc1_block.c >>>> +++ b/libavcodec/vc1_block.c >>>> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) >>>> s->bdsp.clear_blocks(block[0]); >>>> mb_pos = s->mb_x + s->mb_y * s->mb_stride; >>>> s->current_picture.mb_type[mb_pos + v->mb_off] = MB_TYPE_INTRA; >>>> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0; >>>> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0; >>>> + for (int i = 0; i < 4; i++) { >>>> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0; >>>> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0; >>>> + } >>> >>> see AV_ZERO* >> >> Those may use mmx instructions, which may require calling emms_c() after >> the loop. >> memset() may be a better choice here. For a small amount of bytes the >> compiler should be able to optimize it into an inlined mov/xor on its own. > > This calls clear_blocks() a few lines before which can use mmx too unless iam > missing something I see no emms in either ff_clear_blocks_mmx() or anywhere in vc1_block.c, so either nobody tried this decoder on machines without sse, or it's a non issue. Keep in mind AV_ZERO* will choose to use mmx or not at compile time, not runtime. So an x86_32 build with -msse (but not -msse2) for example would use it, regardless of it being run on a PIII or a Haswell.
2018-05-20 3:45 GMT+02:00, James Almer <jamrial@gmail.com>: > I see no emms in either ff_clear_blocks_mmx() or anywhere in > vc1_block.c, so either nobody tried this decoder on machines > without sse I do have access to an SSE-only system, I doubt FFmpeg was used or tested on MMX-only systems in a very, very long time. musl doesn't work regardless of this particular possible issue, so it cannot be used to test it. Carl Eugen
>> Direct prediction for interlace frame B pictures references the mv in the >> second block in an MB in the backward reference frame for the twomv case. >> When the backward reference frame is an I frame, this value may be unset. >> >> Signed-off-by: Jerome Borsboom <jerome.borsboom at carpalis.nl> >> --- >> libavcodec/vc1_block.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c >> index 74935ec9e9..9c170a1e3c 100644 >> --- a/libavcodec/vc1_block.c >> +++ b/libavcodec/vc1_block.c >> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) >> s->bdsp.clear_blocks(block[0]); >> mb_pos = s->mb_x + s->mb_y * s->mb_stride; >> s->current_picture.mb_type[mb_pos + v->mb_off] = MB_TYPE_INTRA; >> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0; >> - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0; >> + for (int i = 0; i < 4; i++) { >> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0; >> + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0; >> + } > > see AV_ZERO* This style of setting motion_val to zero is used all over the VC-1 decoder. vc1_decode_p_mb_intfr uses the exact same code. Changing to AV_ZERO may certainly a good point for improvement, however, for the sake of consistency the proposed code might be preferable. Regards, Jerome
2018-05-20 11:05 GMT+02:00, Jerome Borsboom <jerome.borsboom@carpalis.nl>: >>> Direct prediction for interlace frame B pictures references the mv in the >>> second block in an MB in the backward reference frame for the twomv case. >>> When the backward reference frame is an I frame, this value may be unset. >>> >>> Signed-off-by: Jerome Borsboom <jerome.borsboom at carpalis.nl> >>> --- >>> libavcodec/vc1_block.c | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c >>> index 74935ec9e9..9c170a1e3c 100644 >>> --- a/libavcodec/vc1_block.c >>> +++ b/libavcodec/vc1_block.c >>> @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) >>> s->bdsp.clear_blocks(block[0]); >>> mb_pos = s->mb_x + s->mb_y * s->mb_stride; >>> s->current_picture.mb_type[mb_pos + v->mb_off] >>> = MB_TYPE_INTRA; >>> - s->current_picture.motion_val[1][s->block_index[0] + >>> v->blocks_off][0] = 0; >>> - s->current_picture.motion_val[1][s->block_index[0] + >>> v->blocks_off][1] = 0; >>> + for (int i = 0; i < 4; i++) { >>> + s->current_picture.motion_val[1][s->block_index[i] + >>> v->blocks_off][0] = 0; >>> + s->current_picture.motion_val[1][s->block_index[i] + >>> v->blocks_off][1] = 0; >>> + } >> >> see AV_ZERO* > > This style of setting motion_val to zero is used all over the VC-1 > decoder. vc1_decode_p_mb_intfr uses the exact same code. Changing to > AV_ZERO may certainly a good point for improvement, however, for the > sake of consistency the proposed code might be preferable. Patch applied. Thank you, Carl Eugen
diff --git a/libavcodec/vc1_block.c b/libavcodec/vc1_block.c index 74935ec9e9..9c170a1e3c 100644 --- a/libavcodec/vc1_block.c +++ b/libavcodec/vc1_block.c @@ -2678,8 +2678,10 @@ static void vc1_decode_i_blocks_adv(VC1Context *v) s->bdsp.clear_blocks(block[0]); mb_pos = s->mb_x + s->mb_y * s->mb_stride; s->current_picture.mb_type[mb_pos + v->mb_off] = MB_TYPE_INTRA; - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][0] = 0; - s->current_picture.motion_val[1][s->block_index[0] + v->blocks_off][1] = 0; + for (int i = 0; i < 4; i++) { + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][0] = 0; + s->current_picture.motion_val[1][s->block_index[i] + v->blocks_off][1] = 0; + } // do actual MB decoding and displaying if (v->fcm == ILACE_FRAME && v->fieldtx_is_raw)
Direct prediction for interlace frame B pictures references the mv in the second block in an MB in the backward reference frame for the twomv case. When the backward reference frame is an I frame, this value may be unset. Signed-off-by: Jerome Borsboom <jerome.borsboom@carpalis.nl> --- libavcodec/vc1_block.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)