diff mbox

[FFmpeg-devel] lavc/vc1dec: add multi-slice decoding support for hwaccel.

Message ID e60bc5d6-2e59-16b6-c36c-b49a81172fe5@gmail.com
State Superseded
Headers show

Commit Message

Jun Zhao Nov. 11, 2016, 8:09 a.m. UTC
From 95eebc4d94a2f2db9f03e569b660d94ae083d26c Mon Sep 17 00:00:00 2001
From: Jun Zhao <jun.zhao@intel.com>
Date: Fri, 11 Nov 2016 16:05:57 +0800
Subject: [PATCH] lavc/vc1dec: add multi-slice decoding support for hwaccel.

add multi-slice decoding support for hwaccel, now only test with
vaapi as backend.

Reviewed-by: Jun Zhao <jun.zhao@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
---
 libavcodec/vc1dec.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

Comments

Hendrik Leppkes Nov. 11, 2016, 8:29 a.m. UTC | #1
On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>

Do you have a sample file for this case? AFAIK all vc1 files I ever
saw worked with the DXVA2 hwaccel before, just want to make sure they
are not getting broken.

- Hendrik
Jun Zhao Nov. 14, 2016, 12:57 a.m. UTC | #2
On 2016/11/11 16:29, Hendrik Leppkes wrote:
> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>
> 
> Do you have a sample file for this case? AFAIK all vc1 files I ever
> saw worked with the DXVA2 hwaccel before, just want to make sure they
> are not getting broken.
> 
> - Hendrik

We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
the command:
rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.

Thanks.

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Mark Thompson Nov. 14, 2016, 9:31 p.m. UTC | #3
On 14/11/16 00:57, Jun Zhao wrote:
> On 2016/11/11 16:29, Hendrik Leppkes wrote:
>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>
>>
>> Do you have a sample file for this case? AFAIK all vc1 files I ever
>> saw worked with the DXVA2 hwaccel before, just want to make sure they
>> are not getting broken.
>>
>> - Hendrik
> 
> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
> the command:
> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.

Can you describe your test setup(s) a bit more?

I had a go at testing this with VAAPI.

With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it.

With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it.

I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass.  Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change).

More detailed results below.

Thanks,

- Mark



Without patch, i965 + Skylake GT2:

$ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format yuv420p' fate-vc1
TEST    vc1_sa00040
TEST    vc1_sa00050
TEST    vc1_sa10091
--- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091     2016-05-09 17:55:19.599803161 +0100
+++ tests/data/fate/vc1_sa10091 2016-11-14 21:18:10.550918661 +0000
[...]
Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details.
/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
make: *** [fate-vc1_sa10091] Error 1


Without patch, mesa + Polaris 11:

$ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1
TEST    vc1_sa00040
TEST    vc1_sa00050
TEST    vc1_sa10091
TEST    vc1_sa10143
--- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10143     2016-05-09 17:55:19.599803161 +0100
+++ tests/data/fate/vc1_sa10143 2016-11-14 21:17:03.712198274 +0000
[...]
Test vc1_sa10143 failed. Look at tests/data/fate/vc1_sa10143.err for details.
/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10143' failed
make: *** [fate-vc1_sa10143] Error 1


With patch, i965 + Skylake GT2:

$ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format yuv420p' fate-vc1
TEST    vc1_sa00040
TEST    vc1_sa00050
TEST    vc1_sa10091
^C^C/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
make: *** [fate-vc1_sa10091] Interrupt

dmesg output:
[1300652.142872] [drm] stuck on bsd ring
[1300652.143138] [drm] GPU HANG: ecode 9:2:0xcbfcffe7, in ffmpeg [31688], reason: Engine(s) hung, action: reset
[1300652.144959] drm/i915: Resetting chip after gpu hang
[1300654.130765] [drm] RC6 on
[1300660.106392] [drm] stuck on bsd ring
[1300660.106927] [drm] GPU HANG: ecode 9:2:0xa8dfbffd, reason: Engine(s) hung, action: reset
[1300660.107078] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
[1300660.109931] drm/i915: Resetting chip after gpu hang
[1300661.142622] [drm] RC6 on


With patch, mesa + Polaris 11:

$ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1
TEST    vc1_sa00040
TEST    vc1_sa00050
TEST    vc1_sa10091
--- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091     2016-05-09 17:55:19.599803161 +0100
+++ tests/data/fate/vc1_sa10091 2016-11-14 21:20:24.324357577 +0000
[...]
Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details.
/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
make: *** [fate-vc1_sa10091] Error 1
Hendrik Leppkes Nov. 14, 2016, 10:15 p.m. UTC | #4
On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson <sw@jkqxz.net> wrote:
> On 14/11/16 00:57, Jun Zhao wrote:
>> On 2016/11/11 16:29, Hendrik Leppkes wrote:
>>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>>
>>>
>>> Do you have a sample file for this case? AFAIK all vc1 files I ever
>>> saw worked with the DXVA2 hwaccel before, just want to make sure they
>>> are not getting broken.
>>>
>>> - Hendrik
>>
>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
>> the command:
>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.
>
> Can you describe your test setup(s) a bit more?
>
> I had a go at testing this with VAAPI.
>
> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it.
>
> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it.
>
> I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass.  Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change).
>
> More detailed results below.

For the record, this makes DXVA2 decoding break more as well. Decoding
is not correct before this patch on that sample (only the first slice
looks ok), but after its entirely broken, with error messages to boot.

I have some experience with vc1 hwaccel, at least with dxva2, and if I
find some time I might look into what might be needed to make it work,
but this patch seems to have issues.

- Hendrik
Hendrik Leppkes Nov. 14, 2016, 10:21 p.m. UTC | #5
On Mon, Nov 14, 2016 at 11:15 PM, Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson <sw@jkqxz.net> wrote:
>> On 14/11/16 00:57, Jun Zhao wrote:
>>> On 2016/11/11 16:29, Hendrik Leppkes wrote:
>>>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>>>
>>>>
>>>> Do you have a sample file for this case? AFAIK all vc1 files I ever
>>>> saw worked with the DXVA2 hwaccel before, just want to make sure they
>>>> are not getting broken.
>>>>
>>>> - Hendrik
>>>
>>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
>>> the command:
>>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.
>>
>> Can you describe your test setup(s) a bit more?
>>
>> I had a go at testing this with VAAPI.
>>
>> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it.
>>
>> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it.
>>
>> I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass.  Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change).
>>
>> More detailed results below.
>
> For the record, this makes DXVA2 decoding break more as well. Decoding
> is not correct before this patch on that sample (only the first slice
> looks ok), but after its entirely broken, with error messages to boot.
>
> I have some experience with vc1 hwaccel, at least with dxva2, and if I
> find some time I might look into what might be needed to make it work,
> but this patch seems to have issues.
>

After a quick look - one key problem is that hwaccels typically want
raw/escaped buffers, but the buffers in the slice GetBitContext are
unescaped, so that won't work.

- Hendrik
Mark Thompson Nov. 14, 2016, 10:28 p.m. UTC | #6
On 14/11/16 21:31, Mark Thompson wrote:
> On 14/11/16 00:57, Jun Zhao wrote:
>> On 2016/11/11 16:29, Hendrik Leppkes wrote:
>>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>>
>>>
>>> Do you have a sample file for this case? AFAIK all vc1 files I ever
>>> saw worked with the DXVA2 hwaccel before, just want to make sure they
>>> are not getting broken.
>>>
>>> - Hendrik
>>
>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
>> the command:
>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.
> 
> Can you describe your test setup(s) a bit more?
> 
> I had a go at testing this with VAAPI.
> 
> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it.
> 
> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it.
> 
> I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass.  Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change).
> 
> More detailed results below.
> 
> Thanks,
> 
> - Mark
> 
> 
> 
> Without patch, i965 + Skylake GT2:
> 
> $ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format yuv420p' fate-vc1
> TEST    vc1_sa00040
> TEST    vc1_sa00050
> TEST    vc1_sa10091
> --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091     2016-05-09 17:55:19.599803161 +0100
> +++ tests/data/fate/vc1_sa10091 2016-11-14 21:18:10.550918661 +0000
> [...]
> Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details.
> /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
> make: *** [fate-vc1_sa10091] Error 1
> 
> 
> Without patch, mesa + Polaris 11:
> 
> $ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1
> TEST    vc1_sa00040
> TEST    vc1_sa00050
> TEST    vc1_sa10091
> TEST    vc1_sa10143
> --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10143     2016-05-09 17:55:19.599803161 +0100
> +++ tests/data/fate/vc1_sa10143 2016-11-14 21:17:03.712198274 +0000
> [...]
> Test vc1_sa10143 failed. Look at tests/data/fate/vc1_sa10143.err for details.
> /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10143' failed
> make: *** [fate-vc1_sa10143] Error 1
> 
> 
> With patch, i965 + Skylake GT2:
> 
> $ make HWACCEL='vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format yuv420p' fate-vc1
> TEST    vc1_sa00040
> TEST    vc1_sa00050
> TEST    vc1_sa10091
> ^C^C/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
> make: *** [fate-vc1_sa10091] Interrupt
> 
> dmesg output:
> [1300652.142872] [drm] stuck on bsd ring
> [1300652.143138] [drm] GPU HANG: ecode 9:2:0xcbfcffe7, in ffmpeg [31688], reason: Engine(s) hung, action: reset
> [1300652.144959] drm/i915: Resetting chip after gpu hang
> [1300654.130765] [drm] RC6 on
> [1300660.106392] [drm] stuck on bsd ring
> [1300660.106927] [drm] GPU HANG: ecode 9:2:0xa8dfbffd, reason: Engine(s) hung, action: reset
> [1300660.107078] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
> [1300660.109931] drm/i915: Resetting chip after gpu hang
> [1300661.142622] [drm] RC6 on
> 
> 
> With patch, mesa + Polaris 11:
> 
> $ LIBVA_DRIVER_NAME=radeonsi make HWACCEL='vaapi -vaapi_device /dev/dri/renderD129 -hwaccel_output_format yuv420p' fate-vc1
> TEST    vc1_sa00040
> TEST    vc1_sa00050
> TEST    vc1_sa10091
> --- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091     2016-05-09 17:55:19.599803161 +0100
> +++ tests/data/fate/vc1_sa10091 2016-11-14 21:20:24.324357577 +0000
> [...]
> Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details.
> /home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
> make: *** [fate-vc1_sa10091] Error 1
> 

VDPAU has the same results as VAAPI.


Without patch, mesa + Polaris 11:

$ DISPLAY=:0 make HWACCEL='vdpau' fate-vc1
TEST    vc1_sa00040
TEST    vc1_sa00050
TEST    vc1_sa10091
TEST    vc1_sa10143
--- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10143     2016-05-09 17:55:19.599803161 +0100
+++ tests/data/fate/vc1_sa10143 2016-11-14 22:27:41.903250981 +0000
[...]
Test vc1_sa10143 failed. Look at tests/data/fate/vc1_sa10143.err for details.
/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10143' failed
make: *** [fate-vc1_sa10143] Error 1


With patch, mesa + Polaris 11:

$ DISPLAY=:0 make HWACCEL='vdpau' fate-vc1
TEST    vc1_sa00040
TEST    vc1_sa00050
TEST    vc1_sa10091
--- /home/mrt/video/ffmpeg/vaapi/tests/ref/fate/vc1_sa10091     2016-05-09 17:55:19.599803161 +0100
+++ tests/data/fate/vc1_sa10091 2016-11-14 22:26:17.108873397 +0000
[...]
Test vc1_sa10091 failed. Look at tests/data/fate/vc1_sa10091.err for details.
/home/mrt/video/ffmpeg/vaapi/tests/Makefile:218: recipe for target 'fate-vc1_sa10091' failed
make: *** [fate-vc1_sa10091] Error 1
James Almer Nov. 14, 2016, 10:47 p.m. UTC | #7
On 11/14/2016 7:15 PM, Hendrik Leppkes wrote:
> On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson <sw@jkqxz.net> wrote:
>> On 14/11/16 00:57, Jun Zhao wrote:
>>> On 2016/11/11 16:29, Hendrik Leppkes wrote:
>>>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>>>
>>>>
>>>> Do you have a sample file for this case? AFAIK all vc1 files I ever
>>>> saw worked with the DXVA2 hwaccel before, just want to make sure they
>>>> are not getting broken.
>>>>
>>>> - Hendrik
>>>
>>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
>>> the command:
>>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.
>>
>> Can you describe your test setup(s) a bit more?
>>
>> I had a go at testing this with VAAPI.
>>
>> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it.
>>
>> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it.
>>
>> I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass.  Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change).
>>
>> More detailed results below.
> 
> For the record, this makes DXVA2 decoding break more as well. Decoding
> is not correct before this patch on that sample (only the first slice
> looks ok), but after its entirely broken, with error messages to boot.
> 
> I have some experience with vc1 hwaccel, at least with dxva2, and if I
> find some time I might look into what might be needed to make it work,
> but this patch seems to have issues.
> 
> - Hendrik

On a first gen GCN GPU, vc1_sa10091 passes but vc1_sa10143 fails using
DXVA2 and a recent driver.

Did not test with this patch applied.
Hendrik Leppkes Nov. 15, 2016, midnight UTC | #8
On Mon, Nov 14, 2016 at 11:47 PM, James Almer <jamrial@gmail.com> wrote:
> On 11/14/2016 7:15 PM, Hendrik Leppkes wrote:
>> On Mon, Nov 14, 2016 at 10:31 PM, Mark Thompson <sw@jkqxz.net> wrote:
>>> On 14/11/16 00:57, Jun Zhao wrote:
>>>> On 2016/11/11 16:29, Hendrik Leppkes wrote:
>>>>> On Fri, Nov 11, 2016 at 9:09 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>>>>
>>>>>
>>>>> Do you have a sample file for this case? AFAIK all vc1 files I ever
>>>>> saw worked with the DXVA2 hwaccel before, just want to make sure they
>>>>> are not getting broken.
>>>>>
>>>>> - Hendrik
>>>>
>>>> We used the file fate-suite/vc1/SA10091.vc1, you can get the files with
>>>> the command:
>>>> rsync -aL rsync://fate-suite.ffmpeg.org:/fate-suite/ fate-suite.
>>>
>>> Can you describe your test setup(s) a bit more?
>>>
>>> I had a go at testing this with VAAPI.
>>>
>>> With the i965 driver on Skylake GT2, fate-vc1_sa10091 fails cleanly without the patch, but gives a GPU hang with it.
>>>
>>> With the mesa driver on Polaris 11, fate-vc1_sa10091 passes without the patch, but fails with it.
>>>
>>> I haven't really looked at VC-1 decode much before, so I'm not sure which of these tests should pass.  Still, the GPU hang is certainly bad (and possibly the fault of the driver, but it would be helpful to be sure of that if we want to apply this sort of change).
>>>
>>> More detailed results below.
>>
>> For the record, this makes DXVA2 decoding break more as well. Decoding
>> is not correct before this patch on that sample (only the first slice
>> looks ok), but after its entirely broken, with error messages to boot.
>>
>> I have some experience with vc1 hwaccel, at least with dxva2, and if I
>> find some time I might look into what might be needed to make it work,
>> but this patch seems to have issues.
>>
>> - Hendrik
>
> On a first gen GCN GPU, vc1_sa10091 passes but vc1_sa10143 fails using
> DXVA2 and a recent driver.
>
> Did not test with this patch applied.
>

I made 10091 work on my NVIDIA with DXVA2 by re-writing hwaccel slice
support from scratch. 10143 still fails.
I'll post patches once I got those all figured out.

- Hendrik
Carl Eugen Hoyos Nov. 15, 2016, 12:19 a.m. UTC | #9
2016-11-14 23:47 GMT+01:00 James Almer <jamrial@gmail.com>:

> but vc1_sa10143 fails using DXVA2 and a recent driver.

I suspect it actually passes with DXVA2: FFmpeg is not
bit-exact for vc1.

Carl Eugen

Correct output:
0,          0,          0,        1,   518400, 0x34fa7f55
0,          1,          1,        1,   518400, 0x60466bc1
0,          2,          2,        1,   518400, 0xe68dff1e
0,          3,          3,        1,   518400, 0x790ac06a
0,          4,          4,        1,   518400, 0xb3b26b27
0,          5,          5,        1,   518400, 0x8840096c
0,          6,          6,        1,   518400, 0xf75c3d61
0,          7,          7,        1,   518400, 0xca071781
0,          8,          8,        1,   518400, 0xa8e6edf9
0,          9,          9,        1,   518400, 0xabb61984
0,         10,         10,        1,   518400, 0x0b31dedd
0,         11,         11,        1,   518400, 0xf44378ef
0,         12,         12,        1,   518400, 0xf7268996
0,         13,         13,        1,   518400, 0x8c5b1ff4
0,         14,         14,        1,   518400, 0xda356fd2
0,         15,         15,        1,   518400, 0x0e091c57
0,         16,         16,        1,   518400, 0x17645e68
0,         17,         17,        1,   518400, 0xf47a71ef
0,         18,         18,        1,   518400, 0x6c440498
0,         19,         19,        1,   518400, 0xd705bd32
0,         20,         20,        1,   518400, 0x0800edd0
0,         21,         21,        1,   518400, 0x902be119
0,         22,         22,        1,   518400, 0x0f7d7bc4
0,         23,         23,        1,   518400, 0x9f4dc421
0,         24,         24,        1,   518400, 0x3b8c8d5a
0,         25,         25,        1,   518400, 0xbcdfb2b9
0,         26,         26,        1,   518400, 0xa02a46c3
0,         27,         27,        1,   518400, 0x8ecde915
0,         28,         28,        1,   518400, 0x20576bfd
0,         29,         29,        1,   518400, 0xac40bc36
Hendrik Leppkes Nov. 15, 2016, 12:39 a.m. UTC | #10
On Tue, Nov 15, 2016 at 1:19 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
> 2016-11-14 23:47 GMT+01:00 James Almer <jamrial@gmail.com>:
>
>> but vc1_sa10143 fails using DXVA2 and a recent driver.
>
> I suspect it actually passes with DXVA2: FFmpeg is not
> bit-exact for vc1.

Looks like you are right, thats the hashes I get as well.

In any case, I have a working WIP patch that fixes sa10091 and sa20021
with DXVA2, which were broken before.
I'll clean it up tomorrow and send it for testing.

Unfortunately I don't have a sample for field mode with slices, so
that remains un-implemented. If someone comes across such a thing,
that would be nice to have.

- Hendrik
Hendrik Leppkes Nov. 15, 2016, 12:07 p.m. UTC | #11
On Tue, Nov 15, 2016 at 1:39 AM, Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 1:19 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>> 2016-11-14 23:47 GMT+01:00 James Almer <jamrial@gmail.com>:
>>
>>> but vc1_sa10143 fails using DXVA2 and a recent driver.
>>
>> I suspect it actually passes with DXVA2: FFmpeg is not
>> bit-exact for vc1.
>
> Looks like you are right, thats the hashes I get as well.
>
> In any case, I have a working WIP patch that fixes sa10091 and sa20021
> with DXVA2, which were broken before.
> I'll clean it up tomorrow and send it for testing.
>
> Unfortunately I don't have a sample for field mode with slices, so
> that remains un-implemented. If someone comes across such a thing,
> that would be nice to have.
>

Here is my current work in progress:
https://github.com/Nevcairiel/FFmpeg/commits/vc1slices

It fixes sa10091 and sa20021 on NVIDIA with DXVA2 for me. Note that
sa10143 breakage is from the software decoder not being bitexact, and
not a hwaccel failure (it doesn't even use slices).

Appreciate any testing on other hardware or on VAAPI/VDPAU.

- Hendrik
James Almer Nov. 15, 2016, 1:21 p.m. UTC | #12
On 11/15/2016 9:07 AM, Hendrik Leppkes wrote:
> On Tue, Nov 15, 2016 at 1:39 AM, Hendrik Leppkes <h.leppkes@gmail.com> wrote:
>> On Tue, Nov 15, 2016 at 1:19 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>>> 2016-11-14 23:47 GMT+01:00 James Almer <jamrial@gmail.com>:
>>>
>>>> but vc1_sa10143 fails using DXVA2 and a recent driver.
>>>
>>> I suspect it actually passes with DXVA2: FFmpeg is not
>>> bit-exact for vc1.
>>
>> Looks like you are right, thats the hashes I get as well.
>>
>> In any case, I have a working WIP patch that fixes sa10091 and sa20021
>> with DXVA2, which were broken before.
>> I'll clean it up tomorrow and send it for testing.
>>
>> Unfortunately I don't have a sample for field mode with slices, so
>> that remains un-implemented. If someone comes across such a thing,
>> that would be nice to have.
>>
> 
> Here is my current work in progress:
> https://github.com/Nevcairiel/FFmpeg/commits/vc1slices
> 
> It fixes sa10091 and sa20021 on NVIDIA with DXVA2 for me. Note that
> sa10143 breakage is from the software decoder not being bitexact, and
> not a hwaccel failure (it doesn't even use slices).
> 
> Appreciate any testing on other hardware or on VAAPI/VDPAU.
> 
> - Hendrik

On the same setup i mentioned yesterday, with your patch i get the same
results as without it.
All pass except sa10143 (Where i get the hashes Carl mentioned are the
actually bitexact ones) and vc1_ilaced_twomv, where i get the following

-0,          0,          0,        1,  3110400, 0x764f8856
-0,          2,          2,        1,  3110400, 0x3b615b79
-0,          3,          3,        1,  3110400, 0x4fbb6f84
-0,          4,          4,        1,  3110400, 0xc1ca8532
-0,          5,          5,        1,  3110400, 0xb6e7d363
-0,          6,          6,        1,  3110400, 0x1beb5c34
-0,          7,          7,        1,  3110400, 0xcb8cb061
-0,          8,          8,        1,  3110400, 0x13ddbd61
-0,          9,          9,        1,  3110400, 0xde8f052f
-0,         10,         10,        1,  3110400, 0x4d4072db
-0,         11,         11,        1,  3110400, 0x4e5d29e3
-0,         12,         12,        1,  3110400, 0x75300531
-0,         13,         13,        1,  3110400, 0x1114285a
+0,          0,          0,        1,  3110400, 0xc95e8861
+0,          2,          2,        1,  3110400, 0xf58b5cbf
+0,          3,          3,        1,  3110400, 0x2f866f33
+0,          4,          4,        1,  3110400, 0x05c18415
+0,          5,          5,        1,  3110400, 0x4077ca93
+0,          6,          6,        1,  3110400, 0x44d105fc
+0,          7,          7,        1,  3110400, 0xa0608374
+0,          8,          8,        1,  3110400, 0x407689dc
+0,          9,          9,        1,  3110400, 0x4707d00a
+0,         10,         10,        1,  3110400, 0x74986831
+0,         11,         11,        1,  3110400, 0xa5912619
+0,         12,         12,        1,  3110400, 0x44aa5565
+0,         13,         13,        1,  3110400, 0xb9752774

Maybe this one is the same situation as with sa10143?
Carl Eugen Hoyos Nov. 15, 2016, 1:28 p.m. UTC | #13
2016-11-15 14:21 GMT+01:00 James Almer <jamrial@gmail.com>:
> and vc1_ilaced_twomv

0,          0,          0,        1,  3110400, 0xc95e8861
0,          1,          1,        1,  3110400, 0xf58b5cbf
0,          2,          2,        1,  3110400, 0x2f866f33
0,          3,          3,        1,  3110400, 0x05c18415
0,          4,          4,        1,  3110400, 0x4077ca93
0,          5,          5,        1,  3110400, 0x44d105fc
0,          6,          6,        1,  3110400, 0xa0608374
0,          7,          7,        1,  3110400, 0x407689dc
0,          8,          8,        1,  3110400, 0x4707d00a
0,          9,          9,        1,  3110400, 0x74986831
0,         10,         10,        1,  3110400, 0xa5912619
0,         11,         11,        1,  3110400, 0x44aa5565
0,         12,         12,        1,  3110400, 0xb9752774

Carl Eugen
Hendrik Leppkes Nov. 15, 2016, 1:46 p.m. UTC | #14
On Tue, Nov 15, 2016 at 2:21 PM, James Almer <jamrial@gmail.com> wrote:
> On 11/15/2016 9:07 AM, Hendrik Leppkes wrote:
>> On Tue, Nov 15, 2016 at 1:39 AM, Hendrik Leppkes <h.leppkes@gmail.com> wrote:
>>> On Tue, Nov 15, 2016 at 1:19 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>>>> 2016-11-14 23:47 GMT+01:00 James Almer <jamrial@gmail.com>:
>>>>
>>>>> but vc1_sa10143 fails using DXVA2 and a recent driver.
>>>>
>>>> I suspect it actually passes with DXVA2: FFmpeg is not
>>>> bit-exact for vc1.
>>>
>>> Looks like you are right, thats the hashes I get as well.
>>>
>>> In any case, I have a working WIP patch that fixes sa10091 and sa20021
>>> with DXVA2, which were broken before.
>>> I'll clean it up tomorrow and send it for testing.
>>>
>>> Unfortunately I don't have a sample for field mode with slices, so
>>> that remains un-implemented. If someone comes across such a thing,
>>> that would be nice to have.
>>>
>>
>> Here is my current work in progress:
>> https://github.com/Nevcairiel/FFmpeg/commits/vc1slices
>>
>> It fixes sa10091 and sa20021 on NVIDIA with DXVA2 for me. Note that
>> sa10143 breakage is from the software decoder not being bitexact, and
>> not a hwaccel failure (it doesn't even use slices).
>>
>> Appreciate any testing on other hardware or on VAAPI/VDPAU.
>>
>> - Hendrik
>
> On the same setup i mentioned yesterday, with your patch i get the same
> results as without it.
> All pass except sa10143 (Where i get the hashes Carl mentioned are the
> actually bitexact ones) and vc1_ilaced_twomv, where i get the following
>
> -0,          0,          0,        1,  3110400, 0x764f8856
> -0,          2,          2,        1,  3110400, 0x3b615b79
> -0,          3,          3,        1,  3110400, 0x4fbb6f84
> -0,          4,          4,        1,  3110400, 0xc1ca8532
> -0,          5,          5,        1,  3110400, 0xb6e7d363
> -0,          6,          6,        1,  3110400, 0x1beb5c34
> -0,          7,          7,        1,  3110400, 0xcb8cb061
> -0,          8,          8,        1,  3110400, 0x13ddbd61
> -0,          9,          9,        1,  3110400, 0xde8f052f
> -0,         10,         10,        1,  3110400, 0x4d4072db
> -0,         11,         11,        1,  3110400, 0x4e5d29e3
> -0,         12,         12,        1,  3110400, 0x75300531
> -0,         13,         13,        1,  3110400, 0x1114285a
> +0,          0,          0,        1,  3110400, 0xc95e8861
> +0,          2,          2,        1,  3110400, 0xf58b5cbf
> +0,          3,          3,        1,  3110400, 0x2f866f33
> +0,          4,          4,        1,  3110400, 0x05c18415
> +0,          5,          5,        1,  3110400, 0x4077ca93
> +0,          6,          6,        1,  3110400, 0x44d105fc
> +0,          7,          7,        1,  3110400, 0xa0608374
> +0,          8,          8,        1,  3110400, 0x407689dc
> +0,          9,          9,        1,  3110400, 0x4707d00a
> +0,         10,         10,        1,  3110400, 0x74986831
> +0,         11,         11,        1,  3110400, 0xa5912619
> +0,         12,         12,        1,  3110400, 0x44aa5565
> +0,         13,         13,        1,  3110400, 0xb9752774
>
> Maybe this one is the same situation as with sa10143?
>

Yes, that one also uses field coding, which is not bitexact in our
decoder, so a mismatch is expected.
Thanks for confirming I didn't break anything on other setups - or
more precise fixed the breakage again from last night :)

- Hendrik
Mark Thompson Nov. 15, 2016, 10:21 p.m. UTC | #15
On 15/11/16 12:07, Hendrik Leppkes wrote:
> On Tue, Nov 15, 2016 at 1:39 AM, Hendrik Leppkes <h.leppkes@gmail.com> wrote:
>> On Tue, Nov 15, 2016 at 1:19 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>>> 2016-11-14 23:47 GMT+01:00 James Almer <jamrial@gmail.com>:
>>>
>>>> but vc1_sa10143 fails using DXVA2 and a recent driver.
>>>
>>> I suspect it actually passes with DXVA2: FFmpeg is not
>>> bit-exact for vc1.
>>
>> Looks like you are right, thats the hashes I get as well.
>>
>> In any case, I have a working WIP patch that fixes sa10091 and sa20021
>> with DXVA2, which were broken before.
>> I'll clean it up tomorrow and send it for testing.
>>
>> Unfortunately I don't have a sample for field mode with slices, so
>> that remains un-implemented. If someone comes across such a thing,
>> that would be nice to have.
>>
> 
> Here is my current work in progress:
> https://github.com/Nevcairiel/FFmpeg/commits/vc1slices
> 
> It fixes sa10091 and sa20021 on NVIDIA with DXVA2 for me. Note that
> sa10143 breakage is from the software decoder not being bitexact, and
> not a hwaccel failure (it doesn't even use slices).
> 
> Appreciate any testing on other hardware or on VAAPI/VDPAU.

Sample:                  SA00040     SA10143     ism
                             SA00050     SA20021
                                 SA10091     ilaced_twomv
Without patch:
Polaris 11    VDPAU      p   p   p   p*  p   p*  p
              VAAPI      p   p   p   f   p   f   p
Skylake GT2   VAAPI      p   p   f   n   f   n   p

With patch:
Polaris 11    VDPAU      p   p   p   p*  p   p*  p
              VAAPI      p   p   f   f   f   f   p
Skylake GT2   VAAPI      p   p   p   n   p   n   p

p  = passes the fate test
p* = fail (output looks sensible and matches hashes Carl posted elsewhere in this thread)
f  = fail (output produced but is totally incorrect)
n  = fail (no output - fails to decode at all)


- Mark
diff mbox

Patch

diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c
index 4f78aa8..1c8f03c 100644
--- a/libavcodec/vc1dec.c
+++ b/libavcodec/vc1dec.c
@@ -953,8 +953,24 @@  static int vc1_decode_frame(AVCodecContext *avctx, void *data,
             s->picture_structure = PICT_FRAME;
             if ((ret = avctx->hwaccel->start_frame(avctx, buf_start, (buf + buf_size) - buf_start)) < 0)
                 goto err;
-            if ((ret = avctx->hwaccel->decode_slice(avctx, buf_start, (buf + buf_size) - buf_start)) < 0)
-                goto err;
+            if (n_slices == 0) {
+                if ((ret = avctx->hwaccel->decode_slice(avctx, buf_start, (buf + buf_size) - buf_start)) < 0)
+                    goto err;
+            } else {
+                int i;
+                ret = avctx->hwaccel->decode_slice(avctx, s->gb.buffer, s->gb.buffer_end - s->gb.buffer);
+                if (ret < 0)
+                    goto err;
+                for (i = 0 ; i < n_slices; i++) {
+                     s->gb = slices[i].gb;
+                     s->mb_y = slices[i].mby_start;
+                     if (get_bits(&s->gb, 1))
+                         ff_vc1_parse_frame_header_adv(v, &s->gb);
+                     ret = avctx->hwaccel->decode_slice(avctx, s->gb.buffer, s->gb.buffer_end - s->gb.buffer);
+                     if (ret < 0)
+                         goto err;
+                }
+            }
             if ((ret = avctx->hwaccel->end_frame(avctx)) < 0)
                 goto err;
         }