diff mbox

[FFmpeg-devel,1/5] lavc : yami : add libyami decoder/encoder

Message ID 2b717d09-d570-b4dc-bbf4-c55962ca9199@gmail.com
State Rejected
Headers show

Commit Message

Jun Zhao Aug. 15, 2016, 8:22 a.m. UTC
add libyami decoder/encoder/vpp in ffmpeg, about build step, 
please refer to the link: https://github.com/01org/ffmpeg_libyami/wiki/Build
From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00 2001
From: Jun Zhao <jun.zhao@intel.com>
Date: Mon, 15 Aug 2016 15:36:14 +0800
Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.

add libyami decoder/encoder in ffmepg, supported
decoder:
    - libyami mpeg2
    - libyami vc1
    - libyami vp8
    - libyami vp9
    - libyami h264
    - libyami h265
supported encoder:
    - libyami vp8
    - libyami h264

Signed-off-by: Jun Zhao <jun.zhao@intel.com>
---
 Makefile                   |   1 +
 configure                  |  27 +++
 ffmpeg.c                   |   4 +
 ffmpeg.h                   |   1 +
 ffmpeg_libyami.c           |  85 +++++++
 libavcodec/Makefile        |   8 +
 libavcodec/allcodecs.c     |   6 +
 libavcodec/libyami.cpp     | 429 +++++++++++++++++++++++++++++++++++
 libavcodec/libyami.h       |  59 +++++
 libavcodec/libyami_dec.cpp | 527 +++++++++++++++++++++++++++++++++++++++++++
 libavcodec/libyami_dec.h   |  56 +++++
 libavcodec/libyami_enc.cpp | 551 +++++++++++++++++++++++++++++++++++++++++++++
 libavcodec/libyami_enc.h   |  70 ++++++
 libavutil/pixdesc.c        |   4 +
 libavutil/pixfmt.h         |   5 +
 15 files changed, 1833 insertions(+)
 create mode 100644 ffmpeg_libyami.c
 create mode 100644 libavcodec/libyami.cpp
 create mode 100644 libavcodec/libyami.h
 create mode 100644 libavcodec/libyami_dec.cpp
 create mode 100644 libavcodec/libyami_dec.h
 create mode 100644 libavcodec/libyami_enc.cpp
 create mode 100644 libavcodec/libyami_enc.h

Comments

Hendrik Leppkes Aug. 15, 2016, 9:46 a.m. UTC | #1
On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com> wrote:
> add libyami decoder/encoder/vpp in ffmpeg, about build step,
> please refer to the link: https://github.com/01org/ffmpeg_libyami/wiki/Build
>

We've had patches for yami before, and they were not applied because
many developers did not agree with adding more wrappers for the same
hardware decoders which we already support.
Please refer to the discussion in this thread:
https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html

The concerns and reasons brought up there should not really have changed.

- Hendrik
Jean-Baptiste Kempf Aug. 15, 2016, 5:48 p.m. UTC | #2
On 15 Aug, Hendrik Leppkes wrote :
> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com> wrote:
> > add libyami decoder/encoder/vpp in ffmpeg, about build step,
> > please refer to the link: https://github.com/01org/ffmpeg_libyami/wiki/Build
> >
> 
> We've had patches for yami before, and they were not applied because
> many developers did not agree with adding more wrappers for the same
> hardware decoders which we already support.
> Please refer to the discussion in this thread:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
> 
> The concerns and reasons brought up there should not really have changed.

I still object very strongly against yami.

It is a library that does not bring much that we could not do ourselves,
it duplicates a lot of our code, it is the wrong level of abstraction
for libavcodec, it is using a bad license and there is no guarantee of
maintainership in the future.
Jun Zhao Aug. 16, 2016, 1 a.m. UTC | #3
On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
> On 15 Aug, Hendrik Leppkes wrote :
>> > On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com> wrote:
>>> > > add libyami decoder/encoder/vpp in ffmpeg, about build step,
>>> > > please refer to the link: https://github.com/01org/ffmpeg_libyami/wiki/Build
>>> > >
>> > 
>> > We've had patches for yami before, and they were not applied because
>> > many developers did not agree with adding more wrappers for the same
>> > hardware decoders which we already support.
>> > Please refer to the discussion in this thread:
>> > https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
>> > 
>> > The concerns and reasons brought up there should not really have changed.
> I still object very strongly against yami.
> 
> It is a library that does not bring much that we could not do ourselves,
> it duplicates a lot of our code, it is the wrong level of abstraction
> for libavcodec, it is using a bad license and there is no guarantee of
> maintainership in the future.

I know the worry after read the above thread.For Intel GPU HW accelerate decode/encode,
now have 3 options in ffmpeg:

1. ffmpeg and QSV (Media SDK)
2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
3. ffmpeg and libyami

And I know the guys prefer option 2 than 3, but I have a little question, what's the
difference about ffmpeg/libyami and the other external codec library(e,g openh264, 
videotoolbox...)?

As I know Intel have 3 full time Libyami developers, so no guarantee maybe is wrong.:)
    
Tks.
Timothy Gu Aug. 16, 2016, 1:32 a.m. UTC | #4
On Mon, Aug 15, 2016 at 6:00 PM Jun Zhao <mypopydev@gmail.com> wrote:

> I know the worry after read the above thread.For Intel GPU HW accelerate
> decode/encode,
> now have 3 options in ffmpeg:
>
> 1. ffmpeg and QSV (Media SDK)
> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
> 3. ffmpeg and libyami
>
> And I know the guys prefer option 2 than 3, but I have a little question,
> what's the
> difference about ffmpeg/libyami and the other external codec library(e,g
> openh264,
> videotoolbox...)?
>

OpenH264 is software-only. VideoToolbox is Apple-only.


> As I know Intel have 3 full time Libyami developers, so no guarantee maybe
> is wrong.:)
>

That's true right now, but Intel has offered no guarantee for the future.

Timothy
Chao Liu Aug. 16, 2016, 2:14 a.m. UTC | #5
On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:

>
>
> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
> > On 15 Aug, Hendrik Leppkes wrote :
> >> > On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
> wrote:
> >>> > > add libyami decoder/encoder/vpp in ffmpeg, about build step,
> >>> > > please refer to the link: https://github.com/01org/
> ffmpeg_libyami/wiki/Build
> >>> > >
> >> >
> >> > We've had patches for yami before, and they were not applied because
> >> > many developers did not agree with adding more wrappers for the same
> >> > hardware decoders which we already support.
> >> > Please refer to the discussion in this thread:
> >> > https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
> >> >
> >> > The concerns and reasons brought up there should not really have
> changed.
> > I still object very strongly against yami.
> >
> > It is a library that does not bring much that we could not do ourselves,
> > it duplicates a lot of our code, it is the wrong level of abstraction
> > for libavcodec, it is using a bad license and there is no guarantee of
> > maintainership in the future.
>
> I know the worry after read the above thread.For Intel GPU HW accelerate
> decode/encode,
> now have 3 options in ffmpeg:
>
> 1. ffmpeg and QSV (Media SDK)
> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
> 3. ffmpeg and libyami
>
Sorry for this little diversion: what are the differences between QSV and
vaapi?
My understanding is that QSV has better performance, while vaapi supports
more decoders / encoders. Is that correct?
It would be nice if there are some data showing the speed of these HW
accelerated decoders / encoders.

>
> And I know the guys prefer option 2 than 3, but I have a little question,
> what's the
> difference about ffmpeg/libyami and the other external codec library(e,g
> openh264,
> videotoolbox...)?
>
Is 2 available in ffmpeg today or is it sth. planned?

>
> As I know Intel have 3 full time Libyami developers, so no guarantee maybe
> is wrong.:)
>
> Tks.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Jun Zhao Aug. 16, 2016, 2:44 a.m. UTC | #6
On 2016/8/16 10:14, Chao Liu wrote:
> On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> 
>>
>>
>> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
>>> On 15 Aug, Hendrik Leppkes wrote :
>>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
>> wrote:
>>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
>>>>>>> please refer to the link: https://github.com/01org/
>> ffmpeg_libyami/wiki/Build
>>>>>>>
>>>>>
>>>>> We've had patches for yami before, and they were not applied because
>>>>> many developers did not agree with adding more wrappers for the same
>>>>> hardware decoders which we already support.
>>>>> Please refer to the discussion in this thread:
>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
>>>>>
>>>>> The concerns and reasons brought up there should not really have
>> changed.
>>> I still object very strongly against yami.
>>>
>>> It is a library that does not bring much that we could not do ourselves,
>>> it duplicates a lot of our code, it is the wrong level of abstraction
>>> for libavcodec, it is using a bad license and there is no guarantee of
>>> maintainership in the future.
>>
>> I know the worry after read the above thread.For Intel GPU HW accelerate
>> decode/encode,
>> now have 3 options in ffmpeg:
>>
>> 1. ffmpeg and QSV (Media SDK)
>> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
>> 3. ffmpeg and libyami
>>
> Sorry for this little diversion: what are the differences between QSV and
> vaapi?
> My understanding is that QSV has better performance, while vaapi supports
> more decoders / encoders. Is that correct?
> It would be nice if there are some data showing the speed of these HW
> accelerated decoders / encoders.

QSV has better performance is right, but libyami have more decoders/encoders than 
vaapi hw accel decoder/encoder. :)

According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and native
vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and native
vaapi encoder

> 
>>
>> And I know the guys prefer option 2 than 3, but I have a little question,
>> what's the
>> difference about ffmpeg/libyami and the other external codec library(e,g
>> openh264,
>> videotoolbox...)?
>>
> Is 2 available in ffmpeg today or is it sth. planned?
> 

Option 2 is available today :), I think the wiki page ( https://wiki.libav.org/Hardware/vaapi) 
is good refer to for option 2, if you want to try. :)

>>
>> As I know Intel have 3 full time Libyami developers, so no guarantee maybe
>> is wrong.:)
>>
>> Tks.
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Timothy Gu Aug. 16, 2016, 3:07 a.m. UTC | #7
Hi

On Mon, Aug 15, 2016 at 7:44 PM Jun Zhao <mypopydev@gmail.com> wrote:

>
>
> On 2016/8/16 10:14, Chao Liu wrote:
> > Sorry for this little diversion: what are the differences between QSV and
> > vaapi?
> > My understanding is that QSV has better performance, while vaapi supports
> > more decoders / encoders. Is that correct?
> > It would be nice if there are some data showing the speed of these HW
> > accelerated decoders / encoders.
>
> QSV has better performance is right, but libyami have more
> decoders/encoders than
> vaapi hw accel decoder/encoder. :)
>

I am not sure where you got this information.

On Intel platforms they all use the same chip. Because VAAPI supports more
than just Intel platforms, VAAPI supports all codecs libyami and QSV
support, if not more.

QSV works on both Windows and Linux, although it is a pain to set up a
Linux QSV environment (you have to have the right distro, right kernel,
etc.).


>
> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and
> native
> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
> native
> vaapi encoder
>

You didn't mention _how_ you profiled things, and for HW encoding different
ways of profiling can cause wildly different results. If for example you
are not doing zero-copy VAAPI operations, you are inherently giving the
other two methods an edge.

Timothy
Jun Zhao Aug. 16, 2016, 5:22 a.m. UTC | #8
On 2016/8/16 11:07, Timothy Gu wrote:
> Hi
> 
> On Mon, Aug 15, 2016 at 7:44 PM Jun Zhao <mypopydev@gmail.com> wrote:
> 
>>
>>
>> On 2016/8/16 10:14, Chao Liu wrote:
>>> Sorry for this little diversion: what are the differences between QSV and
>>> vaapi?
>>> My understanding is that QSV has better performance, while vaapi supports
>>> more decoders / encoders. Is that correct?
>>> It would be nice if there are some data showing the speed of these HW
>>> accelerated decoders / encoders.
>>
>> QSV has better performance is right, but libyami have more
>> decoders/encoders than
>> vaapi hw accel decoder/encoder. :)
>>
> 
> I am not sure where you got this information.
> 
> On Intel platforms they all use the same chip. Because VAAPI supports more
> than just Intel platforms, VAAPI supports all codecs libyami and QSV
> support, if not more.
> 
> QSV works on both Windows and Linux, although it is a pain to set up a
> Linux QSV environment (you have to have the right distro, right kernel,
> etc.).
> 
> 

I means ffmpeg_VAAPI hw accel decoder/native VAAPI encoder, not the VAAPI as 
interface.

>>
>> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and
>> native
>> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
>> native
>> vaapi encoder
>>
> 
> You didn't mention _how_ you profiled things, and for HW encoding different
> ways of profiling can cause wildly different results. If for example you
> are not doing zero-copy VAAPI operations, you are inherently giving the
> other two methods an edge.
> 

I used the ffmpeg_QSV/ffmpeg_libyami/ffmpeg_vaapi to do zero-copy mode transcode with default setting as profile case.

> Timothy
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Chao Liu Aug. 16, 2016, 7:37 a.m. UTC | #9
On Mon, Aug 15, 2016 at 7:44 PM, Jun Zhao <mypopydev@gmail.com> wrote:

>
>
> On 2016/8/16 10:14, Chao Liu wrote:
> > On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> >
> >>
> >>
> >> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
> >>> On 15 Aug, Hendrik Leppkes wrote :
> >>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
> >> wrote:
> >>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
> >>>>>>> please refer to the link: https://github.com/01org/
> >> ffmpeg_libyami/wiki/Build
> >>>>>>>
> >>>>>
> >>>>> We've had patches for yami before, and they were not applied because
> >>>>> many developers did not agree with adding more wrappers for the same
> >>>>> hardware decoders which we already support.
> >>>>> Please refer to the discussion in this thread:
> >>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
> >>>>>
> >>>>> The concerns and reasons brought up there should not really have
> >> changed.
> >>> I still object very strongly against yami.
> >>>
> >>> It is a library that does not bring much that we could not do
> ourselves,
> >>> it duplicates a lot of our code, it is the wrong level of abstraction
> >>> for libavcodec, it is using a bad license and there is no guarantee of
> >>> maintainership in the future.
> >>
> >> I know the worry after read the above thread.For Intel GPU HW accelerate
> >> decode/encode,
> >> now have 3 options in ffmpeg:
> >>
> >> 1. ffmpeg and QSV (Media SDK)
> >> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
> >> 3. ffmpeg and libyami
> >>
> > Sorry for this little diversion: what are the differences between QSV and
> > vaapi?
> > My understanding is that QSV has better performance, while vaapi supports
> > more decoders / encoders. Is that correct?
> > It would be nice if there are some data showing the speed of these HW
> > accelerated decoders / encoders.
>
> QSV has better performance is right, but libyami have more
> decoders/encoders than
> vaapi hw accel decoder/encoder. :)
>
> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and
> native
> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
> native
> vaapi encoder
>
> >
> >>
> >> And I know the guys prefer option 2 than 3, but I have a little
> question,
> >> what's the
> >> difference about ffmpeg/libyami and the other external codec library(e,g
> >> openh264,
> >> videotoolbox...)?
> >>
> > Is 2 available in ffmpeg today or is it sth. planned?
> >
>
> Option 2 is available today :), I think the wiki page (
> https://wiki.libav.org/Hardware/vaapi)
> is good refer to for option 2, if you want to try. :)

Thanks. But that's for libav. These decoders and encoders are not available
for ffmpeg.

>


> >>
> >> As I know Intel have 3 full time Libyami developers, so no guarantee
> maybe
> >> is wrong.:)
> >>
> >> Tks.
> >> _______________________________________________
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel@ffmpeg.org
> >> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Chao Liu Aug. 16, 2016, 7:40 a.m. UTC | #10
On Mon, Aug 15, 2016 at 10:22 PM, Jun Zhao <mypopydev@gmail.com> wrote:

>
>
> On 2016/8/16 11:07, Timothy Gu wrote:
> > Hi
> >
> > On Mon, Aug 15, 2016 at 7:44 PM Jun Zhao <mypopydev@gmail.com> wrote:
> >
> >>
> >>
> >> On 2016/8/16 10:14, Chao Liu wrote:
> >>> Sorry for this little diversion: what are the differences between QSV
> and
> >>> vaapi?
> >>> My understanding is that QSV has better performance, while vaapi
> supports
> >>> more decoders / encoders. Is that correct?
> >>> It would be nice if there are some data showing the speed of these HW
> >>> accelerated decoders / encoders.
> >>
> >> QSV has better performance is right, but libyami have more
> >> decoders/encoders than
> >> vaapi hw accel decoder/encoder. :)
> >>
> >
> > I am not sure where you got this information.
> >
> > On Intel platforms they all use the same chip. Because VAAPI supports
> more
> > than just Intel platforms, VAAPI supports all codecs libyami and QSV
> > support, if not more.
> >
> > QSV works on both Windows and Linux, although it is a pain to set up a
> > Linux QSV environment (you have to have the right distro, right kernel,
> > etc.).
> >
> >
>
> I means ffmpeg_VAAPI hw accel decoder/native VAAPI encoder, not the VAAPI
> as
> interface.
>
> >>
> >> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder
> and
> >> native
> >> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
> >> native
> >> vaapi encoder
> >>
> >
> > You didn't mention _how_ you profiled things, and for HW encoding
> different
> > ways of profiling can cause wildly different results. If for example you
> > are not doing zero-copy VAAPI operations, you are inherently giving the
> > other two methods an edge.
> >
>
> I used the ffmpeg_QSV/ffmpeg_libyami/ffmpeg_vaapi to do zero-copy mode
> transcode with default setting as profile case.
>
Perhaps you could share your test environment settings and the results, so
others could repro and confirm what you said, which could make this patch
more appealing.
IIUC, there is no hardware accelerated encoder for VP8 in ffmpeg yet.
That's another value of this patch..

>
> > Timothy
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Jun Zhao Aug. 16, 2016, 8:06 a.m. UTC | #11
On 2016/8/16 15:37, Chao Liu wrote:
> On Mon, Aug 15, 2016 at 7:44 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> 
>>
>>
>> On 2016/8/16 10:14, Chao Liu wrote:
>>> On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
>>>>> On 15 Aug, Hendrik Leppkes wrote :
>>>>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
>>>> wrote:
>>>>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
>>>>>>>>> please refer to the link: https://github.com/01org/
>>>> ffmpeg_libyami/wiki/Build
>>>>>>>>>
>>>>>>>
>>>>>>> We've had patches for yami before, and they were not applied because
>>>>>>> many developers did not agree with adding more wrappers for the same
>>>>>>> hardware decoders which we already support.
>>>>>>> Please refer to the discussion in this thread:
>>>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
>>>>>>>
>>>>>>> The concerns and reasons brought up there should not really have
>>>> changed.
>>>>> I still object very strongly against yami.
>>>>>
>>>>> It is a library that does not bring much that we could not do
>> ourselves,
>>>>> it duplicates a lot of our code, it is the wrong level of abstraction
>>>>> for libavcodec, it is using a bad license and there is no guarantee of
>>>>> maintainership in the future.
>>>>
>>>> I know the worry after read the above thread.For Intel GPU HW accelerate
>>>> decode/encode,
>>>> now have 3 options in ffmpeg:
>>>>
>>>> 1. ffmpeg and QSV (Media SDK)
>>>> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
>>>> 3. ffmpeg and libyami
>>>>
>>> Sorry for this little diversion: what are the differences between QSV and
>>> vaapi?
>>> My understanding is that QSV has better performance, while vaapi supports
>>> more decoders / encoders. Is that correct?
>>> It would be nice if there are some data showing the speed of these HW
>>> accelerated decoders / encoders.
>>
>> QSV has better performance is right, but libyami have more
>> decoders/encoders than
>> vaapi hw accel decoder/encoder. :)
>>
>> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and
>> native
>> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
>> native
>> vaapi encoder
>>
>>>
>>>>
>>>> And I know the guys prefer option 2 than 3, but I have a little
>> question,
>>>> what's the
>>>> difference about ffmpeg/libyami and the other external codec library(e,g
>>>> openh264,
>>>> videotoolbox...)?
>>>>
>>> Is 2 available in ffmpeg today or is it sth. planned?
>>>
>>
>> Option 2 is available today :), I think the wiki page (
>> https://wiki.libav.org/Hardware/vaapi)
>> is good refer to for option 2, if you want to try. :)
> 
> Thanks. But that's for libav. These decoders and encoders are not available
> for ffmpeg.
> 

I can run ffmpeg vaapi hw accel decoder and vaapi encoder with zero-copy mode for
transcode case, I don't know why you can't succeed.

Do you re-build intel-driver/libva with master branch?
Jun Zhao Aug. 16, 2016, 8:12 a.m. UTC | #12
On 2016/8/16 15:40, Chao Liu wrote:
> On Mon, Aug 15, 2016 at 10:22 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> 
>>
>>
>> On 2016/8/16 11:07, Timothy Gu wrote:
>>> Hi
>>>
>>> On Mon, Aug 15, 2016 at 7:44 PM Jun Zhao <mypopydev@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On 2016/8/16 10:14, Chao Liu wrote:
>>>>> Sorry for this little diversion: what are the differences between QSV
>> and
>>>>> vaapi?
>>>>> My understanding is that QSV has better performance, while vaapi
>> supports
>>>>> more decoders / encoders. Is that correct?
>>>>> It would be nice if there are some data showing the speed of these HW
>>>>> accelerated decoders / encoders.
>>>>
>>>> QSV has better performance is right, but libyami have more
>>>> decoders/encoders than
>>>> vaapi hw accel decoder/encoder. :)
>>>>
>>>
>>> I am not sure where you got this information.
>>>
>>> On Intel platforms they all use the same chip. Because VAAPI supports
>> more
>>> than just Intel platforms, VAAPI supports all codecs libyami and QSV
>>> support, if not more.
>>>
>>> QSV works on both Windows and Linux, although it is a pain to set up a
>>> Linux QSV environment (you have to have the right distro, right kernel,
>>> etc.).
>>>
>>>
>>
>> I means ffmpeg_VAAPI hw accel decoder/native VAAPI encoder, not the VAAPI
>> as
>> interface.
>>
>>>>
>>>> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder
>> and
>>>> native
>>>> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
>>>> native
>>>> vaapi encoder
>>>>
>>>
>>> You didn't mention _how_ you profiled things, and for HW encoding
>> different
>>> ways of profiling can cause wildly different results. If for example you
>>> are not doing zero-copy VAAPI operations, you are inherently giving the
>>> other two methods an edge.
>>>
>>
>> I used the ffmpeg_QSV/ffmpeg_libyami/ffmpeg_vaapi to do zero-copy mode
>> transcode with default setting as profile case.
>>
> Perhaps you could share your test environment settings and the results, so
> others could repro and confirm what you said, which could make this patch
> more appealing.
> IIUC, there is no hardware accelerated encoder for VP8 in ffmpeg yet.
> That's another value of this patch..
> 

Yes, you are right, now ffmpeg missing VP8 hardware accelerated encoder/decoder.

If you want to reproduce the performance results, I think you can used the codebase
https://github.com/01org/ffmpeg_libyami branch rebase-upstream, when build the 
source code, pls --enable-vaapi. :)

Then you can try the transcode case with yami decoder/encoder, vaapi hardware accelerated encoder/decoder.
Jun Zhao Aug. 16, 2016, 8:51 a.m. UTC | #13
On 2016/8/16 15:40, Chao Liu wrote:
> On Mon, Aug 15, 2016 at 10:22 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> 
cult...
>>
>> I used the ffmpeg_QSV/ffmpeg_libyami/ffmpeg_vaapi to do zero-copy mode
>> transcode with default setting as profile case.
>>
> Perhaps you could share your test environment settings and the results, so
> others could repro and confirm what you said, which could make this patch
> more appealing.
> IIUC, there is no hardware accelerated encoder for VP8 in ffmpeg yet.
> That's another value of this patch..

some log you can refer to, and now I can't find QSV test bed :(

barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -vaapi_device /dev/dri/card0 -hwaccel vaapi -hwaccel_output_format vaapi -i ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 'format=nv12|vaapi,hwupload' -c:v h264_vaapi -profile 77 -level:v 40  -b 4000k  output_vaapi_transcode.mp4
ffmpeg version N-81825-g80f8fc9 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --prefix=/opt/ffmpeg --enable-libyami --disable-doc --enable-version3 --enable-vaapi
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 51.102 / 57. 51.102
  libavformat    57. 46.101 / 57. 46.101
  libavdevice    57.  0.102 / 57.  0.102
  libavfilter     6. 51.100 /  6. 51.100
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
libva info: VA-API version 0.39.3
libva info: va_getDriverName() returns 0
libva info: Trying to open /opt/yami/vaapi/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_39
libva info: va_openDriver() returns 0
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '../ffmpeg_yami_testcase/skyfall2-trailer.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isom
    creation_time   : 2012-07-31 00:31:48
  Duration: 00:02:30.77, start: 0.000000, bitrate: 4002 kb/s
    Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv), 1920x1080 [SAR 1:1 DAR 16:9], 3937 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 61 kb/s (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Sound Media Handler
Please use -profile:a or -profile:v, -profile is ambiguous
Please use -b:a or -b:v, -b is ambiguous
[mp4 @ 0x36bedc0] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, mp4, to 'output_vaapi_transcode.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isom
    encoder         : Lavf57.46.101
    Stream #0:0(eng): Video: h264 (h264_vaapi) (Main) ([33][0][0][0] / 0x0021), vaapi_vld, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 4000 kb/s, 23.98 fps, 24k tbn, 23.98 tbc (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Video Media Handler
      encoder         : Lavc57.51.102 h264_vaapi
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_vaapi))
Press [q] to stop, [?] for help
[h264 @ 0x382ba40] Hardware accelerated decoding with frame threading is known to be unstable and its use is discouraged.
Input stream #0:0 frame changed from size:1920x1080 fmt:yuv420p to size:1920x1080 fmt:vaapi_vld
Unrepairable overflow!-0.0 size=     797kB time=00:00:01.79 bitrate=3639.5kbits/s dup=1 drop=0 speed=3.54x    
frame= 3615 fps=132 q=-0.0 Lsize=   71470kB time=00:02:30.69 bitrate=3885.3kbits/s dup=1 drop=0 speed=5.49x    
video:71435kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.048997%
barry@barry:~/Source/video/yami/ffmpeg_libyami$ mediainfo output_vaapi_transcode.mp4 
General
Complete name                            : output_vaapi_transcode.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom
File size                                : 69.8 MiB
Duration                                 : 2mn 30s
Overall bit rate mode                    : Variable
Overall bit rate                         : 3 883 Kbps
Writing application                      : Lavf57.46.101

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Main@L4.0
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 2 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 2mn 30s
Bit rate mode                            : Variable
Bit rate                                 : 3 881 Kbps
Maximum bit rate                         : 256 Mbps
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 23.976 fps
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.078
Stream size                              : 69.8 MiB (100%)
Language                                 : English


barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -c:v libyami_h264 ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -c:v libyami_h264 output_yami_transcode.mp4 ffmpeg version N-81825-g80f8fc9 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --prefix=/opt/ffmpeg --enable-libyami --disable-doc --enable-version3 --enable-vaapi
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 51.102 / 57. 51.102
  libavformat    57. 46.101 / 57. 46.101
  libavdevice    57.  0.102 / 57.  0.102
  libavfilter     6. 51.100 /  6. 51.100
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
Output #0, mp4, to '../ffmpeg_yami_testcase/skyfall2-trailer.mp4':
Output file #0 does not contain any stream
barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -c:v libyami_h264 -i ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -c:v libyami_h264 output_yami_transcode.mp4 
ffmpeg version N-81825-g80f8fc9 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --prefix=/opt/ffmpeg --enable-libyami --disable-doc --enable-version3 --enable-vaapi
  libavutil      55. 28.100 / 55. 28.100
  libavcodec     57. 51.102 / 57. 51.102
  libavformat    57. 46.101 / 57. 46.101
  libavdevice    57.  0.102 / 57.  0.102
  libavfilter     6. 51.100 /  6. 51.100
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  1.100 /  2.  1.100
libva info: VA-API version 0.39.3
libva info: va_getDriverName() returns 0
libva info: Trying to open /opt/yami/vaapi/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_39
libva info: va_openDriver() returns 0
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '../ffmpeg_yami_testcase/skyfall2-trailer.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isom
    creation_time   : 2012-07-31 00:31:48
  Duration: 00:02:30.77, start: 0.000000, bitrate: 4002 kb/s
    Stream #0:0(eng): Video: h264 (avc1 / 0x31637661), nv12, 1920x1080, 3937 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 48k tbc (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 61 kb/s (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Sound Media Handler
[libyami_h264 @ 0x2711e20] Using the main profile as default.
[mp4 @ 0x26ea120] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
    Last message repeated 1 times
Output #0, mp4, to 'output_yami_transcode.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isom
    encoder         : Lavf57.46.101
    Stream #0:0(eng): Video: h264 (libyami_h264) ([33][0][0][0] / 0x0021), yami, 1920x1080, q=2-31, 200 kb/s, 23.98 fps, 24k tbn, 23.98 tbc (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Video Media Handler
      encoder         : Lavc57.51.102 libyami_h264
    Stream #0:1(eng): Audio: aac (LC) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      creation_time   : 2012-07-31 00:31:48
      handler_name    : MP4 Sound Media Handler
      encoder         : Lavc57.51.102 aac
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (libyami_h264) -> h264 (libyami_h264))
  Stream #0:1 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[aac @ 0x26eb7e0] SSR is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[aac @ 0x26eb7e0] If you want to help, upload a sample of this file to ftp://upload.ffmpeg.org/incoming/ and contact the ffmpeg-devel mailing list. (ffmpeg-devel@ffmpeg.org)
Error while decoding stream #0:1: Not yet implemented in FFmpeg, patches welcome
frame= 3615 fps=182 q=-0.0 Lsize=   75955kB time=00:02:30.76 bitrate=4127.0kbits/s dup=1 drop=0 speed=7.58x    
video:73474kB audio:2393kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.115290%
[aac @ 0x26f49c0] Qavg: 1272.917
                 
barry@barry:~/Source/video/yami/ffmpeg_libyami$ mediainfo output_yami_transcode.mp4 
General
Complete name                            : output_yami_transcode.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom
File size                                : 74.2 MiB
Duration                                 : 2mn 30s
Overall bit rate                         : 4 126 Kbps
Writing application                      : Lavf57.46.101

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Main@L4.0
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 1 frame
Format settings, GOP                     : M=1, N=12
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 2mn 30s
Bit rate                                 : 3 992 Kbps
Width                                    : 1 920 pixels
Height                                   : 1 080 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 23.976 fps
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.080
Stream size                              : 71.8 MiB (97%)
Language                                 : English

Audio
ID                                       : 2
Format                                   : AAC
Format/Info                              : Advanced Audio Codec
Format profile                           : LC
Codec ID                                 : 40
Duration                                 : 2mn 30s
Bit rate mode                            : Constant
Bit rate                                 : 132 Kbps
Channel(s)                               : 2 channels
Channel positions                        : Front: L R
Sampling rate                            : 44.1 KHz
Compression mode                         : Lossy
Stream size                              : 2.34 MiB (3%)
Language                                 : English
Mark Thompson Aug. 16, 2016, 6:27 p.m. UTC | #14
On 16/08/16 03:44, Jun Zhao wrote:
> 
> 
> On 2016/8/16 10:14, Chao Liu wrote:
>> On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
>>
>>>
>>>
>>> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
>>>> On 15 Aug, Hendrik Leppkes wrote :
>>>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
>>> wrote:
>>>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
>>>>>>>> please refer to the link: https://github.com/01org/
>>> ffmpeg_libyami/wiki/Build
>>>>>>>>
>>>>>>
>>>>>> We've had patches for yami before, and they were not applied because
>>>>>> many developers did not agree with adding more wrappers for the same
>>>>>> hardware decoders which we already support.
>>>>>> Please refer to the discussion in this thread:
>>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
>>>>>>
>>>>>> The concerns and reasons brought up there should not really have
>>> changed.
>>>> I still object very strongly against yami.
>>>>
>>>> It is a library that does not bring much that we could not do ourselves,
>>>> it duplicates a lot of our code, it is the wrong level of abstraction
>>>> for libavcodec, it is using a bad license and there is no guarantee of
>>>> maintainership in the future.
>>>
>>> I know the worry after read the above thread.For Intel GPU HW accelerate
>>> decode/encode,
>>> now have 3 options in ffmpeg:
>>>
>>> 1. ffmpeg and QSV (Media SDK)
>>> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
>>> 3. ffmpeg and libyami
>>>
>> Sorry for this little diversion: what are the differences between QSV and
>> vaapi?
>> My understanding is that QSV has better performance, while vaapi supports
>> more decoders / encoders. Is that correct?
>> It would be nice if there are some data showing the speed of these HW
>> accelerated decoders / encoders.
> 
> QSV has better performance is right, but libyami have more decoders/encoders than 
> vaapi hw accel decoder/encoder. :)
> 
> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and native
> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and native
> vaapi encoder

In a single ffmpeg process I believe that result, but I'm not sure that it's the question you really want to ask.

The lavc VAAPI hwaccel/encoder are both single-threaded, and while they overlap operations internally where possible the single-threadedness of ffmpeg (the program) itself means that they will not achieve the maximum performance.  If you really want to compare the single-transcode performance like this then you will want to make a test program which does the threading outside lavc.


In any case, I don't believe that the single generic transcode setup is a use that many people are interested in (beyond testing to observe that hardware encoders kindof suck relative to libx264, then using that instead).

To my mind, the cases where it is interesting to use VAAPI (or really any hardware encoder on a normal PC-like system) are:

* You want to do /lots/ of simultaneous transcodes in some sort of server setup (often with some simple transformation, like a scale or codec change), and want to maximise the number you can do while maintaining some minimum level of throughput on each one.  You can benchmark this case for VAAPI by running lots of instances of ffmpeg, and I expect that the libyami numbers will be precisely equivalent because libyami is using VAAPI anyway and the hardware is identical.

* You want to do other things with the surfaces on your GPU.  Here, using VAAPI directly is good because the DRM objects are easily exposed so you can move surfaces to and from whatever other stuff you want to use (OpenCL, DRI2 in X11, etc.).

* You want to minimise CPU/power use when doing one or a small number of live encodes/decodes (for example, video calling or screen recording).  Here performance is not really the issue - any of these solutions suffices but we should try to avoid it being too hard to use.

So, what do you think libyami brings to any of these cases?  I don't really see anything beyond the additional codec support* - have I missed something?

libyami also (I believe, correct me if I'm wrong) has Intel-specificity - this is significant given that mesa/gallium has very recently gained VAAPI encode support on AMD VCE (though I think it doesn't currently work well with lavc, I'm going to look into that soon).

I haven't done any detailed review of the patches; I'm happy to do so if people are generally in favour of having the library.

Thanks,

- Mark


* Which is fixable.  Wrt VP8, I wrote a bit of code but abandoned it because I don't know of anyone who actually cares about it.  Do you have some useful case for it?  If so, I'd be happy to implement it.  I am already intending to do VP9 encode when I have hardware available; VP9 decode apparently already works though I don't have hardware myself.
Mark Thompson Aug. 16, 2016, 6:33 p.m. UTC | #15
On 16/08/16 09:51, Jun Zhao wrote:
> 
> barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -vaapi_device /dev/dri/card0 -hwaccel vaapi -hwaccel_output_format vaapi -i ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 'format=nv12|vaapi,hwupload' -c:v h264_vaapi -profile 77 -level:v 40  -b 4000k  output_vaapi_transcode.mp4
> ...
> barry@barry:~/Source/video/yami/ffmpeg_libyami$ mediainfo output_vaapi_transcode.mp4 
> File size                                : 69.8 MiB
> ...
> barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -c:v libyami_h264 -i ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -c:v libyami_h264 output_yami_transcode.mp4 
> ...
> barry@barry:~/Source/video/yami/ffmpeg_libyami$ mediainfo output_yami_transcode.mp4 
> File size                                : 74.2 MiB

I'm assuming you are trying to show them with identical options?  Since the hardware is the same, you really should be able to get those two encodes to produce pretty much identical results.

Here I think the significant difference is probably that h264_vaapi is using 2 B-frames by default, but there might be more subtle differences to remove as well.

- Mark
Jun Zhao Aug. 17, 2016, 12:56 a.m. UTC | #16
On 2016/8/17 2:33, Mark Thompson wrote:
> On 16/08/16 09:51, Jun Zhao wrote:
>>
>> barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -vaapi_device /dev/dri/card0 -hwaccel vaapi -hwaccel_output_format vaapi -i ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 'format=nv12|vaapi,hwupload' -c:v h264_vaapi -profile 77 -level:v 40  -b 4000k  output_vaapi_transcode.mp4
>> ...
>> barry@barry:~/Source/video/yami/ffmpeg_libyami$ mediainfo output_vaapi_transcode.mp4 
>> File size                                : 69.8 MiB
>> ...
>> barry@barry:~/Source/video/yami/ffmpeg_libyami$ ./ffmpeg -y -c:v libyami_h264 -i ../ffmpeg_yami_testcase/skyfall2-trailer.mp4 -c:v libyami_h264 output_yami_transcode.mp4 
>> ...
>> barry@barry:~/Source/video/yami/ffmpeg_libyami$ mediainfo output_yami_transcode.mp4 
>> File size                                : 74.2 MiB
> 
> I'm assuming you are trying to show them with identical options?  Since the hardware is the same, you really should be able to get those two encodes to produce pretty much identical results.
> 
> Here I think the significant difference is probably that h264_vaapi is using 2 B-frames by default, but there might be more subtle differences to remove as well.
> 
> - Mark

Hi, Mark:

I just used this show how to run ffmpeg/vaapi and ffmpeg/libyami :)

For the performance gap, I think the root cause is that ffmpeg/vaapi transcode use VPP in the pipeline, but ffmpeg/libyami transcode without VPP.

I will double-check this case.



> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Jun Zhao Aug. 17, 2016, 1:18 a.m. UTC | #17
On 2016/8/17 2:27, Mark Thompson wrote:
> On 16/08/16 03:44, Jun Zhao wrote:
>>
>>
>> On 2016/8/16 10:14, Chao Liu wrote:
>>> On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
>>>>> On 15 Aug, Hendrik Leppkes wrote :
>>>>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
>>>> wrote:
>>>>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
>>>>>>>>> please refer to the link: https://github.com/01org/
>>>> ffmpeg_libyami/wiki/Build
>>>>>>>>>
>>>>>>>
>>>>>>> We've had patches for yami before, and they were not applied because
>>>>>>> many developers did not agree with adding more wrappers for the same
>>>>>>> hardware decoders which we already support.
>>>>>>> Please refer to the discussion in this thread:
>>>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
>>>>>>>
>>>>>>> The concerns and reasons brought up there should not really have
>>>> changed.
>>>>> I still object very strongly against yami.
>>>>>
>>>>> It is a library that does not bring much that we could not do ourselves,
>>>>> it duplicates a lot of our code, it is the wrong level of abstraction
>>>>> for libavcodec, it is using a bad license and there is no guarantee of
>>>>> maintainership in the future.
>>>>
>>>> I know the worry after read the above thread.For Intel GPU HW accelerate
>>>> decode/encode,
>>>> now have 3 options in ffmpeg:
>>>>
>>>> 1. ffmpeg and QSV (Media SDK)
>>>> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
>>>> 3. ffmpeg and libyami
>>>>
>>> Sorry for this little diversion: what are the differences between QSV and
>>> vaapi?
>>> My understanding is that QSV has better performance, while vaapi supports
>>> more decoders / encoders. Is that correct?
>>> It would be nice if there are some data showing the speed of these HW
>>> accelerated decoders / encoders.
>>
>> QSV has better performance is right, but libyami have more decoders/encoders than 
>> vaapi hw accel decoder/encoder. :)
>>
>> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder and native
>> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and native
>> vaapi encoder
> 
> In a single ffmpeg process I believe that result, but I'm not sure that it's the question you really want to ask.
> 
> The lavc VAAPI hwaccel/encoder are both single-threaded, and while they overlap operations internally where possible the single-threadedness of ffmpeg (the program) itself means that they will not achieve the maximum performance.  If you really want to compare the single-transcode performance like this then you will want to make a test program which does the threading outside lavc.

I agree with you :), now I use thread in ffmpeg/yami encoder/decoder, and
 QSV (Media SDK) use the thread in the library, in this respect, compare the
one way (1 input/1 output)transcode speed is unfair to ffmpeg/vaapi.

> 
> In any case, I don't believe that the single generic transcode setup is a use that many people are interested in (beyond testing to observe that hardware encoders kindof suck relative to libx264, then using that instead).
> 
> To my mind, the cases where it is interesting to use VAAPI (or really any hardware encoder on a normal PC-like system) are:
> 
> * You want to do /lots/ of simultaneous transcodes in some sort of server setup (often with some simple transformation, like a scale or codec change), and want to maximise the number you can do while maintaining some minimum level of throughput on each one.  You can benchmark this case for VAAPI by running lots of instances of ffmpeg, and I expect that the libyami numbers will be precisely equivalent because libyami is using VAAPI anyway and the hardware is identical.
> 
> * You want to do other things with the surfaces on your GPU.  Here, using VAAPI directly is good because the DRM objects are easily exposed so you can move surfaces to and from whatever other stuff you want to use (OpenCL, DRI2 in X11, etc.).
> 
> * You want to minimise CPU/power use when doing one or a small number of live encodes/decodes (for example, video calling or screen recording).  Here performance is not really the issue - any of these solutions suffices but we should try to avoid it being too hard to use.
> 
> So, what do you think libyami brings to any of these cases?  I don't really see anything beyond the additional codec support* - have I missed something?

vpp missing some features, e,g de-noise/de-interlance/...,but I think fill the 
gap is not difficulty, I hope I can submit some patch for this. :)

> 
> libyami also (I believe, correct me if I'm wrong) has Intel-specificity - this is significant given that mesa/gallium has very recently gained VAAPI encode support on AMD VCE (though I think it doesn't currently work well with lavc, I'm going to look into that soon).
> 
> I haven't done any detailed review of the patches; I'm happy to do so if people are generally in favour of having the library.
> 
> Thanks,
> 
> - Mark
> 
> 
> * Which is fixable.  Wrt VP8, I wrote a bit of code but abandoned it because I don't know of anyone who actually cares about it.  Do you have some useful case for it?  If so, I'd be happy to implement it.  I am already intending to do VP9 encode when I have hardware available; VP9 decode apparently already works though I don't have hardware myself.

Glad to hear you will implement VP9 encoder, for the VP8 decoder/encoder, 
I think a lot of webm file will benefit from this.

> ______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Chao Liu Aug. 17, 2016, 4:19 a.m. UTC | #18
On Tue, Aug 16, 2016 at 1:06 AM, Jun Zhao <mypopydev@gmail.com> wrote:

>
>
> On 2016/8/16 15:37, Chao Liu wrote:
> > On Mon, Aug 15, 2016 at 7:44 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> >
> >>
> >>
> >> On 2016/8/16 10:14, Chao Liu wrote:
> >>> On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
> >>>>> On 15 Aug, Hendrik Leppkes wrote :
> >>>>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
> >>>> wrote:
> >>>>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
> >>>>>>>>> please refer to the link: https://github.com/01org/
> >>>> ffmpeg_libyami/wiki/Build
> >>>>>>>>>
> >>>>>>>
> >>>>>>> We've had patches for yami before, and they were not applied
> because
> >>>>>>> many developers did not agree with adding more wrappers for the
> same
> >>>>>>> hardware decoders which we already support.
> >>>>>>> Please refer to the discussion in this thread:
> >>>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
> >>>>>>>
> >>>>>>> The concerns and reasons brought up there should not really have
> >>>> changed.
> >>>>> I still object very strongly against yami.
> >>>>>
> >>>>> It is a library that does not bring much that we could not do
> >> ourselves,
> >>>>> it duplicates a lot of our code, it is the wrong level of abstraction
> >>>>> for libavcodec, it is using a bad license and there is no guarantee
> of
> >>>>> maintainership in the future.
> >>>>
> >>>> I know the worry after read the above thread.For Intel GPU HW
> accelerate
> >>>> decode/encode,
> >>>> now have 3 options in ffmpeg:
> >>>>
> >>>> 1. ffmpeg and QSV (Media SDK)
> >>>> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
> >>>> 3. ffmpeg and libyami
> >>>>
> >>> Sorry for this little diversion: what are the differences between QSV
> and
> >>> vaapi?
> >>> My understanding is that QSV has better performance, while vaapi
> supports
> >>> more decoders / encoders. Is that correct?
> >>> It would be nice if there are some data showing the speed of these HW
> >>> accelerated decoders / encoders.
> >>
> >> QSV has better performance is right, but libyami have more
> >> decoders/encoders than
> >> vaapi hw accel decoder/encoder. :)
> >>
> >> According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder
> and
> >> native
> >> vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
> >> native
> >> vaapi encoder
> >>
> >>>
> >>>>
> >>>> And I know the guys prefer option 2 than 3, but I have a little
> >> question,
> >>>> what's the
> >>>> difference about ffmpeg/libyami and the other external codec
> library(e,g
> >>>> openh264,
> >>>> videotoolbox...)?
> >>>>
> >>> Is 2 available in ffmpeg today or is it sth. planned?
> >>>
> >>
> >> Option 2 is available today :), I think the wiki page (
> >> https://wiki.libav.org/Hardware/vaapi)
> >> is good refer to for option 2, if you want to try. :)
> >
> > Thanks. But that's for libav. These decoders and encoders are not
> available
> > for ffmpeg.
> >
>
> I can run ffmpeg vaapi hw accel decoder and vaapi encoder with zero-copy
> mode for
> transcode case, I don't know why you can't succeed.
>
> Do you re-build intel-driver/libva with master branch?
>
Right. I am using an old version. There is an h264_vaapi encoder in the
latest release.

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Chao Liu Aug. 17, 2016, 4:44 a.m. UTC | #19
On Tue, Aug 16, 2016 at 11:27 AM, Mark Thompson <sw@jkqxz.net> wrote:

> On 16/08/16 03:44, Jun Zhao wrote:
> >
> >
> > On 2016/8/16 10:14, Chao Liu wrote:
> >> On Mon, Aug 15, 2016 at 6:00 PM, Jun Zhao <mypopydev@gmail.com> wrote:
> >>
> >>>
> >>>
> >>> On 2016/8/16 1:48, Jean-Baptiste Kempf wrote:
> >>>> On 15 Aug, Hendrik Leppkes wrote :
> >>>>>> On Mon, Aug 15, 2016 at 10:22 AM, Jun Zhao <mypopydev@gmail.com>
> >>> wrote:
> >>>>>>>> add libyami decoder/encoder/vpp in ffmpeg, about build step,
> >>>>>>>> please refer to the link: https://github.com/01org/
> >>> ffmpeg_libyami/wiki/Build
> >>>>>>>>
> >>>>>>
> >>>>>> We've had patches for yami before, and they were not applied because
> >>>>>> many developers did not agree with adding more wrappers for the same
> >>>>>> hardware decoders which we already support.
> >>>>>> Please refer to the discussion in this thread:
> >>>>>> https://ffmpeg.org/pipermail/ffmpeg-devel/2015-January/167388.html
> >>>>>>
> >>>>>> The concerns and reasons brought up there should not really have
> >>> changed.
> >>>> I still object very strongly against yami.
> >>>>
> >>>> It is a library that does not bring much that we could not do
> ourselves,
> >>>> it duplicates a lot of our code, it is the wrong level of abstraction
> >>>> for libavcodec, it is using a bad license and there is no guarantee of
> >>>> maintainership in the future.
> >>>
> >>> I know the worry after read the above thread.For Intel GPU HW
> accelerate
> >>> decode/encode,
> >>> now have 3 options in ffmpeg:
> >>>
> >>> 1. ffmpeg and QSV (Media SDK)
> >>> 2. ffmpeg vaapi hw accelerate decoder/native vaapi encoder
> >>> 3. ffmpeg and libyami
> >>>
> >> Sorry for this little diversion: what are the differences between QSV
> and
> >> vaapi?
> >> My understanding is that QSV has better performance, while vaapi
> supports
> >> more decoders / encoders. Is that correct?
> >> It would be nice if there are some data showing the speed of these HW
> >> accelerated decoders / encoders.
> >
> > QSV has better performance is right, but libyami have more
> decoders/encoders than
> > vaapi hw accel decoder/encoder. :)
> >
> > According our profile, the speed of QSV/Libyami/vaapi-hw accel decoder
> and native
> > vaapi encoder are: QSV > ffmpeg and libyami > vaapi-hw accel decoder and
> native
> > vaapi encoder
>
> In a single ffmpeg process I believe that result, but I'm not sure that
> it's the question you really want to ask.
>
> The lavc VAAPI hwaccel/encoder are both single-threaded, and while they
> overlap operations internally where possible the single-threadedness of
> ffmpeg (the program) itself means that they will not achieve the maximum
> performance.  If you really want to compare the single-transcode
> performance like this then you will want to make a test program which does
> the threading outside lavc.
>
>
> In any case, I don't believe that the single generic transcode setup is a
> use that many people are interested in (beyond testing to observe that
> hardware encoders kindof suck relative to libx264, then using that instead).
>
> To my mind, the cases where it is interesting to use VAAPI (or really any
> hardware encoder on a normal PC-like system) are:
>
> * You want to do /lots/ of simultaneous transcodes in some sort of server
> setup (often with some simple transformation, like a scale or codec
> change), and want to maximise the number you can do while maintaining some
> minimum level of throughput on each one.  You can benchmark this case for
> VAAPI by running lots of instances of ffmpeg, and I expect that the libyami
> numbers will be precisely equivalent because libyami is using VAAPI anyway
> and the hardware is identical.
>
Our use case is similar to this one. In one process, we have multiple
threads that decode the input video streams, process the decoded frames and
encode.
To process the frames efficiently, we would like the decoded frames to be
of some format like yuv420p, which has a separate luminance channel.
We would like to use whatever hardware accelerations that are available. So
far, we have only tried QSV. It works, with some problems though, like no
support for VP8, only available for relatively new intel CPUs.
I just took a look at vaapi hwaccel, curious why its pix format has to
be AV_PIX_FMT_VAAPI? Jun's patch does support other pix format like yuv420p.

>
> * You want to do other things with the surfaces on your GPU.  Here, using
> VAAPI directly is good because the DRM objects are easily exposed so you
> can move surfaces to and from whatever other stuff you want to use (OpenCL,
> DRI2 in X11, etc.).
>
> * You want to minimise CPU/power use when doing one or a small number of
> live encodes/decodes (for example, video calling or screen recording).
> Here performance is not really the issue - any of these solutions suffices
> but we should try to avoid it being too hard to use.
>
> So, what do you think libyami brings to any of these cases?  I don't
> really see anything beyond the additional codec support* - have I missed
> something?


> libyami also (I believe, correct me if I'm wrong) has Intel-specificity -
> this is significant given that mesa/gallium has very recently gained VAAPI
> encode support on AMD VCE (though I think it doesn't currently work well
> with lavc, I'm going to look into that soon).
>
> I haven't done any detailed review of the patches; I'm happy to do so if
> people are generally in favour of having the library.
>
> Thanks,
>
> - Mark
>
>
> * Which is fixable.  Wrt VP8, I wrote a bit of code but abandoned it
> because I don't know of anyone who actually cares about it.  Do you have
> some useful case for it?  If so, I'd be happy to implement it.  I am
> already intending to do VP9 encode when I have hardware available; VP9
> decode apparently already works though I don't have hardware myself.
>
We would like to have that. Appreciate if you could make this happen.
BTW, compared to libvpx, how faster it could be? I know it depends on the
GPU, just want to have a rough idea..

>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Michael Niedermayer Sept. 24, 2016, 12:34 a.m. UTC | #20
On Mon, Aug 15, 2016 at 04:22:33PM +0800, Jun Zhao wrote:
> add libyami decoder/encoder/vpp in ffmpeg, about build step, 
> please refer to the link: https://github.com/01org/ffmpeg_libyami/wiki/Build

>  Makefile                   |    1 
>  configure                  |   27 ++
>  ffmpeg.c                   |    4 
>  ffmpeg.h                   |    1 
>  ffmpeg_libyami.c           |   85 ++++++
>  libavcodec/Makefile        |    8 
>  libavcodec/allcodecs.c     |    6 
>  libavcodec/libyami.cpp     |  429 +++++++++++++++++++++++++++++++++++
>  libavcodec/libyami.h       |   59 ++++
>  libavcodec/libyami_dec.cpp |  527 +++++++++++++++++++++++++++++++++++++++++++
>  libavcodec/libyami_dec.h   |   56 ++++
>  libavcodec/libyami_enc.cpp |  551 +++++++++++++++++++++++++++++++++++++++++++++
>  libavcodec/libyami_enc.h   |   70 +++++
>  libavutil/pixdesc.c        |    4 
>  libavutil/pixfmt.h         |    5 
>  15 files changed, 1833 insertions(+)
> d5ebbaa497e6f36026a4482dc6e0f26b370561b5  0001-lavc-yami-add-libyami-decoder-encoder.patch
> From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00 2001
> From: Jun Zhao <jun.zhao@intel.com>
> Date: Mon, 15 Aug 2016 15:36:14 +0800
> Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.

it seems people are not in favor of this patchset, judging from this
thread.
If you are interrested in maintaining this code externally as a patch
or git repository, then please add some reasonable link/mention to
some page on https://trac.ffmpeg.org/wiki so users are aware of its
existence and can find it

If you belive thats incorret and people in fact majorly support this
patchset then you can also start a vote of course.

ill mark this patchset as rejected on patchwork as that seems the
de-facto current situation

Thanks

[...]
wm4 Sept. 24, 2016, 1:18 p.m. UTC | #21
On Sat, 24 Sep 2016 02:34:56 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Mon, Aug 15, 2016 at 04:22:33PM +0800, Jun Zhao wrote:
> > add libyami decoder/encoder/vpp in ffmpeg, about build step, 
> > please refer to the link: https://github.com/01org/ffmpeg_libyami/wiki/Build  
> 
> >  Makefile                   |    1 
> >  configure                  |   27 ++
> >  ffmpeg.c                   |    4 
> >  ffmpeg.h                   |    1 
> >  ffmpeg_libyami.c           |   85 ++++++
> >  libavcodec/Makefile        |    8 
> >  libavcodec/allcodecs.c     |    6 
> >  libavcodec/libyami.cpp     |  429 +++++++++++++++++++++++++++++++++++
> >  libavcodec/libyami.h       |   59 ++++
> >  libavcodec/libyami_dec.cpp |  527 +++++++++++++++++++++++++++++++++++++++++++
> >  libavcodec/libyami_dec.h   |   56 ++++
> >  libavcodec/libyami_enc.cpp |  551 +++++++++++++++++++++++++++++++++++++++++++++
> >  libavcodec/libyami_enc.h   |   70 +++++
> >  libavutil/pixdesc.c        |    4 
> >  libavutil/pixfmt.h         |    5 
> >  15 files changed, 1833 insertions(+)
> > d5ebbaa497e6f36026a4482dc6e0f26b370561b5  0001-lavc-yami-add-libyami-decoder-encoder.patch
> > From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00 2001
> > From: Jun Zhao <jun.zhao@intel.com>
> > Date: Mon, 15 Aug 2016 15:36:14 +0800
> > Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.  
> 
> it seems people are not in favor of this patchset, judging from this
> thread.
> If you are interrested in maintaining this code externally as a patch
> or git repository, then please add some reasonable link/mention to
> some page on https://trac.ffmpeg.org/wiki so users are aware of its
> existence and can find it
> 
> If you belive thats incorret and people in fact majorly support this
> patchset then you can also start a vote of course.
> 
> ill mark this patchset as rejected on patchwork as that seems the
> de-facto current situation
> 

From one person who tried to use it (and who's also in the list), I
heard that ffmpeg native vaapi decoding/encoding works better for him.

So there's probably no reason to use this patch at all.
Chao Liu Sept. 28, 2016, 7:18 p.m. UTC | #22
On Sat, Sep 24, 2016 at 6:18 AM, wm4 <nfxjfg@googlemail.com> wrote:

> On Sat, 24 Sep 2016 02:34:56 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
>
> > On Mon, Aug 15, 2016 at 04:22:33PM +0800, Jun Zhao wrote:
> > > add libyami decoder/encoder/vpp in ffmpeg, about build step,
> > > please refer to the link: https://github.com/01org/
> ffmpeg_libyami/wiki/Build
> >
> > >  Makefile                   |    1
> > >  configure                  |   27 ++
> > >  ffmpeg.c                   |    4
> > >  ffmpeg.h                   |    1
> > >  ffmpeg_libyami.c           |   85 ++++++
> > >  libavcodec/Makefile        |    8
> > >  libavcodec/allcodecs.c     |    6
> > >  libavcodec/libyami.cpp     |  429 +++++++++++++++++++++++++++++++++++
> > >  libavcodec/libyami.h       |   59 ++++
> > >  libavcodec/libyami_dec.cpp |  527 ++++++++++++++++++++++++++++++
> +++++++++++++
> > >  libavcodec/libyami_dec.h   |   56 ++++
> > >  libavcodec/libyami_enc.cpp |  551 ++++++++++++++++++++++++++++++
> +++++++++++++++
> > >  libavcodec/libyami_enc.h   |   70 +++++
> > >  libavutil/pixdesc.c        |    4
> > >  libavutil/pixfmt.h         |    5
> > >  15 files changed, 1833 insertions(+)
> > > d5ebbaa497e6f36026a4482dc6e0f26b370561b5  0001-lavc-yami-add-libyami-
> decoder-encoder.patch
> > > From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00 2001
> > > From: Jun Zhao <jun.zhao@intel.com>
> > > Date: Mon, 15 Aug 2016 15:36:14 +0800
> > > Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.
> >
> > it seems people are not in favor of this patchset, judging from this
> > thread.
> > If you are interrested in maintaining this code externally as a patch
> > or git repository, then please add some reasonable link/mention to
> > some page on https://trac.ffmpeg.org/wiki so users are aware of its
> > existence and can find it
> >
> > If you belive thats incorret and people in fact majorly support this
> > patchset then you can also start a vote of course.
> >
> > ill mark this patchset as rejected on patchwork as that seems the
> > de-facto current situation
> >
>
> From one person who tried to use it (and who's also in the list), I
> heard that ffmpeg native vaapi decoding/encoding works better for him.
>
I don't know how he made that conclusion. Maybe he only uses the command
line?
We are building a product using ffmpeg C interface. For me, hwaccel is way
too complicated to use. IIUC, I have to copy thousand lines of code from
ffmpeg_*.c to use it ...
With this patch, it's trivial to switch between codecs like qsv_h264,
libyami_h264, libyami_vp8.

We have been trying different hardware acceleration solutions in our
product. So far, QSV works best for us.
However, QSV itself has a lot of problems, like too much work to use it
under Linux, quite a few critical bugs, no support for VP8.
Even worse, it only supports latest CPUs. We cannot use it in production
because we don't know when they'll stop supporting our hardware, which'll
leave us no option.

So far, libyami looks like the best options for people like us. If you guys
in the end reject this patch, we'll have to patch it ourselves. That'll be
awful. I hope we won't need to do that..

>
> So there's probably no reason to use this patch at all.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
wm4 Sept. 28, 2016, 9:45 p.m. UTC | #23
On Wed, 28 Sep 2016 12:18:38 -0700
Chao Liu <yijinliu@gmail.com> wrote:

> On Sat, Sep 24, 2016 at 6:18 AM, wm4 <nfxjfg@googlemail.com> wrote:
> 
> > On Sat, 24 Sep 2016 02:34:56 +0200
> > Michael Niedermayer <michael@niedermayer.cc> wrote:
> >  
> > > On Mon, Aug 15, 2016 at 04:22:33PM +0800, Jun Zhao wrote:  
> > > > add libyami decoder/encoder/vpp in ffmpeg, about build step,
> > > > please refer to the link: https://github.com/01org/  
> > ffmpeg_libyami/wiki/Build  
> > >  
> > > >  Makefile                   |    1
> > > >  configure                  |   27 ++
> > > >  ffmpeg.c                   |    4
> > > >  ffmpeg.h                   |    1
> > > >  ffmpeg_libyami.c           |   85 ++++++
> > > >  libavcodec/Makefile        |    8
> > > >  libavcodec/allcodecs.c     |    6
> > > >  libavcodec/libyami.cpp     |  429 +++++++++++++++++++++++++++++++++++
> > > >  libavcodec/libyami.h       |   59 ++++
> > > >  libavcodec/libyami_dec.cpp |  527 ++++++++++++++++++++++++++++++  
> > +++++++++++++  
> > > >  libavcodec/libyami_dec.h   |   56 ++++
> > > >  libavcodec/libyami_enc.cpp |  551 ++++++++++++++++++++++++++++++  
> > +++++++++++++++  
> > > >  libavcodec/libyami_enc.h   |   70 +++++
> > > >  libavutil/pixdesc.c        |    4
> > > >  libavutil/pixfmt.h         |    5
> > > >  15 files changed, 1833 insertions(+)
> > > > d5ebbaa497e6f36026a4482dc6e0f26b370561b5  0001-lavc-yami-add-libyami-  
> > decoder-encoder.patch  
> > > > From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00 2001
> > > > From: Jun Zhao <jun.zhao@intel.com>
> > > > Date: Mon, 15 Aug 2016 15:36:14 +0800
> > > > Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.  
> > >
> > > it seems people are not in favor of this patchset, judging from this
> > > thread.
> > > If you are interrested in maintaining this code externally as a patch
> > > or git repository, then please add some reasonable link/mention to
> > > some page on https://trac.ffmpeg.org/wiki so users are aware of its
> > > existence and can find it
> > >
> > > If you belive thats incorret and people in fact majorly support this
> > > patchset then you can also start a vote of course.
> > >
> > > ill mark this patchset as rejected on patchwork as that seems the
> > > de-facto current situation
> > >  
> >
> > From one person who tried to use it (and who's also in the list), I
> > heard that ffmpeg native vaapi decoding/encoding works better for him.
> >  
> I don't know how he made that conclusion. Maybe he only uses the command
> line?
> We are building a product using ffmpeg C interface. For me, hwaccel is way
> too complicated to use. IIUC, I have to copy thousand lines of code from
> ffmpeg_*.c to use it ...

Much less with the latest Libav changes once they're merged in FFmpeg.
Only at most 200 lines (all pretty trivial glue code, much of that
just to hook it up to ffmpeg.c-specifics). The new code will remove the
requirement to manually create the VAAPI context in the decoding case.

Since libyami requires usinf weird libyami-specific buffers instead of
vaapi surfaces, it's unlikely that libyami could profit from further
developments, such as vaapi filters within libavfilter. Unless someone
changes the libyami wrapper to input/output native vaapi surfaces.

Also, there were issues that were never fixed in the libyami wrapper.

> With this patch, it's trivial to switch between codecs like qsv_h264,
> libyami_h264, libyami_vp8.
> 
> We have been trying different hardware acceleration solutions in our
> product. So far, QSV works best for us.
> However, QSV itself has a lot of problems, like too much work to use it
> under Linux, quite a few critical bugs, no support for VP8.
> Even worse, it only supports latest CPUs. We cannot use it in production
> because we don't know when they'll stop supporting our hardware, which'll
> leave us no option.
> 
> So far, libyami looks like the best options for people like us. If you guys
> in the end reject this patch, we'll have to patch it ourselves. That'll be
> awful. I hope we won't need to do that..

So it's better if _we_ have to maintain code that is redundant to our
"proper" APIs, and that has a bunch of issues the patch submitters
don't want to fix? Sounds wrong to me.
Chao Liu Sept. 28, 2016, 9:57 p.m. UTC | #24
On Wed, Sep 28, 2016 at 2:45 PM, wm4 <nfxjfg@googlemail.com> wrote:

> On Wed, 28 Sep 2016 12:18:38 -0700
> Chao Liu <yijinliu@gmail.com> wrote:
>
> > On Sat, Sep 24, 2016 at 6:18 AM, wm4 <nfxjfg@googlemail.com> wrote:
> >
> > > On Sat, 24 Sep 2016 02:34:56 +0200
> > > Michael Niedermayer <michael@niedermayer.cc> wrote:
> > >
> > > > On Mon, Aug 15, 2016 at 04:22:33PM +0800, Jun Zhao wrote:
> > > > > add libyami decoder/encoder/vpp in ffmpeg, about build step,
> > > > > please refer to the link: https://github.com/01org/
> > > ffmpeg_libyami/wiki/Build
> > > >
> > > > >  Makefile                   |    1
> > > > >  configure                  |   27 ++
> > > > >  ffmpeg.c                   |    4
> > > > >  ffmpeg.h                   |    1
> > > > >  ffmpeg_libyami.c           |   85 ++++++
> > > > >  libavcodec/Makefile        |    8
> > > > >  libavcodec/allcodecs.c     |    6
> > > > >  libavcodec/libyami.cpp     |  429 ++++++++++++++++++++++++++++++
> +++++
> > > > >  libavcodec/libyami.h       |   59 ++++
> > > > >  libavcodec/libyami_dec.cpp |  527 ++++++++++++++++++++++++++++++
> > > +++++++++++++
> > > > >  libavcodec/libyami_dec.h   |   56 ++++
> > > > >  libavcodec/libyami_enc.cpp |  551 ++++++++++++++++++++++++++++++
> > > +++++++++++++++
> > > > >  libavcodec/libyami_enc.h   |   70 +++++
> > > > >  libavutil/pixdesc.c        |    4
> > > > >  libavutil/pixfmt.h         |    5
> > > > >  15 files changed, 1833 insertions(+)
> > > > > d5ebbaa497e6f36026a4482dc6e0f26b370561b5
> 0001-lavc-yami-add-libyami-
> > > decoder-encoder.patch
> > > > > From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00
> 2001
> > > > > From: Jun Zhao <jun.zhao@intel.com>
> > > > > Date: Mon, 15 Aug 2016 15:36:14 +0800
> > > > > Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.
> > > >
> > > > it seems people are not in favor of this patchset, judging from this
> > > > thread.
> > > > If you are interrested in maintaining this code externally as a patch
> > > > or git repository, then please add some reasonable link/mention to
> > > > some page on https://trac.ffmpeg.org/wiki so users are aware of its
> > > > existence and can find it
> > > >
> > > > If you belive thats incorret and people in fact majorly support this
> > > > patchset then you can also start a vote of course.
> > > >
> > > > ill mark this patchset as rejected on patchwork as that seems the
> > > > de-facto current situation
> > > >
> > >
> > > From one person who tried to use it (and who's also in the list), I
> > > heard that ffmpeg native vaapi decoding/encoding works better for him.
> > >
> > I don't know how he made that conclusion. Maybe he only uses the command
> > line?
> > We are building a product using ffmpeg C interface. For me, hwaccel is
> way
> > too complicated to use. IIUC, I have to copy thousand lines of code from
> > ffmpeg_*.c to use it ...
>
> Much less with the latest Libav changes once they're merged in FFmpeg.
> Only at most 200 lines (all pretty trivial glue code, much of that
> just to hook it up to ffmpeg.c-specifics). The new code will remove the
> requirement to manually create the VAAPI context in the decoding case.
>
Oh, that's great! When do you think it'll be ready? Cannot wait to give it
a try!

>
> Since libyami requires usinf weird libyami-specific buffers instead of
> vaapi surfaces, it's unlikely that libyami could profit from further
> developments, such as vaapi filters within libavfilter. Unless someone
> changes the libyami wrapper to input/output native vaapi surfaces.
>
> Also, there were issues that were never fixed in the libyami wrapper.
>
> > With this patch, it's trivial to switch between codecs like qsv_h264,
> > libyami_h264, libyami_vp8.
> >
> > We have been trying different hardware acceleration solutions in our
> > product. So far, QSV works best for us.
> > However, QSV itself has a lot of problems, like too much work to use it
> > under Linux, quite a few critical bugs, no support for VP8.
> > Even worse, it only supports latest CPUs. We cannot use it in production
> > because we don't know when they'll stop supporting our hardware, which'll
> > leave us no option.
> >
> > So far, libyami looks like the best options for people like us. If you
> guys
> > in the end reject this patch, we'll have to patch it ourselves. That'll
> be
> > awful. I hope we won't need to do that..
>
> So it's better if _we_ have to maintain code that is redundant to our
> "proper" APIs, and that has a bunch of issues the patch submitters
> don't want to fix? Sounds wrong to me.
>
I didn't know the unfixed issues of libyami, which you mentioned above.
I agree that if you could make hwaccels easy to use, this patch is not very
useful.
BTW, is there any plan to support VP8 with vaapi hwaccel?

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
wm4 Sept. 28, 2016, 10:14 p.m. UTC | #25
On Wed, 28 Sep 2016 14:57:50 -0700
Chao Liu <yijinliu@gmail.com> wrote:

> On Wed, Sep 28, 2016 at 2:45 PM, wm4 <nfxjfg@googlemail.com> wrote:
> 
> > On Wed, 28 Sep 2016 12:18:38 -0700
> > Chao Liu <yijinliu@gmail.com> wrote:
> >  
> > > On Sat, Sep 24, 2016 at 6:18 AM, wm4 <nfxjfg@googlemail.com> wrote:
> > >  
> > > > On Sat, 24 Sep 2016 02:34:56 +0200
> > > > Michael Niedermayer <michael@niedermayer.cc> wrote:
> > > >  
> > > > > On Mon, Aug 15, 2016 at 04:22:33PM +0800, Jun Zhao wrote:  
> > > > > > add libyami decoder/encoder/vpp in ffmpeg, about build step,
> > > > > > please refer to the link: https://github.com/01org/  
> > > > ffmpeg_libyami/wiki/Build  
> > > > >  
> > > > > >  Makefile                   |    1
> > > > > >  configure                  |   27 ++
> > > > > >  ffmpeg.c                   |    4
> > > > > >  ffmpeg.h                   |    1
> > > > > >  ffmpeg_libyami.c           |   85 ++++++
> > > > > >  libavcodec/Makefile        |    8
> > > > > >  libavcodec/allcodecs.c     |    6
> > > > > >  libavcodec/libyami.cpp     |  429 ++++++++++++++++++++++++++++++  
> > +++++  
> > > > > >  libavcodec/libyami.h       |   59 ++++
> > > > > >  libavcodec/libyami_dec.cpp |  527 ++++++++++++++++++++++++++++++  
> > > > +++++++++++++  
> > > > > >  libavcodec/libyami_dec.h   |   56 ++++
> > > > > >  libavcodec/libyami_enc.cpp |  551 ++++++++++++++++++++++++++++++  
> > > > +++++++++++++++  
> > > > > >  libavcodec/libyami_enc.h   |   70 +++++
> > > > > >  libavutil/pixdesc.c        |    4
> > > > > >  libavutil/pixfmt.h         |    5
> > > > > >  15 files changed, 1833 insertions(+)
> > > > > > d5ebbaa497e6f36026a4482dc6e0f26b370561b5  
> > 0001-lavc-yami-add-libyami-  
> > > > decoder-encoder.patch  
> > > > > > From 7147fdb375cb7241d69823d8b9b6e94f66df3a32 Mon Sep 17 00:00:00  
> > 2001  
> > > > > > From: Jun Zhao <jun.zhao@intel.com>
> > > > > > Date: Mon, 15 Aug 2016 15:36:14 +0800
> > > > > > Subject: [[PATCH] 1/5] lavc : yami : add libyami decoder/encoder.  
> > > > >
> > > > > it seems people are not in favor of this patchset, judging from this
> > > > > thread.
> > > > > If you are interrested in maintaining this code externally as a patch
> > > > > or git repository, then please add some reasonable link/mention to
> > > > > some page on https://trac.ffmpeg.org/wiki so users are aware of its
> > > > > existence and can find it
> > > > >
> > > > > If you belive thats incorret and people in fact majorly support this
> > > > > patchset then you can also start a vote of course.
> > > > >
> > > > > ill mark this patchset as rejected on patchwork as that seems the
> > > > > de-facto current situation
> > > > >  
> > > >
> > > > From one person who tried to use it (and who's also in the list), I
> > > > heard that ffmpeg native vaapi decoding/encoding works better for him.
> > > >  
> > > I don't know how he made that conclusion. Maybe he only uses the command
> > > line?
> > > We are building a product using ffmpeg C interface. For me, hwaccel is  
> > way  
> > > too complicated to use. IIUC, I have to copy thousand lines of code from
> > > ffmpeg_*.c to use it ...  
> >
> > Much less with the latest Libav changes once they're merged in FFmpeg.
> > Only at most 200 lines (all pretty trivial glue code, much of that
> > just to hook it up to ffmpeg.c-specifics). The new code will remove the
> > requirement to manually create the VAAPI context in the decoding case.
> >  
> Oh, that's great! When do you think it'll be ready? Cannot wait to give it
> a try!

Merging is hard work, and last I heard we were 300 commits behind, so
probably a while.

Until then, you can see it here:
https://git.libav.org/?p=libav.git;a=blob;f=avconv_vaapi.c

> BTW, is there any plan to support VP8 with vaapi hwaccel?

ffmpeg_vaapi.c seems to have some vp8 support (and the recent Libav
rework keeps that), no idea if it works.
Mark Thompson Sept. 28, 2016, 10:23 p.m. UTC | #26
On 28/09/16 22:57, Chao Liu wrote:
> BTW, is there any plan to support VP8 with vaapi hwaccel?

No plan; already done: <https://git.libav.org/?p=libav.git;a=commit;h=a9fb134730da1f9642eb5a2baa50943b8a4aa245>.

(Depends on the changed decode infrastructure though, so it won't cherry-pick easily.)

- Mark
Chao Liu Sept. 28, 2016, 10:30 p.m. UTC | #27
On Wed, Sep 28, 2016 at 3:23 PM, Mark Thompson <sw@jkqxz.net> wrote:

> On 28/09/16 22:57, Chao Liu wrote:
> > BTW, is there any plan to support VP8 with vaapi hwaccel?
>
> No plan; already done: <https://git.libav.org/?p=libav.git;a=commit;h=
> a9fb134730da1f9642eb5a2baa50943b8a4aa245>.
>
Cool. Thanks!
What about VP8 encoder?

>
> (Depends on the changed decode infrastructure though, so it won't
> cherry-pick easily.)


> - Mark
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
Mark Thompson Sept. 28, 2016, 11:18 p.m. UTC | #28
On 28/09/16 23:30, Chao Liu wrote:
> On Wed, Sep 28, 2016 at 3:23 PM, Mark Thompson <sw@jkqxz.net> wrote:
> 
>> On 28/09/16 22:57, Chao Liu wrote:
>>> BTW, is there any plan to support VP8 with vaapi hwaccel?
>>
>> No plan; already done: <https://git.libav.org/?p=libav.git;a=commit;h=
>> a9fb134730da1f9642eb5a2baa50943b8a4aa245>.
>>
> Cool. Thanks!
> What about VP8 encoder?

Do go ahead.  I don't have any current plans to do it and I don't know of anyone else who would, so you wouldn't be conflicting with anyone.

Thanks,

- Mark
diff mbox

Patch

diff --git a/Makefile b/Makefile
index 8aa72fd..7932570 100644
--- a/Makefile
+++ b/Makefile
@@ -36,6 +36,7 @@  OBJS-ffmpeg-$(CONFIG_VAAPI)   += ffmpeg_vaapi.o
 ifndef CONFIG_VIDEOTOOLBOX
 OBJS-ffmpeg-$(CONFIG_VDA)     += ffmpeg_videotoolbox.o
 endif
+OBJS-ffmpeg-$(CONFIG_LIBYAMI) += ffmpeg_libyami.o
 OBJS-ffmpeg-$(CONFIG_CUVID)   += ffmpeg_cuvid.o
 OBJS-ffmpeg-$(HAVE_DXVA2_LIB) += ffmpeg_dxva2.o
 OBJS-ffmpeg-$(HAVE_VDPAU_X11) += ffmpeg_vdpau.o
diff --git a/configure b/configure
index 9b92426..ba50f22 100755
--- a/configure
+++ b/configure
@@ -258,6 +258,7 @@  External library support:
   --enable-libspeex        enable Speex de/encoding via libspeex [no]
   --enable-libssh          enable SFTP protocol via libssh [no]
   --enable-libtesseract    enable Tesseract, needed for ocr filter [no]
+  --enable-libyami         enable Libyami video encoding/decoding/post-processing [no]
   --enable-libtheora       enable Theora encoding via libtheora [no]
   --enable-libtwolame      enable MP2 encoding via libtwolame [no]
   --enable-libv4l2         enable libv4l2/v4l-utils [no]
@@ -1519,6 +1520,7 @@  EXTERNAL_LIBRARY_LIST="
     libspeex
     libssh
     libtesseract
+    libyami
     libtheora
     libtwolame
     libv4l2
@@ -2787,6 +2789,26 @@  libshine_encoder_select="audio_frame_queue"
 libspeex_decoder_deps="libspeex"
 libspeex_encoder_deps="libspeex"
 libspeex_encoder_select="audio_frame_queue"
+libyami_decoder_deps="libyami pthreads"
+libyami_decoder_extralibs="-lstdc++"
+libyami_encoder_deps="libyami pthreads"
+libyami_encoder_extralibs="-lstdc++"
+libyami_h264_decoder_deps="libyami"
+libyami_h264_decoder_select="libyami_decoder"
+libyami_hevc_decoder_deps="libyami"
+libyami_hevc_decoder_select="libyami_decoder"
+libyami_vp8_decoder_deps="libyami"
+libyami_vp8_decoder_select="libyami_decoder"
+libyami_mpeg2_decoder_deps="libyami"
+libyami_mpeg2_decoder_select="libyami_decoder"
+libyami_vc1_decoder_deps="libyami"
+libyami_vc1_decoder_select="libyami_decoder"
+libyami_vp9_decoder_deps="libyami"
+libyami_vp9_decoder_select="libyami_decoder"
+libyami_vp8_encoder_deps="libyami"
+libyami_vp8_encoder_select="libyami_encoder"
+libyami_h264_encoder_deps="libyami"
+libyami_h264_encoder_select="libyami_encoder"
 libtheora_encoder_deps="libtheora"
 libtwolame_encoder_deps="libtwolame"
 libvo_amrwbenc_encoder_deps="libvo_amrwbenc"
@@ -3080,6 +3102,8 @@  zmq_filter_deps="libzmq"
 zoompan_filter_deps="swscale"
 zscale_filter_deps="libzimg"
 scale_vaapi_filter_deps="vaapi VAProcPipelineParameterBuffer"
+yamivpp_filter_deps="libyami"
+yamivpp_filter_extralibs="-lstdc++"
 
 # examples
 avcodec_example_deps="avcodec avutil"
@@ -5056,6 +5080,7 @@  die_license_disabled version3 libopencore_amrnb
 die_license_disabled version3 libopencore_amrwb
 die_license_disabled version3 libsmbclient
 die_license_disabled version3 libvo_amrwbenc
+die_license_disabled version3 libyami
 
 enabled version3 && { enabled gpl && enable gplv3 || enable lgplv3; }
 
@@ -5706,6 +5731,7 @@  enabled libsoxr           && require libsoxr soxr.h soxr_create -lsoxr && LIBSOX
 enabled libssh            && require_pkg_config libssh libssh/sftp.h sftp_init
 enabled libspeex          && require_pkg_config speex speex/speex.h speex_decoder_init -lspeex
 enabled libtesseract      && require_pkg_config tesseract tesseract/capi.h TessBaseAPICreate
+enabled libyami           && require_pkg_config libyami VideoDecoderDefs.h "" -lstdc++
 enabled libtheora         && require libtheora theora/theoraenc.h th_info_init -ltheoraenc -ltheoradec -logg
 enabled libtwolame        && require libtwolame twolame.h twolame_init -ltwolame &&
                              { check_lib twolame.h twolame_encode_buffer_float32_interleaved -ltwolame ||
@@ -6347,6 +6373,7 @@  enabled spectrumsynth_filter && prepend avfilter_deps "avcodec"
 enabled subtitles_filter    && prepend avfilter_deps "avformat avcodec"
 enabled uspp_filter         && prepend avfilter_deps "avcodec"
 
+
 enabled lavfi_indev         && prepend avdevice_deps "avfilter"
 
 enabled opus_decoder    && prepend avcodec_deps "swresample"
diff --git a/ffmpeg.c b/ffmpeg.c
index bae515d..daad9ce 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -3054,6 +3054,10 @@  static int transcode_init(void)
                 exit_program(1);
 #endif
 
+#if CONFIG_LIBYAMI
+            if (yami_transcode_init(ist, ost))
+                exit_program(1);
+#endif
 #if CONFIG_CUVID
             if (cuvid_transcode_init(ost))
                 exit_program(1);
diff --git a/ffmpeg.h b/ffmpeg.h
index 49d65d8..c31ddc7 100644
--- a/ffmpeg.h
+++ b/ffmpeg.h
@@ -585,6 +585,7 @@  int vda_init(AVCodecContext *s);
 int videotoolbox_init(AVCodecContext *s);
 int qsv_init(AVCodecContext *s);
 int qsv_transcode_init(OutputStream *ost);
+int yami_transcode_init(InputStream *ist, OutputStream *ost);
 int vaapi_decode_init(AVCodecContext *avctx);
 int vaapi_device_init(const char *device);
 int cuvid_init(AVCodecContext *s);
diff --git a/ffmpeg_libyami.c b/ffmpeg_libyami.c
new file mode 100644
index 0000000..dbb6d36
--- /dev/null
+++ b/ffmpeg_libyami.c
@@ -0,0 +1,85 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+#include "libavutil/dict.h"
+#include "libavutil/mem.h"
+#include "libavutil/opt.h"
+
+#include "ffmpeg.h"
+
+int yami_transcode_init(InputStream *inst, OutputStream *ost)
+{
+    InputStream *ist;
+    const enum AVPixelFormat *pix_fmt;
+
+    AVDictionaryEntry *e;
+    const AVOption *opt;
+    int flags = 0;
+
+    int i;
+
+    if (ost && inst && 0 == strncmp(ost->enc_ctx->codec->name, "libyami", strlen("libyami")) &&
+        0 == strncmp(inst->dec_ctx->codec->name, "libyami", strlen("libyami"))) {
+        /* check if the encoder supports LIBYAMI */
+        if (!ost->enc->pix_fmts)
+            return 0;
+        for (pix_fmt = ost->enc->pix_fmts; *pix_fmt != AV_PIX_FMT_NONE; pix_fmt++)
+            if (*pix_fmt == AV_PIX_FMT_YAMI)
+                break;
+        if (*pix_fmt == AV_PIX_FMT_NONE)
+            return 0;
+
+        if (ost->source_index < 0)
+            return 0;
+
+        /* check if the decoder supports libyami and the output only goes to this stream */
+        ist = input_streams[ost->source_index];
+        if ((ist->nb_filters > 1) ||
+            !ist->dec || !ist->dec->pix_fmts)
+            return 0;
+        for (pix_fmt = ist->dec->pix_fmts; *pix_fmt != AV_PIX_FMT_NONE; pix_fmt++)
+            if (*pix_fmt == AV_PIX_FMT_YAMI)
+                break;
+        if (*pix_fmt == AV_PIX_FMT_NONE)
+            return 0;
+
+        for (i = 0; i < nb_output_streams; i++)
+            if (output_streams[i] != ost &&
+                output_streams[i]->source_index == ost->source_index)
+                return 0;
+
+        av_log(NULL, AV_LOG_VERBOSE, "Setting up libyami transcoding\n");
+
+        e = av_dict_get(ost->encoder_opts, "flags", NULL, 0);
+        opt = av_opt_find(ost->enc_ctx, "flags", NULL, 0, 0);
+        if (e && opt)
+            av_opt_eval_flags(ost->enc_ctx, opt, e->value, &flags);
+
+        ost->enc_ctx->pix_fmt         = AV_PIX_FMT_YAMI;
+
+        ist->dec_ctx->pix_fmt         = AV_PIX_FMT_YAMI;
+        ist->resample_pix_fmt         = AV_PIX_FMT_YAMI;
+    }
+
+    return 0;
+}
diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index b375720..2b798d9 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -883,6 +883,14 @@  OBJS-$(CONFIG_LIBSCHROEDINGER_ENCODER)    += libschroedingerenc.o \
 OBJS-$(CONFIG_LIBSHINE_ENCODER)           += libshine.o
 OBJS-$(CONFIG_LIBSPEEX_DECODER)           += libspeexdec.o
 OBJS-$(CONFIG_LIBSPEEX_ENCODER)           += libspeexenc.o
+OBJS-$(CONFIG_LIBYAMI_H264_DECODER)       += libyami_dec.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_H264_ENCODER)       += libyami_enc.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_HEVC_DECODER)       += libyami_dec.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_VP8_DECODER)        += libyami_dec.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_VP8_ENCODER)        += libyami_enc.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_MPEG2_DECODER)      += libyami_dec.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_VC1_DECODER)        += libyami_dec.o libyami.o
+OBJS-$(CONFIG_LIBYAMI_VP9_DECODER)        += libyami_dec.o libyami.o
 OBJS-$(CONFIG_LIBTHEORA_ENCODER)          += libtheoraenc.o
 OBJS-$(CONFIG_LIBTWOLAME_ENCODER)         += libtwolame.o
 OBJS-$(CONFIG_LIBVO_AMRWBENC_ENCODER)     += libvo-amrwbenc.o
diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
index a1ae61f..55920bf 100644
--- a/libavcodec/allcodecs.c
+++ b/libavcodec/allcodecs.c
@@ -642,6 +642,12 @@  void avcodec_register_all(void)
     REGISTER_ENCODER(LIBKVAZAAR,        libkvazaar);
     REGISTER_ENCODER(MJPEG_VAAPI,       mjpeg_vaapi);
     REGISTER_ENCODER(MPEG2_QSV,         mpeg2_qsv);
+    REGISTER_ENCDEC (LIBYAMI_H264,      libyami_h264);
+    REGISTER_DECODER(LIBYAMI_HEVC,      libyami_hevc);
+    REGISTER_ENCDEC(LIBYAMI_VP8,        libyami_vp8);
+    REGISTER_DECODER(LIBYAMI_MPEG2,     libyami_mpeg2);
+    REGISTER_DECODER(LIBYAMI_VC1,       libyami_vc1);
+    REGISTER_DECODER(LIBYAMI_VP9,       libyami_vp9);
     REGISTER_DECODER(VC1_CUVID,         vc1_cuvid);
     REGISTER_DECODER(VP8_CUVID,         vp8_cuvid);
     REGISTER_DECODER(VP9_CUVID,         vp9_cuvid);
diff --git a/libavcodec/libyami.cpp b/libavcodec/libyami.cpp
new file mode 100644
index 0000000..e8fef55
--- /dev/null
+++ b/libavcodec/libyami.cpp
@@ -0,0 +1,429 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+
+extern "C" {
+#include "avcodec.h"
+#include "libavutil/imgutils.h"
+#include "internal.h"
+}
+
+#include "VideoCommonDefs.h"
+#include "libyami.h"
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#define HAVE_VAAPI_DRM 1
+
+#if HAVE_VAAPI_X11
+#include <X11/Xlib.h>
+#endif
+
+VADisplay ff_vaapi_create_display(void)
+{
+    static VADisplay display = NULL;
+
+    if (!display) {
+#if !HAVE_VAAPI_DRM
+        const char *device = NULL;
+        /* Try to open the device as an X11 display */
+        Display *x11_display = XOpenDisplay(device);
+        if (!x11_display) {
+            return NULL;
+        } else {
+            display = vaGetDisplay(x11_display);
+            if (!display) {
+                XCloseDisplay(x11_display);
+            }
+        }
+#else
+        const char *devices[] = {
+            "/dev/dri/renderD128",
+            "/dev/dri/card0",
+            NULL
+        };
+        // Try to open the device as a DRM path.
+        int i;
+        int drm_fd;
+        for (i = 0; !display && devices[i]; i++) {
+            drm_fd = open(devices[i], O_RDWR);
+            if (drm_fd < 0)
+                continue;
+
+            display = vaGetDisplayDRM(drm_fd);
+            if (!display)
+                close(drm_fd);
+        }
+#endif
+        if (!display)
+            return NULL;
+        int majorVersion, minorVersion;
+        VAStatus vaStatus = vaInitialize(display, &majorVersion, &minorVersion);
+        if (vaStatus != VA_STATUS_SUCCESS) {
+#if HAVE_VAAPI_DRM
+            close(drm_fd);
+#endif
+            display = NULL;
+            return NULL;
+        }
+        return display;
+    } else {
+        return display;
+    }
+}
+
+/*
+ * Used SSE4 MOVNTDQA instruction improving performance of data copies from
+ * Uncacheable Speculative Write Combining (USWC) memory to ordinary write back (WB)
+ * system memory.
+ * https://software.intel.com/en-us/articles/copying-accelerated-video-decode-frame-buffers/
+ */
+#if HAVE_SSE4
+#define COPY16(dstp, srcp, load, store) \
+    __asm__ volatile (                  \
+        load "  0(%[src]), %%xmm1\n"    \
+        store " %%xmm1,    0(%[dst])\n" \
+        : : [dst]"r"(dstp), [src]"r"(srcp) : "memory", "xmm1")
+
+#define COPY128(dstp, srcp, load, store) \
+    __asm__ volatile (                   \
+        load "  0(%[src]), %%xmm1\n"     \
+        load " 16(%[src]), %%xmm2\n"     \
+        load " 32(%[src]), %%xmm3\n"     \
+        load " 48(%[src]), %%xmm4\n"     \
+        load " 64(%[src]), %%xmm5\n"     \
+        load " 80(%[src]), %%xmm6\n"     \
+        load " 96(%[src]), %%xmm7\n"     \
+        load " 112(%[src]), %%xmm8\n"    \
+        store " %%xmm1,    0(%[dst])\n"  \
+        store " %%xmm2,   16(%[dst])\n"  \
+        store " %%xmm3,   32(%[dst])\n"  \
+        store " %%xmm4,   48(%[dst])\n"  \
+        store " %%xmm5,   64(%[dst])\n"  \
+        store " %%xmm6,   80(%[dst])\n"  \
+        store " %%xmm7,   96(%[dst])\n"  \
+        store " %%xmm8,   112(%[dst])\n" \
+        : : [dst]"r"(dstp), [src]"r"(srcp) : "memory", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7", "xmm8")
+
+void *ff_copy_from_uswc(void *dst, void *src, size_t size)
+{
+    char aligned;
+    int remain;
+    int i, round;
+    uint8_t *pDst, *pSrc;
+
+    if (dst == NULL || src == NULL || size == 0) {
+        return NULL;
+    }
+
+    aligned = (((size_t) dst) | ((size_t) src)) & 0x0F;
+
+    if (aligned != 0) {
+        return NULL;
+    }
+
+    pDst = (uint8_t *) dst;
+    pSrc = (uint8_t *) src;
+    remain = size & 0x7F;
+    round = size >> 7;
+
+    __asm__ volatile ("mfence");
+
+    for (i = 0; i < round; i++) {
+        COPY128(pDst, pSrc, "movntdqa", "movdqa");
+        pSrc += 128;
+        pDst += 128;
+    }
+
+    if (remain >= 16) {
+        size = remain;
+        remain = size & 0xF;
+        round = size >> 4;
+
+        for (i = 0; i < round; i++) {
+            COPY16(pDst, pSrc, "movntdqa", "movdqa");
+            pSrc += 16;
+            pDst += 16;
+        }
+    }
+
+    if (remain > 0) {
+        char *ps = (char *)(pSrc);
+        char *pd = (char *)(pDst);
+
+        for (i = 0; i < remain; i++) {
+            pd[i] = ps[i];
+        }
+    }
+    __asm__ volatile ("mfence");
+
+    return dst;
+}
+#else
+void *ff_copy_from_uswc(void *dst, void *src, size_t size)
+{
+    return memcpy(dst, src, size);
+}
+#endif
+
+bool ff_check_vaapi_status(VAStatus status, const char *msg)
+{
+    if (status != VA_STATUS_SUCCESS) {
+        av_log(NULL, AV_LOG_ERROR, "%s: %s", msg, vaErrorStr(status));
+        return false;
+    }
+    return true;
+}
+
+SharedPtr<VideoFrame>
+ff_vaapi_create_surface(uint32_t rt_fmt, int pix_fmt, uint32_t w, uint32_t h)
+{
+    SharedPtr<VideoFrame> frame;
+    VAStatus status;
+    VASurfaceID id;
+    VASurfaceAttrib attrib;
+
+    VADisplay m_vaDisplay = ff_vaapi_create_display();
+
+    attrib.type =  VASurfaceAttribPixelFormat;
+    attrib.flags = VA_SURFACE_ATTRIB_SETTABLE;
+    attrib.value.type = VAGenericValueTypeInteger;
+    attrib.value.value.i = pix_fmt;
+
+    status = vaCreateSurfaces(m_vaDisplay, rt_fmt, w, h, &id, 1, &attrib, 1);
+    if (!ff_check_vaapi_status(status, "vaCreateSurfaces"))
+        return frame;
+    frame.reset(new VideoFrame);
+    memset(frame.get(), 0 , sizeof(VideoFrame));
+    frame->surface = (intptr_t)id;
+    frame->crop.x = frame->crop.y = 0;
+    frame->crop.width = w;
+    frame->crop.height = h;
+    frame->fourcc = pix_fmt;
+
+    return frame;
+}
+
+bool ff_vaapi_destory_surface(SharedPtr<VideoFrame>& frame)
+{
+    VADisplay m_vaDisplay = ff_vaapi_create_display();
+    VASurfaceID id = (VASurfaceID)(frame->surface);
+    VAStatus status = vaDestroySurfaces((VADisplay)m_vaDisplay, &id, 1);
+    if (!ff_check_vaapi_status(status, "vaDestroySurfaces"))
+        return false;
+
+    return true;
+}
+
+bool ff_vaapi_load_image(SharedPtr<VideoFrame>& frame, AVFrame *in)
+{
+    VASurfaceID surface = (VASurfaceID)frame->surface;
+    VAImage image;
+
+    uint32_t dest_linesize[4] = {0};
+    const uint8_t *src_data[4];
+    uint8_t *dest_data[4];
+
+    VADisplay m_vaDisplay = ff_vaapi_create_display();
+
+    VAStatus status = vaDeriveImage(m_vaDisplay, surface, &image);
+    if (!ff_check_vaapi_status(status, "vaDeriveImage"))
+        return false;
+
+    uint8_t *buf = NULL;
+    status = vaMapBuffer(m_vaDisplay, image.buf, (void**)&buf);
+    if (!ff_check_vaapi_status(status, "vaMapBuffer")) {
+        vaDestroyImage(m_vaDisplay, image.image_id);
+        return false;
+    }
+
+    src_data[0] = in->data[0];
+    src_data[1] = in->data[1];
+    src_data[2] = in->data[2];
+
+    dest_data[0] = buf + image.offsets[0];
+    dest_data[1] = buf + image.offsets[1];
+    dest_data[2] = buf + image.offsets[2];
+
+    if (in->format == AV_PIX_FMT_YUV420P) {
+        dest_linesize[0] = image.pitches[0];
+        dest_linesize[1] = image.pitches[1];
+        dest_linesize[2] = image.pitches[2];
+    } else if (in->format == AV_PIX_FMT_NV12) {
+        dest_linesize[0] = image.pitches[0];
+        dest_linesize[1] = image.pitches[1];
+        dest_linesize[2] = image.pitches[2];
+    } else {
+        av_log(NULL, AV_LOG_ERROR, "Unsupported the pixel format : %s.\n",
+               av_pix_fmt_desc_get((AVPixelFormat)in->format)->name);
+        return false;
+    }
+
+    av_image_copy(dest_data, (int *)dest_linesize, src_data,
+                  (int *)in->linesize, (AVPixelFormat)in->format,
+                  in->width, in->height);
+    frame->timeStamp = in->pts;
+
+    ff_check_vaapi_status(vaUnmapBuffer(m_vaDisplay, image.buf), "vaUnmapBuffer");
+    ff_check_vaapi_status(vaDestroyImage(m_vaDisplay, image.image_id), "vaDestroyImage");
+    return true;
+}
+
+bool ff_vaapi_get_image(SharedPtr<VideoFrame>& frame, AVFrame *out)
+{
+    VASurfaceID surface = (VASurfaceID)frame->surface;
+    VAImage image;
+    VAStatus status;
+    uint32_t src_linesize[4] = { 0 };
+    uint32_t dest_linesize[4] = { 0 };
+    const uint8_t *src_data[4];
+    uint8_t *dest_data[4];
+
+    VADisplay m_vaDisplay = ff_vaapi_create_display();
+
+    if (out->format == AV_PIX_FMT_NV12) {
+        status = vaDeriveImage(m_vaDisplay, surface, &image);
+        if (!ff_check_vaapi_status(status, "vaDeriveImage"))
+            return false;
+    } else {
+        VAImageFormat image_format;
+        image_format.fourcc = VA_FOURCC_I420;
+        image_format.byte_order = 1;
+        image_format.bits_per_pixel = 12;
+        status = vaCreateImage(m_vaDisplay, &image_format,
+                               frame->crop.width, frame->crop.height, &image);
+        if (!ff_check_vaapi_status(status, "vaCreateImage"))
+            return false;
+        status = vaGetImage(m_vaDisplay, surface, 0, 0,
+                            out->width, out->height, image.image_id);
+        if (!ff_check_vaapi_status(status, "vaGetImage"))
+            return false;
+    }
+
+    uint8_t *buf = NULL;
+    status = vaMapBuffer(m_vaDisplay, image.buf, (void**)&buf);
+    if (!ff_check_vaapi_status(status, "vaMapBuffer")) {
+        vaDestroyImage(m_vaDisplay, image.image_id);
+        return false;
+    }
+
+    dest_data[0] = out->data[0];
+    dest_data[1] = out->data[1];
+    dest_data[2] = out->data[2];
+
+    int plane_size = image.data_size;
+    uint8_t *plane_buf = (uint8_t *)av_malloc(FFMAX(image.width * image.height * 3, plane_size));
+    if (!plane_buf)
+        return false;
+
+    ff_copy_from_uswc((void *)plane_buf, (void *)buf, plane_size);
+
+    src_data[0] = plane_buf + image.offsets[0];
+    src_data[1] = plane_buf + image.offsets[1];
+    src_data[2] = plane_buf + image.offsets[2];
+
+    if (out->format == AV_PIX_FMT_YUV420P) {
+        dest_linesize[0] = out->linesize[0];
+        dest_linesize[1] = out->linesize[1];
+        dest_linesize[2] = out->linesize[2];
+
+        src_linesize[0] = image.pitches[0];
+        src_linesize[1] = image.pitches[1];
+        src_linesize[2] = image.pitches[2];
+    } else if (out->format == AV_PIX_FMT_NV12) {
+        dest_linesize[0] = out->linesize[0];
+        dest_linesize[1] = out->linesize[1];
+        dest_linesize[2] = out->linesize[2];
+
+        src_linesize[0] = image.pitches[0];
+        src_linesize[1] = image.pitches[1];
+        src_linesize[2] = image.pitches[2];
+    } else {
+        av_log(NULL, AV_LOG_ERROR, "Unsupported the pixel format : %s.\n",
+               av_pix_fmt_desc_get((AVPixelFormat)out->format)->name);
+        return false;
+    }
+
+    av_image_copy(dest_data, (int *)dest_linesize, src_data,
+                  (int *)src_linesize, (AVPixelFormat)out->format,
+                  out->width, out->height);
+
+    av_free(plane_buf);
+
+    ff_check_vaapi_status(vaUnmapBuffer(m_vaDisplay, image.buf), "vaUnmapBuffer");
+    ff_check_vaapi_status(vaDestroyImage(m_vaDisplay, image.image_id), "vaDestroyImage");
+    return true;
+}
+
+YamiStatus ff_yami_alloc_surface (SurfaceAllocator* thiz, SurfaceAllocParams* params)
+{
+    if (!params)
+        return YAMI_INVALID_PARAM;
+    uint32_t size = params->size;
+    uint32_t width = params->width;
+    uint32_t height = params->height;
+    if (!width || !height || !size)
+        return YAMI_INVALID_PARAM;
+
+    size += EXTRA_SIZE;
+
+    VASurfaceID* v = new VASurfaceID[size];
+    VAStatus status = vaCreateSurfaces(ff_vaapi_create_display(), VA_RT_FORMAT_YUV420, width,
+                                       height, &v[0], size, NULL, 0);
+    if (!ff_check_vaapi_status(status, "vaCreateSurfaces"))
+        return YAMI_FAIL;
+
+    params->surfaces = new intptr_t[size];
+    for (uint32_t i = 0; i < size; i++) {
+        params->surfaces[i] = (intptr_t)v[i];
+    }
+    params->size = size;
+    return YAMI_SUCCESS;
+}
+
+YamiStatus ff_yami_free_surface (SurfaceAllocator* thiz, SurfaceAllocParams* params)
+{
+    if (!params || !params->size || !params->surfaces)
+        return YAMI_INVALID_PARAM;
+    uint32_t size = params->size;
+    VADisplay m_vaDisplay = ff_vaapi_create_display();
+    VASurfaceID *surfaces = new VASurfaceID[size];
+    for (uint32_t i = 0; i < size; i++) {
+        surfaces[i] = params->surfaces[i];
+    }
+    VAStatus status = vaDestroySurfaces((VADisplay) m_vaDisplay, &surfaces[0], size);
+    delete[] surfaces;
+    if (!ff_check_vaapi_status(status, "vaDestroySurfaces"))
+        return YAMI_FAIL;
+
+    delete[] params->surfaces;
+    return YAMI_SUCCESS;
+}
+
+void ff_yami_unref_surface (SurfaceAllocator* thiz)
+{
+    //TODO
+}
diff --git a/libavcodec/libyami.h b/libavcodec/libyami.h
new file mode 100644
index 0000000..b118521
--- /dev/null
+++ b/libavcodec/libyami.h
@@ -0,0 +1,59 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef LIBAVCODEC_LIBYAMI_H_
+#define LIBAVCODEC_LIBYAMI_H_
+
+#include <va/va_drm.h>
+#if HAVE_VAAPI_X11
+#include <va/va_x11.h>
+#endif
+
+#ifndef VA_FOURCC_I420
+#define VA_FOURCC_I420 VA_FOURCC('I','4','2','0')
+#endif
+
+typedef struct {
+    SharedPtr<VideoFrame> output_frame;
+    VADisplay va_display;
+} YamiImage;
+
+VADisplay ff_vaapi_create_display(void);
+SharedPtr<VideoFrame>
+ff_vaapi_create_surface(uint32_t rt_fmt, int pix_fmt, uint32_t w, uint32_t h);
+bool ff_vaapi_destory_surface(SharedPtr<VideoFrame>& frame);
+bool ff_vaapi_load_image(SharedPtr<VideoFrame>& frame, AVFrame *in);
+bool ff_vaapi_get_image(SharedPtr<VideoFrame>& frame, AVFrame *out);
+bool ff_check_vaapi_status(VAStatus status, const char *msg);
+
+YamiStatus ff_yami_alloc_surface (SurfaceAllocator* thiz, SurfaceAllocParams* params);
+YamiStatus ff_yami_free_surface (SurfaceAllocator* thiz, SurfaceAllocParams* params);
+void ff_yami_unref_surface (SurfaceAllocator* thiz);
+
+#define DECODE_QUEUE_SIZE 8
+#define ENCODE_QUEUE_SIZE 4
+
+/* EXTRA_SIZE must great than DEC_QUE+ENC_QUE+DBP-19 or the thread will be block */
+#define EXTRA_SIZE (DECODE_QUEUE_SIZE + ENCODE_QUEUE_SIZE + 2)
+#endif /* LIBAVCODEC_LIBYAMI_H_ */
diff --git a/libavcodec/libyami_dec.cpp b/libavcodec/libyami_dec.cpp
new file mode 100644
index 0000000..e7f20c7
--- /dev/null
+++ b/libavcodec/libyami_dec.cpp
@@ -0,0 +1,527 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+#include <pthread.h>
+#include <unistd.h>
+#include <deque>
+
+extern "C" {
+#include "avcodec.h"
+#include "libavutil/avassert.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/opt.h"
+#include "libavutil/time.h"
+#include "libavutil/mem.h"
+#include "libavutil/pixdesc.h"
+#include "internal.h"
+#include "libavutil/internal.h"
+}
+#include "VideoDecoderHost.h"
+#include "libyami.h"
+#include "libyami_dec.h"
+
+using namespace YamiMediaCodec;
+
+static int ff_yami_decode_thread_init(YamiDecContext *s)
+{
+    int ret = 0;
+    if (!s)
+        return -1;
+    if ((ret = pthread_mutex_init(&s->ctx_mutex, NULL)) < 0)
+        return ret;
+    if ((ret = pthread_mutex_init(&s->in_mutex, NULL)) < 0)
+        return ret;
+    if ((ret = pthread_cond_init(&s->in_cond, NULL)) < 0)
+        return ret;
+    s->decode_status = DECODE_THREAD_NOT_INIT;
+    return 0;
+}
+
+static int ff_yami_decode_thread_close(YamiDecContext *s)
+{
+    if (!s)
+        return -1;
+    pthread_mutex_lock(&s->ctx_mutex);
+    /* if decode thread do not create do not loop */
+    while (s->decode_status != DECODE_THREAD_EXIT
+           && s->decode_status != DECODE_THREAD_NOT_INIT) {
+        s->decode_status = DECODE_THREAD_GOT_EOS;
+        pthread_mutex_unlock(&s->ctx_mutex);
+        pthread_cond_signal(&s->in_cond);
+        av_usleep(10000);
+        pthread_mutex_lock(&s->ctx_mutex);
+    }
+    pthread_mutex_unlock(&s->ctx_mutex);
+    pthread_mutex_destroy(&s->ctx_mutex);
+    pthread_mutex_destroy(&s->in_mutex);
+    pthread_cond_destroy(&s->in_cond);
+    return 0;
+}
+
+static void *ff_yami_decode_thread(void *arg)
+{
+    AVCodecContext *avctx = (AVCodecContext *)arg;
+    YamiDecContext *s = (YamiDecContext *)avctx->priv_data;
+    while (1) {
+        VideoDecodeBuffer *in_buffer = NULL;
+
+        av_log(avctx, AV_LOG_VERBOSE, "decode thread running ...\n");
+        /* when in queue is empty and don't get EOS, waiting, else
+           flush the decode buffer with NULL */
+        pthread_mutex_lock(&s->in_mutex);
+        if (s->in_queue->empty()) {
+            if (s->decode_status == DECODE_THREAD_GOT_EOS) {
+                /* flush the decode buffer with NULL when get EOS */
+                VideoDecodeBuffer flush_buffer;
+                flush_buffer.data = NULL;
+                flush_buffer.size = 0;
+                s->decoder->decode(&flush_buffer);
+                pthread_mutex_unlock(&s->in_mutex);
+                break;
+            }
+
+            av_log(avctx, AV_LOG_VERBOSE, "decode thread waiting with empty queue.\n");
+            pthread_cond_wait(&s->in_cond, &s->in_mutex); /* wait the packet to decode */
+            pthread_mutex_unlock(&s->in_mutex);
+            continue;
+        }
+
+        av_log(avctx, AV_LOG_VERBOSE, "in queue size %ld\n", s->in_queue->size());
+        /* get a packet from in queue and decode */
+        in_buffer = s->in_queue->front();
+        pthread_mutex_unlock(&s->in_mutex);
+        av_log(avctx, AV_LOG_VERBOSE, "process input buffer, [data=%p, size=%zu]\n",
+               in_buffer->data, in_buffer->size);
+        Decode_Status status = s->decoder->decode(in_buffer);
+        av_log(avctx, AV_LOG_VERBOSE, "decode status %d, decoded count %d render count %d\n",
+               status, s->decode_count_yami, s->render_count);
+        /* get the format info when the first decode success */
+        if (DECODE_SUCCESS == status && !s->format_info) {
+            s->format_info = s->decoder->getFormatInfo();
+            av_log(avctx, AV_LOG_VERBOSE, "decode format %dx%d\n",
+                   s->format_info->width,s->format_info->height);
+            if (s->format_info) {
+                avctx->width  = s->format_info->width;
+                avctx->height = s->format_info->height;
+            }
+        }
+
+        /* when format change, update format info and re-send the
+           packet to decoder */
+        if (DECODE_FORMAT_CHANGE == status) {
+            s->format_info = s->decoder->getFormatInfo();
+            if (s->format_info) {
+                avctx->width  = s->format_info->width;
+                avctx->height = s->format_info->height;
+                av_log(avctx, AV_LOG_VERBOSE, "decode format change %dx%d\n",
+                   s->format_info->width,s->format_info->height);
+            }
+            status = s->decoder->decode(in_buffer);
+            if (status < 0) {
+                av_log(avctx, AV_LOG_ERROR, "decode error %d\n", status);
+            }
+        }
+
+        if (status < 0 || !s->format_info) {
+            av_log(avctx, AV_LOG_ERROR, "decode error %d\n", status);
+            break;
+        }
+
+        s->decode_count_yami++;
+
+        pthread_mutex_lock(&s->in_mutex);
+        s->in_queue->pop_front();
+        pthread_mutex_unlock(&s->in_mutex);
+        av_free(in_buffer->data);
+        av_free(in_buffer);
+    }
+
+    av_log(avctx, AV_LOG_VERBOSE, "decode thread exit\n");
+    pthread_mutex_lock(&s->ctx_mutex);
+    s->decode_status = DECODE_THREAD_EXIT;
+    pthread_mutex_unlock(&s->ctx_mutex);
+    return NULL;
+}
+
+static void ff_yami_recycle_frame(void *opaque, uint8_t *data)
+{
+    AVCodecContext *avctx = (AVCodecContext *)opaque;
+    YamiDecContext *s = (YamiDecContext *)avctx->priv_data;
+    YamiImage *yami_image = (YamiImage *)data;
+    if (!s || !s->decoder || !yami_image)
+        return;
+    pthread_mutex_lock(&s->ctx_mutex);
+    /* XXX: should I delete frame buffer?? */
+    yami_image->output_frame.reset();
+    av_free(yami_image);
+    pthread_mutex_unlock(&s->ctx_mutex);
+    av_log(avctx, AV_LOG_DEBUG, "recycle previous frame: %p\n", yami_image);
+}
+
+/*
+ * when decode output format is YAMI, don't move the decoded data from GPU to CPU,
+ * otherwise, used the USWC memory copy. maybe change this solution with generic
+ * hardware surface upload/download filter "hwupload/hwdownload"
+ */
+static int ff_convert_to_frame(AVCodecContext *avctx, YamiImage *from, AVFrame *to)
+{
+    if(!avctx || !from || !to)
+        return -1;
+    if (avctx->pix_fmt == AV_PIX_FMT_YAMI) {
+        to->pts = from->output_frame->timeStamp;
+        to->width = avctx->width;
+        to->height = avctx->height;
+        to->format = AV_PIX_FMT_YAMI;
+        to->extended_data = to->data;
+        /* XXX: put the surface id to data[3] */
+        to->data[3] = reinterpret_cast<uint8_t *>(from);
+        to->buf[0] = av_buffer_create((uint8_t *)from,
+                                      sizeof(YamiImage),
+                                      ff_yami_recycle_frame, avctx, 0);
+    } else {
+        ff_get_buffer(avctx, to, 0);
+
+        to->pkt_pts = AV_NOPTS_VALUE;
+        to->pkt_dts = from->output_frame->timeStamp;
+        to->pts = AV_NOPTS_VALUE;
+        to->width = avctx->width;
+        to->height = avctx->height;
+        to->format = avctx->pix_fmt;
+        to->extended_data = to->data;
+        ff_vaapi_get_image(from->output_frame, to);
+        to->buf[3] = av_buffer_create((uint8_t *) from,
+                                      sizeof(YamiImage),
+                                      ff_yami_recycle_frame, avctx, 0);
+    }
+    return 0;
+}
+
+static const char *get_mime(AVCodecID id)
+{
+    switch (id) {
+    case AV_CODEC_ID_H264:
+        return YAMI_MIME_H264;
+    case AV_CODEC_ID_HEVC:
+        return YAMI_MIME_H265;
+    case AV_CODEC_ID_VP8:
+        return YAMI_MIME_VP8;
+    case AV_CODEC_ID_MPEG2VIDEO:
+        return YAMI_MIME_MPEG2;
+    case AV_CODEC_ID_VC1:
+        return YAMI_MIME_VC1;
+    case AV_CODEC_ID_VP9:
+        return YAMI_MIME_VP9;
+    default:
+        av_assert0(!"Invalid codec ID!");
+        return NULL;
+    }
+}
+
+static av_cold int yami_dec_init(AVCodecContext *avctx)
+{
+    YamiDecContext *s = (YamiDecContext *)avctx->priv_data;
+    Decode_Status status;
+    s->decoder = NULL;
+    enum AVPixelFormat pix_fmts[4] =
+        {
+            AV_PIX_FMT_NV12,
+            AV_PIX_FMT_YUV420P,
+            AV_PIX_FMT_YAMI,
+            AV_PIX_FMT_NONE
+        };
+
+    if (avctx->pix_fmt == AV_PIX_FMT_NONE) {
+        int ret = ff_get_format(avctx, pix_fmts);
+        if (ret < 0)
+            return ret;
+
+        avctx->pix_fmt = (AVPixelFormat)ret;
+    }
+
+    VADisplay va_display = ff_vaapi_create_display();
+    if (!va_display) {
+        av_log(avctx, AV_LOG_ERROR, "\nfail to create display\n");
+        return AVERROR_BUG;
+    }
+    av_log(avctx, AV_LOG_VERBOSE, "yami_dec_init\n");
+    const char *mime_type = get_mime(avctx->codec_id);
+    s->decoder = createVideoDecoder(mime_type);
+    if (!s->decoder) {
+        av_log(avctx, AV_LOG_ERROR, "fail to create decoder\n");
+        return AVERROR_BUG;
+    }
+    NativeDisplay native_display;
+    native_display.type = NATIVE_DISPLAY_VA;
+    native_display.handle = (intptr_t)va_display;
+    s->decoder->setNativeDisplay(&native_display);
+
+    /* set external surface allocator */
+    s->p_alloc = (SurfaceAllocator *) av_mallocz(sizeof(SurfaceAllocator));
+    s->p_alloc->alloc = ff_yami_alloc_surface;
+    s->p_alloc->free = ff_yami_free_surface;
+    s->p_alloc->unref = ff_yami_unref_surface;
+    s->decoder->setAllocator(s->p_alloc);
+
+    /* fellow h264.c style */
+    if (avctx->codec_id == AV_CODEC_ID_H264) {
+        if (avctx->ticks_per_frame == 1) {
+            if (avctx->time_base.den < INT_MAX / 2) {
+                avctx->time_base.den *= 2;
+            } else
+                avctx->time_base.num /= 2;
+        }
+        avctx->ticks_per_frame = 2;
+    }
+
+    VideoConfigBuffer config_buffer;
+    memset(&config_buffer, 0, sizeof(VideoConfigBuffer));
+    if (avctx->extradata && avctx->extradata_size) {
+        config_buffer.data = avctx->extradata;
+        config_buffer.size = avctx->extradata_size;
+    }
+    config_buffer.profile = VAProfileNone;
+    status = s->decoder->start(&config_buffer);
+    if (status != DECODE_SUCCESS && status != DECODE_FORMAT_CHANGE) {
+        av_log(avctx, AV_LOG_ERROR, "yami decoder fail to start\n");
+        return AVERROR_BUG;
+    }
+    s->in_queue = new std::deque<VideoDecodeBuffer*>;
+
+#if HAVE_PTHREADS
+    if (ff_yami_decode_thread_init(s) < 0)
+        return AVERROR(ENOMEM);
+#else
+    av_log(avctx, AV_LOG_ERROR, "pthread libaray must be supported\n");
+    return AVERROR(ENOSYS);
+#endif
+    s->decode_count = 0;
+    s->decode_count_yami = 0;
+    s->render_count = 0;
+    return 0;
+}
+
+static int ff_get_best_pkt_dts(AVFrame *frame, YamiDecContext *s)
+{
+    if (frame->pkt_dts == AV_NOPTS_VALUE && frame->pts == AV_NOPTS_VALUE) {
+        frame->pkt_dts = s->render_count * s->duration;
+    }
+    return 1;
+}
+
+static int yami_dec_frame(AVCodecContext *avctx, void *data,
+                          int *got_frame, AVPacket *avpkt)
+{
+    YamiDecContext *s = (YamiDecContext *)avctx->priv_data;
+    if (!s || !s->decoder)
+        return -1;
+    VideoDecodeBuffer *in_buffer = NULL;
+    Decode_Status status = DECODE_FAIL;
+    YamiImage *yami_image =  NULL;
+    int ret = 0;
+    AVFrame *frame = (AVFrame *)data;
+    av_log(avctx, AV_LOG_VERBOSE, "yami_dec_frame\n");
+
+    /* append packet to input buffer queue */
+    in_buffer = (VideoDecodeBuffer *)av_mallocz(sizeof(VideoDecodeBuffer));
+    if (!in_buffer)
+        return AVERROR(ENOMEM);
+    /* avoid avpkt free and data is pointer */
+    if (avpkt->data && avpkt->size) {
+        in_buffer->data = (uint8_t *)av_mallocz(avpkt->size);
+        if (!in_buffer->data)
+            return AVERROR(ENOMEM);
+        memcpy(in_buffer->data, avpkt->data, avpkt->size);
+    }
+    in_buffer->size = avpkt->size;
+    in_buffer->timeStamp = avpkt->pts;
+    if (avpkt->duration != 0)
+        s->duration = avpkt->duration;
+
+    while (s->decode_status < DECODE_THREAD_GOT_EOS) {
+        /* need enque eos buffer more than once */
+        pthread_mutex_lock(&s->in_mutex);
+        if (s->in_queue->size() < DECODE_QUEUE_SIZE) {
+            s->in_queue->push_back(in_buffer);
+            av_log(avctx, AV_LOG_VERBOSE, "wakeup decode thread ...\n");
+            pthread_cond_signal(&s->in_cond);
+            pthread_mutex_unlock(&s->in_mutex);
+            break;
+        }
+        pthread_mutex_unlock(&s->in_mutex);
+        av_log(avctx, AV_LOG_DEBUG,
+               "in queue size %ld, decode count %d, decoded count %d,"
+               "too many buffer are under decoding, wait ...\n",
+               s->in_queue->size(), s->decode_count, s->decode_count_yami);
+        av_usleep(1000);
+    };
+    s->decode_count++;
+
+    /* thread status update */
+    pthread_mutex_lock(&s->ctx_mutex);
+    switch (s->decode_status) {
+    case DECODE_THREAD_NOT_INIT:
+    case DECODE_THREAD_EXIT:
+        if (avpkt->data && avpkt->size) {
+            s->decode_status = DECODE_THREAD_RUNING;
+            pthread_create(&s->decode_thread_id, NULL, &ff_yami_decode_thread, avctx);
+        }
+        break;
+    case DECODE_THREAD_RUNING:
+        if (!avpkt->data || !avpkt->size) {
+            s->decode_status = DECODE_THREAD_GOT_EOS;
+            pthread_cond_signal(&s->in_cond);
+        }
+        break;
+    case DECODE_THREAD_GOT_EOS:
+        pthread_cond_signal(&s->in_cond);
+        break;
+    default:
+        break;
+    }
+    pthread_mutex_unlock(&s->ctx_mutex);
+
+    /* get an output buffer from yami */
+    do {
+        if (!s->format_info) {
+            av_usleep(10000);
+            continue;
+        }
+
+        yami_image = (YamiImage *)av_mallocz(sizeof(YamiImage));
+        if (!yami_image) {
+            ret = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        do {
+            yami_image->output_frame = s->decoder->getOutput();
+            av_log(avctx, AV_LOG_DEBUG, "getoutput() status=%d\n", status);
+            pthread_mutex_lock(&s->ctx_mutex);
+            if (avpkt->data || yami_image->output_frame || s->decode_status == DECODE_THREAD_EXIT) {
+                pthread_mutex_unlock(&s->ctx_mutex);
+                break;
+            }
+            pthread_mutex_unlock(&s->ctx_mutex);
+            av_usleep(100);
+        } while (1);
+
+        if (yami_image->output_frame) {
+            yami_image->va_display = ff_vaapi_create_display();
+            status = DECODE_SUCCESS;
+            break;
+        }
+        *got_frame = 0;
+        av_free(yami_image);
+        return avpkt->size;
+    } while (s->decode_status == DECODE_THREAD_RUNING);
+    if (status != DECODE_SUCCESS) {
+        av_log(avctx, AV_LOG_VERBOSE, "after processed EOS, return\n");
+        return avpkt->size;
+    }
+
+    /* process the output frame */
+    if (ff_convert_to_frame(avctx, yami_image, frame) < 0)
+        av_log(avctx, AV_LOG_VERBOSE, "yami frame convert av_frame failed\n");
+    ff_get_best_pkt_dts(frame, s);
+    *got_frame = 1;
+    s->render_count++;
+    av_log(avctx, AV_LOG_VERBOSE,
+           "decode_count_yami=%d, decode_count=%d, render_count=%d\n",
+           s->decode_count_yami, s->decode_count, s->render_count);
+    return avpkt->size;
+
+fail:
+    if (yami_image) {
+        yami_image->output_frame.reset();
+        if (yami_image)
+            av_free(yami_image);
+    }
+    return ret;
+}
+
+static av_cold int yami_dec_close(AVCodecContext *avctx)
+{
+    YamiDecContext *s = (YamiDecContext *)avctx->priv_data;
+
+    ff_yami_decode_thread_close(s);
+    if (s->decoder) {
+        s->decoder->stop();
+        releaseVideoDecoder(s->decoder);
+        s->decoder = NULL;
+    }
+    if (s->p_alloc)
+        av_free(s->p_alloc);
+    while (!s->in_queue->empty()) {
+        VideoDecodeBuffer *in_buffer = s->in_queue->front();
+        s->in_queue->pop_front();
+        av_free(in_buffer->data);
+        av_free(in_buffer);
+    }
+    delete s->in_queue;
+    av_log(avctx, AV_LOG_VERBOSE, "yami_dec_close\n");
+    return 0;
+}
+
+#define YAMI_DEC(NAME, ID) \
+AVCodec ff_libyami_##NAME##_decoder = { \
+    /* name */                  "libyami_" #NAME, \
+    /* long_name */             NULL_IF_CONFIG_SMALL(#NAME " (libyami)"), \
+    /* type */                  AVMEDIA_TYPE_VIDEO, \
+    /* id */                    ID, \
+    /* capabilities */          CODEC_CAP_DELAY, \
+    /* supported_framerates */  NULL, \
+    /* pix_fmts */              (const enum AVPixelFormat[]) { AV_PIX_FMT_YAMI, \
+                                                               AV_PIX_FMT_NV12, \
+                                                               AV_PIX_FMT_YUV420P, \
+                                                               AV_PIX_FMT_NONE}, \
+    /* supported_samplerates */ NULL, \
+    /* sample_fmts */           NULL, \
+    /* channel_layouts */       NULL, \
+    /* max_lowres */            0, \
+    /* priv_class */            NULL, \
+    /* profiles */              NULL, \
+    /* priv_data_size */        sizeof(YamiDecContext), \
+    /* next */                  NULL, \
+    /* init_thread_copy */      NULL, \
+    /* update_thread_context */ NULL, \
+    /* defaults */              NULL, \
+    /* init_static_data */      NULL, \
+    /* init */                  yami_dec_init, \
+    /* encode_sub */            NULL, \
+    /* encode2 */               NULL, \
+    /* decode */                yami_dec_frame, \
+    /* close */                 yami_dec_close, \
+    /* send_frame */            NULL, \
+    /* send_packet */           NULL, \
+    /* receive_frame */         NULL, \
+    /* receive_packet */        NULL, \
+    /* flush */                 NULL, \
+    /* caps_internal */         FF_CODEC_CAP_SETS_PKT_DTS, \
+};
+
+YAMI_DEC(h264, AV_CODEC_ID_H264)
+YAMI_DEC(hevc, AV_CODEC_ID_HEVC)
+YAMI_DEC(vp8, AV_CODEC_ID_VP8)
+YAMI_DEC(mpeg2, AV_CODEC_ID_MPEG2VIDEO)
+YAMI_DEC(vc1, AV_CODEC_ID_VC1)
+YAMI_DEC(vp9, AV_CODEC_ID_VP9)
diff --git a/libavcodec/libyami_dec.h b/libavcodec/libyami_dec.h
new file mode 100644
index 0000000..67161e8
--- /dev/null
+++ b/libavcodec/libyami_dec.h
@@ -0,0 +1,56 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef LIBAVCODEC_LIBYAMI_DEC_H_
+#define LIBAVCODEC_LIBYAMI_DEC_H_
+
+typedef enum {
+    DECODE_THREAD_NOT_INIT = 0,
+    DECODE_THREAD_RUNING,
+    DECODE_THREAD_GOT_EOS,
+    DECODE_THREAD_EXIT,
+} DecodeThreadStatus;
+
+struct YamiDecContext {
+    AVCodecContext *avctx;
+    pthread_mutex_t ctx_mutex; /* mutex for YamiContext */
+
+    YamiMediaCodec::IVideoDecoder *decoder;
+    const VideoFormatInfo *format_info;
+    pthread_t decode_thread_id;
+    std::deque<VideoDecodeBuffer *> *in_queue;
+    pthread_mutex_t in_mutex; /* mutex for in queue */
+    pthread_cond_t in_cond;   /* decode thread condition wait */
+    DecodeThreadStatus decode_status;
+
+    SurfaceAllocator *p_alloc;
+    /* the pts is no value use this value */
+    int duration;
+    /* debug use */
+    int decode_count;
+    int decode_count_yami;
+    int render_count;
+};
+
+#endif /* LIBAVCODEC_LIBYAMI_DEC_H_ */
diff --git a/libavcodec/libyami_enc.cpp b/libavcodec/libyami_enc.cpp
new file mode 100644
index 0000000..fd83126
--- /dev/null
+++ b/libavcodec/libyami_enc.cpp
@@ -0,0 +1,551 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <pthread.h>
+#include <unistd.h>
+#include <deque>
+
+extern "C" {
+#include "avcodec.h"
+#include "libavutil/avassert.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/opt.h"
+#include "libavutil/time.h"
+#include "libavutil/internal.h"
+#include "internal.h"
+}
+
+#include "VideoEncoderHost.h"
+
+#include "libyami_enc.h"
+#include "libyami.h"
+using namespace YamiMediaCodec;
+
+static int ff_yami_encode_thread_init(YamiEncContext *s)
+{
+    int ret = 0;
+    if (!s)
+        return -1;
+    if ((ret = pthread_mutex_init(&s->ctx_mutex, NULL)) < 0)
+        return ret;
+    if ((ret = pthread_mutex_init(&s->in_mutex, NULL)) < 0)
+        return ret;
+    if ((ret = pthread_mutex_init(&s->out_mutex, NULL)) < 0)
+        return ret;
+    if ((ret = pthread_cond_init(&s->in_cond, NULL)) < 0)
+        return ret;
+    s->encode_status = ENCODE_THREAD_NOT_INIT;
+    return 0;
+}
+
+static int ff_yami_encode_thread_close(YamiEncContext *s)
+{
+    if (!s)
+        return -1;
+    pthread_mutex_lock(&s->ctx_mutex);
+    while (s->encode_status == ENCODE_THREAD_RUNING) {
+        s->encode_status = ENCODE_THREAD_GOT_EOS;
+        pthread_mutex_unlock(&s->ctx_mutex);
+        pthread_cond_signal(&s->in_cond);
+        av_usleep(10000);
+        pthread_mutex_lock(&s->ctx_mutex);
+    }
+    pthread_mutex_unlock(&s->ctx_mutex);
+    pthread_mutex_destroy(&s->ctx_mutex);
+    pthread_mutex_destroy(&s->in_mutex);
+    pthread_mutex_destroy(&s->out_mutex);
+    pthread_cond_destroy(&s->in_cond);
+    return 0;
+}
+
+static int ff_convert_to_yami(AVCodecContext *avctx, AVFrame *from, YamiImage *to)
+{
+    int pix_fmt = VA_FOURCC_NV12;
+    if (avctx->pix_fmt == AV_PIX_FMT_YUV420P) {
+        pix_fmt =  VA_FOURCC_I420;
+    } else if (avctx->pix_fmt == AV_PIX_FMT_NV12) {
+        pix_fmt =  VA_FOURCC_NV12;
+    } else {
+        av_log(avctx, AV_LOG_VERBOSE, "used the un-support format ... \n");
+    }
+    to->output_frame = ff_vaapi_create_surface(VA_RT_FORMAT_YUV420, pix_fmt, avctx->width, avctx->height);
+    ff_vaapi_load_image(to->output_frame, from);
+    if (from->key_frame)
+        to->output_frame->flags |= VIDEO_FRAME_FLAGS_KEY;
+    to->va_display = ff_vaapi_create_display();
+    from->data[3] = reinterpret_cast<uint8_t *>(to);
+    return 0;
+}
+
+static void *ff_yami_encode_thread(void *arg)
+{
+    AVCodecContext *avctx = (AVCodecContext *)arg;
+    YamiEncContext *s = (YamiEncContext *)avctx->priv_data;
+    while (1) {
+        AVFrame *frame;
+        /* deque one input buffer */
+        av_log(avctx, AV_LOG_VERBOSE, "encode thread runs one cycle start ... \n");
+        pthread_mutex_lock(&s->in_mutex);
+        if (s->in_queue->empty()) {
+            if (s->encode_status == ENCODE_THREAD_GOT_EOS) {
+                pthread_mutex_unlock(&s->in_mutex);
+                break;
+            }
+
+            av_log(avctx, AV_LOG_VERBOSE, "encode thread wait because in queue is empty\n");
+            pthread_cond_wait(&s->in_cond, &s->in_mutex);
+            pthread_mutex_unlock(&s->in_mutex);
+            continue;
+        }
+
+        av_log(avctx, AV_LOG_VERBOSE, "encode in queue size %ld\n", s->in_queue->size());
+        frame = s->in_queue->front();
+        pthread_mutex_unlock(&s->in_mutex);
+        /* encode one input buffer */
+        Encode_Status status;
+        YamiImage *yami_image = NULL;
+        if (frame->format != AV_PIX_FMT_YAMI) { /* non zero-copy mode */
+            yami_image = (YamiImage *)av_mallocz(sizeof(YamiImage));
+            if (ff_convert_to_yami(avctx, frame, yami_image) < 0)
+                av_log(avctx, AV_LOG_ERROR,
+                   "av_convert_to_yami convert frame failed\n");
+        } else { /* zero-copy mode */
+            yami_image = (YamiImage *)frame->data[3];
+            /* encode use the AVFrame pts */
+            yami_image->output_frame->timeStamp = frame->pts;
+        }
+
+        /* handle encoder busy case */
+        do {
+             status = s->encoder->encode(yami_image->output_frame);
+        } while (status == ENCODE_IS_BUSY);
+        av_log(avctx, AV_LOG_VERBOSE, "encode status %d, encode count %d\n",
+               status, s->encode_count_yami);
+        if (status < 0) {
+            av_log(avctx, AV_LOG_ERROR,
+                   "encode error %d frame %d\n", status , s->encode_count_yami);
+        }
+        s->encode_count_yami++;
+        pthread_mutex_lock(&s->out_mutex);
+        s->out_queue->push_back(frame);
+        pthread_mutex_unlock(&s->out_mutex);
+        s->in_queue->pop_front();
+    }
+    av_log(avctx, AV_LOG_VERBOSE, "encode thread exit\n");
+    pthread_mutex_lock(&s->ctx_mutex);
+    s->encode_status = ENCODE_THREAD_EXIT;
+    pthread_mutex_unlock(&s->ctx_mutex);
+
+    return NULL;
+}
+
+static bool
+ff_out_buffer_create(VideoEncOutputBuffer *enc_out_buf, int max_out_size)
+{
+    enc_out_buf->data = static_cast<uint8_t *>(malloc(max_out_size));
+    if (!enc_out_buf->data)
+        return false;
+    enc_out_buf->bufferSize = max_out_size;
+    enc_out_buf->format = OUTPUT_EVERYTHING;
+    return true;
+}
+
+static const char *get_mime(AVCodecID id)
+{
+    switch (id) {
+    case AV_CODEC_ID_H264:
+        return YAMI_MIME_H264;
+    case AV_CODEC_ID_VP8:
+        return YAMI_MIME_VP8;
+    default:
+        av_assert0(!"Invalid codec ID!");
+        return 0;
+    }
+}
+
+static void ff_out_buffer_destroy(VideoEncOutputBuffer *enc_out_buf)
+{
+    if (enc_out_buf->data)
+        free(enc_out_buf->data);
+}
+
+static av_cold int yami_enc_init(AVCodecContext *avctx)
+{
+    YamiEncContext *s = (YamiEncContext *) avctx->priv_data;
+    Encode_Status status;
+    enum AVPixelFormat pix_fmts[4] =
+        {
+            AV_PIX_FMT_NV12,
+            AV_PIX_FMT_YUV420P,
+            AV_PIX_FMT_YAMI,
+            AV_PIX_FMT_NONE
+        };
+    if (avctx->pix_fmt == AV_PIX_FMT_NONE) {
+        int ret = ff_get_format(avctx, pix_fmts);
+        if (ret < 0)
+            return ret;
+        avctx->pix_fmt      = (AVPixelFormat)ret;
+    }
+
+    if (avctx->codec_id == AV_CODEC_ID_H264 && avctx->width % 2 != 0
+        || avctx->height % 2 != 0) {
+        av_log(avctx, AV_LOG_ERROR,
+                "width or height not divisible by 2 (%dx%d) .\n",
+               avctx->width,avctx->height);
+        return AVERROR(EINVAL);
+    }
+    av_log(avctx, AV_LOG_VERBOSE, "yami_enc_init\n");
+    const char *mime_type = get_mime(avctx->codec_id);
+    s->encoder = createVideoEncoder(mime_type);
+    if (!s->encoder) {
+        av_log(avctx, AV_LOG_ERROR, "fail to create libyami encoder\n");
+        return AVERROR_BUG;
+    }
+    NativeDisplay native_display;
+    native_display.type = NATIVE_DISPLAY_VA;
+    VADisplay va_display = ff_vaapi_create_display();
+    native_display.handle = (intptr_t)va_display;
+    s->encoder->setNativeDisplay(&native_display);
+
+    /* configure encoding parameters */
+    VideoParamsCommon encVideoParams;
+    encVideoParams.size = sizeof(VideoParamsCommon);
+    s->encoder->getParameters(VideoParamsTypeCommon, &encVideoParams);
+    encVideoParams.resolution.width  = avctx->width;
+    encVideoParams.resolution.height = avctx->height;
+    /* frame rate setting */
+    if (avctx->framerate.den > 0 && avctx->framerate.num > 0) {
+        encVideoParams.frameRate.frameRateDenom = avctx->framerate.den;
+        encVideoParams.frameRate.frameRateNum = avctx->framerate.num;
+    } else {
+        encVideoParams.frameRate.frameRateNum = avctx->time_base.den;
+        encVideoParams.frameRate.frameRateDenom = avctx->time_base.num;
+    }
+    /* picture type and bitrate setting */
+    encVideoParams.intraPeriod = av_clip(avctx->gop_size, 1, 250);
+    s->ip_period = encVideoParams.ipPeriod = avctx->max_b_frames < 2 ? 1 : 3;
+    s->max_inqueue_size = FFMAX(encVideoParams.ipPeriod, ENCODE_QUEUE_SIZE);
+
+    /* ratecontrol method selected
+    When ‘global_quality’ is specified, a quality-based mode is used.
+    Specifically this means either
+        - CQP - constant quantizer scale, when the ‘qscale’ codec
+        flag is also set (the ‘-qscale’ avconv option).
+    Otherwise, a bitrate-based mode is used. For all of those, you
+    should specify at least the desired average bitrate with the ‘b’ option.
+        - CBR - constant bitrate, when ‘maxrate’ is specified and
+        equal to the average bitrate.
+        - VBR - variable bitrate, when ‘maxrate’ is specified, but
+        is higher than the average bitrate.
+     */
+    const char *rc_desc;
+    float quant;
+    int want_qscale = !!(avctx->flags & AV_CODEC_FLAG_QSCALE);
+
+    if (want_qscale) {
+        encVideoParams.rcMode = RATE_CONTROL_CQP;
+        quant = avctx->global_quality / FF_QP2LAMBDA;
+        encVideoParams.rcParams.initQP = av_clip(quant, 1, 52);
+
+        rc_desc = "constant quantization parameter (CQP)";
+    } else if (avctx->rc_max_rate > avctx->bit_rate) {
+        encVideoParams.rcMode = RATE_CONTROL_VBR;
+        encVideoParams.rcParams.bitRate = avctx->rc_max_rate;
+
+        encVideoParams.rcParams.targetPercentage = (100 * avctx->bit_rate)/avctx->rc_max_rate;
+        rc_desc = "variable bitrate (VBR)";
+
+        av_log(avctx, AV_LOG_WARNING,
+               "Using the %s ratecontrol method, but driver not support it.\n", rc_desc);
+    } else if (avctx->rc_max_rate == avctx->bit_rate) {
+        encVideoParams.rcMode = RATE_CONTROL_CBR;
+        encVideoParams.rcParams.bitRate = avctx->bit_rate;
+        encVideoParams.rcParams.targetPercentage = 100;
+
+        rc_desc = "constant bitrate (CBR)";
+    } else {
+        encVideoParams.rcMode = RATE_CONTROL_CQP;
+        encVideoParams.rcParams.initQP = 26;
+
+        rc_desc = "constant quantization parameter (CQP) as default";
+    }
+
+    av_log(avctx, AV_LOG_VERBOSE, "Using the %s ratecontrol method\n", rc_desc);
+
+    if (s->level){
+        encVideoParams.level = atoi(s->level);
+    } else {
+        encVideoParams.level = 40;
+    }
+
+    if (avctx->codec_id == AV_CODEC_ID_H264) {
+        encVideoParams.profile = VAProfileH264Main;
+        if (s->profile) {
+            if (!strcmp(s->profile , "high")) {
+                encVideoParams.profile = VAProfileH264High;
+            } else if(!strcmp(s->profile , "main")) {
+                encVideoParams.profile = VAProfileH264Main;
+            } else if(!strcmp(s->profile , "baseline")) {
+                encVideoParams.profile = VAProfileH264Baseline;
+            }
+        } else {
+            av_log(avctx, AV_LOG_WARNING, "Using the main profile as default.\n");
+        }
+    }
+    encVideoParams.size = sizeof(VideoParamsCommon);
+    s->encoder->setParameters(VideoParamsTypeCommon, &encVideoParams);
+
+    if (avctx->codec_id == AV_CODEC_ID_H264) {
+        VideoConfigAVCStreamFormat streamFormat;
+        streamFormat.size = sizeof(VideoConfigAVCStreamFormat);
+        streamFormat.streamFormat = AVC_STREAM_FORMAT_ANNEXB;
+        s->encoder->setParameters(VideoConfigTypeAVCStreamFormat, &streamFormat);
+    }
+
+#if HAVE_PTHREADS
+    if (ff_yami_encode_thread_init(s) < 0)
+        return AVERROR(ENOMEM);
+#else
+    av_log(avctx, AV_LOG_ERROR, "pthread libaray must be supported\n");
+    return AVERROR(ENOSYS);
+#endif
+    status = s->encoder->start();
+    if (status != ENCODE_SUCCESS) {
+        av_log(avctx, AV_LOG_ERROR, "yami encoder fail to start\n");
+        return AVERROR_BUG;
+    }
+    /* init encoder output buffer */
+    s->encoder->getMaxOutSize(&(s->max_out_size));
+
+    if (!ff_out_buffer_create(&s->enc_out_buf, s->max_out_size)) {
+        av_log(avctx, AV_LOG_ERROR, "fail to create output\n");
+        return AVERROR(ENOMEM);
+    }
+    s->enc_frame_size = FFALIGN(avctx->width, 32) * FFALIGN(avctx->height, 32) * 3;
+    s->enc_frame_buf = static_cast<uint8_t *>(av_mallocz(s->enc_frame_size));
+    s->in_queue = new std::deque<AVFrame *>;
+    s->out_queue = new std::deque<AVFrame *>;
+
+    s->encode_count = 0;
+    s->encode_count_yami = 0;
+    s->render_count = 0;
+    av_log(avctx, AV_LOG_DEBUG, "yami_enc_init\n");
+    return 0;
+}
+
+static int yami_enc_frame(AVCodecContext *avctx, AVPacket *pkt,
+                          const AVFrame *frame, int *got_packet)
+{
+    YamiEncContext *s = (YamiEncContext *)avctx->priv_data;
+    Encode_Status status;
+    int ret;
+    if(!s->encoder)
+        return -1;
+    if (frame) {
+        AVFrame *qframe = av_frame_alloc();
+        if (!qframe) {
+            return AVERROR(ENOMEM);
+        }
+        /* av_frame_ref the src frame and av_frame_unref in encode thread */
+        ret = av_frame_ref(qframe, frame);
+        if (ret < 0)
+            return ret;
+        while (s->encode_status < ENCODE_THREAD_GOT_EOS) {
+            pthread_mutex_lock(&s->in_mutex);
+            if (s->in_queue->size() < 2/*s->max_inqueue_size*/) {
+                /* XXX : libyami decode dpb will use 16 surfaces */
+                s->in_queue->push_back(qframe);
+                av_log(avctx, AV_LOG_VERBOSE, "wakeup encode thread ...\n");
+                pthread_cond_signal(&s->in_cond);
+                pthread_mutex_unlock(&s->in_mutex);
+                break;
+            }
+            pthread_mutex_unlock(&s->in_mutex);
+            av_log(avctx, AV_LOG_DEBUG,
+                   "in queue size %ld, encode count %d, encoded count %d, too many buffer are under encoding, wait ...\n",
+                   s->in_queue->size(), s->encode_count, s->encode_count_yami);
+            av_usleep(1000);
+        };
+        s->encode_count++;
+    }
+
+    /* encode thread status update */
+    pthread_mutex_lock(&s->ctx_mutex);
+    switch (s->encode_status) {
+    case ENCODE_THREAD_NOT_INIT:
+    case ENCODE_THREAD_EXIT:
+        if (frame) {
+            s->encode_status = ENCODE_THREAD_RUNING;
+            pthread_create(&s->encode_thread_id, NULL, &ff_yami_encode_thread, avctx);
+        }
+        break;
+    case ENCODE_THREAD_RUNING:
+        if (!frame) {
+            s->encode_status = ENCODE_THREAD_GOT_EOS;
+        }
+        break;
+    case ENCODE_THREAD_GOT_EOS:
+        if (s->in_queue->empty())
+            s->encode_status = ENCODE_THREAD_NOT_INIT;
+        break;
+    default:
+        break;
+    }
+
+    pthread_mutex_unlock(&s->ctx_mutex);
+    do {
+        status = s->encoder->getOutput(&s->enc_out_buf, true);
+    } while (!frame && status != ENCODE_SUCCESS && s->in_queue->size() > 0);
+    if (status != ENCODE_SUCCESS)
+        return 0;
+    if ((ret = ff_alloc_packet2(avctx, pkt, s->enc_out_buf.dataSize, 0)) < 0)
+        return ret;
+
+    pthread_mutex_lock(&s->out_mutex);
+    if (!s->out_queue->empty()) {
+        AVFrame *qframe = s->out_queue->front();
+        if (qframe) {
+            pkt->pts = s->enc_out_buf.timeStamp;
+            /* XXX: DTS must be smaller than PTS, used ip_period as offset */
+            pkt->dts = qframe->pts - s->ip_period;
+            if (qframe->format != AV_PIX_FMT_YAMI) {
+                YamiImage *yami_image = (YamiImage *)qframe->data[3];
+                ff_vaapi_destory_surface(yami_image->output_frame);
+                yami_image->output_frame.reset();
+                av_free(yami_image);
+            };
+            av_frame_free(&qframe);
+        }
+        s->out_queue->pop_front();
+    }
+    pthread_mutex_unlock(&s->out_mutex);
+
+    s->render_count++;
+    /* get extradata when build the first frame */
+    int offset = 0;
+    if (avctx->codec_id == AV_CODEC_ID_H264) {
+        if (avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER && !avctx->extradata) {
+            /* find start code */
+            uint8_t *ptr = s->enc_out_buf.data;
+            for (uint32_t i = 0; i < s->enc_out_buf.dataSize; i++) {
+                if (*(ptr + i) == 0x0 && *(ptr + i + 1) == 0x0
+                    && *(ptr + i + 2) == 0x0 && *(ptr + i + 3) == 0x1
+                    && (*(ptr + i + 4) & 0x1f) == 5) {
+                    offset = i;
+                    break;
+                }
+            }
+            avctx->extradata = (uint8_t *) av_mallocz(
+                offset + AV_INPUT_BUFFER_PADDING_SIZE);
+            memcpy(avctx->extradata, s->enc_out_buf.data, offset);
+            avctx->extradata_size = offset;
+        }
+    }
+    void *p = pkt->data;
+    memcpy(p, s->enc_out_buf.data + offset,
+           s->enc_out_buf.dataSize - offset);
+    pkt->size = s->enc_out_buf.dataSize - offset;
+
+    if (s->enc_out_buf.flag & ENCODE_BUFFERFLAG_SYNCFRAME)
+        pkt->flags |= AV_PKT_FLAG_KEY;
+    *got_packet = 1;
+
+    return 0;
+}
+
+static av_cold int yami_enc_close(AVCodecContext *avctx)
+{
+    YamiEncContext *s = (YamiEncContext *)avctx->priv_data;
+    ff_out_buffer_destroy(&s->enc_out_buf);
+    ff_yami_encode_thread_close(s);
+    if (s->encoder) {
+        s->encoder->stop();
+        releaseVideoEncoder(s->encoder);
+        s->encoder = NULL;
+    }
+    while (!s->in_queue->empty()) {
+        AVFrame *in_buffer = s->in_queue->front();
+        s->in_queue->pop_front();
+        av_frame_free(&in_buffer);
+    }
+    while (!s->out_queue->empty()) {
+            AVFrame *out_buffer = s->out_queue->front();
+            s->out_queue->pop_front();
+            av_frame_free(&out_buffer);
+    }
+    delete s->in_queue;
+    delete s->out_queue;
+    av_free(s->enc_frame_buf);
+    s->enc_frame_size = 0;
+    av_log(avctx, AV_LOG_DEBUG, "yami_enc_close\n");
+    return 0;
+}
+
+#define OFFSET(x) offsetof(YamiEncContext, x)
+#define VE AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM
+static const AVOption options[] = {
+    { "profile",       "Set profile restrictions ", OFFSET(profile),       AV_OPT_TYPE_STRING, { 0 }, 0, 0, VE},
+    { "level",         "Specify level (as defined by Annex A)", OFFSET(level), AV_OPT_TYPE_STRING, {.str=NULL}, 0, 0, VE},
+    { NULL },
+};
+
+#define YAMI_ENC(NAME, ID) \
+static const AVClass yami_enc_##NAME##_class = { \
+    .class_name = "libyami_" #NAME, \
+    .item_name  = av_default_item_name, \
+    .option     = options, \
+    .version    = LIBAVUTIL_VERSION_INT, \
+}; \
+AVCodec ff_libyami_##NAME##_encoder = { \
+    /* name */                  "libyami_" #NAME, \
+    /* long_name */             NULL_IF_CONFIG_SMALL(#NAME " (libyami)"), \
+    /* type */                  AVMEDIA_TYPE_VIDEO, \
+    /* id */                    ID, \
+    /* capabilities */          CODEC_CAP_DELAY, \
+    /* supported_framerates */  NULL, \
+    /* pix_fmts */              (const enum AVPixelFormat[]) { AV_PIX_FMT_YAMI, \
+                                                            AV_PIX_FMT_NV12, \
+                                                            AV_PIX_FMT_YUV420P, \
+                                                            AV_PIX_FMT_NONE}, \
+    /* supported_samplerates */ NULL, \
+    /* sample_fmts */           NULL, \
+    /* channel_layouts */       NULL, \
+    /* max_lowres */            0, \
+    /* priv_class */            &yami_enc_##NAME##_class, \
+    /* profiles */              NULL, \
+    /* priv_data_size */        sizeof(YamiEncContext), \
+    /* next */                  NULL, \
+    /* init_thread_copy */      NULL, \
+    /* update_thread_context */ NULL, \
+    /* defaults */              NULL, \
+    /* init_static_data */      NULL, \
+    /* init */                  yami_enc_init, \
+    /* encode_sub */            NULL, \
+    /* encode2 */               yami_enc_frame, \
+    /* decode */                NULL, \
+    /* close */                 yami_enc_close, \
+};
+
+YAMI_ENC(h264, AV_CODEC_ID_H264)
+YAMI_ENC(vp8, AV_CODEC_ID_VP8)
diff --git a/libavcodec/libyami_enc.h b/libavcodec/libyami_enc.h
new file mode 100644
index 0000000..edde635
--- /dev/null
+++ b/libavcodec/libyami_enc.h
@@ -0,0 +1,70 @@ 
+/*
+ * Intel Yet Another Media Infrastructure video decoder/encoder
+ *
+ * Copyright (c) 2016 Intel Corporation
+ *     Zhou Yun(yunx.z.zhou@intel.com)
+ *     Jun Zhao(jun.zhao@intel.com)
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+#ifndef LIBAVCODEC_LIBYAMI_ENC_H_
+#define LIBAVCODEC_LIBYAMI_ENC_H_
+
+typedef enum {
+    ENCODE_THREAD_NOT_INIT = 0,
+    ENCODE_THREAD_RUNING,
+    ENCODE_THREAD_GOT_EOS,
+    ENCODE_THREAD_EXIT,
+} EncodeThreadStatus;
+
+struct YamiEncContext {
+    AVCodecContext *avctx;
+
+    pthread_mutex_t ctx_mutex; // mutex for encoder->getOutput() and YamiEncContext itself update (encode_status, etc)
+    YamiMediaCodec::IVideoEncoder *encoder;
+    VideoEncOutputBuffer enc_out_buf;
+
+    pthread_t encode_thread_id;
+    uint32_t max_inqueue_size;
+    std::deque<AVFrame *> *in_queue;
+    std::deque<AVFrame *> *out_queue;
+    pthread_mutex_t in_mutex;  // mutex for in_queue
+    pthread_mutex_t out_mutex; // mutex for out_queue
+    pthread_cond_t in_cond;    // encode thread condition wait
+    EncodeThreadStatus encode_status;
+
+    uint8_t *enc_frame_buf;
+    uint32_t enc_frame_size;
+    /***video commom param*****/
+    uint32_t cqp;           // qp value 0-52
+    uint32_t frame_rate;    // frame rate trasfer the time stamp
+    char *rcmod;            // rate control mode CQP|CBR|VBR
+    uint32_t gop;           // group of picture 1-250
+    uint32_t ip_period;      //max b frame 0-only I 1-IP 3-IPBB
+    char *level;            // level 40|41|50|51
+    char *profile;          // profile main|baseline|high
+    /*******************/
+
+    uint32_t max_out_size;
+
+    // debug use
+    int encode_count;
+    int encode_count_yami;
+    int render_count;
+};
+
+#endif /* LIBAVCODEC_LIBYAMI_ENC_H_ */
diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
index a147a2d..5d874fb 100644
--- a/libavutil/pixdesc.c
+++ b/libavutil/pixdesc.c
@@ -1974,6 +1974,10 @@  static const AVPixFmtDescriptor av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
         .name = "qsv",
         .flags = AV_PIX_FMT_FLAG_HWACCEL,
     },
+    [AV_PIX_FMT_YAMI] = {
+        .name = "yami",
+        .flags = AV_PIX_FMT_FLAG_HWACCEL,
+    },
     [AV_PIX_FMT_MEDIACODEC] = {
         .name = "mediacodec",
         .flags = AV_PIX_FMT_FLAG_HWACCEL,
diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
index 6f71ac0..b95f907 100644
--- a/libavutil/pixfmt.h
+++ b/libavutil/pixfmt.h
@@ -293,6 +293,11 @@  enum AVPixelFormat {
     AV_PIX_FMT_AYUV64BE,    ///< packed AYUV 4:4:4,64bpp (1 Cr & Cb sample per 1x1 Y & A samples), big-endian
 
     AV_PIX_FMT_VIDEOTOOLBOX, ///< hardware decoding through Videotoolbox
+    /**
+     *  HW acceleration through libyami, data[3] contains a pointer to the
+     *  VideoFrameRawData structure.
+     */
+    AV_PIX_FMT_YAMI,
 
     AV_PIX_FMT_P010LE, ///< like NV12, with 10bpp per component, data in the high bits, zeros in the low bits, little-endian
     AV_PIX_FMT_P010BE, ///< like NV12, with 10bpp per component, data in the high bits, zeros in the low bits, big-endian