[FFmpeg-devel] videotoolbox: add hwcontext support

Message ID	20170503032604.1711-1-nfxjfg@googlemail.com
State	Accepted
Headers	show Delivered-To: ffmpegpatchwork@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; From: wm4 <nfxjfg@googlemail.com> To: ffmpeg-devel@ffmpeg.org Date: Wed, 3 May 2017 05:26:04 +0200 Message-Id: <20170503032604.1711-1-nfxjfg@googlemail.com> Subject: [FFmpeg-devel] [PATCH] videotoolbox: add hwcontext support Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: wm4 <nfxjfg@googlemail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Message ID

20170503032604.1711-1-nfxjfg@googlemail.com

State

Accepted

Headers

Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100; 
From: wm4 <nfxjfg@googlemail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Wed,  3 May 2017 05:26:04 +0200
Message-Id: <20170503032604.1711-1-nfxjfg@googlemail.com>
Subject: [FFmpeg-devel] [PATCH] videotoolbox: add hwcontext support
Precedence: list
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: wm4 <nfxjfg@googlemail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Commit Message

wm4 May 3, 2017, 3:26 a.m. UTC

This adds tons of code for no other benefit than making VideoToolbox
support conform with the new hwaccel API (using hw_device_ctx and
hw_frames_ctx).

Since VideoToolbox decoding does not actually require the user to
allocate frames, the new code does mostly nothing.

One benefit is that ffmpeg_videotoolbox.c can be dropped once generic
hwaccel support for ffmpeg.c is merged from Libav.

Does not consider VDA or VideoToolbox encoding.

Fun fact: the frame transfer functions are copied from vaapi, as the
mapping makes copying generic boilerplate. Mapping itself is not
exported by the VT code, because I don't know how to test.

TODO: API bumps
---
 doc/APIchanges                     |   8 ++
 libavcodec/vda_vt_internal.h       |   7 ++
 libavcodec/videotoolbox.c          | 186 ++++++++++++++++++++++++++--
 libavutil/Makefile                 |   3 +
 libavutil/hwcontext.c              |   3 +
 libavutil/hwcontext.h              |   1 +
 libavutil/hwcontext_internal.h     |   1 +
 libavutil/hwcontext_videotoolbox.c | 243 +++++++++++++++++++++++++++++++++++++
 libavutil/hwcontext_videotoolbox.h |  54 +++++++++
 9 files changed, 496 insertions(+), 10 deletions(-)
 create mode 100644 libavutil/hwcontext_videotoolbox.c
 create mode 100644 libavutil/hwcontext_videotoolbox.h

Comments

wm4 May 7, 2017, 5:07 a.m. UTC | #1

On Wed,  3 May 2017 05:26:04 +0200
wm4 <nfxjfg@googlemail.com> wrote:

> This adds tons of code for no other benefit than making VideoToolbox
> support conform with the new hwaccel API (using hw_device_ctx and
> hw_frames_ctx).
> 
> Since VideoToolbox decoding does not actually require the user to
> allocate frames, the new code does mostly nothing.
> 
> One benefit is that ffmpeg_videotoolbox.c can be dropped once generic
> hwaccel support for ffmpeg.c is merged from Libav.
> 
> Does not consider VDA or VideoToolbox encoding.
> 
> Fun fact: the frame transfer functions are copied from vaapi, as the
> mapping makes copying generic boilerplate. Mapping itself is not
> exported by the VT code, because I don't know how to test.
> 
> TODO: API bumps
> ---

If nobody wants to review this, I'll push it on Monday or so.

Aaron Levinson May 7, 2017, 7:01 p.m. UTC | #2

Are you also planning to change ffmpeg_videotoolbox.c?  See below for 
more comments.

Aaron Levinson

On 5/2/2017 8:26 PM, wm4 wrote:
> This adds tons of code for no other benefit than making VideoToolbox
> support conform with the new hwaccel API (using hw_device_ctx and
> hw_frames_ctx).
>
> Since VideoToolbox decoding does not actually require the user to
> allocate frames, the new code does mostly nothing.
>
> One benefit is that ffmpeg_videotoolbox.c can be dropped once generic
> hwaccel support for ffmpeg.c is merged from Libav.
>
> Does not consider VDA or VideoToolbox encoding.
>
> Fun fact: the frame transfer functions are copied from vaapi, as the
> mapping makes copying generic boilerplate. Mapping itself is not
> exported by the VT code, because I don't know how to test.
>
> TODO: API bumps
> ---
>  doc/APIchanges                     |   8 ++
>  libavcodec/vda_vt_internal.h       |   7 ++
>  libavcodec/videotoolbox.c          | 186 ++++++++++++++++++++++++++--
>  libavutil/Makefile                 |   3 +
>  libavutil/hwcontext.c              |   3 +
>  libavutil/hwcontext.h              |   1 +
>  libavutil/hwcontext_internal.h     |   1 +
>  libavutil/hwcontext_videotoolbox.c | 243 +++++++++++++++++++++++++++++++++++++
>  libavutil/hwcontext_videotoolbox.h |  54 +++++++++
>  9 files changed, 496 insertions(+), 10 deletions(-)
>  create mode 100644 libavutil/hwcontext_videotoolbox.c
>  create mode 100644 libavutil/hwcontext_videotoolbox.h
>
> diff --git a/doc/APIchanges b/doc/APIchanges
> index fcd3423d58..71f5563f03 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -15,6 +15,14 @@ libavutil:     2015-08-28

Note that the APIchanges part prevents the entire patch from applying, 
but that's to be expected.

>
>  API changes, most recent first:
>
> +2017-05-03 - xxxxxxxxxx - lavc 57.xx.100 - avcodec.h
> +  VideoToolbox hardware accelerated decoding now supports the new hwaccel API,
> +  which can create the decoder context and allocate hardware frame automatically.
> +  See AVCodecContext.hw_device_ctx and AVCodecContext.hw_frames_ctx.

I'd change the first sentence as follows:  "VideoToolbox 
hardware-accelerated decoding now supports the new hwaccel API, which 
can create the decoder context and allocate hardware frames automatically."

Changes are "hardware accelerated" -> "hardware-accelerated" and 
"hardware frame automatically" -> "hardware frames automatically".

> +
> +2017-05-03 - xxxxxxxxxx - lavu 57.xx.100 - hwcontext.h
> +  Add AV_HWDEVICE_TYPE_VIDEOTOOLBOX and implementation.
> +
>  2017-04-11 - 8378466507 - lavu 55.61.100 - avstring.h
>    Add av_strireplace().
>
> diff --git a/libavcodec/vda_vt_internal.h b/libavcodec/vda_vt_internal.h
> index 9ff63ccc52..e55a813899 100644
> --- a/libavcodec/vda_vt_internal.h
> +++ b/libavcodec/vda_vt_internal.h
> @@ -40,6 +40,13 @@ typedef struct VTContext {
>
>      // The core video buffer
>      CVImageBufferRef            frame;
> +
> +    // Current dummy frames context (depends on exact CVImageBufferRef params).
> +    struct AVBufferRef         *cached_hw_frames_ctx;
> +
> +    // Non-NULL if the new hwaccel API is used. This is only a separate struct
> +    // to ease compatibility with the old API.
> +    struct AVVideotoolboxContext *vt_ctx;
>  } VTContext;
>
>  int ff_videotoolbox_alloc_frame(AVCodecContext *avctx, AVFrame *frame);
> diff --git a/libavcodec/videotoolbox.c b/libavcodec/videotoolbox.c
> index 67adad53ed..910ac25ea7 100644
> --- a/libavcodec/videotoolbox.c
> +++ b/libavcodec/videotoolbox.c
> @@ -23,11 +23,13 @@
>  #include "config.h"
>  #if CONFIG_VIDEOTOOLBOX
>  #  include "videotoolbox.h"
> +#  include "libavutil/hwcontext_videotoolbox.h"
>  #else
>  #  include "vda.h"
>  #endif
>  #include "vda_vt_internal.h"
>  #include "libavutil/avutil.h"
> +#include "libavutil/hwcontext.h"
>  #include "bytestream.h"
>  #include "h264dec.h"
>  #include "mpegvideo.h"
> @@ -188,6 +190,79 @@ int ff_videotoolbox_uninit(AVCodecContext *avctx)
>  }
>
>  #if CONFIG_VIDEOTOOLBOX
> +// Return the AVVideotoolboxContext that matters currently. Where it comes from
> +// depends on the API used.
> +static AVVideotoolboxContext *videotoolbox_get_context(AVCodecContext *avctx)
> +{
> +    // Somewhat tricky because the API user can call av_videotoolbox_default_free()
> +    // at any time.

Comment will make more sense if "API" is dropped from the sentence.

> +    if (avctx->internal && avctx->internal->hwaccel_priv_data) {
> +        VTContext *vtctx = avctx->internal->hwaccel_priv_data;
> +        if (vtctx->vt_ctx)
> +            return vtctx->vt_ctx;
> +    }

 From your comment, and your answers to my questions on IRC, it is clear 
that these various checks are only needed for the case that 
av_videotoolbox_default_free() may be called after the codec is closed. 
However, this situation isn't relevant for most of the functions in your 
patch that call videotoolbox_get_content().  I suggest moving this check 
into videotoolbox_default_free() (which would replace the call to 
videotoolbox_get_context() there).  If that's done, then 
videotoolbox_get_context() can be implemented as:

         VTContext *vtctx = avctx->internal->hwaccel_priv_data;
         if (vtctx->vt_ctx)
             return vtctx->vt_ctx;
         return avctx->hwaccel_context;

Also, I suggest improving the comment to make it clear why it is 
necessary to check for internal in videotoolbox_default_free().

> +    return avctx->hwaccel_context;
> +}
> +
> +static int videotoolbox_buffer_create(AVCodecContext *avctx, AVFrame *frame)
> +{
> +    VTContext *vtctx = avctx->internal->hwaccel_priv_data;
> +    CVPixelBufferRef pixbuf = (CVPixelBufferRef)vtctx->frame;
> +    OSType pixel_format = CVPixelBufferGetPixelFormatType(pixbuf);
> +    enum AVPixelFormat sw_format = av_map_videotoolbox_format_to_pixfmt(pixel_format);
> +    int width = CVPixelBufferGetWidth(pixbuf);
> +    int height = CVPixelBufferGetHeight(pixbuf);
> +    AVHWFramesContext *cached_frames;
> +    int ret;
> +
> +    ret = ff_videotoolbox_buffer_create(vtctx, frame);
> +    if (ret < 0)
> +        return ret;
> +
> +    // Old API code path.
> +    if (!vtctx->cached_hw_frames_ctx)
> +        return 0;
> +
> +    // We can still return frames with unknown underlying format, except we need
> +    // "some" AVPixelFormat for it. Use AV_PIX_FMT_VIDEOTOOLBOX to signal an
> +    // opaque/unknown format, which is very sketchy, but you can't sue me.

Um, I would guess that this sort of comment doesn't really belong in the 
ffmpeg code base :-) .

> +    if (sw_format == AV_PIX_FMT_NONE)
> +        sw_format = AV_PIX_FMT_VIDEOTOOLBOX;
> +
> +    cached_frames = (AVHWFramesContext*)vtctx->cached_hw_frames_ctx->data;
> +
> +    if (cached_frames->sw_format != sw_format ||
> +        cached_frames->width != width ||
> +        cached_frames->height != height) {
> +        AVBufferRef *hw_frames_ctx = av_hwframe_ctx_alloc(cached_frames->device_ref);
> +        AVHWFramesContext *hw_frames;
> +        if (!hw_frames_ctx)
> +            return AVERROR(ENOMEM);
> +
> +        hw_frames = (AVHWFramesContext*)hw_frames_ctx->data;
> +        hw_frames->format = cached_frames->format;
> +        hw_frames->sw_format = sw_format;
> +        hw_frames->width = width;
> +        hw_frames->height = height;
> +
> +        ret = av_hwframe_ctx_init(hw_frames_ctx);
> +        if (ret < 0) {
> +            av_buffer_unref(&hw_frames_ctx);
> +            return ret;
> +        }
> +
> +        av_buffer_unref(&vtctx->cached_hw_frames_ctx);
> +        vtctx->cached_hw_frames_ctx = hw_frames_ctx;
> +    }
> +
> +    av_assert0(!frame->hw_frames_ctx);
> +    frame->hw_frames_ctx = av_buffer_ref(vtctx->cached_hw_frames_ctx);
> +    if (!frame->hw_frames_ctx)
> +        return AVERROR(ENOMEM);
> +
> +    return 0;
> +}
> +
>  static void videotoolbox_write_mp4_descr_length(PutByteContext *pb, int length)
>  {
>      int i;
> @@ -323,7 +398,7 @@ static OSStatus videotoolbox_session_decode_frame(AVCodecContext *avctx)
>  {
>      OSStatus status;
>      CMSampleBufferRef sample_buf;
> -    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
> +    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
>      VTContext *vtctx = avctx->internal->hwaccel_priv_data;
>
>      sample_buf = videotoolbox_sample_buffer_create(videotoolbox->cm_fmt_desc,
> @@ -349,7 +424,7 @@ static OSStatus videotoolbox_session_decode_frame(AVCodecContext *avctx)
>  static int videotoolbox_common_end_frame(AVCodecContext *avctx, AVFrame *frame)
>  {
>      int status;
> -    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
> +    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
>      VTContext *vtctx = avctx->internal->hwaccel_priv_data;
>
>      if (!videotoolbox->session || !vtctx->bitstream)
> @@ -365,7 +440,7 @@ static int videotoolbox_common_end_frame(AVCodecContext *avctx, AVFrame *frame)
>      if (!vtctx->frame)
>          return AVERROR_UNKNOWN;
>
> -    return ff_videotoolbox_buffer_create(vtctx, frame);
> +    return videotoolbox_buffer_create(avctx, frame);
>  }
>
>  static int videotoolbox_h264_end_frame(AVCodecContext *avctx)
> @@ -513,7 +588,7 @@ static CMVideoFormatDescriptionRef videotoolbox_format_desc_create(CMVideoCodecT
>
>  static int videotoolbox_default_init(AVCodecContext *avctx)
>  {
> -    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
> +    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
>      OSStatus status;
>      VTDecompressionOutputCallbackRecord decoder_cb;
>      CFDictionaryRef decoder_spec;
> @@ -594,7 +669,7 @@ static int videotoolbox_default_init(AVCodecContext *avctx)
>
>  static void videotoolbox_default_free(AVCodecContext *avctx)
>  {
> -    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
> +    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
>
>      if (videotoolbox) {
>          if (videotoolbox->cm_fmt_desc)
> @@ -607,6 +682,92 @@ static void videotoolbox_default_free(AVCodecContext *avctx)
>      }
>  }
>
> +static int videotoolbox_uninit(AVCodecContext *avctx)
> +{
> +    VTContext *vtctx = avctx->internal->hwaccel_priv_data;
> +    if (!vtctx)
> +        return 0;
> +
> +    ff_videotoolbox_uninit(avctx);
> +
> +    if (vtctx->vt_ctx)
> +        videotoolbox_default_free(avctx);

Unclear why the call to videotoolbox_default_free() is dependent on the 
existence of vt_ctx.  Why not eliminate this and just call 
av_videotoolbox_default_free() at the end of the function?  That way, it 
will work in the off chance that it gets to this code and 
hwaccel_context is valid (in which case vt_ctx will be null).

> +
> +    av_buffer_unref(&vtctx->cached_hw_frames_ctx);
> +    av_freep(&vtctx->vt_ctx);

vt_ctx is allocated using av_videotoolbox_alloc_context().  While using 
av_freep() is correct, since av_videotoolbox_alloc_context() uses 
av_mallocz() to allocate the AVVideotoolboxContext object, I think it 
would be preferable to have an av_videotoolbox_free_context() function, 
which will continue to do the right thing if the implementation of 
av_videotoolbox_alloc_context() ever changes (say, to allocate 
additional memory in the AVVideotoolboxContext object).  This is 
technically an issue with the already existing code though, and in 
addition, doing this would constitute a change to the public APIs and 
documentation, so not really relevant for this patch.  There is also 
already precedent for this approach--for example, 
avcodec_alloc_context3()/avcodec_free_context().

Also, should probably add a call to av_freep(&avctx->hwaccel_context) 
here just in case there is a hwaccel_context, since it doesn't call 
av_videotoolbox_default_free() in this case (unless you change to call 
av_videotoolbox_default_free()).

> +
> +    return 0;
> +}
> +
> +static int videotoolbox_common_init(AVCodecContext *avctx)
> +{
> +    VTContext *vtctx = avctx->internal->hwaccel_priv_data;
> +    AVHWFramesContext *hw_frames;
> +    int err;
> +
> +    // Old API - do nothing.
> +    if (avctx->hwaccel_context)
> +        return 0;
> +
> +    if (!avctx->hw_frames_ctx && !avctx->hw_device_ctx) {
> +        av_log(avctx, AV_LOG_ERROR,
> +               "Either hw_frames_ctx or hw_device_ctx must be set.\n");
> +        return AVERROR(EINVAL);
> +    }
> +
> +    vtctx->vt_ctx = av_videotoolbox_alloc_context();
> +    if (!vtctx->vt_ctx) {
> +        err = AVERROR(ENOMEM);
> +        goto fail;
> +    }
> +
> +    if (avctx->hw_frames_ctx) {
> +        hw_frames = (AVHWFramesContext*)avctx->hw_frames_ctx->data;
> +    } else {
> +        avctx->hw_frames_ctx = av_hwframe_ctx_alloc(avctx->hw_device_ctx);
> +        if (!avctx->hw_frames_ctx) {
> +            err = AVERROR(ENOMEM);
> +            goto fail;
> +        }
> +
> +        hw_frames = (AVHWFramesContext*)avctx->hw_frames_ctx->data;
> +        hw_frames->format = AV_PIX_FMT_VIDEOTOOLBOX;
> +        hw_frames->sw_format = AV_PIX_FMT_NV12; // same as av_videotoolbox_alloc_context()
> +        hw_frames->width = avctx->width;
> +        hw_frames->height = avctx->height;
> +
> +        err = av_hwframe_ctx_init(avctx->hw_frames_ctx);
> +        if (err < 0) {
> +            av_buffer_unref(&avctx->hw_frames_ctx);
> +            goto fail;
> +        }
> +    }
> +
> +    vtctx->cached_hw_frames_ctx = av_buffer_ref(avctx->hw_frames_ctx);
> +    if (!vtctx->cached_hw_frames_ctx) {
> +        err = AVERROR(ENOMEM);
> +        goto fail;
> +    }
> +
> +    vtctx->vt_ctx->cv_pix_fmt_type =
> +        av_map_videotoolbox_format_from_pixfmt(hw_frames->sw_format);
> +    if (!vtctx->vt_ctx->cv_pix_fmt_type) {
> +        av_log(avctx, AV_LOG_ERROR, "Unknown sw_format.\n");
> +        err = AVERROR(EINVAL);
> +        goto fail;
> +    }
> +
> +    err = videotoolbox_default_init(avctx);
> +    if (err < 0)
> +        goto fail;
> +
> +    return 0;
> +
> +fail:
> +    videotoolbox_uninit(avctx);
> +    return err;
> +}
> +
>  AVHWAccel ff_h263_videotoolbox_hwaccel = {
>      .name           = "h263_videotoolbox",
>      .type           = AVMEDIA_TYPE_VIDEO,
> @@ -616,7 +777,8 @@ AVHWAccel ff_h263_videotoolbox_hwaccel = {
>      .start_frame    = videotoolbox_mpeg_start_frame,
>      .decode_slice   = videotoolbox_mpeg_decode_slice,
>      .end_frame      = videotoolbox_mpeg_end_frame,
> -    .uninit         = ff_videotoolbox_uninit,
> +    .init           = videotoolbox_common_init,
> +    .uninit         = videotoolbox_uninit,
>      .priv_data_size = sizeof(VTContext),
>  };
>
> @@ -629,7 +791,8 @@ AVHWAccel ff_h264_videotoolbox_hwaccel = {
>      .start_frame    = ff_videotoolbox_h264_start_frame,
>      .decode_slice   = ff_videotoolbox_h264_decode_slice,
>      .end_frame      = videotoolbox_h264_end_frame,
> -    .uninit         = ff_videotoolbox_uninit,
> +    .init           = videotoolbox_common_init,
> +    .uninit         = videotoolbox_uninit,
>      .priv_data_size = sizeof(VTContext),
>  };
>
> @@ -642,7 +805,8 @@ AVHWAccel ff_mpeg1_videotoolbox_hwaccel = {
>      .start_frame    = videotoolbox_mpeg_start_frame,
>      .decode_slice   = videotoolbox_mpeg_decode_slice,
>      .end_frame      = videotoolbox_mpeg_end_frame,
> -    .uninit         = ff_videotoolbox_uninit,
> +    .init           = videotoolbox_common_init,
> +    .uninit         = videotoolbox_uninit,
>      .priv_data_size = sizeof(VTContext),
>  };
>
> @@ -655,7 +819,8 @@ AVHWAccel ff_mpeg2_videotoolbox_hwaccel = {
>      .start_frame    = videotoolbox_mpeg_start_frame,
>      .decode_slice   = videotoolbox_mpeg_decode_slice,
>      .end_frame      = videotoolbox_mpeg_end_frame,
> -    .uninit         = ff_videotoolbox_uninit,
> +    .init           = videotoolbox_common_init,
> +    .uninit         = videotoolbox_uninit,
>      .priv_data_size = sizeof(VTContext),
>  };
>
> @@ -668,7 +833,8 @@ AVHWAccel ff_mpeg4_videotoolbox_hwaccel = {
>      .start_frame    = videotoolbox_mpeg_start_frame,
>      .decode_slice   = videotoolbox_mpeg_decode_slice,
>      .end_frame      = videotoolbox_mpeg_end_frame,
> -    .uninit         = ff_videotoolbox_uninit,
> +    .init           = videotoolbox_common_init,
> +    .uninit         = videotoolbox_uninit,
>      .priv_data_size = sizeof(VTContext),
>  };
>
> diff --git a/libavutil/Makefile b/libavutil/Makefile
> index d669a924b0..e1fce7732c 100644
> --- a/libavutil/Makefile
> +++ b/libavutil/Makefile
> @@ -37,6 +37,7 @@ HEADERS = adler32.h                                                     \
>            hwcontext_dxva2.h                                             \
>            hwcontext_qsv.h                                               \
>            hwcontext_vaapi.h                                             \
> +          hwcontext_videotoolbox.h                                      \
>            hwcontext_vdpau.h                                             \
>            imgutils.h                                                    \
>            intfloat.h                                                    \
> @@ -161,6 +162,7 @@ OBJS-$(CONFIG_QSV)                   += hwcontext_qsv.o
>  OBJS-$(CONFIG_LZO)                      += lzo.o
>  OBJS-$(CONFIG_OPENCL)                   += opencl.o opencl_internal.o
>  OBJS-$(CONFIG_VAAPI)                    += hwcontext_vaapi.o
> +OBJS-$(CONFIG_VIDEOTOOLBOX)             += hwcontext_videotoolbox.o
>  OBJS-$(CONFIG_VDPAU)                    += hwcontext_vdpau.o
>
>  OBJS += $(COMPAT_OBJS:%=../compat/%)
> @@ -173,6 +175,7 @@ SKIPHEADERS-$(CONFIG_CUDA)             += hwcontext_cuda_internal.h
>  SKIPHEADERS-$(CONFIG_DXVA2)            += hwcontext_dxva2.h
>  SKIPHEADERS-$(CONFIG_QSV)           += hwcontext_qsv.h
>  SKIPHEADERS-$(CONFIG_VAAPI)            += hwcontext_vaapi.h
> +SKIPHEADERS-$(CONFIG_VDPAU)            += hwcontext_videotoolbox.h

Hmm, seems like this should use CONFIG_VIDEOTOOLBOX, not CONFIG_VDPAU.

>  SKIPHEADERS-$(CONFIG_VDPAU)            += hwcontext_vdpau.h
>  SKIPHEADERS-$(HAVE_ATOMICS_GCC)        += atomic_gcc.h
>  SKIPHEADERS-$(HAVE_ATOMICS_SUNCC)      += atomic_suncc.h
> diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c
> index 4cfe377982..8d50a32b84 100644
> --- a/libavutil/hwcontext.c
> +++ b/libavutil/hwcontext.c
> @@ -44,6 +44,9 @@ static const HWContextType *hw_table[] = {
>  #if CONFIG_VDPAU
>      &ff_hwcontext_type_vdpau,
>  #endif
> +#if CONFIG_VIDEOTOOLBOX
> +    &ff_hwcontext_type_videotoolbox,
> +#endif
>      NULL,
>  };
>
> diff --git a/libavutil/hwcontext.h b/libavutil/hwcontext.h
> index 284b091209..cfc6ad0e28 100644
> --- a/libavutil/hwcontext.h
> +++ b/libavutil/hwcontext.h
> @@ -30,6 +30,7 @@ enum AVHWDeviceType {
>      AV_HWDEVICE_TYPE_VAAPI,
>      AV_HWDEVICE_TYPE_DXVA2,
>      AV_HWDEVICE_TYPE_QSV,
> +    AV_HWDEVICE_TYPE_VIDEOTOOLBOX,
>  };
>
>  typedef struct AVHWDeviceInternal AVHWDeviceInternal;
> diff --git a/libavutil/hwcontext_internal.h b/libavutil/hwcontext_internal.h
> index 30fce2afd9..cf05323e15 100644
> --- a/libavutil/hwcontext_internal.h
> +++ b/libavutil/hwcontext_internal.h
> @@ -144,5 +144,6 @@ extern const HWContextType ff_hwcontext_type_dxva2;
>  extern const HWContextType ff_hwcontext_type_qsv;
>  extern const HWContextType ff_hwcontext_type_vaapi;
>  extern const HWContextType ff_hwcontext_type_vdpau;
> +extern const HWContextType ff_hwcontext_type_videotoolbox;
>
>  #endif /* AVUTIL_HWCONTEXT_INTERNAL_H */
> diff --git a/libavutil/hwcontext_videotoolbox.c b/libavutil/hwcontext_videotoolbox.c
> new file mode 100644
> index 0000000000..cc00f1f2f2
> --- /dev/null
> +++ b/libavutil/hwcontext_videotoolbox.c
> @@ -0,0 +1,243 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "config.h"
> +
> +#include <stdint.h>
> +#include <string.h>
> +
> +#include <VideoToolbox/VideoToolbox.h>
> +
> +#include "buffer.h"
> +#include "common.h"
> +#include "hwcontext.h"
> +#include "hwcontext_internal.h"
> +#include "hwcontext_videotoolbox.h"
> +#include "mem.h"
> +#include "pixfmt.h"
> +#include "pixdesc.h"
> +
> +static const struct {
> +    uint32_t cv_fmt;
> +    enum AVPixelFormat pix_fmt;
> +} cv_pix_fmts[] = {
> +    { kCVPixelFormatType_420YpCbCr8Planar,              AV_PIX_FMT_YUV420P },
> +    { kCVPixelFormatType_422YpCbCr8,                    AV_PIX_FMT_UYVY422 },
> +    { kCVPixelFormatType_32BGRA,                        AV_PIX_FMT_BGRA },
> +#ifdef kCFCoreFoundationVersionNumber10_7
> +    { kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,  AV_PIX_FMT_NV12 },
> +#endif
> +};
> +
> +enum AVPixelFormat av_map_videotoolbox_format_to_pixfmt(uint32_t cv_fmt)
> +{
> +    int i;
> +    for (i = 0; i < FF_ARRAY_ELEMS(cv_pix_fmts); i++) {
> +        if (cv_pix_fmts[i].cv_fmt == cv_fmt)
> +            return cv_pix_fmts[i].pix_fmt;
> +    }
> +    return AV_PIX_FMT_NONE;
> +}
> +
> +uint32_t av_map_videotoolbox_format_from_pixfmt(enum AVPixelFormat pix_fmt)
> +{
> +    int i;
> +    for (i = 0; i < FF_ARRAY_ELEMS(cv_pix_fmts); i++) {
> +        if (cv_pix_fmts[i].pix_fmt == pix_fmt)
> +            return cv_pix_fmts[i].cv_fmt;
> +    }
> +    return 0;
> +}
> +
> +static int vt_get_buffer(AVHWFramesContext *ctx, AVFrame *frame)
> +{
> +    frame->buf[0] = av_buffer_pool_get(ctx->pool);
> +    if (!frame->buf[0])
> +        return AVERROR(ENOMEM);
> +
> +    frame->data[3] = frame->buf[0]->data;
> +    frame->format  = AV_PIX_FMT_VIDEOTOOLBOX;
> +    frame->width   = ctx->width;
> +    frame->height  = ctx->height;
> +
> +    return 0;
> +}
> +
> +static int vt_transfer_get_formats(AVHWFramesContext *ctx,
> +                                   enum AVHWFrameTransferDirection dir,
> +                                   enum AVPixelFormat **formats)
> +{
> +    enum AVPixelFormat *fmts = av_malloc_array(2, sizeof(*fmts));
> +    if (!fmts)
> +        return AVERROR(ENOMEM);
> +
> +    fmts[0] = ctx->sw_format;
> +    fmts[1] = AV_PIX_FMT_NONE;
> +
> +    *formats = fmts;
> +    return 0;
> +}
> +
> +static void vt_unmap(AVHWFramesContext *ctx, HWMapDescriptor *hwmap)
> +{
> +    CVPixelBufferRef pixbuf = (CVPixelBufferRef)hwmap->source->data[3];
> +
> +    CVPixelBufferUnlockBaseAddress(pixbuf, (uintptr_t)hwmap->priv);
> +}
> +
> +static int vt_map_frame(AVHWFramesContext *ctx, AVFrame *dst, const AVFrame *src,
> +                        int flags)
> +{
> +    CVPixelBufferRef pixbuf = (CVPixelBufferRef)src->data[3];
> +    OSType pixel_format = CVPixelBufferGetPixelFormatType(pixbuf);
> +    CVReturn err;
> +    uint32_t map_flags = 0;
> +    int ret;
> +    int i;
> +    enum AVPixelFormat format;
> +
> +    format = av_map_videotoolbox_format_to_pixfmt(pixel_format);
> +    if (dst->format != format) {
> +        av_log(ctx, AV_LOG_ERROR, "Unsupported or mismatching pixel format: %s\n",
> +               av_fourcc2str(pixel_format));
> +        return AVERROR_UNKNOWN;
> +    }
> +
> +    if (CVPixelBufferGetWidth(pixbuf) != ctx->width ||
> +        CVPixelBufferGetHeight(pixbuf) != ctx->height) {
> +        av_log(ctx, AV_LOG_ERROR, "Inconsistent frame dimensions.\n");
> +        return AVERROR_UNKNOWN;
> +    }
> +
> +    if (flags == AV_HWFRAME_MAP_READ)
> +        map_flags = kCVPixelBufferLock_ReadOnly;
> +
> +    err = CVPixelBufferLockBaseAddress(pixbuf, map_flags);
> +    if (err != kCVReturnSuccess) {
> +        av_log(ctx, AV_LOG_ERROR, "Error locking the pixel buffer.\n");
> +        return AVERROR_UNKNOWN;
> +    }
> +
> +    if (CVPixelBufferIsPlanar(pixbuf)) {
> +        int planes = CVPixelBufferGetPlaneCount(pixbuf);
> +        for (i = 0; i < planes; i++) {
> +            dst->data[i]     = CVPixelBufferGetBaseAddressOfPlane(pixbuf, i);
> +            dst->linesize[i] = CVPixelBufferGetBytesPerRowOfPlane(pixbuf, i);
> +        }
> +    } else {
> +        dst->data[0]     = CVPixelBufferGetBaseAddress(pixbuf);
> +        dst->linesize[0] = CVPixelBufferGetBytesPerRow(pixbuf);
> +    }
> +
> +    ret = ff_hwframe_map_create(src->hw_frames_ctx, dst, src, vt_unmap,
> +                                (void *)(uintptr_t)map_flags);
> +    if (ret < 0)
> +        goto unlock;
> +
> +    return 0;
> +
> +unlock:
> +    CVPixelBufferUnlockBaseAddress(pixbuf, map_flags);
> +    return ret;
> +}
> +
> +static int vt_transfer_data_from(AVHWFramesContext *hwfc,
> +                                 AVFrame *dst, const AVFrame *src)
> +{
> +    AVFrame *map;
> +    int err;
> +
> +    if (dst->width > hwfc->width || dst->height > hwfc->height)
> +        return AVERROR(EINVAL);
> +
> +    map = av_frame_alloc();
> +    if (!map)
> +        return AVERROR(ENOMEM);
> +    map->format = dst->format;
> +
> +    err = vt_map_frame(hwfc, map, src, AV_HWFRAME_MAP_READ);
> +    if (err)
> +        goto fail;
> +
> +    map->width  = dst->width;
> +    map->height = dst->height;
> +
> +    err = av_frame_copy(dst, map);
> +    if (err)
> +        goto fail;
> +
> +    err = 0;
> +fail:
> +    av_frame_free(&map);
> +    return err;
> +}
> +
> +static int vt_transfer_data_to(AVHWFramesContext *hwfc,
> +                               AVFrame *dst, const AVFrame *src)
> +{
> +    AVFrame *map;
> +    int err;
> +
> +    if (src->width > hwfc->width || src->height > hwfc->height)
> +        return AVERROR(EINVAL);
> +
> +    map = av_frame_alloc();
> +    if (!map)
> +        return AVERROR(ENOMEM);
> +    map->format = src->format;
> +
> +    err = vt_map_frame(hwfc, map, dst, AV_HWFRAME_MAP_WRITE | AV_HWFRAME_MAP_OVERWRITE);
> +    if (err)
> +        goto fail;
> +
> +    map->width  = src->width;
> +    map->height = src->height;
> +
> +    err = av_frame_copy(map, src);
> +    if (err)
> +        goto fail;
> +
> +    err = 0;

For consistency with the rest of the file and past precedent, would be 
preferable to do:

     av_frame_free(&map);
     return 0;

instead of falling through to fail, which gives the appearance that 
something might not have been done properly.  Also applies to data_from().

> +fail:
> +    av_frame_free(&map);
> +    return err;
> +}
> +
> +static int vt_device_create(AVHWDeviceContext *ctx, const char *device,
> +                            AVDictionary *opts, int flags)
> +{
> +    if (device && device[0]) {
> +        av_log(ctx, AV_LOG_ERROR, "Device selection unsupported.\n");
> +        return AVERROR_UNKNOWN;
> +    }
> +
> +    return 0;
> +}
> +
> +const HWContextType ff_hwcontext_type_videotoolbox = {
> +    .type                 = AV_HWDEVICE_TYPE_VIDEOTOOLBOX,
> +    .name                 = "videotoolbox",
> +
> +    .device_create        = vt_device_create,
> +    .frames_get_buffer    = vt_get_buffer,
> +    .transfer_get_formats = vt_transfer_get_formats,
> +    .transfer_data_to     = vt_transfer_data_to,
> +    .transfer_data_from   = vt_transfer_data_from,
> +
> +    .pix_fmts = (const enum AVPixelFormat[]){ AV_PIX_FMT_VIDEOTOOLBOX, AV_PIX_FMT_NONE },
> +};
> diff --git a/libavutil/hwcontext_videotoolbox.h b/libavutil/hwcontext_videotoolbox.h
> new file mode 100644
> index 0000000000..dc7b873204
> --- /dev/null
> +++ b/libavutil/hwcontext_videotoolbox.h
> @@ -0,0 +1,54 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#ifndef AVUTIL_HWCONTEXT_VT_H
> +#define AVUTIL_HWCONTEXT_VT_H
> +
> +#include <stdint.h>
> +
> +#include <VideoToolbox/VideoToolbox.h>
> +
> +#include "pixfmt.h"
> +
> +/**
> + * @file
> + * An API-specific header for AV_HWDEVICE_TYPE_VIDEOTOOLBOX.
> + *
> + * This API currently does not support frame allocation, as the raw VideoToolbox
> + * API does allocation, and FFmpeg itself never has the need to allocate frames.
> + *
> + * If the API user sets a custom pool, AVHWFramesContext.pool must return
> + * AVBufferRefs whose data pointer is a CVImageBufferRef or CVPixelBufferRef.
> + *
> + * Currently AVHWDeviceContext.hwctx and AVHWFramesContext.hwctx are always
> + * NULL.
> + */
> +
> +/**
> + * Convert a VideoToolbox (actually CoreVideo) format to AVPixelFormat.
> + * Returns AV_PIX_FMT_NONE if no known equivalent was found.
> + */
> +enum AVPixelFormat av_map_videotoolbox_format_to_pixfmt(uint32_t cv_fmt);
> +
> +/**
> + * Convert a AVPixelFormat to a VideoToolbox (actually CoreVideo) format.

"a AVPixelFormat" -> "an AVPixelFormat"

> + * Returns 0 if no known equivalent was found.
> + */
> +uint32_t av_map_videotoolbox_format_from_pixfmt(enum AVPixelFormat pix_fmt);
> +
> +#endif /* AVUTIL_HWCONTEXT_VT_H */
>

wm4 May 15, 2017, 9:40 a.m. UTC | #3

On Sun, 7 May 2017 12:01:11 -0700
Aaron Levinson <alevinsn@aracnet.com> wrote:

> Are you also planning to change ffmpeg_videotoolbox.c?  See below for 
> more comments.
> 
> Aaron Levinson
> 

Pushed. Changed what I felt like I could change. Thanks for the review.

diff --git a/doc/APIchanges b/doc/APIchanges
index fcd3423d58..71f5563f03 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -15,6 +15,14 @@  libavutil:     2015-08-28
 
 API changes, most recent first:
 
+2017-05-03 - xxxxxxxxxx - lavc 57.xx.100 - avcodec.h
+  VideoToolbox hardware accelerated decoding now supports the new hwaccel API,
+  which can create the decoder context and allocate hardware frame automatically.
+  See AVCodecContext.hw_device_ctx and AVCodecContext.hw_frames_ctx.
+
+2017-05-03 - xxxxxxxxxx - lavu 57.xx.100 - hwcontext.h
+  Add AV_HWDEVICE_TYPE_VIDEOTOOLBOX and implementation.
+
 2017-04-11 - 8378466507 - lavu 55.61.100 - avstring.h
   Add av_strireplace().
 
diff --git a/libavcodec/vda_vt_internal.h b/libavcodec/vda_vt_internal.h
index 9ff63ccc52..e55a813899 100644
--- a/libavcodec/vda_vt_internal.h
+++ b/libavcodec/vda_vt_internal.h
@@ -40,6 +40,13 @@  typedef struct VTContext {
 
     // The core video buffer
     CVImageBufferRef            frame;
+
+    // Current dummy frames context (depends on exact CVImageBufferRef params).
+    struct AVBufferRef         *cached_hw_frames_ctx;
+
+    // Non-NULL if the new hwaccel API is used. This is only a separate struct
+    // to ease compatibility with the old API.
+    struct AVVideotoolboxContext *vt_ctx;
 } VTContext;
 
 int ff_videotoolbox_alloc_frame(AVCodecContext *avctx, AVFrame *frame);
diff --git a/libavcodec/videotoolbox.c b/libavcodec/videotoolbox.c
index 67adad53ed..910ac25ea7 100644
--- a/libavcodec/videotoolbox.c
+++ b/libavcodec/videotoolbox.c
@@ -23,11 +23,13 @@ 
 #include "config.h"
 #if CONFIG_VIDEOTOOLBOX
 #  include "videotoolbox.h"
+#  include "libavutil/hwcontext_videotoolbox.h"
 #else
 #  include "vda.h"
 #endif
 #include "vda_vt_internal.h"
 #include "libavutil/avutil.h"
+#include "libavutil/hwcontext.h"
 #include "bytestream.h"
 #include "h264dec.h"
 #include "mpegvideo.h"
@@ -188,6 +190,79 @@  int ff_videotoolbox_uninit(AVCodecContext *avctx)
 }
 
 #if CONFIG_VIDEOTOOLBOX
+// Return the AVVideotoolboxContext that matters currently. Where it comes from
+// depends on the API used.
+static AVVideotoolboxContext *videotoolbox_get_context(AVCodecContext *avctx)
+{
+    // Somewhat tricky because the API user can call av_videotoolbox_default_free()
+    // at any time.
+    if (avctx->internal && avctx->internal->hwaccel_priv_data) {
+        VTContext *vtctx = avctx->internal->hwaccel_priv_data;
+        if (vtctx->vt_ctx)
+            return vtctx->vt_ctx;
+    }
+    return avctx->hwaccel_context;
+}
+
+static int videotoolbox_buffer_create(AVCodecContext *avctx, AVFrame *frame)
+{
+    VTContext *vtctx = avctx->internal->hwaccel_priv_data;
+    CVPixelBufferRef pixbuf = (CVPixelBufferRef)vtctx->frame;
+    OSType pixel_format = CVPixelBufferGetPixelFormatType(pixbuf);
+    enum AVPixelFormat sw_format = av_map_videotoolbox_format_to_pixfmt(pixel_format);
+    int width = CVPixelBufferGetWidth(pixbuf);
+    int height = CVPixelBufferGetHeight(pixbuf);
+    AVHWFramesContext *cached_frames;
+    int ret;
+
+    ret = ff_videotoolbox_buffer_create(vtctx, frame);
+    if (ret < 0)
+        return ret;
+
+    // Old API code path.
+    if (!vtctx->cached_hw_frames_ctx)
+        return 0;
+
+    // We can still return frames with unknown underlying format, except we need
+    // "some" AVPixelFormat for it. Use AV_PIX_FMT_VIDEOTOOLBOX to signal an
+    // opaque/unknown format, which is very sketchy, but you can't sue me.
+    if (sw_format == AV_PIX_FMT_NONE)
+        sw_format = AV_PIX_FMT_VIDEOTOOLBOX;
+
+    cached_frames = (AVHWFramesContext*)vtctx->cached_hw_frames_ctx->data;
+
+    if (cached_frames->sw_format != sw_format ||
+        cached_frames->width != width ||
+        cached_frames->height != height) {
+        AVBufferRef *hw_frames_ctx = av_hwframe_ctx_alloc(cached_frames->device_ref);
+        AVHWFramesContext *hw_frames;
+        if (!hw_frames_ctx)
+            return AVERROR(ENOMEM);
+
+        hw_frames = (AVHWFramesContext*)hw_frames_ctx->data;
+        hw_frames->format = cached_frames->format;
+        hw_frames->sw_format = sw_format;
+        hw_frames->width = width;
+        hw_frames->height = height;
+
+        ret = av_hwframe_ctx_init(hw_frames_ctx);
+        if (ret < 0) {
+            av_buffer_unref(&hw_frames_ctx);
+            return ret;
+        }
+
+        av_buffer_unref(&vtctx->cached_hw_frames_ctx);
+        vtctx->cached_hw_frames_ctx = hw_frames_ctx;
+    }
+
+    av_assert0(!frame->hw_frames_ctx);
+    frame->hw_frames_ctx = av_buffer_ref(vtctx->cached_hw_frames_ctx);
+    if (!frame->hw_frames_ctx)
+        return AVERROR(ENOMEM);
+
+    return 0;
+}
+
 static void videotoolbox_write_mp4_descr_length(PutByteContext *pb, int length)
 {
     int i;
@@ -323,7 +398,7 @@  static OSStatus videotoolbox_session_decode_frame(AVCodecContext *avctx)
 {
     OSStatus status;
     CMSampleBufferRef sample_buf;
-    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
+    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
     VTContext *vtctx = avctx->internal->hwaccel_priv_data;
 
     sample_buf = videotoolbox_sample_buffer_create(videotoolbox->cm_fmt_desc,
@@ -349,7 +424,7 @@  static OSStatus videotoolbox_session_decode_frame(AVCodecContext *avctx)
 static int videotoolbox_common_end_frame(AVCodecContext *avctx, AVFrame *frame)
 {
     int status;
-    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
+    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
     VTContext *vtctx = avctx->internal->hwaccel_priv_data;
 
     if (!videotoolbox->session || !vtctx->bitstream)
@@ -365,7 +440,7 @@  static int videotoolbox_common_end_frame(AVCodecContext *avctx, AVFrame *frame)
     if (!vtctx->frame)
         return AVERROR_UNKNOWN;
 
-    return ff_videotoolbox_buffer_create(vtctx, frame);
+    return videotoolbox_buffer_create(avctx, frame);
 }
 
 static int videotoolbox_h264_end_frame(AVCodecContext *avctx)
@@ -513,7 +588,7 @@  static CMVideoFormatDescriptionRef videotoolbox_format_desc_create(CMVideoCodecT
 
 static int videotoolbox_default_init(AVCodecContext *avctx)
 {
-    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
+    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
     OSStatus status;
     VTDecompressionOutputCallbackRecord decoder_cb;
     CFDictionaryRef decoder_spec;
@@ -594,7 +669,7 @@  static int videotoolbox_default_init(AVCodecContext *avctx)
 
 static void videotoolbox_default_free(AVCodecContext *avctx)
 {
-    AVVideotoolboxContext *videotoolbox = avctx->hwaccel_context;
+    AVVideotoolboxContext *videotoolbox = videotoolbox_get_context(avctx);
 
     if (videotoolbox) {
         if (videotoolbox->cm_fmt_desc)
@@ -607,6 +682,92 @@  static void videotoolbox_default_free(AVCodecContext *avctx)
     }
 }
 
+static int videotoolbox_uninit(AVCodecContext *avctx)
+{
+    VTContext *vtctx = avctx->internal->hwaccel_priv_data;
+    if (!vtctx)
+        return 0;
+
+    ff_videotoolbox_uninit(avctx);
+
+    if (vtctx->vt_ctx)
+        videotoolbox_default_free(avctx);
+
+    av_buffer_unref(&vtctx->cached_hw_frames_ctx);
+    av_freep(&vtctx->vt_ctx);
+
+    return 0;
+}
+
+static int videotoolbox_common_init(AVCodecContext *avctx)
+{
+    VTContext *vtctx = avctx->internal->hwaccel_priv_data;
+    AVHWFramesContext *hw_frames;
+    int err;
+
+    // Old API - do nothing.
+    if (avctx->hwaccel_context)
+        return 0;
+
+    if (!avctx->hw_frames_ctx && !avctx->hw_device_ctx) {
+        av_log(avctx, AV_LOG_ERROR,
+               "Either hw_frames_ctx or hw_device_ctx must be set.\n");
+        return AVERROR(EINVAL);
+    }
+
+    vtctx->vt_ctx = av_videotoolbox_alloc_context();
+    if (!vtctx->vt_ctx) {
+        err = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    if (avctx->hw_frames_ctx) {
+        hw_frames = (AVHWFramesContext*)avctx->hw_frames_ctx->data;
+    } else {
+        avctx->hw_frames_ctx = av_hwframe_ctx_alloc(avctx->hw_device_ctx);
+        if (!avctx->hw_frames_ctx) {
+            err = AVERROR(ENOMEM);
+            goto fail;
+        }
+
+        hw_frames = (AVHWFramesContext*)avctx->hw_frames_ctx->data;
+        hw_frames->format = AV_PIX_FMT_VIDEOTOOLBOX;
+        hw_frames->sw_format = AV_PIX_FMT_NV12; // same as av_videotoolbox_alloc_context()
+        hw_frames->width = avctx->width;
+        hw_frames->height = avctx->height;
+
+        err = av_hwframe_ctx_init(avctx->hw_frames_ctx);
+        if (err < 0) {
+            av_buffer_unref(&avctx->hw_frames_ctx);
+            goto fail;
+        }
+    }
+
+    vtctx->cached_hw_frames_ctx = av_buffer_ref(avctx->hw_frames_ctx);
+    if (!vtctx->cached_hw_frames_ctx) {
+        err = AVERROR(ENOMEM);
+        goto fail;
+    }
+
+    vtctx->vt_ctx->cv_pix_fmt_type =
+        av_map_videotoolbox_format_from_pixfmt(hw_frames->sw_format);
+    if (!vtctx->vt_ctx->cv_pix_fmt_type) {
+        av_log(avctx, AV_LOG_ERROR, "Unknown sw_format.\n");
+        err = AVERROR(EINVAL);
+        goto fail;
+    }
+
+    err = videotoolbox_default_init(avctx);
+    if (err < 0)
+        goto fail;
+
+    return 0;
+
+fail:
+    videotoolbox_uninit(avctx);
+    return err;
+}
+
 AVHWAccel ff_h263_videotoolbox_hwaccel = {
     .name           = "h263_videotoolbox",
     .type           = AVMEDIA_TYPE_VIDEO,
@@ -616,7 +777,8 @@  AVHWAccel ff_h263_videotoolbox_hwaccel = {
     .start_frame    = videotoolbox_mpeg_start_frame,
     .decode_slice   = videotoolbox_mpeg_decode_slice,
     .end_frame      = videotoolbox_mpeg_end_frame,
-    .uninit         = ff_videotoolbox_uninit,
+    .init           = videotoolbox_common_init,
+    .uninit         = videotoolbox_uninit,
     .priv_data_size = sizeof(VTContext),
 };
 
@@ -629,7 +791,8 @@  AVHWAccel ff_h264_videotoolbox_hwaccel = {
     .start_frame    = ff_videotoolbox_h264_start_frame,
     .decode_slice   = ff_videotoolbox_h264_decode_slice,
     .end_frame      = videotoolbox_h264_end_frame,
-    .uninit         = ff_videotoolbox_uninit,
+    .init           = videotoolbox_common_init,
+    .uninit         = videotoolbox_uninit,
     .priv_data_size = sizeof(VTContext),
 };
 
@@ -642,7 +805,8 @@  AVHWAccel ff_mpeg1_videotoolbox_hwaccel = {
     .start_frame    = videotoolbox_mpeg_start_frame,
     .decode_slice   = videotoolbox_mpeg_decode_slice,
     .end_frame      = videotoolbox_mpeg_end_frame,
-    .uninit         = ff_videotoolbox_uninit,
+    .init           = videotoolbox_common_init,
+    .uninit         = videotoolbox_uninit,
     .priv_data_size = sizeof(VTContext),
 };
 
@@ -655,7 +819,8 @@  AVHWAccel ff_mpeg2_videotoolbox_hwaccel = {
     .start_frame    = videotoolbox_mpeg_start_frame,
     .decode_slice   = videotoolbox_mpeg_decode_slice,
     .end_frame      = videotoolbox_mpeg_end_frame,
-    .uninit         = ff_videotoolbox_uninit,
+    .init           = videotoolbox_common_init,
+    .uninit         = videotoolbox_uninit,
     .priv_data_size = sizeof(VTContext),
 };
 
@@ -668,7 +833,8 @@  AVHWAccel ff_mpeg4_videotoolbox_hwaccel = {
     .start_frame    = videotoolbox_mpeg_start_frame,
     .decode_slice   = videotoolbox_mpeg_decode_slice,
     .end_frame      = videotoolbox_mpeg_end_frame,
-    .uninit         = ff_videotoolbox_uninit,
+    .init           = videotoolbox_common_init,
+    .uninit         = videotoolbox_uninit,
     .priv_data_size = sizeof(VTContext),
 };
 
diff --git a/libavutil/Makefile b/libavutil/Makefile
index d669a924b0..e1fce7732c 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -37,6 +37,7 @@  HEADERS = adler32.h                                                     \
           hwcontext_dxva2.h                                             \
           hwcontext_qsv.h                                               \
           hwcontext_vaapi.h                                             \
+          hwcontext_videotoolbox.h                                      \
           hwcontext_vdpau.h                                             \
           imgutils.h                                                    \
           intfloat.h                                                    \
@@ -161,6 +162,7 @@  OBJS-$(CONFIG_QSV)                   += hwcontext_qsv.o
 OBJS-$(CONFIG_LZO)                      += lzo.o
 OBJS-$(CONFIG_OPENCL)                   += opencl.o opencl_internal.o
 OBJS-$(CONFIG_VAAPI)                    += hwcontext_vaapi.o
+OBJS-$(CONFIG_VIDEOTOOLBOX)             += hwcontext_videotoolbox.o
 OBJS-$(CONFIG_VDPAU)                    += hwcontext_vdpau.o
 
 OBJS += $(COMPAT_OBJS:%=../compat/%)
@@ -173,6 +175,7 @@  SKIPHEADERS-$(CONFIG_CUDA)             += hwcontext_cuda_internal.h
 SKIPHEADERS-$(CONFIG_DXVA2)            += hwcontext_dxva2.h
 SKIPHEADERS-$(CONFIG_QSV)           += hwcontext_qsv.h
 SKIPHEADERS-$(CONFIG_VAAPI)            += hwcontext_vaapi.h
+SKIPHEADERS-$(CONFIG_VDPAU)            += hwcontext_videotoolbox.h
 SKIPHEADERS-$(CONFIG_VDPAU)            += hwcontext_vdpau.h
 SKIPHEADERS-$(HAVE_ATOMICS_GCC)        += atomic_gcc.h
 SKIPHEADERS-$(HAVE_ATOMICS_SUNCC)      += atomic_suncc.h
diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c
index 4cfe377982..8d50a32b84 100644
--- a/libavutil/hwcontext.c
+++ b/libavutil/hwcontext.c
@@ -44,6 +44,9 @@  static const HWContextType *hw_table[] = {
 #if CONFIG_VDPAU
     &ff_hwcontext_type_vdpau,
 #endif
+#if CONFIG_VIDEOTOOLBOX
+    &ff_hwcontext_type_videotoolbox,
+#endif
     NULL,
 };
 
diff --git a/libavutil/hwcontext.h b/libavutil/hwcontext.h
index 284b091209..cfc6ad0e28 100644
--- a/libavutil/hwcontext.h
+++ b/libavutil/hwcontext.h
@@ -30,6 +30,7 @@  enum AVHWDeviceType {
     AV_HWDEVICE_TYPE_VAAPI,
     AV_HWDEVICE_TYPE_DXVA2,
     AV_HWDEVICE_TYPE_QSV,
+    AV_HWDEVICE_TYPE_VIDEOTOOLBOX,
 };
 
 typedef struct AVHWDeviceInternal AVHWDeviceInternal;
diff --git a/libavutil/hwcontext_internal.h b/libavutil/hwcontext_internal.h
index 30fce2afd9..cf05323e15 100644
--- a/libavutil/hwcontext_internal.h
+++ b/libavutil/hwcontext_internal.h
@@ -144,5 +144,6 @@  extern const HWContextType ff_hwcontext_type_dxva2;
 extern const HWContextType ff_hwcontext_type_qsv;
 extern const HWContextType ff_hwcontext_type_vaapi;
 extern const HWContextType ff_hwcontext_type_vdpau;
+extern const HWContextType ff_hwcontext_type_videotoolbox;
 
 #endif /* AVUTIL_HWCONTEXT_INTERNAL_H */
diff --git a/libavutil/hwcontext_videotoolbox.c b/libavutil/hwcontext_videotoolbox.c
new file mode 100644
index 0000000000..cc00f1f2f2
--- /dev/null
+++ b/libavutil/hwcontext_videotoolbox.c
@@ -0,0 +1,243 @@ 
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "config.h"
+
+#include <stdint.h>
+#include <string.h>
+
+#include <VideoToolbox/VideoToolbox.h>
+
+#include "buffer.h"
+#include "common.h"
+#include "hwcontext.h"
+#include "hwcontext_internal.h"
+#include "hwcontext_videotoolbox.h"
+#include "mem.h"
+#include "pixfmt.h"
+#include "pixdesc.h"
+
+static const struct {
+    uint32_t cv_fmt;
+    enum AVPixelFormat pix_fmt;
+} cv_pix_fmts[] = {
+    { kCVPixelFormatType_420YpCbCr8Planar,              AV_PIX_FMT_YUV420P },
+    { kCVPixelFormatType_422YpCbCr8,                    AV_PIX_FMT_UYVY422 },
+    { kCVPixelFormatType_32BGRA,                        AV_PIX_FMT_BGRA },
+#ifdef kCFCoreFoundationVersionNumber10_7
+    { kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,  AV_PIX_FMT_NV12 },
+#endif
+};
+
+enum AVPixelFormat av_map_videotoolbox_format_to_pixfmt(uint32_t cv_fmt)
+{
+    int i;
+    for (i = 0; i < FF_ARRAY_ELEMS(cv_pix_fmts); i++) {
+        if (cv_pix_fmts[i].cv_fmt == cv_fmt)
+            return cv_pix_fmts[i].pix_fmt;
+    }
+    return AV_PIX_FMT_NONE;
+}
+
+uint32_t av_map_videotoolbox_format_from_pixfmt(enum AVPixelFormat pix_fmt)
+{
+    int i;
+    for (i = 0; i < FF_ARRAY_ELEMS(cv_pix_fmts); i++) {
+        if (cv_pix_fmts[i].pix_fmt == pix_fmt)
+            return cv_pix_fmts[i].cv_fmt;
+    }
+    return 0;
+}
+
+static int vt_get_buffer(AVHWFramesContext *ctx, AVFrame *frame)
+{
+    frame->buf[0] = av_buffer_pool_get(ctx->pool);
+    if (!frame->buf[0])
+        return AVERROR(ENOMEM);
+
+    frame->data[3] = frame->buf[0]->data;
+    frame->format  = AV_PIX_FMT_VIDEOTOOLBOX;
+    frame->width   = ctx->width;
+    frame->height  = ctx->height;
+
+    return 0;
+}
+
+static int vt_transfer_get_formats(AVHWFramesContext *ctx,
+                                   enum AVHWFrameTransferDirection dir,
+                                   enum AVPixelFormat **formats)
+{
+    enum AVPixelFormat *fmts = av_malloc_array(2, sizeof(*fmts));
+    if (!fmts)
+        return AVERROR(ENOMEM);
+
+    fmts[0] = ctx->sw_format;
+    fmts[1] = AV_PIX_FMT_NONE;
+
+    *formats = fmts;
+    return 0;
+}
+
+static void vt_unmap(AVHWFramesContext *ctx, HWMapDescriptor *hwmap)
+{
+    CVPixelBufferRef pixbuf = (CVPixelBufferRef)hwmap->source->data[3];
+
+    CVPixelBufferUnlockBaseAddress(pixbuf, (uintptr_t)hwmap->priv);
+}
+
+static int vt_map_frame(AVHWFramesContext *ctx, AVFrame *dst, const AVFrame *src,
+                        int flags)
+{
+    CVPixelBufferRef pixbuf = (CVPixelBufferRef)src->data[3];
+    OSType pixel_format = CVPixelBufferGetPixelFormatType(pixbuf);
+    CVReturn err;
+    uint32_t map_flags = 0;
+    int ret;
+    int i;
+    enum AVPixelFormat format;
+
+    format = av_map_videotoolbox_format_to_pixfmt(pixel_format);
+    if (dst->format != format) {
+        av_log(ctx, AV_LOG_ERROR, "Unsupported or mismatching pixel format: %s\n",
+               av_fourcc2str(pixel_format));
+        return AVERROR_UNKNOWN;
+    }
+
+    if (CVPixelBufferGetWidth(pixbuf) != ctx->width ||
+        CVPixelBufferGetHeight(pixbuf) != ctx->height) {
+        av_log(ctx, AV_LOG_ERROR, "Inconsistent frame dimensions.\n");
+        return AVERROR_UNKNOWN;
+    }
+
+    if (flags == AV_HWFRAME_MAP_READ)
+        map_flags = kCVPixelBufferLock_ReadOnly;
+
+    err = CVPixelBufferLockBaseAddress(pixbuf, map_flags);
+    if (err != kCVReturnSuccess) {
+        av_log(ctx, AV_LOG_ERROR, "Error locking the pixel buffer.\n");
+        return AVERROR_UNKNOWN;
+    }
+
+    if (CVPixelBufferIsPlanar(pixbuf)) {
+        int planes = CVPixelBufferGetPlaneCount(pixbuf);
+        for (i = 0; i < planes; i++) {
+            dst->data[i]     = CVPixelBufferGetBaseAddressOfPlane(pixbuf, i);
+            dst->linesize[i] = CVPixelBufferGetBytesPerRowOfPlane(pixbuf, i);
+        }
+    } else {
+        dst->data[0]     = CVPixelBufferGetBaseAddress(pixbuf);
+        dst->linesize[0] = CVPixelBufferGetBytesPerRow(pixbuf);
+    }
+
+    ret = ff_hwframe_map_create(src->hw_frames_ctx, dst, src, vt_unmap,
+                                (void *)(uintptr_t)map_flags);
+    if (ret < 0)
+        goto unlock;
+
+    return 0;
+
+unlock:
+    CVPixelBufferUnlockBaseAddress(pixbuf, map_flags);
+    return ret;
+}
+
+static int vt_transfer_data_from(AVHWFramesContext *hwfc,
+                                 AVFrame *dst, const AVFrame *src)
+{
+    AVFrame *map;
+    int err;
+
+    if (dst->width > hwfc->width || dst->height > hwfc->height)
+        return AVERROR(EINVAL);
+
+    map = av_frame_alloc();
+    if (!map)
+        return AVERROR(ENOMEM);
+    map->format = dst->format;
+
+    err = vt_map_frame(hwfc, map, src, AV_HWFRAME_MAP_READ);
+    if (err)
+        goto fail;
+
+    map->width  = dst->width;
+    map->height = dst->height;
+
+    err = av_frame_copy(dst, map);
+    if (err)
+        goto fail;
+
+    err = 0;
+fail:
+    av_frame_free(&map);
+    return err;
+}
+
+static int vt_transfer_data_to(AVHWFramesContext *hwfc,
+                               AVFrame *dst, const AVFrame *src)
+{
+    AVFrame *map;
+    int err;
+
+    if (src->width > hwfc->width || src->height > hwfc->height)
+        return AVERROR(EINVAL);
+
+    map = av_frame_alloc();
+    if (!map)
+        return AVERROR(ENOMEM);
+    map->format = src->format;
+
+    err = vt_map_frame(hwfc, map, dst, AV_HWFRAME_MAP_WRITE | AV_HWFRAME_MAP_OVERWRITE);
+    if (err)
+        goto fail;
+
+    map->width  = src->width;
+    map->height = src->height;
+
+    err = av_frame_copy(map, src);
+    if (err)
+        goto fail;
+
+    err = 0;
+fail:
+    av_frame_free(&map);
+    return err;
+}
+
+static int vt_device_create(AVHWDeviceContext *ctx, const char *device,
+                            AVDictionary *opts, int flags)
+{
+    if (device && device[0]) {
+        av_log(ctx, AV_LOG_ERROR, "Device selection unsupported.\n");
+        return AVERROR_UNKNOWN;
+    }
+
+    return 0;
+}
+
+const HWContextType ff_hwcontext_type_videotoolbox = {
+    .type                 = AV_HWDEVICE_TYPE_VIDEOTOOLBOX,
+    .name                 = "videotoolbox",
+
+    .device_create        = vt_device_create,
+    .frames_get_buffer    = vt_get_buffer,
+    .transfer_get_formats = vt_transfer_get_formats,
+    .transfer_data_to     = vt_transfer_data_to,
+    .transfer_data_from   = vt_transfer_data_from,
+
+    .pix_fmts = (const enum AVPixelFormat[]){ AV_PIX_FMT_VIDEOTOOLBOX, AV_PIX_FMT_NONE },
+};
diff --git a/libavutil/hwcontext_videotoolbox.h b/libavutil/hwcontext_videotoolbox.h
new file mode 100644
index 0000000000..dc7b873204
--- /dev/null
+++ b/libavutil/hwcontext_videotoolbox.h
@@ -0,0 +1,54 @@ 
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_HWCONTEXT_VT_H
+#define AVUTIL_HWCONTEXT_VT_H
+
+#include <stdint.h>
+
+#include <VideoToolbox/VideoToolbox.h>
+
+#include "pixfmt.h"
+
+/**
+ * @file
+ * An API-specific header for AV_HWDEVICE_TYPE_VIDEOTOOLBOX.
+ *
+ * This API currently does not support frame allocation, as the raw VideoToolbox
+ * API does allocation, and FFmpeg itself never has the need to allocate frames.
+ *
+ * If the API user sets a custom pool, AVHWFramesContext.pool must return
+ * AVBufferRefs whose data pointer is a CVImageBufferRef or CVPixelBufferRef.
+ *
+ * Currently AVHWDeviceContext.hwctx and AVHWFramesContext.hwctx are always
+ * NULL.
+ */
+
+/**
+ * Convert a VideoToolbox (actually CoreVideo) format to AVPixelFormat.
+ * Returns AV_PIX_FMT_NONE if no known equivalent was found.
+ */
+enum AVPixelFormat av_map_videotoolbox_format_to_pixfmt(uint32_t cv_fmt);
+
+/**
+ * Convert a AVPixelFormat to a VideoToolbox (actually CoreVideo) format.
+ * Returns 0 if no known equivalent was found.
+ */
+uint32_t av_map_videotoolbox_format_from_pixfmt(enum AVPixelFormat pix_fmt);
+
+#endif /* AVUTIL_HWCONTEXT_VT_H */

[FFmpeg-devel] videotoolbox: add hwcontext support

Commit Message

Comments

Patch