diff mbox series

[FFmpeg-devel] avdevice/v4l2: add limited support for multiplanar API

Message ID 20240620154039.91460-1-ramiro.polla@gmail.com
State New
Headers show
Series [FFmpeg-devel] avdevice/v4l2: add limited support for multiplanar API | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Ramiro Polla June 20, 2024, 3:40 p.m. UTC
This commit adds support for V4L2's multiplanar API, but only when the
number of planes is 1.

Adding full support for the multiplanar API would require a device that
actually uses more than 1 plane, which I have not found yet.
---
 libavdevice/v4l2.c | 93 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 68 insertions(+), 25 deletions(-)

Comments

Anton Khirnov June 25, 2024, 9:19 a.m. UTC | #1
Quoting Ramiro Polla (2024-06-20 17:40:39)
> This commit adds support for V4L2's multiplanar API, but only when the
> number of planes is 1.
> 
> Adding full support for the multiplanar API would require a device that
> actually uses more than 1 plane, which I have not found yet.

Out of curiosity, what's the point of a multiplanar API when the plane
count is always one?
Ramiro Polla June 25, 2024, 11:56 a.m. UTC | #2
On Tue, Jun 25, 2024 at 11:19 AM Anton Khirnov <anton@khirnov.net> wrote:
> Quoting Ramiro Polla (2024-06-20 17:40:39)
> > This commit adds support for V4L2's multiplanar API, but only when the
> > number of planes is 1.
> >
> > Adding full support for the multiplanar API would require a device that
> > actually uses more than 1 plane, which I have not found yet.
>
> Out of curiosity, what's the point of a multiplanar API when the plane
> count is always one?

Good question, and I wish I knew the answer. Some SBCs are providing
HDMI input through v4l2, and they decided to use the multiplanar API,
even though the count is always one.

I couldn't find many examples on how to use this API correctly, with a
couple of implementations being buggy to a point where they will only
work when the count is one.

I wish I could find a device that uses the multiplanar API and
actually provides more than one buffer so that I could properly test,
but by looking at the kernel source code I couldn't find a device that
I could easily obtain yet.
Ramiro Polla June 27, 2024, 2:13 p.m. UTC | #3
On Tue, Jun 25, 2024 at 1:56 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> On Tue, Jun 25, 2024 at 11:19 AM Anton Khirnov <anton@khirnov.net> wrote:
> > Quoting Ramiro Polla (2024-06-20 17:40:39)
> > > This commit adds support for V4L2's multiplanar API, but only when the
> > > number of planes is 1.
> > >
> > > Adding full support for the multiplanar API would require a device that
> > > actually uses more than 1 plane, which I have not found yet.
> >
> > Out of curiosity, what's the point of a multiplanar API when the plane
> > count is always one?
>
> Good question, and I wish I knew the answer. Some SBCs are providing
> HDMI input through v4l2, and they decided to use the multiplanar API,
> even though the count is always one.
>
> I couldn't find many examples on how to use this API correctly, with a
> couple of implementations being buggy to a point where they will only
> work when the count is one.
>
> I wish I could find a device that uses the multiplanar API and
> actually provides more than one buffer so that I could properly test,
> but by looking at the kernel source code I couldn't find a device that
> I could easily obtain yet.

Is anyone fundamentally opposed to this approach to implement limited
support for multiplanar API? I figure it could still be useful even
when full multiplanar API support is implemented, because multiple
memory buffers per packet will likely mean memcpy()s, which can be
avoided if the plane count is one.
Anton Khirnov June 27, 2024, 4:05 p.m. UTC | #4
Quoting Ramiro Polla (2024-06-27 16:13:24)
> On Tue, Jun 25, 2024 at 1:56 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> > On Tue, Jun 25, 2024 at 11:19 AM Anton Khirnov <anton@khirnov.net> wrote:
> > > Quoting Ramiro Polla (2024-06-20 17:40:39)
> > > > This commit adds support for V4L2's multiplanar API, but only when the
> > > > number of planes is 1.
> > > >
> > > > Adding full support for the multiplanar API would require a device that
> > > > actually uses more than 1 plane, which I have not found yet.
> > >
> > > Out of curiosity, what's the point of a multiplanar API when the plane
> > > count is always one?
> >
> > Good question, and I wish I knew the answer. Some SBCs are providing
> > HDMI input through v4l2, and they decided to use the multiplanar API,
> > even though the count is always one.
> >
> > I couldn't find many examples on how to use this API correctly, with a
> > couple of implementations being buggy to a point where they will only
> > work when the count is one.
> >
> > I wish I could find a device that uses the multiplanar API and
> > actually provides more than one buffer so that I could properly test,
> > but by looking at the kernel source code I couldn't find a device that
> > I could easily obtain yet.
> 
> Is anyone fundamentally opposed to this approach to implement limited
> support for multiplanar API? I figure it could still be useful even
> when full multiplanar API support is implemented, because multiple
> memory buffers per packet will likely mean memcpy()s, which can be
> avoided if the plane count is one.

Ideally this should be turned into a lavfi source, then multiple planes
can be exported individually.

(this is not an objection to this patch)
Ramiro Polla June 28, 2024, 11:47 a.m. UTC | #5
On Thu, Jun 27, 2024 at 6:06 PM Anton Khirnov <anton@khirnov.net> wrote:
> Quoting Ramiro Polla (2024-06-27 16:13:24)
> > Is anyone fundamentally opposed to this approach to implement limited
> > support for multiplanar API? I figure it could still be useful even
> > when full multiplanar API support is implemented, because multiple
> > memory buffers per packet will likely mean memcpy()s, which can be
> > avoided if the plane count is one.
>
> Ideally this should be turned into a lavfi source, then multiple planes
> can be exported individually.

Good idea. Is there any kind of roadmap on how this conversion from
input device to source filter should take place, or is this something
that still lacks consensus?

I assume we want to keep the functionality from the command line (so
that "-i /dev/video0" would still use the lavfi source internally
somehow). Would the code be shared between libraries, or would it be
moved to lavfi and a wrapper written in lavd instead?
Ramiro Polla June 28, 2024, 12:54 p.m. UTC | #6
On Thu, Jun 27, 2024 at 4:13 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> On Tue, Jun 25, 2024 at 1:56 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> > On Tue, Jun 25, 2024 at 11:19 AM Anton Khirnov <anton@khirnov.net> wrote:
> > > Quoting Ramiro Polla (2024-06-20 17:40:39)
> > > > This commit adds support for V4L2's multiplanar API, but only when the
> > > > number of planes is 1.
> > > >
> > > > Adding full support for the multiplanar API would require a device that
> > > > actually uses more than 1 plane, which I have not found yet.
> > >
> > > Out of curiosity, what's the point of a multiplanar API when the plane
> > > count is always one?
> >
> > Good question, and I wish I knew the answer. Some SBCs are providing
> > HDMI input through v4l2, and they decided to use the multiplanar API,
> > even though the count is always one.
> >
> > I couldn't find many examples on how to use this API correctly, with a
> > couple of implementations being buggy to a point where they will only
> > work when the count is one.
> >
> > I wish I could find a device that uses the multiplanar API and
> > actually provides more than one buffer so that I could properly test,
> > but by looking at the kernel source code I couldn't find a device that
> > I could easily obtain yet.
>
> Is anyone fundamentally opposed to this approach to implement limited
> support for multiplanar API? I figure it could still be useful even
> when full multiplanar API support is implemented, because multiple
> memory buffers per packet will likely mean memcpy()s, which can be
> avoided if the plane count is one.

I'll apply this after the weekend if there are no objections.
Anton Khirnov July 1, 2024, 2:15 p.m. UTC | #7
Quoting Ramiro Polla (2024-06-28 13:47:13)
> On Thu, Jun 27, 2024 at 6:06 PM Anton Khirnov <anton@khirnov.net> wrote:
> > Quoting Ramiro Polla (2024-06-27 16:13:24)
> > > Is anyone fundamentally opposed to this approach to implement limited
> > > support for multiplanar API? I figure it could still be useful even
> > > when full multiplanar API support is implemented, because multiple
> > > memory buffers per packet will likely mean memcpy()s, which can be
> > > avoided if the plane count is one.
> >
> > Ideally this should be turned into a lavfi source, then multiple planes
> > can be exported individually.
> 
> Good idea. Is there any kind of roadmap on how this conversion from
> input device to source filter should take place, or is this something
> that still lacks consensus?

AFAIK there is a wide consensus that this is something that should be
done in general, though of course that does not directly translate into
volunteers actually doing the conversion.

> I assume we want to keep the functionality from the command line (so
> that "-i /dev/video0" would still use the lavfi source internally
> somehow). Would the code be shared between libraries, or would it be
> moved to lavfi and a wrapper written in lavd instead?

V4L2 is trickier than most devices, since it can also produce compressed
video. So at least that part of the functionaity will have to stay as a
demuxer.
Ramiro Polla July 1, 2024, 5:38 p.m. UTC | #8
On Fri, Jun 28, 2024 at 2:54 PM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> I'll apply this after the weekend if there are no objections.

I realized this patch would have caused a regression on buffers with
corrupted data or an unexpected size.

New patch attached. Alexander, Stephen, since you worked on this
before, could you check that this patch indeed doesn't cause this
regression? I'll apply in a few days if there are no comments.
diff mbox series

Patch

diff --git a/libavdevice/v4l2.c b/libavdevice/v4l2.c
index 74f43ef6a9..305c1f25dd 100644
--- a/libavdevice/v4l2.c
+++ b/libavdevice/v4l2.c
@@ -92,6 +92,9 @@  struct video_data {
     TimeFilter *timefilter;
     int64_t last_time_m;
 
+    int multiplanar;
+    enum v4l2_buf_type buf_type;
+
     int buffers;
     atomic_int buffers_queued;
     void **buf_start;
@@ -182,7 +185,13 @@  static int device_open(AVFormatContext *ctx, const char* device_path)
     av_log(ctx, AV_LOG_VERBOSE, "fd:%d capabilities:%x\n",
            fd, cap.capabilities);
 
-    if (!(cap.capabilities & V4L2_CAP_VIDEO_CAPTURE)) {
+    if (cap.capabilities & V4L2_CAP_VIDEO_CAPTURE) {
+        s->multiplanar = 0;
+        s->buf_type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+    } else if (cap.capabilities & V4L2_CAP_VIDEO_CAPTURE_MPLANE) {
+        s->multiplanar = 1;
+        s->buf_type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;
+    } else {
         av_log(ctx, AV_LOG_ERROR, "Not a video capture device.\n");
         err = AVERROR(ENODEV);
         goto fail;
@@ -206,7 +215,7 @@  static int device_init(AVFormatContext *ctx, int *width, int *height,
                        uint32_t pixelformat)
 {
     struct video_data *s = ctx->priv_data;
-    struct v4l2_format fmt = { .type = V4L2_BUF_TYPE_VIDEO_CAPTURE };
+    struct v4l2_format fmt = { .type = s->buf_type };
     int res = 0;
 
     fmt.fmt.pix.width = *width;
@@ -288,7 +297,7 @@  static void list_framesizes(AVFormatContext *ctx, uint32_t pixelformat)
 static void list_formats(AVFormatContext *ctx, int type)
 {
     const struct video_data *s = ctx->priv_data;
-    struct v4l2_fmtdesc vfd = { .type = V4L2_BUF_TYPE_VIDEO_CAPTURE };
+    struct v4l2_fmtdesc vfd = { .type = s->buf_type };
 
     while(!v4l2_ioctl(s->fd, VIDIOC_ENUM_FMT, &vfd)) {
         enum AVCodecID codec_id = ff_fmt_v4l2codec(vfd.pixelformat);
@@ -352,7 +361,7 @@  static int mmap_init(AVFormatContext *ctx)
     int i, res;
     struct video_data *s = ctx->priv_data;
     struct v4l2_requestbuffers req = {
-        .type   = V4L2_BUF_TYPE_VIDEO_CAPTURE,
+        .type   = s->buf_type,
         .count  = desired_video_buffers,
         .memory = V4L2_MEMORY_MMAP
     };
@@ -381,10 +390,14 @@  static int mmap_init(AVFormatContext *ctx)
     }
 
     for (i = 0; i < req.count; i++) {
+        unsigned int buf_length, buf_offset;
+        struct v4l2_plane planes[VIDEO_MAX_PLANES];
         struct v4l2_buffer buf = {
-            .type   = V4L2_BUF_TYPE_VIDEO_CAPTURE,
+            .type   = s->buf_type,
             .index  = i,
-            .memory = V4L2_MEMORY_MMAP
+            .memory = V4L2_MEMORY_MMAP,
+            .m.planes = s->multiplanar ? planes : NULL,
+            .length   = s->multiplanar ? VIDEO_MAX_PLANES : 0,
         };
         if (v4l2_ioctl(s->fd, VIDIOC_QUERYBUF, &buf) < 0) {
             res = AVERROR(errno);
@@ -392,16 +405,28 @@  static int mmap_init(AVFormatContext *ctx)
             return res;
         }
 
-        s->buf_len[i] = buf.length;
+        if (s->multiplanar) {
+            if (buf.length != 1) {
+                av_log(ctx, AV_LOG_ERROR, "multiplanar only supported when buf.length == 1\n");
+                return AVERROR_PATCHWELCOME;
+            }
+            buf_length = buf.m.planes[0].length;
+            buf_offset = buf.m.planes[0].m.mem_offset;
+        } else {
+            buf_length = buf.length;
+            buf_offset = buf.m.offset;
+        }
+
+        s->buf_len[i] = buf_length;
         if (s->frame_size > 0 && s->buf_len[i] < s->frame_size) {
             av_log(ctx, AV_LOG_ERROR,
                    "buf_len[%d] = %d < expected frame size %d\n",
                    i, s->buf_len[i], s->frame_size);
             return AVERROR(ENOMEM);
         }
-        s->buf_start[i] = v4l2_mmap(NULL, buf.length,
+        s->buf_start[i] = v4l2_mmap(NULL, buf_length,
                                PROT_READ | PROT_WRITE, MAP_SHARED,
-                               s->fd, buf.m.offset);
+                               s->fd, buf_offset);
 
         if (s->buf_start[i] == MAP_FAILED) {
             res = AVERROR(errno);
@@ -429,13 +454,16 @@  static int enqueue_buffer(struct video_data *s, struct v4l2_buffer *buf)
 
 static void mmap_release_buffer(void *opaque, uint8_t *data)
 {
+    struct v4l2_plane planes[VIDEO_MAX_PLANES];
     struct v4l2_buffer buf = { 0 };
     struct buff_data *buf_descriptor = opaque;
     struct video_data *s = buf_descriptor->s;
 
-    buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+    buf.type = s->buf_type;
     buf.memory = V4L2_MEMORY_MMAP;
     buf.index = buf_descriptor->index;
+    buf.m.planes = s->multiplanar ? planes : NULL;
+    buf.length   = s->multiplanar ? VIDEO_MAX_PLANES : 0;
     av_free(buf_descriptor);
 
     enqueue_buffer(s, &buf);
@@ -505,11 +533,15 @@  static int convert_timestamp(AVFormatContext *ctx, int64_t *ts)
 static int mmap_read_frame(AVFormatContext *ctx, AVPacket *pkt)
 {
     struct video_data *s = ctx->priv_data;
+    struct v4l2_plane planes[VIDEO_MAX_PLANES];
     struct v4l2_buffer buf = {
-        .type   = V4L2_BUF_TYPE_VIDEO_CAPTURE,
-        .memory = V4L2_MEMORY_MMAP
+        .type   = s->buf_type,
+        .memory = V4L2_MEMORY_MMAP,
+        .m.planes = s->multiplanar ? planes : NULL,
+        .length   = s->multiplanar ? VIDEO_MAX_PLANES : 0,
     };
     struct timeval buf_ts;
+    unsigned int bytesused;
     int res;
 
     pkt->size = 0;
@@ -536,11 +568,16 @@  static int mmap_read_frame(AVFormatContext *ctx, AVPacket *pkt)
     // always keep at least one buffer queued
     av_assert0(atomic_load(&s->buffers_queued) >= 1);
 
+    bytesused = s->multiplanar ? buf.m.planes[0].bytesused : buf.bytesused;
+
 #ifdef V4L2_BUF_FLAG_ERROR
     if (buf.flags & V4L2_BUF_FLAG_ERROR) {
         av_log(ctx, AV_LOG_WARNING,
                "Dequeued v4l2 buffer contains corrupted data (%d bytes).\n",
-               buf.bytesused);
+               bytesused);
+        if (s->multiplanar)
+            buf.m.planes[0].bytesused = 0;
+        else
         buf.bytesused = 0;
     } else
 #endif
@@ -548,12 +585,15 @@  static int mmap_read_frame(AVFormatContext *ctx, AVPacket *pkt)
         /* CPIA is a compressed format and we don't know the exact number of bytes
          * used by a frame, so set it here as the driver announces it. */
         if (ctx->video_codec_id == AV_CODEC_ID_CPIA)
-            s->frame_size = buf.bytesused;
+            s->frame_size = bytesused;
 
-        if (s->frame_size > 0 && buf.bytesused != s->frame_size) {
+        if (s->frame_size > 0 && bytesused != s->frame_size) {
             av_log(ctx, AV_LOG_WARNING,
                    "Dequeued v4l2 buffer contains %d bytes, but %d were expected. Flags: 0x%08X.\n",
-                   buf.bytesused, s->frame_size, buf.flags);
+                   bytesused, s->frame_size, buf.flags);
+            if (s->multiplanar)
+                buf.m.planes[0].bytesused = 0;
+            else
             buf.bytesused = 0;
         }
     }
@@ -561,13 +601,13 @@  static int mmap_read_frame(AVFormatContext *ctx, AVPacket *pkt)
     /* Image is at s->buff_start[buf.index] */
     if (atomic_load(&s->buffers_queued) == FFMAX(s->buffers / 8, 1)) {
         /* when we start getting low on queued buffers, fall back on copying data */
-        res = av_new_packet(pkt, buf.bytesused);
+        res = av_new_packet(pkt, bytesused);
         if (res < 0) {
             av_log(ctx, AV_LOG_ERROR, "Error allocating a packet.\n");
             enqueue_buffer(s, &buf);
             return res;
         }
-        memcpy(pkt->data, s->buf_start[buf.index], buf.bytesused);
+        memcpy(pkt->data, s->buf_start[buf.index], bytesused);
 
         res = enqueue_buffer(s, &buf);
         if (res) {
@@ -578,7 +618,7 @@  static int mmap_read_frame(AVFormatContext *ctx, AVPacket *pkt)
         struct buff_data *buf_descriptor;
 
         pkt->data     = s->buf_start[buf.index];
-        pkt->size     = buf.bytesused;
+        pkt->size     = bytesused;
 
         buf_descriptor = av_malloc(sizeof(struct buff_data));
         if (!buf_descriptor) {
@@ -615,10 +655,13 @@  static int mmap_start(AVFormatContext *ctx)
     int i, res;
 
     for (i = 0; i < s->buffers; i++) {
+        struct v4l2_plane planes[VIDEO_MAX_PLANES];
         struct v4l2_buffer buf = {
-            .type   = V4L2_BUF_TYPE_VIDEO_CAPTURE,
+            .type   = s->buf_type,
             .index  = i,
-            .memory = V4L2_MEMORY_MMAP
+            .memory = V4L2_MEMORY_MMAP,
+            .m.planes = s->multiplanar ? planes : NULL,
+            .length   = s->multiplanar ? VIDEO_MAX_PLANES : 0,
         };
 
         if (v4l2_ioctl(s->fd, VIDIOC_QBUF, &buf) < 0) {
@@ -630,7 +673,7 @@  static int mmap_start(AVFormatContext *ctx)
     }
     atomic_store(&s->buffers_queued, s->buffers);
 
-    type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+    type = s->buf_type;
     if (v4l2_ioctl(s->fd, VIDIOC_STREAMON, &type) < 0) {
         res = AVERROR(errno);
         av_log(ctx, AV_LOG_ERROR, "ioctl(VIDIOC_STREAMON): %s\n",
@@ -646,7 +689,7 @@  static void mmap_close(struct video_data *s)
     enum v4l2_buf_type type;
     int i;
 
-    type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+    type = s->buf_type;
     /* We do not check for the result, because we could
      * not do anything about it anyway...
      */
@@ -733,7 +776,7 @@  static int v4l2_set_parameters(AVFormatContext *ctx)
         tpf = &streamparm.parm.capture.timeperframe;
     }
 
-    streamparm.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
+    streamparm.type = s->buf_type;
     if (v4l2_ioctl(s->fd, VIDIOC_G_PARM, &streamparm) < 0) {
         ret = AVERROR(errno);
         av_log(ctx, AV_LOG_WARNING, "ioctl(VIDIOC_G_PARM): %s\n", av_err2str(ret));
@@ -921,7 +964,7 @@  static int v4l2_read_header(AVFormatContext *ctx)
     }
 
     if (!s->width && !s->height) {
-        struct v4l2_format fmt = { .type = V4L2_BUF_TYPE_VIDEO_CAPTURE };
+        struct v4l2_format fmt = { .type = s->buf_type };
 
         av_log(ctx, AV_LOG_VERBOSE,
                "Querying the device for the current frame size\n");