diff mbox series

[FFmpeg-devel,v3,3/7] avcodec/mediacodecenc: use bsf to handle crop

Message ID tencent_3B14692E006969427CCEAC8E388DC021B70A@qq.com
State Accepted
Commit e3f2d01d709f35e6e9544d542825eee4ef1d13b5
Headers show
Series None | expand

Commit Message

Zhao Zhili Dec. 9, 2022, 5:22 p.m. UTC
From: Zhao Zhili <zhilizhao@tencent.com>

It's well known that mediacodec encoder requires 16x16 alignment.
Use our bsf to fix the crop info.
---
v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It can have
side effect.

 configure                  |  2 +
 libavcodec/mediacodecenc.c | 78 +++++++++++++++++++++++++++++++++++---
 2 files changed, 75 insertions(+), 5 deletions(-)

Comments

Tomas Härdin Dec. 12, 2022, 3:27 p.m. UTC | #1
lör 2022-12-10 klockan 01:22 +0800 skrev Zhao Zhili:
> From: Zhao Zhili <zhilizhao@tencent.com>
> 
> It's well known that mediacodec encoder requires 16x16 alignment.
> Use our bsf to fix the crop info.
> ---
> v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It can have
> side effect.

Looks like this silently crops? Is that really a good idea? We usually
don't do stuff like that. For example codecs that require even
dimensions complain loudly then fail.

/Tomas
Zhao Zhili Dec. 13, 2022, 3:20 a.m. UTC | #2
> On Dec 12, 2022, at 23:27, Tomas Härdin <git@haerdin.se> wrote:
> 
> lör 2022-12-10 klockan 01:22 +0800 skrev Zhao Zhili:
>> From: Zhao Zhili <zhilizhao@tencent.com>
>> 
>> It's well known that mediacodec encoder requires 16x16 alignment.
>> Use our bsf to fix the crop info.
>> ---
>> v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It can have
>> side effect.
> 
> Looks like this silently crops? Is that really a good idea? We usually
> don't do stuff like that. For example codecs that require even
> dimensions complain loudly then fail.

It’s reasonable to require even dimensions. Require dimensions aligned
to 16 is uncommon. Everyone will complain why 1080x1920 doesn’t work.

A lot of apps just use aligned dimensions. Users have no control on
these apps. It’s not the same with FFmpeg, users (developer or not)
can specify the dimension directly.

If we don’t fix it, either:

1. Reject and fail directly. User complain why.
2. Accept and keep going. Sometimes it works, sometimes don’t. It
depends on the device and get into a confused situation.

I know there are getWidthAlignment()/getHeightAlignment() to get
alignment info of codecs. The results are unreliable. The only
reliable way I can find is don’t depends on those API and fix it
by ourself.

I’d like to know if there are better choices.

> 
> /Tomas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Tomas Härdin Dec. 14, 2022, 5:08 p.m. UTC | #3
tis 2022-12-13 klockan 11:20 +0800 skrev zhilizhao(赵志立):
> 
> 
> > On Dec 12, 2022, at 23:27, Tomas Härdin <git@haerdin.se> wrote:
> > 
> > lör 2022-12-10 klockan 01:22 +0800 skrev Zhao Zhili:
> > > From: Zhao Zhili <zhilizhao@tencent.com>
> > > 
> > > It's well known that mediacodec encoder requires 16x16 alignment.
> > > Use our bsf to fix the crop info.
> > > ---
> > > v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It can
> > > have
> > > side effect.
> > 
> > Looks like this silently crops? Is that really a good idea? We
> > usually
> > don't do stuff like that. For example codecs that require even
> > dimensions complain loudly then fail.
> 
> It’s reasonable to require even dimensions. Require dimensions
> aligned
> to 16 is uncommon. Everyone will complain why 1080x1920 doesn’t work.
> 
> A lot of apps just use aligned dimensions. Users have no control on
> these apps. It’s not the same with FFmpeg, users (developer or not)
> can specify the dimension directly.

Wait a sec, I think I was misunderstanding what the code is doing.
FFALIGN rounds *up*. Does this mean you insert fake data in the border
that then gets cropped away, meaning the original essence is still
"there"? That's a different thing and probably perfectly OK.

I think we might want something for this inside lavf somewhere, so that
encoders can signal dimension alignment requirements. Some containers
(MXF, MOV) support such cropping in a codec-agnostic manner.

> 
> If we don’t fix it, either:
> 
> 1. Reject and fail directly. User complain why.
> 2. Accept and keep going. Sometimes it works, sometimes don’t. It
> depends on the device and get into a confused situation.
> 
> I know there are getWidthAlignment()/getHeightAlignment() to get
> alignment info of codecs. The results are unreliable. The only
> reliable way I can find is don’t depends on those API and fix it
> by ourself.

Given how temperamental MC seems to be a "belt and braces" approach
might be appropriate when dealing with it. Tell users (ffmpeg.c is a
user here) that dimensions must be aligned by 16x16 and then
automagically doing the required padding and cropping somewhere (lavf
or ffmpeg.c) feels like a decent solution.

/Tomas
Tomas Härdin Dec. 14, 2022, 5:19 p.m. UTC | #4
ons 2022-12-14 klockan 18:08 +0100 skrev Tomas Härdin:
> tis 2022-12-13 klockan 11:20 +0800 skrev zhilizhao(赵志立):
> > 
> > 
> > > On Dec 12, 2022, at 23:27, Tomas Härdin <git@haerdin.se> wrote:
> > > 
> > > lör 2022-12-10 klockan 01:22 +0800 skrev Zhao Zhili:
> > > > From: Zhao Zhili <zhilizhao@tencent.com>
> > > > 
> > > > It's well known that mediacodec encoder requires 16x16
> > > > alignment.
> > > > Use our bsf to fix the crop info.
> > > > ---
> > > > v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It
> > > > can
> > > > have
> > > > side effect.
> > > 
> > > Looks like this silently crops? Is that really a good idea? We
> > > usually
> > > don't do stuff like that. For example codecs that require even
> > > dimensions complain loudly then fail.
> > 
> > It’s reasonable to require even dimensions. Require dimensions
> > aligned
> > to 16 is uncommon. Everyone will complain why 1080x1920 doesn’t
> > work.
> > 
> > A lot of apps just use aligned dimensions. Users have no control on
> > these apps. It’s not the same with FFmpeg, users (developer or not)
> > can specify the dimension directly.
> 
> Wait a sec, I think I was misunderstanding what the code is doing.
> FFALIGN rounds *up*. Does this mean you insert fake data in the
> border
> that then gets cropped away, meaning the original essence is still
> "there"? That's a different thing and probably perfectly OK.
> 
> I think we might want something for this inside lavf somewhere, so
> that
> encoders can signal dimension alignment requirements. Some containers
> (MXF, MOV) support such cropping in a codec-agnostic manner.
> 
> > 
> > If we don’t fix it, either:
> > 
> > 1. Reject and fail directly. User complain why.
> > 2. Accept and keep going. Sometimes it works, sometimes don’t. It
> > depends on the device and get into a confused situation.
> > 
> > I know there are getWidthAlignment()/getHeightAlignment() to get
> > alignment info of codecs. The results are unreliable. The only
> > reliable way I can find is don’t depends on those API and fix it
> > by ourself.
> 
> Given how temperamental MC seems to be a "belt and braces" approach
> might be appropriate when dealing with it. Tell users (ffmpeg.c is a
> user here) that dimensions must be aligned by 16x16 and then
> automagically doing the required padding and cropping somewhere (lavf
> or ffmpeg.c) feels like a decent solution.

Come to think of it this kind of 16x16 requirement is very common and
is already being handled silently: it's the macroblock size for almost
every DCT codec when using 4:2:0 subsampling.

/Tomas
Zhao Zhili Dec. 14, 2022, 5:37 p.m. UTC | #5
On Wed, 2022-12-14 at 18:08 +0100, Tomas Härdin wrote:
> tis 2022-12-13 klockan 11:20 +0800 skrev zhilizhao(赵志立):
> > 
> > > On Dec 12, 2022, at 23:27, Tomas Härdin <git@haerdin.se> wrote:
> > > 
> > > lör 2022-12-10 klockan 01:22 +0800 skrev Zhao Zhili:
> > > > From: Zhao Zhili <zhilizhao@tencent.com>
> > > > 
> > > > It's well known that mediacodec encoder requires 16x16
> > > > alignment.
> > > > Use our bsf to fix the crop info.
> > > > ---
> > > > v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It
> > > > can
> > > > have
> > > > side effect.
> > > 
> > > Looks like this silently crops? Is that really a good idea? We
> > > usually
> > > don't do stuff like that. For example codecs that require even
> > > dimensions complain loudly then fail.
> > 
> > It’s reasonable to require even dimensions. Require dimensions
> > aligned
> > to 16 is uncommon. Everyone will complain why 1080x1920 doesn’t
> > work.
> > 
> > A lot of apps just use aligned dimensions. Users have no control on
> > these apps. It’s not the same with FFmpeg, users (developer or not)
> > can specify the dimension directly.
> 
> Wait a sec, I think I was misunderstanding what the code is doing.
> FFALIGN rounds *up*. Does this mean you insert fake data in the
> border
> that then gets cropped away, meaning the original essence is still
> "there"? That's a different thing and probably perfectly OK.

Yes, the dimension passed to MC is rounding up, then use bsf to remove
that border. It depends on our AVFrame data has been aligned properly.
This job is supposed to be done by any decent encoder, not by a
wrapper.

> 
> I think we might want something for this inside lavf somewhere, so
> that
> encoders can signal dimension alignment requirements. Some containers
> (MXF, MOV) support such cropping in a codec-agnostic manner.

From my own experience, dimension mismatch between codec and container
makes a lot of trouble. ISO base format specification specified how to
crop/scale after decoding clear, however, I don't think it has been
widely supported, including FFmpeg. We can fix that inside of FFmpeg,
but we should avoid such cases as much as we can.

> 
> > If we don’t fix it, either:
> > 
> > 1. Reject and fail directly. User complain why.
> > 2. Accept and keep going. Sometimes it works, sometimes don’t. It
> > depends on the device and get into a confused situation.
> > 
> > I know there are getWidthAlignment()/getHeightAlignment() to get
> > alignment info of codecs. The results are unreliable. The only
> > reliable way I can find is don’t depends on those API and fix it
> > by ourself.
> 
> Given how temperamental MC seems to be a "belt and braces" approach
> might be appropriate when dealing with it. Tell users (ffmpeg.c is a
> user here) that dimensions must be aligned by 16x16 and then
> automagically doing the required padding and cropping somewhere (lavf
> or ffmpeg.c) feels like a decent solution.
> 
> /Tomas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Zhao Zhili Dec. 14, 2022, 5:43 p.m. UTC | #6
On Thu, 2022-12-15 at 01:37 +0800, Zhao Zhili wrote:
> On Wed, 2022-12-14 at 18:08 +0100, Tomas Härdin wrote:
> > tis 2022-12-13 klockan 11:20 +0800 skrev zhilizhao(赵志立):
> > > > On Dec 12, 2022, at 23:27, Tomas Härdin <git@haerdin.se> wrote:
> > > > 
> > > > lör 2022-12-10 klockan 01:22 +0800 skrev Zhao Zhili:
> > > > > From: Zhao Zhili <zhilizhao@tencent.com>
> > > > > 
> > > > > It's well known that mediacodec encoder requires 16x16
> > > > > alignment.
> > > > > Use our bsf to fix the crop info.
> > > > > ---
> > > > > v3: don't change the dimension for AV_PIX_FMT_MEDIACODEC. It
> > > > > can
> > > > > have
> > > > > side effect.
> > > > 
> > > > Looks like this silently crops? Is that really a good idea? We
> > > > usually
> > > > don't do stuff like that. For example codecs that require even
> > > > dimensions complain loudly then fail.
> > > 
> > > It’s reasonable to require even dimensions. Require dimensions
> > > aligned
> > > to 16 is uncommon. Everyone will complain why 1080x1920 doesn’t
> > > work.
> > > 
> > > A lot of apps just use aligned dimensions. Users have no control
> > > on
> > > these apps. It’s not the same with FFmpeg, users (developer or
> > > not)
> > > can specify the dimension directly.
> > 
> > Wait a sec, I think I was misunderstanding what the code is doing.
> > FFALIGN rounds *up*. Does this mean you insert fake data in the
> > border
> > that then gets cropped away, meaning the original essence is still
> > "there"? That's a different thing and probably perfectly OK.
> 
> Yes, the dimension passed to MC is rounding up, then use bsf to
> remove
> that border. It depends on our AVFrame data has been aligned
> properly.

Actually there's no such dependent.

> This job is supposed to be done by any decent encoder, not by a
> wrapper.
> 
> > I think we might want something for this inside lavf somewhere, so
> > that
> > encoders can signal dimension alignment requirements. Some
> > containers
> > (MXF, MOV) support such cropping in a codec-agnostic manner.
> 
> From my own experience, dimension mismatch between codec and
> container
> makes a lot of trouble. ISO base format specification specified how
> to
> crop/scale after decoding clear, however, I don't think it has been
> widely supported, including FFmpeg. We can fix that inside of FFmpeg,
> but we should avoid such cases as much as we can.
> 
> > > If we don’t fix it, either:
> > > 
> > > 1. Reject and fail directly. User complain why.
> > > 2. Accept and keep going. Sometimes it works, sometimes don’t. It
> > > depends on the device and get into a confused situation.
> > > 
> > > I know there are getWidthAlignment()/getHeightAlignment() to get
> > > alignment info of codecs. The results are unreliable. The only
> > > reliable way I can find is don’t depends on those API and fix it
> > > by ourself.
> > 
> > Given how temperamental MC seems to be a "belt and braces" approach
> > might be appropriate when dealing with it. Tell users (ffmpeg.c is
> > a
> > user here) that dimensions must be aligned by 16x16 and then
> > automagically doing the required padding and cropping somewhere
> > (lavf
> > or ffmpeg.c) feels like a decent solution.
> > 
> > /Tomas
> > 
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > 
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Tomas Härdin Dec. 20, 2022, 6:24 p.m. UTC | #7
tor 2022-12-15 klockan 01:37 +0800 skrev Zhao Zhili:
> On Wed, 2022-12-14 at 18:08 +0100, Tomas Härdin wrote:
> 
> > 
> > I think we might want something for this inside lavf somewhere, so
> > that
> > encoders can signal dimension alignment requirements. Some
> > containers
> > (MXF, MOV) support such cropping in a codec-agnostic manner.
> 
> From my own experience, dimension mismatch between codec and
> container
> makes a lot of trouble. ISO base format specification specified how
> to
> crop/scale after decoding clear, however, I don't think it has been
> widely supported, including FFmpeg. We can fix that inside of FFmpeg,
> but we should avoid such cases as much as we can.

This is the difference between stored, sampled and display dimensions
in MXF. For example 1080i video has StoredHeight = 544, SampledHeight =
540 and DisplayHeight = 540 (see AS-10). When you add VBLANK and HBLANK
to the mix then all three dimensions are typically different.

Anyway specifying at the NAL level whenever the essence isn't a
multiple of 16x16 is obviously normal. The only complication I can
think of is 4:2:2 and 4:4:4. Does MC require 16x16 also in those cases?
I'd expect 16x8 and 8x8 respectively.

/Tomas
Zhao Zhili Dec. 21, 2022, 7:17 a.m. UTC | #8
> On Dec 21, 2022, at 02:24, Tomas Härdin <git@haerdin.se> wrote:
> 
> tor 2022-12-15 klockan 01:37 +0800 skrev Zhao Zhili:
>> On Wed, 2022-12-14 at 18:08 +0100, Tomas Härdin wrote:
>> 
>>> 
>>> I think we might want something for this inside lavf somewhere, so
>>> that
>>> encoders can signal dimension alignment requirements. Some
>>> containers
>>> (MXF, MOV) support such cropping in a codec-agnostic manner.
>> 
>> From my own experience, dimension mismatch between codec and
>> container
>> makes a lot of trouble. ISO base format specification specified how
>> to
>> crop/scale after decoding clear, however, I don't think it has been
>> widely supported, including FFmpeg. We can fix that inside of FFmpeg,
>> but we should avoid such cases as much as we can.
> 
> This is the difference between stored, sampled and display dimensions
> in MXF. For example 1080i video has StoredHeight = 544, SampledHeight =
> 540 and DisplayHeight = 540 (see AS-10). When you add VBLANK and HBLANK
> to the mix then all three dimensions are typically different.
> 
> Anyway specifying at the NAL level whenever the essence isn't a
> multiple of 16x16 is obviously normal. The only complication I can
> think of is 4:2:2 and 4:4:4. Does MC require 16x16 also in those cases?
> I'd expect 16x8 and 8x8 respectively.

It’s still 16x16. From H.264 specification:

macroblock: A 16x16 block of luma samples and two corresponding blocks of
chroma samples of a picture that has three sample arrays, or a 16x16 block
of samples of a monochrome picture or a picture that is coded using three
separate colour planes.

Macroblock has been replaced by coding tree unit with H.265, which can be
between 16×16 pixels and 64×64 pixels in size.

> 
> /Tomas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Tomas Härdin Dec. 21, 2022, 10:06 a.m. UTC | #9
ons 2022-12-21 klockan 15:17 +0800 skrev zhilizhao(赵志立):
> 
> 
> > On Dec 21, 2022, at 02:24, Tomas Härdin <git@haerdin.se> wrote:
> > 
> > tor 2022-12-15 klockan 01:37 +0800 skrev Zhao Zhili:
> > > On Wed, 2022-12-14 at 18:08 +0100, Tomas Härdin wrote:
> > > 
> > > > 
> > > > I think we might want something for this inside lavf somewhere,
> > > > so
> > > > that
> > > > encoders can signal dimension alignment requirements. Some
> > > > containers
> > > > (MXF, MOV) support such cropping in a codec-agnostic manner.
> > > 
> > > From my own experience, dimension mismatch between codec and
> > > container
> > > makes a lot of trouble. ISO base format specification specified
> > > how
> > > to
> > > crop/scale after decoding clear, however, I don't think it has
> > > been
> > > widely supported, including FFmpeg. We can fix that inside of
> > > FFmpeg,
> > > but we should avoid such cases as much as we can.
> > 
> > This is the difference between stored, sampled and display
> > dimensions
> > in MXF. For example 1080i video has StoredHeight = 544,
> > SampledHeight =
> > 540 and DisplayHeight = 540 (see AS-10). When you add VBLANK and
> > HBLANK
> > to the mix then all three dimensions are typically different.
> > 
> > Anyway specifying at the NAL level whenever the essence isn't a
> > multiple of 16x16 is obviously normal. The only complication I can
> > think of is 4:2:2 and 4:4:4. Does MC require 16x16 also in those
> > cases?
> > I'd expect 16x8 and 8x8 respectively.
> 
> It’s still 16x16. From H.264 specification:

I stand corrected. I did give the spec a once-over a while back, but
not everything sticks.

/Tomas
diff mbox series

Patch

diff --git a/configure b/configure
index f4eedfc207..2180ebb4f1 100755
--- a/configure
+++ b/configure
@@ -3169,6 +3169,7 @@  h264_mediacodec_decoder_extralibs="-landroid"
 h264_mediacodec_decoder_select="h264_mp4toannexb_bsf h264_parser"
 h264_mediacodec_encoder_deps="mediacodec"
 h264_mediacodec_encoder_extralibs="-landroid"
+h264_mediacodec_encoder_select="h264_metadata"
 h264_mf_encoder_deps="mediafoundation"
 h264_mmal_decoder_deps="mmal"
 h264_nvenc_encoder_deps="nvenc"
@@ -3190,6 +3191,7 @@  hevc_mediacodec_decoder_extralibs="-landroid"
 hevc_mediacodec_decoder_select="hevc_mp4toannexb_bsf hevc_parser"
 hevc_mediacodec_encoder_deps="mediacodec"
 hevc_mediacodec_encoder_extralibs="-landroid"
+hevc_mediacodec_encoder_select="hevc_metadata"
 hevc_mf_encoder_deps="mediafoundation"
 hevc_nvenc_encoder_deps="nvenc"
 hevc_nvenc_encoder_select="atsc_a53"
diff --git a/libavcodec/mediacodecenc.c b/libavcodec/mediacodecenc.c
index 2f78567451..4e8716e3a5 100644
--- a/libavcodec/mediacodecenc.c
+++ b/libavcodec/mediacodecenc.c
@@ -28,6 +28,7 @@ 
 #include "libavutil/opt.h"
 
 #include "avcodec.h"
+#include "bsf.h"
 #include "codec_internal.h"
 #include "encode.h"
 #include "hwconfig.h"
@@ -78,6 +79,7 @@  typedef struct MediaCodecEncContext {
     int eof_sent;
 
     AVFrame *frame;
+    AVBSFContext *bsf;
 
     int bitrate_mode;
     int level;
@@ -119,6 +121,42 @@  static void mediacodec_output_format(AVCodecContext *avctx)
     ff_AMediaFormat_delete(out_format);
 }
 
+static int mediacodec_init_bsf(AVCodecContext *avctx)
+{
+    MediaCodecEncContext *s = avctx->priv_data;
+    char str[128];
+    int ret;
+    int crop_right = s->width - avctx->width;
+    int crop_bottom = s->height - avctx->height;
+
+    if (!crop_right && !crop_bottom)
+        return 0;
+
+    if (avctx->codec_id == AV_CODEC_ID_H264)
+        ret = snprintf(str, sizeof(str), "h264_metadata=crop_right=%d:crop_bottom=%d",
+                 crop_right, crop_bottom);
+    else if (avctx->codec_id == AV_CODEC_ID_HEVC)
+        ret = snprintf(str, sizeof(str), "hevc_metadata=crop_right=%d:crop_bottom=%d",
+                 crop_right, crop_bottom);
+    else
+        return 0;
+
+    if (ret >= sizeof(str))
+        return AVERROR_BUFFER_TOO_SMALL;
+
+    ret = av_bsf_list_parse_str(str, &s->bsf);
+    if (ret < 0)
+        return ret;
+
+    ret = avcodec_parameters_from_context(s->bsf->par_in, avctx);
+    if (ret < 0)
+        return ret;
+    s->bsf->time_base_in = avctx->time_base;
+    ret = av_bsf_init(s->bsf);
+
+    return ret;
+}
+
 static av_cold int mediacodec_init(AVCodecContext *avctx)
 {
     const char *codec_mime = NULL;
@@ -158,8 +196,19 @@  static av_cold int mediacodec_init(AVCodecContext *avctx)
     }
 
     ff_AMediaFormat_setString(format, "mime", codec_mime);
-    s->width = FFALIGN(avctx->width, 16);
-    s->height = avctx->height;
+    // Workaround the alignment requirement of mediacodec. We can't do it
+    // silently for AV_PIX_FMT_MEDIACODEC.
+    if (avctx->pix_fmt != AV_PIX_FMT_MEDIACODEC) {
+        s->width = FFALIGN(avctx->width, 16);
+        s->height = FFALIGN(avctx->height, 16);
+    } else {
+        s->width = avctx->width;
+        s->height = avctx->height;
+        if (s->width % 16 || s->height % 16)
+            av_log(avctx, AV_LOG_WARNING,
+                    "Video size %dx%d isn't align to 16, it may have device compatibility issue\n",
+                    s->width, s->height);
+    }
     ff_AMediaFormat_setInt32(format, "width", s->width);
     ff_AMediaFormat_setInt32(format, "height", s->height);
 
@@ -252,6 +301,10 @@  static av_cold int mediacodec_init(AVCodecContext *avctx)
         goto bailout;
     }
 
+    ret = mediacodec_init_bsf(avctx);
+    if (ret)
+        goto bailout;
+
     mediacodec_output_format(avctx);
 
     s->frame = av_frame_alloc();
@@ -444,10 +497,24 @@  static int mediacodec_encode(AVCodecContext *avctx, AVPacket *pkt)
     // 2. Got a packet success
     // 3. No AVFrame is available yet (don't return if get_frame return EOF)
     while (1) {
+        if (s->bsf) {
+            ret = av_bsf_receive_packet(s->bsf, pkt);
+            if (!ret)
+                return 0;
+            if (ret != AVERROR(EAGAIN))
+                return ret;
+        }
+
         ret = mediacodec_receive(avctx, pkt, &got_packet);
-        if (!ret)
-            return 0;
-        else if (ret != AVERROR(EAGAIN))
+        if (s->bsf) {
+            if (!ret || ret == AVERROR_EOF)
+                ret = av_bsf_send_packet(s->bsf, pkt);
+        } else {
+            if (!ret)
+                return 0;
+        }
+
+        if (ret != AVERROR(EAGAIN))
             return ret;
 
         if (!s->frame->buf[0]) {
@@ -480,6 +547,7 @@  static av_cold int mediacodec_close(AVCodecContext *avctx)
         s->window = NULL;
     }
 
+    av_bsf_free(&s->bsf);
     av_frame_free(&s->frame);
 
     return 0;