Message ID | 20180514212950.15764-1-cus@passwd.hu |
---|---|
State | New |
Headers | show |
On 14.05.2018 23:29, Marton Balint wrote: > Regression since ece068a771ac3f725e854c681ecbef08e792addc. > > Signed-off-by: Marton Balint <cus@passwd.hu> > --- > libavutil/hwcontext_cuda.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/libavutil/hwcontext_cuda.c b/libavutil/hwcontext_cuda.c > index cb5d15c9d9..e16d0a2b4b 100644 > --- a/libavutil/hwcontext_cuda.c > +++ b/libavutil/hwcontext_cuda.c > @@ -159,10 +159,13 @@ static int cuda_get_buffer(AVHWFramesContext *ctx, AVFrame *frame) > return res; > > // YUV420P is a special case. > - // Nvenc expects the U/V planes in swapped order from how ffmpeg expects them. > + // Nvenc expects the U/V planes in swapped order from how ffmpeg expects them, also chroma is half-aligned > if (ctx->sw_format == AV_PIX_FMT_YUV420P) { > - FFSWAP(uint8_t*, frame->data[1], frame->data[2]); > - FFSWAP(int, frame->linesize[1], frame->linesize[2]); > + int aligned_width = FFALIGN(ctx->width, CUDA_FRAME_ALIGNMENT); > + frame->data[2] = frame->data[0] + aligned_width * ctx->height; > + frame->data[1] = frame->data[2] + aligned_width * ctx->height / 4; > + frame->linesize[1] = aligned_width / 2; > + frame->linesize[2] = aligned_width / 2; > } Are you sure it isn't already that way? I tested specially YUV420P and could not make out any issues. And if this is really necessary, please don't use the old aligned_width term, it's incredibly confusing and made the old code a mess. linesize[0] should be identical to it if I'm not mistaken. So something like frame->linesize[1] = frame->linesize[2] = frame->linesize[0] / 2; frame->data[2] = frame->data[1]; frame->data[1] = frame->data[2] + frame->linesize[2] * ctx->height / 2;
On Tue, 15 May 2018, Timo Rothenpieler wrote: > On 14.05.2018 23:29, Marton Balint wrote: >> Regression since ece068a771ac3f725e854c681ecbef08e792addc. >> >> Signed-off-by: Marton Balint <cus@passwd.hu> >> --- >> libavutil/hwcontext_cuda.c | 9 ++++++--- >> 1 file changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/libavutil/hwcontext_cuda.c b/libavutil/hwcontext_cuda.c >> index cb5d15c9d9..e16d0a2b4b 100644 >> --- a/libavutil/hwcontext_cuda.c >> +++ b/libavutil/hwcontext_cuda.c >> @@ -159,10 +159,13 @@ static int cuda_get_buffer(AVHWFramesContext *ctx, AVFrame *frame) >> return res; >> >> // YUV420P is a special case. >> - // Nvenc expects the U/V planes in swapped order from how ffmpeg expects them. >> + // Nvenc expects the U/V planes in swapped order from how ffmpeg expects them, also chroma is half-aligned >> if (ctx->sw_format == AV_PIX_FMT_YUV420P) { >> - FFSWAP(uint8_t*, frame->data[1], frame->data[2]); >> - FFSWAP(int, frame->linesize[1], frame->linesize[2]); >> + int aligned_width = FFALIGN(ctx->width, CUDA_FRAME_ALIGNMENT); >> + frame->data[2] = frame->data[0] + aligned_width * ctx->height; >> + frame->data[1] = frame->data[2] + aligned_width * ctx->height / 4; >> + frame->linesize[1] = aligned_width / 2; >> + frame->linesize[2] = aligned_width / 2; >> } > > Are you sure it isn't already that way? I tested specially YUV420P and > could not make out any issues. For 4K content linesize[0] is 3840, but linesize[1] becomes 2048 instead of 1920. > > And if this is really necessary, please don't use the old aligned_width > term, it's incredibly confusing and made the old code a mess. > linesize[0] should be identical to it if I'm not mistaken. > > So something like > > frame->linesize[1] = frame->linesize[2] = frame->linesize[0] / 2; > frame->data[2] = frame->data[1]; > frame->data[1] = frame->data[2] + frame->linesize[2] * ctx->height / 2; Ok, will use this. Regards, Marton
On 15.05.2018 13:37, Marton Balint wrote: >> Are you sure it isn't already that way? I tested specially YUV420P and >> could not make out any issues. > > For 4K content linesize[0] is 3840, but linesize[1] becomes 2048 instead > of 1920. That's odd, 2048 seems more correct since it's a multiple of the recommended 2048. But I guess the whole yuv420p in nvenc is some weird hack on the driver side. >> >> So something like >> >> frame->linesize[1] = frame->linesize[2] = frame->linesize[0] / 2; >> frame->data[2] = frame->data[1]; >> frame->data[1] = frame->data[2] + frame->linesize[2] * ctx->height >> / 2; > > Ok, will use this. Feel free to push if it works. Don't have access to an nvidia machine right now, so can't properly test.
On Tue, 15 May 2018, Timo Rothenpieler wrote: > On 15.05.2018 13:37, Marton Balint wrote: >>> Are you sure it isn't already that way? I tested specially YUV420P and >>> could not make out any issues. >> >> For 4K content linesize[0] is 3840, but linesize[1] becomes 2048 instead >> of 1920. > > That's odd, 2048 seems more correct since it's a multiple of the > recommended 2048. But I guess the whole yuv420p in nvenc is some weird > hack on the driver side. > >>> >>> So something like >>> >>> frame->linesize[1] = frame->linesize[2] = frame->linesize[0] / 2; >>> frame->data[2] = frame->data[1]; >>> frame->data[1] = frame->data[2] + frame->linesize[2] * ctx->height >>> / 2; >> >> Ok, will use this. > > Feel free to push if it works. Don't have access to an nvidia machine > right now, so can't properly test. Ok, pushed. Thanks, Marton
diff --git a/libavutil/hwcontext_cuda.c b/libavutil/hwcontext_cuda.c index cb5d15c9d9..e16d0a2b4b 100644 --- a/libavutil/hwcontext_cuda.c +++ b/libavutil/hwcontext_cuda.c @@ -159,10 +159,13 @@ static int cuda_get_buffer(AVHWFramesContext *ctx, AVFrame *frame) return res; // YUV420P is a special case. - // Nvenc expects the U/V planes in swapped order from how ffmpeg expects them. + // Nvenc expects the U/V planes in swapped order from how ffmpeg expects them, also chroma is half-aligned if (ctx->sw_format == AV_PIX_FMT_YUV420P) { - FFSWAP(uint8_t*, frame->data[1], frame->data[2]); - FFSWAP(int, frame->linesize[1], frame->linesize[2]); + int aligned_width = FFALIGN(ctx->width, CUDA_FRAME_ALIGNMENT); + frame->data[2] = frame->data[0] + aligned_width * ctx->height; + frame->data[1] = frame->data[2] + aligned_width * ctx->height / 4; + frame->linesize[1] = aligned_width / 2; + frame->linesize[2] = aligned_width / 2; } frame->format = AV_PIX_FMT_CUDA;
Regression since ece068a771ac3f725e854c681ecbef08e792addc. Signed-off-by: Marton Balint <cus@passwd.hu> --- libavutil/hwcontext_cuda.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)