diff mbox

[FFmpeg-devel,4/7] Adds gray floating-point pixel formats.

Message ID CAAeE=qoX4-tWKdi7xS7g=edodxaFEE38XN8mhJG_Y-P+EXsh-A@mail.gmail.com
State Superseded
Headers show

Commit Message

Sergey Lavrushkin Aug. 13, 2018, 1:58 p.m. UTC
>
> Just use av_clipf instead of FFMIN/FFMAX.


Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.

Comments

Michael Niedermayer Aug. 14, 2018, 4:23 p.m. UTC | #1
On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> >
> > Just use av_clipf instead of FFMIN/FFMAX.
> 
> 
> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.

will apply

thanks

[...]
James Almer Aug. 17, 2018, 3:46 a.m. UTC | #2
On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
>>>
>>> Just use av_clipf instead of FFMIN/FFMAX.
>>
>>
>> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> 
> will apply
> 
> thanks

This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
tested for bitexact output. The gbrpf32 ones aren't, for example.
http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx

Was a float gray pixfmt needed for this filter? Gray16 was not an option?
Sergey Lavrushkin Aug. 17, 2018, 12:21 p.m. UTC | #3
пт, 17 авг. 2018 г., 6:47 James Almer <jamrial@gmail.com>:

> On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> >>>
> >>> Just use av_clipf instead of FFMIN/FFMAX.
> >>
> >>
> >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> >
> > will apply
> >
> > thanks
>
> This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> tested for bitexact output. The gbrpf32 ones aren't, for example.
>
> http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx


If I am not mistaken, gbrpf32 formats are not supported in libswscale and
not tested because of that.

>
> Was a float gray pixfmt needed for this filter? Gray16 was not an option?
>

All calculations in neural network are done using floats.

What can I do to fix this issue? Can I get a VM image for this host to test
it?

>
Carl Eugen Hoyos Aug. 17, 2018, 12:52 p.m. UTC | #4
2018-08-17 14:21 GMT+02:00, Sergey Lavrushkin <dualfal@gmail.com>:
> пт, 17 авг. 2018 г., 6:47 James Almer <jamrial@gmail.com>:
>
>> On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
>> > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
>> >>>
>> >>> Just use av_clipf instead of FFMIN/FFMAX.
>> >>
>> >>
>> >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
>> >
>> > will apply
>> >
>> > thanks
>>
>> This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
>> tested for bitexact output. The gbrpf32 ones aren't, for example.
>>
>> http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
>
>
> If I am not mistaken, gbrpf32 formats are not supported in libswscale
> [because of that]

I sincerely hope that this is not true...

> and not tested because of that.
>
>>
>> Was a float gray pixfmt needed for this filter? Gray16 was not an option?
>>
>
> All calculations in neural network are done using floats.
>
> What can I do to fix this issue?

Reverting to doing the conversion in the filter comes to mind...

Carl Eugen
Michael Niedermayer Aug. 17, 2018, 8:28 p.m. UTC | #5
On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote:
> On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> >>>
> >>> Just use av_clipf instead of FFMIN/FFMAX.
> >>
> >>
> >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> > 
> > will apply
> > 
> > thanks
> 
> This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> tested for bitexact output. The gbrpf32 ones aren't, for example.
> http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx

hmmmm
i remember i had tested this locally on 32bit 
can something be slightly adjusted (like an offset or factor) to avoid any
values becoming close to 0.5 and rounding differently on platforms ?
If not the tests should skip float pixel formats or try the nearest neighbor scaler

Sergey, can you look into this (its your patch) ? (just asking to make sure
not eevryone thinks someone else will work on this)

thx


[...]
Sergey Lavrushkin Aug. 18, 2018, 11:10 a.m. UTC | #6
2018-08-17 23:28 GMT+03:00 Michael Niedermayer <michael@niedermayer.cc>:

> On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote:
> > On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> > >>>
> > >>> Just use av_clipf instead of FFMIN/FFMAX.
> > >>
> > >>
> > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> > >
> > > will apply
> > >
> > > thanks
> >
> > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> > tested for bitexact output. The gbrpf32 ones aren't, for example.
> > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=
> x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
>
> hmmmm
> i remember i had tested this locally on 32bit
> can something be slightly adjusted (like an offset or factor) to avoid any
> values becoming close to 0.5 and rounding differently on platforms ?

If not the tests should skip float pixel formats or try the nearest
> neighbor scaler
>

Can it really be the problem with scaler? Do all these failed test use
scaling?
Is not it the problem, that different platforms can give slightly different
results for
floating-point operations? Does input for the float format is somehow
generated
for these tests, so the input conversion is tested? Maybe it uses output
conversion first?
If it is the problem of different floating-point operations results on
different platforms,
maybe it is possible to use precomputed LUT for output conversion, so it
will give
the same results? Or is it possible to modify tests for the float format,
so it will
check if pixels of the result are just close to some reference.


> Sergey, can you look into this (its your patch) ? (just asking to make sure
> not eevryone thinks someone else will work on this)
>

Yes, I can, just need to know, what is possible to do to fix this issue,
besides skipping the tests.
James Almer Aug. 18, 2018, 12:59 p.m. UTC | #7
On 8/17/2018 9:52 AM, Carl Eugen Hoyos wrote:
> 2018-08-17 14:21 GMT+02:00, Sergey Lavrushkin <dualfal@gmail.com>:
>> пт, 17 авг. 2018 г., 6:47 James Almer <jamrial@gmail.com>:
>>
>>> On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
>>>> On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
>>>>>>
>>>>>> Just use av_clipf instead of FFMIN/FFMAX.
>>>>>
>>>>>
>>>>> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
>>>>
>>>> will apply
>>>>
>>>> thanks
>>>
>>> This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
>>> tested for bitexact output. The gbrpf32 ones aren't, for example.
>>>
>>> http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
>>
>>
>> If I am not mistaken, gbrpf32 formats are not supported in libswscale
>> [because of that]
> 
> I sincerely hope that this is not true...
> 
>> and not tested because of that.
>>
>>>
>>> Was a float gray pixfmt needed for this filter? Gray16 was not an option?
>>>
>>
>> All calculations in neural network are done using floats.
>>
>> What can I do to fix this issue?
> 
> Reverting to doing the conversion in the filter comes to mind...
> 
> Carl Eugen

We asked him to remove the conversions from the filter, so we're not
going to tell him to roll everything back...
Michael Niedermayer Aug. 18, 2018, 8:20 p.m. UTC | #8
On Sat, Aug 18, 2018 at 02:10:21PM +0300, Sergey Lavrushkin wrote:
> 2018-08-17 23:28 GMT+03:00 Michael Niedermayer <michael@niedermayer.cc>:
> 
> > On Fri, Aug 17, 2018 at 12:46:52AM -0300, James Almer wrote:
> > > On 8/14/2018 1:23 PM, Michael Niedermayer wrote:
> > > > On Mon, Aug 13, 2018 at 04:58:42PM +0300, Sergey Lavrushkin wrote:
> > > >>>
> > > >>> Just use av_clipf instead of FFMIN/FFMAX.
> > > >>
> > > >>
> > > >> Changed FFMIN/FFMAX to av_clip_uint16 and av_clip_uint8.
> > > >
> > > > will apply
> > > >
> > > > thanks
> > >
> > > This broke fate on some 32bit hosts. Guess float pixfmts shouldn't be
> > > tested for bitexact output. The gbrpf32 ones aren't, for example.
> > > http://fate.ffmpeg.org/report.cgi?time=20180816131312&slot=
> > x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx
> >
> > hmmmm
> > i remember i had tested this locally on 32bit
> > can something be slightly adjusted (like an offset or factor) to avoid any
> > values becoming close to 0.5 and rounding differently on platforms ?
> 
> If not the tests should skip float pixel formats or try the nearest
> > neighbor scaler
> >
> 
> Can it really be the problem with scaler? Do all these failed test use
> scaling?
> Is not it the problem, that different platforms can give slightly different
> results for
> floating-point operations? Does input for the float format is somehow
> generated
> for these tests, so the input conversion is tested? Maybe it uses output
> conversion first?
> If it is the problem of different floating-point operations results on
> different platforms,

> maybe it is possible to use precomputed LUT for output conversion, so it

I dont think we should change the "algorithm" to achive "bitexactness"
we could of course but it feels like the wrong reason to make such a
change. How its done should be choosen based on what is fast (and to a
lesser extend clean, simple and maintainable)



> will give
> the same results? Or is it possible to modify tests for the float format,
> so it will
> check if pixels of the result are just close to some reference.

Its possible to compare to a reference, we do this in some other tests,
but thats surely more work than just disabling teh specific tests or trying
to nudge them a little to see if that makes nothing fall too close to n + 0.5

> 
> 
> > Sergey, can you look into this (its your patch) ? (just asking to make sure
> > not eevryone thinks someone else will work on this)
> >
> 
> Yes, I can, just need to know, what is possible to do to fix this issue,
> besides skipping the tests.

most things are possible


> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
diff mbox

Patch

From 210e497d76328947fdf424b169728fa728cc18f2 Mon Sep 17 00:00:00 2001
From: Sergey Lavrushkin <dualfal@gmail.com>
Date: Fri, 3 Aug 2018 18:06:50 +0300
Subject: [PATCH 5/9] libswscale: Adds conversions from/to float gray format.

---
 libswscale/input.c                       |  38 +++++++++++
 libswscale/output.c                      | 105 +++++++++++++++++++++++++++++++
 libswscale/ppc/swscale_altivec.c         |   1 +
 libswscale/swscale_internal.h            |   9 +++
 libswscale/swscale_unscaled.c            |  54 +++++++++++++++-
 libswscale/utils.c                       |  20 +++++-
 libswscale/x86/swscale_template.c        |   3 +-
 tests/ref/fate/filter-pixdesc-grayf32be  |   1 +
 tests/ref/fate/filter-pixdesc-grayf32le  |   1 +
 tests/ref/fate/filter-pixfmts-copy       |   2 +
 tests/ref/fate/filter-pixfmts-crop       |   2 +
 tests/ref/fate/filter-pixfmts-field      |   2 +
 tests/ref/fate/filter-pixfmts-fieldorder |   2 +
 tests/ref/fate/filter-pixfmts-hflip      |   2 +
 tests/ref/fate/filter-pixfmts-il         |   2 +
 tests/ref/fate/filter-pixfmts-null       |   2 +
 tests/ref/fate/filter-pixfmts-scale      |   2 +
 tests/ref/fate/filter-pixfmts-transpose  |   2 +
 tests/ref/fate/filter-pixfmts-vflip      |   2 +
 19 files changed, 248 insertions(+), 4 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-grayf32be
 create mode 100644 tests/ref/fate/filter-pixdesc-grayf32le

diff --git a/libswscale/input.c b/libswscale/input.c
index 3fd3a5d81e..4099c19c2b 100644
--- a/libswscale/input.c
+++ b/libswscale/input.c
@@ -942,6 +942,30 @@  static av_always_inline void planar_rgb16_to_uv(uint8_t *_dstU, uint8_t *_dstV,
 }
 #undef rdpx
 
+static av_always_inline void grayf32ToY16_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
+                                            const uint8_t *unused2, int width, uint32_t *unused)
+{
+    int i;
+    const float *src = (const float *)_src;
+    uint16_t *dst    = (uint16_t *)_dst;
+
+    for (i = 0; i < width; ++i){
+        dst[i] = av_clip_uint16(lrintf(65535.0f * src[i]));
+    }
+}
+
+static av_always_inline void grayf32ToY16_bswap_c(uint8_t *_dst, const uint8_t *_src, const uint8_t *unused1,
+                                                  const uint8_t *unused2, int width, uint32_t *unused)
+{
+    int i;
+    const uint32_t *src = (const uint32_t *)_src;
+    uint16_t *dst    = (uint16_t *)_dst;
+
+    for (i = 0; i < width; ++i){
+        dst[i] = av_clip_uint16(lrintf(65535.0f * av_int2float(av_bswap32(src[i]))));
+    }
+}
+
 #define rgb9plus_planar_funcs_endian(nbits, endian_name, endian)                                    \
 static void planar_rgb##nbits##endian_name##_to_y(uint8_t *dst, const uint8_t *src[4],              \
                                                   int w, int32_t *rgb2yuv)                          \
@@ -1538,6 +1562,20 @@  av_cold void ff_sws_init_input_funcs(SwsContext *c)
     case AV_PIX_FMT_P010BE:
         c->lumToYV12 = p010BEToY_c;
         break;
+    case AV_PIX_FMT_GRAYF32LE:
+#if HAVE_BIGENDIAN
+        c->lumToYV12 = grayf32ToY16_bswap_c;
+#else
+        c->lumToYV12 = grayf32ToY16_c;
+#endif
+        break;
+    case AV_PIX_FMT_GRAYF32BE:
+#if HAVE_BIGENDIAN
+        c->lumToYV12 = grayf32ToY16_c;
+#else
+        c->lumToYV12 = grayf32ToY16_bswap_c;
+#endif
+        break;
     }
     if (c->needAlpha) {
         if (is16BPS(srcFormat) || isNBPS(srcFormat)) {
diff --git a/libswscale/output.c b/libswscale/output.c
index 0af2fffea4..de8637aa3b 100644
--- a/libswscale/output.c
+++ b/libswscale/output.c
@@ -208,6 +208,105 @@  static void yuv2p016cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS
     }
 }
 
+static av_always_inline void
+yuv2plane1_float_c_template(const int32_t *src, float *dest, int dstW)
+{
+    static const int big_endian = HAVE_BIGENDIAN;
+    static const int shift = 3;
+    static const float float_mult = 1.0f / 65535.0f;
+    int i, val;
+    uint16_t val_uint;
+
+    for (i = 0; i < dstW; ++i){
+        val = src[i] + (1 << (shift - 1));
+        output_pixel(&val_uint, val, 0, uint);
+        dest[i] = float_mult * (float)val_uint;
+    }
+}
+
+static av_always_inline void
+yuv2plane1_float_bswap_c_template(const int32_t *src, uint32_t *dest, int dstW)
+{
+    static const int big_endian = HAVE_BIGENDIAN;
+    static const int shift = 3;
+    static const float float_mult = 1.0f / 65535.0f;
+    int i, val;
+    uint16_t val_uint;
+
+    for (i = 0; i < dstW; ++i){
+        val = src[i] + (1 << (shift - 1));
+        output_pixel(&val_uint, val, 0, uint);
+        dest[i] = av_bswap32(av_float2int(float_mult * (float)val_uint));
+    }
+}
+
+static av_always_inline void
+yuv2planeX_float_c_template(const int16_t *filter, int filterSize, const int32_t **src,
+                            float *dest, int dstW)
+{
+    static const int big_endian = HAVE_BIGENDIAN;
+    static const int shift = 15;
+    static const float float_mult = 1.0f / 65535.0f;
+    int i, j, val;
+    uint16_t val_uint;
+
+    for (i = 0; i < dstW; ++i){
+        val = (1 << (shift - 1)) - 0x40000000;
+        for (j = 0; j < filterSize; ++j){
+            val += src[j][i] * (unsigned)filter[j];
+        }
+        output_pixel(&val_uint, val, 0x8000, int);
+        dest[i] = float_mult * (float)val_uint;
+    }
+}
+
+static av_always_inline void
+yuv2planeX_float_bswap_c_template(const int16_t *filter, int filterSize, const int32_t **src,
+                            uint32_t *dest, int dstW)
+{
+    static const int big_endian = HAVE_BIGENDIAN;
+    static const int shift = 15;
+    static const float float_mult = 1.0f / 65535.0f;
+    int i, j, val;
+    uint16_t val_uint;
+
+    for (i = 0; i < dstW; ++i){
+        val = (1 << (shift - 1)) - 0x40000000;
+        for (j = 0; j < filterSize; ++j){
+            val += src[j][i] * (unsigned)filter[j];
+        }
+        output_pixel(&val_uint, val, 0x8000, int);
+        dest[i] = av_bswap32(av_float2int(float_mult * (float)val_uint));
+    }
+}
+
+#define yuv2plane1_float(template, dest_type, BE_LE) \
+static void yuv2plane1_float ## BE_LE ## _c(const int16_t *src, uint8_t *dest, int dstW, \
+                                            const uint8_t *dither, int offset) \
+{ \
+    template((const int32_t *)src, (dest_type *)dest, dstW); \
+}
+
+#define yuv2planeX_float(template, dest_type, BE_LE) \
+static void yuv2planeX_float ## BE_LE ## _c(const int16_t *filter, int filterSize, \
+                                            const int16_t **src, uint8_t *dest, int dstW, \
+                                            const uint8_t *dither, int offset) \
+{ \
+    template(filter, filterSize, (const int32_t **)src, (dest_type *)dest, dstW); \
+}
+
+#if HAVE_BIGENDIAN
+yuv2plane1_float(yuv2plane1_float_c_template,       float,    BE)
+yuv2plane1_float(yuv2plane1_float_bswap_c_template, uint32_t, LE)
+yuv2planeX_float(yuv2planeX_float_c_template,       float,    BE)
+yuv2planeX_float(yuv2planeX_float_bswap_c_template, uint32_t, LE)
+#else
+yuv2plane1_float(yuv2plane1_float_c_template,       float,    LE)
+yuv2plane1_float(yuv2plane1_float_bswap_c_template, uint32_t, BE)
+yuv2planeX_float(yuv2planeX_float_c_template,       float,    LE)
+yuv2planeX_float(yuv2planeX_float_bswap_c_template, uint32_t, BE)
+#endif
+
 #undef output_pixel
 
 #define output_pixel(pos, val) \
@@ -2303,6 +2402,12 @@  av_cold void ff_sws_init_output_funcs(SwsContext *c,
             *yuv2plane1 = isBE(dstFormat) ? yuv2plane1_14BE_c  : yuv2plane1_14LE_c;
         } else
             av_assert0(0);
+    } else if (dstFormat == AV_PIX_FMT_GRAYF32BE) {
+        *yuv2planeX = yuv2planeX_floatBE_c;
+        *yuv2plane1 = yuv2plane1_floatBE_c;
+    } else if (dstFormat == AV_PIX_FMT_GRAYF32LE) {
+        *yuv2planeX = yuv2planeX_floatLE_c;
+        *yuv2plane1 = yuv2plane1_floatLE_c;
     } else {
         *yuv2plane1 = yuv2plane1_8_c;
         *yuv2planeX = yuv2planeX_8_c;
diff --git a/libswscale/ppc/swscale_altivec.c b/libswscale/ppc/swscale_altivec.c
index 9438a63ff2..2fb2337769 100644
--- a/libswscale/ppc/swscale_altivec.c
+++ b/libswscale/ppc/swscale_altivec.c
@@ -339,6 +339,7 @@  av_cold void ff_sws_init_swscale_ppc(SwsContext *c)
     }
     if (!is16BPS(dstFormat) && !isNBPS(dstFormat) &&
         dstFormat != AV_PIX_FMT_NV12 && dstFormat != AV_PIX_FMT_NV21 &&
+        dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE &&
         !c->needAlpha) {
         c->yuv2planeX = yuv2planeX_altivec;
     }
diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h
index 1703856ab2..4fa59386a6 100644
--- a/libswscale/swscale_internal.h
+++ b/libswscale/swscale_internal.h
@@ -336,6 +336,8 @@  typedef struct SwsContext {
     uint32_t pal_yuv[256];
     uint32_t pal_rgb[256];
 
+    float uint2float_lut[256];
+
     /**
      * @name Scaled horizontal lines ring buffer.
      * The horizontal scaler keeps just enough scaled lines in a ring buffer
@@ -764,6 +766,13 @@  static av_always_inline int isAnyRGB(enum AVPixelFormat pix_fmt)
             pix_fmt == AV_PIX_FMT_MONOBLACK || pix_fmt == AV_PIX_FMT_MONOWHITE;
 }
 
+static av_always_inline int isFloat(enum AVPixelFormat pix_fmt)
+{
+    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
+    av_assert0(desc);
+    return desc->flags & AV_PIX_FMT_FLAG_FLOAT;
+}
+
 static av_always_inline int isALPHA(enum AVPixelFormat pix_fmt)
 {
     const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index 6480070cbf..973fa4875f 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -1467,6 +1467,46 @@  static int yvu9ToYv12Wrapper(SwsContext *c, const uint8_t *src[],
     return srcSliceH;
 }
 
+static int uint_y_to_float_y_wrapper(SwsContext *c, const uint8_t *src[],
+                                     int srcStride[], int srcSliceY,
+                                     int srcSliceH, uint8_t *dst[], int dstStride[])
+{
+    int y, x;
+    ptrdiff_t dstStrideFloat = dstStride[0] >> 2;
+    const uint8_t *srcPtr = src[0];
+    float *dstPtr = (float *)(dst[0] + dstStride[0] * srcSliceY);
+
+    for (y = 0; y < srcSliceH; ++y){
+        for (x = 0; x < c->srcW; ++x){
+            dstPtr[x] = c->uint2float_lut[srcPtr[x]];
+        }
+        srcPtr += srcStride[0];
+        dstPtr += dstStrideFloat;
+    }
+
+    return srcSliceH;
+}
+
+static int float_y_to_uint_y_wrapper(SwsContext *c, const uint8_t* src[],
+                                     int srcStride[], int srcSliceY,
+                                     int srcSliceH, uint8_t* dst[], int dstStride[])
+{
+    int y, x;
+    ptrdiff_t srcStrideFloat = srcStride[0] >> 2;
+    const float *srcPtr = (const float *)src[0];
+    uint8_t *dstPtr = dst[0] + dstStride[0] * srcSliceY;
+
+    for (y = 0; y < srcSliceH; ++y){
+        for (x = 0; x < c->srcW; ++x){
+            dstPtr[x] = av_clip_uint8(lrintf(255.0f * srcPtr[x]));
+        }
+        srcPtr += srcStrideFloat;
+        dstPtr += dstStride[0];
+    }
+
+    return srcSliceH;
+}
+
 /* unscaled copy like stuff (assumes nearly identical formats) */
 static int packedCopyWrapper(SwsContext *c, const uint8_t *src[],
                              int srcStride[], int srcSliceY, int srcSliceH,
@@ -1899,6 +1939,16 @@  void ff_get_unscaled_swscale(SwsContext *c)
             c->swscale = yuv422pToUyvyWrapper;
     }
 
+    /* uint Y to float Y */
+    if (srcFormat == AV_PIX_FMT_GRAY8 && dstFormat == AV_PIX_FMT_GRAYF32){
+        c->swscale = uint_y_to_float_y_wrapper;
+    }
+
+    /* float Y to uint Y */
+    if (srcFormat == AV_PIX_FMT_GRAYF32 && dstFormat == AV_PIX_FMT_GRAY8){
+        c->swscale = float_y_to_uint_y_wrapper;
+    }
+
     /* LQ converters if -sws 0 or -sws 4*/
     if (c->flags&(SWS_FAST_BILINEAR|SWS_POINT)) {
         /* yv12_to_yuy2 */
@@ -1925,13 +1975,13 @@  void ff_get_unscaled_swscale(SwsContext *c)
     if ( srcFormat == dstFormat ||
         (srcFormat == AV_PIX_FMT_YUVA420P && dstFormat == AV_PIX_FMT_YUV420P) ||
         (srcFormat == AV_PIX_FMT_YUV420P && dstFormat == AV_PIX_FMT_YUVA420P) ||
-        (isPlanarYUV(srcFormat) && isPlanarGray(dstFormat)) ||
+        (isFloat(srcFormat) == isFloat(dstFormat)) && ((isPlanarYUV(srcFormat) && isPlanarGray(dstFormat)) ||
         (isPlanarYUV(dstFormat) && isPlanarGray(srcFormat)) ||
         (isPlanarGray(dstFormat) && isPlanarGray(srcFormat)) ||
         (isPlanarYUV(srcFormat) && isPlanarYUV(dstFormat) &&
          c->chrDstHSubSample == c->chrSrcHSubSample &&
          c->chrDstVSubSample == c->chrSrcVSubSample &&
-         !isSemiPlanarYUV(srcFormat) && !isSemiPlanarYUV(dstFormat)))
+         !isSemiPlanarYUV(srcFormat) && !isSemiPlanarYUV(dstFormat))))
     {
         if (isPacked(c->srcFormat))
             c->swscale = packedCopyWrapper;
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 61b47182f8..5e56371180 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -258,6 +258,8 @@  static const FormatEntry format_entries[AV_PIX_FMT_NB] = {
     [AV_PIX_FMT_P010BE]      = { 1, 1 },
     [AV_PIX_FMT_P016LE]      = { 1, 1 },
     [AV_PIX_FMT_P016BE]      = { 1, 1 },
+    [AV_PIX_FMT_GRAYF32LE]   = { 1, 1 },
+    [AV_PIX_FMT_GRAYF32BE]   = { 1, 1 },
 };
 
 int sws_isSupportedInput(enum AVPixelFormat pix_fmt)
@@ -1173,6 +1175,7 @@  av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter,
     const AVPixFmtDescriptor *desc_dst;
     int ret = 0;
     enum AVPixelFormat tmpFmt;
+    static const float float_mult = 1.0f / 255.0f;
 
     cpu_flags = av_get_cpu_flags();
     flags     = c->flags;
@@ -1537,6 +1540,19 @@  av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter,
         }
     }
 
+    if (unscaled && c->srcBpc == 8 && dstFormat == AV_PIX_FMT_GRAYF32){
+        for (i = 0; i < 256; ++i){
+            c->uint2float_lut[i] = (float)i * float_mult;
+        }
+    }
+
+    // float will be converted to uint16_t
+    if ((srcFormat == AV_PIX_FMT_GRAYF32BE || srcFormat == AV_PIX_FMT_GRAYF32LE) &&
+        (!unscaled || unscaled && dstFormat != srcFormat && (srcFormat != AV_PIX_FMT_GRAYF32 ||
+        dstFormat != AV_PIX_FMT_GRAY8))){
+        c->srcBpc = 16;
+    }
+
     if (CONFIG_SWSCALE_ALPHA && isALPHA(srcFormat) && !isALPHA(dstFormat)) {
         enum AVPixelFormat tmpFormat = alphaless_fmt(srcFormat);
 
@@ -1793,7 +1809,9 @@  av_cold int sws_init_context(SwsContext *c, SwsFilter *srcFilter,
 
     /* unscaled special cases */
     if (unscaled && !usesHFilter && !usesVFilter &&
-        (c->srcRange == c->dstRange || isAnyRGB(dstFormat))) {
+        (c->srcRange == c->dstRange || isAnyRGB(dstFormat) ||
+         srcFormat == AV_PIX_FMT_GRAYF32 && dstFormat == AV_PIX_FMT_GRAY8 ||
+         srcFormat == AV_PIX_FMT_GRAY8 && dstFormat == AV_PIX_FMT_GRAYF32)) {
         ff_get_unscaled_swscale(c);
 
         if (c->swscale) {
diff --git a/libswscale/x86/swscale_template.c b/libswscale/x86/swscale_template.c
index b8bdcd4d03..7c30470679 100644
--- a/libswscale/x86/swscale_template.c
+++ b/libswscale/x86/swscale_template.c
@@ -1500,7 +1500,8 @@  static av_cold void RENAME(sws_init_swscale)(SwsContext *c)
 
     c->use_mmx_vfilter= 0;
     if (!is16BPS(dstFormat) && !isNBPS(dstFormat) && dstFormat != AV_PIX_FMT_NV12
-        && dstFormat != AV_PIX_FMT_NV21 && !(c->flags & SWS_BITEXACT)) {
+        && dstFormat != AV_PIX_FMT_NV21 && dstFormat != AV_PIX_FMT_GRAYF32BE && dstFormat != AV_PIX_FMT_GRAYF32LE
+        && !(c->flags & SWS_BITEXACT)) {
             if (c->flags & SWS_ACCURATE_RND) {
                 if (!(c->flags & SWS_FULL_CHR_H_INT)) {
                     switch (c->dstFormat) {
diff --git a/tests/ref/fate/filter-pixdesc-grayf32be b/tests/ref/fate/filter-pixdesc-grayf32be
new file mode 100644
index 0000000000..423bbfbebc
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-grayf32be
@@ -0,0 +1 @@ 
+pixdesc-grayf32be   381c8d0f19d286809b91cd6e6c0048ab
diff --git a/tests/ref/fate/filter-pixdesc-grayf32le b/tests/ref/fate/filter-pixdesc-grayf32le
new file mode 100644
index 0000000000..a76e0a995e
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-grayf32le
@@ -0,0 +1 @@ 
+pixdesc-grayf32le   381c8d0f19d286809b91cd6e6c0048ab
diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy
index 013b33f1c0..5385036a82 100644
--- a/tests/ref/fate/filter-pixfmts-copy
+++ b/tests/ref/fate/filter-pixfmts-copy
@@ -47,6 +47,8 @@  gray16be            08d997a3faa25a3db9d6be272d282eef
 gray16le            df65eb804360795e3e38a2701fa9641a
 gray9be             6382a14594a8b68f0ec7de25531f9334
 gray9le             4eb1dda58706436e3b69aef29b0089db
+grayf32be           f3bf178835f8146aa09d1da94bba4d8a
+grayf32le           fb6ea85bfbc8cd21c51fc0e110197294
 monob               8b04f859fee6a0be856be184acd7a0b5
 monow               54d16d2c01abfd72ecdb5e51e283937c
 nv12                8e24feb2c544dc26a20047a71e4c27aa
diff --git a/tests/ref/fate/filter-pixfmts-crop b/tests/ref/fate/filter-pixfmts-crop
index 750ea27404..ae48c2bf42 100644
--- a/tests/ref/fate/filter-pixfmts-crop
+++ b/tests/ref/fate/filter-pixfmts-crop
@@ -47,6 +47,8 @@  gray16be            38f599da990224de86e3dc7a543121a9
 gray16le            9ff7c866bd98def4e6c91542c1c45f80
 gray9be             8ffcb18d699480f55414bfc21ab33321
 gray9le             4d1932d4968a248584f5e39c25f1dd43
+grayf32be           cf40ec06a8abe54852b7f85a00549eec
+grayf32le           b672526c9da9c8959ab881f242f6890a
 nv12                92cda427f794374731ec0321ee00caac
 nv21                1bcfc197f4fb95de85ba58182d8d2f69
 p010be              8b2de2eb6b099bbf355bfc55a0694ddc
diff --git a/tests/ref/fate/filter-pixfmts-field b/tests/ref/fate/filter-pixfmts-field
index 4fdc214781..857ded1c41 100644
--- a/tests/ref/fate/filter-pixfmts-field
+++ b/tests/ref/fate/filter-pixfmts-field
@@ -47,6 +47,8 @@  gray16be            e1700e056de9917744a7ff4ab2ca63fd
 gray16le            338de7ac5f7d36d5ad5ac2c8d5bbea68
 gray9be             25e50940fa300a8f09edfb6eba4fd250
 gray9le             1146cfc1b92bfd07ed238e65ffcd134f
+grayf32be           72fbfa47b2863658a8a80d588f23b3e7
+grayf32le           6b856bdbf2a2bfcd2bc7d50f109daaf0
 monob               2129cc72a484d7e10a44de9117aa9f80
 monow               03d783611d265cae78293f88ea126ea1
 nv12                16f7a46708ef25ebd0b72e47920cc11e
diff --git a/tests/ref/fate/filter-pixfmts-fieldorder b/tests/ref/fate/filter-pixfmts-fieldorder
index 7fc158b0af..fc003457fc 100644
--- a/tests/ref/fate/filter-pixfmts-fieldorder
+++ b/tests/ref/fate/filter-pixfmts-fieldorder
@@ -47,6 +47,8 @@  gray16be            293a36548ce16543494790f8f7f76a05
 gray16le            84f83f5fcbb5d458efb8395a50a3797e
 gray9be             ec877f5bcf0ea275a6f36c12cc9adf11
 gray9le             fba944fde7923d5089f4f52d12988b9e
+grayf32be           1aa7960131f880c54fe3c77f13448674
+grayf32le           4029ac9d197f255794c1b9e416520fc7
 rgb0                2e3d8c91c7a83d451593dfd06607ff39
 rgb24               b82577f8215d3dc2681be60f1da247af
 rgb444be            1c3afc3a0c53c51139c76504f59bb1f4
diff --git a/tests/ref/fate/filter-pixfmts-hflip b/tests/ref/fate/filter-pixfmts-hflip
index 70a43d9959..e97c185f6e 100644
--- a/tests/ref/fate/filter-pixfmts-hflip
+++ b/tests/ref/fate/filter-pixfmts-hflip
@@ -47,6 +47,8 @@  gray16be            cf7294d9aa23e1b838692ec01ade587b
 gray16le            d91ce41e304419bcf32ac792f01bd64f
 gray9be             ac8d260669479ae720a5b6d4d8639e34
 gray9le             424fc581947bc8c357c9ec5e3c1c04d1
+grayf32be           a69add7bbf892a71fe81b3b75982dbe2
+grayf32le           4563e176a35dc8a8a07e0829fad5eb88
 nv12                801e58f1be5fd0b5bc4bf007c604b0b4
 nv21                9f10dfff8963dc327d3395af21f0554f
 p010be              744b13e44d39e1ff7588983fa03e0101
diff --git a/tests/ref/fate/filter-pixfmts-il b/tests/ref/fate/filter-pixfmts-il
index ba06851e24..a006fc19a3 100644
--- a/tests/ref/fate/filter-pixfmts-il
+++ b/tests/ref/fate/filter-pixfmts-il
@@ -47,6 +47,8 @@  gray16be            92c3b09f371b610cc1b6a9776034f4d0
 gray16le            1db278d23a554e01910cedacc6c02521
 gray9be             ed7db5bb2ddc09bc26068c8b858db204
 gray9le             2ec9188f0dcfefef76a09f371d7beb8e
+grayf32be           f36197c9e2ef5c50a995e980c1a37203
+grayf32le           8bf3d295c3ffd53da0e06d0702e7c1ca
 monob               faba75df28033ba7ce3d82ff2a99ee68
 monow               6e9cfb8d3a344c5f0c3e1d5e1297e580
 nv12                3c3ba9b1b4c4dfff09c26f71b51dd146
diff --git a/tests/ref/fate/filter-pixfmts-null b/tests/ref/fate/filter-pixfmts-null
index 013b33f1c0..5385036a82 100644
--- a/tests/ref/fate/filter-pixfmts-null
+++ b/tests/ref/fate/filter-pixfmts-null
@@ -47,6 +47,8 @@  gray16be            08d997a3faa25a3db9d6be272d282eef
 gray16le            df65eb804360795e3e38a2701fa9641a
 gray9be             6382a14594a8b68f0ec7de25531f9334
 gray9le             4eb1dda58706436e3b69aef29b0089db
+grayf32be           f3bf178835f8146aa09d1da94bba4d8a
+grayf32le           fb6ea85bfbc8cd21c51fc0e110197294
 monob               8b04f859fee6a0be856be184acd7a0b5
 monow               54d16d2c01abfd72ecdb5e51e283937c
 nv12                8e24feb2c544dc26a20047a71e4c27aa
diff --git a/tests/ref/fate/filter-pixfmts-scale b/tests/ref/fate/filter-pixfmts-scale
index 559355be49..05879ee3c7 100644
--- a/tests/ref/fate/filter-pixfmts-scale
+++ b/tests/ref/fate/filter-pixfmts-scale
@@ -47,6 +47,8 @@  gray16be            32891cb0928b1119d8d43a6e1bef0e2b
 gray16le            f96cfb5652b090dad52615930f0ce65f
 gray9be             779dec0c6c2df008128b91622a20daf8
 gray9le             fa87a96ca275f82260358635f838b514
+grayf32be           5e4c715519f53c15f1345df90481e5f5
+grayf32le           2ff1b84023e820307b1ba7a9550115bc
 monob               f01cb0b623357387827902d9d0963435
 monow               35c68b86c226d6990b2dcb573a05ff6b
 nv12                b118d24a3653fe66e5d9e079033aef79
diff --git a/tests/ref/fate/filter-pixfmts-transpose b/tests/ref/fate/filter-pixfmts-transpose
index 78218cda4e..44644099c6 100644
--- a/tests/ref/fate/filter-pixfmts-transpose
+++ b/tests/ref/fate/filter-pixfmts-transpose
@@ -47,6 +47,8 @@  gray16be            4aef307021a91b1de67f1d4381a39132
 gray16le            76f2afe156edca7ae05cfa4e5867126e
 gray9be             2c425fa532c940d226822da8b3592310
 gray9le             bcc575942910b3c72eaa72e8794f3acd
+grayf32be           823288e1ec497bb1f22c070e502e5272
+grayf32le           6e9ec0e1cac3617f3041e681afd2c575
 nv12                1965e3826144686748f2f6b516fca5ba
 nv21                292adaf5271c5c8516b71640458c01f4
 p010be              ad0de2cc9bff81688b182a870fcf7000
diff --git a/tests/ref/fate/filter-pixfmts-vflip b/tests/ref/fate/filter-pixfmts-vflip
index 3cb99e7d8d..51628f14ce 100644
--- a/tests/ref/fate/filter-pixfmts-vflip
+++ b/tests/ref/fate/filter-pixfmts-vflip
@@ -47,6 +47,8 @@  gray16be            29f24ba7cb0fc4fd2ae78963d008f6e6
 gray16le            a37e9c4ea76e8eeddc2af8f600ba2c10
 gray9be             dda11d4ffd62b414012ffc4667fb4971
 gray9le             159bf6482d217b2b8276eb2216cd7a09
+grayf32be           c1ba5943a0d24d70e6a280f37e4f4593
+grayf32le           8e6c048a5b3b8b26d3a5ddfce255f3f6
 monob               7810c4857822ccfc844d78f5e803269a
 monow               90a947bfcd5f2261e83b577f48ec57b1
 nv12                261ebe585ae2aa4e70d39a10c1679294
-- 
2.14.1