diff mbox series

[FFmpeg-devel,RFC] GSoC: FLIF16 Image format parser

Message ID 20200226065637.7943-1-aghorui@teknik.io
State Superseded
Headers show
Series [FFmpeg-devel,RFC] GSoC: FLIF16 Image format parser | expand

Checks

Context Check Description
andriy/ffmpeg-patchwork success Make fate finished

Commit Message

Anamitra Ghorui Feb. 26, 2020, 6:56 a.m. UTC
This is a buildable "skeleton" of my component (the FLIF16 parser)
i.e. everything is present aside from the logic itself.

***

Hello, I am trying to implement a parser for the FLIF16 file format as
a GSoC 2020 qualification project. So far I think I have managed to
register the parser (alongwith the format) and the basic structure
of the parser code.

I have now reached a point where moving forward is going to be quite 
difficult without outside help and references, and so I have a number 
of questions regarding the conceptual understanding of FFmpeg:

a. Please tell me if I am right or wrong here:
1. Each audio/video/image file format has a parser for converting the
   file data into a format that can be understood by a decoder.

2. A Decoder converts a given, recogised encoded data stream into a
   form that can be processed by physical hardware.

3. File formats can be independent of what sort of encoding it uses.
   Eg: WebM

4. The general Audio parsing/decoding process is as follows:
     i. Allocate space for a packet of data
    ii. Try to find a hit for the codec of  given data format
   iii. Now, with the codec id, attempt to init a parser
    iv. Allocate a context for the codec
     v. Initialize the codec context
    vi. Initialize the codec
   vii. Allocate space for frame data
  viii. Open the imput file
    ix. While file pointer isn't EOF:
            Read data into buffer
            Parse data into a single frame
            Decode the data
     x. Flush the file and free stuff.

5. Every parser has its own parser context extended from the default parser
   context. The byte offsets/positions in the file are kept by the parser
   context.

6. An image can be thought of as a video with a single frame

b. In libavcodec/parser.h:

    typedef struct ParseContext{
        ...
        int frame_start_found;
        ...
    } ParseContext;
    
Is frame_start_found the determined position of the start of the frame
in the data stream?


c. I have been looking at the decoder/encoder/parser of the BMP format
   (which is one of the simplest image formats), the actual decoding work
   (according to me), i.e. Finding the magic numbers, seeing the various
   segments is being done by the decoder function and not the parser.
   
   The parser function from what I can see from the png_parser and
   bmp_parser, simply manipulates the ParseConstext for appropriate
   values, and does not much else. What is it exactly doing over here?

If there are any books or articles I should read, please tell me.
---
 libavcodec/Makefile        |  1 +
 libavcodec/avcodec.h       |  1 +
 libavcodec/flif16_parser.c | 51 ++++++++++++++++++++++++++++++++++++++
 libavcodec/parsers.c       |  1 +
 libavformat/img2.c         |  1 +
 5 files changed, 55 insertions(+)
 create mode 100644 libavcodec/flif16_parser.c

Comments

Anamitra Ghorui Feb. 27, 2020, 9:09 a.m. UTC | #1
February 26, 2020 12:27 PM, "Anamitra Ghorui" <aghorui@teknik.io> wrote:

> c. I have been looking at the decoder/encoder/parser of the BMP format
> (which is one of the simplest image formats), the actual decoding work
> (according to me), i.e. Finding the magic numbers, seeing the various
> segments is being done by the decoder function and not the parser.

I meant to say, "the actual 'parsing' work, according to me, is
being done by the decoder"

I'm sorry if I had come off as rude in my post. It wasn't my intention.
Lynne Feb. 27, 2020, 10:56 a.m. UTC | #2
Feb 27, 2020, 09:09 by aghorui@teknik.io:

> February 26, 2020 12:27 PM, "Anamitra Ghorui" <aghorui@teknik.io> wrote:
>
>> c. I have been looking at the decoder/encoder/parser of the BMP format
>> (which is one of the simplest image formats), the actual decoding work
>> (according to me), i.e. Finding the magic numbers, seeing the various
>> segments is being done by the decoder function and not the parser.
>>
>
> I meant to say, "the actual 'parsing' work, according to me, is
> being done by the decoder"
>
> I'm sorry if I had come off as rude in my post. It wasn't my intention.
>

AFAIK, flif isn't stable (no spec, bitstream may change) and is kind of dead since the author
moved on to work on JPEG XL to my knowledge.

I don't think we should be implementing this decoder.
Thilo Borgmann Feb. 27, 2020, 11:14 a.m. UTC | #3
Am 27.02.20 um 11:56 schrieb Lynne:
> Feb 27, 2020, 09:09 by aghorui@teknik.io:
> 
>> February 26, 2020 12:27 PM, "Anamitra Ghorui" <aghorui@teknik.io> wrote:
>>
>>> c. I have been looking at the decoder/encoder/parser of the BMP format
>>> (which is one of the simplest image formats), the actual decoding work
>>> (according to me), i.e. Finding the magic numbers, seeing the various
>>> segments is being done by the decoder function and not the parser.
>>>
>>
>> I meant to say, "the actual 'parsing' work, according to me, is
>> being done by the decoder"
>>
>> I'm sorry if I had come off as rude in my post. It wasn't my intention.
>>
> 
> AFAIK, flif isn't stable (no spec, bitstream may change) and is kind of dead since the author
> moved on to work on JPEG XL to my knowledge.
> 
> I don't think we should be implementing this decoder.

We had have this discussion before creating a project out of it.
They proclaimed it stable themselves. What do we care if their author now works on something else?

Anamitra, please don't get confused about that.

-Thilo
Jai Luthra Feb. 27, 2020, 11:15 a.m. UTC | #4
Hi Anamitra,

On Wed, Feb 26, 2020 at 12:26:37PM +0530, Anamitra Ghorui wrote:
>This is a buildable "skeleton" of my component (the FLIF16 parser)
>i.e. everything is present aside from the logic itself.
>
>***
>
>Hello, I am trying to implement a parser for the FLIF16 file format as
>a GSoC 2020 qualification project. So far I think I have managed to
>register the parser (alongwith the format) and the basic structure
>of the parser code.
>
>I have now reached a point where moving forward is going to be quite
>difficult without outside help and references, and so I have a number
>of questions regarding the conceptual understanding of FFmpeg:
>
>a. Please tell me if I am right or wrong here:
>1. Each audio/video/image file format has a parser for converting the
>   file data into a format that can be understood by a decoder.

Yes

>
>2. A Decoder converts a given, recogised encoded data stream into a
>   form that can be processed by physical hardware.

Yes. To be exact, decoder turns the encoded data packets to raw frames or 
samples, which can then be transcoded to some other codec or displayed/played.

>
>3. File formats can be independent of what sort of encoding it uses.
>   Eg: WebM

Yes a single container format can support diff codecs.

>
>4. The general Audio parsing/decoding process is as follows:
>     i. Allocate space for a packet of data
>    ii. Try to find a hit for the codec of  given data format
>   iii. Now, with the codec id, attempt to init a parser
>    iv. Allocate a context for the codec
>     v. Initialize the codec context
>    vi. Initialize the codec
>   vii. Allocate space for frame data
>  viii. Open the imput file
>    ix. While file pointer isn't EOF:
>            Read data into buffer
>            Parse data into a single frame
>            Decode the data
>     x. Flush the file and free stuff.

Yes, there may also be some form of probing taking place, i.e. checking the 
first few packets to find what file format and codec is used. 

>
>5. Every parser has its own parser context extended from the default parser
>   context. The byte offsets/positions in the file are kept by the parser
>   context.
>
>6. An image can be thought of as a video with a single frame

For some purposes this high level distinction may work. But many image formats 
also support multiple frames and animations like GIF and even FLIF. 

>
>b. In libavcodec/parser.h:
>
>    typedef struct ParseContext{
>        ...
>        int frame_start_found;
>        ...
>    } ParseContext;
>
>Is frame_start_found the determined position of the start of the frame
>in the data stream?
>
>
>c. I have been looking at the decoder/encoder/parser of the BMP format
>   (which is one of the simplest image formats), the actual decoding work
>   (according to me), i.e. Finding the magic numbers, seeing the various
>   segments is being done by the decoder function and not the parser.
>
>   The parser function from what I can see from the png_parser and
>   bmp_parser, simply manipulates the ParseConstext for appropriate
>   values, and does not much else. What is it exactly doing over here?

You are correct. The parser is usally used for video formats, to read and 
iterate over encoded packets/frames in a bitstream. Main decoding part and 
filling contexts for a particular packet is done within the decoder module 
usually.

FLIF does have multiple frames so having a parser is a good idea. But you may 
choose to read the other information through header into the decoder context, 
that is up to you whatever you find better.

>
>If there are any books or articles I should read, please tell me.
>---
> libavcodec/Makefile        |  1 +
> libavcodec/avcodec.h       |  1 +
> libavcodec/flif16_parser.c | 51 ++++++++++++++++++++++++++++++++++++++
> libavcodec/parsers.c       |  1 +
> libavformat/img2.c         |  1 +
> 5 files changed, 55 insertions(+)
> create mode 100644 libavcodec/flif16_parser.c
>
>diff --git a/libavcodec/Makefile b/libavcodec/Makefile
>index 1e894c8049..ce18632d2c 100644
>--- a/libavcodec/Makefile
>+++ b/libavcodec/Makefile
>@@ -1045,6 +1045,7 @@ OBJS-$(CONFIG_DVD_NAV_PARSER)          += dvd_nav_parser.o
> OBJS-$(CONFIG_DVDSUB_PARSER)           += dvdsub_parser.o
> OBJS-$(CONFIG_FLAC_PARSER)             += flac_parser.o flacdata.o flac.o \
>                                           vorbis_data.o
>+OBJS-$(CONFIG_FLAC_PARSER)             += flif16_parser.o
> OBJS-$(CONFIG_G723_1_PARSER)           += g723_1_parser.o
> OBJS-$(CONFIG_G729_PARSER)             += g729_parser.o
> OBJS-$(CONFIG_GIF_PARSER)              += gif_parser.o
>diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>index 978f36d12a..c6b8c6a1eb 100644
>--- a/libavcodec/avcodec.h
>+++ b/libavcodec/avcodec.h
>@@ -461,6 +461,7 @@ enum AVCodecID {
>     AV_CODEC_ID_MVDV,
>     AV_CODEC_ID_MVHA,
>     AV_CODEC_ID_CDTOONS,
>+    AV_CODEC_ID_FLIF16,
>
>     /* various PCM "codecs" */
>     AV_CODEC_ID_FIRST_AUDIO = 0x10000,     ///< A dummy id pointing at the start of audio codecs
>diff --git a/libavcodec/flif16_parser.c b/libavcodec/flif16_parser.c
>new file mode 100644
>index 0000000000..54bd93d499
>--- /dev/null
>+++ b/libavcodec/flif16_parser.c
>@@ -0,0 +1,51 @@
>+/*
>+ * FLIF16 parser
>+ * Copyright (c) 2020 Anamitra Ghorui
>+ *
>+ * This file is part of FFmpeg.
>+ *
>+ * FFmpeg is free software; you can redistribute it and/or
>+ * modify it under the terms of the GNU Lesser General Public
>+ * License as published by the Free Software Foundation; either
>+ * version 2.1 of the License, or (at your option) any later version.
>+ *
>+ * FFmpeg is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>+ * Lesser General Public License for more details.
>+ *
>+ * You should have received a copy of the GNU Lesser General Public
>+ * License along with FFmpeg; if not, write to the Free Software
>+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>+ */
>+
>+ /**
>+  * @file
>+  * FLIF16 parser
>+  */
>+
>+#include "parser.h"
>+#include <stdio.h>
>+
>+typedef struct FLIF16ParseContext {
>+    ParseContext pc;
>+
>+} FLIF16ParseContext;
>+
>+static int flif16_parse(AVCodecParserContext *s, AVCodecContext *avctx,
>+                     const uint8_t **poutbuf, int *poutbuf_size,
>+                     const uint8_t *buf, int buf_size)
>+{
>+    FLIF16ParseContext *fpc = s->priv_data;
>+    int next = END_NOT_FOUND;
>+
>+    return next;
>+}
>+
>+AVCodecParser ff_flif16_parser = {
>+    .codec_ids      = { AV_CODEC_ID_FLIF16 },
>+    .priv_data_size = sizeof(FLIF16ParseContext),
>+    .parser_parse   = flif16_parse,
>+    .parser_close   = ff_parse_close,
>+};
>+
>diff --git a/libavcodec/parsers.c b/libavcodec/parsers.c
>index 33a71de8a0..8b6eb954b3 100644
>--- a/libavcodec/parsers.c
>+++ b/libavcodec/parsers.c
>@@ -40,6 +40,7 @@ extern AVCodecParser ff_dvbsub_parser;
> extern AVCodecParser ff_dvdsub_parser;
> extern AVCodecParser ff_dvd_nav_parser;
> extern AVCodecParser ff_flac_parser;
>+extern AVCodecParser ff_flif16_parser;
> extern AVCodecParser ff_g723_1_parser;
> extern AVCodecParser ff_g729_parser;
> extern AVCodecParser ff_gif_parser;
>diff --git a/libavformat/img2.c b/libavformat/img2.c
>index 16bc9d2abd..14c11d0c82 100644
>--- a/libavformat/img2.c
>+++ b/libavformat/img2.c
>@@ -81,6 +81,7 @@ const IdStrMap ff_img_tags[] = {
>     { AV_CODEC_ID_XPM,        "xpm"      },
>     { AV_CODEC_ID_XFACE,      "xface"    },
>     { AV_CODEC_ID_XWD,        "xwd"      },
>+    { AV_CODEC_ID_FLIF16,     "flif16"   },
>     { AV_CODEC_ID_NONE,       NULL       }
> };
>
>-- 
>2.17.1
>

Looks good to me, try to parse an animated FLIF file and see if you can find 
the right frame boundaries. Then move onto reading other parameters from the 
bitstream headers into a context.

Cheers!

--
Jai (darkapex)
Moritz Barsnick Feb. 27, 2020, 12:44 p.m. UTC | #5
Welcome to ffmpeg!

Since review has now started, I want to point out what was missed:

On Wed, Feb 26, 2020 at 12:26:37 +0530, Anamitra Ghorui wrote:

> a. Please tell me if I am right or wrong here:
> 1. Each audio/video/image file format has a parser for converting the
>    file data into a format that can be understood by a decoder.
>
> 2. A Decoder converts a given, recogised encoded data stream into a
>    form that can be processed by physical hardware.
>
> 3. File formats can be independent of what sort of encoding it uses.
>    Eg: WebM

Welcome to the world of multimedia. For many (but not all)
contributions, it is important to understand these concepts. Do play
around with ffmpeg and perhaps some examples from the ffmpeg wiki. And
feel free to ask. :-)

Also observe the contribution flow on this mailing list. Your mentor
will assist you though, because not everything may be obvious enough.

>  libavcodec/Makefile        |  1 +
>  libavcodec/avcodec.h       |  1 +
>  libavcodec/flif16_parser.c | 51 ++++++++++++++++++++++++++++++++++++++
>  libavcodec/parsers.c       |  1 +
>  libavformat/img2.c         |  1 +
>  5 files changed, 55 insertions(+)
>  create mode 100644 libavcodec/flif16_parser.c

Just to jump ahead: As soon as you start an actual decoder and demuxer,
you will ultimately (i.e. in a future version, before push) require:

- a Changelog entry (as soo
- documentation in doc/*.texi
- a minor version bump of libavcodec (or libavformat, with the commit
  in which each gains a new codec/format), resetting micro to 100

>  OBJS-$(CONFIG_FLAC_PARSER)             += flac_parser.o flacdata.o flac.o \
>                                            vorbis_data.o
> +OBJS-$(CONFIG_FLAC_PARSER)             += flif16_parser.o
                 ^^^^

This looks wrong. You are adding this object to a compile of the FLAC
parser, not your new parser. (Probably a copy/paste mistake.)

Cheers,
Moritz
Anamitra Ghorui Feb. 27, 2020, 1:35 p.m. UTC | #6
February 27, 2020 6:15 PM, "Moritz Barsnick" <barsnick@gmx.net> wrote:

>> OBJS-$(CONFIG_FLAC_PARSER) += flac_parser.o flacdata.o flac.o \
>> vorbis_data.o
>> +OBJS-$(CONFIG_FLAC_PARSER) += flif16_parser.o
> 
> ^^^^
> 
> This looks wrong. You are adding this object to a compile of the FLAC
> parser, not your new parser. (Probably a copy/paste mistake.)
> 

Thanks for pointing it out
Anamitra Ghorui Feb. 29, 2020, 4:50 a.m. UTC | #7
Hello,
I have been reading through the parsing API and other things and here's what 
I've managed to gather (I will be ignoring overruns in these functions for now).
Please tell me if I am right or wrong:

1. As long as the parse function determines next == END_NOT_FOUND, 
   ff_combine_frame will keep increasing the AVParseContext index by buf_size.
   Once next is no longer END_NOT_FOUND, buf_size will be set to index + next.
   
   The bytes from the input chunks are copied into the buffer of AVParseContext
   during this process.
   
   while next == END_NOT_FOUND, and the thing being decoded is a video, we 
   cannot really determine the end of frame, and hence poutbuf and poutbuf_size
   are set to zero by the function. However, this doesn't really matter for
   still images since they have a single frame.

2. av_parser_parse2 will look for whether poutbuf_size is greater than zero.
   If it is, the next frame start offset will be advanced, and the frame offset
   pointer will be set to the previous value of the next frame offset in
   AVCodecParserContext.

3. In https://ffmpeg.org/doxygen/trunk/decode_video_8c-example.html
   pkt->size will be set to zero as long as a frame has not been returned.
   Hence decode will not be triggered as long as a frame has not been found.

Now, Regarding FLIF16:
1. The pixels of the image are stored in this format (non interlaced):
(see https://flif.info/spec.html#_part_4_pixel_data)
      _______________________________________________
     |     _________________________________________ |
     |    |     ___________________________________ ||
all  |    |    |     _____________________________ |||
     |    |    |    |                             ||||
     |    |    | f1 | x1 x2 x3 ..... xw           ||||
     |    |    |    |                             ||||
     |    | y1 |    |_____________________________||||
     | c1 |    |                ...                |||
     |    |    |     _____________________________ |||
     |    |    |    |                             ||||
     |    |    | fn | x1 x2 x3 ..... xw           ||||
     |    |    |    |                             ||||
     |    |    |    |_____________________________||||
     |    |    |                                   |||
     |    |    |___________________________________|||
     |    |                 ...                     ||
     |    |     ___________________________________ ||
     |    |    |     _____________________________ |||
     |    |    |    |                             ||||
     |    |    | f1 | x1 x2 x3 ..... xw           ||||
     |    |    |    |                             ||||
     |    | yh |    |_____________________________||||
     |    |    |               ...                 |||
     |    |    |     _____________________________ |||
     |    |    |    |                             ||||
     |    |    | fn | x1 x2 x3 ..... xw           ||||
     |    |    |    |                             ||||
     |    |    |    |_____________________________||||
     |    |    |                                   |||
     |    |    |___________________________________|||
     |    |_________________________________________||
     |                                               |
     |                      ...                      |
     | cn                                            |
     |_______________________________________________|

where: ci: color channel
       yi: pixel row
       fi: frame number
       xi: individual pixel

The frames are not stored in a contiguous manner as observable. How should I be 
getting the frame over here? It dosen't seem possible without either putting the
whole pixel data chunk in memory, or allocating space for all the frames at once
and then putting data in them.

I guess what the parser has to do in that case is that it will have to either
return the whole file length as the buffer to the decoder function, or make the
parser manage frames by itself through its own data structures and component
functions.

What should I be doing here?

2. The FLIF format spec refers to a thing known as the 24 bit RAC. Is it an
   abbreviation for 24 bit RAnge Coding? (https://en.wikipedia.org/wiki/Range_encoding)
   What does the "24 bit" mean? Is it the size of each symbol that is processed
   by the range coder?

I started going through the reference implementation of FLIF. I'll see what I
can make out of it. The decoder by itself under the Apache lisence so we could
refer to it or borrow some things from it: https://github.com/FLIF-hub/FLIF.

Thanks
Anamitra Ghorui Feb. 29, 2020, 11:22 a.m. UTC | #8
Oh, sorry about that.
Kartik K. Khullar Feb. 29, 2020, 5:09 p.m. UTC | #9
It is just to remind that I am already working on Transformations involved
in FLIF and the functions which these transformations use like Symbol
Encoding @Anamitra. It would be helpful if someone could help me clear what
does RAC refer to in FLIF spec. It is mentioned under Symbol Encoding and
is being used repetitively.
Thanks

On Sat, Feb 29, 2020 at 4:52 PM Anamitra Ghorui <aghorui@teknik.io> wrote:

> Oh, sorry about that.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Jai Luthra Feb. 29, 2020, 5:47 p.m. UTC | #10
Hi Anamitra,

On Sat, Feb 29, 2020 at 04:50:23AM +0000, Anamitra Ghorui wrote:
>Hello,
>I have been reading through the parsing API and other things and here's what
>I've managed to gather (I will be ignoring overruns in these functions for now).
>Please tell me if I am right or wrong:
>
>1. As long as the parse function determines next == END_NOT_FOUND,
>   ff_combine_frame will keep increasing the AVParseContext index by buf_size.
>   Once next is no longer END_NOT_FOUND, buf_size will be set to index + next.
>
>   The bytes from the input chunks are copied into the buffer of AVParseContext
>   during this process.
>
>   while next == END_NOT_FOUND, and the thing being decoded is a video, we
>   cannot really determine the end of frame, and hence poutbuf and poutbuf_size
>   are set to zero by the function. However, this doesn't really matter for
>   still images since they have a single frame.
>
>2. av_parser_parse2 will look for whether poutbuf_size is greater than zero.
>   If it is, the next frame start offset will be advanced, and the frame offset
>   pointer will be set to the previous value of the next frame offset in
>   AVCodecParserContext.
>
>3. In https://ffmpeg.org/doxygen/trunk/decode_video_8c-example.html
>   pkt->size will be set to zero as long as a frame has not been returned.
>   Hence decode will not be triggered as long as a frame has not been found.

Yes this is all correct. Good work of looking at different parsers to 
understand this.

>
>Now, Regarding FLIF16:
>1. The pixels of the image are stored in this format (non interlaced):
>(see https://flif.info/spec.html#_part_4_pixel_data)
>      _______________________________________________
>     |     _________________________________________ |
>     |    |     ___________________________________ ||
>all  |    |    |     _____________________________ |||
>     |    |    |    |                             ||||
>     |    |    | f1 | x1 x2 x3 ..... xw           ||||
>     |    |    |    |                             ||||
>     |    | y1 |    |_____________________________||||
>     | c1 |    |                ...                |||
>     |    |    |     _____________________________ |||
>     |    |    |    |                             ||||
>     |    |    | fn | x1 x2 x3 ..... xw           ||||
>     |    |    |    |                             ||||
>     |    |    |    |_____________________________||||
>     |    |    |                                   |||
>     |    |    |___________________________________|||
>     |    |                 ...                     ||
>     |    |     ___________________________________ ||
>     |    |    |     _____________________________ |||
>     |    |    |    |                             ||||
>     |    |    | f1 | x1 x2 x3 ..... xw           ||||
>     |    |    |    |                             ||||
>     |    | yh |    |_____________________________||||
>     |    |    |               ...                 |||
>     |    |    |     _____________________________ |||
>     |    |    |    |                             ||||
>     |    |    | fn | x1 x2 x3 ..... xw           ||||
>     |    |    |    |                             ||||
>     |    |    |    |_____________________________||||
>     |    |    |                                   |||
>     |    |    |___________________________________|||
>     |    |_________________________________________||
>     |                                               |
>     |                      ...                      |
>     | cn                                            |
>     |_______________________________________________|
>
>where: ci: color channel
>       yi: pixel row
>       fi: frame number
>       xi: individual pixel

Ah FLIF is a bit wacky. I can see why this might be helpful for decoding 
partial images on-the-fly, but I don't think it will be easy or even possible 
to do with the current AVFrame API.

>
>The frames are not stored in a contiguous manner as observable. How should I be
>getting the frame over here? It dosen't seem possible without either putting the
>whole pixel data chunk in memory, or allocating space for all the frames at once
>and then putting data in them.
>
>I guess what the parser has to do in that case is that it will have to either
>return the whole file length as the buffer to the decoder function, or make the
>parser manage frames by itself through its own data structures and component
>functions.
>
>What should I be doing here?

For now go with the approach of reading all the data into a single AVPacket. 
This does mean that parser isn't splitting frames. We can figure out how to do 
progressive decoding like intended by FLIF later.

>
>2. The FLIF format spec refers to a thing known as the 24 bit RAC. Is it an
>   abbreviation for 24 bit RAnge Coding? (https://en.wikipedia.org/wiki/Range_encoding)
>   What does the "24 bit" mean? Is it the size of each symbol that is processed
>   by the range coder?
>

Yes RAC refers to Range Coding [1]. You can try to match what the reference 
codec does in [2] with the explanation in [1].

"24 bit" here is the working range of the entropy coder.

In range coding the sequence of all bits is stored as an aribitrarily long 
integer, which cannot be stored in working memory, so we define a range (like 
16-24 bits used in FLIF) in which we will always keep our working variable. If 
it overflows during encoding we write some LSBs to the stream and shift it to 
bring it back in range.

12 bit is the precision with which probabilities are stored here.

For the sake of your qualification task, just use pseudo function RAC() 
wherever you feel the need as Yasiru should be working in its implementation.

>I started going through the reference implementation of FLIF. I'll see what I
>can make out of it. The decoder by itself under the Apache lisence so we could
>refer to it or borrow some things from it: https://github.com/FLIF-hub/FLIF.
>
>Thanks
>

Cheers

[1]: https://people.xiph.org/~tterribe/notes/range.html
[2]: https://github.com/FLIF-hub/FLIF/blob/master/src/maniac/rac.hpp

--
Jai (darkapex)
diff mbox series

Patch

diff --git a/libavcodec/Makefile b/libavcodec/Makefile
index 1e894c8049..ce18632d2c 100644
--- a/libavcodec/Makefile
+++ b/libavcodec/Makefile
@@ -1045,6 +1045,7 @@  OBJS-$(CONFIG_DVD_NAV_PARSER)          += dvd_nav_parser.o
 OBJS-$(CONFIG_DVDSUB_PARSER)           += dvdsub_parser.o
 OBJS-$(CONFIG_FLAC_PARSER)             += flac_parser.o flacdata.o flac.o \
                                           vorbis_data.o
+OBJS-$(CONFIG_FLAC_PARSER)             += flif16_parser.o
 OBJS-$(CONFIG_G723_1_PARSER)           += g723_1_parser.o
 OBJS-$(CONFIG_G729_PARSER)             += g729_parser.o
 OBJS-$(CONFIG_GIF_PARSER)              += gif_parser.o
diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 978f36d12a..c6b8c6a1eb 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -461,6 +461,7 @@  enum AVCodecID {
     AV_CODEC_ID_MVDV,
     AV_CODEC_ID_MVHA,
     AV_CODEC_ID_CDTOONS,
+    AV_CODEC_ID_FLIF16,
 
     /* various PCM "codecs" */
     AV_CODEC_ID_FIRST_AUDIO = 0x10000,     ///< A dummy id pointing at the start of audio codecs
diff --git a/libavcodec/flif16_parser.c b/libavcodec/flif16_parser.c
new file mode 100644
index 0000000000..54bd93d499
--- /dev/null
+++ b/libavcodec/flif16_parser.c
@@ -0,0 +1,51 @@ 
+/*
+ * FLIF16 parser
+ * Copyright (c) 2020 Anamitra Ghorui
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+ 
+ /**
+  * @file
+  * FLIF16 parser
+  */
+
+#include "parser.h"
+#include <stdio.h>
+
+typedef struct FLIF16ParseContext {
+    ParseContext pc;
+    
+} FLIF16ParseContext;
+
+static int flif16_parse(AVCodecParserContext *s, AVCodecContext *avctx,
+                     const uint8_t **poutbuf, int *poutbuf_size,
+                     const uint8_t *buf, int buf_size)
+{
+    FLIF16ParseContext *fpc = s->priv_data;
+    int next = END_NOT_FOUND;
+
+    return next;
+}
+
+AVCodecParser ff_flif16_parser = {
+    .codec_ids      = { AV_CODEC_ID_FLIF16 },
+    .priv_data_size = sizeof(FLIF16ParseContext),
+    .parser_parse   = flif16_parse,
+    .parser_close   = ff_parse_close,
+};
+
diff --git a/libavcodec/parsers.c b/libavcodec/parsers.c
index 33a71de8a0..8b6eb954b3 100644
--- a/libavcodec/parsers.c
+++ b/libavcodec/parsers.c
@@ -40,6 +40,7 @@  extern AVCodecParser ff_dvbsub_parser;
 extern AVCodecParser ff_dvdsub_parser;
 extern AVCodecParser ff_dvd_nav_parser;
 extern AVCodecParser ff_flac_parser;
+extern AVCodecParser ff_flif16_parser;
 extern AVCodecParser ff_g723_1_parser;
 extern AVCodecParser ff_g729_parser;
 extern AVCodecParser ff_gif_parser;
diff --git a/libavformat/img2.c b/libavformat/img2.c
index 16bc9d2abd..14c11d0c82 100644
--- a/libavformat/img2.c
+++ b/libavformat/img2.c
@@ -81,6 +81,7 @@  const IdStrMap ff_img_tags[] = {
     { AV_CODEC_ID_XPM,        "xpm"      },
     { AV_CODEC_ID_XFACE,      "xface"    },
     { AV_CODEC_ID_XWD,        "xwd"      },
+    { AV_CODEC_ID_FLIF16,     "flif16"   },
     { AV_CODEC_ID_NONE,       NULL       }
 };