[FFmpeg-devel] Add FITS Demuxer

Message ID	1499008697-26462-1-git-send-email-paraschadha18@gmail.com
State	Superseded
Headers	show Delivered-To: ffmpegpatchwork@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; From: Paras Chadha <paraschadha18@gmail.com> To: ffmpeg-devel@ffmpeg.org Date: Sun, 2 Jul 2017 20:48:17 +0530 Message-Id: <1499008697-26462-1-git-send-email-paraschadha18@gmail.com> In-Reply-To: <1497886658-3500-1-git-send-email-paraschadha18@gmail.com> References: <1497886658-3500-1-git-send-email-paraschadha18@gmail.com> Subject: [FFmpeg-devel] [PATCH] Add FITS Demuxer Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: Paras Chadha <paraschadha18@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Paras July 2, 2017, 3:18 p.m. UTC

Filled buf with 0 to prevent overfow
Also added checks for integer overflow

Signed-off-by: Paras Chadha <paraschadha18@gmail.com>
---
 libavformat/Makefile     |   1 +
 libavformat/allformats.c |   1 +
 libavformat/fitsdec.c    | 224 +++++++++++++++++++++++++++++++++++++++++++++++
 libavformat/version.h    |   2 +-
 4 files changed, 227 insertions(+), 1 deletion(-)
 create mode 100644 libavformat/fitsdec.c

--
2.4.11

Moritz Barsnick July 2, 2017, 10:26 p.m. UTC | #1

On Sun, Jul 02, 2017 at 20:48:17 +0530, Paras Chadha wrote:
> +    int64_t header_size = 0, data_size=0, ret, pcount=0, gcount=1, d;
[...]
> +    header_size += 80;
[...]
> +    header_size += 80;
[...]
> +    header_size += 80;
[...]
> +    for (i = 0; i < naxis; i++) {
[...]
> +        header_size += 80;
[...]
> +    header_size += 80;
[...]
> +    while (strncmp(buf, "END", 3)) {
[...]
> +        header_size += 80;
> +    }
> +
> +    header_size = ceil(header_size/2880.0)*2880;
> +    if (header_size < 0)
> +        return AVERROR_INVALIDDATA;

How can this happen, except by integer overflow?

> +    if (data_size < 0)
> +        return AVERROR_INVALIDDATA;
> +
> +    if (!data_size) {
> +        fits->image = 0;
> +    } else {
> +        data_size = ceil(data_size/2880.0)*2880;
> +        if (data_size < 0)
> +            return AVERROR_INVALIDDATA;

How can this occur?

Moritz

Paras July 3, 2017, 7:33 a.m. UTC | #2

On Mon, Jul 3, 2017 at 3:56 AM, Moritz Barsnick <barsnick@gmx.net> wrote:

> On Sun, Jul 02, 2017 at 20:48:17 +0530, Paras Chadha wrote:
> > +    int64_t header_size = 0, data_size=0, ret, pcount=0, gcount=1, d;
> [...]
> > +    header_size += 80;
> [...]
> > +    header_size += 80;
> [...]
> > +    header_size += 80;
> [...]
> > +    for (i = 0; i < naxis; i++) {
> [...]
> > +        header_size += 80;
> [...]
> > +    header_size += 80;
> [...]
> > +    while (strncmp(buf, "END", 3)) {
> [...]
> > +        header_size += 80;
> > +    }
> > +
> > +    header_size = ceil(header_size/2880.0)*2880;
> > +    if (header_size < 0)
> > +        return AVERROR_INVALIDDATA;
>
> How can this happen, except by integer overflow?
>

It will not happen except in case of integer overflow.


>
> > +    if (data_size < 0)
> > +        return AVERROR_INVALIDDATA;
> > +
> > +    if (!data_size) {
> > +        fits->image = 0;
> > +    } else {
> > +        data_size = ceil(data_size/2880.0)*2880;
> > +        if (data_size < 0)
> > +            return AVERROR_INVALIDDATA;
>
> How can this occur?


It will not occur except in case of integer overflow.
If you are asking for specific case when this 'if' will execute then, set
data_size = LLONG_MAX - 1. Due to ceil function the final value will become
greater than LLONG_MAX. So, the statement inside 'if' will execute.


> Moritz
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Reimar Döffinger July 3, 2017, 10:42 p.m. UTC | #3

> +static int64_t find_size(AVIOContext * pb, FITSContext * fits)
> +{
> +    int bitpix, naxis, dim_no, i, naxisn[999], groups=0;
> +    int64_t header_size = 0, data_size=0, ret, pcount=0, gcount=1, d;
> +    char buf[81], c;

This is more than 4kB of data on the stack.
Large stack arrays have a huge amount of security implications, please put such buffers (if really needed) into the context.

> +    memset(buf, 0, 81);
> +    if ((ret = avio_read(pb, buf, 80)) != 80)
> +        return AVERROR_EOF;

Seems a bit overkill to memset all if only one is not being read.
Though mostly I wanted to comment that I still think it's really bad style to put the assignment into the if, it makes it quite a bit harder to read and we've had a few bugs because of it just to avoid one line of code...

> +    if (!strncmp(buf, "SIMPLE", 6) || !strncmp(buf, "XTENSION= 'IMAGE", 16)) {
> +        fits->image = 1;
> +    } else {
> +        fits->image = 0;
> +    }

Could be simplified to fits->image = !strncmp...

> +    header_size = ceil(header_size/2880.0)*2880;

Please avoid floating point. It rarely ends well to use it, especially since its range is smaller than the range of the int64 type you operate on.
It's the same as doing ((header_size + 2879)/2880)*2880, though there might be nicer-looking ways.

> +    if (header_size < 0)
> +        return AVERROR_INVALIDDATA;

As you said, this can only happen in case of overflow.
But if it overflowed (though I would claim this is not really possible), you already invoked undefined behaviour. Checking after undefined behaviour happened is like putting a bandage on the broken hand of a beheaded person...
Not to mention that it doesn't even catch the loss of precision in the floating-point operation above.

> +            data_size *= naxisn[i];
> +            if (data_size < 0)
> +                return AVERROR_INVALIDDATA;

If this is supposed to be overlow check as well: same issue as before: you need to PREVENT the overflow, afterwards it's too late, at least for signed types.
Also the golden rule of input sanitization: sanitize/range check FIRST, THEN do the arithmetic.
NEVER do it the other way run without at least spending 30 minutes per operation making sure it's really still safe.

> 
> +    fits->image = 0;
> +    pos = avio_tell(s->pb);
> +    while (!fits->image) {
> +        if ((ret = avio_seek(s->pb, pos+size, SEEK_SET)) < 0)
> +            return ret;
> +
> +        if (avio_feof(s->pb))
> +            return AVERROR_EOF;
> +
> +        pos = avio_tell(s->pb);
> +        if ((size = find_size(s->pb, fits)) < 0)
> +            return size;
> +    }
> +
> +    if ((ret = avio_seek(s->pb, pos, SEEK_SET)) < 0)
> +        return ret;

Does this seek ever do anything?
Either way, I do not like all this seeking, especially not when they all use SEEK_SET, it makes it rather hard to see if this all works when the input is being streamed or not...

> +    ret = av_get_packet(s->pb, pkt, size);
> +    if (ret != size) {
> +        if (ret > 0) av_packet_unref(pkt);
> +        return AVERROR(EIO);
> +    }

I don't know what the current rules are, but back when I was more active the rule was to always extract as much data as possible.
So if you only get a partial packet, return that, and maybe the application can still do at least something with it.

Nicolas George July 3, 2017, 10:51 p.m. UTC | #4

Hi. Nice to see you back.

Le sextidi 16 messidor, an CCXXV, Reimar Döffinger a écrit :
> This is more than 4kB of data on the stack.
> Large stack arrays have a huge amount of security implications, please
> put such buffers (if really needed) into the context.

4 ko is not large, and neither is what is used here. We have a lot stack
allocations of that size and more and a few significantly larger.

And data that do not survive the function call do not belong in the
context.

Regards,

Reimar Döffinger July 4, 2017, 6:42 a.m. UTC | #5

On 04.07.2017, at 00:51, Nicolas George <george@nsup.org> wrote:

> Hi. Nice to see you back.
> 
> Le sextidi 16 messidor, an CCXXV, Reimar Döffinger a écrit :
>> This is more than 4kB of data on the stack.
>> Large stack arrays have a huge amount of security implications, please
>> put such buffers (if really needed) into the context.
> 
> 4 ko is not large, and neither is what is used here. We have a lot stack
> allocations of that size and more and a few significantly larger.

Ok, I won't try to change policy, but the guard pages (if even implemented) are 4kB and thus anything not significantly smaller increases security risks.
As does any type of array that presents an overflow risk.
Those may rather be kernel issues admittedly, but considering all OS kernels seem to have the same issues they should probably not be entirely ignored by application.

> And data that do not survive the function call do not belong in the
> context.

From a security standpoint, I believe any array and anything that is more than a handful bytes ideally should not be on the stack, if the added complexity is minimal.

wm4 July 4, 2017, 8:33 a.m. UTC | #6

On Tue, 4 Jul 2017 08:42:56 +0200
Reimar Döffinger <Reimar.Doeffinger@gmx.de> wrote:

> On 04.07.2017, at 00:51, Nicolas George <george@nsup.org> wrote:
> 
> > Hi. Nice to see you back.
> > 
> > Le sextidi 16 messidor, an CCXXV, Reimar Döffinger a écrit :  
> >> This is more than 4kB of data on the stack.
> >> Large stack arrays have a huge amount of security implications, please
> >> put such buffers (if really needed) into the context.  
> > 
> > 4 ko is not large, and neither is what is used here. We have a lot stack
> > allocations of that size and more and a few significantly larger.  
> 
> Ok, I won't try to change policy, but the guard pages (if even implemented) are 4kB and thus anything not significantly smaller increases security risks.
> As does any type of array that presents an overflow risk.
> Those may rather be kernel issues admittedly, but considering all OS kernels seem to have the same issues they should probably not be entirely ignored by application.

If you're interested in security hacks, you should probably use a
Microsoft compiler, which will touch at least every 4K of stack
allocation, to avoid skipping a guard page.

Nicolas George July 4, 2017, 9:21 a.m. UTC | #7

Le quartidi 14 messidor, an CCXXV, Paras Chadha a écrit :
> Filled buf with 0 to prevent overfow
> Also added checks for integer overflow
> 
> Signed-off-by: Paras Chadha <paraschadha18@gmail.com>
> ---
>  libavformat/Makefile     |   1 +
>  libavformat/allformats.c |   1 +
>  libavformat/fitsdec.c    | 224 +++++++++++++++++++++++++++++++++++++++++++++++
>  libavformat/version.h    |   2 +-
>  4 files changed, 227 insertions(+), 1 deletion(-)
>  create mode 100644 libavformat/fitsdec.c
> 
> diff --git a/libavformat/Makefile b/libavformat/Makefile
> index 80aeed2..1d43c05 100644
> --- a/libavformat/Makefile
> +++ b/libavformat/Makefile
> @@ -164,6 +164,7 @@ OBJS-$(CONFIG_FFMETADATA_MUXER)          += ffmetaenc.o
>  OBJS-$(CONFIG_FIFO_MUXER)                += fifo.o
>  OBJS-$(CONFIG_FILMSTRIP_DEMUXER)         += filmstripdec.o
>  OBJS-$(CONFIG_FILMSTRIP_MUXER)           += filmstripenc.o
> +OBJS-$(CONFIG_FITS_DEMUXER)              += fitsdec.o
>  OBJS-$(CONFIG_FLAC_DEMUXER)              += flacdec.o rawdec.o \
>                                              flac_picture.o   \
>                                              oggparsevorbis.o \
> diff --git a/libavformat/allformats.c b/libavformat/allformats.c
> index a0e2fb8..74a2e15 100644
> --- a/libavformat/allformats.c
> +++ b/libavformat/allformats.c
> @@ -121,6 +121,7 @@ static void register_all(void)
>      REGISTER_MUXDEMUX(FFMETADATA,       ffmetadata);
>      REGISTER_MUXER   (FIFO,             fifo);
>      REGISTER_MUXDEMUX(FILMSTRIP,        filmstrip);
> +    REGISTER_DEMUXER (FITS,             fits);
>      REGISTER_MUXDEMUX(FLAC,             flac);
>      REGISTER_DEMUXER (FLIC,             flic);
>      REGISTER_MUXDEMUX(FLV,              flv);
> diff --git a/libavformat/fitsdec.c b/libavformat/fitsdec.c
> new file mode 100644
> index 0000000..3670fbf
> --- /dev/null
> +++ b/libavformat/fitsdec.c
> @@ -0,0 +1,224 @@
> +/*
> + * FITS demuxer
> + * Copyright (c) 2017 Paras Chadha
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * FITS demuxer.
> + */
> +
> +#include "libavutil/intreadwrite.h"
> +#include "internal.h"
> +#include "libavutil/opt.h"
> +
> +typedef struct FITSContext {
> +    const AVClass *class;
> +    AVRational framerate;
> +    int image;
> +    int64_t pts;
> +} FITSContext;
> +

> +static int fits_probe(AVProbeData *p)
> +{
> +    const uint8_t *b = p->buf;
> +
> +    if (AV_RB32(b) == 0x53494d50 && AV_RB16(b+4) == 0x4c45)
> +        return AVPROBE_SCORE_MAX - 1;
> +    return 0;
> +}

This will match any text file starting with the word "simple" in
uppercase: I think it needs to be much more strict.

> +
> +static int fits_read_header(AVFormatContext *s)
> +{
> +    AVStream *st = avformat_new_stream(s, NULL);
> +    FITSContext * fits = s->priv_data;
> +
> +    if (!st)
> +        return AVERROR(ENOMEM);
> +
> +    st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
> +    st->codecpar->codec_id = AV_CODEC_ID_FITS;
> +
> +    avpriv_set_pts_info(st, 64, fits->framerate.den, fits->framerate.num);
> +    fits->pts = 0;
> +    return 0;
> +}
> +
> +static int64_t find_size(AVIOContext * pb, FITSContext * fits)
> +{
> +    int bitpix, naxis, dim_no, i, naxisn[999], groups=0;
> +    int64_t header_size = 0, data_size=0, ret, pcount=0, gcount=1, d;
> +    char buf[81], c;
> +

> +    memset(buf, 0, 81);

Since you read all 80 octets, you can just set the last one to 0.

Alternatively, "char buf[81] = { 0 }" should be equivalent.

> +    if ((ret = avio_read(pb, buf, 80)) != 80)
> +        return AVERROR_EOF;

Here and everywhere: please do not discard error codes: ret < 0 must be
returned as is, and only 0 <= ret < 80 should generate EOF.

> +    if (!strncmp(buf, "SIMPLE", 6) || !strncmp(buf, "XTENSION= 'IMAGE", 16)) {
> +        fits->image = 1;
> +    } else {
> +        fits->image = 0;
> +    }
> +    header_size += 80;
> +
> +    if ((ret = avio_read(pb, buf, 80)) != 80)
> +        return AVERROR_EOF;
> +    if (sscanf(buf, "BITPIX = %d", &bitpix) != 1)
> +        return AVERROR_INVALIDDATA;
> +    header_size += 80;
> +
> +    if ((ret = avio_read(pb, buf, 80)) != 80)
> +        return AVERROR_EOF;
> +    if (sscanf(buf, "NAXIS = %d", &naxis) != 1)
> +        return AVERROR_INVALIDDATA;
> +    header_size += 80;
> +
> +    for (i = 0; i < naxis; i++) {
> +        if ((ret = avio_read(pb, buf, 80)) != 80)
> +            return AVERROR_EOF;
> +        if (sscanf(buf, "NAXIS%d = %d", &dim_no, &naxisn[i]) != 2)
> +            return AVERROR_INVALIDDATA;
> +        if (dim_no != i+1)
> +            return AVERROR_INVALIDDATA;
> +        header_size += 80;
> +    }
> +
> +    if ((ret = avio_read(pb, buf, 80)) != 80)
> +        return AVERROR_EOF;
> +    header_size += 80;
> +
> +    while (strncmp(buf, "END", 3)) {
> +        if (sscanf(buf, "PCOUNT = %ld", &d) == 1) {
> +            pcount = d;
> +        } else if (sscanf(buf, "GCOUNT = %ld", &d) == 1) {
> +            gcount = d;
> +        } else if (sscanf(buf, "GROUPS = %c", &c) == 1) {
> +            if (c == 'T') {
> +                groups = 1;
> +            }
> +        }
> +
> +        if ((ret = avio_read(pb, buf, 80)) != 80)
> +            return AVERROR_EOF;
> +        header_size += 80;
> +    }
> +

> +    header_size = ceil(header_size/2880.0)*2880;

As said before, use integer arithmetic. And if nothing prevents it, make
header_size unsigned.

> +    if (header_size < 0)
> +        return AVERROR_INVALIDDATA;
> +
> +    if (groups) {
> +        fits->image = 0;
> +        if (naxis > 1)
> +            data_size = 1;
> +        for (i = 1; i < naxis; i++) {
> +            data_size *= naxisn[i];
> +            if (data_size < 0)
> +                return AVERROR_INVALIDDATA;
> +        }
> +    } else if (naxis) {
> +        data_size = 1;
> +        for (i = 0; i < naxis; i++) {
> +            data_size *= naxisn[i];
> +            if (data_size < 0)
> +                return AVERROR_INVALIDDATA;
> +        }
> +    } else {
> +        fits->image = 0;
> +    }
> +
> +    data_size += pcount;
> +    data_size *= (abs(bitpix) >> 3) * gcount;
> +
> +    if (data_size < 0)
> +        return AVERROR_INVALIDDATA;
> +
> +    if (!data_size) {
> +        fits->image = 0;
> +    } else {
> +        data_size = ceil(data_size/2880.0)*2880;
> +        if (data_size < 0)
> +            return AVERROR_INVALIDDATA;
> +    }
> +
> +    if(header_size + data_size < 0)
> +        return AVERROR_INVALIDDATA;
> +
> +    return header_size + data_size;
> +}
> +
> +static int fits_read_packet(AVFormatContext *s, AVPacket *pkt)
> +{
> +    int64_t size=0, pos, ret;
> +    FITSContext * fits = s->priv_data;
> +
> +    fits->image = 0;
> +    pos = avio_tell(s->pb);
> +    while (!fits->image) {

> +        if ((ret = avio_seek(s->pb, pos+size, SEEK_SET)) < 0)

Note that we have avio_skip() for that.

> +            return ret;
> +
> +        if (avio_feof(s->pb))
> +            return AVERROR_EOF;
> +

> +        pos = avio_tell(s->pb);

I think you could just "pos += size".

> +        if ((size = find_size(s->pb, fits)) < 0)
> +            return size;
> +    }
> +

> +    if ((ret = avio_seek(s->pb, pos, SEEK_SET)) < 0)
> +        return ret;

I really REALLY do not like this.

For one, it makes seeking mandatory for no good reason. But that is the
least of it.

The worst part is that the same chunk of data is both decoded by this
demuxer and read into the packet payload to be decoded by the decoder. A
very visible consequence of that is the function find_size() above that
shares a lot of logic with the read_header() function in the decoder,
with a different interface to access the data (avio instead of a full
buffer).

I was about to suggest you find a way of sharing that logic, but now I
realize it is only the tip of the problem.

Is there a standard audio/video file format that supports FITS video
streams? Matroska? MPEG-TS?

If the answer is yes, then we must be compatible, and that ends the
question. But I think the answer is no.

If the answer is no, then we, i.e. you, must decide what is considered
part of the format and must be handled in the demuxer, and what is
considered part of the packet payload and must be handled by the
decoder.

This is quite easy, but it must be done properly, because other projects
may imitate us.

What I suggest is this: take the dump of a typical FITS file with three
images and annotate it with "this is the global header", "this is the
header of the first packet", "this is the payload of the first packet",
"this is the header of the second packet", etc. Then post it here for
review.

Note that if there is no global header, then this demuxer should
probably be just a parser for the image2pipe demuxer.

> +
> +    ret = av_get_packet(s->pb, pkt, size);

> +    if (ret != size) {
> +        if (ret > 0) av_packet_unref(pkt);
> +        return AVERROR(EIO);
> +    }

I think the correct handling would be a packet with AV_PKT_FLAG_CORRUPT.

> +
> +    pkt->stream_index = 0;
> +    pkt->flags |= AV_PKT_FLAG_KEY;
> +    pkt->pos = pos;
> +    pkt->size = size;
> +    pkt->pts = fits->pts;
> +    fits->pts++;
> +
> +    return size;
> +}
> +
> +static const AVOption fits_options[] = {
> +    { "framerate", "set the framerate", offsetof(FITSContext, framerate), AV_OPT_TYPE_VIDEO_RATE, {.str = "1"}, 0, INT_MAX, AV_OPT_FLAG_DECODING_PARAM},
> +    { NULL },
> +};
> +
> +static const AVClass fits_demuxer_class = {
> +    .class_name = "FITS demuxer",
> +    .item_name  = av_default_item_name,
> +    .option     = fits_options,
> +    .version    = LIBAVUTIL_VERSION_INT,
> +};
> +
> +AVInputFormat ff_fits_demuxer = {
> +    .name           = "fits",
> +    .long_name      = NULL_IF_CONFIG_SMALL("Flexible Image Transport System"),
> +    .priv_data_size = sizeof(FITSContext),
> +    .read_probe     = fits_probe,
> +    .read_header    = fits_read_header,
> +    .read_packet    = fits_read_packet,
> +    .priv_class     = &fits_demuxer_class,
> +    .raw_codec_id   = AV_CODEC_ID_FITS,
> +};
> diff --git a/libavformat/version.h b/libavformat/version.h
> index 44c16ac..48b81f2 100644
> --- a/libavformat/version.h
> +++ b/libavformat/version.h
> @@ -32,7 +32,7 @@
>  // Major bumping may affect Ticket5467, 5421, 5451(compatibility with Chromium)
>  // Also please add any ticket numbers that you believe might be affected here
>  #define LIBAVFORMAT_VERSION_MAJOR  57
> -#define LIBAVFORMAT_VERSION_MINOR  75
> +#define LIBAVFORMAT_VERSION_MINOR  76
>  #define LIBAVFORMAT_VERSION_MICRO 100
> 
>  #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \

Regards,

Nicolas George July 4, 2017, 9:34 a.m. UTC | #8

Le sextidi 16 messidor, an CCXXV, Reimar Döffinger a écrit :
> From a security standpoint, I believe any array and anything that is
> more than a handful bytes ideally should not be on the stack, if the
> added complexity is minimal.

If you change that into "a handful of kilo-octets", then for a project
like FFmpeg (which is not a monster like a Gui toolkit but neither meant
for embedded systems with tiny limits) I agree.

But "a handful bytes", I consider the added security to be the same
level as stopping people at the entrances of a mall to have a passing
glance at their handbag: pure theater. The wasted time could be more
efficiently be used to other security-related tasks. Reimplementing
FFmpeg in Rust for example.

Regards,

Nicolas George July 4, 2017, 9:43 a.m. UTC | #9

Le sextidi 16 messidor, an CCXXV, Nicolas George a écrit :
> If you change that into "a handful of kilo-octets", then for a project
> like FFmpeg (which is not a monster like a Gui toolkit but neither meant
> for embedded systems with tiny limits) I agree.
> 
> But "a handful bytes", I consider the added security to be the same
> level as stopping people at the entrances of a mall to have a passing
> glance at their handbag: pure theater. The wasted time could be more
> efficiently be used to other security-related tasks. Reimplementing
> FFmpeg in Rust for example.

Forgot to add: since this is really an OS and compiler problem and "you
don't have to run faster than the bear to get away, you just have to run
faster than the guy next to you", until all other programs that may have
to deal with untrusted data is as careful as FFmpeg, enhancing the
security here would have little effect anyway. We can sleep on both
ears for the time being.

Regards,

Paras July 4, 2017, 4:20 p.m. UTC | #10

On Tue, Jul 4, 2017 at 4:12 AM, Reimar Döffinger <Reimar.Doeffinger@gmx.de>
wrote:

>
> > +static int64_t find_size(AVIOContext * pb, FITSContext * fits)
> > +{
> > +    int bitpix, naxis, dim_no, i, naxisn[999], groups=0;
> > +    int64_t header_size = 0, data_size=0, ret, pcount=0, gcount=1, d;
> > +    char buf[81], c;
>
> This is more than 4kB of data on the stack.
> Large stack arrays have a huge amount of security implications, please put
> such buffers (if really needed) into the context.
>
> > +    memset(buf, 0, 81);
> > +    if ((ret = avio_read(pb, buf, 80)) != 80)
> > +        return AVERROR_EOF;
>
> Seems a bit overkill to memset all if only one is not being read.
> Though mostly I wanted to comment that I still think it's really bad style
> to put the assignment into the if, it makes it quite a bit harder to read
> and we've had a few bugs because of it just to avoid one line of code...
>

Ok, i will move this assignment outside.


>
> > +    if (!strncmp(buf, "SIMPLE", 6) || !strncmp(buf, "XTENSION= 'IMAGE",
> 16)) {
> > +        fits->image = 1;
> > +    } else {
> > +        fits->image = 0;
> > +    }
>
> Could be simplified to fits->image = !strncmp...
>

Yes, will do this.


>
> > +    header_size = ceil(header_size/2880.0)*2880;
>
> Please avoid floating point. It rarely ends well to use it, especially
> since its range is smaller than the range of the int64 type you operate on.
> It's the same as doing ((header_size + 2879)/2880)*2880, though there
> might be nicer-looking ways.
>

ok, i will make this change.


>
> > +    if (header_size < 0)
> > +        return AVERROR_INVALIDDATA;
>
> As you said, this can only happen in case of overflow.
> But if it overflowed (though I would claim this is not really possible),
> you already invoked undefined behaviour. Checking after undefined behaviour
> happened is like putting a bandage on the broken hand of a beheaded
> person...
> Not to mention that it doesn't even catch the loss of precision in the
> floating-point operation above.
>

okay, i know it is really not possible. Just to be on safe side, i added
this check. Should i remove it ?


>
> > +            data_size *= naxisn[i];
> > +            if (data_size < 0)
> > +                return AVERROR_INVALIDDATA;
>
> If this is supposed to be overlow check as well: same issue as before: you
> need to PREVENT the overflow, afterwards it's too late, at least for signed
> types.
> Also the golden rule of input sanitization: sanitize/range check FIRST,
> THEN do the arithmetic.
> NEVER do it the other way run without at least spending 30 minutes per
> operation making sure it's really still safe.
>

Actually, here also overflow is not possible unless someone intentionally
put values of naxisn[i] that would cause overflow. Most of the FITS files
are in MB's or at max in GB's (although i have not seen any in GB). So, if
someone intentionally put wrong values of size in header then it would
crash or cause undefined behavior. Just to prevent that i have added a
check. So, can suggest a better way to do this ? I need the size of data to
allocate a packet and send it to decoder. Regarding range check, FITS
standard does not have any limit on the range of values of dimension,
theoretically it can be anything but practically ofcourse it cannot exceed
the limits of 64 bit integer. So, how should i go about this ?


>
> >
> > +    fits->image = 0;
> > +    pos = avio_tell(s->pb);
> > +    while (!fits->image) {
> > +        if ((ret = avio_seek(s->pb, pos+size, SEEK_SET)) < 0)
> > +            return ret;
> > +
> > +        if (avio_feof(s->pb))
> > +            return AVERROR_EOF;
> > +
> > +        pos = avio_tell(s->pb);
> > +        if ((size = find_size(s->pb, fits)) < 0)
> > +            return size;
> > +    }
> > +
> > +    if ((ret = avio_seek(s->pb, pos, SEEK_SET)) < 0)
> > +        return ret;
>
> Does this seek ever do anything?
>

Yes, it seeks back the pointer to point to the begining of the header.


> Either way, I do not like all this seeking, especially not when they all
> use SEEK_SET, it makes it rather hard to see if this all works when the
> input is being streamed or not...


> > +    ret = av_get_packet(s->pb, pkt, size);
> > +    if (ret != size) {
> > +        if (ret > 0) av_packet_unref(pkt);
> > +        return AVERROR(EIO);
> > +    }
>
> I don't know what the current rules are, but back when I was more active
> the rule was to always extract as much data as possible.
> So if you only get a partial packet, return that, and maybe the
> application can still do at least something with it.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Paras July 4, 2017, 4:20 p.m. UTC | #11

On Tue, Jul 4, 2017 at 2:51 PM, Nicolas George <george@nsup.org> wrote:

> Le quartidi 14 messidor, an CCXXV, Paras Chadha a écrit :
> > Filled buf with 0 to prevent overfow
> > Also added checks for integer overflow
> >
> > Signed-off-by: Paras Chadha <paraschadha18@gmail.com>
> > ---
> >  libavformat/Makefile     |   1 +
> >  libavformat/allformats.c |   1 +
> >  libavformat/fitsdec.c    | 224 ++++++++++++++++++++++++++++++
> +++++++++++++++++
> >  libavformat/version.h    |   2 +-
> >  4 files changed, 227 insertions(+), 1 deletion(-)
> >  create mode 100644 libavformat/fitsdec.c
> >
> > diff --git a/libavformat/Makefile b/libavformat/Makefile
> > index 80aeed2..1d43c05 100644
> > --- a/libavformat/Makefile
> > +++ b/libavformat/Makefile
> > @@ -164,6 +164,7 @@ OBJS-$(CONFIG_FFMETADATA_MUXER)          +=
> ffmetaenc.o
> >  OBJS-$(CONFIG_FIFO_MUXER)                += fifo.o
> >  OBJS-$(CONFIG_FILMSTRIP_DEMUXER)         += filmstripdec.o
> >  OBJS-$(CONFIG_FILMSTRIP_MUXER)           += filmstripenc.o
> > +OBJS-$(CONFIG_FITS_DEMUXER)              += fitsdec.o
> >  OBJS-$(CONFIG_FLAC_DEMUXER)              += flacdec.o rawdec.o \
> >                                              flac_picture.o   \
> >                                              oggparsevorbis.o \
> > diff --git a/libavformat/allformats.c b/libavformat/allformats.c
> > index a0e2fb8..74a2e15 100644
> > --- a/libavformat/allformats.c
> > +++ b/libavformat/allformats.c
> > @@ -121,6 +121,7 @@ static void register_all(void)
> >      REGISTER_MUXDEMUX(FFMETADATA,       ffmetadata);
> >      REGISTER_MUXER   (FIFO,             fifo);
> >      REGISTER_MUXDEMUX(FILMSTRIP,        filmstrip);
> > +    REGISTER_DEMUXER (FITS,             fits);
> >      REGISTER_MUXDEMUX(FLAC,             flac);
> >      REGISTER_DEMUXER (FLIC,             flic);
> >      REGISTER_MUXDEMUX(FLV,              flv);
> > diff --git a/libavformat/fitsdec.c b/libavformat/fitsdec.c
> > new file mode 100644
> > index 0000000..3670fbf
> > --- /dev/null
> > +++ b/libavformat/fitsdec.c
> > @@ -0,0 +1,224 @@
> > +/*
> > + * FITS demuxer
> > + * Copyright (c) 2017 Paras Chadha
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +/**
> > + * @file
> > + * FITS demuxer.
> > + */
> > +
> > +#include "libavutil/intreadwrite.h"
> > +#include "internal.h"
> > +#include "libavutil/opt.h"
> > +
> > +typedef struct FITSContext {
> > +    const AVClass *class;
> > +    AVRational framerate;
> > +    int image;
> > +    int64_t pts;
> > +} FITSContext;
> > +
>
> > +static int fits_probe(AVProbeData *p)
> > +{
> > +    const uint8_t *b = p->buf;
> > +
> > +    if (AV_RB32(b) == 0x53494d50 && AV_RB16(b+4) == 0x4c45)
> > +        return AVPROBE_SCORE_MAX - 1;
> > +    return 0;
> > +}
>
> This will match any text file starting with the word "simple" in
> uppercase: I think it needs to be much more strict.
>

Okay, i will replace it with a strict signature which consists of SIMPLE in
1-6 bytes, spaces in 7-8 bytes, = followed by space in 9-10 bytes and
spaces followed by a single character T/F in byte 30. This would be
sufficient i guess.


>
> > +
> > +static int fits_read_header(AVFormatContext *s)
> > +{
> > +    AVStream *st = avformat_new_stream(s, NULL);
> > +    FITSContext * fits = s->priv_data;
> > +
> > +    if (!st)
> > +        return AVERROR(ENOMEM);
> > +
> > +    st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
> > +    st->codecpar->codec_id = AV_CODEC_ID_FITS;
> > +
> > +    avpriv_set_pts_info(st, 64, fits->framerate.den,
> fits->framerate.num);
> > +    fits->pts = 0;
> > +    return 0;
> > +}
> > +
> > +static int64_t find_size(AVIOContext * pb, FITSContext * fits)
> > +{
> > +    int bitpix, naxis, dim_no, i, naxisn[999], groups=0;
> > +    int64_t header_size = 0, data_size=0, ret, pcount=0, gcount=1, d;
> > +    char buf[81], c;
> > +
>
> > +    memset(buf, 0, 81);
>
> Since you read all 80 octets, you can just set the last one to 0.
>
> Alternatively, "char buf[81] = { 0 }" should be equivalent.
>

Okay, will do this.


>
> > +    if ((ret = avio_read(pb, buf, 80)) != 80)
> > +        return AVERROR_EOF;
>
> Here and everywhere: please do not discard error codes: ret < 0 must be
> returned as is, and only 0 <= ret < 80 should generate EOF.
>

okay


>
> > +    if (!strncmp(buf, "SIMPLE", 6) || !strncmp(buf, "XTENSION= 'IMAGE",
> 16)) {
> > +        fits->image = 1;
> > +    } else {
> > +        fits->image = 0;
> > +    }
> > +    header_size += 80;
> > +
> > +    if ((ret = avio_read(pb, buf, 80)) != 80)
> > +        return AVERROR_EOF;
> > +    if (sscanf(buf, "BITPIX = %d", &bitpix) != 1)
> > +        return AVERROR_INVALIDDATA;
> > +    header_size += 80;
> > +
> > +    if ((ret = avio_read(pb, buf, 80)) != 80)
> > +        return AVERROR_EOF;
> > +    if (sscanf(buf, "NAXIS = %d", &naxis) != 1)
> > +        return AVERROR_INVALIDDATA;
> > +    header_size += 80;
> > +
> > +    for (i = 0; i < naxis; i++) {
> > +        if ((ret = avio_read(pb, buf, 80)) != 80)
> > +            return AVERROR_EOF;
> > +        if (sscanf(buf, "NAXIS%d = %d", &dim_no, &naxisn[i]) != 2)
> > +            return AVERROR_INVALIDDATA;
> > +        if (dim_no != i+1)
> > +            return AVERROR_INVALIDDATA;
> > +        header_size += 80;
> > +    }
> > +
> > +    if ((ret = avio_read(pb, buf, 80)) != 80)
> > +        return AVERROR_EOF;
> > +    header_size += 80;
> > +
> > +    while (strncmp(buf, "END", 3)) {
> > +        if (sscanf(buf, "PCOUNT = %ld", &d) == 1) {
> > +            pcount = d;
> > +        } else if (sscanf(buf, "GCOUNT = %ld", &d) == 1) {
> > +            gcount = d;
> > +        } else if (sscanf(buf, "GROUPS = %c", &c) == 1) {
> > +            if (c == 'T') {
> > +                groups = 1;
> > +            }
> > +        }
> > +
> > +        if ((ret = avio_read(pb, buf, 80)) != 80)
> > +            return AVERROR_EOF;
> > +        header_size += 80;
> > +    }
> > +
>
> > +    header_size = ceil(header_size/2880.0)*2880;
>
> As said before, use integer arithmetic. And if nothing prevents it, make
> header_size unsigned.
>

Yes, will do it.


>
> > +    if (header_size < 0)
> > +        return AVERROR_INVALIDDATA;
> > +
> > +    if (groups) {
> > +        fits->image = 0;
> > +        if (naxis > 1)
> > +            data_size = 1;
> > +        for (i = 1; i < naxis; i++) {
> > +            data_size *= naxisn[i];
> > +            if (data_size < 0)
> > +                return AVERROR_INVALIDDATA;
> > +        }
> > +    } else if (naxis) {
> > +        data_size = 1;
> > +        for (i = 0; i < naxis; i++) {
> > +            data_size *= naxisn[i];
> > +            if (data_size < 0)
> > +                return AVERROR_INVALIDDATA;
> > +        }
> > +    } else {
> > +        fits->image = 0;
> > +    }
> > +
> > +    data_size += pcount;
> > +    data_size *= (abs(bitpix) >> 3) * gcount;
> > +
> > +    if (data_size < 0)
> > +        return AVERROR_INVALIDDATA;
> > +
> > +    if (!data_size) {
> > +        fits->image = 0;
> > +    } else {
> > +        data_size = ceil(data_size/2880.0)*2880;
> > +        if (data_size < 0)
> > +            return AVERROR_INVALIDDATA;
> > +    }
> > +
> > +    if(header_size + data_size < 0)
> > +        return AVERROR_INVALIDDATA;
> > +
> > +    return header_size + data_size;
> > +}
> > +
> > +static int fits_read_packet(AVFormatContext *s, AVPacket *pkt)
> > +{
> > +    int64_t size=0, pos, ret;
> > +    FITSContext * fits = s->priv_data;
> > +
> > +    fits->image = 0;
> > +    pos = avio_tell(s->pb);
> > +    while (!fits->image) {
>
> > +        if ((ret = avio_seek(s->pb, pos+size, SEEK_SET)) < 0)
>
> Note that we have avio_skip() for that.
>

okay


>
> > +            return ret;
> > +
> > +        if (avio_feof(s->pb))
> > +            return AVERROR_EOF;
> > +
>
> > +        pos = avio_tell(s->pb);
>
> I think you could just "pos += size".
>

okay


>
> > +        if ((size = find_size(s->pb, fits)) < 0)
> > +            return size;
> > +    }
> > +
>
> > +    if ((ret = avio_seek(s->pb, pos, SEEK_SET)) < 0)
> > +        return ret;
>
> I really REALLY do not like this.
>
> For one, it makes seeking mandatory for no good reason. But that is the
> least of it.
>
> The worst part is that the same chunk of data is both decoded by this
> demuxer and read into the packet payload to be decoded by the decoder. A
> very visible consequence of that is the function find_size() above that
> shares a lot of logic with the read_header() function in the decoder,
> with a different interface to access the data (avio instead of a full
> buffer).
>
> I was about to suggest you find a way of sharing that logic, but now I
> realize it is only the tip of the problem.
>
> Is there a standard audio/video file format that supports FITS video
> streams? Matroska? MPEG-TS?
>
> If the answer is yes, then we must be compatible, and that ends the
> question. But I think the answer is no.
>

yes, the answer is no.


>
> If the answer is no, then we, i.e. you, must decide what is considered
> part of the format and must be handled in the demuxer, and what is
> considered part of the packet payload and must be handled by the
> decoder.
>
> This is quite easy, but it must be done properly, because other projects
> may imitate us.
>
> What I suggest is this: take the dump of a typical FITS file with three
> images and annotate it with "this is the global header", "this is the
> header of the first packet", "this is the payload of the first packet",
> "this is the header of the second packet", etc. Then post it here for
> review.
>
> Note that if there is no global header, then this demuxer should
> probably be just a parser for the image2pipe demuxer.
>

There is no global header.

Basically FITS files can have multiple images. To support them, i had added
a demuxer. Each image consists of a header part and a data part. The header
part of image begins with a SIMPLE keyword or XTENSION = IMAGE keyword,
followed by some mandatory keywords, followed by some optional keywords and
ends with the END keyword. The size of header must be a multiple of 2880
bytes. If it is not, then it is padded with spaces at the end to make it a
multiple of 2880. After this comes the data part. It's size is determined
by the NAXISN fields of the header part. Each NAXISN corresponds to one
dimension of the data and there can be maximum 999 dimensions. So,
basically the product of all NAXISN keywords give the data size (there are
other keywords too which may affect this like PCOUNT, GCOUNT, GROUPS
keyword ). Similar to the header part, the size of data part must also be a
multiple of 2880. It is padded with 0 to make it multiple of 2880.

So, what i am doing here is I am calculating the header size (which is no
of bytes read til the END keyword, including END) and data size (calculated
using information from header) in the find_size function. If the data
contains image (i.e. if SIMPLE or IMAGE XTENSION is present) then i am
allocating a packet of appropriate size and sending it to decoder.
Otherwise i am simply skipping over the data to read the next header (if
any).

Now, i know i am reading the same header twice once in demuxer and then in
decoder. But i guess, there is no other way around. In order to calculate
the size of header i must read the header till the END keyword. In order to
calculate the data size, i must read the NAXIS keywords in the header.

Regarding separation of separation of data that must be handled by demuxer
and decoder, FITS is a generat transport format too. Besides images, it
also supports BInary Tables, ASCII tables which are used to store the data.
Along with it there is something known as Random groups structure. In order
to find the size of their data portions, i must read some other keywords
like GROUP, PCOUNT, GCOUNT etc. So, i must also read them to skip the data
portions of these structures (as there my be an image after them). Besides
this, in decoder i have to read some other keywords like DATAMIN, DATAMAX,
BSCALE, BZERO etc. These are used to manipulate the image to show them
correctly and these keywords can occur in any order within the header
before the END keyword.

One way, i initially thought to solve this problem was to read the header
in the demuxer only, extract all the required keywords and their values and
then send the information to the decoder for processing along with the data
portions. But i could not find a way to do this.

So, now the whole problem is clear. Please suggest a way (if there is any)
to read the header only once and avoid the use of avio_seek.


>
> > +
> > +    ret = av_get_packet(s->pb, pkt, size);
>
> > +    if (ret != size) {
> > +        if (ret > 0) av_packet_unref(pkt);
> > +        return AVERROR(EIO);
> > +    }
>
> I think the correct handling would be a packet with AV_PKT_FLAG_CORRUPT.
>

So, i should send the corrupted packet to the decoder ? But i have added
checks in the decoder that if the packet contains any missing mandatory
keyword it will return AVERROR_INVALIDDATA. If the data portion is also
incomplete, then also it will return AVERROR_INVALIDDATA. So, should i
really send a partial packet to the decoder ?


>
> > +
> > +    pkt->stream_index = 0;
> > +    pkt->flags |= AV_PKT_FLAG_KEY;
> > +    pkt->pos = pos;
> > +    pkt->size = size;
> > +    pkt->pts = fits->pts;
> > +    fits->pts++;
> > +
> > +    return size;
> > +}
> > +
> > +static const AVOption fits_options[] = {
> > +    { "framerate", "set the framerate", offsetof(FITSContext,
> framerate), AV_OPT_TYPE_VIDEO_RATE, {.str = "1"}, 0, INT_MAX,
> AV_OPT_FLAG_DECODING_PARAM},
> > +    { NULL },
> > +};
> > +
> > +static const AVClass fits_demuxer_class = {
> > +    .class_name = "FITS demuxer",
> > +    .item_name  = av_default_item_name,
> > +    .option     = fits_options,
> > +    .version    = LIBAVUTIL_VERSION_INT,
> > +};
> > +
> > +AVInputFormat ff_fits_demuxer = {
> > +    .name           = "fits",
> > +    .long_name      = NULL_IF_CONFIG_SMALL("Flexible Image Transport
> System"),
> > +    .priv_data_size = sizeof(FITSContext),
> > +    .read_probe     = fits_probe,
> > +    .read_header    = fits_read_header,
> > +    .read_packet    = fits_read_packet,
> > +    .priv_class     = &fits_demuxer_class,
> > +    .raw_codec_id   = AV_CODEC_ID_FITS,
> > +};
> > diff --git a/libavformat/version.h b/libavformat/version.h
> > index 44c16ac..48b81f2 100644
> > --- a/libavformat/version.h
> > +++ b/libavformat/version.h
> > @@ -32,7 +32,7 @@
> >  // Major bumping may affect Ticket5467, 5421, 5451(compatibility with
> Chromium)
> >  // Also please add any ticket numbers that you believe might be
> affected here
> >  #define LIBAVFORMAT_VERSION_MAJOR  57
> > -#define LIBAVFORMAT_VERSION_MINOR  75
> > +#define LIBAVFORMAT_VERSION_MINOR  76
> >  #define LIBAVFORMAT_VERSION_MICRO 100
> >
> >  #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR,
> \
>
> Regards,
>
> --
>   Nicolas George

Nicolas George July 4, 2017, 4:50 p.m. UTC | #12

Le sextidi 16 messidor, an CCXXV, Paras Chadha a écrit :
> There is no global header.
> 
> Basically FITS files can have multiple images.
<snip>

Thanks for all the details. When there are several images, they are all
one after the other?

If so, then I really think you should stop the demuxer and integrate
FITS into the image2 system. See libavformat/img2dec.c: you only need a
parser and a probe function, which is basically a subset of what you
have written already.

One of the benefits is that the parser is in libavcodec, so it is much
easier to have it use directly the function fits_read_header() from the
decoder.

Regards,

Paras July 4, 2017, 5:11 p.m. UTC | #13

On Tue, Jul 4, 2017 at 10:20 PM, Nicolas George <george@nsup.org> wrote:

> Le sextidi 16 messidor, an CCXXV, Paras Chadha a écrit :
> > There is no global header.
> >
> > Basically FITS files can have multiple images.
> <snip>
>
> Thanks for all the details. When there are several images, they are all
> one after the other?
>

No, it is not mandatory that they are after one another. There can be other
extensions between image extensions too.


>
> If so, then I really think you should stop the demuxer and integrate
> FITS into the image2 system. See libavformat/img2dec.c: you only need a
> parser and a probe function, which is basically a subset of what you
> have written already.
>
> One of the benefits is that the parser is in libavcodec, so it is much
> easier to have it use directly the function fits_read_header() from the
> decoder.


So, now should i do this ? Even if i am able to call read_header() from
there, i will have to modify read_header() a lot and it will become very
complex and long.


>
> Regards,
>
> --
>   Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>

Nicolas George July 8, 2017, 10:59 a.m. UTC | #14

Le sextidi 16 messidor, an CCXXV, Paras Chadha a écrit :
> So, now should i do this ?

Based on what you explained here, FITS seems like just an image format,
without provisions for animations like GIF or PNG. Therefore, it should
have been integrated in the img2 framework in the first place and
writing a dedicated demuxer was a mistake.

>			     Even if i am able to call read_header() from
> there, i will have to modify read_header() a lot and it will become very
> complex and long.

I do not see why you would need to modify fits_read_header() a lot:
fits_read_header() populates the FITSContext structure, and the parser
uses the fields to compute the size. Done.

You may need to make minimal changes to fits_read_header() to prevent it
from doing extra works, for example building metadata that will not be
used, but that is very minor.

Regards,

Reimar Döffinger July 8, 2017, 8:47 p.m. UTC | #15

On Sat, Jul 08, 2017 at 12:59:09PM +0200, Nicolas George wrote:
> Le sextidi 16 messidor, an CCXXV, Paras Chadha a écrit :
> > So, now should i do this ?
> 
> Based on what you explained here, FITS seems like just an image format,
> without provisions for animations like GIF or PNG. Therefore, it should
> have been integrated in the img2 framework in the first place and
> writing a dedicated demuxer was a mistake.

I don't think that's a correct description then.
First, the format is made to change and be extended, so what is
true now might not stay true.
But also the images can have arbitrary dimensions, in particular
they can be "3D" images with the third dimension being time,
thus being a video.
This may not be well enough specified for the demuxer to be
able to produce a proper "video stream" at this point,
but implementing it in the img2 framework to me seems to
have a significant risk of turning out a dead-end.

Reimar Döffinger July 8, 2017, 9:11 p.m. UTC | #16

On Tue, Jul 04, 2017 at 09:50:31PM +0530, Paras Chadha wrote:
> On Tue, Jul 4, 2017 at 4:12 AM, Reimar Döffinger <Reimar.Doeffinger@gmx.de>
> wrote:
> > > +            data_size *= naxisn[i];
> > > +            if (data_size < 0)
> > > +                return AVERROR_INVALIDDATA;
> >
> > If this is supposed to be overlow check as well: same issue as before: you
> > need to PREVENT the overflow, afterwards it's too late, at least for signed
> > types.
> > Also the golden rule of input sanitization: sanitize/range check FIRST,
> > THEN do the arithmetic.
> > NEVER do it the other way run without at least spending 30 minutes per
> > operation making sure it's really still safe.
> >
> 
> Actually, here also overflow is not possible unless someone intentionally
> put values of naxisn[i] that would cause overflow. Most of the FITS files
> are in MB's or at max in GB's (although i have not seen any in GB). So, if
> someone intentionally put wrong values of size in header then it would
> crash or cause undefined behavior.

crash == usually exploitable
undefined behaviour == possibly exploitable with risk of becoming
exploitable at any point of time with new compiler versions etc.

Both are completely unacceptable.

> Just to prevent that i have added a
> check.

The problem is, your check doesn't prevent it.
It SOMETIMES MIGHT detect it, AFTER it happened.
You can reduce the problem somewhat by using unsigned types,
then at least the overflow would not be undefined behaviour,
but then you have lots of weird corner-case that ALL the
following code must ALWAYS deal correctly with.
For example (not that you would have this specific code in FFmpeg):
ptr = malloc(data_size);
fread(ptr, 1, naxisn[0], somefile);
would be exploitable if you had an overflow.

> So, can suggest a better way to do this ?

As said, always check the inputs BEFORE use.
One way would be
if (naxisn[i] > INT_MAX / data_size) goto err_out;
However you still have to carefully decide on
which limit value (INT_MAX in my example) is the
right compromise between limiting use-cases and
making the code easy to write and maintain.
If you choose it too close, the following code
can't even safely do "data_size + 1" without
another overflow check.
av_image_check_size might be a guideline,
as well as the default value of max_alloc_size.

> Regarding range check, FITS
> standard does not have any limit on the range of values of dimension,
> theoretically it can be anything but practically ofcourse it cannot exceed
> the limits of 64 bit integer. So, how should i go about this ?

As you can see from max_alloc_size, anything
that results in more than 2 GB in a single
packet won't really work by default anyway.
(3D images COULD have their 3rd dimension split
up into separate packages for example, but the
might be something better to leave for later?).

> > > +    fits->image = 0;
> > > +    pos = avio_tell(s->pb);
> > > +    while (!fits->image) {
> > > +        if ((ret = avio_seek(s->pb, pos+size, SEEK_SET)) < 0)
> > > +            return ret;
> > > +
> > > +        if (avio_feof(s->pb))
> > > +            return AVERROR_EOF;
> > > +
> > > +        pos = avio_tell(s->pb);
> > > +        if ((size = find_size(s->pb, fits)) < 0)
> > > +            return size;
> > > +    }
> > > +
> > > +    if ((ret = avio_seek(s->pb, pos, SEEK_SET)) < 0)
> > > +        return ret;
> >
> > Does this seek ever do anything?
> >
> 
> Yes, it seeks back the pointer to point to the begining of the header.

Yes, I failed at reading code (missed that the find_size does
the whole header reading).
Maybe you should just be using av_append_packet while reading the
header? So that when you are done with parsing the header, you
already have all the header data in the packet and the just need
to use av_append_packet again to get the image data itself.
(av_append_packet is not perfect/the most efficient way of doing it,
but if the last bit of speed does not matter so much it tends
to be easy way to avoid seeking in demuxer without increasing complexity
much, sometimes even simplifying things).

Nicolas George July 11, 2017, 10:11 a.m. UTC | #17

Le decadi 20 messidor, an CCXXV, Reimar Döffinger a écrit :
> I don't think that's a correct description then.
> First, the format is made to change and be extended, so what is
> true now might not stay true.
> But also the images can have arbitrary dimensions, in particular
> they can be "3D" images with the third dimension being time,
> thus being a video.
> This may not be well enough specified for the demuxer to be
> able to produce a proper "video stream" at this point,
> but implementing it in the img2 framework to me seems to
> have a significant risk of turning out a dead-end.

Ok. To me, it looks identical to GIF: it is primarily an image format
but with non-essential animation features, it was first implemented as
part of img2 and only when somebody did actually implement the animation
features was it implemented as an actual demuxer. I think it was the
right approach.

Anyway, whatever solution is chosen to fix it, the big chunk of logic
duplication between the demuxer and the decoder as they are is IMO
unacceptable in new and non-essential code. Even more so in the scope of
an internship project, where the primary goal is not to produce code as
fast as possible but to teach good coding practices.

Therefore I urge again to post a dump of a typical three-images FITS
file and comment it to see which part belong in which data structure.

Regards,

Paras July 11, 2017, 2:26 p.m. UTC | #18

On Tue, Jul 11, 2017 at 3:41 PM, Nicolas George <george@nsup.org> wrote:

> Le decadi 20 messidor, an CCXXV, Reimar Döffinger a écrit :
> > I don't think that's a correct description then.
> > First, the format is made to change and be extended, so what is
> > true now might not stay true.
> > But also the images can have arbitrary dimensions, in particular
> > they can be "3D" images with the third dimension being time,
> > thus being a video.
> > This may not be well enough specified for the demuxer to be
> > able to produce a proper "video stream" at this point,
> > but implementing it in the img2 framework to me seems to
> > have a significant risk of turning out a dead-end.
>
> Ok. To me, it looks identical to GIF: it is primarily an image format
> but with non-essential animation features, it was first implemented as
> part of img2 and only when somebody did actually implement the animation
> features was it implemented as an actual demuxer. I think it was the
> right approach.
>
> Anyway, whatever solution is chosen to fix it, the big chunk of logic
> duplication between the demuxer and the decoder as they are is IMO
> unacceptable in new and non-essential code. Even more so in the scope of
> an internship project, where the primary goal is not to produce code as
> fast as possible but to teach good coding practices.
>
> Therefore I urge again to post a dump of a typical three-images FITS
> file and comment it to see which part belong in which data structure.
>
> Regards,
>
> --
>   Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
Hi, I have attached the dump of a FITS File containing 5 images and 4
binary table xtensions. The dump is taken using fdump utility. Please take
a look.
This is the header of the first (primary) image.

SIMPLE  =                    T / FITS STANDARD
BITPIX  =                    8 / Character information
NAXIS   =                    0 / No image data array present
EXTEND  =                    T / There may be standard extensions
DATE    = '21/09/99'           / Date file was written (dd/mm/yy) 19yy
ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
OBJECT  = '        '           / Name of observed object
RA_OBJ  =             79.17265 / R.A. of the object (degrees)
DEC_OBJ =             45.99862 / Declination of the object (degrees)
RA_PNT  =     79.1724000000003 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =     987466469.376002 / Time of start of observation (seconds)
TSTOP   =     987977263.104001 / Time of end of observation (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
COMMENT     ' '
COMMENT     'This file is part of the EUVE Science Archive. It contains'
COMMENT     'images and filter limits for one EUVE observation.'
COMMENT     ' '
COMMENT     'The EUVE Science Archive contains the science data from'
COMMENT     'observations performed with the EUVE telescopes. It forms one'
COMMENT     'part of the EUVE Permanent Archive. The other part of the'
COMMENT     'permanent archive is the EUVE Telemetry Archive, which is a'
COMMENT     'complete record of the raw telemetry from the EUVE mission.'
COMMENT     ' '
COMMENT     'For documentation of the contents of the EUVE Science Archive,'
COMMENT     'see the "EUVE Science Archive User's Guide".  The contents of'
COMMENT     'the EUVE Telemetry Archive are described in the "EUVE'
COMMENT     'Telemetry Archive User's Guide".'
COMMENT     ' '
COMMENT     'The EUVE Permanent Archive was produced by the Center for EUV'
COMMENT     'Astrophysics, a division of UC Berkeley's Space Science'
COMMENT     Laboratory.
COMMENT     ' '
END

This will be followed by the data part whose size (in bits) is given by - |BITPIX| * GCOUNT * (PCOUNT + NAXIS1 * NAXIS2 * ... * NAXISm) = 0 in this case, so no data part is there

This is the header of second image.

XTENSION= 'IMAGE   '           / IMAGE extension
BITPIX  =                   16 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional image
NAXIS1  =                 2048 / Number of image columns
NAXIS2  =                 2048 / Number of image rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
EXTNAME = 'ds      '           / Name of image
BSCALE  =                1.0E0 / Scale factor for pixel values
BZERO   =                0.0E0 / Offset for pixel values
CTYPE1  = 'RA---TAN'           / RA in tangent plane projection
CRVAL1  =     79.1726455688478 / RA at reference pixel
CRPIX1  =               1024.5 / reference pixel, axis 1
CDELT1  =  0.00126953120343387 / scale, axis 1
CUNIT1  = 'deg     '           / units of CRVAL1, CDELT1
CTYPE2  = 'DEC--TAN'           / DEC in tangent plane projection
CRVAL2  =     45.9986228942872 / DEC at reference pixel
CRPIX2  =               1024.5 / reference pixel, axis 2
CDELT2  =  0.00126953120343387 / scale, axis 2
CUNIT2  = 'deg     '           / units of CRVAL2, CDELT2
EXPTIME =             64971.06 / Primbsch/deadtime corrected exposure time
RAWEXP  =             86784.98 / Uncorrected exposure time
FILENAME= 'ds      '
ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'DS      '           / Deep Survey
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =     79.1724000000002 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =     987466469.376001 / Time of start of observation (seconds)
TSTOP   =     987977263.104001 / Time of end of observation (seconds)
TIERRABS=                0.001 / Timing precision (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This extension contains an image of the Deep Survey detector'
COMMENT     'for this observation. All events have been remapped onto the'
COMMENT     'sky.  The filter limits used to select events for inclusion'
COMMENT     'in this image are in the binary table extension named'
COMMENT     '"ds_limits" in this file.'
COMMENT     ' '
END

Here also there is data part whose size is given by the same expression

Header of third image.

XTENSION= 'IMAGE   '           / IMAGE extension
BITPIX  =                   16 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional image
NAXIS1  =                 2048 / Number of image columns
NAXIS2  =                  300 / Number of image rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
EXTNAME = 'sw_night'           / Name of image
BSCALE  =                1.0E0 / Scale factor for pixel values
BZERO   =                0.0E0 / Offset for pixel values
CTYPE1  = 'WAVELENGTH'         / Wavelength
CRVAL1  =                 133. / Wavelength at reference pixel
CRPIX1  =               1024.5 / reference pixel, axis 1
CDELT1  =   0.0673828125000001 / scale, axis 1
CUNIT1  = 'angstrom'           / units of CRVAL1, CDELT1
CTYPE2  = 'ANGLE   '           / Imaging angle
CRVAL2  =                  0.0 / Imaging angle at reference pixel
CRPIX2  =                150.5 / reference pixel, axis 2
CDELT2  =  0.00121093750931323 / scale, axis 2
CUNIT2  = 'deg     '           / units of CRVAL2, CDELT2
EXPTIME =             83184.02 / Primbsch/deadtime corrected exposure time
RAWEXP  =             85043.08 / Uncorrected exposure time
FILENAME= 'sw_night'
ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'SW      '           / Short Wavelength Spectrometer
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =     79.1724000000002 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
DETSTRIP= '875 : 1174'         / Detector strip
VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =     987466469.376001 / Time of start of observation (seconds)
TSTOP   =     987977263.104001 / Time of end of observation (seconds)
TIERRABS=                0.001 / Timing precision (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This extension contains an image of the Short Wavelength'
COMMENT     'spectrometer for the nighttime portion of this observation.'
COMMENT     'All events have been dispersion corrected.  The filter limits'
COMMENT     'used to select events for inclusion in this image are in the'
COMMENT     'binary table extension named "sw_night_limits" in this file.'
COMMENT     ' '
COMMENT     'The coordinate system of this image is non-standard. Axis 1'
COMMENT     'is linear in wavelength. Axis 2 is linear in imaging angle.'
COMMENT     'Imaging angle is the angular distance from the source in a'
COMMENT     'direction perpendicular to the dispersion.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Header of fourth image.

XTENSION= 'IMAGE   '           / IMAGE extension
BITPIX  =                   16 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional image
NAXIS1  =                 2048 / Number of image columns
NAXIS2  =                  300 / Number of image rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
EXTNAME = 'mw      '           / Name of image
BSCALE  =                1.0E0 / Scale factor for pixel values
BZERO   =                0.0E0 / Offset for pixel values
CTYPE1  = 'WAVELENGTH'         / Wavelength
CRVAL1  =                 266. / Wavelength at reference pixel
CRPIX1  =               1024.5 / reference pixel, axis 1
CDELT1  =          0.134765625 / scale, axis 1
CUNIT1  = 'angstrom'           / units of CRVAL1, CDELT1
CTYPE2  = 'ANGLE   '           / Imaging angle
CRVAL2  =                  0.0 / Imaging angle at reference pixel
CRPIX2  =                150.5 / reference pixel, axis 2
CDELT2  =  0.00121093750931323 / scale, axis 2
CUNIT2  = 'deg     '           / units of CRVAL2, CDELT2
EXPTIME =              81476.3 / Primbsch/deadtime corrected exposure time
RAWEXP  =             86539.14 / Uncorrected exposure time
FILENAME= 'mw      '
ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'MW      '           / Medium Wavelength Spectrometer
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =     79.1724000000002 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
DETSTRIP= '875 : 1174'         / Detector strip
VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =     987466469.376001 / Time of start of observation (seconds)
TSTOP   =     987977263.104001 / Time of end of observation (seconds)
TIERRABS=                0.001 / Timing precision (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This extension contains an image of the Medium Wavelength'
COMMENT     'spectrometer for this observation. All events have been'
COMMENT     'dispersion corrected.  The filter limits used to select events'
COMMENT     'for inclusion in this image are in the binary table extension'
COMMENT     'named "mw_limits" in this file.'
COMMENT     ' '
COMMENT     'The coordinate system of this image is non-standard. Axis 1'
COMMENT     'is linear in wavelength. Axis 2 is linear in imaging angle.'
COMMENT     'Imaging angle is the angular distance from the source in a'
COMMENT     'direction perpendicular to the dispersion.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Header of fifth image.

XTENSION= 'IMAGE   '           / IMAGE extension
BITPIX  =                   16 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional image
NAXIS1  =                 2048 / Number of image columns
NAXIS2  =                  300 / Number of image rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
EXTNAME = 'lw      '           / Name of image
BSCALE  =                1.0E0 / Scale factor for pixel values
BZERO   =                0.0E0 / Offset for pixel values
CTYPE1  = 'WAVELENGTH'         / Wavelength
CRVAL1  =                 532. / Wavelength at reference pixel
CRPIX1  =               1024.5 / reference pixel, axis 1
CDELT1  =           0.26953125 / scale, axis 1
CUNIT1  = 'angstrom'           / units of CRVAL1, CDELT1
CTYPE2  = 'ANGLE   '           / Imaging angle
CRVAL2  =                  0.0 / Imaging angle at reference pixel
CRPIX2  =                150.5 / reference pixel, axis 2
CDELT2  =  0.00121093750931323 / scale, axis 2
CUNIT2  = 'deg     '           / units of CRVAL2, CDELT2
EXPTIME =               84906. / Primbsch/deadtime corrected exposure time
RAWEXP  =             91046.76 / Uncorrected exposure time
FILENAME= 'lw      '
ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'LW      '           / Long Wavelength Spectrometer
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =     79.1724000000002 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
DETSTRIP= '875 : 1174'         / Detector strip
VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =     987466469.376001 / Time of start of observation (seconds)
TSTOP   =     987977263.104001 / Time of end of observation (seconds)
TIERRABS=                0.001 / Timing precision (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This extension contains an image of the Long Wavelength'
COMMENT     'spectrometer for this observation. All events have been'
COMMENT     'dispersion corrected.  The filter limits used to select events'
COMMENT     'for inclusion in this image are in the binary table extension'
COMMENT     'named "lw_limits" in this file.'
COMMENT     ' '
COMMENT     'The coordinate system of this image is non-standard. Axis 1'
COMMENT     'is linear in wavelength. Axis 2 is linear in imaging angle.'
COMMENT     'Imaging angle is the angular distance from the source in a'
COMMENT     'direction perpendicular to the dispersion.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Header of first binary table extension.

XTENSION= 'BINTABLE'           / Binary table extension
BITPIX  =                    8 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                   16 / Width of table in bytes
NAXIS2  =                    3 / The number of rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
TFIELDS =                    3 / Number of fields per row
EXTNAME = 'ds_limits'          / Name of table

TTYPE1  = 'NAME    '           / Name of limited value
TFORM1  = '8A      '           /
TUNIT1  = '        '           /
TDISP1  = 'A       '           / %s

TTYPE2  = 'LOW     '           / Lower limit
TFORM2  = '1E      '           /
TUNIT2  = 'UNKNOWN '           /
TDISP2  = 'G15.7   '           / %15.7g

TTYPE3  = 'HIGH    '           / Upper limit
TFORM3  = '1E      '           /
TUNIT3  = 'UNKNOWN '           /
TDISP3  = 'G15.7   '           / %15.7g

ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'DS      '           / Deep Survey
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =    79.17240000000026 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
OBSERVER= '        '           / Original observing P.I. (EUVE = calibration)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
VALIDTIM=    132369.4058322902 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =    987466469.3760021 / Time of start of observation (seconds)
TSTOP   =    987977263.1040014 / Time of end of observation (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This table contains the limits used to filter the Deep Survey'
COMMENT     'image for this observation. The image is in the extension'
COMMENT     'named "ds" in this file.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Header of second binary table extension.

XTENSION= 'BINTABLE'           / Binary table extension
BITPIX  =                    8 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                   20 / Width of table in bytes
NAXIS2  =                    2 / The number of rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
TFIELDS =                    3 / Number of fields per row
EXTNAME = 'sw_night_limits'    / Name of table

TTYPE1  = 'NAME    '           / Name of limited value
TFORM1  = '12A     '           /
TUNIT1  = '        '           /
TDISP1  = 'A       '           / %s

TTYPE2  = 'LOW     '           / Lower limit
TFORM2  = '1E      '           /
TUNIT2  = 'UNKNOWN '           /
TDISP2  = 'G15.7   '           / %15.7g

TTYPE3  = 'HIGH    '           / Upper limit
TFORM3  = '1E      '           /
TUNIT3  = 'UNKNOWN '           /
TDISP3  = 'G15.7   '           / %15.7g

ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'SW      '           / Short Wavelength Spectrometer
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =    79.17240000000026 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
OBSERVER= '        '           / Original observing P.I. (EUVE = calibration)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
VALIDTIM=    132369.4058322902 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =    987466469.3760021 / Time of start of observation (seconds)
TSTOP   =    987977263.1040014 / Time of end of observation (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This table contains the limits used to filter the Short'
COMMENT     'Wavelength spectrometer nighttime image for this observation.'
COMMENT     'The image is in the extension named "sw_night" in this file.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Header of third binary table extension.

XTENSION= 'BINTABLE'           / Binary table extension
BITPIX  =                    8 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                   20 / Width of table in bytes
NAXIS2  =                    2 / The number of rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
TFIELDS =                    3 / Number of fields per row
EXTNAME = 'mw_limits'          / Name of table

TTYPE1  = 'NAME    '           / Name of limited value
TFORM1  = '12A     '           /
TUNIT1  = '        '           /
TDISP1  = 'A       '           / %s

TTYPE2  = 'LOW     '           / Lower limit
TFORM2  = '1E      '           /
TUNIT2  = 'UNKNOWN '           /
TDISP2  = 'G15.7   '           / %15.7g

TTYPE3  = 'HIGH    '           / Upper limit
TFORM3  = '1E      '           /
TUNIT3  = 'UNKNOWN '           /
TDISP3  = 'G15.7   '           / %15.7g

ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'MW      '           / Medium Wavelength Spectrometer
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =    79.17240000000026 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
OBSERVER= '        '           / Original observing P.I. (EUVE = calibration)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
VALIDTIM=    132369.4058322902 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =    987466469.3760021 / Time of start of observation (seconds)
TSTOP   =    987977263.1040014 / Time of end of observation (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This table contains the limits used to filter the Medium'
COMMENT     'Wavelength spectrometer image for this observation.'
COMMENT     'The image is in the extension named "mw" in this file.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Header of fourth binary table extension.

XTENSION= 'BINTABLE'           / Binary table extension
BITPIX  =                    8 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                   20 / Width of table in bytes
NAXIS2  =                    2 / The number of rows
PCOUNT  =                    0 / Size of special data area
GCOUNT  =                    1 / Only one group
TFIELDS =                    3 / Number of fields per row
EXTNAME = 'lw_limits'          / Name of table

TTYPE1  = 'NAME    '           / Name of limited value
TFORM1  = '12A     '           /
TUNIT1  = '        '           /
TDISP1  = 'A       '           / %s

TTYPE2  = 'LOW     '           / Lower limit
TFORM2  = '1E      '           /
TUNIT2  = 'UNKNOWN '           /
TDISP2  = 'G15.7   '           / %15.7g

TTYPE3  = 'HIGH    '           / Upper limit
TFORM3  = '1E      '           /
TUNIT3  = 'UNKNOWN '           /
TDISP3  = 'G15.7   '           / %15.7g

ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
DETNAM  = 'LW      '           / Long Wavelength Spectrometer
OBJECT  = '        '           / Name of observed object
RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
DEC_OBJ = '45.998624'          / Declination of the object (degrees)
RA_PNT  =    79.17240000000026 / R.A. of the pointing direction (degrees)
DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
RA_PROC =             79.17586 / R.A. used to process data (degrees)
DEC_PROC=             45.99578 / Declination used to process data (degrees)
OBSERVER= '        '           / Original observing P.I. (EUVE = calibration)
DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
OBS_MODE= 'POINTING'           / Inertial pointing mode
DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
OFF-AXIS=                    F / Was this pointing done off-axis
MOVING  =                    F / Did the source position vary during observation
DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
VALIDTIM=    132369.4058322902 / Amount of telemetry present (seconds)
RA_UNIT = 'deg     '           / Units for Right Ascension
DEC_UNIT= 'deg     '           / Units for Declination
EQUINOX =                2000. / Coordinate equinox
RADECSYS= 'FK5     '           / Frame of reference of coordinates
TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
TIMEZERO=                   0. / No time offset required for EUVE event times
TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
CLOCKCOR= 'NO      '           / Not corrected to UT
TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
TSTART  =    987466469.3760021 / Time of start of observation (seconds)
TSTOP   =    987977263.1040014 / Time of end of observation (seconds)
MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
INHERIT =                    F / Do not inherit cards from the primary header
COMMENT     ' '
COMMENT     'This table contains the limits used to filter the Long'
COMMENT     'Wavelength spectrometer image for this observation.'
COMMENT     'The image is in the extension named "lw" in this file.'
COMMENT     ' '
END

Data part whose size is given by the same expression

Paras July 13, 2017, 7:02 p.m. UTC | #19

On Tue, Jul 11, 2017 at 3:41 PM, Nicolas George <george@nsup.org> wrote:

> Le decadi 20 messidor, an CCXXV, Reimar Döffinger a écrit :
> > I don't think that's a correct description then.
> > First, the format is made to change and be extended, so what is
> > true now might not stay true.
> > But also the images can have arbitrary dimensions, in particular
> > they can be "3D" images with the third dimension being time,
> > thus being a video.
> > This may not be well enough specified for the demuxer to be
> > able to produce a proper "video stream" at this point,
> > but implementing it in the img2 framework to me seems to
> > have a significant risk of turning out a dead-end.
>
> Ok. To me, it looks identical to GIF: it is primarily an image format
> but with non-essential animation features, it was first implemented as
> part of img2 and only when somebody did actually implement the animation
> features was it implemented as an actual demuxer. I think it was the
> right approach.
>
> Anyway, whatever solution is chosen to fix it, the big chunk of logic
> duplication between the demuxer and the decoder as they are is IMO
> unacceptable in new and non-essential code. Even more so in the scope of
> an internship project, where the primary goal is not to produce code as
> fast as possible but to teach good coding practices.
>
> Therefore I urge again to post a dump of a typical three-images FITS
> file and comment it to see which part belong in which data structure.
>

Hi, i had attached a dump of FITS file in my previous reply. Please take a
look at it and suggest changes or should i make the one's suggested
previously to the demuxer ?


>
> Regards,
>
> --
>   Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>

Nicolas George July 14, 2017, 5:45 p.m. UTC | #20

Le tridi 23 messidor, an CCXXV, Paras Chadha a écrit :
> Hi, I have attached the dump of a FITS File containing 5 images and 4
> binary table xtensions. The dump is taken using fdump utility. Please take
> a look.

Thanks for the dump. But it is you who know the format, thus it is you
who can tell the most accurate things about it.

Anyway, distributing the task between the demuxer and the decoder and
deciding the format of the payload is absolutely necessary before any
real work can proceed on the implementation.


> This is the header of the first (primary) image.
> 
> SIMPLE  =                    T / FITS STANDARD
> BITPIX  =                    8 / Character information
> NAXIS   =                    0 / No image data array present
> EXTEND  =                    T / There may be standard extensions
> DATE    = '21/09/99'           / Date file was written (dd/mm/yy) 19yy
> ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
> CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
> TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
> INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
> OBJECT  = '        '           / Name of observed object
> RA_OBJ  =             79.17265 / R.A. of the object (degrees)
> DEC_OBJ =             45.99862 / Declination of the object (degrees)
> RA_PNT  =     79.1724000000003 / R.A. of the pointing direction (degrees)
> DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
> RA_PROC =             79.17586 / R.A. used to process data (degrees)
> DEC_PROC=             45.99578 / Declination used to process data (degrees)
> DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
> TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
> DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
> TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
> OBS_MODE= 'POINTING'           / Inertial pointing mode
> DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
> DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
> OFF-AXIS=                    F / Was this pointing done off-axis
> MOVING  =                    F / Did the source position vary during observation
> DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
> VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
> RA_UNIT = 'deg     '           / Units for Right Ascension
> DEC_UNIT= 'deg     '           / Units for Declination
> EQUINOX =                2000. / Coordinate equinox
> RADECSYS= 'FK5     '           / Frame of reference of coordinates
> TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
> TIMEZERO=                   0. / No time offset required for EUVE event times
> TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
> CLOCKCOR= 'NO      '           / Not corrected to UT
> TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
> TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
> TSTART  =     987466469.376002 / Time of start of observation (seconds)
> TSTOP   =     987977263.104001 / Time of end of observation (seconds)
> MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
> EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
> REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
> COMMENT     ' '
> COMMENT     'This file is part of the EUVE Science Archive. It contains'
> COMMENT     'images and filter limits for one EUVE observation.'
> COMMENT     ' '
> COMMENT     'The EUVE Science Archive contains the science data from'
> COMMENT     'observations performed with the EUVE telescopes. It forms one'
> COMMENT     'part of the EUVE Permanent Archive. The other part of the'
> COMMENT     'permanent archive is the EUVE Telemetry Archive, which is a'
> COMMENT     'complete record of the raw telemetry from the EUVE mission.'
> COMMENT     ' '
> COMMENT     'For documentation of the contents of the EUVE Science Archive,'
> COMMENT     'see the "EUVE Science Archive User's Guide".  The contents of'
> COMMENT     'the EUVE Telemetry Archive are described in the "EUVE'
> COMMENT     'Telemetry Archive User's Guide".'
> COMMENT     ' '
> COMMENT     'The EUVE Permanent Archive was produced by the Center for EUV'
> COMMENT     'Astrophysics, a division of UC Berkeley's Space Science'
> COMMENT     Laboratory.
> COMMENT     ' '
> END
> 
> This will be followed by the data part whose size (in bits) is given by - |BITPIX| * GCOUNT * (PCOUNT + NAXIS1 * NAXIS2 * ... * NAXISm) = 0 in this case, so no data part is there

You forgot to explain what should be done with an image without data.

> 
> This is the header of second image.
> 
> XTENSION= 'IMAGE   '           / IMAGE extension
> BITPIX  =                   16 / 8-bit bytes
> NAXIS   =                    2 / 2-dimensional image
> NAXIS1  =                 2048 / Number of image columns
> NAXIS2  =                 2048 / Number of image rows
> PCOUNT  =                    0 / Size of special data area
> GCOUNT  =                    1 / Only one group
> EXTNAME = 'ds      '           / Name of image
> BSCALE  =                1.0E0 / Scale factor for pixel values
> BZERO   =                0.0E0 / Offset for pixel values
> CTYPE1  = 'RA---TAN'           / RA in tangent plane projection
> CRVAL1  =     79.1726455688478 / RA at reference pixel
> CRPIX1  =               1024.5 / reference pixel, axis 1
> CDELT1  =  0.00126953120343387 / scale, axis 1
> CUNIT1  = 'deg     '           / units of CRVAL1, CDELT1
> CTYPE2  = 'DEC--TAN'           / DEC in tangent plane projection
> CRVAL2  =     45.9986228942872 / DEC at reference pixel
> CRPIX2  =               1024.5 / reference pixel, axis 2
> CDELT2  =  0.00126953120343387 / scale, axis 2
> CUNIT2  = 'deg     '           / units of CRVAL2, CDELT2
> EXPTIME =             64971.06 / Primbsch/deadtime corrected exposure time
> RAWEXP  =             86784.98 / Uncorrected exposure time
> FILENAME= 'ds      '
> ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
> CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
> TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
> INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
> INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
> DETNAM  = 'DS      '           / Deep Survey
> OBJECT  = '        '           / Name of observed object
> RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
> DEC_OBJ = '45.998624'          / Declination of the object (degrees)
> RA_PNT  =     79.1724000000002 / R.A. of the pointing direction (degrees)
> DEC_PNT =               46.007 / Declination of the pointing direction (degrees)
> RA_PROC =             79.17586 / R.A. used to process data (degrees)
> DEC_PROC=             45.99578 / Declination used to process data (degrees)
> DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy) 19yy
> TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
> DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
> TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
> OBS_MODE= 'POINTING'           / Inertial pointing mode
> DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED, SPIRAL, NONE)
> DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ or XY)
> OFF-AXIS=                    F / Was this pointing done off-axis
> MOVING  =                    F / Did the source position vary during observation
> DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT, BOTH)
> VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
> RA_UNIT = 'deg     '           / Units for Right Ascension
> DEC_UNIT= 'deg     '           / Units for Declination
> EQUINOX =                2000. / Coordinate equinox
> RADECSYS= 'FK5     '           / Frame of reference of coordinates
> TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
> TIMEZERO=                   0. / No time offset required for EUVE event times
> TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
> CLOCKCOR= 'NO      '           / Not corrected to UT
> TIMEREF = 'LOCAL   '           / No corrections applied (barycentric, etc.)
> TASSIGN = 'SATELLITE'          / Event times are assigned at the satellite
> TSTART  =     987466469.376001 / Time of start of observation (seconds)
> TSTOP   =     987977263.104001 / Time of end of observation (seconds)
> TIERRABS=                0.001 / Timing precision (seconds)
> MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
> EGOCSVER= 'egocs1.7.2'         / Software version used to produce this data
> REFVERS = 'egodata1.15.2'      / Reference calibration dataset version used
> INHERIT =                    F / Do not inherit cards from the primary header
> COMMENT     ' '
> COMMENT     'This extension contains an image of the Deep Survey detector'
> COMMENT     'for this observation. All events have been remapped onto the'
> COMMENT     'sky.  The filter limits used to select events for inclusion'
> COMMENT     'in this image are in the binary table extension named'
> COMMENT     '"ds_limits" in this file.'
> COMMENT     ' '
> END
> 
> Here also there is data part whose size is given by the same expression

At a guess, all this could be parsed by the demuxer, using BITPIX and
NAXIS* to populate codecpar->width, codecpar->height and
codecpar->format and putting all the rest in string metadata, and only
the data itself go in the packet payload.

But you probably know better, so please have a loot at the fields in
AVPacket and AVCodecParameters, and tell us what you think.

Regards,

Paras July 16, 2017, 2:19 p.m. UTC | #21

On Fri, Jul 14, 2017 at 11:15 PM, Nicolas George <george@nsup.org> wrote:

> Le tridi 23 messidor, an CCXXV, Paras Chadha a écrit :
> > Hi, I have attached the dump of a FITS File containing 5 images and 4
> > binary table xtensions. The dump is taken using fdump utility. Please
> take
> > a look.
>
> Thanks for the dump. But it is you who know the format, thus it is you
> who can tell the most accurate things about it.
>
> Anyway, distributing the task between the demuxer and the decoder and
> deciding the format of the payload is absolutely necessary before any
> real work can proceed on the implementation.
>
>
> > This is the header of the first (primary) image.
> >
> > SIMPLE  =                    T / FITS STANDARD
> > BITPIX  =                    8 / Character information
> > NAXIS   =                    0 / No image data array present
> > EXTEND  =                    T / There may be standard extensions
> > DATE    = '21/09/99'           / Date file was written (dd/mm/yy) 19yy
> > ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
> > CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
> > TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
> > INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
> > OBJECT  = '        '           / Name of observed object
> > RA_OBJ  =             79.17265 / R.A. of the object (degrees)
> > DEC_OBJ =             45.99862 / Declination of the object (degrees)
> > RA_PNT  =     79.1724000000003 / R.A. of the pointing direction (degrees)
> > DEC_PNT =               46.007 / Declination of the pointing direction
> (degrees)
> > RA_PROC =             79.17586 / R.A. used to process data (degrees)
> > DEC_PROC=             45.99578 / Declination used to process data
> (degrees)
> > DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy)
> 19yy
> > TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
> > DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
> > TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
> > OBS_MODE= 'POINTING'           / Inertial pointing mode
> > DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED,
> SPIRAL, NONE)
> > DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ
> or XY)
> > OFF-AXIS=                    F / Was this pointing done off-axis
> > MOVING  =                    F / Did the source position vary during
> observation
> > DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT,
> BOTH)
> > VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
> > RA_UNIT = 'deg     '           / Units for Right Ascension
> > DEC_UNIT= 'deg     '           / Units for Declination
> > EQUINOX =                2000. / Coordinate equinox
> > RADECSYS= 'FK5     '           / Frame of reference of coordinates
> > TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
> > TIMEZERO=                   0. / No time offset required for EUVE event
> times
> > TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
> > CLOCKCOR= 'NO      '           / Not corrected to UT
> > TIMEREF = 'LOCAL   '           / No corrections applied (barycentric,
> etc.)
> > TASSIGN = 'SATELLITE'          / Event times are assigned at the
> satellite
> > TSTART  =     987466469.376002 / Time of start of observation (seconds)
> > TSTOP   =     987977263.104001 / Time of end of observation (seconds)
> > MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
> > EGOCSVER= 'egocs1.7.2'         / Software version used to produce this
> data
> > REFVERS = 'egodata1.15.2'      / Reference calibration dataset version
> used
> > COMMENT     ' '
> > COMMENT     'This file is part of the EUVE Science Archive. It contains'
> > COMMENT     'images and filter limits for one EUVE observation.'
> > COMMENT     ' '
> > COMMENT     'The EUVE Science Archive contains the science data from'
> > COMMENT     'observations performed with the EUVE telescopes. It forms
> one'
> > COMMENT     'part of the EUVE Permanent Archive. The other part of the'
> > COMMENT     'permanent archive is the EUVE Telemetry Archive, which is a'
> > COMMENT     'complete record of the raw telemetry from the EUVE mission.'
> > COMMENT     ' '
> > COMMENT     'For documentation of the contents of the EUVE Science
> Archive,'
> > COMMENT     'see the "EUVE Science Archive User's Guide".  The contents
> of'
> > COMMENT     'the EUVE Telemetry Archive are described in the "EUVE'
> > COMMENT     'Telemetry Archive User's Guide".'
> > COMMENT     ' '
> > COMMENT     'The EUVE Permanent Archive was produced by the Center for
> EUV'
> > COMMENT     'Astrophysics, a division of UC Berkeley's Space Science'
> > COMMENT     Laboratory.
> > COMMENT     ' '
> > END
> >
> > This will be followed by the data part whose size (in bits) is given by
> - |BITPIX| * GCOUNT * (PCOUNT + NAXIS1 * NAXIS2 * ... * NAXISm) = 0 in this
> case, so no data part is there
>
> You forgot to explain what should be done with an image without data.
>

Image without data means that there is no image. It is just the header
which can be ignored.
So, currently i am skipping over the header if it contains no image.


>
> >
> > This is the header of second image.
> >
> > XTENSION= 'IMAGE   '           / IMAGE extension
> > BITPIX  =                   16 / 8-bit bytes
> > NAXIS   =                    2 / 2-dimensional image
> > NAXIS1  =                 2048 / Number of image columns
> > NAXIS2  =                 2048 / Number of image rows
> > PCOUNT  =                    0 / Size of special data area
> > GCOUNT  =                    1 / Only one group
> > EXTNAME = 'ds      '           / Name of image
> > BSCALE  =                1.0E0 / Scale factor for pixel values
> > BZERO   =                0.0E0 / Offset for pixel values
> > CTYPE1  = 'RA---TAN'           / RA in tangent plane projection
> > CRVAL1  =     79.1726455688478 / RA at reference pixel
> > CRPIX1  =               1024.5 / reference pixel, axis 1
> > CDELT1  =  0.00126953120343387 / scale, axis 1
> > CUNIT1  = 'deg     '           / units of CRVAL1, CDELT1
> > CTYPE2  = 'DEC--TAN'           / DEC in tangent plane projection
> > CRVAL2  =     45.9986228942872 / DEC at reference pixel
> > CRPIX2  =               1024.5 / reference pixel, axis 2
> > CDELT2  =  0.00126953120343387 / scale, axis 2
> > CUNIT2  = 'deg     '           / units of CRVAL2, CDELT2
> > EXPTIME =             64971.06 / Primbsch/deadtime corrected exposure
> time
> > RAWEXP  =             86784.98 / Uncorrected exposure time
> > FILENAME= 'ds      '
> > ORIGIN  = 'CEA/SSL UC Berkeley' / EUVE Science Archive
> > CREATOR = 'STWFITS '           / Fitsio version 05-Aug-1997
> > TELESCOP= 'EUVE    '           / Extreme Ultraviolet Explorer
> > INSTTYPE= 'DS/S    '           / Instrument type (DS/S, SCANNER)
> > INSTRUME= 'DS/S    '           / Deep Survey - Spectrometer
> > DETNAM  = 'DS      '           / Deep Survey
> > OBJECT  = '        '           / Name of observed object
> > RA_OBJ  = '79.172646'          / R.A. of the object (degrees)
> > DEC_OBJ = '45.998624'          / Declination of the object (degrees)
> > RA_PNT  =     79.1724000000002 / R.A. of the pointing direction (degrees)
> > DEC_PNT =               46.007 / Declination of the pointing direction
> (degrees)
> > RA_PROC =             79.17586 / R.A. used to process data (degrees)
> > DEC_PROC=             45.99578 / Declination used to process data
> (degrees)
> > DATE-OBS= '08/09/99 GMT'       / Start date of observation (dd/mm/yy)
> 19yy
> > TIME-OBS= '00:14:29 GMT'       / Start time of observation (hh:mm:ss GMT)
> > DATE-END= '13/09/99 GMT'       / End date of observation (dd/mm/yy) 19yy
> > TIME-END= '22:07:43 GMT'       / End time of observation (hh:mm:ss GMT)
> > OBS_MODE= 'POINTING'           / Inertial pointing mode
> > DITHER  = 'SPIRAL  '           / Spacecraft dither type (DITHERED,
> SPIRAL, NONE)
> > DETMODE = 'WSZ     '           / Detector position conversion mode (WSZ
> or XY)
> > OFF-AXIS=                    F / Was this pointing done off-axis
> > MOVING  =                    F / Did the source position vary during
> observation
> > DAYNIGHT= 'NIGHT   '           / Day/night data indicator (DAY, NIGHT,
> BOTH)
> > VALIDTIM=      132369.40583229 / Amount of telemetry present (seconds)
> > RA_UNIT = 'deg     '           / Units for Right Ascension
> > DEC_UNIT= 'deg     '           / Units for Declination
> > EQUINOX =                2000. / Coordinate equinox
> > RADECSYS= 'FK5     '           / Frame of reference of coordinates
> > TIMESYS = 'MJD     '           / MJD = JD - 2400000.5
> > TIMEZERO=                   0. / No time offset required for EUVE event
> times
> > TIMEUNIT= 's       '           / Units for TSTART, TSTOP, TIMEZERO
> > CLOCKCOR= 'NO      '           / Not corrected to UT
> > TIMEREF = 'LOCAL   '           / No corrections applied (barycentric,
> etc.)
> > TASSIGN = 'SATELLITE'          / Event times are assigned at the
> satellite
> > TSTART  =     987466469.376001 / Time of start of observation (seconds)
> > TSTOP   =     987977263.104001 / Time of end of observation (seconds)
> > TIERRABS=                0.001 / Timing precision (seconds)
> > MJDREF  =               40000. / MJD of SC clock start, 24.00 May 1968
> > EGOCSVER= 'egocs1.7.2'         / Software version used to produce this
> data
> > REFVERS = 'egodata1.15.2'      / Reference calibration dataset version
> used
> > INHERIT =                    F / Do not inherit cards from the primary
> header
> > COMMENT     ' '
> > COMMENT     'This extension contains an image of the Deep Survey
> detector'
> > COMMENT     'for this observation. All events have been remapped onto
> the'
> > COMMENT     'sky.  The filter limits used to select events for inclusion'
> > COMMENT     'in this image are in the binary table extension named'
> > COMMENT     '"ds_limits" in this file.'
> > COMMENT     ' '
> > END
> >
> > Here also there is data part whose size is given by the same expression
>
> At a guess, all this could be parsed by the demuxer, using BITPIX and
> NAXIS* to populate codecpar->width, codecpar->height and
> codecpar->format and putting all the rest in string metadata, and only
> the data itself go in the packet payload.
>

Okay, i can fill height, width and format in demuxer itself.
By metadata above, do you mean AVStream meatdata ?
Actually i need BLANK, BSCALE, BZERO, DATAMIN and DATAMIN keywords in the
decoder to decode the image correctly. How can i access them there ?


>
> But you probably know better, so please have a loot at the fields in
> AVPacket and AVCodecParameters, and tell us what you think.
>
> Regards,
>
> --
>   Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>

Paul B Mahol July 16, 2017, 4:36 p.m. UTC | #22

Can you give link to file which holds more than one FITS image?

Paras July 16, 2017, 4:53 p.m. UTC | #23

On Sun, Jul 16, 2017 at 10:06 PM, Paul B Mahol <onemda@gmail.com> wrote:

> Can you give link to file which holds more than one FITS image?
>

Yes, here is the file: https://fits.gsfc.nasa.gov/samples/EUVEngc4151imgx.
fits

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Paul B Mahol July 18, 2017, 1:09 p.m. UTC | #24

On 7/16/17, Paras Chadha <paraschadha18@gmail.com> wrote:
> On Sun, Jul 16, 2017 at 10:06 PM, Paul B Mahol <onemda@gmail.com> wrote:
>
>> Can you give link to file which holds more than one FITS image?
>>
>
> Yes, here is the file: https://fits.gsfc.nasa.gov/samples/EUVEngc4151imgx.
> fits
>

I believe your approach is fine. There could not be parser as format
does not allow that.

Nicolas George July 18, 2017, 1:27 p.m. UTC | #25

Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
> I believe your approach is fine.

The huge duplication between decoder and demuxer is definitely not fine.
You need to find a solution before pushing anything.

Regards,

Paul B Mahol July 18, 2017, 2:11 p.m. UTC | #26

On 7/18/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Paul B Mahol a ecrit :
>> I believe your approach is fine.
>
> The huge duplication between decoder and demuxer is definitely not fine.
> You need to find a solution before pushing anything.

I'm not going to listen your non-useful comments.

Nicolas George July 18, 2017, 2:14 p.m. UTC | #27

Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
> I'm not going to listen your non-useful comments.

My comments have been constructive. Pushing with outstanding objections
like that is against the project policy. If you do it on purpose, your
Git right should be revoked.

Paul B Mahol July 18, 2017, 2:15 p.m. UTC | #28

On 7/18/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Paul B Mahol a ecrit :
>> I'm not going to listen your non-useful comments.
>
> My comments have been constructive. Pushing with outstanding objections
> like that is against the project policy. If you do it on purpose, your
> Git right should be revoked.

No it will not. I will fork FFmpeg. And leave you alone with CE.

Nicolas George July 18, 2017, 2:18 p.m. UTC | #29

Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
>		  I will fork FFmpeg.

It is the third time you make that threat. Out of principle, I want to
reply to anybody making it : please go ahead and good riddance.

Seriously, only somebody unable to working in a team and respecting the
diverse wishes and sensibilities of their colleagues make that kind of
threat.

Regards,

Paul B Mahol July 18, 2017, 2:24 p.m. UTC | #30

On 7/18/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Paul B Mahol a ecrit :
>>		  I will fork FFmpeg.
>
> It is the third time you make that threat. Out of principle, I want to
> reply to anybody making it : please go ahead and good riddance.
>
> Seriously, only somebody unable to working in a team and respecting the
> diverse wishes and sensibilities of their colleagues make that kind of
> threat.

You are unable to work with as a team.

I see nothing wrong with either demuxer or decoder code and no
duplicated lines of code.

Nicolas George July 18, 2017, 2:42 p.m. UTC | #31

Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
> You are unable to work with as a team.

You may notice that my message did not contain anything ad-hominem.
This, on the other hand, is pure ad-hoinem.

> I see nothing wrong with either demuxer or decoder code and no
> duplicated lines of code.

The codde is not duplicated, but the logic is, one reading from a
buffer, one reading from avio. Proof that this is duplicated : if a typo
is found in the name of one of the fields, it needs fixing in both lavc
and lavf. This is not good design. And as such, it cannot be accepted in
FFmpeg.

Paul B Mahol July 18, 2017, 6:24 p.m. UTC | #32

On 7/18/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Paul B Mahol a ecrit :
>> You are unable to work with as a team.
>
> You may notice that my message did not contain anything ad-hominem.
> This, on the other hand, is pure ad-hoinem.
>
>> I see nothing wrong with either demuxer or decoder code and no
>> duplicated lines of code.
>
> The codde is not duplicated, but the logic is, one reading from a
> buffer, one reading from avio. Proof that this is duplicated : if a typo
> is found in the name of one of the fields, it needs fixing in both lavc
> and lavf. This is not good design. And as such, it cannot be accepted in
> FFmpeg.

Design is just fine. You need to be more constructive, otherwise I will
just assume you want to be evil and ignore your comments.

Nicolas George July 18, 2017, 6:30 p.m. UTC | #33

Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
> Design is just fine. You need to be more constructive, otherwise I will
> just assume you want to be evil and ignore your comments.

This is not true, anybody who reads the mailing-list can see it (I can
provide links if necessary).

Please stop your ad-hominem attacks. Immediately.

If the patches gets pushed with this issue unresolved, I will revert.

Regards,

Paul B Mahol July 18, 2017, 6:38 p.m. UTC | #34

On 7/18/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Paul B Mahol a ecrit :
>> Design is just fine. You need to be more constructive, otherwise I will
>> just assume you want to be evil and ignore your comments.
>
> This is not true, anybody who reads the mailing-list can see it (I can
> provide links if necessary).
>
> Please stop your ad-hominem attacks. Immediately.
>
> If the patches gets pushed with this issue unresolved, I will revert.

To not cause drama on git repo I kindly ask Michael to remove your git
push access.

Nicolas George July 18, 2017, 6:42 p.m. UTC | #35

Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
> To not cause drama on git repo I kindly ask Michael to remove your git
> push access.

I will leave the people who have that kind of power judge who must be
banished :

The one who made useful reviews to the patches, with constructive
comments, and insists on high-quality code.

Or the one who makes ad-hominem attacks, threatens to fork when he
cannot have his way and never addresses the actual technical question.

Rostislav Pehlivanov July 18, 2017, 8:35 p.m. UTC | #36

On 18 July 2017 at 19:42, Nicolas George <george@nsup.org> wrote:

> Le decadi 30 messidor, an CCXXV, Paul B Mahol a écrit :
> > To not cause drama on git repo I kindly ask Michael to remove your git
> > push access.
>
> I will leave the people who have that kind of power judge who must be
> banished :
>
> The one who made useful reviews to the patches, with constructive
> comments, and insists on high-quality code.
>
> Or the one who makes ad-hominem attacks, threatens to fork when he
> cannot have his way and never addresses the actual technical question.
>
> --
>   Nicolas George
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Chill, both of you.

Its a crappy format, no reason to blame anyone else but the format.
We have plenty of crappy formats which have no clear separation between
packets so demuxers have to give not-entirely-demuxed packets to the
decoder which also has to skip past junk. Its both wrong and its not,
because it isn't defined anywhere and the packets packed in the file were
never meant to go anywhere but the format they were packed in.

I think the image2 demuxer shouldn't handle this type of junk/useless data
filtering and would rather see a separate demuxer like the current patch
which deals with crap. Its not elegant but blame the format and the fact
that back when it was designed there weren't even any universal containers
for audio/video.

Nicolas George July 19, 2017, 10:03 a.m. UTC | #37

Le decadi 30 messidor, an CCXXV, Rostislav Pehlivanov a écrit :
> Its a crappy format, no reason to blame anyone else but the format.
> We have plenty of crappy formats which have no clear separation between
> packets so demuxers have to give not-entirely-demuxed packets to the
> decoder which also has to skip past junk. Its both wrong and its not,
> because it isn't defined anywhere and the packets packed in the file were
> never meant to go anywhere but the format they were packed in.

I mostly agree, but I would say that exactly the same applies to many
image formats, especially old ones, including some amongst GIF, Sun
Raster, XPM / XBM, XFace, Targa, etc.

Do you disagree?

> I think the image2 demuxer shouldn't handle this type of junk/useless data
> filtering and would rather see a separate demuxer like the current patch
> which deals with crap.

I am somewhat confused by your statement, because in my mind, this is
exactly the purpose and task of the image2(pipe) demuxers.

The image2pipe demuxer reads the input stream, calls a parser to find in
it the size of the images and returns them as individual packets. Note
that the parsers are not part of the image2(pipe) demuxers, they are
separate entities of their own, each specific to one image format.

I do not know what the image2pipe demuxer would be good for, if not for
that. Or did I miss something?

Still, I do not insist that it is done with the image2(pipe) demuxer. It
only seems the best solution according to what was said about the format
until now.

What I do insist on, is this:

Look at the find_size() function in this patch:
https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213076.html

Then look at the fits_read_header() in this patch:
https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213061.html

They are the same. One works on an AVIOContext using read operations,
the other works on a buffer using pointer arithmetic, but the logic is
the same. And it is not a trivial logic, here, we are talking about
dozens of lines of code.

This is not good design, this is not good for maintenance: if a bug
needs fixing, it will need fixing consistently in both implementations;
if a feature is added, it will require the same.

Good design and maintenance require a single implementation.

This is especially true for an obscure format like FITS, where
maintenance workforce is scarce.

And this is especially true for a GSoC project, where the student is
supposed to be learning good practices.

I think the best and simplest way of achieving that is to have both the
parser and the decoder call the same function. And I think the simplest
way of implementing it is to make a parser and rely on image2pipe.

Because with minimal changes to fits_read_header(), the parser would
look just slightly more complex than that:

static int fitsparse(AVCodecParserContext *s, AVCodecContext *avctx,
                      const uint8_t **poutbuf, int *poutbuf_size,
                      const uint8_t *buf, int buf_size)
{
    fits_read_header(buf, buf_size, &size);
    return size;
}

So as it stands now, I will oppose any patch that duplicates the
header-reading logic, and I think anybody should do the same, since it
is everybody's responsibility to uphold the high standards of code
quality in our project.

Regards,

Paul B Mahol July 19, 2017, 10:08 a.m. UTC | #38

On 7/19/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Rostislav Pehlivanov a ecrit :
>> Its a crappy format, no reason to blame anyone else but the format.
>> We have plenty of crappy formats which have no clear separation between
>> packets so demuxers have to give not-entirely-demuxed packets to the
>> decoder which also has to skip past junk. Its both wrong and its not,
>> because it isn't defined anywhere and the packets packed in the file were
>> never meant to go anywhere but the format they were packed in.
>
> I mostly agree, but I would say that exactly the same applies to many
> image formats, especially old ones, including some amongst GIF, Sun
> Raster, XPM / XBM, XFace, Targa, etc.
>
> Do you disagree?
>
>> I think the image2 demuxer shouldn't handle this type of junk/useless
>> data
>> filtering and would rather see a separate demuxer like the current patch
>> which deals with crap.
>
> I am somewhat confused by your statement, because in my mind, this is
> exactly the purpose and task of the image2(pipe) demuxers.
>
> The image2pipe demuxer reads the input stream, calls a parser to find in
> it the size of the images and returns them as individual packets. Note
> that the parsers are not part of the image2(pipe) demuxers, they are
> separate entities of their own, each specific to one image format.
>
> I do not know what the image2pipe demuxer would be good for, if not for
> that. Or did I miss something?
>
> Still, I do not insist that it is done with the image2(pipe) demuxer. It
> only seems the best solution according to what was said about the format
> until now.
>
> What I do insist on, is this:
>
> Look at the find_size() function in this patch:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213076.html
>
> Then look at the fits_read_header() in this patch:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213061.html
>
> They are the same. One works on an AVIOContext using read operations,
> the other works on a buffer using pointer arithmetic, but the logic is
> the same. And it is not a trivial logic, here, we are talking about
> dozens of lines of code.
>
> This is not good design, this is not good for maintenance: if a bug
> needs fixing, it will need fixing consistently in both implementations;
> if a feature is added, it will require the same.
>
> Good design and maintenance require a single implementation.
>
> This is especially true for an obscure format like FITS, where
> maintenance workforce is scarce.
>
> And this is especially true for a GSoC project, where the student is
> supposed to be learning good practices.
>
> I think the best and simplest way of achieving that is to have both the
> parser and the decoder call the same function. And I think the simplest
> way of implementing it is to make a parser and rely on image2pipe.
>
> Because with minimal changes to fits_read_header(), the parser would
> look just slightly more complex than that:
>
> static int fitsparse(AVCodecParserContext *s, AVCodecContext *avctx,
>                       const uint8_t **poutbuf, int *poutbuf_size,
>                       const uint8_t *buf, int buf_size)
> {
>     fits_read_header(buf, buf_size, &size);
>     return size;
> }
>
> So as it stands now, I will oppose any patch that duplicates the
> header-reading logic, and I think anybody should do the same, since it
> is everybody's responsibility to uphold the high standards of code
> quality in our project.

Except when those standards need to be applied to your work.

Nicolas George July 19, 2017, 10:14 a.m. UTC | #39

Le primidi 1er thermidor, an CCXXV, Paul B Mahol a écrit :
> Except when those standards need to be applied to your work.

Yet another ad-hominem attack.

Paul B Mahol July 19, 2017, 10:33 a.m. UTC | #40

On 7/19/17, Nicolas George <george@nsup.org> wrote:
> Le decadi 30 messidor, an CCXXV, Rostislav Pehlivanov a ecrit :
>> Its a crappy format, no reason to blame anyone else but the format.
>> We have plenty of crappy formats which have no clear separation between
>> packets so demuxers have to give not-entirely-demuxed packets to the
>> decoder which also has to skip past junk. Its both wrong and its not,
>> because it isn't defined anywhere and the packets packed in the file were
>> never meant to go anywhere but the format they were packed in.
>
> I mostly agree, but I would say that exactly the same applies to many
> image formats, especially old ones, including some amongst GIF, Sun
> Raster, XPM / XBM, XFace, Targa, etc.
>
> Do you disagree?
>
>> I think the image2 demuxer shouldn't handle this type of junk/useless
>> data
>> filtering and would rather see a separate demuxer like the current patch
>> which deals with crap.
>
> I am somewhat confused by your statement, because in my mind, this is
> exactly the purpose and task of the image2(pipe) demuxers.
>
> The image2pipe demuxer reads the input stream, calls a parser to find in
> it the size of the images and returns them as individual packets. Note
> that the parsers are not part of the image2(pipe) demuxers, they are
> separate entities of their own, each specific to one image format.
>
> I do not know what the image2pipe demuxer would be good for, if not for
> that. Or did I miss something?
>
> Still, I do not insist that it is done with the image2(pipe) demuxer. It
> only seems the best solution according to what was said about the format
> until now.
>
> What I do insist on, is this:
>
> Look at the find_size() function in this patch:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213076.html
>
> Then look at the fits_read_header() in this patch:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213061.html
>
> They are the same. One works on an AVIOContext using read operations,
> the other works on a buffer using pointer arithmetic, but the logic is
> the same. And it is not a trivial logic, here, we are talking about
> dozens of lines of code.
>
> This is not good design, this is not good for maintenance: if a bug
> needs fixing, it will need fixing consistently in both implementations;
> if a feature is added, it will require the same.
>
> Good design and maintenance require a single implementation.
>
> This is especially true for an obscure format like FITS, where
> maintenance workforce is scarce.
>
> And this is especially true for a GSoC project, where the student is
> supposed to be learning good practices.
>
> I think the best and simplest way of achieving that is to have both the
> parser and the decoder call the same function. And I think the simplest
> way of implementing it is to make a parser and rely on image2pipe.
>
> Because with minimal changes to fits_read_header(), the parser would
> look just slightly more complex than that:
>
> static int fitsparse(AVCodecParserContext *s, AVCodecContext *avctx,
>                       const uint8_t **poutbuf, int *poutbuf_size,
>                       const uint8_t *buf, int buf_size)
> {
>     fits_read_header(buf, buf_size, &size);
>     return size;
> }

If you had ever write parser, you would know that this above is giberish.

Nicolas George July 19, 2017, 12:10 p.m. UTC | #41

Le primidi 1er thermidor, an CCXXV, Paul B Mahol a écrit :
> If you had ever write parser, you would know that this above is giberish.

Please enlighten us.

Regards,

Paul B Mahol July 19, 2017, 3:33 p.m. UTC | #42

On 7/19/17, Nicolas George <george@nsup.org> wrote:
> Le primidi 1er thermidor, an CCXXV, Paul B Mahol a ecrit :
>> If you had ever write parser, you would know that this above is giberish.
>
> Please enlighten us.

Are there more than one Nicolas here?

Anyway the input can be of any size so one would need to hold complete
state of parsing.

Reimar Döffinger July 19, 2017, 9:25 p.m. UTC | #43

On 19.07.2017, at 12:03, Nicolas George <george@nsup.org> wrote:

> Le decadi 30 messidor, an CCXXV, Rostislav Pehlivanov a écrit :
> 
>> I think the image2 demuxer shouldn't handle this type of junk/useless data
>> filtering and would rather see a separate demuxer like the current patch
>> which deals with crap.
> 
> I am somewhat confused by your statement, because in my mind, this is
> exactly the purpose and task of the image2(pipe) demuxers.
> 
> The image2pipe demuxer reads the input stream, calls a parser to find in
> it the size of the images and returns them as individual packets. Note
> that the parsers are not part of the image2(pipe) demuxers, they are
> separate entities of their own, each specific to one image format.
> 
> I do not know what the image2pipe demuxer would be good for, if not for
> that. Or did I miss something?

I am a bit unclear on the details, but my memory:
Well, they work better (only?) if the images are completely independent.
I am not sure this is entirely the case here, for example the first image would have a different header it looks like from the muxer patch.
There is also the question: how would the encoding work?
Because if encoding will be using a muxer, having things symmetric by having a demuxer is an advantage.

> Good design and maintenance require a single implementation.

I don't think we have the right frameworks for that. Other cases like the MPEG video format parsers and decoders also duplicate that kind of code.
There might be some ways though, I am mostly thinking of an iterative function that is called with a pointer to each 80 byte header buffer and updates a state structure with the info it extracts from there. Would still get messy if the decoder needs quite different data than the demuxer.
I have not yet checked how similar these functions actually are.

> I think the best and simplest way of achieving that is to have both the
> parser and the decoder call the same function. And I think the simplest
> way of implementing it is to make a parser and rely on image2pipe.

Parsers work best when parsing requires very limited context.
My impression is that there is no fixed maximum header size, which can easily make things really painful in a parser.

Also considering the complexity: I suspect that the header parsing can be simplified a good bit, and maybe some common helper functions extracted.
I wouldn't insist on avoiding code duplication when trying to share it might actually result in more complicated code. But it probably needs some thinking to figure out what makes sense.

Reimar Döffinger July 19, 2017, 9:48 p.m. UTC | #44

On 19.07.2017, at 12:03, Nicolas George <george@nsup.org> wrote:
> What I do insist on, is this:
> 
> Look at the find_size() function in this patch:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213076.html
> 
> Then look at the fits_read_header() in this patch:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213061.html
> 
> They are the same. One works on an AVIOContext using read operations,
> the other works on a buffer using pointer arithmetic, but the logic is
> the same. And it is not a trivial logic, here, we are talking about
> dozens of lines of code.

Ok, I think it is possible to use the same code for those, for example by doing this:
- in the demuxer (or parser, but that is harder because you cannot just read what you need but have to deal with the data as it is provided), get data until the end of the header.
The end of the header detection would be somewhat duplicated, but should be fairly trivial.
- Then use a function that takes such a buffer and returns a AVDictionary with name/value pairs. The decoder and demuxer can then each decide what they want to do with them.

There is one issue at least though: for files with very large meta-data/header this would almost double memory usage in the demuxer due to having copies in the dictionary.
There is also to me the issue, while both parse the same things, what these functions do is not at all the same semantically, at least currently.
The demuxer only parses a select few keywords with well-known format and as a result can e.g, use sscanf directly (not that I like using sscanf personally).
The decoder one builts a metadata dictionary and thus needs a lot of special-case handling like proper parsing, handling possibly new keywords, deciding what to do with garbage (instead of just skipping), handling escaping for the values, ...
Personally if it was my job to get this in, I would start by simplifying at least one of those functions while making sure to have proper bounds checking and error handling to see where the complexity really lies and how complex the problem actually is compared to how the first try solution looks.

Paul B Mahol July 20, 2017, 4:14 p.m. UTC | #45

On 7/19/17, Paul B Mahol <onemda@gmail.com> wrote:
> On 7/19/17, Nicolas George <george@nsup.org> wrote:
>> Le primidi 1er thermidor, an CCXXV, Paul B Mahol a ecrit :
>>> If you had ever write parser, you would know that this above is
>>> giberish.
>>
>> Please enlighten us.
>
> Are there more than one Nicolas here?
>
> Anyway the input can be of any size so one would need to hold complete
> state of parsing.
>

Assuming there are no comments. I will apply decoder patch as is.

Nicolas George July 20, 2017, 4:23 p.m. UTC | #46

Le duodi 2 thermidor, an CCXXV, Paul B Mahol a écrit :
> Assuming there are no comments.

Wrong. Give me more time.

Regards,

Paul B Mahol July 20, 2017, 4:29 p.m. UTC | #47

On 7/20/17, Nicolas George <george@nsup.org> wrote:
> Le duodi 2 thermidor, an CCXXV, Paul B Mahol a ecrit :
>> Assuming there are no comments.
>
> Wrong. Give me more time.

You should be more constructive, better go porting dualinput to framesync2,
that wasting everybodies precious time here.

Nicolas George July 20, 2017, 4:54 p.m. UTC | #48

Le duodi 2 thermidor, an CCXXV, Paul B Mahol a écrit :
> You should be more constructive, better go porting dualinput to framesync2,
> that wasting everybodies precious time here.

Keeping high standards of code quality seems to me a more important use
of my time. And since you are neither my boss nor my king, you have no
authority to tell me how I should spend my time.

I have an outstanding technical objection to these patches. Not Reimar,
not anybody else, I. Until I drop it, you are not allowed to push, until
you gain support to overrule me. That is the rule.

In the meantime, you can perfectly continue working, Git makes it
entirely convenient. You can also understand that I am right and start
fixing the patches.

I will now continue the discussion with Reimar.

Regards,

Paras July 20, 2017, 6:37 p.m. UTC | #49

On Thu, Jul 20, 2017 at 11:17 PM, Nicolas George <george@nsup.org> wrote:

> Le primidi 1er thermidor, an CCXXV, Reimar Döffinger a écrit :
> > I am a bit unclear on the details, but my memory:
> > Well, they work better (only?) if the images are completely independent.
>
> > I am not sure this is entirely the case here, for example the first
> > image would have a different header it looks like from the muxer
> > patch.
>
> According to this mail from Paras Chadha:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213180.html
> there is no global header. A first frame that is not identical to the
> next one would count as a global header.
>
> I hope Paras Chadha can tell us more about this issue.
>

The only difference between first image and rest is that, the first image
will always contain the keyword, SIMPLE = T as first keyword while all
other images will have XTENSION = 'IMAGE ' as first keyword. Also, PCOUNT
and GCOUNT are mandatory in all image extensions but not in the first
image. Rest all keywords are same.


>
> > There is also the question: how would the encoding work?
> > Because if encoding will be using a muxer, having things symmetric by
> > having a demuxer is an advantage.
>
> If it is an image format (without global headers), it should work just
> the same as the other image formats, nothing more: the image2pipe muxer
> concatenates the packets.
>
> > I don't think we have the right frameworks for that. Other cases like
> > the MPEG video format parsers and decoders also duplicate that kind of
> > code.
>
> If we do not have the right framework to avoid code duplication, then we
> need to invent it. What you say here only means that sub-standard code
> have been accepted in the past, and that should not count as a
> precedent.
>
> Also, MPEG is a bit more important than FITS, so rushing things for it
> may have been acceptable.
>
> Still, in this particular case, I have reviewed the patches and I know
> the code enough to say that merging both functions would be quite easy
> even without a framework.
>
> > There might be some ways though, I am mostly thinking of an iterative
> > function that is called with a pointer to each 80 byte header buffer
> > and updates a state structure with the info it extracts from there.
> [1]
>
> Yes, exactly, that is one of the possibilities.
>
> > Would still get messy if the decoder needs quite different data than
> > the demuxer.
> > I have not yet checked how similar these functions actually are.
>
> I have looked at the code (enough to make a full review of a previous
> version), and I can say it is not the case. The information required by
> the demuxer is more or less a subset of the information required by the
> decoder.
>
> I say more or less because I just saw something fishy in the code:
>
> +    data_size += pcount;
> +    data_size *= (abs(bitpix) >> 3) * gcount;
>
> Paras Chadha, can you tell us why this is counted in the data size in
> the demuxer and not in the decoder?
>

It is because for image extensions, PCOUNT = 0 and GCOUNT = 1 are mandatory
in the header according to the standard. These keywords are not even
mandatory in the header of first image.
Anyways as i have told before, it is not necessary FITS file will contain
an image always. It is a general data storage and transport format too. It
supports various extensions - ASCII table extension and binary table
extension that are used to store and transport data. After that there is
something called Random Groups structure which is also used for the same
purpose. PCOUNT and GCOUNT can affect the size of data in those extensions.
In demuxer, i am skipping over these extensions and only sending the images
to the decoder. So, these keywords are taken into account in demuxer but
not in decoder.


>
> > Parsers work best when parsing requires very limited context.
> > My impression is that there is no fixed maximum header size, which can
> > easily make things really painful in a parser.
>
> Urgh, I thought the framework was better than that. I concede that
> turning the existing code into a parser would be more work than I
>
> Nonetheless, with the coding organization you suggested above (see [1]),
> adding buf[80] to the parsing structure would do the trick.
>
> But we still need Paras Chadha's knowledge about the possible
> differences in the first frame.
>

I have described them above


>
> > Also considering the complexity: I suspect that the header parsing can
> > be simplified a good bit, and maybe some common helper functions
> > extracted.
>
> I think so too.
>
> > I wouldn't insist on avoiding code duplication when trying to share it
> > might actually result in more complicated code. But it probably needs
> > some thinking to figure out what makes sense.
>
> Having looked at the code, I really think it would not lead to more
> complicated code.
>
> > Ok, I think it is possible to use the same code for those, for example
> > by doing this:
>
> > - in the demuxer (or parser, but that is harder because you cannot
> > just read what you need but have to deal with the data as it is
> > provided), get data until the end of the header.
>
> > The end of the header detection would be somewhat duplicated, but
> > should be fairly trivial.
>
> Or it could go the other way round: code the function to use an AVIO,
> and in the decoder create an AVIO that reads from the packet data.
>
> > - Then use a function that takes such a buffer and returns a
> > AVDictionary with name/value pairs. The decoder and demuxer can then
> > each decide what they want to do with them.
>
> Wait, what?!?
>
> The decoder makes an AVDictionary for the strings metadata that are
> returned as is in the frame, but apart from that, both the decoder and
> demuxer only use a few numeric fields: just return these fields in a
> structure.
>
> > There is one issue at least though: for files with very large
> > meta-data/header this would almost double memory usage in the demuxer
> > due to having copies in the dictionary.
>
> I would say many of our components have this kind of issue. I have not
> checked, but I think cases where a component reads a full block and then
> splits it into string metadata is very common and we never cared.
>
> > There is also to me the issue, while both parse the same things, what
> > these functions do is not at all the same semantically, at least
> > currently.
>
> > The demuxer only parses a select few keywords with well-known format
> > and as a result can e.g, use sscanf directly (not that I like using
> > sscanf personally).
>
> > The decoder one builts a metadata dictionary and thus needs a lot of
> > special-case handling like proper parsing, handling possibly new
> > keywords, deciding what to do with garbage (instead of just skipping),
> > handling escaping for the values, ...
>
> It does not seem very complicated to me: just enclose the parts that are
> needed by the decoder in "if (decoding) { ... }" blocks. Or even, do the
> parsing and discard the value in the demuxer, a few extraneous string
> comparisons or conversions to numeric are not a problem.
>
> I just looked at the code again, and it seems to me only the
> av_dict_set() calls would need to be disabled in the demuxer, the rest
> is just populating a structure with numeric values.
>
> > Personally if it was my job to get this in, I would start by
> > simplifying at least one of those functions while making sure to have
> > proper bounds checking and error handling to see where the complexity
> > really lies and how complex the problem actually is compared to how
> > the first try solution looks.
>
> Indeed.
>
> Regards,
>
> --
>   Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Nicolas George July 21, 2017, 2:42 p.m. UTC | #50

Le tridi 3 thermidor, an CCXXV, Paras Chadha a écrit :
> The only difference between first image and rest is that, the first image
> will always contain the keyword, SIMPLE = T as first keyword while all
> other images will have XTENSION = 'IMAGE ' as first keyword.

Then I would say that "SIMPLE = T" counts as a global header, if one
void of information.

And thus, I strongly suggest to have the demuxer remove the "SIMPLE = T"
and "XTENSION = 'IMAGE '" and only return the rest of the block.
Otherwise, this codec would not be really intra-only and that would be
really a problem.

> Also, PCOUNT
> and GCOUNT are mandatory in all image extensions but not in the first
> image. Rest all keywords are same.

In the first image, are they optional or forbidden?

> It is because for image extensions, PCOUNT = 0 and GCOUNT = 1 are mandatory
> in the header according to the standard.

Ok. But I think you neglected to check that this constraint is met by
the file being read.

> In demuxer, i am skipping over these extensions and only sending the images
> to the decoder.

Ok, that is another argument for having a real demuxer.

Let us find the most efficient way of getting rid of the logic
duplication.

I suggest something like this:

* Create libavcodec/fits.h with:

    typedef enum FITSHeaderState {
        STATE_SIMPLE,
        STATE_XTENSION,
        STATE_BITPIX,
        STATE_NAXIS,
        STATE_NAXIS_I,
        STATE_REST,
    } FITSHeaderState;

    typedef struct FITSHeader {
        FITSHeaderState state;
        unsigned axis_index;
        ...
    } FITSHeader;

    int avpriv_fits_header_init(FITSHeader *header, FITSHeaderState state);

    int avpriv_fits_header_parse_line(FITSHeader *header,
                                      uint8_t line[80],
                                      AVDictionary ***metadata);

* In libavcodec/fitsdec.c, rename fits_read_header() into
  avpriv_fits_header_parse_line(), and in the body of the function,
  surround each block that parses a keyword with:

    if (header->state == STATE_...) {
        /* existing code */
        header->state = STATE_next;
    }

  (or even better with a switch/case statement). Signal END by returning
  a non-zero value.
  
  Of course, all local variables need to be moved to the structure.
  (Sorry I made you do the exact opposite earlier, but I had not yet
  seen the demuxer at the time.)

* The actual fits_read_header() becomes very simple:

    avpriv_fits_header_init(&header, STATE_BITPIX);
    do {
        if (ptr8 - end < 80)
            return AVERROR_INVALIDDATA;
        ret = avpriv_fits_header_parse_line(&header, ptr8, &metadata);
        ptr8 += 80;
    } while (!ret);
    if (ret < 0)
        return ret;
    /* validation of the values, fill_data_min_max, etc. almost
       unchanged */

* Move the code for PCOUNT, GCOUNT, GROUPS from the demuxer to
  avpriv_fits_header_parse_line() and add the corresponding fields in
  the structure.

* The find_size() function is simplified the same way as
  avpriv_fits_header_init() with just a loop calling
  avpriv_fits_header_parse_line(), but with avio_read() instead of ptr8.

  I strongly suggest you take the occasion for keeping a copy of all the
  data in memory to avoid the need for seeking later. You can use the
  AVBPrint API for that, for example.

  Also, the demuxer would pass NULL for the metadata pointer, so each
  metadata affectation must be made conditional. It can be factored with
  a helper function dict_set_if_not_null() or something.

If you follow these steps, you get rid of the huge code duplication, and
you also gain a lot of simplification for error checking. For example:

+    if ((ret = avio_read(pb, buf, 80)) != 80)
+        return AVERROR_EOF;

would only exist once (and you can take the time to fix it, because
negative values must be returned as is, not converted to EOF).

Of course, it is only a set of suggestions. Feel free to discuss them or
choose another path if you see fit. But based on my experience, it is
probably the simplest way of getting code that is simple and easy to
maintain.

Regards,

[FFmpeg-devel] Add FITS Demuxer

Commit Message

Comments

Patch