diff mbox

[FFmpeg-devel] frama-c:ify qt-faststart.c, proof-of-concept (kinda)

Message ID 1529674887.26162.7.camel@acc.umu.se
State New
Headers show

Commit Message

Tomas Härdin June 22, 2018, 1:41 p.m. UTC
Hi

I've recently been looking at formal proof systems for minimizing bugs.
One common example is SPARK, which is an Ada variant which uses
contracts, guaranteed termination and lack of dynamic allocation to
make reasoning about SPARK programs much easier. This allows automatic
checking for things like overflows, out-of-bounds access and so on. A
great thing for code quality.

There is something similar for C called Frama-C (https://frama-c.com/).
It allows you to add contracts to functions and then attempts to prove
that the functions fulfill said contracts. Like SPARK it is also
possible to check for things like out-of-bounds accesses or
over/underflow. So I spent a few days going over qt-faststart.c (then
yet again after the recent patch), adding ACSL annotations here and
there, and rewriting parts of it to make the job of alt-ergo/why3/coq
much easier. It's far from done, roughly 29% of goals are yet to be
proven. But the purpose is to raise discussion rather than provide a
perfect solution.

I'd also like to know if anyone else on here has fiddled with frama-c,
and might know how one could make this part of the test system. For
example, I'd like a plot of % proved goals over time and the ability to
fail a build if a file that was 100% proved suddenly no longer is.

Doing this for a large project like FFmpeg is a serious undertaking,
but something I feel should really be done for all C projects. In my
case this is motivated by the recent cargo cult discussion about checks
in mxfdec.c, which an automated prover could take care of. There are of
course many more cases.

Patch attached. One thing I discovered is that all atom parsers expect
at least 8 bytes of atom data, so I was able to add contracts and
checks for that. Here's what "frama-c -wp -wp-rte qt-faststart.c" says
about it right now:

    [wp] Proved goals:  240 / 340
         Qed:           156  (4ms-16ms-1.2s)
         Alt-Ergo:       84  (28ms-133ms-776ms) (403) (interrupted: 22) (unknown: 78)

/Tomas

Comments

Eran Kornblau June 22, 2018, 2:07 p.m. UTC | #1
> Hi

> 

> I've recently been looking at formal proof systems for minimizing bugs.

> One common example is SPARK, which is an Ada variant which uses contracts, guaranteed termination and lack of dynamic allocation to make reasoning about SPARK programs much easier. This allows automatic checking for things like overflows, out-of-bounds access and so on. A great thing for code quality.

> 

> There is something similar for C called Frama-C (https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fframa-c.com%2F&data=02%7C01%7Ceran.kornblau%40kaltura.com%7C41e71cf613e041e2d75908d5d845e373%7C0c503748de3f4e2597e26819d53a42b6%7C0%7C0%7C636652717037149779&sdata=cnGg2SD1mSGhn3qhGR2YRVexfA17%2F6Lrx52mnxPN1nA%3D&reserved=0).

> It allows you to add contracts to functions and then attempts to prove that the functions fulfill said contracts. Like SPARK it is also possible to check for things like out-of-bounds accesses or over/underflow. So I spent a few days going over qt-faststart.c (then yet again after the recent patch), adding ACSL annotations here and there, and rewriting parts of it to make the job of alt-ergo/why3/coq much easier. It's far from done, roughly 29% of goals are yet to be proven. But the purpose is to raise discussion rather than provide a perfect solution.

> 

> I'd also like to know if anyone else on here has fiddled with frama-c, and might know how one could make this part of the test system. For example, I'd like a plot of % proved goals over time and the ability to fail a build if a file that was 100% proved suddenly no longer is.

> 

> Doing this for a large project like FFmpeg is a serious undertaking, but something I feel should really be done for all C projects. In my case this is motivated by the recent cargo cult discussion about checks in mxfdec.c, which an automated prover could take care of. There are of course many more cases.

> 

> Patch attached. One thing I discovered is that all atom parsers expect at least 8 bytes of atom data, so I was able to add contracts and checks for that. Here's what "frama-c -wp -wp-rte qt-faststart.c" says about it right now:

> 

First, regarding the if you added, it's redundant - if you look a few lines above, you'll see 'if (atom.size < atom.header_size)'.
atom.header_size is either 8 or 16, it can't be anything else, so atom.size can't be < 8 at this point.

I'll leave it to the maintainers to decide whether this tool is helpful or not, IMHO, all these comments make the 
code less readable, and some of the changes make it less efficient. I don't think this slight reduction of performance 
matters much in the context of faststart, but in other parts of ffmpeg it can be significant.

Few examples to why I think it's less efficient -
1. the change of macros to functions - maybe the compiler will inline them, but maybe it won't...
2. this extra statement - 'pos = atom->data + 8 + 4*i;' - 
	why for on 'i' when we only care about 'pos'?
3. this code - 'current_offset = ((uint64_t)current_offset + (context->moov_atom_size & 0xFFFFFFFF)) & 0xFFFFFFFF;'
	why go through 64 bit when we later cast it to 32 bit?

Other than that, your patch implies that function pointers should not be used, I think that in many cases,
they can be helpful. In this specific case, the function 'parse_atoms' was a generic function for parsing atoms,
after the change, it can't be reused for any other task.

Eran

>     [wp] Proved goals:  240 / 340

>          Qed:           156  (4ms-16ms-1.2s)

>          Alt-Ergo:       84  (28ms-133ms-776ms) (403) (interrupted: 22) (unknown: 78)

> 

> /Tomas

>
Tomas Härdin June 22, 2018, 2:51 p.m. UTC | #2
fre 2018-06-22 klockan 14:07 +0000 skrev Eran Kornblau:
> First, regarding the if you added, it's redundant - if you look a few lines above, you'll see 'if (atom.size < atom.header_size)'.
> atom.header_size is either 8 or 16, it can't be anything else, so atom.size can't be < 8 at this point.

If you look closely you'll see that check is after subtracting
atom.header_size.

> I'll leave it to the maintainers to decide whether this tool is helpful or not, IMHO, all these comments make the 
> code less readable, and some of the changes make it less efficient. I don't think this slight reduction of performance 
> matters much in the context of faststart, but in other parts of ffmpeg it can be significant.
> 
> Few examples to why I think it's less efficient -
> 1. the change of macros to functions - maybe the compiler will inline them, but maybe it won't...

You're assuming inlining actually makes the code faster. It's not the
80's anymore.

> 2. this extra statement - 'pos = atom->data + 8 + 4*i;' - 
> 	why for on 'i' when we only care about 'pos'?

It's an attempt at making that code easier to prove. If I can convince
alt-ergo that atom_data[0..(atom_size-1)] is valid and that the code
only touches bytes in that range then the code will pass inspection.
Using end-pos for that ended up with some rather torturous ACSL
annotations.

> 3. this code - 'current_offset = ((uint64_t)current_offset + (context->moov_atom_size & 0xFFFFFFFF)) & 0xFFFFFFFF;'
> 	why go through 64 bit when we later cast it to 32 bit?

Yeah that's just my attempt at saying "relax, overflow is fine".
Judging by the ACSL manual [1], adding an explicit cast to uint32_t may
be enough.

> Other than that, your patch implies that function pointers should not be used, I think that in many cases,
> they can be helpful. In this specific case, the function 'parse_atoms' was a generic function for parsing atoms,
> after the change, it can't be reused for any other task.

Yeah, but it only ever calls two functions. So calling them explicitly
makes the job of the prover much easier. If it were possible to add
annotations to the function pointer typedef such that assigning from a
pointer to a function that does not fulfill the contract is illegal,
that would be fine. That doesn't seem possible yet.

Of course, for something like struct AVCodec function pointers are
almost unavoidable.

Perhaps when I get a better hang of this I'll try adding annotations to
libavutil, and mxfdec.c. In the end I want less mental burden.

/Tomas

[1] http://frama-c.com/download/acsl_1.13.pdf
Eran Kornblau June 22, 2018, 7:34 p.m. UTC | #3
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of Tomas Härdin

> Sent: Friday, June 22, 2018 5:51 PM

> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] frama-c:ify qt-faststart.c, proof-of-concept (kinda)

> 

> > fre 2018-06-22 klockan 14:07 +0000 skrev Eran Kornblau:

> > First, regarding the if you added, it's redundant - if you look a few lines above, you'll see 'if (atom.size < atom.header_size)'.

> > atom.header_size is either 8 or 16, it can't be anything else, so atom.size can't be < 8 at this point.

> 

> If you look closely you'll see that check is after subtracting atom.header_size.

> 

You're right, I missed that, but now that I see it - it's not redundant, it's just wrong.
It's perfectly ok for an arbitrary atom to have size smaller than 8 bytes (frma is, for example, 
only 4 bytes), it's not ok for the two specific atoms that we're parsing - stco & co64. 
The validation of size >= 8 already existed in all branches that required it 
(parse_atoms / update_stco_offsets / update_co64_offsets)

> > I'll leave it to the maintainers to decide whether this tool is 

> > helpful or not, IMHO, all these comments make the code less readable, 

> > and some of the changes make it less efficient. I don't think this slight reduction of performance matters much in the context of faststart, but in other parts of ffmpeg it can be significant.

> > 

> > Few examples to why I think it's less efficient - 1. the change of 

> > macros to functions - maybe the compiler will inline them, but maybe it won't...

> 

> You're assuming inlining actually makes the code faster. It's not the 80's anymore.

>

> > 2. this extra statement - 'pos = atom->data + 8 + 4*i;' - 

> > 	why for on 'i' when we only care about 'pos'?

> 

> It's an attempt at making that code easier to prove. If I can convince alt-ergo that atom_data[0..(atom_size-1)] is valid and that the code only touches bytes in that range then the code will pass inspection.

> Using end-pos for that ended up with some rather torturous ACSL annotations.

> 

> > 3. this code - 'current_offset = ((uint64_t)current_offset + (context->moov_atom_size & 0xFFFFFFFF)) & 0xFFFFFFFF;'

> > 	why go through 64 bit when we later cast it to 32 bit?

> 

> Yeah that's just my attempt at saying "relax, overflow is fine".

> Judging by the ACSL manual [1], adding an explicit cast to uint32_t may be enough.

> 

Sounds a bit problematic to me to demand that everyone here learn all this stuff now, 
but again, it's not my call...

> > Other than that, your patch implies that function pointers should not 

> > be used, I think that in many cases, they can be helpful. In this 

> > specific case, the function 'parse_atoms' was a generic function for parsing atoms, after the change, it can't be reused for any other task.

> 

> Yeah, but it only ever calls two functions. So calling them explicitly makes the job of the prover much easier. If it were possible to add annotations to the function pointer typedef such that assigning from a pointer to a function that does not fulfill the contract is illegal, that would be fine. That doesn't seem possible yet.

> 

> Of course, for something like struct AVCodec function pointers are almost unavoidable.

> 

> Perhaps when I get a better hang of this I'll try adding annotations to libavutil, and mxfdec.c. In the end I want less mental burden.

> 

> /Tomas

> 

> [1] https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fframa-c.com%2Fdownload%2Facsl_1.13.pdf&data=02%7C01%7Ceran.kornblau%40kaltura.com%7Ce74b7daf0c4846621a6e08d5d84fa248%7C0c503748de3f4e2597e26819d53a42b6%7C0%7C1%7C636652758896295662&sdata=CjOaX1Uvl%2FsvYLmL6Pdvm2yg8RFNZ%2BcJUTJ11hyu4c8%3D&reserved=0

> _______________________________________________

> ffmpeg-devel mailing list

> ffmpeg-devel@ffmpeg.org

> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel&data=02%7C01%7Ceran.kornblau%40kaltura.com%7Ce74b7daf0c4846621a6e08d5d84fa248%7C0c503748de3f4e2597e26819d53a42b6%7C0%7C1%7C636652758896295662&sdata=HhRDsQBrzYopJhMPb7xuVtlo99ka9btAbcnE8091Yu8%3D&reserved=0

> 


Eran
Michael Niedermayer June 22, 2018, 9:12 p.m. UTC | #4
On Fri, Jun 22, 2018 at 04:51:09PM +0200, Tomas Härdin wrote:
> fre 2018-06-22 klockan 14:07 +0000 skrev Eran Kornblau:
> > First, regarding the if you added, it's redundant - if you look a few lines above, you'll see 'if (atom.size < atom.header_size)'.
> > atom.header_size is either 8 or 16, it can't be anything else, so atom.size can't be < 8 at this point.
> 
> If you look closely you'll see that check is after subtracting
> atom.header_size.
> 
> > I'll leave it to the maintainers to decide whether this tool is helpful or not, IMHO, all these comments make the 
> > code less readable, and some of the changes make it less efficient. I don't think this slight reduction of performance 
> > matters much in the context of faststart, but in other parts of ffmpeg it can be significant.
> > 
> > Few examples to why I think it's less efficient -
> > 1. the change of macros to functions - maybe the compiler will inline them, but maybe it won't...
> 
> You're assuming inlining actually makes the code faster. It's not the
> 80's anymore.

Until someone tests it, both claims are assumtations.
The date on the calender surely is not a good argument though even though it
has a somewhat "authorative" vibe to it.

Personally i prefer inline/always inline functions over macros when they are
equally fast though ...

Speaking about the date. I would have thought the need to manually annotate the
code for analyzers was a thing more for the past and for academia. But quite
possibly i have missed something ...

If you compare the time it takes a human to anotate a piece of code (some of
this anotation itself may be incorrect) and subsequently a analyzer to find bugs.
To the alternative
of a human manually reviewing the code line by line and a analyzer running over
the code which does not need anotations

which finds more bugs ?
The question matters, because i think we want the maximum return for the time that
is invested


[...]
Tomas Härdin June 23, 2018, 9:40 a.m. UTC | #5
fre 2018-06-22 klockan 23:12 +0200 skrev Michael Niedermayer:
> On Fri, Jun 22, 2018 at 04:51:09PM +0200, Tomas Härdin wrote:
> > fre 2018-06-22 klockan 14:07 +0000 skrev Eran Kornblau:
> > > First, regarding the if you added, it's redundant - if you look a few lines above, you'll see 'if (atom.size < atom.header_size)'.
> > > atom.header_size is either 8 or 16, it can't be anything else, so atom.size can't be < 8 at this point.
> > 
> > If you look closely you'll see that check is after subtracting
> > atom.header_size.
> > 
> > > I'll leave it to the maintainers to decide whether this tool is helpful or not, IMHO, all these comments make the 
> > > code less readable, and some of the changes make it less efficient. I don't think this slight reduction of performance 
> > > matters much in the context of faststart, but in other parts of ffmpeg it can be significant.
> > > 
> > > Few examples to why I think it's less efficient -
> > > 1. the change of macros to functions - maybe the compiler will inline them, but maybe it won't...
> > 
> > You're assuming inlining actually makes the code faster. It's not the
> > 80's anymore.
> 
> Until someone tests it, both claims are assumtations.
> The date on the calender surely is not a good argument though even though it
> has a somewhat "authorative" vibe to it.
> 
> Personally i prefer inline/always inline functions over macros when they are
> equally fast though ...
> 
> Speaking about the date. I would have thought the need to manually annotate the
> code for analyzers was a thing more for the past and for academia. But quite
> possibly i have missed something ...

It's common in industries where you need to guarantee that a piece of
code will never crash. Aerospace comes to mind.

> If you compare the time it takes a human to anotate a piece of code (some of
> this anotation itself may be incorrect) and subsequently a analyzer to find bugs.
> To the alternative
> of a human manually reviewing the code line by line and a analyzer running over
> the code which does not need anotations
> 
> which finds more bugs ?
> The question matters, because i think we want the maximum return for the time that
> is invested

The point is to let the machine automatically catch when certain
assumptions are violated, since mistakes happen. You are correct that
it can take considerable time though. I wouldn't insist on anyone else
doing it, this is mostly for my own sanity. And of course if the
contracts are wrong then you're screwed. Just like if ones assumptions
of what a function does are wrong.

As an example of just how wordy these annotations can get, I've
attached binary_search_tree_proved.c from [1]. It's 8 lines of code and
over 100 lines of annotations. But it does prove correct.

To cap off this post, I'll link to what I've said on the 'ol blogaroo
about Ada/SPARK: http://härdin.se/blog/2018/01/20/trying-out-ada-spark/

/Tomas

[1] https://bts.frama-c.com/dokuwiki/lib/exe/fetch.php?media=mantis:fra
ma-c:tutorial:lifo-2014:solutions.tar.gz
Tomas Härdin June 23, 2018, 9:42 a.m. UTC | #6
fre 2018-06-22 klockan 19:34 +0000 skrev Eran Kornblau:
> > -----Original Message-----
> > From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On
> > Behalf Of Tomas Härdin
> > Sent: Friday, June 22, 2018 5:51 PM
> > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg
> > .org>
> > Subject: Re: [FFmpeg-devel] frama-c:ify qt-faststart.c, proof-of-
> > concept (kinda)
> > 
> > > fre 2018-06-22 klockan 14:07 +0000 skrev Eran Kornblau:
> > > First, regarding the if you added, it's redundant - if you look a
> > > few lines above, you'll see 'if (atom.size < atom.header_size)'.
> > > atom.header_size is either 8 or 16, it can't be anything else, so
> > > atom.size can't be < 8 at this point.
> > 
> > If you look closely you'll see that check is after subtracting
> > atom.header_size.
> > 
> 
> You're right, I missed that, but now that I see it - it's not
> redundant, it's just wrong.
> It's perfectly ok for an arbitrary atom to have size smaller than 8
> bytes (frma is, for example, 
> only 4 bytes), it's not ok for the two specific atoms that we're
> parsing - stco & co64. 
> The validation of size >= 8 already existed in all branches that
> required it 
> (parse_atoms / update_stco_offsets / update_co64_offsets)
> 
> > > I'll leave it to the maintainers to decide whether this tool is 
> > > helpful or not, IMHO, all these comments make the code less
> > > readable, 
> > > and some of the changes make it less efficient. I don't think
> > > this slight reduction of performance matters much in the context
> > > of faststart, but in other parts of ffmpeg it can be significant.
> > > 
> > > Few examples to why I think it's less efficient - 1. the change
> > > of 
> > > macros to functions - maybe the compiler will inline them, but
> > > maybe it won't...
> > 
> > You're assuming inlining actually makes the code faster. It's not
> > the 80's anymore.
> > 
> > > 2. this extra statement - 'pos = atom->data + 8 + 4*i;' - 
> > > 	why for on 'i' when we only care about 'pos'?
> > 
> > It's an attempt at making that code easier to prove. If I can
> > convince alt-ergo that atom_data[0..(atom_size-1)] is valid and
> > that the code only touches bytes in that range then the code will
> > pass inspection.
> > Using end-pos for that ended up with some rather torturous ACSL
> > annotations.
> > 
> > > 3. this code - 'current_offset = ((uint64_t)current_offset +
> > > (context->moov_atom_size & 0xFFFFFFFF)) & 0xFFFFFFFF;'
> > > 	why go through 64 bit when we later cast it to 32 bit?
> > 
> > Yeah that's just my attempt at saying "relax, overflow is fine".
> > Judging by the ACSL manual [1], adding an explicit cast to uint32_t
> > may be enough.
> > 
> 
> Sounds a bit problematic to me to demand that everyone here learn all
> this stuff now, 
> but again, it's not my call...

I wouldn't demand everyone do this, but I might want it for stuff I
maintain.

/Tomas
Michael Niedermayer June 23, 2018, 3:03 p.m. UTC | #7
On Sat, Jun 23, 2018 at 11:40:52AM +0200, Tomas Härdin wrote:
> fre 2018-06-22 klockan 23:12 +0200 skrev Michael Niedermayer:
> > On Fri, Jun 22, 2018 at 04:51:09PM +0200, Tomas Härdin wrote:
> > > fre 2018-06-22 klockan 14:07 +0000 skrev Eran Kornblau:
> > > > First, regarding the if you added, it's redundant - if you look a few lines above, you'll see 'if (atom.size < atom.header_size)'.
> > > > atom.header_size is either 8 or 16, it can't be anything else, so atom.size can't be < 8 at this point.
> > > 
> > > If you look closely you'll see that check is after subtracting
> > > atom.header_size.
> > > 
> > > > I'll leave it to the maintainers to decide whether this tool is helpful or not, IMHO, all these comments make the 
> > > > code less readable, and some of the changes make it less efficient. I don't think this slight reduction of performance 
> > > > matters much in the context of faststart, but in other parts of ffmpeg it can be significant.
> > > > 
> > > > Few examples to why I think it's less efficient -
> > > > 1. the change of macros to functions - maybe the compiler will inline them, but maybe it won't...
> > > 
> > > You're assuming inlining actually makes the code faster. It's not the
> > > 80's anymore.
> > 
> > Until someone tests it, both claims are assumtations.
> > The date on the calender surely is not a good argument though even though it
> > has a somewhat "authorative" vibe to it.
> > 
> > Personally i prefer inline/always inline functions over macros when they are
> > equally fast though ...
> > 
> > Speaking about the date. I would have thought the need to manually annotate the
> > code for analyzers was a thing more for the past and for academia. But quite
> > possibly i have missed something ...
> 
> It's common in industries where you need to guarantee that a piece of
> code will never crash. Aerospace comes to mind.

If you want to make FFmpeg reach Aerospace / Medicine like levels of
bug-free-ness, that is welcome.

Thanks

[...]
diff mbox

Patch

From d4e00db4d0922e745eb23b791b4f51e9d4e4d079 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <tjoppen@acc.umu.se>
Date: Tue, 5 Jun 2018 10:26:00 +0200
Subject: [PATCH] frama-c:ify qt-faststart.c

It's far form perfect.
Some modifications have been made to make it easier to prove:

* some macros have been changed to inline functions
* function pointers have been replaced with an integer specifying the desired callback
* all atoms must be at least of size 8 (this already seemed to be the case, just not explicitly)
* update_stco_offsets() got a bit of a rework
---
 tools/qt-faststart.c | 263 ++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 220 insertions(+), 43 deletions(-)

diff --git a/tools/qt-faststart.c b/tools/qt-faststart.c
index 5e88c38e6b..4b4733a418 100644
--- a/tools/qt-faststart.c
+++ b/tools/qt-faststart.c
@@ -42,33 +42,90 @@ 
 #define ftello(x)       _ftelli64(x)
 #endif
 
-#define MIN(a,b) ((a) > (b) ? (b) : (a))
+/**
+First we do Proof by Weakest Precondition Calculus.
+We'll also want to do value analysis since thing can over/underflow.
+
+Some notes on WP options I've looked at:
+
+Option          What I have to say                      What the manual says
+-----------------------------------------------------------------------------------------------------------------
+-wp             Required to get it going                Generate proof obligations for all (selected) properties.
+-wp-rte         For dealing with run-time exceptions?   Generate RTE guards before WP.
+-wp-dynamic                                             Handle dynamic calls with specific annotations.
+-wp-init-const                                          Use initializers for global const variables.
+-wp-invariants                                          Handle generalized invariants inside loops.
+-wp-split                                               Split conjunctions into sub-goals.
+-wp-model       For setting memory model, such as
+                Hoare (simple, but incompatible with
+                heap memory pointers)
+
+frama-c -wp -wp-rte qt-faststart.c
+[wp] Proved goals:  240 / 340
+     Qed:           156  (4ms-18ms-1.4s)
+     Alt-Ergo:       84  (16ms-134ms-768ms) (403) (interrupted: 22) (unknown: 78)
+*/
+
+/*@ ensures a <= b ==> \result == a;
+    ensures a >  b ==> \result == b;
+    assigns \nothing;
+ */
+static int64_t MIN(int64_t a, int64_t b)
+{
+    return a <= b ? a : b;
+}
 
-#define BE_32(x) (((uint32_t)(((uint8_t*)(x))[0]) << 24) |  \
-                             (((uint8_t*)(x))[1]  << 16) |  \
-                             (((uint8_t*)(x))[2]  <<  8) |  \
-                              ((uint8_t*)(x))[3])
+/*@ requires \valid(x + (0..3));
+    assigns \nothing;
+ */
+static uint32_t BE_32(const uint8_t *x)
+{
+    return ((uint32_t)x[0] << 24) +
+           ((uint32_t)x[1] << 16) +
+           ((uint32_t)x[2] <<  8) +
+            x[3];
+}
 
-#define BE_64(x) (((uint64_t)(((uint8_t*)(x))[0]) << 56) |  \
-                  ((uint64_t)(((uint8_t*)(x))[1]) << 48) |  \
-                  ((uint64_t)(((uint8_t*)(x))[2]) << 40) |  \
-                  ((uint64_t)(((uint8_t*)(x))[3]) << 32) |  \
-                  ((uint64_t)(((uint8_t*)(x))[4]) << 24) |  \
-                  ((uint64_t)(((uint8_t*)(x))[5]) << 16) |  \
-                  ((uint64_t)(((uint8_t*)(x))[6]) <<  8) |  \
-                  ((uint64_t)( (uint8_t*)(x))[7]))
+/*@ requires \valid(x + (0..3));
+    assigns x[0..3];
+ */
+static void AV_WB32(uint8_t *x, uint32_t a)
+{
+    x[0] = a >> 24;
+    x[1] = a >> 16;
+    x[2] = a >> 8;
+    x[3] = a;
+}
 
-#define AV_WB32(p, val)    {                    \
-    ((uint8_t*)(p))[0] = ((val) >> 24) & 0xff;  \
-    ((uint8_t*)(p))[1] = ((val) >> 16) & 0xff;  \
-    ((uint8_t*)(p))[2] = ((val) >> 8) & 0xff;   \
-    ((uint8_t*)(p))[3] = (val) & 0xff;          \
-    }
+/*@ requires \valid(x + (0..7));
+    assigns \nothing;
+ */
+static uint64_t BE_64(const uint8_t *x)
+{
+    return ((uint64_t)x[0] << 56) +
+           ((uint64_t)x[1] << 48) +
+           ((uint64_t)x[2] << 40) +
+           ((uint64_t)x[3] << 32) +
+           ((uint64_t)x[4] << 24) +
+           ((uint64_t)x[5] << 16) +
+           ((uint64_t)x[6] <<  8) +
+            x[7];
+}
 
-#define AV_WB64(p, val)    {                    \
-    AV_WB32(p, (val) >> 32)                     \
-    AV_WB32(p + 4, val)                         \
-    }
+/*@ requires \valid(x + (0..7));
+    assigns x[0..7];
+ */
+static void AV_WB64(uint8_t *x, uint64_t a)
+{
+    x[0] = a >> 56;
+    x[1] = a >> 48;
+    x[2] = a >> 40;
+    x[3] = a >> 32;
+    x[4] = a >> 24;
+    x[5] = a >> 16;
+    x[6] = a >> 8;
+    x[7] = a;
+}
 
 #define BE_FOURCC(ch0, ch1, ch2, ch3)           \
     ( (uint32_t)(unsigned char)(ch3)        |   \
@@ -122,23 +179,55 @@  typedef struct {
     uint64_t new_moov_size;
 } upgrade_stco_context_t;
 
-typedef int (*parse_atoms_callback_t)(void *context, atom_t *atom);
+//Called when fn == 0 in parse_atoms()
+static int update_chunk_offsets_callback(update_chunk_offsets_context_t *context, atom_t *atom);
+//Called when fn == 1 in parse_atoms()
+static int upgrade_stco_callback(upgrade_stco_context_t *context, atom_t *atom);
 
+/*@ requires size >= ATOM_PREAMBLE_SIZE;
+    requires \valid(buf + (0..size-1));
+    ensures -1 <= \result <= 0;
+
+    behavior update_chunk_offsets:
+        requires fn == 0;
+        requires \valid(chunk_context);
+        requires stco_context == NULL;
+
+    behavior upgrade_stco:
+        requires fn == 1;
+        requires chunk_context == NULL;
+        requires \valid(stco_context);
+
+    complete behaviors update_chunk_offsets, upgrade_stco;
+ */
 static int parse_atoms(
     unsigned char *buf,
     uint64_t size,
-    parse_atoms_callback_t callback,
-    void *context)
+    int fn,  //avoid function pointers
+    update_chunk_offsets_context_t *chunk_context,
+    upgrade_stco_context_t *stco_context)
 {
     unsigned char *pos = buf;
     unsigned char *end = pos + size;
+    //@ assert end >= pos + ATOM_PREAMBLE_SIZE;
     atom_t atom;
     int ret;
 
+    //this needs work
+    /*@ loop invariant \valid(pos + (0..(ATOM_PREAMBLE_SIZE-1)));
+        loop invariant end - pos >= ATOM_PREAMBLE_SIZE;
+        loop invariant \valid(pos + (0 .. (end-pos-1)));
+        loop assigns pos;
+        loop variant end - pos;
+     */
     while (end - pos >= ATOM_PREAMBLE_SIZE) {
+        //@ assert \valid(pos + (0..7));
         atom.size = BE_32(pos);
         atom.type = BE_32(pos + 4);
         pos += ATOM_PREAMBLE_SIZE;
+        //we probably need to handle pos == end specially
+        //@ assert pos <= end;
+        //@ assert pos < end ==> \valid(pos + (0 .. (end-pos-1)));
         atom.header_size = ATOM_PREAMBLE_SIZE;
 
         switch (atom.size) {
@@ -148,6 +237,7 @@  static int parse_atoms(
                 return -1;
             }
 
+            //@ assert \valid(pos + (0..7));
             atom.size = BE_64(pos);
             pos += 8;
             atom.header_size = ATOM_PREAMBLE_SIZE + 8;
@@ -158,6 +248,7 @@  static int parse_atoms(
             break;
         }
 
+        //@ assert atom.header_size == 8 || atom.header_size == 16;
         if (atom.size < atom.header_size) {
             fprintf(stderr, "atom size %"PRIu64" too small\n", atom.size);
             return -1;
@@ -170,8 +261,20 @@  static int parse_atoms(
             return -1;
         }
 
+        //both callbacks expect >= 8 bytes
+        if (atom.size < 8) {
+            fprintf(stderr, "atom size %"PRIu64" too small\n", atom.size);
+            return -1;
+        }
+
         atom.data = pos;
-        ret = callback(context, &atom);
+        //@ assert \valid(atom.data + ((-atom.header_size)..(atom.size-1)));
+        if (fn == 0) {
+            ret = update_chunk_offsets_callback(chunk_context, &atom);
+        } else { //1
+            ret = upgrade_stco_callback(stco_context, &atom);
+        }
+        //@ assert -1 <= ret <= 0;
         if (ret < 0) {
             return ret;
         }
@@ -182,12 +285,15 @@  static int parse_atoms(
     return 0;
 }
 
+/*@ requires \valid(context);
+    requires \valid(atom);
+    requires 8 <= atom->size ==> \valid(atom->data + (0..(atom->size-1)));
+    ensures -1 <= \result <= 0;
+ */
 static int update_stco_offsets(update_chunk_offsets_context_t *context, atom_t *atom)
 {
     uint32_t current_offset;
     uint32_t offset_count;
-    unsigned char *pos;
-    unsigned char *end;
 
     printf(" patching stco atom...\n");
     if (atom->size < 8) {
@@ -195,29 +301,48 @@  static int update_stco_offsets(update_chunk_offsets_context_t *context, atom_t *
         return -1;
     }
 
+    //@ assert atom->size >= 8;
+    //@ assert \valid(atom->data + (0..7));
+    //@ assert \valid(atom->data + (0..(atom->size-1)));
     offset_count = BE_32(atom->data + 4);
     if (offset_count > (atom->size - 8) / 4) {
         fprintf(stderr, "stco offset count %"PRIu32" too big\n", offset_count);
         return -1;
     }
+    //@ assert offset_count*4 + 8 <= atom->size;
+    //@ assert \valid(atom->data + (8 .. (8 + 4*offset_count - 1)));
 
     context->stco_offset_count += offset_count;
     context->stco_data_size += atom->size - 8;
 
-    for (pos = atom->data + 8, end = pos + offset_count * 4;
-        pos < end;
-        pos += 4) {
+    /*@ loop invariant \valid(atom->data + ((8 + 4*i) .. (8 + 4*i + 3)));
+        loop assigns i;
+        loop variant offset_count - i;
+     */
+    for (uint32_t i = 0; i < offset_count; i++) {
+        //@ assert \valid(atom->data + ((8 + 4*i) .. (8 + 4*i + 3)));
+        unsigned char *pos = atom->data + 8 + 4*i;
+        //@ assert \valid(pos + (0..3));
         current_offset = BE_32(pos);
         if (current_offset > UINT_MAX - context->moov_atom_size) {
             context->stco_overflow = 1;
+            //make frama-c happy
+            current_offset = ((uint64_t)current_offset + (context->moov_atom_size & 0xFFFFFFFF)) & 0xFFFFFFFF;
+        } else {
+            current_offset += context->moov_atom_size;
         }
-        current_offset += context->moov_atom_size;
+        //@ assert \valid(pos + (0..3));
         AV_WB32(pos, current_offset);
     }
 
     return 0;
 }
 
+/*@ requires \valid(context);
+    requires \valid(atom);
+    requires 8 <= atom->size ==> \valid(atom->data + (0..(atom->size-1)));
+    ensures -1 <= \result <= 0;
+ */
 static int update_co64_offsets(update_chunk_offsets_context_t *context, atom_t *atom)
 {
     uint64_t current_offset;
@@ -248,9 +373,15 @@  static int update_co64_offsets(update_chunk_offsets_context_t *context, atom_t *
     return 0;
 }
 
-static int update_chunk_offsets_callback(void *ctx, atom_t *atom)
+/*@ requires \valid(context);
+    requires \valid(atom);
+    requires 8 <= atom->size;
+    requires 8 <= atom->header_size <= 16;
+    requires \valid(atom->data + ((-atom->header_size) .. (atom->size-1)));
+    ensures -1 <= \result <= 0;
+ */
+static int update_chunk_offsets_callback(update_chunk_offsets_context_t *context, atom_t *atom)
 {
-    update_chunk_offsets_context_t *context = ctx;
     int ret;
 
     switch (atom->type) {
@@ -274,15 +405,29 @@  static int update_chunk_offsets_callback(void *ctx, atom_t *atom)
         ret = parse_atoms(
             atom->data,
             atom->size,
-            update_chunk_offsets_callback,
-            context);
+            0,
+            context,
+            NULL);
         context->depth--;
+        //@ assert -1 <= ret <= 0;
         return ret;
     }
 
     return 0;
 }
 
+/*@ behavior narrow_header:
+        requires header_size == 8;
+        requires \valid(header + (0..3));
+        assigns header[0..3];
+
+    behavior wide_header:
+        requires header_size == 16;
+        requires \valid(header + (8..15));
+        assigns header[8..15];
+
+    complete behaviors narrow_header, wide_header;
+ */
 static void set_atom_size(unsigned char *header, uint32_t header_size, uint64_t size)
 {
     switch (header_size) {
@@ -296,6 +441,12 @@  static void set_atom_size(unsigned char *header, uint32_t header_size, uint64_t
     }
 }
 
+/*@ requires \valid(context);
+    requires \valid(atom);
+    requires 8 <= atom->size;
+    requires \valid(atom->data + (0 .. (atom->size-1)));
+    requires \valid(context->dest + (0..7));
+ */
 static void upgrade_stco_atom(upgrade_stco_context_t *context, atom_t *atom)
 {
     unsigned char *pos;
@@ -325,9 +476,15 @@  static void upgrade_stco_atom(upgrade_stco_context_t *context, atom_t *atom)
     }
 }
 
-static int upgrade_stco_callback(void *ctx, atom_t *atom)
+/*@ requires \valid(context);
+    requires \valid(atom);
+    requires 8 <= atom->size;
+    requires 8 <= atom->header_size <= 16;
+    requires \valid(atom->data + ((-atom->header_size) .. (atom->size-1)));
+    ensures -1 <= \result <= 0;
+ */
+static int upgrade_stco_callback(upgrade_stco_context_t *context, atom_t *atom)
 {
-    upgrade_stco_context_t *context = ctx;
     unsigned char *start_pos;
     uint64_t copy_size;
 
@@ -350,7 +507,8 @@  static int upgrade_stco_callback(void *ctx, atom_t *atom)
         if (parse_atoms(
             atom->data,
             atom->size,
-            upgrade_stco_callback,
+            1,
+            NULL,
             context) < 0) {
             return -1;
         }
@@ -369,6 +527,12 @@  static int upgrade_stco_callback(void *ctx, atom_t *atom)
     return 0;
 }
 
+/*@ requires \valid(moov_atom);
+    requires \valid(moov_atom_size);
+    requires \valid((*moov_atom) + (0..(*moov_atom_size)));
+    requires *moov_atom_size >= 16;
+    ensures -1 <= \result <= 0;
+ */
 static int update_moov_atom(
     unsigned char **moov_atom,
     uint64_t *moov_atom_size)
@@ -382,8 +546,9 @@  static int update_moov_atom(
     if (parse_atoms(
         *moov_atom,
         *moov_atom_size,
-        update_chunk_offsets_callback,
-        &update_context) < 0) {
+        0,
+        &update_context,
+        NULL) < 0) {
         return -1;
     }
 
@@ -409,7 +574,8 @@  static int update_moov_atom(
     if (parse_atoms(
         *moov_atom,
         *moov_atom_size,
-        upgrade_stco_callback,
+        1,
+        NULL,
         &upgrade_context) < 0) {
         free(new_moov_atom);
         return -1;
@@ -427,6 +593,11 @@  static int update_moov_atom(
     return 0;
 }
 
+/*@ requires argc >= 1;
+    requires \valid(argv + (0..argc-1));
+    requires \forall int i; 0 <= i <= argc-1 ==> \valid(argv[i]);
+    ensures 0 <= \result <= 1;
+ */
 int main(int argc, char *argv[])
 {
     FILE *infile  = NULL;
@@ -450,6 +621,8 @@  int main(int argc, char *argv[])
         return 0;
     }
 
+    //@ assert argc == 3;
+    //@ assert \valid(argv + (0..2));
     if (!strcmp(argv[1], argv[2])) {
         fprintf(stderr, "input and output files need to be different\n");
         return 1;
@@ -485,6 +658,7 @@  int main(int argc, char *argv[])
                        atom_size);
                 goto error_out;
             }
+            //@ assert \valid(ftyp_atom + (0..(ftyp_atom_size-1)));
             if (fseeko(infile, -ATOM_PREAMBLE_SIZE, SEEK_CUR) ||
                 fread(ftyp_atom, atom_size, 1, infile) != 1 ||
                 (start_offset = ftello(infile)) < 0) {
@@ -566,6 +740,7 @@  int main(int argc, char *argv[])
         fprintf(stderr, "could not allocate %"PRIu64" bytes for moov atom\n", atom_size);
         goto error_out;
     }
+    //@ assert \valid(moov_atom + (0..(moov_atom_size-1)));
     if (fread(moov_atom, atom_size, 1, infile) != 1) {
         perror(argv[1]);
         goto error_out;
@@ -631,9 +806,11 @@  int main(int argc, char *argv[])
         fprintf(stderr, "could not allocate %d bytes for copy_buffer\n", bytes_to_copy);
         goto error_out;
     }
+    //@ assert \valid(copy_buffer + (0..(bytes_to_copy-1)));
     printf(" copying rest of file...\n");
     while (last_offset) {
         bytes_to_copy = MIN(bytes_to_copy, last_offset);
+        //@ assert \valid(copy_buffer + (0..(bytes_to_copy-1)));
 
         if (fread(copy_buffer, bytes_to_copy, 1, infile) != 1) {
             perror(argv[1]);
-- 
2.11.0