diff mbox series

[FFmpeg-devel,2/2] doc/utils/eval: clarify meaning of random* seed value

Message ID 20240101193859.1038078-2-stefasab@gmail.com
State New
Headers show
Series [FFmpeg-devel,1/2] libavutil/eval: introduce UINT64_MAX constant | expand

Checks

Context Check Description
andriy/configure_x86 warning Failed to apply patch
yinshiyou/configure_loongarch64 warning Failed to apply patch

Commit Message

Stefano Sabatini Jan. 1, 2024, 7:38 p.m. UTC
Possible address trac issue:
http://trac.ffmpeg.org/ticket/10763
---
 doc/utils.texi | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Michael Koch Jan. 3, 2024, 11:20 a.m. UTC | #1
> Possible address trac issue:

> http://trac.ffmpeg.org/ticket/10763

I don't like the random generator as it is, because the first two random numbers
are very close to zero, as can be shown with this command line:
ffmpeg -loglevel repeat -f lavfi -i nullsrc=size=1x1,format=gray -vf 
"geq=lum='print(random(0));print(random(0));print(random(0));print(random(0))'" 
-frames 1 -y out.png 0.000000 0.000091 0.285346 0.929202 This behaviour 
can be improved by inizializing the generator with a large number as 
seed value. I'm not sure if it's a good idea to share the same variables 
for ld() and st() (as double) and random() (as unsigned int). Wouldn't 
it be better to use separate variables for random()? Michael
Stefano Sabatini Jan. 3, 2024, 4:14 p.m. UTC | #2
On date Wednesday 2024-01-03 12:20:12 +0100, Michael Koch wrote:
> > Possible address trac issue:
> 
> > http://trac.ffmpeg.org/ticket/10763
> 

> I don't like the random generator as it is, because the first two random numbers
> are very close to zero, as can be shown with this command line:
> ffmpeg -loglevel repeat -f lavfi -i nullsrc=size=1x1,format=gray -vf "geq=lum='print(random(0));print(random(0));print(random(0));print(random(0))'"
> -frames 1 -y out.png 0.000000 0.000091 0.285346 0.929202 This behaviour can
> be improved by inizializing the generator with a large number as seed value.

OTOH this seems only to happen with 0, which is also the default seed
value used by the random generator (since all variables are set to 0).

$ echo "st(0, UINT64_MAX/3); st(1, 10); while(gt(ld(1), 0), print(random(0)); st(1, ld(1)-1))" | tools/ffeval 
0.666667
0.333374
0.616496
0.023515
0.931519
0.317050
0.393449
0.969531
0.842403
0.129183
=> 0.000000
$ echo "st(0, UINT64_MAX/2); st(1, 10); while(gt(ld(1), 0), print(random(0)); st(1, ld(1)-1))" | tools/ffeval 
0.500000
0.500091
0.785217
0.673490
0.103073
0.829142
0.144794
0.508226
0.735092
0.655693
=> 0.000000
$ echo "st(0, UINT64_MAX/7); st(1, 10); while(gt(ld(1), 0), print(random(0)); st(1, ld(1)-1))" | tools/ffeval 
0.285714
0.571498
0.855853
0.398648
0.009648
0.802972
0.608529
0.007854
0.405221
0.959683
=> 0.000000
$ echo "st(0, UINT64_MAX/13); st(1, 10); while(gt(ld(1), 0), print(random(0)); st(1, ld(1)-1))" | tools/ffeval 
0.384615
0.923180
0.593572
0.754748
0.415309
0.067556
0.905708
0.928599
0.448587
0.772728
=> 0.000000
$ echo "st(0, UINT64_MAX/6); st(1, 10); while(gt(ld(1), 0), print(random(0)); st(1, ld(1)-1))" | tools/ffeval 
0.833333
0.166733
0.451010
0.605203
0.240940
0.007228
0.778975
0.354595
0.413333
0.898240
=> 0.000000
$ echo "st(0, 0); st(1, 10); while(gt(ld(1), 0), print(random(0)); st(1, ld(1)-1))" | tools/ffeval 
0.000000
0.000091
0.285346
0.929202
0.239519
0.140744
0.686872
0.275489
0.537404
0.583466
=> 0.000000

(BTW it could be handy to have an inc and dec function as well to
simplify this kind of loops.)

> I'm not sure if it's a good idea to share the same variables for ld() and
> st() (as double) and random() (as unsigned int).

All functions work with double precision floating point numbers, but
they can be used also to represent integers, but with a few
limitations (e.g. you cannot represent many integers with a float
exactly).

> Wouldn't it be better to use separate variables for random()?
> Michael

One idea would be to use a random_seed(idx) function, but this would mean
that we need to use separate registers only for the random function,
and this would also break backward compatibility.
Michael Koch Jan. 3, 2024, 5:04 p.m. UTC | #3
On one hand, it's good that the 10 variables for ld() and st() are 
initialized with zero.
But on the other hand, zero is obviously the worst possible seed value 
for the random generator.

For example, make a 1000x1000 image with red color, and then fill one 
pixel at a random position with yellow color. Surprise surprise, the 
pixel is in the top left corner. That's not what a user expects from a 
random generator. It took me some time to figure out what's the problem, 
and how to set a better seed value.

Michael
Michael Koch Jan. 3, 2024, 6:36 p.m. UTC | #4
Possible better algorithm for the random generator, which might fix the 
problem:

eval.c  line 235

if (r == 0)
   r = UINT64_MAX/13;
else
   r = r*1664525+1013904223;

Michael
Michael Niedermayer Jan. 3, 2024, 10:29 p.m. UTC | #5
On Wed, Jan 03, 2024 at 07:36:00PM +0100, Michael Koch wrote:
> Possible better algorithm for the random generator, which might fix the
> problem:
> 
> eval.c  line 235
> 
> if (r == 0)
>   r = UINT64_MAX/13;
> else
>   r = r*1664525+1013904223;

This modifies the properties of the PRNG as it doesnt just change
the seed but jumps the generator from 0 to a differnt value, that
may happen more often than just at the start

posted a more correct fix for the low quality numbers

thx

[...]
diff mbox series

Patch

diff --git a/doc/utils.texi b/doc/utils.texi
index ac9b63826e..554a07f057 100644
--- a/doc/utils.texi
+++ b/doc/utils.texi
@@ -944,11 +944,17 @@  Return a pseudo random value between 0.0 and 1.0. @var{idx} is the
 index of the internal variable which will be used to save the
 seed/state.
 
+The seed is represented as 64-bit unsigned integer. You can use a
+fraction of @code{UINT64_MAX} to initialize it.
+
 @item randomi(idx, min, max)
 Return a pseudo random value in the interval between @var{min} and
 @var{max}. @var{idx} is the index of the internal variable which will
 be used to save the seed/state.
 
+The seed is represented as 64-bit unsigned integer. You can use a
+fraction of @code{UINT64_MAX} to initialize it.
+
 @item root(expr, max)
 Find an input value for which the function represented by @var{expr}
 with argument @var{ld(0)} is 0 in the interval 0..@var{max}.