diff mbox

[FFmpeg-devel] configure: speedup x2-x8

Message ID 378315057.2531045.1535207584068@mail.yahoo.com
State Superseded
Headers show

Commit Message

avih Aug. 25, 2018, 2:33 p.m. UTC
Hi,

I noticed that configure can be a bit slow - few minutes on my macOS
system, 1:30m on linux/bash, 30s on linux/dash, and even if
pathological, still a too-long 10 minutes on Windows (MSYS2) (same
duration as a full build with make -j4 after configure).

I added some timing information printouts for various parts in
configure, and eventually identified few culprits.

The attachment "config-timing.patch" adds these printouts and sorted
summary. It's attached for reference and not intended to be merged.
It applies cleanly before or after the main patch.


The attached "main.patch" addresses three areas I identified as slow:

1. About 50-70% of configure runtime was being spent inside one
   function: flatten_extralibs() and callees resolve() and unique().
   It manipulates strings and invoked nearly 20K (20000) subshells.
   It was rewritten to avoid subshells, and ended up x50-x250 faster.

2. print_enabled_components() was invoking sed about 350 times on one
   file. This is never instant but takes many seconds where fork is
   slow (Windows). Invoke sed only once instead = x4-x10 speedup.

3. After the previous speedups, configure spent 20-60% of its runtime
   at check_deps(). It's particularly slow with bash. After some local
   optimizations - mainly avoid pushvar/popvar and abort early in one
   notable case (empty deps), it's now x4-x25 faster.


Some resulting speedups (more details at "before-after.txt"):

- macOS/bash:  total: 98s -> 22s    flatten_extralibs:  53s -> 0.7s
- Linux/bash:  total: 87s -> 11s    flatten_extralibs:  59s -> 0.9s
- Linux/dash:  total: 27s ->  8s    flatten_extralibs:  17s -> 0.2s
- FreeBSD/sh:  total: 34s ->  9s    flatten_extralibs:  23s -> 0.2s
- MSYS2/bash:  total: 10m -> 2:30m  flatten_extralibs: 400s -> 1.5s (!)


Notes:

- unique() is modified with flatten_extralibs. It now outputs different
  order: it was keeping the last instance of recurring items, now it
  keeps the first. It affects libs order at ffbuild/config.{mak,sh} -
  but I don't think it matters. If it does, "opt1-reorder-unique.patch"
  restores the original order. Let me know if/why it matters and I'll
  squash it and update the commit message accordingly if required.

- After the check_deps() patch, pushvar() and popvar() are not used but
  I was hesitant to remove them (nice to have). If you think it should
  be removed, the patch "opt2-remove-pushvar.patch" removes them.

- The patches assume POSIX shell and don't use anything "tricky".
  It was tested with dash, bash, busybox-ash, freebsd-sh, ksh93u, mksh.

- Thanks to tmm1 and atomnuker for their help with testing.


This is my first mail to this list, and I wasn't sure how to split the
patches. After asking a veteran, it was suggested to me that I should
put all of them in one email. Just let me know if you prefer something
else (dev mailing lists are not my forte...).

For convinience, the patches are also available at:
https://gist.github.com/avih/f51008225d4a20a0981daed1faca4bc2

Regards,
Avi
configure - speedup - before/after details

Systems:
- Linux: Ubuntu 16.04 VM: bash 4.3/4.4, dash 4.4, busybox-ash 1.29, ksh93u
- macOS: iMac 5k 2017 i7: bash 3.2
- macOS: MBA late 2010:   bash 3.2, dash 0.5.10
- FreeBSD: GhostBSD 11.1: sh
- MSYS2: latest (gcc 8):  bash 4.4, dash 0.5.10
- Linux: debian unstable: dash, bash

---------------------------------------------
Linux - Ubuntu 16.04 - VM (i7 3630qm)

/bin/sh (dash):  28s ->  8.3s  = x3.3
bash 4.3      : 103s -> 12.5s  = x8
bash 4.4      :  87s -> 11.3s  = x8
busybox ash   :  37s ->  9s    = x4
ksh93         :  23s ->  9s    = x2.5
---------------------------------------------

bash 4.3 BEFORE:
102787 ms, 1 ms/measure, Linux/bash (4.3.48)
-----
 72526 ms  71 %  flatten_extralibs_wrapper()
 18329 ms  18 %  check_deps()
  8295 ms   8 %  Compilation tests
  2062 ms   2 %  print_enabled_components()
   992 ms   1 %  Writing config
   343 ms   0 %  Prepare deps
   190 ms   0 %  Options review
    33 ms   0 %  Init
    17 ms   0 %  Finalize

bash 4.3 AFTER:
12529 ms, 1 ms/measure, Linux/bash (4.3.48)
-----
 8358 ms  67 %  Compilation tests
 1408 ms  11 %  check_deps()
  920 ms   7 %  flatten_extralibs_wrapper()
  735 ms   6 %  Writing config
  520 ms   4 %  print_enabled_components()
  346 ms   3 %  Prepare deps
  190 ms   2 %  Options review
   35 ms   0 %  Init
   17 ms   0 %  Finalize


bash 4.4 BEFORE:
86601 ms, 1 ms/measure, Linux/bash (4.4.19)
-----
58889 ms  68 %  flatten_extralibs_wrapper()
16288 ms  19 %  check_deps()
 8127 ms   9 %  Compilation tests
 2013 ms   2 %  print_enabled_components()
  793 ms   1 %  Writing config
  253 ms   0 %  Prepare deps
  188 ms   0 %  Options review
   34 ms   0 %  Init
   16 ms   0 %  Finalize

bash 4.4 AFTER:
11373 ms, 1 ms/measure, Linux/bash (4.4.19)
-----
 8170 ms  72 %  Compilation tests
  985 ms   9 %  check_deps()
  758 ms   7 %  flatten_extralibs_wrapper()
  555 ms   5 %  Writing config
  425 ms   4 %  print_enabled_components()
  250 ms   2 %  Prepare deps
  182 ms   2 %  Options review
   32 ms   0 %  Init
   16 ms   0 %  Finalize


dash 0.5.8 BEFORE:
28308 ms, 1 ms/measure, Linux/dash
-----
18306 ms  65 %  flatten_extralibs_wrapper()
 7450 ms  26 %  Compilation tests
 1260 ms   4 %  check_deps()
  895 ms   3 %  print_enabled_components()
  227 ms   1 %  Writing config
   88 ms   0 %  Options review
   59 ms   0 %  Prepare deps
   14 ms   0 %  Init
    9 ms   0 %  Finalize

dash 0.5.8 AFTER:
8312 ms, 1 ms/measure, Linux/dash
-----
7390 ms  89 %  Compilation tests
 298 ms   4 %  check_deps()
 218 ms   3 %  flatten_extralibs_wrapper()
 136 ms   2 %  Writing config
 100 ms   1 %  print_enabled_components()
  88 ms   1 %  Options review
  59 ms   1 %  Prepare deps
  14 ms   0 %  Init
   9 ms   0 %  Finalize


busybox-ash 1.29 BEFORE:
36580 ms, 1 ms/measure, Linux/busybox (ash)
-----
25559 ms  70 %  flatten_extralibs_wrapper()
 7580 ms  21 %  Compilation tests
 1974 ms   5 %  check_deps()
  962 ms   3 %  print_enabled_components()
  277 ms   1 %  Writing config
  107 ms   0 %  Options review
   92 ms   0 %  Prepare deps
   20 ms   0 %  Init
    9 ms   0 %  Finalize

busybox-ash 1.29 AFTER:
8888 ms, 1 ms/measure, Linux/busybox (ash)
-----
7603 ms  86 %  Compilation tests
 452 ms   5 %  check_deps()
 311 ms   3 %  flatten_extralibs_wrapper()
 189 ms   2 %  Writing config
 113 ms   1 %  print_enabled_components()
 101 ms   1 %  Options review
  91 ms   1 %  Prepare deps
  18 ms   0 %  Init
  10 ms   0 %  Finalize


ksh93u BEFORE:
22483 ms, 0 ms/measure, Linux/ksh93
-----
11985 ms  53 %  flatten_extralibs_wrapper()
 7353 ms  33 %  Compilation tests
 1690 ms   8 %  check_deps()
  914 ms   4 %  print_enabled_components()
  330 ms   1 %  Writing config
  108 ms   0 %  Options review
   79 ms   0 %  Prepare deps
   14 ms   0 %  Init
   10 ms   0 %  Finalize

ksh93u AFTER:
8898 ms, 1 ms/measure, Linux/ksh93
-----
7511 ms  84 %  Compilation tests
 353 ms   4 %  check_deps()
 333 ms   4 %  flatten_extralibs_wrapper()
 328 ms   4 %  Writing config
 162 ms   2 %  print_enabled_components()
 107 ms   1 %  Options review
  80 ms   1 %  Prepare deps
  15 ms   0 %  Init
   9 ms   0 %  Finalize


---------------------------------------------
tmm1:
macOS (iMac 5k 2017 i7)
/bin/sh (bash 3.2):  98s -> 22s = x4.5
---------------------------------------------

bash BEFORE:
98134 ms, 30 ms/measure, Darwin/sh
-----
52947 ms  54 %  flatten_extralibs_wrapper()
23608 ms  24 %  check_deps()
17392 ms  18 %  Compilation tests
 2603 ms   3 %  print_enabled_components()
  994 ms   1 %  Writing config
  281 ms   0 %  Prepare deps
  251 ms   0 %  Options review
   39 ms   0 %  Init
   19 ms   0 %  Finalize

bash AFTER:
21981 ms, 29 ms/measure, Darwin/sh
-----
17783 ms  81 %  Compilation tests
 1396 ms   6 %  check_deps()
  791 ms   4 %  Writing config
  733 ms   3 %  print_enabled_components()
  682 ms   3 %  flatten_extralibs_wrapper()
  285 ms   1 %  Prepare deps
  248 ms   1 %  Options review
   40 ms   0 %  Init
   23 ms   0 %  Finalize



---------------------------------------------
macOS (MBA late 2010)
/bin/sh (bash 3.2):  272s -> 61s = x4.5
dash              :  114s -> 51s = x2.3
---------------------------------------------

bash 3.2 BEFORE:
272643 ms, 67 ms/measure, Darwin/sh (bash 3.2.57)
-----
140802 ms  52 %  flatten_extralibs_wrapper()
 71122 ms  26 %  check_deps()
 48587 ms  18 %  Compilation tests
  7445 ms   3 %  print_enabled_components()
  2898 ms   1 %  Writing config
   797 ms   0 %  Prepare deps
   791 ms   0 %  Options review
   133 ms   0 %  Init
    68 ms   0 %  Finalize

bash 3.2 AFTER:
60802 ms, 68 ms/measure, Darwin/sh (bash 3.2.57)
-----
48835 ms  80 %  Compilation tests
 3663 ms   6 %  check_deps()
 2212 ms   4 %  Writing config
 2209 ms   4 %  print_enabled_components()
 2084 ms   3 %  flatten_extralibs_wrapper()
  807 ms   1 %  Options review
  790 ms   1 %  Prepare deps
  147 ms   0 %  Init
   55 ms   0 %  Finalize


dash 0.5.10 before and after:
Note, I got some reports that configure sometimes fails in dash, possibly under
specific configurtions. This wasn't investigated further.

dash 0.5.10 BEFORE:
114689 ms, 66 ms/measure, Darwin/dash
-----
 56977 ms  50 %  flatten_extralibs_wrapper()
 46503 ms  41 %  Compilation tests
  5466 ms   5 %  check_deps()
  3732 ms   3 %  print_enabled_components()
  1216 ms   1 %  Writing config
   393 ms   0 %  Options review
   256 ms   0 %  Prepare deps
   102 ms   0 %  Init
    44 ms   0 %  Finalize

dash 0.5.10 AFTER:
51010 ms, 67 ms/measure, Darwin/dash
-----
46599 ms  91 %  Compilation tests
 1201 ms   2 %  check_deps()
  999 ms   2 %  flatten_extralibs_wrapper()
  835 ms   2 %  Writing config
  577 ms   1 %  print_enabled_components()
  405 ms   1 %  Options review
  254 ms   0 %  Prepare deps
   93 ms   0 %  Init
   47 ms   0 %  Finalize


---------------------------------------------
FreeBSD - GhostBSD 11.1 live DVD - VM (i7 4500u)
/bin/sh:  34.5s -> 9s  = x4
---------------------------------------------

sh BEFORE:
34410 ms, 15 ms/measure, FreeBSD/sh
-----
23270 ms  68 %  flatten_extralibs_wrapper()
 7606 ms  22 %  Compilation tests
 1875 ms   5 %  print_enabled_components()
  789 ms   2 %  check_deps()
  506 ms   1 %  Writing config
  271 ms   1 %  Options review
   53 ms   0 %  Prepare deps
   23 ms   0 %  Init
   17 ms   0 %  Finalize

sh AFTER:
9045 ms, 15 ms/measure, FreeBSD/sh
-----
7691 ms  85 %  Compilation tests
 381 ms   4 %  Writing config
 270 ms   3 %  check_deps()
 266 ms   3 %  Options review
 186 ms   2 %  flatten_extralibs_wrapper()
 149 ms   2 %  print_enabled_components()
  59 ms   1 %  Prepare deps
  23 ms   0 %  Init
  20 ms   0 %  Finalize



---------------------------------------------
MSYS2 - Windows 8.1 (i7 4500u)
/bin/sh (bash 4.4.19): 600s -> 148s = x4
dash                 : 438s -> 139s = x3
---------------------------------------------

bash BEFORE:
599395 ms, 37 ms/measure, MINGW64_NT-6.3/sh
-----
396352 ms  66 %  flatten_extralibs_wrapper()
133812 ms  22 %  Compilation tests
 36044 ms   6 %  check_deps()
 21561 ms   4 %  print_enabled_components()
  6902 ms   1 %  Writing config
  3364 ms   1 %  Options review
   742 ms   0 %  Prepare deps
   351 ms   0 %  Init
   267 ms   0 %  Finalize

bash AFTER:
147484 ms, 36 ms/measure, MINGW64_NT-6.3/sh (bash 4.4.19)
-----
133968 ms  91 %  Compilation tests
  3502 ms   2 %  Options review
  3247 ms   2 %  Writing config
  2307 ms   2 %  print_enabled_components()
  1553 ms   1 %  flatten_extralibs_wrapper()
  1504 ms   1 %  check_deps()
   728 ms   0 %  Prepare deps
   379 ms   0 %  Init
   296 ms   0 %  Finalize


FFmpeg's configure doesn't work in MSYS2's dash out of the box.
mingw pkg-config outputs \r\n line endings, and while MSYS2's bash is patched
to handle them, MSYS2's dash is not, and requires to strip them, for instance
using this pkg-config-wrapper file:
```
#!/bin/sh
out=$("$MINGW_PREFIX"/bin/pkg-config "$@"; printf %s "-$?")
e=${out##*-}
out=${out%-*}
printf %s "$out" | tr -d '\r'
exit $e
```
and then invoke configure --pkg-config=path/to/pkg-config-wrapper ...

dash BEFORE:
497919 ms, 30 ms/measure, MINGW64_NT-6.3/dash
-----
337320 ms  68 %  flatten_extralibs_wrapper()
129702 ms  26 %  Compilation tests
 18802 ms   4 %  print_enabled_components()
  6150 ms   1 %  Writing config
  3269 ms   1 %  Options review
  1707 ms   0 %  check_deps()
   400 ms   0 %  Prepare deps
   331 ms   0 %  Init
   238 ms   0 %  Finalize

dash AFTER:
138592 ms, 29 ms/measure, MINGW64_NT-6.3/dash
-----
128895 ms  93 %  Compilation tests
  3067 ms   2 %  Options review
  2570 ms   2 %  Writing config
  1508 ms   1 %  print_enabled_components()
  1013 ms   1 %  flatten_extralibs_wrapper()
   425 ms   0 %  Prepare deps
   422 ms   0 %  check_deps()
   374 ms   0 %  Finalize
   318 ms   0 %  Init


------------------------------------------
atomnuker:
Debian Unstable, Thinkpad X1 Carbon 4th Gen, Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
dash: 16s -> 6.3s  = x2.6
bash: 50s -> 8.3s  = x6
------------------------------------------

dash BEFORE:
Command line: dash ./configure --disable-autodetect
-----
16132 ms, 2 ms/measure, Linux/dash
-----
 8402 ms  52 %  flatten_extralibs_wrapper()
 5796 ms  36 %  Compilation tests
 1024 ms   6 %  check_deps()
  626 ms   4 %  print_enabled_components()
  129 ms   1 %  Writing config
   72 ms   0 %  Options review
   48 ms   0 %  Prepare deps
   30 ms   0 %  Init
    5 ms   0 %  Finalize

dash AFTER:
Command line: dash ./configure --disable-autodetect
-----
6366 ms, 0 ms/measure, Linux/dash
-----
5661 ms  89 %  Compilation tests
 244 ms   4 %  check_deps()
 205 ms   3 %  flatten_extralibs_wrapper()
  90 ms   1 %  Writing config
  52 ms   1 %  print_enabled_components()
  49 ms   1 %  Prepare deps
  48 ms   1 %  Options review
   9 ms   0 %  Init
   8 ms   0 %  Finalize


bash BEFORE:
Command line: bash ./configure --disable-autodetect
-----
50069 ms, 3 ms/measure, Linux/bash
-----
31212 ms  62 %  flatten_extralibs_wrapper()
 9665 ms  19 %  check_deps()
 6309 ms  13 %  Compilation tests
 1955 ms   4 %  print_enabled_components()
  591 ms   1 %  Writing config
  168 ms   0 %  Prepare deps
   99 ms   0 %  Options review
   61 ms   0 %  Init
    9 ms   0 %  Finalize

bash AFTER:
-----
8279 ms, 3 ms/measure, Linux/bash
-----
6255 ms  76 %  Compilation tests
 623 ms   8 %  check_deps()
 432 ms   5 %  flatten_extralibs_wrapper()
 397 ms   5 %  Writing config
 208 ms   3 %  print_enabled_components()
 193 ms   2 %  Prepare deps
 102 ms   1 %  Options review
  61 ms   1 %  Init
   8 ms   0 %  Finalize

Comments

Timo Rothenpieler Aug. 25, 2018, 4:55 p.m. UTC | #1
Please use git send-email to send your patches, or at least send each 
patch, created by git format-patch, as individual attachment. Your files 
seem to contain multiple patches one after another, which makes them 
very hard to follow.

But nice work! Let's hope this does not cause any regressions.
Michael Niedermayer Aug. 25, 2018, 6 p.m. UTC | #2
On Sat, Aug 25, 2018 at 02:33:03PM +0000, avih wrote:
> Hi,
> 
> I noticed that configure can be a bit slow - few minutes on my macOS
> system, 1:30m on linux/bash, 30s on linux/dash, and even if
> pathological, still a too-long 10 minutes on Windows (MSYS2) (same
> duration as a full build with make -j4 after configure).
> 
> I added some timing information printouts for various parts in
> configure, and eventually identified few culprits.
> 
> The attachment "config-timing.patch" adds these printouts and sorted
> summary. It's attached for reference and not intended to be merged.
> It applies cleanly before or after the main patch.
> 
> 
> The attached "main.patch" addresses three areas I identified as slow:
> 
> 1. About 50-70% of configure runtime was being spent inside one
>    function: flatten_extralibs() and callees resolve() and unique().
>    It manipulates strings and invoked nearly 20K (20000) subshells.
>    It was rewritten to avoid subshells, and ended up x50-x250 faster.
> 
> 2. print_enabled_components() was invoking sed about 350 times on one
>    file. This is never instant but takes many seconds where fork is
>    slow (Windows). Invoke sed only once instead = x4-x10 speedup.
> 
> 3. After the previous speedups, configure spent 20-60% of its runtime
>    at check_deps(). It's particularly slow with bash. After some local
>    optimizations - mainly avoid pushvar/popvar and abort early in one
>    notable case (empty deps), it's now x4-x25 faster.
> 
> 
> Some resulting speedups (more details at "before-after.txt"):
> 
> - macOS/bash:  total: 98s -> 22s    flatten_extralibs:  53s -> 0.7s
> - Linux/bash:  total: 87s -> 11s    flatten_extralibs:  59s -> 0.9s
> - Linux/dash:  total: 27s ->  8s    flatten_extralibs:  17s -> 0.2s
> - FreeBSD/sh:  total: 34s ->  9s    flatten_extralibs:  23s -> 0.2s
> - MSYS2/bash:  total: 10m -> 2:30m  flatten_extralibs: 400s -> 1.5s (!)
> 
> 
> Notes:
> 
> - unique() is modified with flatten_extralibs. It now outputs different
>   order: it was keeping the last instance of recurring items, now it
>   keeps the first. It affects libs order at ffbuild/config.{mak,sh} -
>   but I don't think it matters. If it does, "opt1-reorder-unique.patch"
>   restores the original order. Let me know if/why it matters and I'll
>   squash it and update the commit message accordingly if required.
> 
> - After the check_deps() patch, pushvar() and popvar() are not used but
>   I was hesitant to remove them (nice to have). If you think it should
>   be removed, the patch "opt2-remove-pushvar.patch" removes them.
> 
> - The patches assume POSIX shell and don't use anything "tricky".
>   It was tested with dash, bash, busybox-ash, freebsd-sh, ksh93u, mksh.
> 
> - Thanks to tmm1 and atomnuker for their help with testing.
> 
> 
> This is my first mail to this list, and I wasn't sure how to split the
> patches. After asking a veteran, it was suggested to me that I should
> put all of them in one email. Just let me know if you prefer something
> else (dev mailing lists are not my forte...).

with the main patch
make distclean ; dash ./configure  --enable-gpl && make -j12 testprogs
fails:

LD	libavfilter/tests/filtfmts
libavformat/libavformat.a(utils.o): In function `av_apply_bitstream_filters':
ffmpeg/libavformat/utils.c:5577: undefined reference to `av_bitstream_filter_filter'
libavformat/libavformat.a(codec2.o): In function `codec2_read_header_common':
ffmpeg/libavformat/codec2.c:74: undefined reference to `avpriv_codec2_mode_bit_rate'
ffmpeg/libavformat/codec2.c:75: undefined reference to `avpriv_codec2_mode_frame_size'
ffmpeg/libavformat/codec2.c:76: undefined reference to `avpriv_codec2_mode_block_align'
ffmpeg/libavformat/codec2.c:74: undefined reference to `avpriv_codec2_mode_bit_rate'
ffmpeg/libavformat/codec2.c:75: undefined reference to `avpriv_codec2_mode_frame_size'
ffmpeg/libavformat/codec2.c:76: undefined reference to `avpriv_codec2_mode_block_align'
libavformat/libavformat.a(spdifdec.o): In function `spdif_get_offset_and_codec':
ffmpeg/libavformat/spdifdec.c:63: undefined reference to `av_adts_header_parse'
ffmpeg/libavformat/spdifdec.c:63: undefined reference to `av_adts_header_parse'
libavformat/libavformat.a(spdifenc.o): In function `spdif_header_aac':
ffmpeg/libavformat/spdifenc.c:357: undefined reference to `av_adts_header_parse'
collect2: error: ld returned 1 exit status
make: *** [libavfilter/tests/filtfmts] Error 1

[...]
avih Aug. 25, 2018, 6:05 p.m. UTC | #3
Thanks.
 I'll post the 3 parts of main.patch unmodified as individualemails like Timo requested, and then I'll look at the failures.
 

    On Saturday, August 25, 2018 9:00 PM, Michael Niedermayer <michael@niedermayer.cc> wrote:
 

 On Sat, Aug 25, 2018 at 02:33:03PM +0000, avih wrote:
> Hi,
> 
> I noticed that configure can be a bit slow - few minutes on my macOS
> system, 1:30m on linux/bash, 30s on linux/dash, and even if
> pathological, still a too-long 10 minutes on Windows (MSYS2) (same
> duration as a full build with make -j4 after configure).
> 
> I added some timing information printouts for various parts in
> configure, and eventually identified few culprits.
> 
> The attachment "config-timing.patch" adds these printouts and sorted
> summary. It's attached for reference and not intended to be merged.
> It applies cleanly before or after the main patch.
> 
> 
> The attached "main.patch" addresses three areas I identified as slow:
> 
> 1. About 50-70% of configure runtime was being spent inside one
>    function: flatten_extralibs() and callees resolve() and unique().
>    It manipulates strings and invoked nearly 20K (20000) subshells.
>    It was rewritten to avoid subshells, and ended up x50-x250 faster.
> 
> 2. print_enabled_components() was invoking sed about 350 times on one
>    file. This is never instant but takes many seconds where fork is
>    slow (Windows). Invoke sed only once instead = x4-x10 speedup.
> 
> 3. After the previous speedups, configure spent 20-60% of its runtime
>    at check_deps(). It's particularly slow with bash. After some local
>    optimizations - mainly avoid pushvar/popvar and abort early in one
>    notable case (empty deps), it's now x4-x25 faster.
> 
> 
> Some resulting speedups (more details at "before-after.txt"):
> 
> - macOS/bash:  total: 98s -> 22s    flatten_extralibs:  53s -> 0.7s
> - Linux/bash:  total: 87s -> 11s    flatten_extralibs:  59s -> 0.9s
> - Linux/dash:  total: 27s ->  8s    flatten_extralibs:  17s -> 0.2s
> - FreeBSD/sh:  total: 34s ->  9s    flatten_extralibs:  23s -> 0.2s
> - MSYS2/bash:  total: 10m -> 2:30m  flatten_extralibs: 400s -> 1.5s (!)
> 
> 
> Notes:
> 
> - unique() is modified with flatten_extralibs. It now outputs different
>   order: it was keeping the last instance of recurring items, now it
>   keeps the first. It affects libs order at ffbuild/config.{mak,sh} -
>   but I don't think it matters. If it does, "opt1-reorder-unique.patch"
>   restores the original order. Let me know if/why it matters and I'll
>   squash it and update the commit message accordingly if required.
> 
> - After the check_deps() patch, pushvar() and popvar() are not used but
>   I was hesitant to remove them (nice to have). If you think it should
>   be removed, the patch "opt2-remove-pushvar.patch" removes them.
> 
> - The patches assume POSIX shell and don't use anything "tricky".
>   It was tested with dash, bash, busybox-ash, freebsd-sh, ksh93u, mksh.
> 
> - Thanks to tmm1 and atomnuker for their help with testing.
> 
> 
> This is my first mail to this list, and I wasn't sure how to split the
> patches. After asking a veteran, it was suggested to me that I should
> put all of them in one email. Just let me know if you prefer something
> else (dev mailing lists are not my forte...).

with the main patch
make distclean ; dash ./configure  --enable-gpl && make -j12 testprogs
fails:

LD    libavfilter/tests/filtfmts
libavformat/libavformat.a(utils.o): In function `av_apply_bitstream_filters':
ffmpeg/libavformat/utils.c:5577: undefined reference to `av_bitstream_filter_filter'
libavformat/libavformat.a(codec2.o): In function `codec2_read_header_common':
ffmpeg/libavformat/codec2.c:74: undefined reference to `avpriv_codec2_mode_bit_rate'
ffmpeg/libavformat/codec2.c:75: undefined reference to `avpriv_codec2_mode_frame_size'
ffmpeg/libavformat/codec2.c:76: undefined reference to `avpriv_codec2_mode_block_align'
ffmpeg/libavformat/codec2.c:74: undefined reference to `avpriv_codec2_mode_bit_rate'
ffmpeg/libavformat/codec2.c:75: undefined reference to `avpriv_codec2_mode_frame_size'
ffmpeg/libavformat/codec2.c:76: undefined reference to `avpriv_codec2_mode_block_align'
libavformat/libavformat.a(spdifdec.o): In function `spdif_get_offset_and_codec':
ffmpeg/libavformat/spdifdec.c:63: undefined reference to `av_adts_header_parse'
ffmpeg/libavformat/spdifdec.c:63: undefined reference to `av_adts_header_parse'
libavformat/libavformat.a(spdifenc.o): In function `spdif_header_aac':
ffmpeg/libavformat/spdifenc.c:357: undefined reference to `av_adts_header_parse'
collect2: error: ld returned 1 exit status
make: *** [libavfilter/tests/filtfmts] Error 1

[...]
Dave Yeo Aug. 26, 2018, 8:27 p.m. UTC | #4
On 08/25/18 11:11 AM, avih wrote:
> After the previous speedups, configure spent 20-60% of its runtime
> at check_deps(). It's particularly slow with bash. After some local
> optimizations - mainly avoid pushvar/popvar and abort early in one
> notable case (empty deps), it's now x4-x25 faster.

Works great on OS/2, from 700 seconds to 144 seconds
Dave
Reino Wijnsma Aug. 26, 2018, 10:18 p.m. UTC | #5
On 26-8-2018 22:27, Dave Yeo <daveryeo@telus.net> wrote:
> On 08/25/18 11:11 AM, avih wrote:
>> After the previous speedups, configure spent 20-60% of its runtime
>> at check_deps(). It's particularly slow with bash. After some local
>> optimizations - mainly avoid pushvar/popvar and abort early in one
>> notable case (empty deps), it's now x4-x25 faster.
>
> Works great on OS/2, from 700 seconds to 144 seconds
> Dave

I've gone from 857s to 268s on my old WinXP pc. That's really impressive!

-- Reino
diff mbox

Patch

From 77f897c8ed4eec9119d758037b0311629f549a5b Mon Sep 17 00:00:00 2001
From: "Avi Halachmi (:avih)" <avihpit@yahoo.com>
Date: Wed, 1 Aug 2018 09:10:12 +0300
Subject: [PATCH] configure: remove unused pushvar()/popvar()

---
 configure | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/configure b/configure
index e9cb7703..d6c1d032 100755
--- a/configure
+++ b/configure
@@ -619,25 +619,6 @@  get_sanitized(){
     eval echo \$$(sanitize_var_name "$1")
 }
 
-pushvar(){
-    for pvar in $*; do
-        eval level=\${${pvar}_level:=0}
-        eval ${pvar}_${level}="\$$pvar"
-        eval ${pvar}_level=$(($level+1))
-    done
-}
-
-popvar(){
-    for pvar in $*; do
-        eval level=\${${pvar}_level:-0}
-        test $level = 0 && continue
-        eval level=$(($level-1))
-        eval $pvar="\${${pvar}_${level}}"
-        eval ${pvar}_level=$level
-        eval unset ${pvar}_${level}
-    done
-}
-
 request(){
     for var in $*; do
         eval ${var}_requested=yes
-- 
2.17.1