diff mbox

[FFmpeg-devel,V5,1/2] configure: sort decoder/encoder/filter/... names in alphabet order

Message ID 20190428011818.GB1955@kappa.local
State New
Headers show

Commit Message

Alexander Strasser April 28, 2019, 1:18 a.m. UTC
Hi both of you,

nice work iterating on this patch set!

I want to discuss a little more, please don't take my comments
the wrong way, I just couldn't get back to this discussion earlier.

On 2019-04-24 13:36 +0000, Guo, Yejun wrote:
> >
> > From: avih [mailto:avihpit@yahoo.com]
> > Sent: Wednesday, April 24, 2019 9:22 PM
> > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
> > Cc: Guo, Yejun <yejun.guo@intel.com>
> > Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort
> > decoder/encoder/filter/... names in alphabet order
> >
> > >  print_in_columns() {
> > > -    cols=$(expr $ncols / 24)
> > > -    cat | tr ' ' '\n' | sort | pr -r "-$cols" -w $ncols -t
> > > +    # the input should not contain chars such as '*',
> > > +    # otherwise, '*' will be expanded to be all files in the current
> > > +    # working directory which don't begin with a dot (`.`)
> > > +    set -- $(tr ' ' '\n' | sort)
> > > +    col_width=24
> > > +    if [ $ncols -lt $col_width ]; then
> > > +        col_width=$ncols
> > > +    fi
> > > +    cols=$(($ncols / $col_width))
> > > +    rows=$(($(($# + $cols - 1)) / $cols))
> > > +    cols_seq=$(seq $cols)
> > > +    rows_seq=$(seq $rows)
> > > +    for row in $rows_seq; do
> > > +        print_index=$row
> > > +        print_line=""
> > > +        for col in $cols_seq; do
> > > +            if [ $print_index -le $# ]; then
> > > +                eval print_line='"$print_line "${'$print_index'}'
> > > +            fi
> > > +            print_index=$(($print_index + $rows))
> > > +        done
> > > +        printf "%-${col_width}s" $print_line
> > > +        printf "\n"
> > > +    done | sed 's/ *$//'
> > > }
> >
> > Looks good to me. No further comments (but I don't push).
> >
> > Next time, know that you can use e.g. `$((x + y))` instead of `$(($x + $y))`,
> > though in this case it doesn't matter and not worth another version.
>
> got it, thanks.

What do you think about using awk instead of shell?

I have 2 POC patches attached. It's probably not 100% correct yet,
but it kind of demonstrates what it would look like.

The main advantage of using awk, is that it's more elegant and
shorter. It's probably also less risky, because it's more isolated,
e.g. as it was explained by avih, there is no widely supported way
for locals across shells. We already use awk in configure for
mandatory functions, so it's no new dependency.

Please comment on the awk approach.

I'm not against the shell way, or a mixed approach, but before going
either way and pushing I would rather have some more testing;
especially on more exotic platforms.


Thank you
  Alexander

Comments

avih April 28, 2019, 3:11 a.m. UTC | #1
> What do you think about using awk instead of shell?

No objection here, especially if it's more robust in some ways than this
or other shell code (though personally I'm not fluent in awk).

My only concern was preventing a considerable performance impact which could
otherwise be avoided relatively easily, and with which I thought I could help.
Guo, Yejun April 28, 2019, 7:42 a.m. UTC | #2
> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of
> Alexander Strasser
> Sent: Sunday, April 28, 2019 9:18 AM
> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort
> decoder/encoder/filter/... names in alphabet order
> 
> 
> What do you think about using awk instead of shell?
> 
> I have 2 POC patches attached. It's probably not 100% correct yet,
> but it kind of demonstrates what it would look like.
> 
> The main advantage of using awk, is that it's more elegant and
> shorter. It's probably also less risky, because it's more isolated,
> e.g. as it was explained by avih, there is no widely supported way
> for locals across shells. We already use awk in configure for
> mandatory functions, so it's no new dependency.
> 
> Please comment on the awk approach.

I did some change to make it correct on ubuntu 16.04, but looks issue on windows mingw.

print_in_columns() {
    sort | tr '\n' ' ' | awk -v width="$ncols" '
    {
        #add int()
        num_cols = int(width / 24); num_rows = int((NF + num_cols-1) / num_cols);
        y = x = 1;
        for (i = 1; i <= NF; i++) {
            if (y > num_rows) {
                y = 1; x++;
            }
            d[x,y] = $i;
            y++;
        }
		
		print width
		print num_cols
		print num_rows
		print NF
		
        for (y = 1; y <= num_rows; y++) {
            for (x = 1; x <= num_cols; x++) {
                # it does not work for the case that the last column is not fully filled
                #line = sprintf(x != num_cols ? "%s%-24s" : "%s%s", line, d[x,y]);
                line = sprintf("%s%-24s", line, d[x,y]);
            }
            print line;
            line = "";
        }
    }' | sed 's/ *$//'
}

on ubuntu 16.04, the output is:
Enabled bsfs:
135
5
7
33
aac_adtstoasc           extract_extradata       hevc_mp4toannexb        mpeg4_unpack_bframes    truehd_core
av1_frame_split         filter_units            imx_dump_header         noise                   vp9_metadata
av1_metadata            h264_metadata           mjpeg2jpeg              null                    vp9_raw_reorder
chomp                   h264_mp4toannexb        mjpega_dump_header      prores_metadata         vp9_superframe
dca_core                h264_redundant_pps      mov2textsub             remove_extradata        vp9_superframe_split
dump_extradata          hapqa_extract           mp3_header_decompress   text2movsub
eac3_core               hevc_metadata           mpeg2_metadata          trace_headers

while on windows, the output is:
Enabled bsfs:
72
3
11
33
     noiseh264_redundant_pps
          nullextract
          prores_metadata
       remove_extradatamp4toannexb
        text2movsubdump_header
             trace_headers
     truehd_corepega_dump_header
            vp9_metadata
  vp9_raw_reorderader_decompress
         vp9_superframea
   vp9_superframe_splitames


I did a quick check, but has not found the root cause yet.


> 
> I'm not against the shell way, or a mixed approach, but before going
> either way and pushing I would rather have some more testing;
> especially on more exotic platforms.
> 
> 
> Thank you
>   Alexander
Alexander Strasser April 28, 2019, 2:44 p.m. UTC | #3
On 2019-04-28 03:11 +0000, avih wrote:
> > What do you think about using awk instead of shell?
>
> No objection here, especially if it's more robust in some ways than this
> or other shell code (though personally I'm not fluent in awk).
>
> My only concern was preventing a considerable performance impact which could
> otherwise be avoided relatively easily, and with which I thought I could help.

Yes, thank you. Very much appreciated!

  Alexander
Alexander Strasser April 28, 2019, 3:28 p.m. UTC | #4
On 2019-04-28 07:42 +0000, Guo, Yejun wrote:
> > -----Original Message-----
> > From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of
> > Alexander Strasser
> > Sent: Sunday, April 28, 2019 9:18 AM
> > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort
> > decoder/encoder/filter/... names in alphabet order
> >
> >
> > What do you think about using awk instead of shell?
> >
> > I have 2 POC patches attached. It's probably not 100% correct yet,
> > but it kind of demonstrates what it would look like.
> >
> > The main advantage of using awk, is that it's more elegant and
> > shorter. It's probably also less risky, because it's more isolated,
> > e.g. as it was explained by avih, there is no widely supported way
> > for locals across shells. We already use awk in configure for
> > mandatory functions, so it's no new dependency.
> >
> > Please comment on the awk approach.
>

> I did some change to make it correct on ubuntu 16.04, but looks issue on windows mingw.
>
> print_in_columns() {
>     sort | tr '\n' ' ' | awk -v width="$ncols" '
>     {
>         #add int()
>         num_cols = int(width / 24); num_rows = int((NF + num_cols-1) / num_cols);

The added int() calls looks right.


>         y = x = 1;
>         for (i = 1; i <= NF; i++) {
>             if (y > num_rows) {
>                 y = 1; x++;
>             }
>             d[x,y] = $i;
>             y++;
>         }
>
> 		print width
> 		print num_cols
> 		print num_rows
> 		print NF
>
>         for (y = 1; y <= num_rows; y++) {
>             for (x = 1; x <= num_cols; x++) {
>                 # it does not work for the case that the last column is not fully filled
>                 #line = sprintf(x != num_cols ? "%s%-24s" : "%s%s", line, d[x,y]);
>                 line = sprintf("%s%-24s", line, d[x,y]);

You mean it doesn't work in the way that there will be trailing spaces
in the column before the last one which is partially empty, right?

But the output looked right and there were never trailing spaces in the
last column, right?

BTW I changed the algorithm here, because I thought it might be easier
to unterstand. I will also try the algorithm, you implemented in shell.


>             }
>             print line;
>             line = "";
>         }
>     }' | sed 's/ *$//'
> }
>
> on ubuntu 16.04, the output is:
> Enabled bsfs:
> 135
> 5
> 7
> 33
> aac_adtstoasc           extract_extradata       hevc_mp4toannexb        mpeg4_unpack_bframes    truehd_core
> av1_frame_split         filter_units            imx_dump_header         noise                   vp9_metadata
> av1_metadata            h264_metadata           mjpeg2jpeg              null                    vp9_raw_reorder
> chomp                   h264_mp4toannexb        mjpega_dump_header      prores_metadata         vp9_superframe
> dca_core                h264_redundant_pps      mov2textsub             remove_extradata        vp9_superframe_split
> dump_extradata          hapqa_extract           mp3_header_decompress   text2movsub
> eac3_core               hevc_metadata           mpeg2_metadata          trace_headers
>
> while on windows, the output is:
> Enabled bsfs:
> 72
> 3
> 11
> 33
>      noiseh264_redundant_pps
>           nullextract
>           prores_metadata
>        remove_extradatamp4toannexb
>         text2movsubdump_header
>              trace_headers
>      truehd_corepega_dump_header
>             vp9_metadata
>   vp9_raw_reorderader_decompress
>          vp9_superframea
>    vp9_superframe_splitames

Wild guess: CR LF line endings are emitted somewhere and the CRs stay in
the input. Your terminal resets the cursor to the start of the line when
interpreting the midline CRs.

Does it work if you extend the tr in print_in_columns like below?

     sort | tr '\r\n' '  ' | awk -v width="$ncols" '


> I did a quick check, but has not found the root cause yet.
>
>
> >
> > I'm not against the shell way, or a mixed approach, but before going
> > either way and pushing I would rather have some more testing;
> > especially on more exotic platforms.

Thanks for testing and looking into it!

  Alexander
Guo, Yejun April 29, 2019, 12:55 a.m. UTC | #5
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of

> Alexander Strasser

> Sent: Sunday, April 28, 2019 11:29 PM

> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort

> decoder/encoder/filter/... names in alphabet order

> 

> On 2019-04-28 07:42 +0000, Guo, Yejun wrote:

> > > -----Original Message-----

> > > From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf

> Of

> > > Alexander Strasser

> > > Sent: Sunday, April 28, 2019 9:18 AM

> > > To: FFmpeg development discussions and patches

> <ffmpeg-devel@ffmpeg.org>

> > > Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort

> > > decoder/encoder/filter/... names in alphabet order

> > >

> > >

> > > What do you think about using awk instead of shell?

> > >

> > > I have 2 POC patches attached. It's probably not 100% correct yet,

> > > but it kind of demonstrates what it would look like.

> > >

> > > The main advantage of using awk, is that it's more elegant and

> > > shorter. It's probably also less risky, because it's more isolated,

> > > e.g. as it was explained by avih, there is no widely supported way

> > > for locals across shells. We already use awk in configure for

> > > mandatory functions, so it's no new dependency.

> > >

> > > Please comment on the awk approach.

> >

> 

> > I did some change to make it correct on ubuntu 16.04, but looks issue on

> windows mingw.

> >

> > print_in_columns() {

> >     sort | tr '\n' ' ' | awk -v width="$ncols" '

> >     {

> >         #add int()

> >         num_cols = int(width / 24); num_rows = int((NF + num_cols-1) /

> num_cols);

> 

> The added int() calls looks right.

> 

> 

> >         y = x = 1;

> >         for (i = 1; i <= NF; i++) {

> >             if (y > num_rows) {

> >                 y = 1; x++;

> >             }

> >             d[x,y] = $i;

> >             y++;

> >         }

> >

> > 		print width

> > 		print num_cols

> > 		print num_rows

> > 		print NF

> >

> >         for (y = 1; y <= num_rows; y++) {

> >             for (x = 1; x <= num_cols; x++) {

> >                 # it does not work for the case that the last column is not

> fully filled

> >                 #line = sprintf(x != num_cols ? "%s%-24s" : "%s%s", line,

> d[x,y]);

> >                 line = sprintf("%s%-24s", line, d[x,y]);

> 

> You mean it doesn't work in the way that there will be trailing spaces

> in the column before the last one which is partially empty, right?

> 

> But the output looked right and there were never trailing spaces in the

> last column, right?


suppose num_cols = 3, see below, the spaces after 'c' is trimmed, while the spaces after 'e' is still there.

a<spaces>b<spaces>c
d<spaces>e<spaces>


> 

> BTW I changed the algorithm here, because I thought it might be easier

> to unterstand. I will also try the algorithm, you implemented in shell.

> 

> 

> >             }

> >             print line;

> >             line = "";

> >         }

> >     }' | sed 's/ *$//'

> > }

> >

> > on ubuntu 16.04, the output is:

> > Enabled bsfs:

> > 135

> > 5

> > 7

> > 33

> > aac_adtstoasc           extract_extradata       hevc_mp4toannexb

> mpeg4_unpack_bframes    truehd_core

> > av1_frame_split         filter_units            imx_dump_header

> noise                   vp9_metadata

> > av1_metadata            h264_metadata           mjpeg2jpeg

> null                    vp9_raw_reorder

> > chomp                   h264_mp4toannexb

> mjpega_dump_header      prores_metadata         vp9_superframe

> > dca_core                h264_redundant_pps      mov2textsub

> remove_extradata        vp9_superframe_split

> > dump_extradata          hapqa_extract

> mp3_header_decompress   text2movsub

> > eac3_core               hevc_metadata           mpeg2_metadata

> trace_headers

> >

> > while on windows, the output is:

> > Enabled bsfs:

> > 72

> > 3

> > 11

> > 33

> >      noiseh264_redundant_pps

> >           nullextract

> >           prores_metadata

> >        remove_extradatamp4toannexb

> >         text2movsubdump_header

> >              trace_headers

> >      truehd_corepega_dump_header

> >             vp9_metadata

> >   vp9_raw_reorderader_decompress

> >          vp9_superframea

> >    vp9_superframe_splitames

> 

> Wild guess: CR LF line endings are emitted somewhere and the CRs stay in

> the input. Your terminal resets the cursor to the start of the line when

> interpreting the midline CRs.

> 

> Does it work if you extend the tr in print_in_columns like below?

> 

>      sort | tr '\r\n' '  ' | awk -v width="$ncols" '

> 


yes, it works!

> 

> > I did a quick check, but has not found the root cause yet.

> >

> >

> > >

> > > I'm not against the shell way, or a mixed approach, but before going

> > > either way and pushing I would rather have some more testing;

> > > especially on more exotic platforms.

> 

> Thanks for testing and looking into it!

> 

>   Alexander

> _______________________________________________

> ffmpeg-devel mailing list

> ffmpeg-devel@ffmpeg.org

> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

> 

> To unsubscribe, visit link above, or email

> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Alexander Strasser May 1, 2019, 8:51 p.m. UTC | #6
On 2019-04-29 00:55 +0000, Guo, Yejun wrote:
[...]
> > Wild guess: CR LF line endings are emitted somewhere and the CRs stay in
> > the input. Your terminal resets the cursor to the start of the line when
> > interpreting the midline CRs.
> >
> > Does it work if you extend the tr in print_in_columns like below?
> >
> >      sort | tr '\r\n' '  ' | awk -v width="$ncols" '
> >
>
> yes, it works!
>
> >
> > > I did a quick check, but has not found the root cause yet.
> > >
> > >
> > > >
> > > > I'm not against the shell way, or a mixed approach, but before going
> > > > either way and pushing I would rather have some more testing;
> > > > especially on more exotic platforms.
> >
> > Thanks for testing and looking into it!

New patchset sent with subject:

    [PATCH 0/2] configure: Replace pr with awk and fix column sorting


Your last version (v5) of the log_file rewrite performs better, so maybe
we should use it instead of my awk version, which is at best as fast
as current version (depends on the awk implementation).

  Alexander
Carl Eugen Hoyos May 1, 2019, 8:57 p.m. UTC | #7
2019-04-28 3:18 GMT+02:00, Alexander Strasser <eclipse7@gmx.net>:

> What do you think about using awk instead of shell?

Do we only use awk for --enable-random and the dependency
files so far? Does configure also work without awk now and
would this change?

Thank you, Carl Eugen
Lauri Kasanen May 2, 2019, 5:48 a.m. UTC | #8
On Wed, 1 May 2019 22:57:47 +0200
Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:

> 2019-04-28 3:18 GMT+02:00, Alexander Strasser <eclipse7@gmx.net>:
>
> > What do you think about using awk instead of shell?
>
> Do we only use awk for --enable-random and the dependency
> files so far? Does configure also work without awk now and
> would this change?

It seems awk is unconditionally required already. However I wanted to
say that it's a very nice dep to have: easy to build, present almost
everywhere, even in busybox. Nothing like perl/tcl or worse,
python/java/rust/go.

- Lauri
avih May 2, 2019, 8:55 a.m. UTC | #9
> It seems awk is unconditionally required already. However I wanted to
> say that it's a very nice dep to have

While it's possibly nicer than other deps to have, it's still better to use
it IMHO only when it adds some value, like simpler code, better performance,
compliance with some things, etc.

With this patchset, for part 1 I'm not sure I see the added value with awk,
as it looks about the same complexity of the code compared to the shell
version. Performance-wise it's negligible as print_in_columns() is only called
about 5 times (but with thousands of values to process), so as long as the
loop is efficient, the additional awk process per call is barely measurable.

For part 2 the awk version is simpler code, but `log_file()` is called nearly
300 times, mostly with files with less than 5 lines to print. At this case it
does add a performance penalty, especially on systems where a new process is
expensive.
Alexander Strasser May 3, 2019, 11:09 p.m. UTC | #10
Hi Avih!

On 2019-05-02 08:55 +0000, avih wrote:
> > It seems awk is unconditionally required already. However I wanted to
> > say that it's a very nice dep to have
>
> While it's possibly nicer than other deps to have, it's still better to use
> it IMHO only when it adds some value, like simpler code, better performance,
> compliance with some things, etc.

Agreed; of course we shouldn't just use awk because we can.

Though I think not implementing things in shell is often
lower risk, as we have no isolation in POSIX shell, we all
share the same variables etc. and the configure script is
quite big. Then shell is not suited for many tasks because
of the way it works, like you explained really good before.
So even if we find an acceptable solution, it will not be
understood that easily by most people, because most people
are not that deep into shell programming.

Also in my experience the shells differ, when using more
advanced concepts and in things that aren't used widely.

Some of what I said applies to the alternatives too,
except that they will always be better isolated.

Sorry, that turned out longer than I wanted it to be :(


> With this patchset, for part 1 I'm not sure I see the added value with awk,
> as it looks about the same complexity of the code compared to the shell
> version.

Did you look at the version I attached in this thread? Or the one I
posted in the new patch set?

I changed it to use an algorithm more similar to the latest shell
version discussed here. The version I attached in this mail thread,
was an experiment to see if it could be easier understood that way.


> Performance-wise it's negligible as print_in_columns() is only called
> about 5 times (but with thousands of values to process), so as long as the
> loop is efficient, the additional awk process per call is barely measurable.


> For part 2 the awk version is simpler code, but `log_file()` is called nearly
> 300 times, mostly with files with less than 5 lines to print. At this case it
> does add a performance penalty, especially on systems where a new process is
> expensive.

Yes, exactly. Very well worded!

Though I am still surprised about the Windows results. I did not
investigate more, but I found it quite strange, that it showed
a 8% performance hit the first time, and in subsequent runs there
were no performance differences measurable for the runtime of the
complete configure script.

As mentioned in the other patch set, I am fine going with the shell
version for part 2 (log_file).

A third option would be to just use cat and forget about the line
numbers. I am not sure the numbers really add much value, for those
dumps of mostly very small source files with little ambiguity.


  Alexander
Alexander Strasser May 3, 2019, 11:28 p.m. UTC | #11
Hi Carl Eugen!

On 2019-05-01 22:57 +0200, Carl Eugen Hoyos wrote:
> 2019-04-28 3:18 GMT+02:00, Alexander Strasser <eclipse7@gmx.net>:
>
> > What do you think about using awk instead of shell?
>
> Do we only use awk for --enable-random and the dependency
> files so far? Does configure also work without awk now and
> would this change?

No, it wouldn't as I have written when initially proposing
to use it here.

We unconditionally use awk for generating and writing the
primary configure outputs: the config files.

Upon your question I looked it up exactly; the routine was
rewritten by Mans to use awk near the end of 2012 in this
commit:

  f454e879238ce317c6d905d187e7608c461a7087


  Alexander
avih May 4, 2019, 6:28 a.m. UTC | #12
> On 2019-05-02 08:55 +0000, avih wrote:
> > > It seems awk is unconditionally required already. However I wanted to
> > > say that it's a very nice dep to have
> >
> > While it's possibly nicer than other deps to have, it's still better to use
> > it IMHO only when it adds some value, like simpler code, better performance,
> > compliance with some things, etc.
> 
> Agreed; of course we shouldn't just use awk because we can.
> 
> Though I think not implementing things in shell is often
> lower risk, as we have no isolation in POSIX shell, we all
> share the same variables etc. and the configure script is
> quite big. Then shell is not suited for many tasks because
> of the way it works
> ...

This actually sounds to me like you're saying we shouldn't use awk because
we can, but rather use it where possible because it'd be better than shell.

In other words: we should write new configure code in awk.

Did I misinterpret the statement or its implications?

Assuming I interpreted it correctly, my opinion on this is that configure
is already written in shell, so unless there's a consensus to migrate it
to awk, things should stay shell except where other tools add some value
beyond "this doesn't have the general issues of shell". I.e. use tools
where they are better suited for some specific tasks than shell.

Personally I don't have an opinion about migrating configure, though I'd
think that if such migration ever happens, it should be to a proper build
system and not just a different scripting language.


> Also in my experience the shells differ, when using more
> advanced concepts and in things that aren't used widely.

True, though I don't think configure uses exotic shell features, and
as far as I know it's currently posix and real-world compliant with
many new and old shells. If you know of a valid issue with some shells,
you can report it or send a patch.

Note that I'm not arguing that shell is a great programming language.
It's not - it's hard and it's treacherous. But currently that's what we
have, and for now it does its job.


> > With this patchset, for part 1 I'm not sure I see the added value with awk,
> > as it looks about the same complexity of the code compared to the shell
> > version.
> 
> Did you look at the version I attached in this thread? Or the one I
> posted in the new patch set?
> 
> I changed it to use an algorithm more similar to the latest shell
> version discussed here.

I think so, yes. As I said, it's similar to the shell version. I don't
think it's worse in any way, but I also didn't see an added value.

Please post a link to the actual patch if you think I'm not looking
at the patch version you refer to.
Alexander Strasser May 4, 2019, 7:43 p.m. UTC | #13
Hi!

On 2019-05-04 06:28 +0000, avih wrote:
> > On 2019-05-02 08:55 +0000, avih wrote:
> > > > It seems awk is unconditionally required already. However I wanted to
> > > > say that it's a very nice dep to have
> > >
> > > While it's possibly nicer than other deps to have, it's still better to use
> > > it IMHO only when it adds some value, like simpler code, better performance,
> > > compliance with some things, etc.
> >
> > Agreed; of course we shouldn't just use awk because we can.
> >
> > Though I think not implementing things in shell is often
> > lower risk, as we have no isolation in POSIX shell, we all
> > share the same variables etc. and the configure script is
> > quite big. Then shell is not suited for many tasks because
> > of the way it works
> > ...
>
> This actually sounds to me like you're saying we shouldn't use awk because
> we can, but rather use it where possible because it'd be better than shell.
>
> In other words: we should write new configure code in awk.
>
> Did I misinterpret the statement or its implications?

You got me totally wrong :(

[...]
> >
> > Did you look at the version I attached in this thread? Or the one I
> > posted in the new patch set?
> >
> > I changed it to use an algorithm more similar to the latest shell
> > version discussed here.
>
> I think so, yes. As I said, it's similar to the shell version. I don't
> think it's worse in any way, but I also didn't see an added value.
>
> Please post a link to the actual patch if you think I'm not looking
> at the patch version you refer to.

I guess you were looking at the right patch. I mean this one:

  http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html


  Alexander
avih May 5, 2019, 9:14 p.m. UTC | #14
> I guess you were looking at the right patch. I mean this one:
>   http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html

I was referring to this patch indeed. Thanks.


> > > Agreed; of course we shouldn't just use awk because we can.
> > >
> > > Though I think not implementing things in shell is often
> > > lower risk, as we have no isolation in POSIX shell, we all
> > > share the same variables etc. and the configure script is
> > > quite big. Then shell is not suited for many tasks because
> > > of the way it works
> > > ...
> >
> > This actually sounds to me like you're saying we shouldn't use awk because
> > we can, but rather use it where possible because it'd be better than shell.
> >
> > In other words: we should write new configure code in awk.
> > 
> > Did I misinterpret the statement or its implications?
> 
> You got me totally wrong :(


I'm only human, it happens. But you didn't explain what you actually meant.

Specifically:

- What makes this patch a good candidate to use awk rather than shell like
  the rest of configure?

- What should be the general criteria to choose a scripting language for
  future patches?
 

    On Saturday, May 4, 2019 10:43 PM, Alexander Strasser <eclipse7@gmx.net> wrote:
 

 Hi!

On 2019-05-04 06:28 +0000, avih wrote:
> > On 2019-05-02 08:55 +0000, avih wrote:
> > > > It seems awk is unconditionally required already. However I wanted to
> > > > say that it's a very nice dep to have
> > >
> > > While it's possibly nicer than other deps to have, it's still better to use
> > > it IMHO only when it adds some value, like simpler code, better performance,
> > > compliance with some things, etc.
> >
> > Agreed; of course we shouldn't just use awk because we can.
> >
> > Though I think not implementing things in shell is often
> > lower risk, as we have no isolation in POSIX shell, we all
> > share the same variables etc. and the configure script is
> > quite big. Then shell is not suited for many tasks because
> > of the way it works
> > ...
>
> This actually sounds to me like you're saying we shouldn't use awk because
> we can, but rather use it where possible because it'd be better than shell.
>
> In other words: we should write new configure code in awk.
>
> Did I misinterpret the statement or its implications?

You got me totally wrong :(

[...]
> >
> > Did you look at the version I attached in this thread? Or the one I
> > posted in the new patch set?
> >
> > I changed it to use an algorithm more similar to the latest shell
> > version discussed here.
>
> I think so, yes. As I said, it's similar to the shell version. I don't
> think it's worse in any way, but I also didn't see an added value.
>
> Please post a link to the actual patch if you think I'm not looking
> at the patch version you refer to.

I guess you were looking at the right patch. I mean this one:

  http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html


  Alexander
Alexander Strasser May 6, 2019, 10:21 p.m. UTC | #15
On 2019-05-05 21:14 +0000, avih wrote:
> > I guess you were looking at the right patch. I mean this one:
> >   http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html
>
> I was referring to this patch indeed. Thanks.
>
>
> > > > Agreed; of course we shouldn't just use awk because we can.
> > > >
> > > > Though I think not implementing things in shell is often
> > > > lower risk, as we have no isolation in POSIX shell, we all
> > > > share the same variables etc. and the configure script is
> > > > quite big. Then shell is not suited for many tasks because
> > > > of the way it works
> > > > ...
> > >
> > > This actually sounds to me like you're saying we shouldn't use awk because
> > > we can, but rather use it where possible because it'd be better than shell.
> > >
> > > In other words: we should write new configure code in awk.
> > >
> > > Did I misinterpret the statement or its implications?
> >
> > You got me totally wrong :(
>
>
> I'm only human, it happens. But you didn't explain what you actually meant.

It's a communication problem. I tried to explain why I prefer
the awk version over the shell version of this patch in the email
before. Seems I was too vague.

I will try to answer your specific questions. If there is more
to discuss, please don't hesitate to ask.

> Specifically:
>
> - What makes this patch a good candidate to use awk rather than shell like
>   the rest of configure?

I would like to rephrase this question a bit. I guess the other aspect
will be in my answer to your next question anyway.

patch1 (awk) configure: print_in_columns: replace pr with awk version: http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html
patch2 (shell) configure: sort decoder/encoder/filter/... names in alphabet order (v5 as posted in this thread)

- Why do you prefer patch1 over patch2?

1. Statistics
    * patch1: 1 file changed, 16 insertions(+), 2 deletions(-)
    * patch2: 1 file changed, 24 insertions(+), 2 deletions(-)
2. patch2 uses lots of variables (which is good in itself), but those
   should be local and we cannot express that portably in shell.
   In turn we have raised potential for clashes now or in the future.
3. patch2 uses eval combined with non-trivial quoting, which is hard
   to read and hard to grasp quickly. It's not wrong, but it's not as
   easy and straight forward as in patch1.
4. Depending on the input (stdin) unexpected things can happen in the
   "eval" and "set" lines. One example is given in the comment.
5. patch2 to uses shell for the parts easily expressed in shell and
   uses awk for the parts that are non-trivial in shell.
6. The awk implementation should be easier to read and understand for
   the majority of readers/developers.


> - What should be the general criteria to choose a scripting language for
>   future patches?

I don't see anywhere that we switch away from shell for configure. So in
general things should be implemented in shell. Shell though is not really
suited for implementing all kinds of algorithms. OTOH shell is an excellent
choice for starting processes that communicate via stdin/stdout/stderr
and plugging them together.

So I think programming shell, sticking only to shell itself (builtins) is
almost never a good option. Shell is best when coordinating the execution
of other programs. Sometimes when you do not have a particular program
around to help you in your shell script, that void needs to be filled.
For this purpose awk is often a good choice, because it was especially
designed to be a fit for this purpose amongst other things. Using only awk
to implement bigger systems or as a general purpose programming language
is IMHO nearly as wrong as using shell-only.


>     On Saturday, May 4, 2019 10:43 PM, Alexander Strasser <eclipse7@gmx.net> wrote:
[...]


I'm feeling like I might not have expressed my reasoning good and
precise enough in every level of detail. I apologize for that. My
natural language skills have their limits and I am not a native
English speaker. I will try to refine specific points, should you
or anyone else have more questions or comments.


  Alexander
Guo, Yejun May 7, 2019, 2:07 a.m. UTC | #16
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of

> Alexander Strasser

> Sent: Tuesday, May 07, 2019 6:21 AM

> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort

> decoder/encoder/filter/... names in alphabet order

> 

> On 2019-05-05 21:14 +0000, avih wrote:

> > > I guess you were looking at the right patch. I mean this one:

> > >   http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html

> >

> > I was referring to this patch indeed. Thanks.

> >

> >

> > > > > Agreed; of course we shouldn't just use awk because we can.

> > > > >

> > > > > Though I think not implementing things in shell is often

> > > > > lower risk, as we have no isolation in POSIX shell, we all

> > > > > share the same variables etc. and the configure script is

> > > > > quite big. Then shell is not suited for many tasks because

> > > > > of the way it works

> > > > > ...

> > > >

> > > > This actually sounds to me like you're saying we shouldn't use awk

> because

> > > > we can, but rather use it where possible because it'd be better than shell.

> > > >

> > > > In other words: we should write new configure code in awk.

> > > >

> > > > Did I misinterpret the statement or its implications?

> > >

> > > You got me totally wrong :(

> >

> >

> > I'm only human, it happens. But you didn't explain what you actually meant.

> 

> It's a communication problem. I tried to explain why I prefer

> the awk version over the shell version of this patch in the email

> before. Seems I was too vague.

> 

> I will try to answer your specific questions. If there is more

> to discuss, please don't hesitate to ask.

> 

> > Specifically:

> >

> > - What makes this patch a good candidate to use awk rather than shell like

> >   the rest of configure?

> 

> I would like to rephrase this question a bit. I guess the other aspect

> will be in my answer to your next question anyway.

> 

> patch1 (awk) configure: print_in_columns: replace pr with awk version:

> http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html

> patch2 (shell) configure: sort decoder/encoder/filter/... names in alphabet

> order (v5 as posted in this thread)

> 

> - Why do you prefer patch1 over patch2?

> 

> 1. Statistics

>     * patch1: 1 file changed, 16 insertions(+), 2 deletions(-)

>     * patch2: 1 file changed, 24 insertions(+), 2 deletions(-)


I have no idea which one is better, but I think this point is not the reason.

we can refine the shell patch as below with 15 insertions(+), 2 deletions(-1). Actually, the line number does not mean something. :)

print_in_columns() {
    # the input should not contain chars such as '*', otherwise it will be expanded.
    set -- $(tr ' ' '\n' | sort)
    test $ncols -lt 24 && col_width=$ncols || col_width=24
    cols=$(($ncols / $col_width))
    rows=$(($(($# + $cols - 1)) / $cols))
    cols_seq=$(seq $cols)
    for row in $(seq $rows); do
        print_index=$row
        print_line=""
        for col in $cols_seq; do
            test $print_index -le $# && eval print_line='"$print_line "${'$print_index'}'
            print_index=$(($print_index + $rows))
        done
        printf "%-${col_width}s" $print_line; printf "\n"
    done | sed 's/ *$//'
}

the usage of "test && ||" is already used in configure.

> 2. patch2 uses lots of variables (which is good in itself), but those

>    should be local and we cannot express that portably in shell.

>    In turn we have raised potential for clashes now or in the future.

> 3. patch2 uses eval combined with non-trivial quoting, which is hard

>    to read and hard to grasp quickly. It's not wrong, but it's not as

>    easy and straight forward as in patch1.

> 4. Depending on the input (stdin) unexpected things can happen in the

>    "eval" and "set" lines. One example is given in the comment.

> 5. patch2 to uses shell for the parts easily expressed in shell and

>    uses awk for the parts that are non-trivial in shell.

> 6. The awk implementation should be easier to read and understand for

>    the majority of readers/developers.

> 

> 

> > - What should be the general criteria to choose a scripting language for

> >   future patches?

> 

> I don't see anywhere that we switch away from shell for configure. So in

> general things should be implemented in shell. Shell though is not really

> suited for implementing all kinds of algorithms. OTOH shell is an excellent

> choice for starting processes that communicate via stdin/stdout/stderr

> and plugging them together.

> 

> So I think programming shell, sticking only to shell itself (builtins) is

> almost never a good option. Shell is best when coordinating the execution

> of other programs. Sometimes when you do not have a particular program

> around to help you in your shell script, that void needs to be filled.

> For this purpose awk is often a good choice, because it was especially

> designed to be a fit for this purpose amongst other things. Using only awk

> to implement bigger systems or as a general purpose programming language

> is IMHO nearly as wrong as using shell-only.

> 

> 

> >     On Saturday, May 4, 2019 10:43 PM, Alexander Strasser

> <eclipse7@gmx.net> wrote:

> [...]

> 

> 

> I'm feeling like I might not have expressed my reasoning good and

> precise enough in every level of detail. I apologize for that. My

> natural language skills have their limits and I am not a native

> English speaker. I will try to refine specific points, should you

> or anyone else have more questions or comments.

> 

> 

>   Alexander

> _______________________________________________

> ffmpeg-devel mailing list

> ffmpeg-devel@ffmpeg.org

> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

> 

> To unsubscribe, visit link above, or email

> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Alexander Strasser May 7, 2019, 6:49 a.m. UTC | #17
Am 7. Mai 2019 04:07:23 MESZ schrieb "Guo, Yejun" <yejun.guo@intel.com>:
>
>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf
>Of
>> Alexander Strasser
>> Sent: Tuesday, May 07, 2019 6:21 AM
>> To: FFmpeg development discussions and patches
><ffmpeg-devel@ffmpeg.org>
>> Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort
>> decoder/encoder/filter/... names in alphabet order
>> 
>> On 2019-05-05 21:14 +0000, avih wrote:
>> > > I guess you were looking at the right patch. I mean this one:
[...]
>> > > >
>> > > > In other words: we should write new configure code in awk.
>> > > >
>> > > > Did I misinterpret the statement or its implications?
>> > >
>> > > You got me totally wrong :(
>> >
>> > I'm only human, it happens. But you didn't explain what you
>actually meant.
>> 
>> It's a communication problem. I tried to explain why I prefer
>> the awk version over the shell version of this patch in the email
>> before. Seems I was too vague.
>> 
>> I will try to answer your specific questions. If there is more
>> to discuss, please don't hesitate to ask.
>> 
>> > Specifically:
>> >
>> > - What makes this patch a good candidate to use awk rather than
>shell like
>> >   the rest of configure?
>> 
>> I would like to rephrase this question a bit. I guess the other
>aspect
>> will be in my answer to your next question anyway.
>> 
>> patch1 (awk) configure: print_in_columns: replace pr with awk
>version:
>> http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html
>> patch2 (shell) configure: sort decoder/encoder/filter/... names in
>alphabet
>> order (v5 as posted in this thread)
>> 
>> - Why do you prefer patch1 over patch2?
>> 
>> 1. Statistics
>>     * patch1: 1 file changed, 16 insertions(+), 2 deletions(-)
>>     * patch2: 1 file changed, 24 insertions(+), 2 deletions(-)
>
>I have no idea which one is better, but I think this point is not the
>reason.

My fault. I wanted to add that those points are in no particular order and that this one is not a very strong Point; but it's an indication.

>we can refine the shell patch as below with 15 insertions(+), 2
>deletions(-1). Actually, the line number does not mean something. :)
>
>print_in_columns() {
># the input should not contain chars such as '*', otherwise it will be
>expanded.
>    set -- $(tr ' ' '\n' | sort)
>    test $ncols -lt 24 && col_width=$ncols || col_width=24
>    cols=$(($ncols / $col_width))
>    rows=$(($(($# + $cols - 1)) / $cols))
>    cols_seq=$(seq $cols)
>    for row in $(seq $rows); do
>        print_index=$row
>        print_line=""
>        for col in $cols_seq; do
>test $print_index -le $# && eval print_line='"$print_line
>"${'$print_index'}'
>            print_index=$(($print_index + $rows))
>        done
>        printf "%-${col_width}s" $print_line; printf "\n"
>    done | sed 's/ *$//'
>}
>
>the usage of "test && ||" is already used in configure.

Lines can bei sqeezed, but that isn't a very good Idea beyond some point. I'm quite sure the shell Implementation will end up with a few lines more or I will not feel comfortable pushing a patch that has been compacted too much.

As I said above already, that alone isn't a strong reason to prefer patch2. Though all points together are in my opinion.


Best regards,
  Alexander

>> 2. patch2 uses lots of variables (which is good in itself), but those
>>    should be local and we cannot express that portably in shell.
>>    In turn we have raised potential for clashes now or in the future.
>> 3. patch2 uses eval combined with non-trivial quoting, which is hard
>>    to read and hard to grasp quickly. It's not wrong, but it's not as
>>    easy and straight forward as in patch1.
>> 4. Depending on the input (stdin) unexpected things can happen in the
>>    "eval" and "set" lines. One example is given in the comment.
>> 5. patch2 to uses shell for the parts easily expressed in shell and
>>    uses awk for the parts that are non-trivial in shell.
>> 6. The awk implementation should be easier to read and understand for
>>    the majority of readers/developers.
>> 
>> 
>> > - What should be the general criteria to choose a scripting
>language for
>> >   future patches?
>> 
>> I don't see anywhere that we switch away from shell for configure. So
>in
>> general things should be implemented in shell. Shell though is not
>really
>> suited for implementing all kinds of algorithms. OTOH shell is an
>excellent
>> choice for starting processes that communicate via
>stdin/stdout/stderr
>> and plugging them together.
>> 
>> So I think programming shell, sticking only to shell itself
>(builtins) is
>> almost never a good option. Shell is best when coordinating the
>execution
>> of other programs. Sometimes when you do not have a particular
>program
>> around to help you in your shell script, that void needs to be
>filled.
>> For this purpose awk is often a good choice, because it was
>especially
>> designed to be a fit for this purpose amongst other things. Using
>only awk
>> to implement bigger systems or as a general purpose programming
>language
>> is IMHO nearly as wrong as using shell-only.
>> 
>> 
>> >     On Saturday, May 4, 2019 10:43 PM, Alexander Strasser
>> <eclipse7@gmx.net> wrote:
>> [...]
>> 
>> 
>> I'm feeling like I might not have expressed my reasoning good and
>> precise enough in every level of detail. I apologize for that. My
>> natural language skills have their limits and I am not a native
>> English speaker. I will try to refine specific points, should you
>> or anyone else have more questions or comments.
avih May 7, 2019, 12:22 p.m. UTC | #18
> patch1 (awk) configure: print_in_columns: replace pr with awk version: http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html
> patch2 (shell) configure: sort decoder/encoder/filter/... names in alphabet
> order (v5 as posted in this thread)
> 
> - Why do you prefer patch1 over patch2?

In general, I agree with your reasoning, but not entirely sure about the
conclusion and its implications.

Few exceptions first:


> 1. Statistics
>     * patch1: 1 file changed, 16 insertions(+), 2 deletions(-)
>     * patch2: 1 file changed, 24 insertions(+), 2 deletions(-)

As you said later, and I agree, it's a weak argument, as both versions
are the same code and complexity, but differ mostly in style and language
syntax.


> 2. patch2 uses lots of variables (which is good in itself), but those
>   should be local and we cannot express that portably in shell.
>   In turn we have raised potential for clashes now or in the future.

This is incorrect. A subshell provides identical isolation as an external
process, it's fully portable, it can be used here, and it's not slower than
an external process, e.g. `print_in_columns() ( ... )`.


> 4. Depending on the input (stdin) unexpected things can happen in the
>   "eval" and "set" lines. One example is given in the comment.

"Unexpected" is a bit general. Specifically, the "eval" line does not
execute arbitrary inputs, and neither does the "set" line.

The only potential surprise is globbing, as documented, and the technique
of expanding variables unquoted - like here, is used throughout configure.

Look no further than 100% of the very places which pipe the input stream
into print_in_columns, and multiple times earlier than that.

It's not great of course, but it's not worse than elsewhere, and it's at
least documented. Some day maybe someone would improve it.


I generally agree with rest of your comments/arguments, including:

> 1. Statistics ... (patch2 is more verbose).
> 3. patch2 uses eval combined with non-trivial quoting, which is hard
>   to read and hard to grasp quickly. It's not wrong, but it's not as
>   easy and straight forward as in patch1.
> 5. patch2 to uses shell for the parts easily expressed in shell and
>   uses awk for the parts that are non-trivial in shell.
> 6. The awk implementation should be easier to read and understand for
>   the majority of readers/developers.

which basically say that shell is hard, and, pardon the pun, awkward.

And also I agree with:

> Shell though is not really suited for implementing all kinds of algorithms.
> OTOH shell is an excellent choice for starting processes that communicate
> via stdin/stdout/stderr and plugging them together.

and with

> Shell is best when coordinating the execution of other programs.

And similar assertions, however, let's keep the following facts in mind:

- We both agree that we should keep to shell except where an external tool
  provides some meaningful value over shell code.

- An external tool only helps by being more suitable for the task than
  shell code (readability, portability, correctness, performance, etc),
  but it doesn't help with isolation more than a subshell would.

- Shell is capable in a portable way, including for some tasks which can
  be considered "programming".

- configure is currently written in shell, including some non trivial
  functionality, and doesn't use external script engines for "scriptlets".

- Being a shell script, it already requires competence and reasonable
  understanding of shell, including quoting, expansions, etc.

- Maintaining code in more than one scripting system does carry a cost.

- Yes, shell is a lousy programming language.


I really don't care if this specific patch ends up as shell or awk. What
I do care about is being consistent, and understanding the implications.

IMHO, because I'm not convinced that the awk version provides a meaningful
additional value other than not being shell, if we use the awk version then
it sets a precedence of using awk (or maybe also other languages) for small
scripts even if they're basically the same complexity and code as they
would be in shell.

It's not necessarily a bad precedence (even if I wouldn't do that myself),
because I do agree that not being shell code does have value, but I think
it's important to acknowledge that it does set such precedence.

If there was already such practice of using awk scriptlets where possible
then I wouldn't have said anything. So I mainly care about the precedence.

I guess it comes down to a judgment call on whether or not this awk script
can be considered meaningfully more suitable for the task than shell.

I think not, but clearly it's only my subjective assessment.

Regards,
Avi
Alexander Strasser June 2, 2019, 9:35 p.m. UTC | #19
Hi all,
hi Avi!

Sorry for the late reply.

On 2019-05-07 12:22 +0000, avih wrote:
> > patch1 (awk) configure: print_in_columns: replace pr with awk version: http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html
> > patch2 (shell) configure: sort decoder/encoder/filter/... names in alphabet
> > order (v5 as posted in this thread)
> >
> > - Why do you prefer patch1 over patch2?
>
> In general, I agree with your reasoning, but not entirely sure about the
> conclusion and its implications.
>
> Few exceptions first:
>
>
> > 1. Statistics
> >     * patch1: 1 file changed, 16 insertions(+), 2 deletions(-)
> >     * patch2: 1 file changed, 24 insertions(+), 2 deletions(-)
>
> As you said later, and I agree, it's a weak argument, as both versions
> are the same code and complexity, but differ mostly in style and language
> syntax.
>
> > 2. patch2 uses lots of variables (which is good in itself), but those
> >   should be local and we cannot express that portably in shell.
> >   In turn we have raised potential for clashes now or in the future.
>
> This is incorrect. A subshell provides identical isolation as an external
> process, it's fully portable, it can be used here, and it's not slower than
> an external process, e.g. `print_in_columns() ( ... )`.

No disagreement regarding those 2 points. Though with all other things
being equal, it's nice to have patches ready and tested as they are.


> > 4. Depending on the input (stdin) unexpected things can happen in the
> >   "eval" and "set" lines. One example is given in the comment.
>
> "Unexpected" is a bit general. Specifically, the "eval" line does not
> execute arbitrary inputs, and neither does the "set" line.

Regarding eval it seems I misremembered something or it was in
previous iteration. I looked at it again and I can't see how the
latest version can do something unexpected trough the eval.


> The only potential surprise is globbing, as documented, and the technique
> of expanding variables unquoted - like here, is used throughout configure.
>
> Look no further than 100% of the very places which pipe the input stream
> into print_in_columns, and multiple times earlier than that.
>
> It's not great of course, but it's not worse than elsewhere, and it's at
> least documented. Some day maybe someone would improve it.
>
>
> I generally agree with rest of your comments/arguments, including:
>
> > 1. Statistics ... (patch2 is more verbose).
> > 3. patch2 uses eval combined with non-trivial quoting, which is hard
> >   to read and hard to grasp quickly. It's not wrong, but it's not as
> >   easy and straight forward as in patch1.
> > 5. patch2 to uses shell for the parts easily expressed in shell and
> >   uses awk for the parts that are non-trivial in shell.
> > 6. The awk implementation should be easier to read and understand for
> >   the majority of readers/developers.
>
> which basically say that shell is hard, and, pardon the pun, awkward.

:)


> And also I agree with:
>
> > Shell though is not really suited for implementing all kinds of algorithms.
> > OTOH shell is an excellent choice for starting processes that communicate
> > via stdin/stdout/stderr and plugging them together.
>
> and with
>
> > Shell is best when coordinating the execution of other programs.
>
> And similar assertions, however, let's keep the following facts in mind:
>
> - We both agree that we should keep to shell except where an external tool
>   provides some meaningful value over shell code.
>
> - An external tool only helps by being more suitable for the task than
>   shell code (readability, portability, correctness, performance, etc),
>   but it doesn't help with isolation more than a subshell would.
>
> - Shell is capable in a portable way, including for some tasks which can
>   be considered "programming".
>
> - configure is currently written in shell, including some non trivial
>   functionality, and doesn't use external script engines for "scriptlets".
>
> - Being a shell script, it already requires competence and reasonable
>   understanding of shell, including quoting, expansions, etc.
>
> - Maintaining code in more than one scripting system does carry a cost.
>
> - Yes, shell is a lousy programming language.
>
>
> I really don't care if this specific patch ends up as shell or awk. What
> I do care about is being consistent, and understanding the implications.
>
> IMHO, because I'm not convinced that the awk version provides a meaningful
> additional value other than not being shell, if we use the awk version then
> it sets a precedence of using awk (or maybe also other languages) for small
> scripts even if they're basically the same complexity and code as they
> would be in shell.
>
> It's not necessarily a bad precedence (even if I wouldn't do that myself),
> because I do agree that not being shell code does have value, but I think
> it's important to acknowledge that it does set such precedence.
>
> If there was already such practice of using awk scriptlets where possible
> then I wouldn't have said anything. So I mainly care about the precedence.

Interesting point. We seem to disagree regarding the reach of the
precedence. I don't believe it sets a precedence "of using awk (or
maybe also other languages) for small scripts even if they're basically
the same complexity and code as they would be in shell". For me this
is just an individual instance where we decided to implement the needed
"leaf" functionality in awk. With leaf I mean it is lower level code that
does not need to call into e.g. shell functions we provide on the
same or lower level. Comparable to e.g. calling an external process,
like we do with pr in the current implementation.

I would oppose to implement more upper level configure logic in
any language other than shell.


> I guess it comes down to a judgment call on whether or not this awk script
> can be considered meaningfully more suitable for the task than shell.
>
> I think not, but clearly it's only my subjective assessment.

I would still prefer to commit the awk version. IMHO it is better
suited for this task. Surely it's not ideal and it's not clearly
better in all ways.

I intent to push the awk version. I will wait at least until
Thursday, so people can further test, comment, or object.


  Alexander
avih June 2, 2019, 10:28 p.m. UTC | #20
> I intent to push the awk version. I will wait at least until
> Thursday, so people can further test, comment, or object.

No further comments here.

Thanks!
Avi
Guo, Yejun June 3, 2019, 6:24 a.m. UTC | #21
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of

> avih

> Sent: Monday, June 03, 2019 6:29 AM

> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort

> decoder/encoder/filter/... names in alphabet order

> 

> > I intent to push the awk version. I will wait at least until

> > Thursday, so people can further test, comment, or object.

> 

> No further comments here.


I'm ok with the awk version.

> 

> Thanks!

> Avi

> _______________________________________________

> ffmpeg-devel mailing list

> ffmpeg-devel@ffmpeg.org

> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

> 

> To unsubscribe, visit link above, or email

> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Alexander Strasser June 24, 2019, 9:34 p.m. UTC | #22
On 2019-06-03 06:24 +0000, Guo, Yejun wrote:
>
>
> > -----Original Message-----
> > From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of
> > avih
> > Sent: Monday, June 03, 2019 6:29 AM
> > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH V5 1/2] configure: sort
> > decoder/encoder/filter/... names in alphabet order
> >
> > > I intent to push the awk version. I will wait at least until
> > > Thursday, so people can further test, comment, or object.
> >
> > No further comments here.
>
> I'm ok with the awk version.

Finally pushed the awk version.

Thanks again for your work and comments.


  Alexander
diff mbox

Patch

From 7dc4043db4ea32f6e85b17b6ec8060dac8c85f6b Mon Sep 17 00:00:00 2001
From: Alexander Strasser <eclipse7@gmx.net>
Date: Sat, 27 Apr 2019 01:06:59 +0200
Subject: [PATCH 2/2] configure: log_file: Replace pr with awk invocation

Fixes remaining part of ticket #TODO
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index 7e739b3e55..491d4d50e9 100755
--- a/configure
+++ b/configure
@@ -503,7 +503,7 @@  log(){

 log_file(){
     log BEGIN $1
-    pr -n -t $1 >> $logfile
+    awk '{ printf("%5d\t%s\n", NR, $0) }' $1 >> $logfile
     log END $1
 }

--
2.20.1