Message ID | 20170611155845.27939-1-michael@niedermayer.cc |
---|---|
State | Accepted |
Commit | 4132218b87cd6fb13abd162e3037ef4563286baa |
Headers | show |
On Sun, 11 Jun 2017 17:58:45 +0200 Michael Niedermayer <michael@niedermayer.cc> wrote: > This reduces the worst case from O(n²) to O(n) time > > Fixes Timeout > Fixes: 2127/clusterfuzz-testcase-minimized-6595787859427328 > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > --- > libavcodec/htmlsubtitles.c | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c > index 16295daa0c..70311c66d5 100644 > --- a/libavcodec/htmlsubtitles.c > +++ b/libavcodec/htmlsubtitles.c > @@ -56,6 +56,7 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) > char *param, buffer[128], tmp[128]; > int len, tag_close, sptr = 1, line_start = 1, an = 0, end = 0; > SrtStack stack[16]; > + int closing_brace_missing = 0; > > stack[0].tag[0] = 0; > strcpy(stack[0].param[PARAM_SIZE], "{\\fs}"); > @@ -83,11 +84,20 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) > and all microdvd like styles such as {Y:xxx} */ > len = 0; > an += sscanf(in, "{\\an%*1u}%n", &len) >= 0 && len > 0; > - if ((an != 1 && (len = 0, sscanf(in, "{\\%*[^}]}%n", &len) >= 0 && len > 0)) || > - (len = 0, sscanf(in, "{%*1[CcFfoPSsYy]:%*[^}]}%n", &len) >= 0 && len > 0)) { > - in += len - 1; > - } else > - av_bprint_chars(dst, *in, 1); > + > + if (!closing_brace_missing) { > + if ( (an != 1 && in[1] == '\\') > + || (in[1] && strchr("CcFfoPSsYy", in[1]) && in[2] == ':')) { > + char *bracep = strchr(in+2, '}'); > + if (bracep) { > + in = bracep; > + break; > + } else > + closing_brace_missing = 1; > + } > + } > + > + av_bprint_chars(dst, *in, 1); > break; > case '<': > tag_close = in[1] == '/'; IMO better than before - now anyone can understand this code, and it's faster. I'm not maintainer of this though. Would it be possible to move this switch case to a separate function? ff_htmlmarkup_to_ass is a bit too big. Btw. as far as ASS tag stripping is concerned, this seems to ignore drawings, but maybe they're typically not used in SRT.
On Mon, Jun 12, 2017 at 11:35:14AM +0200, wm4 wrote: > On Sun, 11 Jun 2017 17:58:45 +0200 > Michael Niedermayer <michael@niedermayer.cc> wrote: > > > This reduces the worst case from O(n²) to O(n) time > > > > Fixes Timeout > > Fixes: 2127/clusterfuzz-testcase-minimized-6595787859427328 > > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > --- > > libavcodec/htmlsubtitles.c | 20 +++++++++++++++----- > > 1 file changed, 15 insertions(+), 5 deletions(-) > > > > diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c > > index 16295daa0c..70311c66d5 100644 > > --- a/libavcodec/htmlsubtitles.c > > +++ b/libavcodec/htmlsubtitles.c > > @@ -56,6 +56,7 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) > > char *param, buffer[128], tmp[128]; > > int len, tag_close, sptr = 1, line_start = 1, an = 0, end = 0; > > SrtStack stack[16]; > > + int closing_brace_missing = 0; > > > > stack[0].tag[0] = 0; > > strcpy(stack[0].param[PARAM_SIZE], "{\\fs}"); > > @@ -83,11 +84,20 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) > > and all microdvd like styles such as {Y:xxx} */ > > len = 0; > > an += sscanf(in, "{\\an%*1u}%n", &len) >= 0 && len > 0; > > - if ((an != 1 && (len = 0, sscanf(in, "{\\%*[^}]}%n", &len) >= 0 && len > 0)) || > > - (len = 0, sscanf(in, "{%*1[CcFfoPSsYy]:%*[^}]}%n", &len) >= 0 && len > 0)) { > > - in += len - 1; > > - } else > > - av_bprint_chars(dst, *in, 1); > > + > > + if (!closing_brace_missing) { > > + if ( (an != 1 && in[1] == '\\') > > + || (in[1] && strchr("CcFfoPSsYy", in[1]) && in[2] == ':')) { > > + char *bracep = strchr(in+2, '}'); > > + if (bracep) { > > + in = bracep; > > + break; > > + } else > > + closing_brace_missing = 1; > > + } > > + } > > + > > + av_bprint_chars(dst, *in, 1); > > break; > > case '<': > > tag_close = in[1] == '/'; > > IMO better than before - now anyone can understand this code, and it's > faster. I'm not maintainer of this though. > > Would it be possible to move this switch case to a separate function? > ff_htmlmarkup_to_ass is a bit too big. patch posted which moves the code on top of this patch about the "an +=" stuff, i tried to find some description of the subtitle format that trigers it but i had no luck. So i tried not to change its behavior ... [...]
diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c index 16295daa0c..70311c66d5 100644 --- a/libavcodec/htmlsubtitles.c +++ b/libavcodec/htmlsubtitles.c @@ -56,6 +56,7 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) char *param, buffer[128], tmp[128]; int len, tag_close, sptr = 1, line_start = 1, an = 0, end = 0; SrtStack stack[16]; + int closing_brace_missing = 0; stack[0].tag[0] = 0; strcpy(stack[0].param[PARAM_SIZE], "{\\fs}"); @@ -83,11 +84,20 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) and all microdvd like styles such as {Y:xxx} */ len = 0; an += sscanf(in, "{\\an%*1u}%n", &len) >= 0 && len > 0; - if ((an != 1 && (len = 0, sscanf(in, "{\\%*[^}]}%n", &len) >= 0 && len > 0)) || - (len = 0, sscanf(in, "{%*1[CcFfoPSsYy]:%*[^}]}%n", &len) >= 0 && len > 0)) { - in += len - 1; - } else - av_bprint_chars(dst, *in, 1); + + if (!closing_brace_missing) { + if ( (an != 1 && in[1] == '\\') + || (in[1] && strchr("CcFfoPSsYy", in[1]) && in[2] == ':')) { + char *bracep = strchr(in+2, '}'); + if (bracep) { + in = bracep; + break; + } else + closing_brace_missing = 1; + } + } + + av_bprint_chars(dst, *in, 1); break; case '<': tag_close = in[1] == '/';
This reduces the worst case from O(n²) to O(n) time Fixes Timeout Fixes: 2127/clusterfuzz-testcase-minimized-6595787859427328 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavcodec/htmlsubtitles.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)