diff mbox series

[FFmpeg-devel,v2] avcodec/ccaption_dec: honor transparency of leading non-breaking space

Message ID 20240311004411.140656-1-marth64@proxyid.net
State New
Headers show
Series [FFmpeg-devel,v2] avcodec/ccaption_dec: honor transparency of leading non-breaking space | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Marth64 March 11, 2024, 12:44 a.m. UTC
In Closed Captions (US), the non-breaking space (0xA0) can be used to align
text horizontally from the left by using it as a leading character.
However, CC decoder does not ignore it as a leading character like it does
an ordinary space, so a blank padding is rendered over the black CC box.
This is not the intended viewing experience.

Ignore the leading non-breaking spaces, thus creating the intended transparency
which aligns the text. Since all characters are fixed-width in CC, it
can be handled the same way as we currently treat leading ordinary spaces.
Also, as a nit, lowercase the NBSP's hex code in the entry table to match
casing of the other hex codes.

v2 only updates the commit message which mistakenly referenced avformat.

Signed-off-by: Marth64 <marth64@proxyid.net>
---
 libavcodec/ccaption_dec.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Stefano Sabatini March 11, 2024, 4:38 p.m. UTC | #1
On date Sunday 2024-03-10 19:44:11 -0500, Marth64 wrote:
> In Closed Captions (US), the non-breaking space (0xA0) can be used to align
> text horizontally from the left by using it as a leading character.
> However, CC decoder does not ignore it as a leading character like it does
> an ordinary space, so a blank padding is rendered over the black CC box.
> This is not the intended viewing experience.
> 
> Ignore the leading non-breaking spaces, thus creating the intended transparency
> which aligns the text. Since all characters are fixed-width in CC, it
> can be handled the same way as we currently treat leading ordinary spaces.
> Also, as a nit, lowercase the NBSP's hex code in the entry table to match
> casing of the other hex codes.
> 
> v2 only updates the commit message which mistakenly referenced avformat.
> 
> Signed-off-by: Marth64 <marth64@proxyid.net>
> ---
>  libavcodec/ccaption_dec.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/libavcodec/ccaption_dec.c b/libavcodec/ccaption_dec.c
> index faf058ce97..591013d202 100644
> --- a/libavcodec/ccaption_dec.c
> +++ b/libavcodec/ccaption_dec.c
> @@ -91,7 +91,7 @@ enum cc_charset {
>          ENTRY(0x36, "\u00a3")                            \
>          ENTRY(0x37, "\u266a")                            \
>          ENTRY(0x38, "\u00e0")                            \
> -        ENTRY(0x39, "\u00A0")                            \
> +        ENTRY(0x39, "\u00a0")                            \
>          ENTRY(0x3a, "\u00e8")                            \
>          ENTRY(0x3b, "\u00e2")                            \
>          ENTRY(0x3c, "\u00ea")                            \
> @@ -471,7 +471,8 @@ static int capture_screen(CCaptionSubContext *ctx)
>              const char *row = screen->characters[i];
>              const char *charset = screen->charsets[i];
>              j = 0;
> -            while (row[j] == ' ' && charset[j] == CCSET_BASIC_AMERICAN)
> +            while ((row[j] == ' '  && charset[j] == CCSET_BASIC_AMERICAN) ||
> +                   (row[j] == 0x39 && charset[j] == CCSET_SPECIAL_AMERICAN))
>                  j++;
>              if (!tab || j < tab)
>                  tab = j;
> @@ -491,7 +492,9 @@ static int capture_screen(CCaptionSubContext *ctx)
>              j = 0;
>  
>              /* skip leading space */
> -            while (row[j] == ' ' && charset[j] == CCSET_BASIC_AMERICAN && j < tab)
> +            while (j < tab &&
> +                   (row[j] == ' '  && charset[j] == CCSET_BASIC_AMERICAN) ||
> +                   (row[j] == 0x39 && charset[j] == CCSET_SPECIAL_AMERICAN))
>                  j++;

Patch LGTM, will apply the complete patcheset if I see no comments
after a few days.

Thanks.
Marth64 March 12, 2024, 2:29 a.m. UTC | #2
I am working on an improved patchset to consolidate these patches, also
address feedback and other improvements. Will submit soon, thank you!

On Mon, Mar 11, 2024 at 11:38 AM Stefano Sabatini <stefasab@gmail.com>
wrote:

> On date Sunday 2024-03-10 19:44:11 -0500, Marth64 wrote:
> > In Closed Captions (US), the non-breaking space (0xA0) can be used to
> align
> > text horizontally from the left by using it as a leading character.
> > However, CC decoder does not ignore it as a leading character like it
> does
> > an ordinary space, so a blank padding is rendered over the black CC box.
> > This is not the intended viewing experience.
> >
> > Ignore the leading non-breaking spaces, thus creating the intended
> transparency
> > which aligns the text. Since all characters are fixed-width in CC, it
> > can be handled the same way as we currently treat leading ordinary
> spaces.
> > Also, as a nit, lowercase the NBSP's hex code in the entry table to match
> > casing of the other hex codes.
> >
> > v2 only updates the commit message which mistakenly referenced avformat.
> >
> > Signed-off-by: Marth64 <marth64@proxyid.net>
> > ---
> >  libavcodec/ccaption_dec.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/libavcodec/ccaption_dec.c b/libavcodec/ccaption_dec.c
> > index faf058ce97..591013d202 100644
> > --- a/libavcodec/ccaption_dec.c
> > +++ b/libavcodec/ccaption_dec.c
> > @@ -91,7 +91,7 @@ enum cc_charset {
> >          ENTRY(0x36, "\u00a3")                            \
> >          ENTRY(0x37, "\u266a")                            \
> >          ENTRY(0x38, "\u00e0")                            \
> > -        ENTRY(0x39, "\u00A0")                            \
> > +        ENTRY(0x39, "\u00a0")                            \
> >          ENTRY(0x3a, "\u00e8")                            \
> >          ENTRY(0x3b, "\u00e2")                            \
> >          ENTRY(0x3c, "\u00ea")                            \
> > @@ -471,7 +471,8 @@ static int capture_screen(CCaptionSubContext *ctx)
> >              const char *row = screen->characters[i];
> >              const char *charset = screen->charsets[i];
> >              j = 0;
> > -            while (row[j] == ' ' && charset[j] == CCSET_BASIC_AMERICAN)
> > +            while ((row[j] == ' '  && charset[j] ==
> CCSET_BASIC_AMERICAN) ||
> > +                   (row[j] == 0x39 && charset[j] ==
> CCSET_SPECIAL_AMERICAN))
> >                  j++;
> >              if (!tab || j < tab)
> >                  tab = j;
> > @@ -491,7 +492,9 @@ static int capture_screen(CCaptionSubContext *ctx)
> >              j = 0;
> >
> >              /* skip leading space */
> > -            while (row[j] == ' ' && charset[j] == CCSET_BASIC_AMERICAN
> && j < tab)
> > +            while (j < tab &&
> > +                   (row[j] == ' '  && charset[j] ==
> CCSET_BASIC_AMERICAN) ||
> > +                   (row[j] == 0x39 && charset[j] ==
> CCSET_SPECIAL_AMERICAN))
> >                  j++;
>
> Patch LGTM, will apply the complete patcheset if I see no comments
> after a few days.
>
> Thanks.
>
diff mbox series

Patch

diff --git a/libavcodec/ccaption_dec.c b/libavcodec/ccaption_dec.c
index faf058ce97..591013d202 100644
--- a/libavcodec/ccaption_dec.c
+++ b/libavcodec/ccaption_dec.c
@@ -91,7 +91,7 @@  enum cc_charset {
         ENTRY(0x36, "\u00a3")                            \
         ENTRY(0x37, "\u266a")                            \
         ENTRY(0x38, "\u00e0")                            \
-        ENTRY(0x39, "\u00A0")                            \
+        ENTRY(0x39, "\u00a0")                            \
         ENTRY(0x3a, "\u00e8")                            \
         ENTRY(0x3b, "\u00e2")                            \
         ENTRY(0x3c, "\u00ea")                            \
@@ -471,7 +471,8 @@  static int capture_screen(CCaptionSubContext *ctx)
             const char *row = screen->characters[i];
             const char *charset = screen->charsets[i];
             j = 0;
-            while (row[j] == ' ' && charset[j] == CCSET_BASIC_AMERICAN)
+            while ((row[j] == ' '  && charset[j] == CCSET_BASIC_AMERICAN) ||
+                   (row[j] == 0x39 && charset[j] == CCSET_SPECIAL_AMERICAN))
                 j++;
             if (!tab || j < tab)
                 tab = j;
@@ -491,7 +492,9 @@  static int capture_screen(CCaptionSubContext *ctx)
             j = 0;
 
             /* skip leading space */
-            while (row[j] == ' ' && charset[j] == CCSET_BASIC_AMERICAN && j < tab)
+            while (j < tab &&
+                   (row[j] == ' '  && charset[j] == CCSET_BASIC_AMERICAN) ||
+                   (row[j] == 0x39 && charset[j] == CCSET_SPECIAL_AMERICAN))
                 j++;
 
             x = ASS_DEFAULT_PLAYRESX * (0.1 + 0.0250 * j);