Message ID | 20170203084154.9612-2-h.leppkes@gmail.com |
---|---|
State | Accepted |
Commit | 8b80feb9a70bca07e6ea2e1a0b870915e88f13f7 |
Headers | show |
On 2/3/2017 5:41 AM, Hendrik Leppkes wrote: > Without the /UTF-8 switch, the MSVC compiler treats all files as in the > system codepage, instead of in UTF-8, which causes UTF-8 string literals > to be interpreted wrong. > > This switch was only introduced in VS2015 Update 2, and any earlier > versions do not have an equivalent solution. > > Fixes fate-sub-scc on MSVC 2015+ > --- > configure | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/configure b/configure > index d3d652f0f4..231cc3eca7 100755 > --- a/configure > +++ b/configure > @@ -6327,6 +6327,9 @@ EOF > # Issue has been fixed in MSVC v19.00.24218. > check_cpp_condition windows.h "_MSC_FULL_VER >= 190024218" || > check_cflags -d2SSAOptimizer- > + # enable utf-8 source processing on VS2015 U2 and newer > + check_cpp_condition windows.h "_MSC_FULL_VER >= 190023918" && > + add_cflags -utf-8 Probably better use check_cflags, just in case. > fi > > for pfx in "" host_; do >
On Fri, Feb 3, 2017 at 3:05 PM, James Almer <jamrial@gmail.com> wrote: > On 2/3/2017 5:41 AM, Hendrik Leppkes wrote: >> Without the /UTF-8 switch, the MSVC compiler treats all files as in the >> system codepage, instead of in UTF-8, which causes UTF-8 string literals >> to be interpreted wrong. >> >> This switch was only introduced in VS2015 Update 2, and any earlier >> versions do not have an equivalent solution. >> >> Fixes fate-sub-scc on MSVC 2015+ >> --- >> configure | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/configure b/configure >> index d3d652f0f4..231cc3eca7 100755 >> --- a/configure >> +++ b/configure >> @@ -6327,6 +6327,9 @@ EOF >> # Issue has been fixed in MSVC v19.00.24218. >> check_cpp_condition windows.h "_MSC_FULL_VER >= 190024218" || >> check_cflags -d2SSAOptimizer- >> + # enable utf-8 source processing on VS2015 U2 and newer >> + check_cpp_condition windows.h "_MSC_FULL_VER >= 190023918" && >> + add_cflags -utf-8 > > Probably better use check_cflags, just in case. > check_cflags doesn't work, since most wrong options just cause it to emit a warning but not error out (although confusingly some do error out, like the d2 option above, since the d2 prefix directly targets the c2 compiler stage, and unknown options there error instead of warn). Thats the whole reason I added a version check in the first place instead of solely using check_cflags with it. I mean, no real harm to use check_cflags together with the version check, but since it doesn't do anything, I figured I would save the extra check and a few forks. - Hendrik
On 4 February 2017 at 02:32, Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Fri, Feb 3, 2017 at 3:05 PM, James Almer <jamrial@gmail.com> wrote: > > On 2/3/2017 5:41 AM, Hendrik Leppkes wrote: > >> Without the /UTF-8 switch, the MSVC compiler treats all files as in the > >> system codepage, instead of in UTF-8, which causes UTF-8 string literals > >> to be interpreted wrong. > >> > >> This switch was only introduced in VS2015 Update 2, and any earlier > >> versions do not have an equivalent solution. > >> > >> Fixes fate-sub-scc on MSVC 2015+ > >> --- > >> configure | 3 +++ > >> 1 file changed, 3 insertions(+) > >> > >> diff --git a/configure b/configure > >> index d3d652f0f4..231cc3eca7 100755 > >> --- a/configure > >> +++ b/configure > >> @@ -6327,6 +6327,9 @@ EOF > >> # Issue has been fixed in MSVC v19.00.24218. > >> check_cpp_condition windows.h "_MSC_FULL_VER >= 190024218" || > >> check_cflags -d2SSAOptimizer- > >> + # enable utf-8 source processing on VS2015 U2 and newer > >> + check_cpp_condition windows.h "_MSC_FULL_VER >= 190023918" && > >> + add_cflags -utf-8 > > > > Probably better use check_cflags, just in case. > > > > check_cflags doesn't work, since most wrong options just cause it to > emit a warning but not error out (although confusingly some do error > out, like the d2 option above, since the d2 prefix directly targets > the c2 compiler stage, and unknown options there error instead of > warn). > Thats the whole reason I added a version check in the first place > instead of solely using check_cflags with it. > > I mean, no real harm to use check_cflags together with the version > check, but since it doesn't do anything, I figured I would save the > extra check and a few forks. Is there any possibility to also find a fix for this on older msvc versions? If you direct me to the specific lines of code that are mis-compiling/executing ill have a look. In the mean time this patch looks good to me.
On Sat, Feb 4, 2017 at 5:50 AM, Matt Oliver <protogonoi@gmail.com> wrote: > On 4 February 2017 at 02:32, Hendrik Leppkes <h.leppkes@gmail.com> wrote: > >> On Fri, Feb 3, 2017 at 3:05 PM, James Almer <jamrial@gmail.com> wrote: >> > On 2/3/2017 5:41 AM, Hendrik Leppkes wrote: >> >> Without the /UTF-8 switch, the MSVC compiler treats all files as in the >> >> system codepage, instead of in UTF-8, which causes UTF-8 string literals >> >> to be interpreted wrong. >> >> >> >> This switch was only introduced in VS2015 Update 2, and any earlier >> >> versions do not have an equivalent solution. >> >> >> >> Fixes fate-sub-scc on MSVC 2015+ >> >> --- >> >> configure | 3 +++ >> >> 1 file changed, 3 insertions(+) >> >> >> >> diff --git a/configure b/configure >> >> index d3d652f0f4..231cc3eca7 100755 >> >> --- a/configure >> >> +++ b/configure >> >> @@ -6327,6 +6327,9 @@ EOF >> >> # Issue has been fixed in MSVC v19.00.24218. >> >> check_cpp_condition windows.h "_MSC_FULL_VER >= 190024218" || >> >> check_cflags -d2SSAOptimizer- >> >> + # enable utf-8 source processing on VS2015 U2 and newer >> >> + check_cpp_condition windows.h "_MSC_FULL_VER >= 190023918" && >> >> + add_cflags -utf-8 >> > >> > Probably better use check_cflags, just in case. >> > >> >> check_cflags doesn't work, since most wrong options just cause it to >> emit a warning but not error out (although confusingly some do error >> out, like the d2 option above, since the d2 prefix directly targets >> the c2 compiler stage, and unknown options there error instead of >> warn). >> Thats the whole reason I added a version check in the first place >> instead of solely using check_cflags with it. >> >> I mean, no real harm to use check_cflags together with the version >> check, but since it doesn't do anything, I figured I would save the >> extra check and a few forks. > > > Is there any possibility to also find a fix for this on older msvc > versions? If you direct me to the specific lines of code that are > mis-compiling/executing ill have a look. > In the mean time this patch looks good to me. For this particular fate test, ccaption_dec.c line 73 onwards has various unicode escape sequences, which don't get interpreted properly. The one generic workaround is encoding byte sequences directly, which is really ugly and code-changes are usually hard to justify. Another MSVC workaround is an undocumented compiler pragma, which works in 2013, but not in 2012 (#pragma execution_character_set("utf-8")), but adding pragmas to files is also not something you can just easily hide away without it becoming ugly. - Hendrik
On Sat, Feb 4, 2017 at 10:25 AM, Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Sat, Feb 4, 2017 at 5:50 AM, Matt Oliver <protogonoi@gmail.com> wrote: >> On 4 February 2017 at 02:32, Hendrik Leppkes <h.leppkes@gmail.com> wrote: >> >>> On Fri, Feb 3, 2017 at 3:05 PM, James Almer <jamrial@gmail.com> wrote: >>> > On 2/3/2017 5:41 AM, Hendrik Leppkes wrote: >>> >> Without the /UTF-8 switch, the MSVC compiler treats all files as in the >>> >> system codepage, instead of in UTF-8, which causes UTF-8 string literals >>> >> to be interpreted wrong. >>> >> >>> >> This switch was only introduced in VS2015 Update 2, and any earlier >>> >> versions do not have an equivalent solution. >>> >> >>> >> Fixes fate-sub-scc on MSVC 2015+ >>> >> --- >>> >> configure | 3 +++ >>> >> 1 file changed, 3 insertions(+) >>> >> >>> >> diff --git a/configure b/configure >>> >> index d3d652f0f4..231cc3eca7 100755 >>> >> --- a/configure >>> >> +++ b/configure >>> >> @@ -6327,6 +6327,9 @@ EOF >>> >> # Issue has been fixed in MSVC v19.00.24218. >>> >> check_cpp_condition windows.h "_MSC_FULL_VER >= 190024218" || >>> >> check_cflags -d2SSAOptimizer- >>> >> + # enable utf-8 source processing on VS2015 U2 and newer >>> >> + check_cpp_condition windows.h "_MSC_FULL_VER >= 190023918" && >>> >> + add_cflags -utf-8 >>> > >>> > Probably better use check_cflags, just in case. >>> > >>> >>> check_cflags doesn't work, since most wrong options just cause it to >>> emit a warning but not error out (although confusingly some do error >>> out, like the d2 option above, since the d2 prefix directly targets >>> the c2 compiler stage, and unknown options there error instead of >>> warn). >>> Thats the whole reason I added a version check in the first place >>> instead of solely using check_cflags with it. >>> >>> I mean, no real harm to use check_cflags together with the version >>> check, but since it doesn't do anything, I figured I would save the >>> extra check and a few forks. >> >> >> Is there any possibility to also find a fix for this on older msvc >> versions? If you direct me to the specific lines of code that are >> mis-compiling/executing ill have a look. >> In the mean time this patch looks good to me. > > For this particular fate test, ccaption_dec.c line 73 onwards has > various unicode escape sequences, which don't get interpreted > properly. > > The one generic workaround is encoding byte sequences directly, which > is really ugly and code-changes are usually hard to justify. > Another MSVC workaround is an undocumented compiler pragma, which > works in 2013, but not in 2012 (#pragma > execution_character_set("utf-8")), but adding pragmas to files is also > not something you can just easily hide away without it becoming ugly. > Oh, and one other option is re-saving the file with a UTF-8 BOM, which the compiler will also use as a hint to support UTF-8 sequences, however UTF-8 BOM is also usually frowned upon, not to mention that all files using such features would need to be identified first. - Hendrik
2017-02-04 10:25 GMT+01:00 Hendrik Leppkes <h.leppkes@gmail.com>: > Another MSVC workaround is an undocumented compiler pragma, which > works in 2013, but not in 2012 (#pragma > execution_character_set("utf-8")), but adding pragmas to files is also > not something you can just easily hide away without it becoming ugly. Is there a real disadvantage adding the pragma after your patch gets applied? Carl Eugen
On Sat, Feb 4, 2017 at 10:29 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote: > 2017-02-04 10:25 GMT+01:00 Hendrik Leppkes <h.leppkes@gmail.com>: > >> Another MSVC workaround is an undocumented compiler pragma, which >> works in 2013, but not in 2012 (#pragma >> execution_character_set("utf-8")), but adding pragmas to files is also >> not something you can just easily hide away without it becoming ugly. > > Is there a real disadvantage adding the pragma after your patch > gets applied? > Adding pragmas is ugly because you can't hide them away, but its independent of this patch either way, since it targets a different compiler version. Applied this patch in the meantime, as well as 1/2 from the set. - Hendrik
On 4 February 2017 at 21:23, Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Sat, Feb 4, 2017 at 10:29 AM, Carl Eugen Hoyos <ceffmpeg@gmail.com> > wrote: > > 2017-02-04 10:25 GMT+01:00 Hendrik Leppkes <h.leppkes@gmail.com>: > > > >> Another MSVC workaround is an undocumented compiler pragma, which > >> works in 2013, but not in 2012 (#pragma > >> execution_character_set("utf-8")), but adding pragmas to files is also > >> not something you can just easily hide away without it becoming ugly. > > > > Is there a real disadvantage adding the pragma after your patch > > gets applied? > > > > Adding pragmas is ugly because you can't hide them away, but its > independent of this patch either way, since it targets a different > compiler version. > > Applied this patch in the meantime, as well as 1/2 from the set. > Well given that the pragma doesn't work in 2012 and we are still claiming support for those older msvc versions then it wouldn't really be a suitable fix anyway. As far as I see it it the only way to cover all currently supported msvc versions would be to convert the utf chars directly to binary. Which is a bit ugly granted, but im not sure there is anyway around it. MSVC appears to output a warning about characters that dont fit in the current code page and at the moment only ccaption_dec.c is mentioned so it appears tat luckily only 1 file needs to be changed.
diff --git a/configure b/configure index d3d652f0f4..231cc3eca7 100755 --- a/configure +++ b/configure @@ -6327,6 +6327,9 @@ EOF # Issue has been fixed in MSVC v19.00.24218. check_cpp_condition windows.h "_MSC_FULL_VER >= 190024218" || check_cflags -d2SSAOptimizer- + # enable utf-8 source processing on VS2015 U2 and newer + check_cpp_condition windows.h "_MSC_FULL_VER >= 190023918" && + add_cflags -utf-8 fi for pfx in "" host_; do