Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
% java Example
TITLE
% 

However, most languages which languages that use the Latin alphabet associate the letter I as the uppercase version of i. But Turkish is an exception: it has a dotted i whose uppercase version is also dotted (İ) and an undotted ı whose uppercase version is undotted (I). Changing capitalization on most strings in the Turkish locale [API 2006] may produce unexpected results:

...

Many programs only use locale-dependent methods for outputting information, such as dates , provided that the locale-dependent data is not inspected by the program, and it may safely rely on the default locale.

...

In HTML, tags are case-insensitive , and consequently can therefore be specified using uppercase, lowercase, or any mixture of cases. This noncompliant code example uses the locale-dependent String.toUpperCase() method to convert an HTML tag to upper case, uppercase to check it for further processing. The code must ignore <SCRIPT> tags, as they indicate code that is to be discarded. Whereas the English locale would convert "script" to "SCRIPT", the Turkish locale will convert "script" to "SCRİPT", and the check will fail to detect the <SCRIPT> tag.

...

Specifying Locale.ROOT is a suitable alternative under conditions where when an English-specific locale would not be appropriate.

...

Java provides classes for handling input and output, which can be based on either bytes or characters. The byte I/O families derive from the InputStream and OutputStream interfaces , and are independent of locale or character encoding. However, the character I/O families derive from Reader and Writer, and they must convert byte sequences into strings and back. Thus, so they rely on a specified character encoding to do their conversion. This encoding is indicated by the file.encoding system property, which is part of the current locale. Consequently, a file encoded with one encoding, such as UTF-8, must not be read by a character input method using a different encoding, such as UTF-16.

Programs that read character data (whether directly using a Reader or indirectly using some method such as constructing a String from a byte array) must be aware of the source of the data. If the encoding of the data is fixed (such as if the data comes from a file resource that is shipped with the program), then that encoding must be specified by the program. Failure to specify the coding enables an attacker to change the encoding to force the program to read the data using the wrong encoding.

This risk does not apply to programs that read data known to be in the encoding specified by the platform running the program. For example, if the program must open a file provided by the user, it is reasonable to rely on the default encoding, expecting that it will be set correctly.

This noncompliant code example reads its own source code , and prints it out, prepending each line with a line number. If the program is run with the argument -Dfile.encoding=UTF16 while its source file is stored as UTF8, the program will save garbage in the output file.

...

This compliant solution forces the date to be printed in an English format, regardless of the current locale:.

Code Block
bgColor#ccccff
langjava
String myString = DateFormat.getDateInstance(DateFormat.MEDIUM, Locale.US).format(rightNow.getTime());
/* ...rest of code unchanged...*/

...