JEP 254 proposes changing the internal representation
of strings inside the JVM. As most readers surely know,
strings are stored using UTF-16, which uses two bytes per
character. This proposal suggests using a more compact,
one-byte-per-character representation internally: “Data
gathered from many different applications indicates that
strings are a major component of heap usage and, moreover,
that most String objects contain only Latin-1 characters.
Such characters require only one byte of storage,
hence half of the space in the internal char arrays of such
String objects is going unused,” says the JEP proposal.
Changing to the more compact form would not affect existing code or any APIs; it would be a purely internal change inside the JVM and not visible to programmers. Interestingly, the information on the JEP’s web page reveals that a string compression feature was tested in Java 6. It converted String.value to an Object that pointed either to an array of 7-bit characters or an array of regular Java characters. That feature, though, was removed subsequently.
Changing to the more compact form would not affect existing code or any APIs; it would be a purely internal change inside the JVM and not visible to programmers. Interestingly, the information on the JEP’s web page reveals that a string compression feature was tested in Java 6. It converted String.value to an Object that pointed either to an array of 7-bit characters or an array of regular Java characters. That feature, though, was removed subsequently.