Friday, March 13, 2009

Java String Literals: No String Constructor Required

I think that most experienced Java developers are aware of many of the many characteristics of the Java String that make it a little different than other objects. One particular nuance of the Java String that people new to Java sometimes don't fully appreciate is that a literal String is already a String object.

When first learning Java it is really easy to write a String assignment like this:


// Unnecessary and redundant String instantiation
String blogUrlString = new String("http://marxsoftware.blogspot.com/");


This will compile and the initialized String blogUrlString will support any needs one might expect from a String. However, the downside of this particular statement is there are actually two String instantiations in this case and one of them is unnecessary. Because the String literal "http://marxsoftware.blogspot.com/" is already a full-fledged Java String, the new operator is unnecessary and results in an extraneous instantiation. The code above can be re-written as follows:


// The 'new' keyword is not needed because the literal String is a full String object
String blogUrlString = "http://marxsoftware.blogspot.com/";


The unnecessary String instantiation demonstrated first will lead to reduced performance in Java applications. If the extraneous instantiation occurs in limited cases outside of loops, it is likely not to be a significant performance degradation. However, if it occurs within a loop, its performance impact can be much more significant. However, even when the performance issue is only slight, I still find the extra "new" instantiation to be less readable than the second method shown above.

Joshua Bloch uses an example similar to mine above to illustrate Item 5 ("Avoid Creating Unnecessary Objects") in the Second Edition of Effective Java. He points out that this extra instantiation in frequently called code can lead to performance problems.

To demonstrate the effect of this unnecessary extra instantiation of a String, I put together the following simple class (with a nested member class and a nested enum). The full code for it appears next.

RedundantStringExample.java


package dustin.examples;

import java.util.ArrayList;
import java.util.List;

/**
* Example demonstrating effect of redundant String instantiation.
*/
public class RedundantStringExample
{
/** Operating System-independent New line character. */
private static final String NEW_LINE = System.getProperty("line.separator");

/** List of Strings. */
private List<String> strings = new ArrayList<String>();

/** No-arguments constructor. */
public RedundantStringExample() {}

/**
* Test performance in loop over single String instantiation that is
* executed the number of times as provided by the passed-in argument.
*
* @param numberOfLoops Number of times to instantiate Single String.
* @return Results of this test.
*/
public TestResult testSingleString(final int numberOfLoops)
{
final TestResult result = new TestResult(numberOfLoops, TestType.SINGLE);
result.startTimer();
for (int counter = 0; counter < numberOfLoops; counter++)
{
strings.add("http://marxsoftware.blogspot.com/");
}
result.stopTimer();
return result;
}

/**
* Test performance in loop over redundant String instantiations that is
* executed the number of times as provided by the passed-in argument.
*
* @param numberOfLoops Number of times to instantiate Single String.
* @return Results of this test.
*/
public TestResult testRedundantStrings(final int numberOfLoops)
{
final TestResult result = new TestResult(numberOfLoops, TestType.REDUNDANT);
result.startTimer();
for (int counter = 0; counter < numberOfLoops; counter++)
{
strings.add(new String("http://marxsoftware.blogspot.com/"));
}
result.stopTimer();
return result;
}

/**
* Run the examples based on provided command-line arguments.
*
* @param arguments Command-line arguments where the first argument should
* be an integer (not decimal) numeral.
*/
public static void main(final String[] arguments)
{
final int numberArguments = arguments.length;
if (numberArguments < 2)
{
System.err.println("Please provide two command-line arguments:");
System.err.println("\tIntegral number of times to instantiate Strings");
System.err.println("\tType of test to run ('redundant', 'constant', or 'single')");
System.exit(-2);
}

final int numberOfExecutions = Integer.valueOf(arguments[0]);
final String testChoice = arguments[1];
if (testChoice == null || testChoice.isEmpty())
{
System.err.println("The second argument must be a test choice.");
System.exit(-1);
}
final RedundantStringExample me = new RedundantStringExample();
TestResult testResult = null;
if (testChoice.equalsIgnoreCase("redundant"))
{
testResult = me.testRedundantStrings(numberOfExecutions);
}
else // testChoice is "single" or something unexpected
{
testResult = me.testSingleString(numberOfExecutions);
}
System.out.println(testResult);
}

/**
* Class used to pass test results back to caller.
*/
private static class TestResult
{
/** Number of milliseconds per second. */
private static final long MILLISECONDS_PER_SECOND = 1000;

/** Number of String instantiations. */
private int numberOfExecutions;

/** Type of test this result applies to. */
private TestType testType;

/** Test begining time. */
private long startTime = -1L;

/** Test ending time. */
private long finishTime = -1L;

/**
* Constructor acceptes argument indicating number of times applicable
* test should be run
*
* @param newNumberOfExecutions Times test whose result this is will be/was
* executed.
* @param newTestType Type of test executed for this result
*/
public TestResult(final int newNumberOfExecutions, final TestType newTestType)
{
numberOfExecutions = newNumberOfExecutions;
testType = newTestType;
}

/**
* Start timer.
*/
public void startTimer()
{
startTime = System.currentTimeMillis();
}

/**
* Stop timer.
*
* @throws IllegalStateException Thrown if this stopTimer() method is
* called and the corresponding startTimer() method was never called or
* if the calculated finish time is earlier than the start time.
*/
public void stopTimer()
{
if (startTime < 0 )
{
throw new IllegalStateException(
"Cannot stop timer because it was never started!");
}
finishTime = System.currentTimeMillis();
if (finishTime < startTime)
{
throw new IllegalStateException(
"Cannot have a stop time [" + finishTime + "] that is less than "
+ "the start time [" + startTime + "]");
}
}

/**
* Provide the number of milliseconds spent in execution of test.
*
* @return Number of milliseconds spent in execution of test.
* @throws IllegalStateException Thrown if the time spent is invalid
* due to the finish time being less than (earlier than) the start time.
*/
public long getMillisecondsSpent()
{
if (finishTime < startTime)
{
throw new IllegalStateException(
"The time spent is invalid because the finish time ["
+ finishTime + " is later than the start time ["
+ startTime + "].");
}
return finishTime - startTime;
}

/**
* Provide the number of seconds spent in execution of test.
*
* @return Number of seconds spent in execution of test.
*/
public double getSecondsSpent()
{
return getMillisecondsSpent() / MILLISECONDS_PER_SECOND;
}

/**
* Provide the number of executions run as part of this test.
*
* @return Number of executions of this test.
*/
public int getNumberOfExecution()
{
return numberOfExecutions;
}

/**
* Provide the type of this test.
*
* @return Type of this test.
*/
public TestType getTestType()
{
return testType;
}

/**
* Provide String representation of me.
*
* @return My String representation.
*/
@Override
public String toString()
{
final StringBuilder builder = new StringBuilder();
builder.append("TEST RESULTS:").append(NEW_LINE);
builder.append("Type of Test: ").append(testType).append(NEW_LINE);
builder.append("Number of Executions: ").append(numberOfExecutions).append(NEW_LINE);
builder.append("Elapsed Time (milliseconds): ").append(getMillisecondsSpent()).append(NEW_LINE);
builder.append("\t\tStart: ").append(startTime);
builder.append(" ; Stop: ").append(finishTime);
return builder.toString();
}
}

/** Enum representing type of Test. */
private static enum TestType
{
SINGLE,
REDUNDANT
}
}


For the very simple code example used in the tests above, I needed to run the tests with many loops to see truly dramatic differences. However, the performance difference was obvious. I ran the tests several times for each test and averaged the results. In general, when the loops were large enough to differentiate significant differences, I found the method using the extraneous String instantiation to take roughly four times as long to execute as the loops using the String literal directly without the extra "new."

Although I ran each test on each number of loops, I show just one representative sample run for a few key data points in the following screen capture. I mark the results of running tests with 1 million loops in yellow and running with 10 million loops in red.



There are many cases in which the extra String instantiation demonstrated above might not have any significant performance impact. However, there is no positive of specifying an extra String instantiation and there is a negative in addition to reduced performance related to the extra code clutter.

Note that the examples above extend to similar String uses. Here is another slightly altered example.


// the way NOT to do it
String someString = new String("http://" + theDomain + ":" + thePort + "/servicecontext");

// better way to do this; don't need extra new instantiation
String someString = "http://" + theDomain + ":" + thePort + "/serviceContext";

// NOTE: If you start using loops to assemble long Strings similar to those
// shown above, performance needs will likely dictate use of StringBuilder
// or StringBuffer instead. See
// http://marxsoftware.blogspot.com/2008/05/string-stringbuffer-and-stringbuilder.html
// for additional details.


Finally, as a reminder for anyone new to Java and Java Strings, if you find yourself assembling a large String from a large number of pieces, you will typically be better off using a StringBuilder or StringBuffer instead of a String. The root cause for this again has to do with too many String instantiations.

The Java String's behavior can seem a little strange until one gets used to it and even then it still might seem a little strange. The main point to remember related to this blog posting is that String literals are full-fledged String objects and so do not require the String constructor to be explicitly invoked.


Additional Resources

Java Tutorial: Strings

String Constructor Considered Useless Turns Out to be Useful After All

Use of the String(String) Constructor in Java

What is the Purpose of the Expression "new String(...)" in Java?

Java String@Everything2.com

4 comments:

Mathias Ricken said...

Generally this is true, there are exceptions, however. Strings share memory, and in some cases, using the constructor can improve garbage collection.

For more information, take a look at this post:

http://kjetilod.blogspot.com/2008/09/string-constructor-considered-useless.html

Joe Enos said...

Maybe it's just me, but I would never have imagined anyone would write:
String s = new String("myValue");
instead of
String s = "myValue";

I suppose it could just depend on your programming background...

Scott Vachalek said...

"if you find yourself assembling a large String from a large number of pieces, you will typically be better off using a StringBuilder or StringBuffer instead of a String."

The statement "S = S1 + S2 + S3;" is no more efficient if you convert it to a StringBuilder (which is how its compiled anyway) but it will be harder to read. StringBuilders are for loops and other cases of string catenation across multiple *statements* and not just multiple pieces.

Unknown said...

Normally, the VM will optimize a S1 + S2 + S3 into a StringBuilder-like structure, as it can perfectly allocate enough size for the StringBuilder (S1.length() + S2.length() + S3.length). It's a lot more of an issue if you're using String concatenation in a loop. In that case you should ALWAYS use a StringBuilder or StringBuffer. However, stating that you should always use StringBuilder or StringBuffer when handling a known number of Strings is incorrect.