Pages

Sunday, October 7, 2012

Performance of string comparison in Java

Have you ever wonder how fast Java can compare two strings? Which method is the best? The answer is.... it depends what you are compare. Let's do some tests. We will try to compare the intern strings and dynamic created by both '==' and equals methods. Here is the code which do the checking:
String s1 = "test";
String s2 = "test";
String s3 = s1 + "";
String s4 = s3 + "";
String s5 = s4;
long iterations = 100000000;
long time = System.currentTimeMillis();

for (int i = 0; i < iterations; i++) {
    if (s1 == s2);
}
System.out.println("s1==s2: " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s1 == s3.intern());
}
System.out.println("s1==s3.intern(): " + (System.currentTimeMillis() - time));

for (int i = 0; i < iterations; i++) {
    if (s1.equals(s3.intern()));
}
System.out.println("s1.equals(s3.intern()): " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s1.equals(s2));
}
System.out.println("s1.equals(s2): " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s1.equals(s3));
}
System.out.println("s1.equals(s3): " + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s3.equals(s1));
}
System.out.println("s3.equals(s1): " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s3.equals(s4));
}
System.out.println("s3.equals(s4): " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s3.intern() == s4.intern());
}
System.out.println("s3.intern()==s4.intern(): " + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s3.intern().equals(s4.intern()));

}
System.out.println("s3.intern().equals(s4.intern()): " + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
    if (s5 == s4);
}
System.out.println("s5==s4: " + (System.currentTimeMillis() - time));
}
And the result is:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
s1==s2: 95
s1==s3.intern(): 12925
s1.equals(s3.intern()): 25778
s1.equals(s2): 88
s1.equals(s3): 90
s3.equals(s1): 89
s3.equals(s4): 314
s3.intern()==s4.intern(): 26148
s3.intern().equals(s4.intern()): 26083
s5==s4: 310

So as you can see there is no big difference in comparison strings using reference comparison nd equals method (test 1 and  4 for intern strings, 7 and 10 for dynamically created). What is interesting that comparing two dynamically created string is takes nearly 3.5 times longer than in case when at least of the arguments is intern string. As you can also see using the intern() method is really time consuming. The interesting result is also comparison of the second and third case. Why the time is more than doubled when equals is used?

Now, lets check how the results change when we use the longer string. The same tests but now s1 and s2 strings are 248 characters long.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
s1==s2: 135
s1==s3.intern(): 187042
s1.equals(s3.intern()): 374946
s1.equals(s2): 90
s1.equals(s3): 89
s3.equals(s1): 89
s3.equals(s4): 316
s3.intern()==s4.intern(): 373573
s3.intern().equals(s4.intern()): 374662
s5==s4: 311

The results are quite surprising. The equals() method doesn't depend on the length of its arguments! However the intern() method needs much more time.

Saturday, October 6, 2012

SvcUtils generates incorect code with multidimensional arrays

There may be situation when you need create .NET client for some webservice. The easiest way is to get the WSDL file and generates the clients using SvcUtils.exe tool. In most cases it everything goes fine and you can quickly finish your work. But there is a special case when the SvcUtils and its predecessor - wsdl.exe will fail.

Lets assume that schema contains such element:
<xs:element name="Description" minOccurs="0" maxOccurs="unbounded">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="Text" minOccurs="0" maxOccurs="unbounded" type="xs:string"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>
SvcUtils tries to be smart and generates:
private string[][] text;

[System.Xml.Serialization.XmlArrayAttribute(Order = 1)]
[System.Xml.Serialization.XmlArrayItemAttribute("Text", typeof(string), IsNullable = false)]
public string[][] Description {
    get {
        return this.text;
    }
    set {
        this.text = value;
    }
}
OK, nearly great. But when you try to use this code you experience the cast exception as there won't be possibility of casting string to string[]. The typeof parameter of XmlArrayItemAttribute is wrong! The attribute is an array of arrays, then the correct type is string[]!

Quick changes is enough to make the code work properly. But if you are the author of the service and wont to your user to avoid doing this there is one simple solution - just move the occurrences definition from Text element and place it in sequence node, that is:
<xs:element name="Describtion" minOccurs="0" maxOccurs="unbounded">
    <xs:complexType>
        <xs:sequence minOccurs="0" maxOccurs="unbounded">
            <xs:element name="Text" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
</xs:element>
After this SvcUtils will generate separate class for Description element and everything will work out of the box.