If you code in Java, you have inevitably used the
And why wouldn’t you? They are much more convenient than using the Java Regular Expressions API where you need to create a ‘
Pattern‘ object, and possibly a ‘
Matcher‘, and then call methods on those.
However, all convenience comes at a price!
The Evil Inside
In this case, the
String.replace*() methods (with the sole exception of
String.replace(char, char) ) internally use the regular expression apis themselves, which can result in performance issues for your application.
Here is the
Notice that each call to
String.split() creates and compiles a new
Pattern object. The same is true for the
String.replace() methods. This compiling of a pattern each time can cause performance issues in your program if you call the
replace() functions in a tight loop.
I tried a very simple test case to see how much the performance is affected.
The first case used
String.split() a million times:
In the second case, I just changed the loop to use a precompiled
Here are the average results of 6 test runs:
Time taken with
String.split() : 1600ms
Time taken with precompiled
Pattern object: 1195 ms
Note that I used an extremely simple regular expression here which consists of just a single ‘space’ character and it resulted in > 25% decrease in performance.
A longer more complex expression would take longer to compile and thus make the loop containing the split() method even slower compared to its counterpart.
Lesson learned: It is good to know the internals of the APIs you use. Sometimes the convenience comes at the price of a hidden evil which may come to bite you when you are not looking.