How Not To Remove Multiple Spaces From Strings

by bob on July 26, 2008

I encounter this code very often when the intent is to remove multiple spaces from strings and reduce them to single spaces:
someString = someString.Replace("  "," ");
No, no, no. This will find all instances of two spaces and replace them with one, provided that the instances aren’t consecutive. In other words consider the following string, where spaces are replaced with periods for clarity:

FOO..BAR..FOO

The above code will reduce this to FOO.BAR.FOO as expected. But suppose we have this:

FOO…BAR…FOO

Now we get FOO..BAR..FOO. Uh-oh.

The only foolproof way is:
while (someString.IndexOf("  ",StringComparison.Ordinal) > -1) {
  someString = someString.Replace("  "," ");
}

The included StringComparison enum is optional and not relevant to this discussion but I throw it in there to remind you that if you’re not dealing with culturally sensitive strings, using Ordinal or OrdinalIgnoreCase comparisons is faster. And since you’re dealing with spaces, the very fastest possible Ordinal comparison makes sense in this case, even if the rest of the string may be culturally sensitive.

{ 1 comment… read it below or add one }

Rory Fitzpatrick July 27, 2008 at 5:21 am

A useful tip, thanks for sharing. But would the following regex not work as well, as it can match 2 or more spaces?
[\s]{2,}

I guess from a performance perspective a simple string comparison and replace might be more effective.

Bob repies: I use regex all the time, but tend to avoid regex when a simple comparison will do. Not only for performance reasons, but for clarity. But the main motivation for this post was running across this logic for the umpteenth time. I know the intent, but it’s one of those sneaky things that, depending on the data being fed to it, is apt to work most of the time and then occasionally fail. That is the worst class of bugs.

Leave a Comment

Previous post:

Next post: