Regular expression to remove unwanted characters from the String

3.6k views Asked by At

I have a requirement where I need to remove unwanted characters for String in java. For example, Input String is

Income ......................4,456
liability........................56,445.99

I want the output as

Income 4,456
liability 56,445.99

What is the best approach to write this in java. I am parsing large documents for this hence it should be performance optimized.

3

There are 3 answers

2
German On BEST ANSWER

You can do this replace with this line of code:

System.out.println("asdfadf ..........34,4234.34".replaceAll("[ ]*\\.{2,}"," "));
3
Fady Saad On

Best way to do that is like:

String result = yourString.replaceAll("[-+.^:,]","");

That will replace this special character with nothing.

3
Tim Biegeleisen On

For this particular example, I might use the following replacement:

String input = "Income ......................4,456";
input = input.replaceAll("(\\w+)\\s*\\.+(.*)", "$1 $2");
System.out.println(input);

Here is an explanation of the pattern being used:

(\\w+)   match AND capture one or more word characters
\\s*     match zero or more whitespace characters
\\.+     match one or more literal dots
(.*)     match AND capture the rest of the line

The two quantities in parentheses are known as capture groups. The regex engine remembers what these were while matching, and makes them available, in order, as $1 and $2 to use in the replacement string.

Output:

Income 4,456

Demo