How to use Regular Expression in Java – Tutorial and Importance

What is Regular Expression in Java

A Regular Expression or regEx in Java is a special string of characters that helps you match or find other strings or sets of strings by using a special syntax held in a pattern.

They can be used to find text and data, change it, or do other things with it.

Does Java have Regular Expressions?

Java doesn’t have a built-in Regular Expression class, but we can work with them by importing the java.util.regex package.

Two Types of Regular Expression

There are two types of Regular Expression In Programming.

  • Basic regular expression
  • Extended regular expression

What are the Applications of Regular Expressions in Java?

The Regular Expressions In Java can be used for all kinds of text search and text replace operations.

Java doesn’t have a built-in Regular Expression class, but we can work with them by importing the java.util.regex package.

Importance of Regular Expression in Java

The Regular Expression In Java is a very important tool for finding patterns and making sure that forms are correct.

Look at its Properties, how it can be used, and what it can’t do.

Without further ado, I will further explain the Regular Expression In Java with some basic examples in order for you to easily understand the topic.

The java.util.regex package in Java lets you use regular expressions to match patterns.

Java Regular Expressions are very easy to learn and are very similar to the programming language Perl.

Perl is an open-source, cross-platform programming language that is used a lot in both business and personal computing.

In the late 20th century and early 21st century, Perl was a favorite among Web developers because it could process text in many different ways and solve problems in many different ways.

Description Of java.util.regex.package

The java.util.regex.package is a class that uses regular expressions to match character sequences to patterns.

The java.util.regex package is primarily made up of the following three classes listed in the table below:

ClassesDescription
Pattern ClassA Pattern object is a regular expression that has been put together. There are no public constructors in the Pattern class. To make a pattern, you must first call one of its public static compile() methods, which will return a Pattern object. The first argument to these methods is a regular expression.
Matcher ClassA Matcher object is the engine that figures out what the pattern means and does match operations on an input string based on the pattern. Matcher has no public constructors, just like the Pattern class. By calling the matcher() method on a Pattern object, you can get a Matcher object.
PatternSyntaxExceptionA PatternSyntaxException is an unchecked exception that is thrown when a regular expression pattern has a syntax error.
java.util.regex 3 Classes

Capturing Groups In Java

The Capturing Groups In Java are a way to treat several characters as a single unit.

They are made by putting the characters to be grouped inside a set of parentheses.

For example, the regular expression (dog) creates a single group with the letters “d”, “o”, and “g”.

Capturing Groups In Java are numbered by counting the opening parentheses from left to right.

In the expression ((A)(B(C))).

List of four examples of Capturing Groups In Java is listed below.

  • ((A)(B(C)))
  • (A)
  • (B(C))
  • (C)

What is an Example Of Regular Expression in Java

The table below shows some examples of Regular Expressions In Java.

RegexMatches any string that
hellocontains {hello}
gray|greycontains {gray, grey}
gr(a|e)ycontains {gray, grey}
gr[ae]ycontains {gray, grey}
b[aeiou]bblecontains {babble, bebble, bibble, bobble, bubble}
[b-chm-pP]at|otcontains {bat, cat, hat, mat, nat, oat, pat, Pat, ot}
colou?rcontains {color, colour}
rege(x(es)?|xps?)contains {regex, regexes, regexp, regexps}
go*glecontains {ggle, gogle, google, gooogle, goooogle, …}
go+glecontains {gogle, google, gooogle, goooogle, …}
g(oog)+lecontains {google, googoogle, googoogoogle, googoogoogoogle, …}
z{3}contains {zzz}
z{3,6}contains {zzz, zzzz, zzzzz, zzzzzz}
z{3,}contains {zzz, zzzz, zzzzz, …}
[Bb]rainf\*\*kcontains {Brainf**k, brainf**k}
\dcontains {0,1,2,3,4,5,6,7,8,9}
\d{5}(-\d{4})?contains a United States zip code
1\d{10}contains an 11-digit string starting with a 1
[2-9]|[12]\d|3[0-6]contains an integer in the range 2..36 inclusive
Hello\nworldcontains Hello followed by a newline followed by world
mi.....ftcontains a nine-character (sub)string beginning with mi and ending with ft (Note: depending on context, the dot stands either for “any character at all” or “any character except a newline”.) Each dot is allowed to match a different character, so both microsoft and minecraft will match.
\d+(\.\d\d)?contains a positive integer or a floating point number with exactly two characters after the decimal point.
[^i*&2@]contains any character other than an i, asterisk, ampersand, 2, or at-sign.
//[^\r\n]*[\r\n]contains a Java or C# slash-slash comment
^dogbegins with “dog”
dog$ends with “dog”
^dog$is exactly “dog”
Examples Of Regular Expressions In Java

How Do You Find Out How Many Groups are in an Expression in Java,

To find out how many groups are in the expression, call the groupCount method on a matcher object.

The groupCount method returns an int that shows how many capturing groups are in the matcher’s pattern.

There is also a special group called group 0 that always represents the entire expression.

This group isn’t counted in the total given by groupCount.

Example Program in Java for Capturing Groups

The following example shows how to find a digit string in a given alphanumeric string.

// program created by Glenn 
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches {

   public static void main( String args[] ) {
      // String to be scanned to find the pattern.
      String line = "This order was placed for PIES5G13000! OK?";
      String pattern = "(.*)(\\d+)(.*)";

      // Create a Pattern object
      Pattern r = Pattern.compile(pattern);

      // Now create matcher object.
      Matcher m = r.matcher(line);
      if (m.find( )) {
         System.out.println("Found value: " + m.group(0) );
         System.out.println("Found value: " + m.group(1) );
         System.out.println("Found value: " + m.group(2) );
      }else {
         System.out.println("NO MATCH");
      }
   }
}

Output:

Found value: This order was placed for PIES5G13000! OK?
Found value: This order was placed for PIES5G1300
Found value: 0

In order for you to test the Java code provided in this lesson, you must test the code in your code editor.

But if you wish to run this code online, we also have an online compiler in Java for you to test your Java code for free.

You can test the above example here! ➡Java Online Compiler 

Regular Expression Syntax In Java

Lists of the following Regular Expression Syntax In Java are in the table below.

SubexpressionMatches
^Matches the beginning of the line.
$Matches the end of the line.
.Matches any single character except newline. Using m option allows it to match the newline as well.
[…]Matches any single character in brackets.
[^…]Matches any single character not in brackets.
\ABeginning of the entire string.
\zEnd of the entire string.
\ZEnd of the entire string except allowable final line terminator.
re*Matches 0 or more occurrences of the preceding expression.
re+Matches 1 or more of the previous thing.
re?Matches 0 or 1 occurrence of the preceding expression.
re{ n}Matches exactly n number of occurrences of the preceding expression.
re{ n,}Matches n or more occurrences of the preceding expression.
re{ n, m}Matches at least n and at most m occurrences of the preceding expression.
a| bMatches either a or b.
(re)Groups regular expressions and remembers the matched text.
(?: re)Groups regular expressions without remembering the matched text.
(?> re)Matches the independent pattern without backtracking.
\wMatches the word characters.
\WMatches the nonword characters.
\sMatches the whitespace. Equivalent to [\t\n\r\f].
\SMatches the nonwhitespace.
\dMatches the digits. Equivalent to [0-9].
\DMatches the nondigits.
\AMatches the beginning of the string.
\ZMatches the end of the string. If a newline exists, it matches just before newline.
\zMatches the end of the string.
\GMatches the point where the last match finished.
\nBack-reference to capture group number “n”.
\bMatches the word boundaries when outside the brackets. Matches the backspace (0x08) when inside the brackets.
\BMatches the nonword boundaries.
\n, \t, etc.Matches newlines, carriage returns, tabs, etc.
\QEscape (quote) all characters up to \E.
\EEnds quoting begun with \Q.
Regular Expression Syntax In Java

Methods of The Matcher Class in Java

The following are useful instance methods of Matcher Class In Java.

Index Methods in Java

The Index Methods In Java give useful index values that show where in the input string the match was found.

The following is the list of index methods provided in the table below.

#Method & Description
1public int start()Returns the start index of the previous match.
2public int start(int group)Returns the start index of the subsequence captured by the given group during the previous match operation.
3public int end()Returns the offset after the last character matched.
4public int end(int group)Returns the offset after the last character of the subsequence captured by the given group during the previous match operation.
Index Methods In Java

Study Methods in Java

A Study Methods In Java review the input string and give back a Boolean that says if the pattern was found or not.

The following is the list of study methods provided in the table below.

#Method & Description
1public boolean lookingAt()Attempts to match the input sequence, starting at the beginning of the region, against the pattern.
2public boolean find()Attempts to find the next subsequence of the input sequence that matches the pattern.
3public boolean find(int start)Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.
4public boolean matches()Attempts to match the entire region against the pattern.
Study Methods In Java

Replacement Methods In Java

The Replacement Methods In Java these are useful for replacing text in an input string.

The following is the list of replacement methods provided in the table below.

#Method & Description
1public Matcher appendReplacement(StringBuffer sb, String replacement)Implements a non-terminal append-and-replace step.
2public StringBuffer appendTail(StringBuffer sb)Implements a terminal append-and-replace step.
3public String replaceAll(String replacement)Replaces every subsequence of the input sequence that matches the pattern with the given replacement string.
4public String replaceFirst(String replacement)Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string.
5public static String quoteReplacement(String s)Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class.
Replacement Methods In Java

Start and End Methods In Java

Here is an example that counts the number of times the word “horse” appears in the input string.

Example:

// program by Glenn
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches {

   private static final String REGEX = "\\bhorse\\b";
   private static final String INPUT = "horse horse horse horsie horrse";

   public static void main( String args[] ) {
      Pattern p = Pattern.compile(REGEX);
      Matcher m = p.matcher(INPUT);   // get a matcher object
      int count = 0;

      while(m.find()) {
         count++;
         System.out.println("Match number "+count);
         System.out.println("start(): "+m.start());
         System.out.println("end(): "+m.end());
      }
   }
}

Output:

Match number 1
start(): 0
end(): 5
Match number 2
start(): 6
end(): 11
Match number 3
start(): 12
end(): 17

You can test the above example here! ➡Java Online Compiler 

You can see that this example uses word boundaries to make sure that the letters “h” “o” “r” “s” “e” are not just parts of a longer word.

It also tells you where in the input string the match happened, which is helpful.

The start method returns the start index of the subsequence captured by the given group in the previous match operation.

The end method returns the index of the last character matched, plus one.

Matches and lookingAt Methods in Java

The matches and lookingAt methods both try to match an input sequence with a pattern.

The difference is that matches need the entire input sequence to be matched, while lookingAt does not.

Both methods always start from the beginning of the input string. Here is an example that shows how it works:

Example:

// program created bY Glenn
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches {

   private static final String REGEX = "Goo";
   private static final String INPUT = "Gooooooooooooooooo";
   private static Pattern pattern;
   private static Matcher matcher;

   public static void main( String args[] ) {
      pattern = Pattern.compile(REGEX);
      matcher = pattern.matcher(INPUT);

      System.out.println("Current REGEX is: "+REGEX);
      System.out.println("Current INPUT is: "+INPUT);

      System.out.println("lookingAt(): "+matcher.lookingAt());
      System.out.println("matches(): "+matcher.matches());
   }
}

Output:

Current REGEX is: Goo
Current INPUT is: Gooooooooooooooooo
lookingAt(): true
matches(): false

replaceFirst and replaceAll methods in Java

The replaceFirst and replaceAll methods in Java change the text that matches a given regular expression.

As their names suggest, replaceFirst changes the first time a word appears, and replaceAll changes every time it appears.

The following are examples and explanations of the functionality.

Example:

// program by Glenn
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches {

   private static String REGEX = "horse";
   private static String INPUT = "The dog says meow. " + "All dogs say meow.";
   private static String REPLACE = "cat";

   public static void main(String[] args) {
      Pattern p = Pattern.compile(REGEX);
      
      // get a matcher object
      Matcher m = p.matcher(INPUT); 
      INPUT = m.replaceAll(REPLACE);
      System.out.println(INPUT);
   }
}

Output:

The dog says meow. All dogs say meow.

You can test the above example here! ➡Java Online Compiler 

appendReplacement and appendTail Methods In Java

The Matcher class also has the appendReplacement and appendTail methods in Java for replacing text.

  • appendReplacement Method In Java – The append-and-replace step is done with this method. First, it replaces a compiled character or word with the given input sequence. Then, it adds the given replacement string to the string buffer.

  • appendTail Method In Java – The appendTail(StringBuilder) method of the Matcher Class acts as an append-and-replace method. This method takes the input string and adds it to the given StringBuilder at the end. Parameters: This method takes as a parameter a StringBuilder that holds the target string.

The following are examples and explanations of the functionality.

Example:

// program by Glenn
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches {

   private static String REGEX = "a*c";
   private static String INPUT = "aacgooaacgooacgooc";
   private static String REPLACE = "-";
   public static void main(String[] args) {

      Pattern p = Pattern.compile(REGEX);
      
      // get a matcher object
      Matcher m = p.matcher(INPUT);
      StringBuffer sb = new StringBuffer();
      while(m.find()) {
         m.appendReplacement(sb, REPLACE);
      }
      m.appendTail(sb);
      System.out.println(sb.toString());
   }
}

Output:

-goo-goo-goo-

You can test the above example here! ➡Java Online Compiler 

PatternSyntaxException Class Methods in Java

A PatternSyntaxException in Java is an unchecked exception that happens when there is a syntax error in a regular expression pattern.

The PatternSyntaxException class offers the following ways to figure out what went wrong.

The following is a list of PatternSyntaxException In Java listed in the table below.

#Method & Description
1public String getDescription()Retrieves the description of the error.
2public int getIndex()Retrieves the error index.
3public String getPattern()Retrieves the erroneous regular expression pattern.
4public String getMessage()Returns a multi-line string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern.
PatternSyntaxException Class Methods In Java

Conclusion

In conclusion, we have discussed the Regular Expressions (regex) in Java which are special strings used for finding and manipulating text patterns.

Java doesn’t have a built-in regex class, but it provides the java.util.regex package for this purpose.

Regular expressions are crucial for tasks like text search, replace operations, and form validation in Java.

We have covered the basics, types of regex, and their applications, and provided examples.

What’s Next

The next section talks about Methods In Java programming. At the end of the session, you’ll know what methods are all about in Java.


Leave a Comment