What is Regular Expression in Java
A Regular Expression or regEx in Java is a special string of characters that helps you match or find other strings or sets of strings by using a special syntax held in a pattern.
They can be used to find text and data, change it, or do other things with it.
Does Java have Regular Expressions?
Java doesn’t have a built-in Regular Expression class, but we can work with them by importing the java.util.regex package.
Two Types of Regular Expression
There are two types of Regular Expression In Programming.
- Basic regular expression
- Extended regular expression
What are the Applications of Regular Expressions in Java?
The Regular Expressions In Java can be used for all kinds of text search and text replace operations.
Java doesn’t have a built-in Regular Expression class, but we can work with them by importing the java.util.regex package.
Importance of Regular Expression in Java
The Regular Expression In Java is a very important tool for finding patterns and making sure that forms are correct.
Look at its Properties, how it can be used, and what it can’t do.
Without further ado, I will further explain the Regular Expression In Java with some basic examples in order for you to easily understand the topic.
The java.util.regex package in Java lets you use regular expressions to match patterns.
Java Regular Expressions are very easy to learn and are very similar to the programming language Perl.
Perl is an open-source, cross-platform programming language that is used a lot in both business and personal computing.
In the late 20th century and early 21st century, Perl was a favorite among Web developers because it could process text in many different ways and solve problems in many different ways.
Description Of java.util.regex.package
The java.util.regex.package is a class that uses regular expressions to match character sequences to patterns.
The java.util.regex package is primarily made up of the following three classes listed in the table below:
Classes | Description |
Pattern Class | A Pattern object is a regular expression that has been put together. There are no public constructors in the Pattern class. To make a pattern, you must first call one of its public static compile() methods, which will return a Pattern object. The first argument to these methods is a regular expression. |
Matcher Class | A Matcher object is the engine that figures out what the pattern means and does match operations on an input string based on the pattern. Matcher has no public constructors, just like the Pattern class. By calling the matcher() method on a Pattern object, you can get a Matcher object. |
PatternSyntaxException | A PatternSyntaxException is an unchecked exception that is thrown when a regular expression pattern has a syntax error. |
Capturing Groups In Java
The Capturing Groups In Java are a way to treat several characters as a single unit.
They are made by putting the characters to be grouped inside a set of parentheses.
For example, the regular expression (dog) creates a single group with the letters “d”, “o”, and “g”.
Capturing Groups In Java are numbered by counting the opening parentheses from left to right.
In the expression ((A)(B(C))).
List of four examples of Capturing Groups In Java is listed below.
- ((A)(B(C)))
- (A)
- (B(C))
- (C)
What is an Example Of Regular Expression in Java
The table below shows some examples of Regular Expressions In Java.
Regex | Matches any string that |
---|---|
hello | contains {hello} |
gray|grey | contains {gray, grey} |
gr(a|e)y | contains {gray, grey} |
gr[ae]y | contains {gray, grey} |
b[aeiou]bble | contains {babble, bebble, bibble, bobble, bubble} |
[b-chm-pP]at|ot | contains {bat, cat, hat, mat, nat, oat, pat, Pat, ot} |
colou?r | contains {color, colour} |
rege(x(es)?|xps?) | contains {regex, regexes, regexp, regexps} |
go*gle | contains {ggle, gogle, google, gooogle, goooogle, …} |
go+gle | contains {gogle, google, gooogle, goooogle, …} |
g(oog)+le | contains {google, googoogle, googoogoogle, googoogoogoogle, …} |
z{3} | contains {zzz} |
z{3,6} | contains {zzz, zzzz, zzzzz, zzzzzz} |
z{3,} | contains {zzz, zzzz, zzzzz, …} |
[Bb]rainf\*\*k | contains {Brainf**k, brainf**k} |
\d | contains {0,1,2,3,4,5,6,7,8,9} |
\d{5}(-\d{4})? | contains a United States zip code |
1\d{10} | contains an 11-digit string starting with a 1 |
[2-9]|[12]\d|3[0-6] | contains an integer in the range 2..36 inclusive |
Hello\nworld | contains Hello followed by a newline followed by world |
mi.....ft | contains a nine-character (sub)string beginning with mi and ending with ft (Note: depending on context, the dot stands either for “any character at all” or “any character except a newline”.) Each dot is allowed to match a different character, so both microsoft and minecraft will match. |
\d+(\.\d\d)? | contains a positive integer or a floating point number with exactly two characters after the decimal point. |
[^i*&2@] | contains any character other than an i, asterisk, ampersand, 2, or at-sign. |
//[^\r\n]*[\r\n] | contains a Java or C# slash-slash comment |
^dog | begins with “dog” |
dog$ | ends with “dog” |
^dog$ | is exactly “dog” |
How Do You Find Out How Many Groups are in an Expression in Java,
To find out how many groups are in the expression, call the groupCount method on a matcher object.
The groupCount method returns an int that shows how many capturing groups are in the matcher’s pattern.
There is also a special group called group 0 that always represents the entire expression.
This group isn’t counted in the total given by groupCount.
Example Program in Java for Capturing Groups
The following example shows how to find a digit string in a given alphanumeric string.
// program created by Glenn import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { public static void main( String args[] ) { // String to be scanned to find the pattern. String line = "This order was placed for PIES5G13000! OK?"; String pattern = "(.*)(\\d+)(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); if (m.find( )) { System.out.println("Found value: " + m.group(0) ); System.out.println("Found value: " + m.group(1) ); System.out.println("Found value: " + m.group(2) ); }else { System.out.println("NO MATCH"); } } }
Output:
Found value: This order was placed for PIES5G13000! OK?
Found value: This order was placed for PIES5G1300
Found value: 0
In order for you to test the Java code provided in this lesson, you must test the code in your code editor.
But if you wish to run this code online, we also have an online compiler in Java for you to test your Java code for free.
You can test the above example here! ➡Java Online Compiler
Regular Expression Syntax In Java
Lists of the following Regular Expression Syntax In Java are in the table below.
Subexpression | Matches |
---|---|
^ | Matches the beginning of the line. |
$ | Matches the end of the line. |
. | Matches any single character except newline. Using m option allows it to match the newline as well. |
[…] | Matches any single character in brackets. |
[^…] | Matches any single character not in brackets. |
\A | Beginning of the entire string. |
\z | End of the entire string. |
\Z | End of the entire string except allowable final line terminator. |
re* | Matches 0 or more occurrences of the preceding expression. |
re+ | Matches 1 or more of the previous thing. |
re? | Matches 0 or 1 occurrence of the preceding expression. |
re{ n} | Matches exactly n number of occurrences of the preceding expression. |
re{ n,} | Matches n or more occurrences of the preceding expression. |
re{ n, m} | Matches at least n and at most m occurrences of the preceding expression. |
a| b | Matches either a or b. |
(re) | Groups regular expressions and remembers the matched text. |
(?: re) | Groups regular expressions without remembering the matched text. |
(?> re) | Matches the independent pattern without backtracking. |
\w | Matches the word characters. |
\W | Matches the nonword characters. |
\s | Matches the whitespace. Equivalent to [\t\n\r\f]. |
\S | Matches the nonwhitespace. |
\d | Matches the digits. Equivalent to [0-9]. |
\D | Matches the nondigits. |
\A | Matches the beginning of the string. |
\Z | Matches the end of the string. If a newline exists, it matches just before newline. |
\z | Matches the end of the string. |
\G | Matches the point where the last match finished. |
\n | Back-reference to capture group number “n”. |
\b | Matches the word boundaries when outside the brackets. Matches the backspace (0x08) when inside the brackets. |
\B | Matches the nonword boundaries. |
\n, \t, etc. | Matches newlines, carriage returns, tabs, etc. |
\Q | Escape (quote) all characters up to \E. |
\E | Ends quoting begun with \Q. |
Methods of The Matcher Class in Java
The following are useful instance methods of Matcher Class In Java.
Index Methods in Java
The Index Methods In Java give useful index values that show where in the input string the match was found.
The following is the list of index methods provided in the table below.
# | Method & Description |
---|---|
1 | public int start()Returns the start index of the previous match. |
2 | public int start(int group)Returns the start index of the subsequence captured by the given group during the previous match operation. |
3 | public int end()Returns the offset after the last character matched. |
4 | public int end(int group)Returns the offset after the last character of the subsequence captured by the given group during the previous match operation. |
Study Methods in Java
A Study Methods In Java review the input string and give back a Boolean that says if the pattern was found or not.
The following is the list of study methods provided in the table below.
# | Method & Description |
---|---|
1 | public boolean lookingAt()Attempts to match the input sequence, starting at the beginning of the region, against the pattern. |
2 | public boolean find()Attempts to find the next subsequence of the input sequence that matches the pattern. |
3 | public boolean find(int start)Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index. |
4 | public boolean matches()Attempts to match the entire region against the pattern. |
Replacement Methods In Java
The Replacement Methods In Java these are useful for replacing text in an input string.
The following is the list of replacement methods provided in the table below.
# | Method & Description |
---|---|
1 | public Matcher appendReplacement(StringBuffer sb, String replacement)Implements a non-terminal append-and-replace step. |
2 | public StringBuffer appendTail(StringBuffer sb)Implements a terminal append-and-replace step. |
3 | public String replaceAll(String replacement)Replaces every subsequence of the input sequence that matches the pattern with the given replacement string. |
4 | public String replaceFirst(String replacement)Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string. |
5 | public static String quoteReplacement(String s)Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. |
Start and End Methods In Java
Here is an example that counts the number of times the word “horse” appears in the input string.
Example:
// program by Glenn import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static final String REGEX = "\\bhorse\\b"; private static final String INPUT = "horse horse horse horsie horrse"; public static void main( String args[] ) { Pattern p = Pattern.compile(REGEX); Matcher m = p.matcher(INPUT); // get a matcher object int count = 0; while(m.find()) { count++; System.out.println("Match number "+count); System.out.println("start(): "+m.start()); System.out.println("end(): "+m.end()); } } }
Output:
Match number 1
start(): 0
end(): 5
Match number 2
start(): 6
end(): 11
Match number 3
start(): 12
end(): 17
You can test the above example here! ➡Java Online Compiler
You can see that this example uses word boundaries to make sure that the letters “h” “o” “r” “s” “e” are not just parts of a longer word.
It also tells you where in the input string the match happened, which is helpful.
The start method returns the start index of the subsequence captured by the given group in the previous match operation.
The end method returns the index of the last character matched, plus one.
Matches and lookingAt Methods in Java
The matches and lookingAt methods both try to match an input sequence with a pattern.
The difference is that matches need the entire input sequence to be matched, while lookingAt does not.
Both methods always start from the beginning of the input string. Here is an example that shows how it works:
Example:
// program created bY Glenn import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static final String REGEX = "Goo"; private static final String INPUT = "Gooooooooooooooooo"; private static Pattern pattern; private static Matcher matcher; public static void main( String args[] ) { pattern = Pattern.compile(REGEX); matcher = pattern.matcher(INPUT); System.out.println("Current REGEX is: "+REGEX); System.out.println("Current INPUT is: "+INPUT); System.out.println("lookingAt(): "+matcher.lookingAt()); System.out.println("matches(): "+matcher.matches()); } }
Output:
Current REGEX is: Goo
Current INPUT is: Gooooooooooooooooo
lookingAt(): true
matches(): false
replaceFirst and replaceAll methods in Java
The replaceFirst and replaceAll methods in Java change the text that matches a given regular expression.
As their names suggest, replaceFirst changes the first time a word appears, and replaceAll changes every time it appears.
The following are examples and explanations of the functionality.
Example:
// program by Glenn import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String REGEX = "horse"; private static String INPUT = "The dog says meow. " + "All dogs say meow."; private static String REPLACE = "cat"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX); // get a matcher object Matcher m = p.matcher(INPUT); INPUT = m.replaceAll(REPLACE); System.out.println(INPUT); } }
Output:
The dog says meow. All dogs say meow.
You can test the above example here! ➡Java Online Compiler
appendReplacement and appendTail Methods In Java
The Matcher class also has the appendReplacement and appendTail methods in Java for replacing text.
- appendReplacement Method In Java – The append-and-replace step is done with this method. First, it replaces a compiled character or word with the given input sequence. Then, it adds the given replacement string to the string buffer.
- appendTail Method In Java – The appendTail(StringBuilder) method of the Matcher Class acts as an append-and-replace method. This method takes the input string and adds it to the given StringBuilder at the end. Parameters: This method takes as a parameter a StringBuilder that holds the target string.
The following are examples and explanations of the functionality.
Example:
// program by Glenn import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String REGEX = "a*c"; private static String INPUT = "aacgooaacgooacgooc"; private static String REPLACE = "-"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX); // get a matcher object Matcher m = p.matcher(INPUT); StringBuffer sb = new StringBuffer(); while(m.find()) { m.appendReplacement(sb, REPLACE); } m.appendTail(sb); System.out.println(sb.toString()); } }
Output:
-goo-goo-goo-
You can test the above example here! ➡Java Online Compiler
PatternSyntaxException Class Methods in Java
A PatternSyntaxException in Java is an unchecked exception that happens when there is a syntax error in a regular expression pattern.
The PatternSyntaxException class offers the following ways to figure out what went wrong.
The following is a list of PatternSyntaxException In Java listed in the table below.
# | Method & Description |
---|---|
1 | public String getDescription()Retrieves the description of the error. |
2 | public int getIndex()Retrieves the error index. |
3 | public String getPattern()Retrieves the erroneous regular expression pattern. |
4 | public String getMessage()Returns a multi-line string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern. |
Conclusion
In conclusion, we have discussed the Regular Expressions (regex) in Java which are special strings used for finding and manipulating text patterns.
Java doesn’t have a built-in regex class, but it provides the java.util.regex package for this purpose.
Regular expressions are crucial for tasks like text search, replace operations, and form validation in Java.
We have covered the basics, types of regex, and their applications, and provided examples.
What’s Next
The next section talks about Methods In Java programming. At the end of the session, you’ll know what methods are all about in Java.
< PREVIOUS
NEXT >