Regular Expression PDF
Regular Expression PDF
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
1 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Regular Expression
Agenda
1. Introduction.
2. The main important application areas of Regular Expression
3. Pattern class
4. Matcher class
5. Important methods of Matcher class
6. Character classes
7. Predefined character classes
8. Quantifiers
9. Pattern class split() method
10. String class split() method
11. StringTokenizer
12. Requirements:
o Write a regular expression to represent all valid identifiers in java language
o Write a regular expression to represent all mobile numbers
o Write a regular expression to represent all Mail Ids
o Write a program to extract all valid mobile numbers from a file
o Write a program to extract all Mail IDS from the File
o Write a program to display all .txt file names present in specific(E:\scjp)
folder
Introduction
A Regular Expression is a expression which represents a group of Strings according to a
particular pattern.
Example:
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
2 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
int count=0;
Pattern p=Pattern.compile("ab");
Matcher m=p.matcher("abbbabbaba");
while(m.find())
{
count++;
System.out.println(m.start()+"------"+m.end()+"--
----"+m.group());
}
System.out.println("The no of occurences
:"+count);
}
}
Output:
0------2------ab
4------6------ab
7------9------ab
The no of occurrences: 3
Pattern class:
A Pattern object represents "compiled version of Regular Expression".
We can create a Pattern object by using compile() method of Pattern class.
Matcher:
A Matcher object can be used to match character sequences against a Regular
Expression.
We can create a Matcher object by using matcher() method of Pattern class.
1. boolean find();
It attempts to find next match and returns true if it is available otherwise returns
false.
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
3 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
2. int start();
Returns the start index of the match.
3. int end();
Returns the offset(equalize) after the last character matched.(or)
Returns the "end+1" index of the matched.
4. String group();
Returns the matched Pattern.
Note: Pattern and Matcher classes are available in java.util.regex package, and
introduced in 1.4 version
Character classes:
1. [abc]-------------------Either 'a' or 'b' or 'c'
2. [^abc] -----------------Except 'a' and 'b' and 'c'
3. [a-z] --------------------Any lower case alphabet symbol
4. [A-Z] --------------------Any upper case alphabet symbol
5. [a-zA-Z] ----------------Any alphabet symbol
6. [0-9] --------------------Any digit from 0 to 9
7. [a-zA-Z0-9] ------------Any alphanumeric character
8. [^a-zA-Z0-9] ------------Any special character
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("x");
Matcher m=p.matcher("a1b7@z#");
while(m.find())
{
System.out.println(m.start()+"-------
"+m.group());
}
}
}
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
4 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Output:
\s---------------------space character
\d---------------------Any digit from o to 9[o-9]
\w---------------------Any word character[a-zA-Z0-9]
. ---------------------Any character including special characters.
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("x");
Matcher m=p.matcher("a1b7 @z#");
while(m.find())
{
System.out.println(m.start()+"-------
"+m.group());
}
}
}
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
5 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Output:
Quantifiers:
Quantifiers can be used to specify no of characters to match.
a-----------------------Exactly one 'a'
a+----------------------At least one 'a'
a*----------------------Any no of a's including zero number
a? ----------------------At most one 'a'
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("x");
Matcher m=p.matcher("abaabaaab");
while(m.find())
{
System.out.println(m.start()+"-------
"+m.group());
}
}
}
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
6 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Output:
Example 1:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("\\s");
String[] s=p.split("ashok software solutions");
for(String s1:s)
{
System.out.println(s1);//ashok
//software
//solutions
}
}
}
Example 2:
import java.util.regex.*;
class RegularExpressionDemo
{
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
7 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
String s="www.saijobs.com";
String[] s1=s.split("\\.");
for(String s2:s1)
{
System.out.println(s2);//www
//saijobs
//com
}
}
}
Note : String class split() method can take regular expression as argument where as
pattern class split() method can take target string as the argument.
StringTokenizer:
Example 1:
import java.util.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
StringTokenizer st=new StringTokenizer("sai
software solutions");
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
8 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
while(st.hasMoreTokens())
{
System.out.println(st.nextToken());//sai
//software
//solutions
}
}
}
The default regular expression for the StringTokenizer is space.
Example 2:
import java.util.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
StringTokenizer st=new
StringTokenizer("1,99,988",",");
while(st.hasMoreTokens())
{
System.out.println(st.nextToken());//1
//99
//988
}
}
}
Requirement:
1. a to z, A to Z, 0 to 9, -,#
2. The 1st character should be alphabet symbol only.
3. The length of the identifier should be at least 2.
Program:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("[a-zA-Z][a-zA-Z0-9-
#]+"); (or)
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
9 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Pattern p=Pattern.compile("[a-zA-Z][a-zA-Z0-9-
#][a-zA-Z0-9-#]*");
Matcher m=p.matcher(args[0]);
if(m.find()&&m.group().equals(args[0]))
{
System.out.println("valid identifier");
}
else
{
System.out.println("invalid identifier");
}
}
}
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo ashok
Valid identifier
Requirement:
Write a regular expression to represent all mobile numbers.
Program:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("
[7-9][0-9][0-9][0-9][0-9][0-
9][0-9][0-9][0-9][0-9]");
//Pattern p=Pattern.compile("[7-9][0-9]{9}");
Matcher m=p.matcher(args[0]);
if(m.find()&&m.group().equals(args[0]))
{
System.out.println("valid number");
}
else
{
System.out.println("invalid number");
}
}
}
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
10 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
Analysis:
10 digits mobile:
[7-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] (or)
[7-9][0-9]{9}
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo 9989123456
Valid number
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo 9989123456
Valid number
E:\scjp>java RegularExpressionDemo 09989123456
Valid number
E:\scjp>java RegularExpressionDemo 919989123456
Valid number
E:\scjp>java RegularExpressionDemo 69989123456
Invalid number
Requirement:
Write a regular expression to represent all Mail Ids.
Program:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
11 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
[a-zA-Z][a-zA-Z0-9-.]*@[a-zA-Z0-
+ + +
9] ([.][a-zA-Z] ) ");
Matcher m=p.matcher(args[0]);
if(m.find()&&m.group().equals(args[0]))
{
System.out.println("valid mail id");
}
else
{
System.out.println("invalid mail id");
}
}
}
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo [email protected]
Valid mail id
E:\scjp>java RegularExpressionDemo [email protected]
Invalid mail id
E:\scjp>java RegularExpressionDemo [email protected]
Invalid mail id
Requirement:
Write a program to extract all valid mobile numbers from a file.
Diagram:
Program:
import java.util.regex.*;
import java.io.*;
class RegularExpressionDemo
{
public static void main(String[] args)throws IOException
{
PrintWriter out=new PrintWriter("output.txt");
BufferedReader br=new BufferedReader(new
FileReader("input.txt"));
Pattern p=Pattern.compile("(0|91)?[7-9][0-
9]{9}");
String line=br.readLine();
while(line!=null)
{
Matcher m=p.matcher(line);
while(m.find())
{
out.println(m.group());
}
line=br.readLine();
}
out.flush();
}
}
Requirement:
Write a program to extract all Mail IDS from the File.
Note: In the above program replace mobile number regular expression with MAIL ID
regular expression.
Requirement:
Write a program to display all .txt file names present in E:\scjp folder.
Program:
import java.util.regex.*;
import java.io.*;
class RegularExpressionDemo
{
public static void main(String[] args)throws IOException
{
int count=0;
Pattern p=Pattern.compile("[a-zA-Z0-9-
$.]+[.]txt");
File f=new File("E:\\scjp");
String[] s=f.list();
for(String s1:s)
{
Matcher m=p.matcher(s1);
if(m.find()&&m.group().equals(s1))
{
count++;
System.out.println(s1);
}
}
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
13 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com
Core Java with SCJP/ OCJP Notes By Durga Sir Regular Expression
System.out.println(count);
}
}
Output:
input.txt
output.txt
outut.txt
3
[k-z][0369][a-zA-Z0-9#$]*
nd
DURGASOFT, # 202,2 Floor,HUDA Maitrivanam,Ameerpet, Hyderabad - 500038,
14 040 – 64 51 27 86, 80 96 96 96 96, 9246212143 | www.durgasoft.com