Regular Expression-1
Regular Expression-1
Agenda
1. Introduction.
2. The main important application areas of Regular Expression
3. Pattern class
4. Matcher class
5. Important methods of Matcher class
6. Character classes
7. Predefined character classes
8. Quantifiers
9. Pattern class split() method
10. String class split() method
11. StringTokenizer
12. Requirements:
Write a regular expression to represent all valid identifiers in java language
Write a regular expression to represent all mobile numbers
Write a regular expression to represent all Mail Ids
Write a program to extract all valid mobile numbers from a file
Write a program to extract all Mail IDS from the File
Write a program to display all .txt file names present in specific(E:\scjp) folder
Introduction
A Regular Expression is a expression which represents a group of Strings according to a particular
pattern.
Example:
Pattern class:
A Pattern object represents "compiled version of Regular Expression".
We can create a Pattern object by using compile() method of Pattern class.
Note: if we refer API we will get more information about pattern class.
Matcher:
A Matcher object can be used to match character sequences against a Regular Expression.
We can create a Matcher object by using matcher() method of Pattern class.
public Matcher matcher(String target);
Matcher m=p.matcher("abbbabbaba");
1. boolean find();
It attempts to find next match and returns true if it is available otherwise returns false.
2. int start();
Returns the start index of the match.
3. int end();
Returns the offset(equalize) after the last character matched.(or)
Returns the "end+1" index of the matched.
4. String group();
Returns the matched Pattern.
Note: Pattern and Matcher classes are available in java.util.regex package, and introduced in 1.4
version
Character classes:
Output:
\s---------------------space character
\d---------------------Any digit from o to 9[o-9]
\w---------------------Any word character[a-zA-Z0-9]
. ---------------------Any character including special characters.
\S---------------------any character except space character
\D---------------------any character except digit
\W---------------------any character except word character(special character)
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("x");
Matcher m=p.matcher("a1b7 @z#");
while(m.find())
{
System.out.println(m.start()+"-------"+m.group());
}
}
}
Output:
Quantifiers:
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("x");
Matcher m=p.matcher("abaabaaab");
while(m.find())
{
System.out.println(m.start()+"-------"+m.group());
}
}
}
Output:
Pattern class split() method:
Pattern class contains split() method to split the given string against a regular expression.
Example 1:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("\\s");
String[] s=p.split("ashok software solutions");
for(String s1:s)
{
System.out.println(s1);//ashok
//software
//solutions
}
}
}
Example 2:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("\\."); //(or)[.]
String[] s=p.split("www.dugrajobs.com");
for(String s1:s)
{
System.out.println(s1);//www
//dugrajobs
//com
}
}
}
String class also contains split() method to split the given string against a regular expression.
Example:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
String s="www.saijobs.com";
String[] s1=s.split("\\.");
for(String s2:s1)
{
System.out.println(s2);//www
//saijobs
//com
}
}
}
Note : String class split() method can take regular expression as argument where as pattern class
split() method can take target string as the argument.
StringTokenizer:
Requirement:
Rules:
The allowed characters are:
1. a to z, A to Z, 0 to 9, -,#
2. The 1st character should be alphabet symbol only.
3. The length of the identifier should be at least 2.
Program:
import java.util.regex.*;
class RegularExpressionDemo
{
public static void main(String[] args)
{
Pattern p=Pattern.compile("[a-zA-Z][a-zA-Z0-9-#]+"); (or)
Pattern p=Pattern.compile("[a-zA-Z][a-zA-Z0-9-#][a-zA-Z0-9-#]*");
Matcher m=p.matcher(args[0]);
if(m.find()&&m.group().equals(args[0]))
{
System.out.println("valid identifier");
}
else
{
System.out.println("invalid identifier");
}
}
}
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo ashok
Valid identifier
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo 9989123456
Valid number
Output:
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>javac RegularExpressionDemo.java
E:\scjp>java RegularExpressionDemo 9989123456
Valid number
E:\scjp>java RegularExpressionDemo 09989123456
Valid number
E:\scjp>java RegularExpressionDemo 919989123456
Valid number
E:\scjp>java RegularExpressionDemo 69989123456
Invalid number
Requirement:
Requirement:
Diagram:
Program:
import java.util.regex.*;
import java.io.*;
class RegularExpressionDemo
{
public static void main(String[] args)throws IOException
{
PrintWriter out=new PrintWriter("output.txt");
BufferedReader br=new BufferedReader(new FileReader("input.txt"));
Pattern p=Pattern.compile("(0|91)?[7-9][0-9]{9}");
String line=br.readLine();
while(line!=null)
{
Matcher m=p.matcher(line);
while(m.find())
{
out.println(m.group());
}
line=br.readLine();
}
out.flush();
}
}
Requirement:
Note: In the above program replace mobile number regular expression with MAIL ID regular
expression.
Requirement:
Write a program to display all .txt file names present in E:\scjp folder.
Program:
import java.util.regex.*;
import java.io.*;
class RegularExpressionDemo
{
public static void main(String[] args)throws IOException
{
int count=0;
Pattern p=Pattern.compile("[a-zA-Z0-9-$.]+[.]txt");
File f=new File("E:\\scjp");
String[] s=f.list();
for(String s1:s)
{
Matcher m=p.matcher(s1);
if(m.find()&&m.group().equals(s1))
{
count++;
System.out.println(s1);
}
}
System.out.println(count);
}
}
Output:
input.txt
output.txt
outut.txt
3
The first character should be lower case alphabet symbol k-z , and second character should be a
digit divisible by 3
[k-z][0369][a-zA-Z0-9#$]*