Pattern matches in java

Pattern Matchers are used with String , String Buffers ,String Builders etc

Matchers are used to find more than one occurrence of a pattern in a String

Let us create a String builder and then we wull use pattern matchers to find multiple occurences of h2 in that

  StringBuilder htmlText = new StringBuilder("<h1>My heading </h1>");
  htmlText.append("<h2> Sub-heading</h2>");
  htmlText.append("<p>This is a paragraph of text </p>");
  htmlText.append("<p>This is second paragraph of text</p>");
  htmlText.append("<h2>Summary</h2>");
  htmlText.append("<p>This is the summary</p>");

Now just like .matches matcher.matches need to match entire String . So regular expression to match entire String htmlText we need the regular expression .that is anything before or after h2.

   // String h2Pattern = ".*<h2>.*";

But if we just want to find occurences of a certain pattern in a String we can just use that pattern as here we are not matching entire String So in order to find occurrences of h2 we can use the below reg expression

String h2Pattern = "<h2>";

Below are the steps to find occurences of a req expression using patterns


/Create a instance of Pattern class/

Pattern pattern = Pattern.compile(h2Pattern);

/Or if we want the pattern to be case insenstive that is look for both upper and lower case characters/ // Pattern pattern1 = Pattern.compile(h2Pattern,Pattern.CASE_INSENSITIVE|Pattern.UNICODE_CASE);

/*Create an instance of Matcher */ Matcher matcher = pattern.matcher(htmlText);

/*call the matches method to look for the pattern */ //This will print false since we are not matching the entire String System.out.println("The result is " + matcher.matches());

/Please note that we have to reset the matcher once we have used it i.e once we called the matcher.matches method. like we used it above with matcher.matches then we have to reset it to use with matcher.reset/ matcher.reset(); int counter = 0; while (matcher.find()) { counter++; System.out.println("Occurrence " + counter + " : " + matcher.start() + " to " + matcher.end()); }

Important things to consider

if we use reg expression

String h2Pattern = ".*

.*";

matcher.matches will print true as for matches entire String has to match but matcher.start and matcher.end will print starting and end point of entire String and not just

tag as

".*

.*"

represent the entire String and not just h2 tag if we want to print only the occurring indexes of

we have to use the below reg expression

  String h2Pattern = "

";