How to validate complete html using java -


i want validate html tags , contents using java. validation should make sure html tags closed properly. there no mistake in tag creation area. eg

<div id="dividvalue'></div> 

or

<span id\="spanidval" ,></span> 

i need validate such kind of things. while googling got regular expression this

<(\"[^\"]*\"|'[^']*'|[^'\">])*> 

but wont validate htmls closed or not? how can add this.

my sample code attached below. please me.

package com.test;  import java.util.regex.matcher; import java.util.regex.pattern;  public class htmlvalidator {      private static pattern pattern;     private static matcher matcher;      private static final string html_tag_pattern = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";      public void htmltagvalidator(){         pattern = pattern.compile(html_tag_pattern);     }      public static boolean validate(final string tag){                   matcher = pattern.matcher(tag);           return matcher.matches();      }      /**      * @param args      */     public static void main(string[] args) {         // todo auto-generated method stub          string htmlstr = "<div> <p id=/'bb'>this first paragraph. first paragraph. </p> <span id='spanid'>yes spab</span></div>";          system.out.println("htmlstr :- "+htmlstr);          validate(htmlstr);      }  } 

if need parse html using pure java, there many open source options available. however, recommend instead using w3c validate syntax definition more date on correct usages. luck project.


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -