How to validate complete html using java -
i want validate html tags , contents using java. validation should make sure html tags closed properly. there no mistake in tag creation area. eg
<div id="dividvalue'></div>
or
<span id\="spanidval" ,></span>
i need validate such kind of things. while googling got regular expression this
<(\"[^\"]*\"|'[^']*'|[^'\">])*>
but wont validate htmls closed or not? how can add this.
my sample code attached below. please me.
package com.test; import java.util.regex.matcher; import java.util.regex.pattern; public class htmlvalidator { private static pattern pattern; private static matcher matcher; private static final string html_tag_pattern = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>"; public void htmltagvalidator(){ pattern = pattern.compile(html_tag_pattern); } public static boolean validate(final string tag){ matcher = pattern.matcher(tag); return matcher.matches(); } /** * @param args */ public static void main(string[] args) { // todo auto-generated method stub string htmlstr = "<div> <p id=/'bb'>this first paragraph. first paragraph. </p> <span id='spanid'>yes spab</span></div>"; system.out.println("htmlstr :- "+htmlstr); validate(htmlstr); } }
if need parse html using pure java, there many open source options available. however, recommend instead using w3c validate syntax definition more date on correct usages. luck project.
Comments
Post a Comment