java - Why HTMLUnit always shows the HostPage no matter what url I type in (Crawlable GWT APP)? -


here full code

public class crawlservlet implements filter{  public static string getfullurl(httpservletrequest request) {     stringbuffer requesturl = request.getrequesturl();     string querystring = request.getquerystring();       if (querystring == null) {         return requesturl.tostring();     } else {         return requesturl.append('?').append(querystring).tostring();     }  }   @override  public void destroy() {  // todo auto-generated method stub   }   @override  public void dofilter(servletrequest request, servletresponse response,  filterchain chain) throws ioexception, servletexception {   httpservletrequest httprequest = (httpservletrequest) request;  string fullurlquerystring = getfullurl(httprequest);  system.out.println(fullurlquerystring+" wrong");   if ((fullurlquerystring != null) && (fullurlquerystring.contains("_escaped_fragment_"))) {      // remember unescape %xx characters      fullurlquerystring=urldecoder.decode(fullurlquerystring,"utf-8");      // rewrite url original #! version          string url_with_hash_fragment=fullurlquerystring.replace("?_escaped_fragment_=", "#!");            final webclient webclient = new webclient();           webclientoptions options = webclient.getoptions();          options.setcssenabled(false);          options.setthrowexceptiononscripterror(false);          options.setthrowexceptiononfailingstatuscode(false);          options.setjavascriptenabled(false);          htmlpage page = webclient.getpage(url_with_hash_fragment);           // important!  give headless browser enough time execute javascript          // exact time wait may depend on application.           webclient.waitforbackgroundjavascript(20000);           // return snapshot          //string originalhtml=page.getwebresponse().getcontentasstring();          //system.out.println(originalhtml+" +++++++++");          system.out.println(page.asxml()+" +++++++++");           printwriter out = response.getwriter();          out.println(page.asxml());          //out.println(originalhtml);      } else {       try {         // not _escaped_fragment_ url, move chain of servlet (filters)         chain.dofilter(request, response);       } catch (servletexception e) {         system.err.println("servlet exception caught: " + e);         e.printstacktrace();       }     }   }    @override  public void init(filterconfig arg0) throws servletexception {  // todo auto-generated method stub   }   } 

after opened url "http://127.0.0.1:8888/myproject.html?gwt.codesvr=127.0.0.1:9997?_escaped_fragment_=article", showed host page html code this:

<html>  <head> <meta name="fragment" content="!"> <meta http-equiv="content-type" content="text/html; charset=utf-8"/> <!-- --> <!--  consider inlining css reduce number of requested files  --> <!-- --> <link type="text/css" rel="stylesheet" href="myproject.css"/> <!-- --> <!-- title fine --> <!-- --> <title>myproject</title> <!-- --> <!-- script loads compiled module. --> <!-- if add gwt meta tags, must --> <!-- added before line. --> <!-- --> <script type="text/javascript" language="javascript" ></script> <!-- --> <!-- body can have arbitrary html, or --> <!-- can leave body empty if want --> <!-- create dynamic ui. --> <!-- --> </head> <body>  <div id="loading"> loading <br/> <img src="../images/loading.gif"/> </div> <!-- optional: include if want history support --> <iframe src="javascript:''" id="__gwt_historyframe" tabindex="-1" style="position: absolute; width: 0;height: 0; border:0;"></iframe> <!--  recommended if web app not function without javascript enabled  --> <noscript>  <div style="width: 22em; position: absolute; left: 50%; margin-left: -11em; color: red; background-color: white; border: 1pxsolid red; padding: 4px; font-family: sans-serif;"> web browser must have javascript enabled in order application display correctly. </div> </noscript> </body> </html> 

on other hand, "http://127.0.0.1:8888/myproject.html?gwt.codesvr=127.0.0.1:9997#!article" works ok & show article without problem.

i compiled whole project & ran under tomcat7, have same problem. shows html of host page.

note: article page nested presenter embedded inside header presenter. don't think main reason cos didn't show header page.

first, instead of ?_escaped_fragment_=article, perhaps try &_escaped_fragment_=article because have ? gwt.codesvr, 2 ? may mess url parameter parsing.

second, need make sure filter handle case of having parameter gwt.codesvr. looks filter assumes first parameter -- i.e., starting ?. believe example here work either way.


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -