python regex invalid syntax -


i testing code current 2600 magazine wordlist generator based off bunch of searches in google. invalid syntax line:

    results.extend(re.findall("<a href="/%201d([^/%201d]*)/%201d">class=(?:1|s)",data.read())) 

i new regex did research on basics of re , seemed easy still didn't understand /%201d. did search on , found thats it's hex of char code. still stuck on making work. here rest of code. line i'm having problem line 36.

this function:

import re, sys, os, urllib ### custom useragent   ### class appurlopener(urllib.fancyurlopener):     version = "mozilla/5.0(compatable;msie 9.0; windows nt 6.1; trident/5.0)"  urllib._urlopener = appurlopener() uopen   = urllib.urlopen uencode = urllib.urlencode  def google(query, numget=10, verbose=0):          numget = int(numget)     start = 0     results = []      if verbose == 2:             print("[+]getting " + str(numget) + " results")              while len(results) < numget:                     print("[+]" + str(len(results)) + " far...")                     data = uopen("https://www.google.com/search?q="+query+"&star="+str(start))                      if data.code != 200:                             print("error " + str(data.code))                             break                      results.extend(re.findall("<a href="/%201d([^/%201d]*)/%201d">class=(?:1|s)",data.read()))                     print(data.read())                     start += 10                      if verbose == 2:                             print("[+] got " + str(numget) + " results")                      return results[:numget] 

first need escape " in <a href="

"<a href=\"/%201d([^/%201d]*)/%201d\">class=(?:1|s)" 

second, %20 encodes single space in urls, %201d corresponds " 1d".


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -