python - How to use glob to read limited set of files with numeric names? -
how use glob read limited set of files?
i have json files named numbers 50 20000 (e.g. 50.json,51.json,52.json...19999.json,20000.json) within same directory. want read files numbered 15000 18000.
to i'm using glob, shown below, generates empty list every time try filter out numbers. i've tried best follow link (https://docs.python.org/2/library/glob.html), i'm not sure i'm doing wrong.
>>> directory = "/users/chris/dropbox" >>> read_files = glob.glob(directory+"/[15000-18000].*") >>> print read_files []
also, if wanted files number greater 18000?
you using glob syntax incorrectly; [..]
sequence works per character. following glob match files correctly instead:
'1[5-8][0-9][0-9][0-9].*'
under covers, glob
uses fnmatch
translates pattern regular expression. pattern translates to:
>>> import fnmatch >>> fnmatch.translate('[15000-18000].*') '[15000-18000]\\..*\\z(?ms)'
which matches 1 character before .
, 0
, 1
, 5
or 8
. nothing else.
glob
patterns quite limited; matching numeric ranges not easy it; you'd have create separate globs ranges, example (glob('1[8-9][0-9][0-9][0-9]') + glob('2[0-9][0-9][0-9][0-9]')
, etc.).
do own filtering instead:
directory = "/users/chris/dropbox" filename in os.listdir(directory): basename, ext = os.path.splitext(filename) if ext != '.json': continue try: number = int(basename) except valueerror: continue # not numeric if 18000 <= number <= 19000: # process file filename = os.path.join(directory, filename)
Comments
Post a Comment