python - Reading mixed type binary data with Numpy (characters, floats and integers) -
i trying read binary files mixed types (of varying data structures) numpy arrays.
the data organized in *.dat file , *.dict (plain text contains data dictionary). example of data dictionary have following:
"name" "s" "50"
"project id" "i" "4"
"amount" "f" "8"
my idea have class i'd instantiate , load data calling
f = data_bin() f.load("profit.bin")
this code working flawlessly whenever have mix of integers , floats, throw string field in middle, throws me error
.
"typeerror: float argument required, not numpy.string_"
the class wrote bellow.
side note, can need data in numpy (for performance , compatibility existing code reasons), can live going python lists , there numpy.
i appreciate this!
class data_bin: def __init__(self): self.datafile="no file loaded yet" self.dictfile="no file loaded yet" self.dictionary=none self.data=none def load(self, file): self.datafile = file self.dictfile=file[0:len(file)-3]+"dict" self.builds_dt() self.loads_data() def builds_dt(self): w=open(self.dictfile,'rb') w.readline() w.readline() q=w.readline() dt=[] while len(q)>0: a=q.rstrip().split(',') field_name=a[0] field_type=a[1] field_length=a[2] dt.append((field_name,field_type+field_length)) q=w.readline() self.dictionary=dt def loads_data(self): f=open(self.datafile,'rb') self.data=np.fromfile(f, dtype=self.dictionary) def info(self): print "binnary source: ", self.datafile print " data types:", self.dictionary try: print " number of records: ", self.data.shape[0] except: print " no valid data loaded"
if want store of data in numpy array row, can. example:
myarr = np.array([1, 2.5, 'hello', {'a':7}], dtype='o') --> array([1, 2.5, 'hello', {'a': 7}], dtype=object)
this create numpy array of objects. since in python object, works. unfortunately, lose lot of reason having numpy array in first place. if need perform calculations on data, suggest separating them data type , working there (e.g., parse based on 2nd column, or use np.where combined np.take , recarray np.loadtxt). otherwise, suggest sticking python lists or similar.
having said that, number of functions still work:
e.g., myarr = np.append(myarr, ('what?', 5.2)) --> array([1, 2.5, 'hello', {'a': 7}, 'what?', '5.2'], dtype=object) myarr.reshape(2,3) --> array([[1, 2.5, 'hello'], [{'a': 7}, 'what?', '5.2']], dtype=object)
if i've missed regarding you're going for, let me know.
Comments
Post a Comment