python - Issues with regular_expression -
i wrote script open multiple text files if line match specific pattern. after compare line country pattern(which contain country names) , (for now) print country.
(later try create method move each text file folder based on it's country)
basically each of text file contains line :
25/02/2015|11:06:21|mys|mys14_frc6-7_my1_aa1_wp|mms1|wxp2632|ashraf|true|120|0|false|
as can see example contains country name "mys"
import os import string import re import sys import glob import fileinput country_pattern = 'mys','idn','zaf', 'tha','twn','sgp', 'nwz', 'aus','alb','aut','bel', 'bgr', 'bih', 'che','cze', 'deu', 'dnk', 'esp','est','srb','mdk','mne','bih', 'bih','mne','fin', 'fra', 'gbr','grc', 'hrv', 'hun', 'irl', 'ita', 'lie', 'ltu', 'lux', 'lva', 'mda', 'smr','cyp','nld','nor','pol','prt','rou','scg', 'svk','svn','swe','tur','bra','can','usa','mex','chl','arg','rus' pattern = r'(\d+)/(\d+)/(\d+)|(\d+):(\d+):(\d+)|(\s+)|(\s+)|(\s+|(\s+)|(\s+)|(\s+)|(\d+)|(\d+)|(\s+)|' src = raw_input("enter source disk location: ") src = os.path.dirname(src) # zwraca sciezke pliku dir,_,_ in os.walk(src): # odwoluje sie wielu folderow file_path = glob.glob(os.path.join(dir,"*.txt")) # szukam plikow mdi print(file_path) file in file_path: f = open(file, 'r') object_name = f.readlines() f.close() line_name_tmp in object_name: line_name = line_name_tmp.replace('\n','') if line_name == '': line_name.split() continue else: try: re.search(pattern, line_name) except: print line_name pass searchobj = re.search(pattern, line_name) m = searchobj.group(1) if m in coutry_pattern: print "searchobj.group(1) : ", searchobj.group(1) else: print 'did not find any'
unfortunately error:
file "<string>", line 254, in run_nodebug file "c:\users\kostrzew\desktop\reports\mdiadmin.py", line 43, in <module> searchobj = re.search(pattern, line_name) # file "c:\python27\lib\re.py", line 142, in search return _compile(pattern, flags).search(string) file "c:\python27\lib\re.py", line 245, in _compile raise error, v # invalid expression sre_constants.error: unbalanced parenthesis
i don't know how solve error. did miss on pattern?
you've omitted 1 close parenthesis in regex :
pattern = r'(\d+)/(\d+)/(\d+)|(\d+):(\d+):(\d+)|(\s+)|(\s+)|(\s+|(\s+)|(\s+)|(\s+)|(\d+)|(\d+)|(\s+)|' ^
Comments
Post a Comment