regex - how to use python re.ingorecase -
i had earlier question regarding returning original sentence irrespective of eh title or case. have done working trying work out , came this:
import re nltk import tokenize nltk.tokenize import sent_tokenize def foo(): txt = "risk factors breast cancer have been characterized. breast cancer 100 times more frequent in women in men.\ factors associated increased exposure estrogen have been elucidated including menarche, late menopause, later age\ @ first pregnancy, or nulliparity. use of hormone replacement therapy has been confirmed risk factor, although limited \ combined use of estrogen , progesterone, demonstrated in whi (2). analysis showed risk of breast cancer among women using \ estrogen , progesterone increased 24% compared placebo. separate arm of whi randomized women prior hysterectomy \ conjugated equine estrogen (cee) versus placebo, , in study, use of cee not associated increased risk of breast cancer (3).\ unlike hormone replacement therapy, there no evidence oral contraceptive (ocp) use increases risk. large population-based case-control study \ examining risk of breast cancer among women used or using ocps included on 9,000 women aged 35 64 \ (half of whom had breast cancer) (4). reported relative risk 1.0 (95% ci, 0.8 1.3) among women using ocps , 0.9 \ (95% ci, 0.8 1.0) among prior users. in addition, neither race nor family history associated greater risk of breast cancer among ocp users." words = txt corpus = " ".join(words) sentences1 = sent_tokenize(corpus) = [" ".join([sentences1[i-1],j]) i,j in enumerate(sentences1) if re.match('risk',i,re.i) in word_tokenize(j)] in a: print i,'\n','\n' foo()
how ever, keep getting error:
type error: expected string or buffer
how can make return sentence irrespective of case using ignorecase? kind regards
Comments
Post a Comment