numpy - Python count and probability -
i have following data :
name item peter apple peter apple ben banana peter banana
i want print
frequency of peter eat : apple 2 banana 1
this code
u, count = np.unique(data['item'], return_counts=true) process = u[np.where(data['name']= 'peter')[0]] process2 = dict(counter(process)) print "item\frequency" k, v in process2.items(): print '{0:.0f}\t{1}'.format(k,v)
but got error want calculate probability of peter eat apple next time dont have idea , suggestion ?
the error getting other answer indicates, cannot use data['name'] = 'peter'
function parameter, intended use - np.where(data['name'] == 'peter')
.
but, given using pandas
, , guessing data
pandas dataframe
. in case, want can achieved using dataframe.groupby
. example -
data[data['name']=='peter'].groupby('item').count()
demo -
in [7]: data[data['name']=='peter'].groupby('item').count() out[7]: name item apple 2 banana 1
if want printed in loop, can use -
df = data[data['name']=='peter'].groupby('item').count() fruit,count in df['name'].iteritems(): print('{0}\t{1}'.format(fruit,count))
demo -
in [24]: df = data[data['name']=='peter'].groupby('item').count() in [25]: fruit,count in df['name'].iteritems(): ....: print('{0}\t{1}'.format(fruit,count)) ....: apple 2 banana 1
for updated issue op getting, getting following error -
typeerror: invalid type comparison
the issue occurs in case because in real data op , column has numeric values (float/int) , op comparing values against string, , hence getting error. example -
in [30]: df out[30]: 0 1 0 1 2 in [31]: df[0]=='asd' --------------------------------------------------------------------------- typeerror traceback (most recent call last) <ipython-input-31-e7bacd79d320> in <module>() ----> 1 df[0]=='asd' c:\anaconda3\lib\site-packages\pandas\core\ops.py in wrapper(self, other, axis) 612 613 # scalars --> 614 res = na_op(values, other) 615 if np.isscalar(res): 616 raise typeerror('could not compare %s type series' c:\anaconda3\lib\site-packages\pandas\core\ops.py in na_op(x, y) 566 result = getattr(x, name)(y) 567 if result notimplemented: --> 568 raise typeerror("invalid type comparison") 569 except (attributeerror): 570 result = op(x, y) typeerror: invalid type comparison
if column numeric, should compare against numeric values, not string.
Comments
Post a Comment