python - Seaborn: countplot() with frequencies -


i have pandas dataframe column called "axles", can take integer value between 3-12. trying use seaborn's countplot() option achieve following plot:

  1. left y axis shows frequencies of these values occurring in data. axis extends [0%-100%], tick marks @ every 10%.
  2. right y axis shows actual counts, values correspond tick marks determined left y axis (marked @ every 10%.)
  3. x axis shows categories bar plots [3, 4, 5, 6, 7, 8, 9, 10, 11, 12].
  4. annotation on top of bars show actual percentage of category.

the following code gives me plot below, actual counts, not find way convert them frequencies. can frequencies using df.axles.value_counts()/len(df.index) not sure how plug information seaborn's countplot().

i found workaround annotations, not sure if best implementation.

any appreciated!

thanks

plt.figure(figsize=(12,8)) ax = sns.countplot(x="axles", data=dfwim, order=[3,4,5,6,7,8,9,10,11,12]) plt.title('distribution of truck configurations') plt.xlabel('number of axles') plt.ylabel('frequency [%]')  p in ax.patches:         ax.annotate('%{:.1f}'.format(p.get_height()), (p.get_x()+0.1, p.get_height()+50)) 

enter image description here

edit:

i got closer need following code, using pandas' bar plot, ditching seaborn. feels i'm using many workarounds, , there has easier way it. issues approach:

  • there no order keyword in pandas' bar plot function seaborn's countplot() has, cannot plot categories 3-12 did in countplot(). need have them shown if there no data in category.
  • the secondary y-axis messes bars , annotation reason (see white gridlines drawn on text , bars).

    plt.figure(figsize=(12,8)) plt.title('distribution of truck configurations') plt.xlabel('number of axles') plt.ylabel('frequency [%]')  ax = (dfwim.axles.value_counts()/len(df)*100).sort_index().plot(kind="bar", rot=0) ax.set_yticks(np.arange(0, 110, 10))  ax2 = ax.twinx() ax2.set_yticks(np.arange(0, 110, 10)*len(df)/100)  p in ax.patches:     ax.annotate('{:.2f}%'.format(p.get_height()), (p.get_x()+0.15, p.get_height()+1)) 

enter image description here

you can making twinx axes frequencies. can switch 2 y axes around frequencies stay on left , counts on right, without having recalculate counts axis (here use tick_left() , tick_right() move ticks , set_label_position move axis labels

you can set ticks using matplotlib.ticker module, ticker.multiplelocator , ticker.linearlocator.

as annotations, can x , y locations 4 corners of bar patch.get_bbox().get_points(). this, along setting horizontal , vertical alignment correctly, means don't need add arbitrary offsets annotation location.

finally, need turn grid off twinned axis, prevent grid lines showing on top of bars (ax2.grid(none))

here working script:

import pandas pd import matplotlib.pyplot plt import numpy np import seaborn sns import matplotlib.ticker ticker  # random data dfwim = pd.dataframe({'axles': np.random.normal(8, 2, 5000).astype(int)}) ncount = len(dfwim)  plt.figure(figsize=(12,8)) ax = sns.countplot(x="axles", data=dfwim, order=[3,4,5,6,7,8,9,10,11,12]) plt.title('distribution of truck configurations') plt.xlabel('number of axles')  # make twin axis ax2=ax.twinx()  # switch count axis on right, frequency on left ax2.yaxis.tick_left() ax.yaxis.tick_right()  # switch labels on ax.yaxis.set_label_position('right') ax2.yaxis.set_label_position('left')  ax2.set_ylabel('frequency [%]')  p in ax.patches:     x=p.get_bbox().get_points()[:,0]     y=p.get_bbox().get_points()[1,1]     ax.annotate('{:.1f}%'.format(100.*y/ncount), (x.mean(), y),              ha='center', va='bottom') # set alignment of text  # use linearlocator ensure correct number of ticks ax.yaxis.set_major_locator(ticker.linearlocator(11))  # fix frequency range 0-100 ax2.set_ylim(0,100) ax.set_ylim(0,ncount)  # , use multiplelocator ensure tick spacing of 10 ax2.yaxis.set_major_locator(ticker.multiplelocator(10))  # need turn grid on ax2 off, otherwise gridlines end on top of bars ax2.grid(none)  plt.savefig('snscounter.pdf') 

enter image description here


Comments

Popular posts from this blog

javascript - Chart.js (Radar Chart) different scaleLineColor for each scaleLine -

apache - Error with PHP mail(): Multiple or malformed newlines found in additional_header -

android - Go back to previous fragment -