plotbox.plot module¶
-
plotbox.plot.dist_plot(distObj, saveAs=None, title='Probability Dashboard', logscale=False)¶ Generates a plot object which is either displayed or saved. The plot includes 4 subplots:
- a probability plot generated using scipy.stats “probplot”, comparing the theoretical distribution and its data X
- a scatter plot of the data sample
- the CDF of the distribution
- the PDF of the distribution
-
plotbox.plot.hist_box(ar, perc=0, val=None, type='sym', annotation=True, distplot=False, trunc=False, bins=30, normed=True, alpha=0.6, color='b', kde=True, legend='', xmax=None)¶ Plots a histogram highlighting or filtering out the given percentage of data, according to the given type of filtering.
3 different types available:
- ‘left’: filter out the lower given percentage of data
- ‘right’: filter out the upper given percentage of data
- ‘sym’: both left and right filtering
Parameters: - ar (array-like) – Array of data
- perc (float in [0,100] (defaults to 0)) – Percentage of data to filter
- val (float) – An alternative to perc, gives the threshold value instead of percentage
- type (string (defaults to 'sym')) – Type of filtering
- annotation (bool (defaults to True)) – If True, writes complementary annotations on the plot
- distplot (bool (defaults to False)) – If True, uses seaborn distplot
- trunc (bool (defaults to False)) – If True, filters out the data and only displays remaining data. Otherwise, only highlight the selected data.
- bins (int (default is 30)) – Number of bins to use
- normed (bool (defaults to True)) – If True, histogram is normed
- alpha (float in [0,1] (defaults is 0.6)) – Transparency of the plot (0 is completely transparent, 1 is opaque)
- color (string (default is 'b')) – Color of the plot
- kde (bool (defaults to True)) – If using seaborn distplot, tells wether or not to plot the kernel density estimation ef the distribution legend : string (default is ‘’) Legend to use
- xmax (float) – Maximum x-value of the plot. If none is given, chooses the maximum value of the data.
-
plotbox.plot.make_readable_ticks(type='x')¶ Turn your unreadable plot ticks (x-ticks or y-ticks) into nice, clean ticks.
Works only for floats for now.
Parameters: type ('x' or 'y' (default is 'x')) – Choose ‘x’ if you want to change your x-ticks, ‘y’ for your y-ticks Notes
Must be incorporated as part of a “regular” plot script (see example)
Examples
>>>
-
plotbox.plot.plotScatter(x, y, data=None, hue=None, bestfit=False, ci=95, alpha=1, size=20, xlab='x', ylab='y', axFontSize=9, title='', saveAs=None, figsize=(9.5, 6), snssize=5, label=None)¶ Robust scatterplot tool. Takes x and y as names of columns if given a df (you also get proper x/y labels), else takes x and y as arrays. Also takes hue as a way to color points as well as facet the bestfit, CI, alpha, size, x/y labels, and title. If given a date array (as x), draws simple scatterplot.
-
plotbox.plot.prepare_plot(xticks, yticks, figsize=(10.5, 6), hideLabels=False, gridColor='#999999', gridWidth=1.0)¶ function for generating pretty plot layout
-
plotbox.plot.save_plot(fig, path, filename=None, show=True, filetype='.png', dpi=270)¶ Function to save plots at high resolution and clean crop Requires a matplotlib figure object and a save path.
-
plotbox.plot.scatter_plot(x, y, data=None, hue=None, title='Scatter Plot', xlabel='X-Values', ylabel='Y-Values', alpha=0.3, figsize=(14.25, 9.0), saveAs=None, vlines=None, hlines=None, xlim=(None, None), ylim=(None, None), legend=True, model=None, plotly=False, show=True)¶ scatter_plot uses matplotlib.pyplot.scatter in a seaborn like functional paridigm
Parameters: - y (x,) – Column names in
data. - data (DataFrame) – Long-form (tidy) dataframe with variables in columns and observations in rows.
- col, row (hue,) – Variable names to facet on the hue, col, or row dimensions (see
FacetGriddocs for more information). - xlabel, ylabel (title,) – labels of scatter plot.
- alpha (float) – opacity of scatter points.
- figsize (touple, (width, height)) –
- saveAs (optional) – filename to save figure as
- vlines (list) – list of x points to make vertical lines in the plot
- xlim (touple (xmin, xmax)) – horizontal boundries of the figure
- ylim (tuple (ymin, ymax)) – vertical boundries of the plot
- legend (boolean, optional) – Draw a legend for the data when using a hue variable.
- model (str, optional) – regression on given data.
Notes
This function can be used in 2 different ways:
Using the arguments to generate titles, legends, etc... and then save/display the plot
Incorporate the plot in a script and overriding the plotting features this way:
>>> import matplotlib.pyplot as plt >>> >>> f = 1000 >>> hue = ['one' for i in range(50*f)] + ['two' for i in range(30*f)] + ['three' for i in range(20*f)] >>> rp.plotBox.scatter_plot(x = np.random.randn(100*f), y = np.random.randn(100*f), hue = hue, vlines = 0, alpha= .1, hlines = 0) >>> plt.title('My title') >>> plt.xlabel('X label I want') >>> >>> # To change the figure size : >>> fig = plt.gcf() # get the figure object >>> fig.set_size_inches(5,10) >>> >>> plt.show()
Add arguments: * dropna : boolean, optional Drop missing values from the data before plotting.
- add regression :
f, popt, pcov = rp.statBox.regression_model(x,y, model) plt.plot(np.linspace(0,max(x)+100,50), f(np.linspace(0,max(x)+100,50), *popt), ‘r-‘, label=”Fitted Curve”)
- y (x,) – Column names in
-
plotbox.plot.violinOne(X, col=None, subplot=111, alpha=0.2)¶ Given a sample of data (and optionnally a boolean vector), returns a pyplot axis object which is the violin plot of the data.
Parameters: - X (array-like) – Vector
- col (array-like) – Indicates suspension and failure times
- subplot (int) – Subplot value for the axis to return
- alpha (float (0<=,=>1)) – Transparency of the dots
Returns: ax – Violin plot
Return type: pyplot axis
-
plotbox.plot.violinPlot(savefig=None, **kwargs)¶ Given a set of n data samples with their boolean vectors, returns n subplots with the violin plot of each sample.
Usage: violinPlot(sample1=(X1,Xbool1), sample2=(X2,Xbool2), etc...)
Parameters: - savefig (string (default is None)) – If given, save the figure using savefig as the file name.
- kwargs (tuples) – For each sample, the data sample and the boolean vector must be provided in a tuple.
-
plotbox.plot.wblPlots(distList, title=None, labelList=None, saveAs=None, xlabel='Miles to Failure', min_x=None, max_x=None, ylim=(None, None), xlim=(None, None), use_sci=False, show=True)¶ Produces a weibull probability plot from a list of dist objects labelList is a list of strings the same size and order as distList that provides label information for each distribution