Example: Column, Normal Probability, and Box Plots
Use a column plot, a normal probability plot, and a box plot to study the result of an experiment.
1. Define a data set describing a study of the oxide growth process on a silicon wafer. The matrix Data has two columns: one for the furnace number and the other for the oxide thickness measured in Angstroms.
2. Extract the thickness data in vector Thick.
3. Call the histogram function to separate the data into twenty bins.
4. Plot the binned data and change the trace type to Column Trace. For each column, you can view the thickness range on the x-axis and the number of experiments on the y-axis.
5. Call the mean and Stdev functions to calculate the mean and the standard deviation of the data. With these statistics, call the dnorm function to calculate the expected result for each bin if the data were a normal distribution.
6. Add a y-axis expression for plotting the vector Norm. To view the normal distribution, decrease the size of the histogram by adding a scaling factor of a 1000 in the unit placeholder of its y-axis expression.
7. Call the qqplot function to compare the quantiles of Data to those of the normal distribution.
8. Plot the quantiles one against the other. Change the trace style to create a scatter plot: select the cross from the Symbol list, and then select None from the Line Style list.
9. Call the boxplot function to calculate the three quartiles, the minimum and maximum, and the outliers of the data set.
10. Plot the transpose of B and change the trace type to Box Plot Trace to view these statistics in a box plot.
The column plot and the normal probability plot show that the normal distribution is a reasonable approximation of the measured thickness. The box plot shows that there is only one outlier which is relatively close to the rest of the data set.
11. Call the vlookup function to extract the thickness measurements for each furnace.
12. Call the augment function to merge the vectors F1, F2, F3, and F4 into one matrix where each column contains the results for one of the furnaces.
13. Call the boxplot function to calculate the statistics for each data set.
14. Define a vector of furnace labels.
15. Create a box plot to view the data sets. The matrix in the y-axis expression contains one row per data set and also NaNs when data sets do not have the same number of outliers. The plot returns one box plot per data set.
The box plots show that the variance between the furnaces is small, even though for each furnace, there is a considerable amount of variation in the thickness measurements.