Data visualization toolbox
Hypervariate data consist of associated samples or measurements of more than three quantitative variables. The general task with this type of data is to determine how the variables are related. In come cases the variables are factors (independent) and a response (dependent). In other cases the variables are not functionally related, but their distributions can be related.
As the number of variables in a data set increases, it becomes increasingly likely that some of the factors have no significant effect on the response and that some of the factors are not independent.
This chapter includes examples of analyzing several data sets:
(scattermatrix.m)
The same scatterplot matrix which is used for trivariate data also extends to more variables. It provides convenient visualization of the relationships of pairs of variables.
![]() |
Figure 5.1 Scatterplot matrix of the environmental data. (book 5.1) |
Figure 5.1 indicates that ozone concentration generally increases with temperature, decreases with windspeed, and has a nonmonotonic change with radiation.
(scattermatrixc.m)
The color scatterplot can also be used in a matrix of panels. This presentation allows us to see the effect of combinations of factors. In Figure 5.2 the combination of temperature and wind speed has a strong effect on ozone. The complex relationship of ozone and radiation appears to be due to interaction with the other factors. Figure 5.2 also reveals a probable outlier at Wind Speed 20 and Solar Radiation 280, which has not previously been noted.
![]() |
Figure 5.2 Color scatterplot matrix of the environmental data. The ozone concentration is encoded in color. Darker colors are higher concentrations. |
Data sets such as this, with three factors and a response, are also nicely visualized using three axis color scatterplots. (Figure 5.2) This presentation allows us to see where the data are available in factor space as well as the response variation with all three factors. A real time display with interactive rotation is especially helpful.
![]() |
Figure 5.3 Color three axis scatterplot of cube root ozone concentration. |
Sometimes the dimensions of a data set can be reduced by combining variables. A
scatterplot matrix of the iris data revealed that the varieties are well separated by
petal length and width. The relevant panel is shown in Figure 5.4. Note that both color
and symbol differ by variety. This presentation redundancy makes it easier to see the
distinctions.
![]() |
Figure 5.4 Iris variety by petal length and width. |
The relations in Figure 5.4 suggest that petal area would be a good basis for variety
classification. This observation is used in Figure 5.5. Elongation (length/width) is used
as the other axis to separate the points.
![]() |
Figure 5.5 Iris variety by petal area and elongation. (book 5.21) |
Surprisingly this simple classification based on petal area was not found by any of the many previous numerical analyses of the data.
1 Introduction | 4 Trivariate Data |
2 Univariate Data | 5 Hypervariate Data |
3 Bivariate Data | 6 Multiway Data |
Send feedback to author@datatool.com | Go to Data visualization home |