I’ve been working on a project recently where I needed to produce a xy scatterplot, and I wanted to plot each of the data points a different color based on a third variable (lets call it “z”). I developed a quick and easy utility to do this. It seems to work pretty well! I’m still working out how to add a colorramp to plot() to displays the values of “z”, but in the mean time, I’ll show you what I’ve come up with:

First, I create a vector of colors. Below is a nice color scale I use often; it goes from black to blue to purple, orange and finally a deep red:

plotclr <- c(colorRampPalette(c("black", "blue"))(50),
colorRampPalette(c("blue", "purple", "orange"))(55),
colorRampPalette(c("orange", "red", "darkred"))(70))

Here’s an image of the ramp (in case you were wondering):

Colorramp produced from above script.

Next, I create a utility function to scale the third variable, z, to between 0 and 1. Then I multiply z_scl by the length of plotclr (50+55+70 = 175), and make sure that the smallest value in color_scl (ie. 0) is set to 1 so that I can use color_scl as an index for plotclr:

z_scl <- (z - min(z, na.rm=T))/(max(z, na.rm=T) - min(z, na.rm=T))
color_scl = round(z_scl*length(plotclr))
color_scl[color_scl == 0] = 1

Now, lets plot something up with it:

### Fake data
x = seq(0.1, 10, 0.5) + rnorm (length(seq(0.1, 10, 0.5)), 0.25)
y = seq(0.1, 10, 0.5) + rnorm (length(seq(0.1, 10, 0.5)), 0.25)
z = seq(0.1, 10, 0.5) + rnorm (length(seq(0.1, 10, 0.5)), 0.25)
### Plotting
plot(x, y, type = "n") # create new plot
z_scl <- (z - min(z, na.rm=T))/(max(z, na.rm=T) - min(z, na.rm=T))
color_scl = round(z_scl*length(plotclr))
color_scl[color_scl == 0] = 1
# Loop to plot each point
for(i in 1:length(x)){
points(x[i], y[i], pch = 20, col = plotclr[color_scl[i]], cex = 2.5)
}
### End of Plotting

Here’s the final result:

xy scatterplot with the color of each point based on a third variable.

I should mention that by using a colorramp, the reader may have some difficulty determining the exact value of the third variable for the point, but as long as that is OK, then I think this is a pretty nifty way of plotting three variables together! Till next time…

### Like this:

Like Loading...

*Related*

AlexHi, I came across this post whilst googleing. It’s a shame there isn’t a built in function to do this. I have found that the function scatterPlot in the package “openair” works quite nicely.

You may also want to think about adding a color bar to your plot. I find the package “plotrix” has the best solution. As an example based on your code:

## Choose the z-values at which the colour will change

levs <- pretty(z,n=11)

## Work out which colour level each point is at

nl <- length(levs)

icols <- as.integer(1 + (nl-1) * (z-levs[1])/(levs[nl]-levs[1]))

mypalette <- colorRampPalette(c("black","red","yellow"))(nl-1)

## Set the plotting palette

palette(mypalette)

## Set the plot, adding extra room for color strip

par(mar=c(5,4,4,6))

## Plot the data

plot(x, y, pch=19, col=icols )

grid()

par(xpd=T)

color.legend(11.5,0,12.5,11,legend=levs, rect.col=mypalette, gradient="y", align="rb")

par(xpd=F)

ktakagiPost authorThanks for the comment! I’ve used plotrix before (can’t remember what) and I liked it. I usually try and use “base” functions and libraries as much as possible but there definitely are some nice packages out there. For example, I’ve been meaning to looking into ggplot, just haven’t had the time.