# C.4: R and shape

- Page ID
- 3613

Analysis of biological shape is a really useful technique. Inspired with highly influential works of D’Arcy Thompson\(^{[1]}\), it takes into account not the linear measurements but the *whole shape* of the object: contours of teeth, bones, leaves, flower petals, and even 3D objects like skulls or beaks.

Naturally, shape is not exactly measurement data, it should be analyzed with special approaches. There are methods based on the analysis of curves (namely, Fourier coefficients) and methods which use *landmarks* and *thin-plate splines* (TPS). The last method allows to visualize aligned shapes with PCA (in so-called tangent space) and plot transformation grids.

In R, several packages capable to perform this statistical analysis of shape, or *geometric morphometry*. Fourier analysis is possible with momocs, and landmark analysis used below with geomorph package:

**Code \(\PageIndex{1}\) (Python):**

library(geomorph) TangentSpace2 <- function(A){ x <- two.d.array(A) pc.res <- prcomp(x) pcdata <- pc.res$x list(array=x, pc.summary=summary(pc.res), pc.scores=pcdata)}

(One additional function was defined to simplify the workflow.)

Data comes out of leaf measures of alder tree. There are two data files: classic morphometric dataset with multiple linear measurements, and geometric morphometric dataset:

**Code \(\PageIndex{2}\) (Python):**

am <- read.table("data/bigaln.txt", sep=";", head=TRUE) ag <- readland.tps("data/bigaln.tps", specID="imageID")

Geometric morphometric data was prepared with separate program, tpsDig\(^{[2]}\). In the field, every leaf was contoured with sharp pencil, and then all images were scanned.

**Figure \(\PageIndex{1}\) **Example of three alder leaf contours with landmark locations.

Next, PNG images were supplied to tpsDig and went through landmark mapping\(^{[3]}\). In total, there were 12 landmarks: top, base, and endpoints of the first (lower) five pairs of primary leaf veins (Figure \(\PageIndex{1}\)). Note that in geometric morphometry, preferable number of cases should be > 11 times bigger then number of variables.

Next step is the *Generalized Procrustes Analysis* (GPA). The name refers to bandit from Greek mythology who made his victims fit his bed either by stretching their limbs or cutting them off (Figure \(\PageIndex{2}\)). GPA aligns all images together:

**Figure \(\PageIndex{2}\) **Theseus and Procrustes (from Attic red-figure neck-amphora, 470–460 BC).

**Code \(\PageIndex{3}\) (Python):**

ag <- readland.tps("data/bigaln.tps", specID="imageID") gpa.ag <- gpagen(ag)

... and next—principal component analysis on GPA results:

**Code \(\PageIndex{4}\) (Python):**

ag <- readland.tps("data/bigaln.tps", specID="imageID") gpa.ag <- gpagen(ag) ta.ag <- TangentSpace2(gpa.ag$coords) screeplot(ta.ag$pc.summary) # importance of principal components pca.ag <- ta.ag$pc.summary$x

(**Check** the PCA screeplot yourself.)

Now we can plot the results (Figure \(\PageIndex{3}\)). For example, let us check if leaves from top branches (high P.1 indices) differ in their shape from leaves of lower branches (small P.1 indices):

**Code \(\PageIndex{5}\) (Python):**

ag <- readland.tps("data/bigaln.tps", specID="imageID") gpa.ag <- gpagen(ag) ta.ag <- TangentSpace2(gpa.ag$coords) pca.ag <- ta.ag$pc.summary$x pca.ag.ids <- as.numeric(gsub(".png", "", row.names(pca.ag))) branch <- cut(am$P.1, 3, labels=c("lower", "middle", "top")) b.code <- as.numeric(Recode(pca.ag.ids, am$PIC, branch, char=FALSE)) # recode.r plot(pca.ag[, 1:2], xlab="PC1", ylab="PC2", pch=19, col=b.code) legend("topright", legend=paste(levels(branch), "branches"),pch=19, col=1:3)

Well, the difference, if even exists, is small.

Now plot *consensus shapes* of top and lower leaves. First, we need *mean shapes* for the whole dataset and separately for lower and top branches, and then *links* to connect landmarks:

**Code \(\PageIndex{6}\) (Python):**

ag <- readland.tps("data/bigaln.tps", specID="imageID") gpa.ag <- gpagen(ag) c.lower <- mshape(gpa.ag$coords[, , b.code == 1]) c.top <- mshape(gpa.ag$coords[, , b.code == 3]) all.mean <- mshape(gpa.ag$coords) ag.links <- matrix(c(1, rep(c(2:7, 12:8), each=2), 1),ncol=2, byrow=TRUE)

Finally, we plot D’Arcy Thompson’s *transformation grids* (Figure C.5.1):

**Code \(\PageIndex{7}\) (Python):**

ag <- readland.tps("data/bigaln.tps", specID="imageID") ag.links <- matrix(c(1, rep(c(2:7, 12:8), each=2), 1),ncol=2, byrow=TRUE) old.par <- par(mfrow=c(1, 2)) GP <- gridPar(grid.col="grey", tar.link.col="darkseagreen", tar.pt.bg=0) plotRefToTarget(c.lower, all.mean, links=ag.links, gridPars=GP) title(main="lower branches", line=-5, cex=0.8) plotRefToTarget(c.top, all.mean, links=ag.links, gridPars=GP) title(main="top branches", line=-5, cex=0.8) par(old.par)

**Figure \(\PageIndex{3}\) **Alder leaves shapes in two-dimensional tangent space made with Procrustes analysis.

Small difference is clearly visible and could be the starting point for the further research.

## References

1. Thompson D. W. 1945. On growth and form. Cambridge, New York. 1140 pp.

2. Rohlf F.J. tpsDig. Department of Ecology and Evolution, State University of New York at Stony Brook. Freely available at http://life.bio.sunysb.edu/morph/

3. Actually, geomorph package is capable to digitize images with digitize2d() function but it works only with JPEG images.