Phylogenetic reconstruction workshop, Tartu, 2021
This project is maintained by Mycology-Microbiology-Center
This is also available as an R Script file: scripts_tree_operations.R
Read a substitution table for tip labels. The first column must contain the same values as tips in the tree, the other column(s) - your pretty labels
labels <- read.csv('data/labels_simple.csv')
Combine tree and new labels. Similarly, you can plug in any kind of data you may want to plot, e.g. values of morphological traits. %<+% is a ggtree-specific operator working as a left join in SQL. It produces a treedata
object:
tree_data <- ggtree(bayesTree_rooted) %<+% labels
In the same fashion add bootstrap values obtained with the getSupports function earlier. Note that it automatically maps bootstraps to the tree on the basis of node
field shared between supportsTable
and tree_data
.
tree_data <- tree_data %<+% supportsTable
As before, separate geoms describe different sets of objects, and connected with a +
sign.
geom_tiplab
shows labels from column label_pretty
that was received from labels
object.geom_label2
plots both posteriors and bootstraps with a use of concatenating paste
function, with slash (/) as separator. Note the ifelse
conditions that recognize missing values with is.na
, and draw ‘-‘ on branches that did not appear in one of the trees. subset = !isTip
here is needed to prevent the geom_label2
from plotting tip labels - they are already done.xlim
is needed to fix truncation of long labels. You may need to change its second parameter or even drop xlim
altogether, depending on your tree. Note that when plotting your tree with some additional data
that do not fit into area specified by xlim, it may drop these data completely out of your plot.
tree_data +
geom_tiplab(aes(label = label_pretty), offset = 0.001) +
geom_label2(
aes(
label = paste(
ifelse(is.na(round(as.numeric(label), 2)),
'-', round(as.numeric(label), 2)),
ifelse(is.na(support),
'-', support),
sep = '/'
),
subset = !isTip & label != 'Root'
),
hjust = 1.1,
vjust = -0.3,
alpha = 0.8,
label.size = NA,
label.padding = unit(0.2, 'mm'),
size = 2
) +
xlim(0, 0.5)
This function saves the last plotted graph. If you need to save a particular object, specify plot = object_name
in ggsave
.
## Create a directory for output files:
dir.create('output')
## Write the result
ggsave(
filename = 'output/tree_data_lab.pdf',
device = 'pdf',
width = 210,
height = 297,
units = 'mm'
)
The main part is done, unless you want to play more with annotations or visualizations.
Read the table with more information on labels. Here we explicitly ask to read empty strings as NA value (na.strings = ''
), so they will be properly parsed by is.na
in plotting below. This will help to make labels look cleaner, without orphan commas.
labels_extended <- read.csv('data/labels_extended.csv', na.strings = '')
Add these data.
tree_data <- tree_data %<+% labels_extended
Plot the result. In this case we construct labels from separate fields using paste
.
tree_data +
geom_tiplab(
aes(label = paste(
label,
ifelse(is.na(Organism),
'', paste(',', Organism)),
ifelse(is.na(Country),
'', paste(',', Country)),
ifelse(is.na(Altitude),
'', paste(', altitude:', Altitude, 'm')),
sep = ''
)),
geom = 'label',
label.size = NA,
label.padding = unit(0, 'mm'),
offset = 0.001
) +
geom_label2(
aes(
label = paste(
ifelse(is.na(round(as.numeric(label), 2)),
'-', round(as.numeric(label), 2)),
ifelse(is.na(support),
'-', support),
sep = '/'
),
subset = !isTip & label != 'Root'
),
hjust = 1.1,
vjust = -0.3,
alpha = 0.8,
label.size = NA,
label.padding = unit(0.2, 'mm'),
size = 2
) +
xlim(0, 0.5)
Save the result.
ggsave(
filename = 'output/tree_data_lab_ext.pdf',
device = 'pdf',
width = 420,
height = 297,
units = 'mm'
)
For this purpose ggtree
provides several geoms, e.g. geom_hilight
, geom_balance
(see again https://yulab-smu.top/treedata-book/chapter5.html). With geom_cladelabel
we additionally can delineate clades with labeled vertical bars. We will highlight 2 clades and add 1 labelled bar by adding these geometries to the code. Note that order in which we lay out geoms defines their rendering order: e.g. here highlights will be drawn under tip labels to keep the latter untinted, black.
tree_data +
geom_label2(
aes(
label = paste(
ifelse(is.na(round(as.numeric(label), 2)),
'-', round(as.numeric(label), 2)),
ifelse(is.na(support),
'-', support),
sep = '/'
),
subset = !isTip & label != 'Root'
),
hjust = 1.1,
vjust = -0.3,
alpha = 0.8,
label.size = NA,
label.padding = unit(0.2, 'mm'),
size = 2
) +
geom_hilight(node = 65, fill="#1B9E77", alpha = 0.3, extend = 0.22) +
geom_hilight(node = 56, fill="#F54748", alpha = 0.3, extend=0.20) +
geom_cladelabel(56, "SP. NOV.", offset = 0.12, barsize = 2, align = TRUE, angle = 90, offset.text = 0.008, extend = 0.5, hjust = 0.5, fontsize=5) +
geom_tiplab(aes(label = label_pretty), offset = 0.001) +
xlim(0, 0.5)
Save the result.
ggsave(
filename = 'output/tree_data_highlighted.pdf',
device = 'pdf',
width = 210,
height = 297,
units = 'mm'
)
Example of an ouput:
That’s all for now, folks!