- Some basic models
- Equivalent models
- Diallels
- Clonal data
- Multiple site, single trait
Some basic models
Lets start with the simplest design normally used in tree breeding programs: randomized complete blocks.
dat <- asreml.read.table('data.csv', sep = ',', header = TRUE)
ped<- asreml.read.table('ped.csv', sep = ',', header = TRUE)
# Fitting model with a family model
dbh.1 <- asreml(dbh ~ 1, random = ~ Block+ Block:Plot + Mom,
data = dat)
# Having a look at the variance components
summary(dbh.1)$varcomp
# If single tree plots then use:
dbh.2 <- asreml(dbh ~ 1, random = ~ Block+ Mom,
data = dat)
Then we can move to fit an animal model (or tree model, or individual tree model, pick a name), for which we need the inverse of the numerator relationship matrix (obtained from the pedigree).
pedinv <- asreml.Ainverse(ped)$ginv
# Fitting model with an animal model
dbh.3 <- asreml(dbh ~ 1, random = ~ Block+ Block:Plot + ped(Tree),
data = dat, ginverse = list(Tree = pedinv))
# Same thing for basic density
den.1 <- asreml(den ~ 1, random = ~ Block+ Block:Plot + ped(Tree),
data = dat, ginverse = list(Tree = pedinv))
Now an incomplete block design, where we have complete replicates and incomplete blocks within each Rep.
dbh.4 <- asreml(dbh ~ 1, random = ~ Rep + Rep/Block+ ped(Tree),
data = dat, ginverse = list(Tree = pedinv))
Disregard what follows: in preparation
Equivalent models
In some situations, e.g. when you are only interested in predicting breeding values for the parents for backwards selection, you may prefer to use models that are equivalent and computationally less demanding (e.g. a family model). For example:
Incomplete block design using OP material tree !P motherID 200 rep 5 # replicates iblock 20 # incomplete blocks plot 1000 # plot codes dbh crctest3ped.txt crctest3dat.txt !dopart $A # Uses an individual tree model # where var(tree) = additive variance !part 1 dbh ~ mu rep !r rep.iblock plot tree # Uses a family model # where var(motherID) = 1/4 additive variance # (if there is no selfing, etc) !part 2 dbh ~ mu rep !r rep.iblock plot motherID
…or in the case of controlled pollinated material:
Incomplete block design using CP material tree !P motherID 20 fatherID 20 family 120 rep 5 # replicates iblock 20 # incomplete blocks plot 1000 # plot codes dbh crctest4ped.txt crctest4dat.txt !dopart $A # Uses an individual tree model # where var(tree) = additive variance and # var(family) = 1/4 dominance variance !part 1 dbh ~ mu rep !r rep.iblock plot tree family # Uses a family model !part 2 dbh ~ mu rep !r rep.iblock plot motherID, and(fatherID) family
In the previous equation motherID and(fatherID) overlays the design matrices for males and females so you get only one prediction for each parent, in spite of some parents acting as both male and female (which is typical in crossing programs in trees). The variance of motherID will represent var(GCA).
If you face problems overlaying the matrices with and, please read Overlapping Design Matrices?.
Diallels
The specifications of diallels is very straightforward in ASReml, and do not require the creation of many additional variables to hold extra factors.
Note: The specification of family code is in such a way that direction of cross does not matter (e.g., 55x96 = 96x55). In reciprocals code direction is important (e.g., 55x96 != 96x55).
Diallel in complete block design tree !P motherID 20 fatherID 20 family 120 # Family code recipro 200 # Reciprocals code rep 10 # replicates plot 1000 # plot codes dbh crctest5ped.txt crctest5dat.txt !dopart $A # Uses an individual tree model !part 1 dbh ~ mu rep !r rep.iblock plot tree family, motherID recipro # Uses a family model !part 2 dbh ~ mu rep !r plot fatherID and(motherID), family motherID recipro
Clonal data
Clonal data can be seen as repeated observations of a genotype, thus their analysis is related to repeated measurements, although there is no ordering in time. The analysis of clonal data is straightforward in ASReml. In the data file all ramets of the same clone will have the same genotype ID, and each genotype will be only once in the pedigree file. If each genotype is repeated in the pedigree file, it will be necessary to include the !repeat
keyword after the pedigree file name.
Brian Kennedy (in Animal Model BLUP. Erasmus Intensive Graduate Course. August 20–26 1989. University of Guelph. page 130) showed the mathematics behind using clonal data as repeated measurements, referring to the analysis of embryo splitting. I first ran code like this while working in longitudinal analysis in 1999–2000. However, João Costa e Silva provided me with a very good interpretation of the analyses at the end of 2003. For more details, check: Costa e Silva, J., Borralho, N.M.G. & Potts, B.M. 2004. Additive and non-additive genetic parameters from clonally replicated and seedling progenies of Eucalyptus globulus. Theoretical and Applied Genetics 108: 1113–1119.
Incomplete block design with clonal CP material genotypeID !P motherID 20 fatherID 20 family 120 rep 5 # replicates iblock 20 # incomplete blocks dbh crctest6ped.txt crctest6dat.txt !dopart $A # Uses an individual tree model # where var(genotypeID) = additive variance, # var(family) = 0.25 dominance variance, and # var(ide(genotypeID)) = 0.75 dominance + epistasis !part 1 dbh ~ mu rep !r rep.iblock genotypeID fam ide(genotypeID)
The ide(genotypeID) part of the job, creates an
additional matrix for genotypeID that ignores the pedigree relationships.
Multiple site, single trait
The traditional approach used in tree breeding to analyse progeny trials in multiple sites was to either assume a unique error variance (and then use the approach explained before) or to correct the data by the site specific error variance and then use the typical approach using the corrected data. Using ASReml it is possible to use alternative methods, either explicitly fitting a site specific error variance (but keeping a unique additive genetic variance) or fitting site specific additive and residual variances, in fact using a Multivariate Analysis? approach, where the expression of a trait in each site is considered a different variable. In any case, any of the alternative methods needs a specification of Covariance Structures.
Multiple site single trait analysis tree !P family 120 # family code site 2 rep 5 # replicates iblock 20 # incomplete blocks dbh crctest8ped.txt crctest8dat.txt !dopart $A # Uses an individual tree model for site 1 !part 1 ! filter site !select 1 dbh ~ mu rep !r rep.iblock tree family # Uses an individual tree model for site 2 !part 2 ! filter site !select 2 dbh ~ mu rep !r rep.iblock tree family # Both sites, as single trait but different # error variances !part 3 dbh ~ mu site site.rep !r site.rep.iblock, site.tree site.family # there are two separate errors, with #one dimension 2 1 0 # Error site 1 # 1500 observations in site 1, 0 is a # place holder (see spatial analysis), # IDEN indicates an identity matrix, # !S2=25 is the starting value # for residual variance (obtained from # running !part 1) 1500 0 IDEN !S2=25 # Error site 2 # Same explanation as before 1300 0 IDEN !S2=30