R packages

The following code uses several packages particularly tidyr and plyr for organizing data tables and manipulating strings of text.

## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.5-6

Data Import, Subsetting, and Reformatting

Prior to assigning functional guild with Funguild, the OTU table must first be reformatted so that it includes the taxonomy assignments from UNITE. To begin I first import the rarefied shared file from mothur.

Data Import

Below ITS data from bulk soils collected from 47 Juglans nigra trees are imported into R. This includes a rarefied OTU table from mothur and the taxonomic classifications for each OTU. Samples were rarefied to 25,000 sequences in mothur prior to import, resulting in the loss of two samples from WA. Taxonomic assignments were made in mothur using the UNITE database.

Reformatting of OTU Table

Prior to running funguild, the OTU table needs to be transposed and the taxonomy strings from UNITE added to meet the input requirements for funguild.

#To add taxonomy information I first transposed the OTU table so that OTU IDs are the row names and sample names are the Column names. 
t.its2.soil.shared.rare.nosingles<-t(its2.soil.shared.rare.nosingles)

#I then create a column called otu that contains the OTU ID. This will be used to merge the OTU table to the taxonomy table.
t.its2.soil.shared.rare.nosingles<-data.frame(otu=labels(t.its2.soil.shared.rare.nosingles)[[1]],t.its2.soil.shared.rare.nosingles)

#I then merge the taxonomy file and the transposed OTU table using the OTU ID column. 
t.its2.soil.otutab.tax<-merge(t.its2.soil.shared.rare.nosingles,its2.soil.tax.sub, by.x="otu",by.y="OTU")

#Then I remove the column titled "size" from the new transposed OTU table with taxonomy information. This column does not really provide any useful information in this context and following funguild assignments the OTU table will be pretty heavily cluttered. 
t.its2.soil.otutab.tax<-t.its2.soil.otutab.tax[,-47]

#I then change the column names of the OTU ID column and the taxonomy column to align with the input requirements for funguild. 
names(t.its2.soil.otutab.tax)[1]<-"OTU_ID"
names(t.its2.soil.otutab.tax)[47]<-"taxonomy"

t.its2.soil.otutab.tax1<-t.its2.soil.otutab.tax[1:3500,]
t.its2.soil.otutab.tax2<-t.its2.soil.otutab.tax[3501:6738,]

#write.table(t.its2.soil.otutab.tax1, "~/Documents/microbiome2017//funguild.soil.input.file.1_FA19.txt",sep="\t", row.names=FALSE)

#write.table(t.its2.soil.otutab.tax2, "~/Documents/microbiome2017//funguild.soil.input.file.2_FA19.txt",sep="\t", row.names=FALSE)

Funguild Table Reformatting

Prior to creating relative abundance plots, the functional guild output needs to be reformatted. First, compound guild strings need to be broken down into individual guilds and OTUs need to be weighted based on the number of guilds to which they were assigned. Then, OTUs are summed by guild and state.

##        otu
## 1 Otu00001
## 2 Otu00003
## 3 Otu00002
## 4 Otu00004
## 5 Otu00007
## 6 Otu00005
##                                                                                Guilds
## 1                                Dung Saprotroph-Undefined Saprotroph-Wood Saprotroph
## 2                                                                Undefined Saprotroph
## 3                                                                     Animal Pathogen
## 4                    Endophyte-Epiphyt-Fungal Parasite-Plant Pathogen-Wood Saprotroph
## 5 Animal Pathogen-Dung Saprotroph-Endophyte-Epiphyte-Plant Saprotroph-Wood Saprotroph
## 6                    Endophyte-Litter Saprotroph-Soil Saprotroph-Undefined Saprotroph
##   IN_130_MCB_26_S_ITS_S70 IN_130_MCB_2_S_ITS_S69 IN_130_MCB_9_S_ITS_S22
## 1                      17                     37                     16
## 2                     575                    676                    990
## 3                     195                    516                    297
## 4                     214                    147                   3994
## 5                      25                     41                     63
## 6                     622                    618                    896
##   IN_132_MCB_16_S_ITS_S23 IN_132_MCB_17_S_ITS_S71 IN_132_MCB_27_S_ITS_S24
## 1                      64                      62                      28
## 2                     881                     555                     624
## 3                     652                     500                     730
## 4                     665                     973                     771
## 5                      98                       0                      14
## 6                     666                     330                     167
##   IN_272_MCB_10_S_ITS_S31 IN_272_MCB_11_S_ITS_S79 IN_272_MCB_24_S_ITS_S72
## 1                      23                    4930                      92
## 2                     756                     773                     862
## 3                     158                     276                     217
## 4                     118                    1055                    1131
## 5                      13                     871                     263
## 6                     191                       0                     712
##   IN_55_MCB_6_S_ITS_S20 IN_55_MCB_7_S_ITS_S68 IN_55_MCB_8_S_ITS_S21
## 1                  3647                    43                    41
## 2                     2                   410                   867
## 3                     0                  1250                   893
## 4                     0                    46                   917
## 5                     0                    34                    89
## 6                   440                   121                  3457
##   IN_WT_MCB_28_S_ITS_S32 IN_WT_MCB_29_S_ITS_S80 IN_WT_MCB_33_S_ITS_S33
## 1                    114                    923                      0
## 2                    515                    460                      2
## 3                    145                   1053                     45
## 4                   1082                    595                     40
## 5                    130                    714                    547
## 6                    463                      0                    999
##   TN_130_A_S_ITS_S35 TN_130_B_S_ITS_S83 TN_130_C_S_ITS_S36 TN_132_A_S_ITS_S84
## 1                 13                 11                  0               2098
## 2                 34                449               2229                437
## 3                163                760               9812                536
## 4                172                194                820                515
## 5                384                953                329                592
## 6               1017                  1                139                  0
##   TN_132_B_S_ITS_S43 TN_132_C_S_ITS_S91 TN_272_A_S_ITS_S44 TN_272_B_S_ITS_S92
## 1               2073                  1                  9                124
## 2               2171                272                456                533
## 3                556                251                361               2038
## 4                727                  0                619                 17
## 5                962                 10               1248                843
## 6                  0                 16                  0                 51
##   TN_272_C_S_ITS_S45 TN_55_A_S_ITS_S81 TN_55_B_S_ITS_S34 TN_55_C_S_ITS_S82
## 1               2344              3557                13              2896
## 2                966              1416               341              1326
## 3                295               188                82               196
## 4                177              1139               235               895
## 5               1966               669                36               826
## 6                  0                 0              1668                 0
##   TN_LS_1_S_ITS_S93 TN_LS_2_S_ITS_S46 TN_LS_3_S_ITS_S94 TN_WT_MB_19_S_ITS_S47
## 1                 3               920                12                   717
## 2               538              2027               629                   533
## 3              1058               370               625                   258
## 4              1596               581               386                  1080
## 5               140               761              2062                  2363
## 6                 0                 4                 0                     0
##   TN_WT_MB_20_S_ITS_S95 TN_WT_MB_21_S_ITS_S48 WA130_BNL_20_S_ITS_S9
## 1                     0                     1                    44
## 2                  2216                    22                     1
## 3                   786                  1168                     0
## 4                   834                     0                     2
## 5                   573                   145                     0
## 6                     0                    64                   506
##   WA132_RN_7_S_ITS_S57 WA132_RN_8_S_ITS_S10 WA272_BNL_17_S_ITS_S59
## 1                  441                  407                     78
## 2                   71                   24                      0
## 3                    2                    7                      0
## 4                    5                   37                    189
## 5                    0                    2                      0
## 6                   57                  307                    242
##   WA272_BNL_18_S_ITS_S12 WA272_RN_10_S_ITS_S11 WA_55_BNL_19_S_ITS_S8
## 1                    708                     8                  1758
## 2                      1                   129                   307
## 3                      1                     7                     1
## 4                     17                    32                   116
## 5                      1                     5                     0
## 6                    728                    78                   161
##   WA_55_RN_1_S_ITS_S7 WA_55_RN_2_S_ITS_S55 WA_WT_BNL_21_S_ITS_S60
## 1                 606                  829                     14
## 2                   3                    5                    276
## 3                   0                    0                      3
## 4                  25                   16                    173
## 5                   0                    0                      4
## 6                 477                   24                    327
##   WA_WT_BNL_22_S_ITS_S19 WA_WT_BNL_23_S_ITS_S67
## 1                   2300                     15
## 2                      1                    597
## 3                      0                    489
## 4                      0                   1376
## 5                      1                     52
## 6                   1068                    324
#Before we move forward lets correct some spelling and formatting errors for some of the database submissions. 

t.otutab.guild.soil$Guilds<-sub("\\--\\b","-",t.otutab.guild.soil$Guilds, ignore.case = FALSE)

t.otutab.guild.soil$Guilds<-sub("\\bEpiphyt\\b","Epiphyte",t.otutab.guild.soil$Guilds, ignore.case = FALSE)

t.otutab.guild.soil$Guilds<-sub("\\bWood Saprotrop\\b","Wood Saprotroph",t.otutab.guild.soil$Guilds, ignore.case = FALSE)

t.otutab.guild.soil$Guilds<-sub("\\bAnimal Pathogen-Endophyte-Fungal Parasite-Plant Pathogen=Wood Saprotroph\\b","Animal Pathogen-Endophyte-Fungal Parasite-Plant Pathogen-Plant Pathogen-Wood Saprotroph",t.otutab.guild.soil$Guilds, ignore.case = FALSE)


#Next I use the stringr package to quantify the number of "-" in the guild names. To weight each value, I divide the total count by the number of dashes plus 1. I add 1 because a name with only 1 dash will have two functional guilds. 
library(stringr)
t.otutab.guild.div.soil<-data.frame(Guilds=t.otutab.guild.soil$Guilds,(t.otutab.guild.soil[,3:47]/(str_count(t.otutab.guild.soil$Guilds,pattern="-")+1)))

#Then using separate_rows, I split the functional guild names by dash and create duplicate columns. 
library(tidyr)
t.otutab.guild.div.sep.soil<-separate_rows(t.otutab.guild.div.soil, Guilds, sep="-")

#Then using plyr I take the sum of the data by functional guild and then by site. 
library(plyr)
t.otutab.guild.div.sep.sum.soil<-as.data.frame(ddply(t.otutab.guild.div.sep.soil, .(Guilds),colwise(sum)))

#I then make the row name the functional guild and change the names so that the period between two parts of a name becomes a space. 
rownames(t.otutab.guild.div.sep.sum.soil)<-make.names(t.otutab.guild.div.sep.sum.soil$Guilds)


#I then remove the column that contains guild names since this is a duplicate of the row names. 
otutab.guild.div.sep.sum.soil<-t(t.otutab.guild.div.sep.sum.soil[,2:46])

site<-c("IN","IN","IN","IN","IN","IN","IN","IN","IN","IN","IN","IN","IN","IN","IN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","TN","WA","WA","WA","WA","WA","WA","WA","WA","WA","WA","WA","WA")

#Then I create a data frame that contains the state information alongside the functional guild assignments and the sum functional guilds by state.  
otutab.guild.div.sep.sum.state.soil<-data.frame(state=site,otutab.guild.div.sep.sum.soil)
otutab.guild.div.sep.sum.statesum.soil<-(ddply(otutab.guild.div.sep.sum.state.soil, .(state),colwise(sum)))

#Then I rename the unassigned column, currently delimited with "X" to Unassigned and substitute . for spaces in the funguild names. 
otutab.guild.div.sep.sum.statesum.soil$Unassigned<-otutab.guild.div.sep.sum.statesum.soil$X+otutab.guild.div.sep.sum.statesum.soil$NULL.

#Then I remove the X and Null columns
otutab.guild.div.sep.sum.statesum.soil<-otutab.guild.div.sep.sum.statesum.soil[,-2]
otutab.guild.div.sep.sum.statesum.soil<-otutab.guild.div.sep.sum.statesum.soil[,-21]

#Finally I remove the periods from the names and add spaces. 
colnames(otutab.guild.div.sep.sum.statesum.soil)<-sub("\\."," ",colnames(otutab.guild.div.sep.sum.statesum.soil))

Relative Abundance Charts

To visualize differences in the relative abundance of different functional guilds between states, I reorganize the OTU table and taxonomy strings to create relative abundance stacked bar charts.

Plant Pathogens

Plotting of plant.pathogen Principal Coordinate Analysis

Then to plot the PCoA, I use the ggplot function. To use ggplot I need to first pull out the site scores for each sample and generate ellipses.

#Next I import the metadata for the particular habitat of interest. In this case I am working with drill shavings. I will use this metadata to subset the larger OTU file above. 
its2.soil.met<-read.table("Mothur_output/soilits2metadata_jnigra17_su19_ajo.txt",header=TRUE,sep="\t",fill=TRUE,strip.white=TRUE)

#Then we extract PCoA site scores (these will be the x and y coordinates for each sample)
plant.pathogen.soil.pcoavec<-as.data.frame(plant.pathogen.soil.pcoa.results$vectors)
plant.pathogen.soil.pcoasitescores<-data.frame(PC1=plant.pathogen.soil.pcoavec$Axis.1, PC2=plant.pathogen.soil.pcoavec$Axis.2)

#Create a new dataframe that includes the site scores from above with metadata from your study. 
plant.pathogen.soil.pcoagraph<-data.frame(plant.pathogen.soil.pcoasitescores,PC1=plant.pathogen.soil.pcoasitescores$PC1, PC2=plant.pathogen.soil.pcoasitescores$PC2, State=its2.soil.met$State, Clone=its2.soil.met$Clone,group=its2.soil.met$Group)

#This is where you make confidence ellipses. I don't know what all the code means. Just know where to plug in my objects. 
plant.pathogen.soil.pcoaellipse<-ordiellipse(plant.pathogen.soil.pcoasitescores,plant.pathogen.soil.pcoagraph$State, display="sites", kind="sd", draw="none")

df_ell.soil.pp <- data.frame()
for(g in levels(plant.pathogen.soil.pcoagraph$State)){
df_ell.soil.pp <- rbind(df_ell.soil.pp, cbind(as.data.frame(with(plant.pathogen.soil.pcoagraph[plant.pathogen.soil.pcoagraph$State==g,],                                                vegan:::veganCovEllipse(plant.pathogen.soil.pcoaellipse[[g]]$cov,plant.pathogen.soil.pcoaellipse[[g]]$center,plant.pathogen.soil.pcoaellipse[[g]]$scale))) ,State=g))}

#Plot PCoA in ggplot
library(ggplot2)
plant.pathogen.soil.pcoa<-ggplot(plant.pathogen.soil.pcoagraph, aes(PC1,PC2, colour=State))+
  #geom_text(aes(label=group))+
  geom_path(data=df_ell.soil.pp, aes(x=PC1, y=PC2, colour=State),size=0.5, linetype=1)+
  geom_point(aes(shape=Clone), size=3.5)+
  xlab("PC1 (21.8%)")+
  ylab("PC2 (14.2%)")+
 theme(axis.title.x=element_text(size=14, face="bold"))+
  theme(axis.title.y=element_text(size=14, face="bold"))+
  theme(axis.text.x=element_text(size=12, face="bold"))+
  theme(axis.text.y=element_text(size=12, face="bold"))+ 
  scale_color_manual(values=c("#006600","#3399FF","#FF9900"))+
  scale_shape_manual(values=c(16,10,8,7,0,2))+
  scale_x_continuous(breaks=c(-0.60,-0.30,0,0.30,0.60),limits=c(-0.60,0.60))+
  scale_y_continuous(breaks=c(-0.60,-0.30,0,0.30,0.60),limits=c(-.60,.60))+
  theme(panel.grid.major=element_blank(),
        panel.grid.minor=element_blank(),
    panel.border=element_rect(colour="black", size=1, fill=NA))+
  theme(panel.grid.major=element_blank(),
       panel.grid.minor=element_blank(),
       panel.background = element_blank(),
  panel.border=element_rect(color="black", size=1, fill=NA))+
  geom_text(data=NULL,aes(x=-0.60,y=0.60,label="Soil: Plant Pathogen"),hjust=0,colour="black")
plant.pathogen.soil.pcoa

Plant pathogens PERMANOVA

To formally test how well state and clone explain differences in drill shaving fungal community composition, I perform a PERMANOVA using the adonis function from vegan.

## 
## Call:
## adonis(formula = plant.pathogen.soil.relabund ~ its2.soil.met$State *      its2.soil.met$Clone, permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                                         Df SumsOfSqs MeanSqs F.Model      R2
## its2.soil.met$State                      2    2.1495 1.07473  6.1848 0.23177
## its2.soil.met$Clone                      4    0.5690 0.14226  0.8187 0.06136
## its2.soil.met$State:its2.soil.met$Clone  8    1.3427 0.16784  0.9659 0.14478
## Residuals                               30    5.2130 0.17377         0.56210
## Total                                   44    9.2742                 1.00000
##                                            Pr(>F)    
## its2.soil.met$State                     9.999e-05 ***
## its2.soil.met$Clone                        0.7839    
## its2.soil.met$State:its2.soil.met$Clone    0.5582    
## Residuals                                            
## Total                                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call:
## adonis(formula = plant.pathogen.soil.relabund ~ its2.soil.met$State +      its2.soil.met$Clone, permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                     Df SumsOfSqs MeanSqs F.Model      R2    Pr(>F)    
## its2.soil.met$State  2    2.1495 1.07473  6.2296 0.23177 9.999e-05 ***
## its2.soil.met$Clone  4    0.5690 0.14226  0.8246 0.06136    0.7737    
## Residuals           38    6.5558 0.17252         0.70688              
## Total               44    9.2742                 1.00000              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call:
## adonis(formula = plant.pathogen.soil.relabund ~ its2.soil.met$State,      permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                     Df SumsOfSqs MeanSqs F.Model      R2    Pr(>F)    
## its2.soil.met$State  2    2.1495 1.07473  6.3354 0.23177 9.999e-05 ***
## Residuals           42    7.1248 0.16964         0.76823              
## Total               44    9.2742                 1.00000              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Plant pathogens Richness

Observed OTU richness is first calculated for each sample. Then due to the unbalanced nature of our study design, we use a type 3 ANOVA.

## Loading required package: carData
## Anova Table (Type III tests)
## 
## Response: (Richness)
##             Sum Sq Df  F value    Pr(>F)    
## (Intercept)  63948  1 146.9393 4.333e-13 ***
## State         1207  2   1.3866    0.2655    
## clone         2300  4   1.3215    0.2846    
## State:clone   1199  8   0.3445    0.9410    
## Residuals    13056 30                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 63948.000 1 146.9393382 0.0000000
State 1206.857 2 1.3865546 0.2654936
clone 2300.400 4 1.3214614 0.2846392
State:clone 1199.478 8 0.3445192 0.9409584
Residuals 13056.000 30 NA NA
## Warning: not plotting observations with leverage one:
##   34

## Warning: not plotting observations with leverage one:
##   34

## Anova Table (Type III tests)
## 
## Response: (Richness)
##             Sum Sq Df  F value    Pr(>F)    
## (Intercept) 108326  1 288.7571 < 2.2e-16 ***
## State         4774  2   6.3626  0.004137 ** 
## clone         2636  4   1.7567  0.157827    
## Residuals    14255 38                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = (Richness) ~ State + clone, data = plant.pathogen.soil.rich.tabmet)
## 
## $State
##             diff       lwr       upr     p adj
## TN-IN  -4.633333 -21.14744 11.880770 0.7740346
## WA-IN -27.466667 -45.76138 -9.171956 0.0021463
## WA-TN -22.833333 -40.43743 -5.229240 0.0084177
## 
## $clone
##               diff       lwr       upr     p adj
## 132-130 -21.912500 -50.61246  6.787460 0.2069306
## 272-130 -12.733333 -40.67932 15.212650 0.6900657
## 55-130  -23.177778 -51.12376  4.768206 0.1445359
## WT-130  -14.583333 -40.95678 11.790110 0.5169666
## 272-132   9.179167 -17.76644 36.124769 0.8645926
## 55-132   -1.265278 -28.21088 25.680324 0.9999228
## WT-132    7.329167 -17.98182 32.640153 0.9199356
## 55-272  -10.444444 -36.58552 15.696629 0.7823071
## WT-272   -1.850000 -26.30274 22.602735 0.9994875
## WT-55     8.594444 -15.85829 33.047180 0.8508616
diff lwr upr p adj
TN-IN -4.633333 -21.14744 11.880770 0.7740346
WA-IN -27.466667 -45.76138 -9.171956 0.0021463
WA-TN -22.833333 -40.43743 -5.229240 0.0084177
132-130 -21.912500 -50.61246 6.787460 0.2069306
272-130 -12.733333 -40.67932 15.212650 0.6900657
55-130 -23.177778 -51.12376 4.768206 0.1445359
WT-130 -14.583333 -40.95678 11.790110 0.5169666
272-132 9.179167 -17.76644 36.124769 0.8645926
55-132 -1.265278 -28.21088 25.680324 0.9999228
WT-132 7.329167 -17.98182 32.640152 0.9199356
55-272 -10.444444 -36.58552 15.696629 0.7823071
WT-272 -1.850000 -26.30274 22.602735 0.9994875
WT-55 8.594444 -15.85829 33.047180 0.8508616

## Anova Table (Type III tests)
## 
## Response: (Richness)
##             Sum Sq Df F value    Pr(>F)    
## (Intercept) 229897  1 571.626 < 2.2e-16 ***
## State         5649  2   7.023  0.002338 ** 
## Residuals    16892 42                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 229896.600 1 571.625912 0.0000000
State 5649.011 2 7.022986 0.0023378
Residuals 16891.567 42 NA NA
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = (Richness) ~ State, data = plant.pathogen.soil.rich.tabmet)
## 
## $State
##             diff       lwr       upr     p adj
## TN-IN  -4.633333 -21.66673 12.400061 0.7872673
## WA-IN -27.466667 -46.33666 -8.596674 0.0028297
## WA-TN -22.833333 -40.99099 -4.675674 0.0106638
diff lwr upr p adj
TN-IN -4.633333 -21.66673 12.400061 0.7872673
WA-IN -27.466667 -46.33666 -8.596674 0.0028297
WA-TN -22.833333 -40.99099 -4.675674 0.0106638

## Loading required package: magrittr
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:tidyr':
## 
##     extract
## 
## Attaching package: 'ggpubr'
## The following object is masked from 'package:ape':
## 
##     rotate
## The following object is masked from 'package:plyr':
## 
##     mutate

Arbuscular Mycorrhizae

Plotting of arbuscular.mycorrhizae Principal Coordinate Analysis

Then to plot the PCoA, I use the ggplot function. To use ggplot I need to first pull out the site scores for each sample and generate ellipses.

#Next I import the metadata for the particular habitat of interest. In this case I am working with drill shavings. I will use this metadata to subset the larger OTU file above. 
its2.soil.met<-read.table("Mothur_output/soilits2metadata_jnigra17_su19_ajo.txt",header=TRUE,sep="\t",fill=TRUE,strip.white=TRUE)

#Then we extract PCoA site scores (these will be the x and y coordinates for each sample)
arbuscular.mycorrhizae.soil.pcoavec<-as.data.frame(arbuscular.mycorrhizae.soil.pcoa.results$vectors)
arbuscular.mycorrhizae.soil.pcoasitescores<-data.frame(PC1=arbuscular.mycorrhizae.soil.pcoavec$Axis.1, PC2=arbuscular.mycorrhizae.soil.pcoavec$Axis.2)

#Create a new dataframe that includes the site scores from above with metadata from your study. 
arbuscular.mycorrhizae.soil.pcoagraph<-data.frame(arbuscular.mycorrhizae.soil.pcoasitescores,PC1=arbuscular.mycorrhizae.soil.pcoasitescores$PC1, PC2=arbuscular.mycorrhizae.soil.pcoasitescores$PC2, State=its2.soil.met$State, Clone=its2.soil.met$Clone,group=its2.soil.met$Group)

#This is where you make confidence ellipses. I don't know what all the code means. Just know where to plug in my objects. 
arbuscular.mycorrhizae.soil.pcoaellipse<-ordiellipse(arbuscular.mycorrhizae.soil.pcoasitescores,arbuscular.mycorrhizae.soil.pcoagraph$State, display="sites", kind="sd", draw="none")

df_ell.am <- data.frame()
for(g in levels(arbuscular.mycorrhizae.soil.pcoagraph$State)){
df_ell.am <- rbind(df_ell.am, cbind(as.data.frame(with(arbuscular.mycorrhizae.soil.pcoagraph[arbuscular.mycorrhizae.soil.pcoagraph$State==g,],                                                vegan:::veganCovEllipse(arbuscular.mycorrhizae.soil.pcoaellipse[[g]]$cov,arbuscular.mycorrhizae.soil.pcoaellipse[[g]]$center,arbuscular.mycorrhizae.soil.pcoaellipse[[g]]$scale))) ,State=g))}

#Plot PCoA in ggplot
library(ggplot2)
arbuscular.mycorrhizae.soil.pcoa<-ggplot(arbuscular.mycorrhizae.soil.pcoagraph, aes(PC1,PC2, colour=State))+
  #geom_text(aes(label=group))+
  geom_path(data=df_ell.am, aes(x=PC1, y=PC2, colour=State),size=0.5, linetype=1)+
  geom_point(aes(shape=Clone), size=3.5)+
  xlab("PC1 (12.7%)")+
  ylab("PC2 (10.0%)")+
 theme(axis.title.x=element_text(size=14, face="bold"))+
  theme(axis.title.y=element_text(size=14, face="bold"))+
  theme(axis.text.x=element_text(size=12, face="bold"))+
  theme(axis.text.y=element_text(size=12, face="bold"))+ 
  scale_color_manual(values=c("#006600","#3399FF","#FF9900"))+
  scale_shape_manual(values=c(16,10,8,7,0,2))+
  scale_x_continuous(breaks=c(-0.60,-0.30,0,0.30,0.60),limits=c(-0.60,0.60))+
  scale_y_continuous(breaks=c(-0.60,-0.30,0,0.30,0.60),limits=c(-.60,.60))+
  theme(panel.grid.major=element_blank(),
        panel.grid.minor=element_blank(),
    panel.border=element_rect(colour="black", size=1, fill=NA))+
  theme(panel.grid.major=element_blank(),
       panel.grid.minor=element_blank(),
       panel.background = element_blank(),
  panel.border=element_rect(color="black", size=1, fill=NA))+
  geom_text(data=NULL,aes(x=-0.60,y=0.60,label="Soil: Arbuscular Mycorrhizae"),hjust=0,colour="black")
arbuscular.mycorrhizae.soil.pcoa

Plant pathogens PERMANOVA

To formally test how well state and clone explain differences in drill shaving fungal community composition, I perform a PERMANOVA using the adonis function from vegan.

## 
## Call:
## adonis(formula = arbuscular.mycorrhizae.soil.relabund ~ its2.soil.met$State *      its2.soil.met$Clone, permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                                         Df SumsOfSqs MeanSqs F.Model      R2
## its2.soil.met$State                      2    2.4884 1.24418  3.6081 0.14540
## its2.soil.met$Clone                      4    1.2631 0.31578  0.9158 0.07381
## its2.soil.met$State:its2.soil.met$Clone  8    3.0174 0.37718  1.0938 0.17632
## Residuals                               30   10.3448 0.34483         0.60447
## Total                                   44   17.1137                 1.00000
##                                            Pr(>F)    
## its2.soil.met$State                     9.999e-05 ***
## its2.soil.met$Clone                        0.7087    
## its2.soil.met$State:its2.soil.met$Clone    0.2002    
## Residuals                                            
## Total                                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call:
## adonis(formula = arbuscular.mycorrhizae.soil.relabund ~ its2.soil.met$State +      its2.soil.met$Clone, permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                     Df SumsOfSqs MeanSqs F.Model      R2    Pr(>F)    
## its2.soil.met$State  2    2.4884 1.24418  3.5383 0.14540 9.999e-05 ***
## its2.soil.met$Clone  4    1.2631 0.31578  0.8980 0.07381     0.742    
## Residuals           38   13.3622 0.35164         0.78079              
## Total               44   17.1137                 1.00000              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call:
## adonis(formula = arbuscular.mycorrhizae.soil.relabund ~ its2.soil.met$State,      permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                     Df SumsOfSqs MeanSqs F.Model     R2    Pr(>F)    
## its2.soil.met$State  2    2.4884 1.24418   3.573 0.1454 9.999e-05 ***
## Residuals           42   14.6253 0.34822         0.8546              
## Total               44   17.1137                 1.0000              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Arbuscular Richness

Observed OTU richness is first calculated for each sample. Then due to the unbalanced nature of our study design, we use a type 3 ANOVA.

## Anova Table (Type III tests)
## 
## Response: (Richness)
##              Sum Sq Df F value    Pr(>F)    
## (Intercept) 10561.3  1 37.0754 1.088e-06 ***
## State         857.5  2  1.5052    0.2383    
## clone        1294.9  4  1.1365    0.3583    
## State:clone  3938.5  8  1.7282    0.1325    
## Residuals    8545.8 30                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 10561.3333 1 37.075378 0.0000011
State 857.5238 2 1.505161 0.2382716
clone 1294.9333 4 1.136460 0.3582601
State:clone 3938.4595 8 1.728237 0.1324945
Residuals 8545.8333 30 NA NA
## Warning: not plotting observations with leverage one:
##   34

## Warning: not plotting observations with leverage one:
##   34

## Anova Table (Type III tests)
## 
## Response: (Richness)
##              Sum Sq Df F value    Pr(>F)    
## (Intercept) 15362.6  1 46.7611 4.074e-08 ***
## State        4134.0  2  6.2916  0.004363 ** 
## clone         759.8  4  0.5781  0.680233    
## Residuals   12484.3 38                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 15362.6091 1 46.7610904 0.0000000
State 4133.9949 2 6.2915780 0.0043629
clone 759.7516 4 0.5781377 0.6802333
Residuals 12484.2928 38 NA NA
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = (Richness) ~ State + clone, data = arbuscular.mycorrhizae.soil.rich.tabmet)
## 
## $State
##             diff       lwr       upr     p adj
## TN-IN  -2.488889 -17.94307 12.965296 0.9186642
## WA-IN -23.933333 -41.05384 -6.812826 0.0043380
## WA-TN -21.444444 -37.91866 -4.970228 0.0081689
## 
## $clone
##                diff       lwr      upr     p adj
## 132-130   2.9130952 -23.94482 29.77102 0.9978893
## 272-130  -6.4878307 -32.64017 19.66450 0.9528813
## 55-130   -5.9322751 -32.08461 20.22006 0.9657002
## WT-130   -7.7341270 -32.41485 16.94660 0.8961570
## 272-132  -9.4009259 -34.61709 15.81524 0.8218660
## 55-132   -8.8453704 -34.06153 16.37079 0.8517573
## WT-132  -10.6472222 -34.33368 13.03924 0.7006212
## 55-272    0.5555556 -23.90771 25.01883 0.9999957
## WT-272   -1.2462963 -24.12959 21.63700 0.9998608
## WT-55    -1.8018519 -24.68515 21.08144 0.9993999
diff lwr upr p adj
TN-IN -2.4888889 -17.94307 12.965296 0.9186642
WA-IN -23.9333333 -41.05384 -6.812826 0.0043380
WA-TN -21.4444444 -37.91866 -4.970229 0.0081689
132-130 2.9130952 -23.94482 29.771015 0.9978893
272-130 -6.4878307 -32.64017 19.664505 0.9528813
55-130 -5.9322751 -32.08461 20.220060 0.9657002
WT-130 -7.7341270 -32.41485 16.946598 0.8961570
272-132 -9.4009259 -34.61709 15.815235 0.8218660
55-132 -8.8453704 -34.06153 16.370791 0.8517573
WT-132 -10.6472222 -34.33368 13.039237 0.7006212
55-272 0.5555556 -23.90771 25.018825 0.9999957
WT-272 -1.2462963 -24.12959 21.636997 0.9998608
WT-55 -1.8018519 -24.68515 21.081442 0.9993999

## Anova Table (Type III tests)
## 
## Response: (Richness)
##             Sum Sq Df  F value    Pr(>F)    
## (Intercept)  35429  1 112.3550 1.916e-13 ***
## State         4536  2   7.1919  0.002061 ** 
## Residuals    13244 42                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 35429.400 1 112.355014 0.0000000
State 4535.733 2 7.191942 0.0020605
Residuals 13244.044 42 NA NA
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = (Richness) ~ State, data = arbuscular.mycorrhizae.soil.rich.tabmet)
## 
## $State
##             diff       lwr       upr     p adj
## TN-IN  -2.488889 -17.57150 12.593723 0.9153984
## WA-IN -23.933333 -40.64220 -7.224463 0.0033227
## WA-TN -21.444444 -37.52256 -5.366327 0.0064784
diff lwr upr p adj
TN-IN -2.488889 -17.57150 12.593723 0.9153984
WA-IN -23.933333 -40.64220 -7.224463 0.0033227
WA-TN -21.444444 -37.52256 -5.366327 0.0064784

Mycoparasite

Plotting of mycoparasite Principal Coordinate Analysis

Then to plot the PCoA, I use the ggplot function. To use ggplot I need to first pull out the site scores for each sample and generate ellipses.

#Next I import the metadata for the particular habitat of interest. In this case I am working with drill shavings. I will use this metadata to subset the larger OTU file above. 
its2.soil.met<-read.table("Mothur_output/soilits2metadata_jnigra17_su19_ajo.txt",header=TRUE,sep="\t",fill=TRUE,strip.white=TRUE)

#Then we extract PCoA site scores (these will be the x and y coordinates for each sample)
mycoparasite.soil.pcoavec<-as.data.frame(mycoparasite.soil.pcoa.results$vectors)
mycoparasite.soil.pcoasitescores<-data.frame(PC1=mycoparasite.soil.pcoavec$Axis.1, PC2=mycoparasite.soil.pcoavec$Axis.2)

#Create a new dataframe that includes the site scores from above with metadata from your study. 
mycoparasite.soil.pcoagraph<-data.frame(mycoparasite.soil.pcoasitescores,PC1=mycoparasite.soil.pcoasitescores$PC1, PC2=mycoparasite.soil.pcoasitescores$PC2, State=its2.soil.met$State, Clone=its2.soil.met$Clone,group=its2.soil.met$Group)

#This is where you make confidence ellipses. I don't know what all the code means. Just know where to plug in my objects. 
mycoparasite.soil.pcoaellipse<-ordiellipse(mycoparasite.soil.pcoasitescores,mycoparasite.soil.pcoagraph$State, display="sites", kind="sd", draw="none")

df_ell.soil.mycop <- data.frame()
for(g in levels(mycoparasite.soil.pcoagraph$State)){
df_ell.soil.mycop <- rbind(df_ell.soil.mycop, cbind(as.data.frame(with(mycoparasite.soil.pcoagraph[mycoparasite.soil.pcoagraph$State==g,],                                                vegan:::veganCovEllipse(mycoparasite.soil.pcoaellipse[[g]]$cov,mycoparasite.soil.pcoaellipse[[g]]$center,mycoparasite.soil.pcoaellipse[[g]]$scale))) ,State=g))}

#Plot PCoA in ggplot
library(ggplot2)
mycoparasite.soil.pcoa<-ggplot(mycoparasite.soil.pcoagraph, aes(PC1,PC2, colour=State))+
  #geom_text(aes(label=group))+
  geom_path(data=df_ell.soil.mycop, aes(x=PC1, y=PC2, colour=State),size=0.5, linetype=1)+
  geom_point(aes(shape=Clone), size=3.5)+
  xlab("PC1 (29.3%)")+
  ylab("PC2 (14.1%)")+
 theme(axis.title.x=element_text(size=14, face="bold"))+
  theme(axis.title.y=element_text(size=14, face="bold"))+
  theme(axis.text.x=element_text(size=12, face="bold"))+
  theme(axis.text.y=element_text(size=12, face="bold"))+ 
  scale_color_manual(values=c("#006600","#3399FF","#FF9900"))+
  scale_shape_manual(values=c(16,10,8,7,0,2))+
  scale_x_continuous(breaks=c(-0.60,-0.30,0,0.30,0.60),limits=c(-0.60,0.60))+
  scale_y_continuous(breaks=c(-0.60,-0.30,0,0.30,0.60),limits=c(-.60,.60))+
  theme(panel.grid.major=element_blank(),
        panel.grid.minor=element_blank(),
    panel.border=element_rect(colour="black", size=1, fill=NA))+
  theme(panel.grid.major=element_blank(),
       panel.grid.minor=element_blank(),
       panel.background = element_blank(),
  panel.border=element_rect(color="black", size=1, fill=NA))+
  geom_text(data=NULL,aes(x=-0.60,y=0.60,label="Soil: Mycoparasite"),hjust=0,colour="black")
mycoparasite.soil.pcoa

Plant pathogens PERMANOVA

To formally test how well state and clone explain differences in drill shaving fungal community composition, I perform a PERMANOVA using the adonis function from vegan.

## 
## Call:
## adonis(formula = mycoparasite.soil.relabund ~ its2.soil.met$State *      its2.soil.met$Clone, permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                                         Df SumsOfSqs MeanSqs F.Model      R2
## its2.soil.met$State                      2    3.2406 1.62030  8.2947 0.27253
## its2.soil.met$Clone                      4    0.8611 0.21527  1.1020 0.07242
## its2.soil.met$State:its2.soil.met$Clone  8    1.9287 0.24108  1.2342 0.16220
## Residuals                               30    5.8603 0.19534         0.49285
## Total                                   44   11.8906                 1.00000
##                                            Pr(>F)    
## its2.soil.met$State                     9.999e-05 ***
## its2.soil.met$Clone                        0.3128    
## its2.soil.met$State:its2.soil.met$Clone    0.1286    
## Residuals                                            
## Total                                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call:
## adonis(formula = mycoparasite.soil.relabund ~ its2.soil.met$State +      its2.soil.met$Clone, permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                     Df SumsOfSqs MeanSqs F.Model      R2    Pr(>F)    
## its2.soil.met$State  2    3.2406 1.62030  7.9050 0.27253 9.999e-05 ***
## its2.soil.met$Clone  4    0.8611 0.21527  1.0502 0.07242    0.3813    
## Residuals           38    7.7889 0.20497         0.65505              
## Total               44   11.8906                 1.00000              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Call:
## adonis(formula = mycoparasite.soil.relabund ~ its2.soil.met$State,      permutations = 10000, method = "bray") 
## 
## Permutation: free
## Number of permutations: 10000
## 
## Terms added sequentially (first to last)
## 
##                     Df SumsOfSqs MeanSqs F.Model      R2    Pr(>F)    
## its2.soil.met$State  2    3.2406 1.62030  7.8674 0.27253 9.999e-05 ***
## Residuals           42    8.6500 0.20595         0.72747              
## Total               44   11.8906                 1.00000              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mycoparasite Richness

Observed OTU richness is first calculated for each sample. Then due to the unbalanced nature of our study design, we use a type 3 ANOVA.

## Anova Table (Type III tests)
## 
## Response: (Richness)
##              Sum Sq Df F value    Pr(>F)    
## (Intercept) 15408.3  1 57.9987 1.711e-08 ***
## State         670.1  2  1.2612    0.2979    
## clone         473.1  4  0.4452    0.7750    
## State:clone   957.8  8  0.4507    0.8804    
## Residuals    7970.0 30                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 15408.3333 1 57.9987453 0.0000000
State 670.0952 2 1.2611579 0.2979181
clone 473.0667 4 0.4451694 0.7749903
State:clone 957.7958 8 0.4506567 0.8803600
Residuals 7970.0000 30 NA NA
## Warning: not plotting observations with leverage one:
##   34

## Warning: not plotting observations with leverage one:
##   34

## Anova Table (Type III tests)
## 
## Response: (Richness)
##              Sum Sq Df  F value    Pr(>F)    
## (Intercept) 26409.7  1 112.4094 6.554e-13 ***
## State        2551.0  2   5.4289  0.008436 ** 
## clone         814.2  4   0.8664  0.492921    
## Residuals    8927.8 38                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = (Richness) ~ State + clone, data = mycoparasite.soil.rich.tabmet)
## 
## $State
##             diff       lwr       upr     p adj
## TN-IN -11.811111 -24.87993  1.257705 0.0833455
## WA-IN -20.200000 -34.67794 -5.722060 0.0044173
## WA-TN  -8.388889 -22.32029  5.542516 0.3171462
## 
## $clone
##               diff       lwr       upr     p adj
## 132-130 -14.057738 -36.77011  8.654636 0.4042172
## 272-130  -8.213757 -30.32945 13.901940 0.8238620
## 55-130   -6.547090 -28.66279 15.568607 0.9138211
## WT-130  -10.039683 -30.91091 10.831549 0.6455642
## 272-132   5.843981 -15.48004 27.168004 0.9335086
## 55-132    7.510648 -13.81337 28.834670 0.8498930
## WT-132    4.018056 -16.01238 24.048487 0.9780459
## 55-272    1.666667 -19.02067 22.354007 0.9993431
## WT-272   -1.825926 -21.17716 17.525309 0.9987765
## WT-55    -3.492593 -22.84383 15.858642 0.9851546
diff lwr upr p adj
TN-IN -11.811111 -24.87993 1.257705 0.0833455
WA-IN -20.200000 -34.67794 -5.722060 0.0044173
WA-TN -8.388889 -22.32029 5.542516 0.3171462
132-130 -14.057738 -36.77011 8.654636 0.4042172
272-130 -8.213757 -30.32945 13.901940 0.8238620
55-130 -6.547090 -28.66279 15.568607 0.9138211
WT-130 -10.039682 -30.91091 10.831549 0.6455642
272-132 5.843981 -15.48004 27.168004 0.9335086
55-132 7.510648 -13.81337 28.834670 0.8498930
WT-132 4.018056 -16.01238 24.048486 0.9780459
55-272 1.666667 -19.02067 22.354007 0.9993431
WT-272 -1.825926 -21.17716 17.525309 0.9987765
WT-55 -3.492593 -22.84383 15.858642 0.9851546

## Anova Table (Type III tests)
## 
## Response: (Richness)
##             Sum Sq Df  F value    Pr(>F)    
## (Intercept)  54964  1 236.9633 < 2.2e-16 ***
## State         2807  2   6.0507  0.004907 ** 
## Residuals     9742 42                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq Df F value Pr(>F)
(Intercept) 54964.267 1 236.963310 0.0000000
State 2806.967 2 6.050732 0.0049073
Residuals 9742.011 42 NA NA
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = (Richness) ~ State, data = mycoparasite.soil.rich.tabmet)
## 
## $State
##             diff       lwr       upr     p adj
## TN-IN -11.811111 -24.74683  1.124607 0.0796851
## WA-IN -20.200000 -34.53049 -5.869509 0.0038856
## WA-TN  -8.388889 -22.17841  5.400633 0.3116053
diff lwr upr p adj
TN-IN -11.811111 -24.74683 1.124607 0.0796851
WA-IN -20.200000 -34.53049 -5.869509 0.0038856
WA-TN -8.388889 -22.17841 5.400633 0.3116053