1. A genetic network linking genotype and phenotype

In this example, we will use the structural constraint interface to create a genetic network linking a genotype with gene expression traits and a higher-order phenotype. There are 5 nodes in the network: Genotype, Gene1, Gene2, Gene3, and Phenotype. The input datafile is available here. To begin, select Learn a network model from data on the BNW homepage and load the data file, displaying what is shown below:


We now have the option of either learning performing structure learning using the default BNW settings and no additional structural constraints or we can use modify settings and add structural constraints. Here, we will select Go to structure learning settings and the BNW structural constraint interface to select the latter option. The BNW structural constraint interface is described in our Help page. At shown in the figure below, the top of the page has several settings for global features of the structure search. We have kept the default settings, except we have changed the Number of networks to include in model averaging to 1000. As this is a small network with only 5 nodes, including many high scoring networks had little effect on the estimated time required to perform structure learning, and the estimated run time increased from 12 to 13 seconds. We also could increase the Maximum number of parents for any node to any value without significantly changing the run time for a network of this size. For larger networks with more than approximately 10 nodes, changing these settings can have a major impact on estimated run times.


In the next section of the structural constraint interface, we assign nodes to tiers which will can be used to focus network searches on biologically meaningful networks. Our dataset contains a genotype, three gene expression traits, and a higher order phenotype. Instead of considering all possible network models for this dataset, we may want to focus on models relevant to a questions such as: How does variation in genotype and gene expression explain the variation observed in the phenotype? To address this question, we assign the network nodes to three tiers: Tier1 contains the genotype, Tier2 contains the gene expression traits, and Tier3 contains the phenotype.


The third section of the BNW structural constraint interface allows users to specify the interactions that are allowed within and between tiers. By default, within tier interactions (i.e., nodes in TierX can be parents or children of other nodes in TierX) are allowed, but users may want to prevent within tier interactions for some cases. For example, it may be advantageous to prevent interactions between a tier that contained a set of some demographic variables (e.g., age, sex, and race), as these variables are not likely to be causal factor that influence other factors in the tier. In this case, within tier interactions only apply to Tier2, and we do not have any prior knowledge that indicates that between gene interactions should not be allowed, so we will keep the default setting and allow within tier interactions.

The default settings for between tier interactions allow nodes within a tier to be the parents of all nodes in lower ranking tiers. Here, the Genotype node in Tier1 can be parents of the gene nodes in Tier2 and the Phenotype node in Tier3, the gene nodes in Tier2 can be the parents of the Tier3 Phenotype node, and the Tier3 Phenotype node cannot be the parents of nodes in any other tier. Therefore, by default, the Genotype node can be the direct parent of the Phenotype node. We may want to allow this interaction, as the genotype may influence genotype through genes or other factors that are not explicitly included as variables in the network. If users do not want to allow this direct Genotype-Phenotype interaction, they can unclick the Tier3 box in the Which tiers contain nodes that can be the children of this tier? for Tier1. In this case, we have maintained the default settings which are shown below.


In this example, we will not specify any additional constraints in the fourth section of the structural constraint interface, and we can click Perform Bayesian network modeling on the upper left corner of the page. The figure below shows the network with the model average network of the 1000 highest scoring networks and this network is available here. Genotype directly influences two of the genes (Gene1 and Gene3), and two of the genes (Gene2 and Gene3) directly influence the Phenotype. In this case, although we did not prevent the Genotype from directly influencing the Phenotype, the highest scoring networks did not include this directed edge. The right side of the figure shows the predictions of the network with Genotype=1 used as evidence. If Genotype is known to be in state 1, the values of all other variables in the network are expected to decrease compared to the distributions learned using all phenotypes. A more complete description of using BNW to make predictions with network models can be found in a separate tutorial.


2. A genetic network with multiple genotypes and cis- and trans-regulated genes.

In this example, we will add restrictions to a network containing 8 nodes: 2 genotype nodes (Geno1 and Geno2), 3 cis-regulated gene expression traits (cisGene1, cisGene2, and cisGene3), 2 trans-regulated gene expression traits (transGene1 and transGene2), and a phenotype (Pheno). We have four tiers of nodes (genotypes, cis-regulated genes, trans-regulated genes, and phenotype), so we have selected 4 from the dropdown menu at the top of the page and assigned the nodes to the correct tiers. We could make a more complex system of tiers that would allow us to specify which genes are regulated by which genotypes (for example, Geno1 regulates cisGene1 and transGene1, while Geno2 regulates cisGene2, cisGene3, and transGene2), but, for this example, we will use a simpler system of 4 tiers.

We have made one change to the default setting in the Define interactions allowed between tiers section. For Tier1, which contains the genotypes, we have selected "No" for the "Are within tier interactions allowed?", as it does not make biological sense for one genotype variation to cause the variation in another genotype in this example.

Assume that a known regulatory relationship between cisGene1 and transGene1 has been established from previous experiments. We can require that this relationship is included in the network by adding the edge list of required edges in the Specify additional constraints section.