DAG = create_empty_DAG(7)
DAG = bnlearn::set.arc(DAG, 'U1', 'U2')
DAG = bnlearn::set.arc(DAG, 'U1', 'U3')
DAG = bnlearn::set.arc(DAG, 'U1', 'U4')
DAG = bnlearn::set.arc(DAG, 'U2', 'U4')
DAG = bnlearn::set.arc(DAG, 'U3', 'U4')
DAG = bnlearn::set.arc(DAG, 'U4', 'U6')
DAG = bnlearn::set.arc(DAG, 'U5', 'U6')
DAG = bnlearn::set.arc(DAG, 'U4', 'U7')
DAG = bnlearn::set.arc(DAG, 'U6', 'U7')DAG |> bnlearn::as.igraph() |>
igraph::plot.igraph(size = 20, label.cex = 2
# , layout = igraph::layout_as_tree
)order_hash = r2r::hashmap()
order_hash[['U4']] = c("U2", "U1", "U3")
order_hash[['U6']] = c("U4", "U5")
order_hash[['U7']] = c("U4", "U6")
complete_and_check_orders(DAG, order_hash)
fam = matrix(c(0, 1, 1, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0),
byrow = TRUE, ncol = 7)
tau = 0.2 * fam
my_PCBN = new_PCBN(
DAG, order_hash,
copula_mat = list(tau = tau, fam = fam))
N = 100
mydata = PCBN_sim(my_PCBN, N = N)We prepare the environment for storing the results.
We start by fitting the copula of \((U_1, U_2)\).
BiCopCondFit(data = mydata, DAG = DAG, v = "U1", w = "U2",
cond_set = c(), familyset = 1, order_hash = order_hash, e = e,
method = "mle")
#> Estimating the copula of U1 and U2
#> Bivariate copula: Gaussian (par = 0.27, tau = 0.17)This is stored in e$copula_hash. To access it, we can
do:
copula_key = e$keychain[[list(margins = c("U1", "U2"), cond = character(0))]]
e$copula_hash[[copula_key]]
#> Bivariate copula: Gaussian (par = 0.27, tau = 0.17)There is a level of indirection, because the copula_key
actually stores the whole computation tree. Therefore, if the
statistician decides to use a different PCBN for the same structure,
some estimated parts can be reused if the computation path is the
same.
Let’s see what is actually the copula_key.
The computation trees of these copulas can be found as before.
e$keychain[[list(margins = c("U2", "U4"), cond = character(0))]] |>
data.tree::FromListSimple() |>
print()
#> levelName
#> 1 U2, U4
#> 2 ¦--U2
#> 3 °--U4e$keychain[[list(margins = c("U1", "U4"), cond = c("U2"))]] |>
data.tree::FromListSimple() |>
print()
#> levelName
#> 1 U1, U4 | U2
#> 2 ¦--U1 | U2
#> 3 ¦ °--U1, U2
#> 4 ¦ ¦--U1
#> 5 ¦ °--U2
#> 6 °--U4 | U2
#> 7 °--U2, U4
#> 8 ¦--U2
#> 9 °--U4e$keychain[[list(margins = c("U3", "U4"), cond = c("U1", "U2"))]] |>
data.tree::FromListSimple() |>
print()
#> levelName
#> 1 U3, U4 | U1, U2
#> 2 ¦--U3 | U1
#> 3 ¦ °--U1, U3
#> 4 ¦ ¦--U1
#> 5 ¦ °--U3
#> 6 °--U4 | U1, U2
#> 7 °--U1, U4 | U2
#> 8 ¦--U1 | U2
#> 9 ¦ °--U1, U2
#> 10 ¦ ¦--U1
#> 11 ¦ °--U2
#> 12 °--U4 | U2
#> 13 °--U2, U4
#> 14 ¦--U2
#> 15 °--U4In the same way, we obtain the computation tree of the margin \(U_4 \, | \, U_1, U_2\) by:
e$keychain[[list(margin = c("U4"), cond = c("U1", "U2"))]] |>
data.tree::FromListSimple() |>
print()
#> levelName
#> 1 U4 | U1, U2
#> 2 °--U1, U4 | U2
#> 3 ¦--U1 | U2
#> 4 ¦ °--U1, U2
#> 5 ¦ ¦--U1
#> 6 ¦ °--U2
#> 7 °--U4 | U2
#> 8 °--U2, U4
#> 9 ¦--U2
#> 10 °--U4You can remark that this is a sub-tree of the computation tree of the copula of \((U_3, U_4) \, | \, U_1, U_2\). This is coherent, because we needed the margin \(U_4 \, | \, U_1, U_2\) to estimate this copula. Nevertheless, note that
This is because the margin \(U_3 \, | \, U_1, U_2\) is actually the same as \(U_3 \, | \, U_1\) by conditional independence. This can be seen using this function:
The conditional marginal pseudo-observations can be found in
e$margin_hash. For example, the conditional
pseudo-observations \(\hat U_{i, \,
3|1}\), \(i=1, \dots, n\), can
be obtained by:
e$margin_hash[[ e$keychain[[list(margin = c("U3"), cond = c("U1"))]] ]] |>
head()
#> [1] 0.7016929 0.3503202 0.2715592 0.9891115 0.1999063 0.1960950These conditional margins are internally computed by the function
ComputeCondMargin.