Estimation of the skeleton of a directed acyclic graph (DAG) is of great importance for understanding the underlying DAG and causal effects can be assessed from the skeleton when the DAG is not identifiable. We propose a novel method named PenPC to estimate the skeleton of a high-dimensional DAG by a two-step approach. We first estimate the non-zero entries of a concentration matrix using penalized regression, and then fix the difference between the concentration matrix and the skeleton by evaluating a set of conditional independence hypotheses. As illustrated by extensive simulations and real data studies, PenPC has significantly higher sensitivity and specificity than the PC algorithm. We also study the asymptotic property of PenPC on high dimensional problem (the number of vertices p is in either polynomial or exponential scale of sample size n) of traditional random graph model where the number of connections of each vertex is limited and scale-free DAGs where one vertex may be connected to a large number of neighbors.
Return to Biostatistics Seminars