Introduction to `caugi`
Introduction_to_caugi.Rmd
# dev version from GitHub
install.packages("pak",
repos =
sprintf(
"https://r-lib.github.io/p/pak/stable/%s/%s/%s",
.Platform$pkgType,
R.Version()$os,
R.Version()$arch
)
)
pak::pak("frederikfabriciusbjerre/caugi")
# ... or wait for the first CRAN release
# install.packages("caugi")What is caugi?
caugi (pronounced “corgi”) stands for Causal
Graph Interface.
caugi is a small R interface to a fast Rust backend for
causal graphs. You define graphs with readable infix operators, query
relations, and keep R code nice and tidy, while Rust handles
performance.
The basic object: caugi_graph
A caugi_graph is the bread and butter of
caugi. It is easy to create, query, and modify.
You can create simple graphs as well as a large number of predefined
graph classes. Currently, we only support "Unknown",
"DAG", or "PDAG". We plan on supporting
several other causal graph types in future releases, such as
"PAG", "CPDAG", "MAG",
"SWIG", and "ADMG".
# a tiny DAG
cg <- caugi_graph(
A %-->% B + C,
B %-->% D,
C %-->% D,
class = "DAG", # optional, guarantees acyclicity by construction
simple = TRUE, # default
build = TRUE # build now; otherwise built lazily on first query
)Edge operators
-
%-->%directed -
%---%undirected -
%<->%bidirected -
%o->%partially directed -
%o--%partially undirected -
%o-o%partial
You can register more types with register_caugi_edge(),
if you find that you need a more expressive set of edges. For example,
if you want to represent a directed edge in the reverse direction, you
can do so like this:
register_caugi_edge(
glyph = "<--",
tail_mark = "arrow",
head_mark = "tail",
class = "directed",
symmetric = FALSE,
flags = c("TRAVERSABLE_WHEN_CONDITIONED")
)
caugi_graph(A %-->% B, B %<--% C, class = "PDAG")
#> # A tibble: 3 × 1
#> name
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> # A tibble: 2 × 3
#> from edge to
#> <chr> <chr> <chr>
#> 1 A --> B
#> 2 B <-- C
# reset the registry to default with original edges
reset_caugi_registry()We expect this feature to be needing further polishing in future releases, and we would love your input if you use this feature!
(Lazy) building and the Rust backend
caugi graphs are represented in a compact Compressed
Sparse Row (CSR) format in Rust. caugi works with a front
loading philosophy. Since the caugi graph is stored in a
CSR format, mutations of the graph is computationally expensive compared
to other graph storage systems, but it allows for very fast
querying. Additionally to the storage format of the graph itself,
caugi also stores additional information about node
relations in such a way that it allows for
look-up time for many relational queries, such as nb(cg, A)
or ch(cg, B).
To accommodate for the cost of mutations, caugi graphs
are built lazily. This means that when you mutate the graph, for example
by adding edges to it, the graph edits are stored in R, but not in Rust.
When you then need to query the graphs, the graph will rebuild itself in
Rust, and the query will be executed on the newly built graph. You can
also use the build(cg) function to force building the graph
in Rust at any time.
Nodes, grouping, and +
Use + to fan out from one side, and c(...)
or parentheses to group.
caugi_graph(
X %-->% Y + Z, # X → Y and X → Z
c(Y, Z) %-->% W, # Y → W and Z → W
(A + B) %---% (C + D) # all undirected pairs across groups
)
#> # A tibble: 8 × 1
#> name
#> <chr>
#> 1 X
#> 2 Y
#> 3 Z
#> 4 A
#> 5 B
#> 6 W
#> 7 C
#> 8 D
#> # A tibble: 8 × 3
#> from edge to
#> <chr> <chr> <chr>
#> 1 A --- C
#> 2 A --- D
#> 3 B --- C
#> 4 B --- D
#> 5 X --> Y
#> 6 X --> Z
#> 7 Y --> W
#> 8 Z --> WUnconnected symbols also declare nodes:
caugi_graph(A, B, C) # declares three isolated nodes
#> # A tibble: 3 × 1
#> name
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> # A tibble: 0 × 3
#> # ℹ 3 variables: from <chr>, edge <chr>, to <chr>Queries
Relations
You can query relations like parents, children, and neighbors. Here you can both use symbols, characters, or indices.
Properties
You can check graph properties with the following functions.
is_acyclic(cg)
#> [1] TRUE
is_dag(cg)
#> [1] FALSE
is_pdag(cg)
#> [1] TRUEModifying graphs
You can add, remove, or set edges and nodes. Changes are applied on
the next query or when you call build(). We both provide a
non-standard evaluation interface with infix operators:
cg <- caugi_graph(
A %-->% B + C,
B %-->% D
)
cg <- add_edges(cg, E %-->% F)or you can use standard evaluation:
Besides add_edges, you can also use
remove_edges(), set_edges(),
add_nodes(), remove_nodes(),
set_nodes(), and subgraph().
S7 and Safety
caugi utilizes the S7 object system to ensure mutation
safety. This means that it is meant to be hard for you to mutate the
graphs without using the provided functions. This is to ensure that the
graphs are always valid, and that you do not accidentally create
situations where queries are invalid or the graph itself is invalid.
The S7 object system is not widely used yet, but we found that the S3 object system was too lenient, and it was hard for us to ensure that graphs and queries were always valid. We found that S4 was too heavy and cumbersome to work with.
Future work
We plan on adding the following features to caugi in the
future:
- Graph classes: PAG, CPDAG, MAG, SWIG, ADMG,
- Adjustment sets,
- Structural Hamming Distance,
- Adjustment Identification Distance,
- d- and m-separation queries,
- …and much more!
Session info
sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] caugi_0.0.0.9000
#>
#> loaded via a namespace (and not attached):
#> [1] vctrs_0.6.5 cli_3.6.5 knitr_1.50 rlang_1.1.6
#> [5] xfun_0.53 generics_0.1.4 textshaping_1.0.4 S7_0.2.0
#> [9] jsonlite_2.0.0 glue_1.8.0 htmltools_0.5.8.1 ragg_1.5.0
#> [13] sass_0.4.10 rmarkdown_2.30 tibble_3.3.0 evaluate_1.0.5
#> [17] jquerylib_0.1.4 fastmap_1.2.0 yaml_2.3.10 lifecycle_1.0.4
#> [21] compiler_4.5.1 dplyr_1.1.4 fs_1.6.6 pkgconfig_2.0.3
#> [25] htmlwidgets_1.6.4 systemfonts_1.3.1 digest_0.6.37 R6_2.6.1
#> [29] utf8_1.2.6 tidyselect_1.2.1 pillar_1.11.1 magrittr_2.0.4
#> [33] bslib_0.9.0 tools_4.5.1 pkgdown_2.1.3 cachem_1.1.0
#> [37] desc_1.4.3