Skip to contents
# dev version from GitHub
install.packages("pak",
  repos =
    sprintf(
      "https://r-lib.github.io/p/pak/stable/%s/%s/%s",
      .Platform$pkgType,
      R.Version()$os,
      R.Version()$arch
    )
)
pak::pak("frederikfabriciusbjerre/caugi")

# ... or wait for the first CRAN release
# install.packages("caugi")
# load the package
library(caugi)

What is caugi?

caugi (pronounced “corgi”) stands for Causal Graph Interface.

caugi is a small R interface to a fast Rust backend for causal graphs. You define graphs with readable infix operators, query relations, and keep R code nice and tidy, while Rust handles performance.

The basic object: caugi_graph

A caugi_graph is the bread and butter of caugi. It is easy to create, query, and modify.

You can create simple graphs as well as a large number of predefined graph classes. Currently, we only support "Unknown", "DAG", or "PDAG". We plan on supporting several other causal graph types in future releases, such as "PAG", "CPDAG", "MAG", "SWIG", and "ADMG".

# a tiny DAG
cg <- caugi_graph(
  A %-->% B + C,
  B %-->% D,
  C %-->% D,
  class = "DAG", # optional, guarantees acyclicity by construction
  simple = TRUE, # default
  build = TRUE # build now; otherwise built lazily on first query
)

Edge operators

  • %-->% directed
  • %---% undirected
  • %<->% bidirected
  • %o->% partially directed
  • %o--% partially undirected
  • %o-o% partial

You can register more types with register_caugi_edge(), if you find that you need a more expressive set of edges. For example, if you want to represent a directed edge in the reverse direction, you can do so like this:

register_caugi_edge(
  glyph = "<--",
  tail_mark = "arrow",
  head_mark = "tail",
  class = "directed",
  symmetric = FALSE,
  flags = c("TRAVERSABLE_WHEN_CONDITIONED")
)

caugi_graph(A %-->% B, B %<--% C, class = "PDAG")
#> # A tibble: 3 × 1
#>   name 
#>   <chr>
#> 1 A    
#> 2 B    
#> 3 C    
#> # A tibble: 2 × 3
#>   from  edge  to   
#>   <chr> <chr> <chr>
#> 1 A     -->   B    
#> 2 B     <--   C

# reset the registry to default with original edges
reset_caugi_registry()

We expect this feature to be needing further polishing in future releases, and we would love your input if you use this feature!

(Lazy) building and the Rust backend

caugi graphs are represented in a compact Compressed Sparse Row (CSR) format in Rust. caugi works with a front loading philosophy. Since the caugi graph is stored in a CSR format, mutations of the graph is computationally expensive compared to other graph storage systems, but it allows for very fast querying. Additionally to the storage format of the graph itself, caugi also stores additional information about node relations in such a way that it allows for 𝒪(1)\mathcal{O}(1) look-up time for many relational queries, such as nb(cg, A) or ch(cg, B).

To accommodate for the cost of mutations, caugi graphs are built lazily. This means that when you mutate the graph, for example by adding edges to it, the graph edits are stored in R, but not in Rust. When you then need to query the graphs, the graph will rebuild itself in Rust, and the query will be executed on the newly built graph. You can also use the build(cg) function to force building the graph in Rust at any time.

Nodes, grouping, and +

Use + to fan out from one side, and c(...) or parentheses to group.

caugi_graph(
  X %-->% Y + Z, # X → Y and X → Z
  c(Y, Z) %-->% W, # Y → W and Z → W
  (A + B) %---% (C + D) # all undirected pairs across groups
)
#> # A tibble: 8 × 1
#>   name 
#>   <chr>
#> 1 X    
#> 2 Y    
#> 3 Z    
#> 4 A    
#> 5 B    
#> 6 W    
#> 7 C    
#> 8 D    
#> # A tibble: 8 × 3
#>   from  edge  to   
#>   <chr> <chr> <chr>
#> 1 A     ---   C    
#> 2 A     ---   D    
#> 3 B     ---   C    
#> 4 B     ---   D    
#> 5 X     -->   Y    
#> 6 X     -->   Z    
#> 7 Y     -->   W    
#> 8 Z     -->   W

Unconnected symbols also declare nodes:

caugi_graph(A, B, C) # declares three isolated nodes
#> # A tibble: 3 × 1
#>   name 
#>   <chr>
#> 1 A    
#> 2 B    
#> 3 C    
#> # A tibble: 0 × 3
#> # ℹ 3 variables: from <chr>, edge <chr>, to <chr>

Queries

Relations

You can query relations like parents, children, and neighbors. Here you can both use symbols, characters, or indices.

cg <- caugi_graph(
  A %-->% B + C,
  B %-->% D,
  C %-->% D,
  E %---% F,
  class = "PDAG"
)
parents(cg, "D")
#> [1] "B" "C"
children(cg, index = 3)
#> [1] "D"
neighbors(cg, c("B", "C"))
#> $B
#> [1] "A" "D"
#> 
#> $C
#> [1] "A" "D"

Properties

You can check graph properties with the following functions.

is_acyclic(cg)
#> [1] TRUE
is_dag(cg)
#> [1] FALSE
is_pdag(cg)
#> [1] TRUE

Modifying graphs

You can add, remove, or set edges and nodes. Changes are applied on the next query or when you call build(). We both provide a non-standard evaluation interface with infix operators:

cg <- caugi_graph(
  A %-->% B + C,
  B %-->% D
)
cg <- add_edges(cg, E %-->% F)

or you can use standard evaluation:

cg <- add_edges(cg,
  from = c("D", "G"),
  edge = c("-->", "o->"),
  to = c("E", "H")
)

Besides add_edges, you can also use remove_edges(), set_edges(), add_nodes(), remove_nodes(), set_nodes(), and subgraph().

S7 and Safety

caugi utilizes the S7 object system to ensure mutation safety. This means that it is meant to be hard for you to mutate the graphs without using the provided functions. This is to ensure that the graphs are always valid, and that you do not accidentally create situations where queries are invalid or the graph itself is invalid.

The S7 object system is not widely used yet, but we found that the S3 object system was too lenient, and it was hard for us to ensure that graphs and queries were always valid. We found that S4 was too heavy and cumbersome to work with.

Future work

We plan on adding the following features to caugi in the future:

  • Graph classes: PAG, CPDAG, MAG, SWIG, ADMG,
  • Adjustment sets,
  • Structural Hamming Distance,
  • Adjustment Identification Distance,
  • d- and m-separation queries,
  • …and much more!

Session info

sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] caugi_0.0.0.9000
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.5       cli_3.6.5         knitr_1.50        rlang_1.1.6      
#>  [5] xfun_0.53         generics_0.1.4    textshaping_1.0.4 S7_0.2.0         
#>  [9] jsonlite_2.0.0    glue_1.8.0        htmltools_0.5.8.1 ragg_1.5.0       
#> [13] sass_0.4.10       rmarkdown_2.30    tibble_3.3.0      evaluate_1.0.5   
#> [17] jquerylib_0.1.4   fastmap_1.2.0     yaml_2.3.10       lifecycle_1.0.4  
#> [21] compiler_4.5.1    dplyr_1.1.4       fs_1.6.6          pkgconfig_2.0.3  
#> [25] htmlwidgets_1.6.4 systemfonts_1.3.1 digest_0.6.37     R6_2.6.1         
#> [29] utf8_1.2.6        tidyselect_1.2.1  pillar_1.11.1     magrittr_2.0.4   
#> [33] bslib_0.9.0       tools_4.5.1       pkgdown_2.1.3     cachem_1.1.0     
#> [37] desc_1.4.3