Skip to contents
# load the package
library(caugi)

In this vignette, we will walk you through on how to create a caugi_graph, query it, modify it, and then compare it to other caugis.

The caugi object

You can create a caugi graph object using the caugi_graph() function along with infix operators to define edges. Let’s create a directed acyclic graph (DAG) with 5 nodes and 5 edges.

cg <- caugi_graph(
  A %-->% B %-->% C + D,
  A %-->% C,
  class = "DAG"
)
cg
#> # A tibble: 4 × 1
#>   name 
#>   <chr>
#> 1 A    
#> 2 B    
#> 3 C    
#> 4 D    
#> # A tibble: 4 × 3
#>   from  edge  to   
#>   <chr> <chr> <chr>
#> 1 A     -->   B    
#> 2 A     -->   C    
#> 3 B     -->   C    
#> 4 B     -->   D

You might scratch your head a bit, when looking at the call above. To clarify, A %-->% B creates the edge --> from A to B. The syntax B %-->% C + D is equivalent to B %--> C and B %-->% D. Notice that the graph prints two tibbles. The first is equivalent to cg@nodes and the second cg@edges. Besides that the caugi holds other properties. Let’s check the other properties.

Properties

ptr

cg@ptr
#> <pointer: 0x55bac1ee49f0>

This is the pointer to the Rust object that caugi utilizes for performance.

simple

cg@simple
#> [1] TRUE

This indicates whether the graph is simple or not. Let’s try to create a non-simple graph:

caugi_graph(A %-->% B, B %-->% A)
#> Error in graph_builder_add_edges(b, as.integer(unname(id[edges$from])), : parallel edges not allowed in simple graphs

This cannot be done unless you initialize the graph with simple = FALSE. Note that, currently, all of the supported graph classes only support simple = TRUE unless the class is UNKNOWN.

graph_class

cg@graph_class
#> [1] "DAG"

This is the graph’s class. As you can see here, it is a DAG.

Querying the caugi

We can query the caugi graph object with the built-in queries provided in the package. Let’s try to find the descendants of all the parents of the node C:

descendants(cg, parents(cg, "C"))
#> $A
#> [1] "B" "C" "D"
#> 
#> $B
#> [1] "C" "D"

First note that the output is a list of named character vectors. How come? Since the parents of C is c(A, B):

parents(cg, "C")
#> [1] "A" "B"

So for each parent of C we have a named vector in the list that represents the descendants of that parent node.

Modifying the caugi

Let’s try to modify the graph from before, so we get a new DAG.

cg_modified <- cg |>
  remove_edges(A %-->% B, B %-->% C + D) |>
  add_edges(B %-->% A, D %-->% C)
cg_modified
#> # A tibble: 4 × 1
#>   name 
#>   <chr>
#> 1 A    
#> 2 B    
#> 3 C    
#> 4 D    
#> # A tibble: 3 × 3
#>   from  edge  to   
#>   <chr> <chr> <chr>
#> 1 A     -->   C    
#> 2 B     -->   A    
#> 3 D     -->   C

Would you like to add nodes? Then use add_nodes().

Graph metrics

Now that we have two different graphs, we can use different metrics to measure the difference between the two graphs. Here, we use the adjustment identification distance (AID) and the structural Hamming distance (SHD):

aid(cg, cg_modified)
#> [1] 0.5833333
shd(cg, cg_modified, normalized = TRUE)
#> [1] 0.6666667

There you go!

You have now created a graph, inspected it, modified it, and measured the difference between the two graphs – both structurally and interventionally.

For further reading, we recommend the vignettes vignette("package_use") and vignette("performance") to see how to use caugi in your own packages, and to see how caugi performs compared to other graph packages in R. For the interested reader, we also recommend vignette("motivation"), which goes deeper into the motivation behind caugi and what we aspire to do with caugi.

Appendix

Advanced properties

built

cg@built
#> [1] TRUE

This indicates whether the graph has been “built” or not on the Rust side. This is important, as the Rust object may not agree with the R object if built = FALSE.

name_index_map

cg@name_index_map

The name_index_map is a hashmap that takes node names as keys and output zero-based indices. This is use to access nodes’ zero-based indices,, when converting node names to indices for Rust calls, as the Rust backend uses zero-based indices.

.state

cg@.state
#> <environment: 0x55babfd8d330>

This is the internal state of the caugi graph object. It is used to ensure that the caugi object can be modifying in R and, so to say, saves the modifications you might make to the graph in R without having to rebuild the graph in Rust each time. The most important takeaways about the state is that you should avoid modifying the state directly. Instead, you should use the verbs.