Get started
get_started.RmdIn this vignette, we will walk you through on how to create a
caugi_graph, query it, modify it, and then compare it to
other caugis.
The caugi object
You can create a caugi graph object using the
caugi_graph() function along with infix operators to define
edges. Let’s create a directed acyclic graph (DAG) with 5 nodes and 5
edges.
cg <- caugi_graph(
A %-->% B %-->% C + D,
A %-->% C,
class = "DAG"
)
cg
#> # A tibble: 4 × 1
#> name
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> 4 D
#> # A tibble: 4 × 3
#> from edge to
#> <chr> <chr> <chr>
#> 1 A --> B
#> 2 A --> C
#> 3 B --> C
#> 4 B --> DYou might scratch your head a bit, when looking at the call above. To
clarify, A %-->% B creates the edge -->
from A to B. The syntax
B %-->% C + D is equivalent to B %--> C
and B %-->% D. Notice that the graph prints two
tibbles. The first is equivalent to cg@nodes
and the second cg@edges. Besides that the
caugi holds other properties. Let’s check the
other properties.
Properties
ptr
cg@ptr
#> <pointer: 0x55bac1ee49f0>This is the pointer to the Rust object that caugi
utilizes for performance.
simple
cg@simple
#> [1] TRUEThis indicates whether the graph is simple or not. Let’s try to create a non-simple graph:
caugi_graph(A %-->% B, B %-->% A)
#> Error in graph_builder_add_edges(b, as.integer(unname(id[edges$from])), : parallel edges not allowed in simple graphsThis cannot be done unless you initialize the graph with
simple = FALSE. Note that, currently, all of the supported
graph classes only support simple = TRUE unless the class
is UNKNOWN.
Querying the caugi
We can query the caugi graph object with the built-in
queries provided in the package. Let’s try to find the descendants of
all the parents of the node C:
descendants(cg, parents(cg, "C"))
#> $A
#> [1] "B" "C" "D"
#>
#> $B
#> [1] "C" "D"First note that the output is a list of named character vectors. How
come? Since the parents of C is c(A, B):
parents(cg, "C")
#> [1] "A" "B"So for each parent of C we have a named vector in the
list that represents the descendants of that parent node.
Modifying the caugi
Let’s try to modify the graph from before, so we get a new DAG.
cg_modified <- cg |>
remove_edges(A %-->% B, B %-->% C + D) |>
add_edges(B %-->% A, D %-->% C)
cg_modified
#> # A tibble: 4 × 1
#> name
#> <chr>
#> 1 A
#> 2 B
#> 3 C
#> 4 D
#> # A tibble: 3 × 3
#> from edge to
#> <chr> <chr> <chr>
#> 1 A --> C
#> 2 B --> A
#> 3 D --> CWould you like to add nodes? Then use add_nodes().
Graph metrics
Now that we have two different graphs, we can use different metrics to measure the difference between the two graphs. Here, we use the adjustment identification distance (AID) and the structural Hamming distance (SHD):
There you go!
You have now created a graph, inspected it, modified it, and measured the difference between the two graphs – both structurally and interventionally.
For further reading, we recommend the vignettes
vignette("package_use") and
vignette("performance") to see how to use
caugi in your own packages, and to see how
caugi performs compared to other graph packages in R. For
the interested reader, we also recommend
vignette("motivation"), which goes deeper into the
motivation behind caugi and what we aspire to do with
caugi.
Appendix
Advanced properties
built
cg@built
#> [1] TRUEThis indicates whether the graph has been “built” or not on the Rust
side. This is important, as the Rust object may not agree with the R
object if built = FALSE.
name_index_map
cg@name_index_mapThe name_index_map is a hashmap that takes node names as
keys and output zero-based indices. This is use to access nodes’
zero-based indices,, when converting node names to indices for Rust
calls, as the Rust backend uses zero-based indices.
.state
cg@.state
#> <environment: 0x55babfd8d330>This is the internal state of the caugi graph object. It
is used to ensure that the caugi object can be modifying in
R and, so to say, saves the modifications you might make to the
graph in R without having to rebuild the graph in Rust each time. The
most important takeaways about the state is that you should
avoid modifying the state directly. Instead, you should use the
verbs.