Introduction to GRAND

Introduction

Welcome

Thank you for your interest in the grand package! This vignette illustrates how to use this package to describe a network following the Guidelines for Reporting About Network Data (GRAND).

The grand package can be cited as:

Neal, Z. P. (2023). grand: An R package for using the Guidelines for Reporting About Network Data. GitHub. GitHub. https://github.com/zpneal/grand/

If you have questions about the grand package, please contact the maintainer Zachary Neal by email ([email protected]) or via Mastodon (@[email protected]). Please report bugs in the backbone package at https://github.com/zpneal/grand/issues.

What is GRAND?

Networks can represent a wide range of social and natural phenomena at many different scales, and so network data is often quite diverse. Additionally, methods for analyzing networks have evolved over several decades across multiple disciplines. As a result, different researchers, from different disciplinary backgrounds, studying networks representing different things, often describe their network data in very different (and sometimes incomplete) ways.

The Guidelines for Reporting About Network Data (GRAND) are an attempt to establish some basic reporting standards that can help facilitate consistent and complete description of networks in research publications, presentations, and data repositories. GRAND aims to be neutral with respect to discipline, method, and content, and therefore focuses only on a limited number of fundamental characteristics that are relevant for all networks: What does it represent? When and where did it come from? How is it measured?

Loading the package

The backbone package can be loaded in the usual way:

library(grand)
#> +-------+  grand v0.9.0
#> | GRAND |  Cite: Neal, Z. P., (2023). grand: An R package for using the Guidelines for
#> | ~~~~~ |        Reporting About Network Data. GitHub. https://github.com/zpneal/grand/
#> | ~~~~~ |
#> | ~~~~~ |  Help: type vignette("grand"); email [email protected]; github zpneal/grand
#> +-------+  Beta: type devtools::install_github("zpneal/grand", ref = "devel")

Upon successful loading, a startup message will display that shows the version number, citation, ways to get help, and ways to contact us.

Package overview

The package offers three basic functions:

  • grand() interactively queries the user about a network, and saves the responses as graph attributes.

  • grand.text() writes a uniform narrative description of the network.

  • grand.table() plots a uniform tabular description of the network, in the style of a US Nutrition Label.

Because the goal of GRAND is to bring consistency and uniformity to the description of network data, these functions offer relatively few options. They are designed to provide a minimal uniform description that is suitable for any network and any context, which users can supplement with additional network-specific and context-specific details.

back to Table of Contents

Adding GRAND attributes

Interactively

The grand() function applies GRAND to a network stored as an igraph object. It offers in interactive mode that guides the user through GRAND by asking a series of questions, and a non-interactive mode that allows the user to directly specify GRAND attributes. This section illustrates grand() with the example airport data, which is a weighted and directed network of passenger air traffic in the United States in 2019, and which can be loaded using:

data(airport)

To interactively add GRAND attributes, use:

airport <- grand(airport)
This graph already contains a GRAND attribute. Do you want to overwrite (Y/N)?
1: Y
What is the name of this network (enter NA if unnamed)?
1: US Air Traffic Network
What DOI is associated with this network (enter NA if unnamed)?
1: 10.1371/journal.pone.0269137
How were these data collected or generated? 

1: Survey
2: Interview
3: Sensor
4: Observation
5: Archival
6: Simulation
7: Other

Selection: 5
In what year were these data collected?
1: 2019
This network contains 382 nodes. What type of entity do these represent (e.g., people)?
1: Airports

This code block illustrates the first several questions, and appropriate responses, and they would appear in the interactive mode.

The first set of interactive questions ask about the data as a whole:

  • name - What is the name of the network? This should usually be specified ending with the word “network” or “data” (e.g. “Florentine Families Network” or “Airline Traffic Data”).

  • doi - What is the DOI associated with the network? This could be a DOI for the data itself (e.g., if it is available online), or could be the DOI for a manuscript describing the data.

  • Data collection mode - How were these data collected or generated. Chose one of the available options (Survey, Interview, Sensor, Observation, Archival, or Simulation) or choose Other to enter something else.

  • year - In what year were the data collected?

The second set of interactive questions ask about the nodes or vertices:

  • vertex1 (and in bipartite graphs, vertex2) - What type of entity do the nodes/vertices represent? This should be specified as a plural noun (e.g., “People”).

  • vertex1.total (and in bipartite graphs, vertex2.total) - Networks often have an externally-defined boundary that determines which nodes/vertices should be included, even if some are missing from the network. If the network has a boundary: How many entities are included in the network’s boundary. This is used to compute rates of missingness (e.g. a classroom contained 20 children, but only 18 provided network data; 10% node missingness).

The third set of interactive questions ask about the edges:

  • edge.pos (and in signed graphs, edge.neg) - What type of relationship do the edges represent? This should be specified as a plural noun (e.g., “Friendships”).

  • weight - What do the edge weights represent? There are four default options: Frequency (how often), Intensity (how strong), Multiplexity (how many), or Valence (positive or negative). Choosing Other prompts to enter another option.

  • measure - How are the edge weights measured? There are four defauly options:Continuous, Count, Ordinal, or Categorical. Choosing Other prompts to enter another option.

The final interactive question asks about relevant topological characteristics. Some topological characteristics are reported by default (depending on the type of network), however it is possible to request that additional topological characteristics are also reported. The available topological characteristics include:

  • clustering coefficient - Computed using transitivity(G, type = "localaverage")

  • degree centralization - Computed using centr_degree(G)$centralization

  • degree distribution - Computed using fit_power_law(degree(G), implementation = "plfit")

  • density - Computed using edge_density(G)

  • diameter - Computed using diameter(G)

  • efficiency - Computed using global_efficiency(G)

  • mean degree - Computed using mean(degree(G))

  • modularity - Computed from a partition generated by cluster_leiden(G, objective_function = "modularity")

  • number of communities - Computed from a partition generated by cluster_leiden(G, objective_function = "modularity")

  • number of components - Computed using count_components(G)

  • transitivity - Computed using transitivity(G, type = "global")

  • structural balance - Computed using the triangle index

Not Interactively

It is also possible to add GRAND attributes directly, using:

airport <- grand(airport, interactive = FALSE, #Apply GRAND non-interactively
                 name = "US Air Traffic Network",
                 doi = "10.1371/journal.pone.0269137",
                 vertex1 = "Airports",
                 vertex1.total = 382,
                 edge.pos = "Routes",
                 weight = "Passengers",
                 measure = "Count",
                 mode = "Archival",
                 year = "2019",
                 topology = c("clustering coefficient", "mean path length", "degree distribution"))
#> This graph was created by an old(er) igraph version.
#> ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
#> For now we convert it on the fly...

Using the non-interactive mode requires knowing which parameters to specify given the type of network and knowing what values to supply each parameter. For example, here we specify the vertex1 parameter but not the vertex2 parameter because this is a unipartite network with only one type of node. Similarly, we supply "Archival" as the value for the mode parameter because this parameter records the mode of data collection. These issues are automated in the interactive mode, however the non-interactive mode offers greater flexibility and the ability to add GRAND attributes when running R scripts.

back to Table of Contents

Generating a GRAND narrative

The grand.text() function writes a complete and uniform narrative description of the network. For example:

grand.text(airport)
#> [1] "The US Air Traffic Network is a directed and weighted network that represents 382 airports connected by 16095 routes. All airports included in the network's boundary are represented as nodes (i.e., no node missingness). The edges are weighted by passengers, which was measured on a count scale. These data were collected in 2019 using archival methods. The network's clustering coefficient is 0.711. The network's mean path length is 2.093. Fitting a power law to this network's degree distribution implies that k^-10.091 for k >= 146. This network is described in 10.1371/journal.pone.0269137."

back to Table of Contents

Generating a GRAND table

The grand.table() function plots writes a complete and uniform tabular description of the network, in the style of a US Nutrition Label. For example:

grand.table(airport)

A table can be exported for use in a paper or presentation using a graphics device. For example:

pdf("grand.pdf", width = 3.5, height = 4)
grand.table(airport)
dev.off()

back to Table of Contents

More examples

A bipartite network

The example cosponsor data is a bipartite network representing US Senators’ (co-)sponsorship of Senate Bills during the 116th session (2019-2020). It can be loaded, and narrative and tabular summaries can be obtained, using:

data(cosponsor)
grand.text(cosponsor)
#> This graph was created by an old(er) igraph version.
#> ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
#> For now we convert it on the fly...
#> [1] "The US Senate Co-Sponsorship Network is a undirected and unweighted network that represents 102 senators and 5086 bills connected by 35166 sponsorships. All senators and senators included in the network's boundary are represented as nodes (i.e., no node missingness). These data were collected in 2021 using archival methods. The network's mean degree is 13.557. This network is described in 10.2478/connections-2019.026 and is available from https://osf.io/kjgrz/."
grand.table(cosponsor)

back to Table of Contents

A signed network

The example senate data is a signed network representing US Senators’ representing US Senators’ alliances and antagonisms during the 116th session (2019-2020). It can be loaded, and narrative and tabular summaries can be obtained, using:

data(senate)
grand.text(senate)
#> This graph was created by an old(er) igraph version.
#> ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
#> For now we convert it on the fly...
#> [1] "The US Senate Network is a undirected and signed network that represents 102 senators connected by 1539 alliances and 2339 antagonisms. All senators included in the network's boundary are represented as nodes (i.e., no node missingness). These data were collected in 2021 using backbone methods. The network's degree of balance is 0.94. This network is described in 10.2478/connections-2019.026 and is available from https://osf.io/kjgrz/."
grand.table(senate)

back to Table of Contents

Utility functions

The grand package includes a couple exported utility functions that are extensions of user input functions in base R. These functions are used by grand() to interactively query the user about the supplied igraph object.

scan2()

The scan2() function is an extension of scan() that allows the specification of a required input format using the type parameter. Currently four input types are allowed: character, numeric, integer, or a vector of allowable responses. For example:

#Requiring an integer input
scan2(prompt = "Enter an integer", type = "integer")
Enter an integer
1: q
Please enter an integer.
1: 2.5
Please enter an integer.
1: 1
[1] 1

#Requiring an input from a vector of possibilities
> scan2(prompt = "Do you like this function (Y/N)?", type = c("Y", "N", "y", "n"))
Do you like this function (Y/N)?
1: 3
Please enter one of these options: Y N y n
1: yes
Please enter one of these options: Y N y n
1: Y
[1] "Y"