Supported Formats

SocNetV supports the following network data formats:

  • GraphML (.graphml or .xml),
  • GML (.gml or .xml)
  • GraphViz (.dot),
  • Adjacency matrix (.sm, .adj or .csv )
  • Pajek (.net, .paj or .pajek),
  • UCINET's Data Language (.dl)
  • edge list (.lst or .list)
  • Weighted Lists (.wlst or .wlist)

The default network data format for saving new networks is GraphML.

Warning
If you create a new network and press Ctrl+S to save it, then by default SocNetV will try to save it in GraphML format.

You can load GraphML files by clicking on menu File > Load or by specifying them explicitly at the command line.

For other network data formats, you have to import the files using File -> Import.

Please note that Pajek and UCINET support is not complete. Read below for more.

GraphML files

Each GraphML document is written in a special form of XML and defines a graph. For instance the code below, contains 11 nodes and 12 edges:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<node id="n2"/>
<node id="n3"/>
<node id="n4"/>
<node id="n5"/>
<node id="n6"/>
<node id="n7"/>
<node id="n8"/>
<node id="n9"/>
<node id="n10"/>
<edge source="n0" target="n2"/>
<edge source="n1" target="n2"/>
<edge source="n2" target="n3"/>
<edge source="n3" target="n5"/>
<edge source="n3" target="n4"/>
<edge source="n4" target="n6"/>
<edge source="n6" target="n5"/>
<edge source="n5" target="n7"/>
<edge source="n6" target="n8"/>
<edge source="n8" target="n7"/>
<edge source="n8" target="n9"/>
<edge source="n8" target="n10"/>
</graph>
</graphml>

All GraphML files consist of a graphml element and a variety of subelements: graph, node, edge, keys. SocNetV understands all of them.

Nodes are defined by the <node id="n1" /> where id is a unique node identification string. This id is used in edge declaration, below.

Edges are defined by the <edge source="n1" target="n1" />where source and target are equal to existing node ids.

Graph Modeling Language (GML) files

Graph Modeling Language (GML) is a hierarchical ASCII-based file format for describing graphs. It has been also named Graph Meta Language.

SocNetv supports this set of commands:

  • graph
  • comment
  • directed
  • node
  • edge
  • id
  • label (inside "node")
  • label (inside "edge")
  • source
  • target

GraphViz (DOT) files

The so-called "DOT format" is the file format of the GraphViz layout application. SocNetV supports GraphViz formatted files, although not the whole specification of the format. Directed and undirected graphs are supported. Some node properties, such as color and shape, and edge properties (weight and color) are also supported. The features supported by SocNetV are displayed in the following example:

digraph mydot {
node [color=red, shape=box];
a -> b -> c ->d
node [color=pink, shape=circle];
d->e->a->f->j->k->l->o
[weight=1, color=black];
}

Nodes are defined by the "node" declaration. In this you can define the color and the shape of the nodes that will follow. Each link is denoted by an "->" for directed graphs (digraphs) and a "-" for undirected graphs (graphs) between nodes' labels. For instance, "a -> b" means a directed edge from a to b. Moreover, links can have weights and colours.

Warning
Custom attributes, subgraphs and other advance GraphViz features are not supported.
The data should be formatted in separate lines, as in the example. At the moment, SocNetV does not support reading files where all data is on the same line (i.e. digraph sample { 1 -> 2; 3 -> 4; } )

Adjacency Matrix files

The adjacency sociomatrix format is a very easy one.

It describes one-mode networks and contains a simple matrix NxN, where N is the amount of nodes. Each (i,j) element is a number.

If (i,j)=0 then nodes i and j are not connected. If (i,j)=x where x a non-zero number then there will be an arc from node i to node j. Again, negative weights are allowed. Those are depicted as dashed lines when the network is visualised on the canvas.

This is an example of an adjacency sociomatrix formatted network.

0,1,1,2
1,0,2,1
0,0,0,1
1,0,0,0

The delimeter between columns might be a comma or a space.

Two-mode Sociomatrix files

Unlike one-mode networks which describe direct links between actors of the same type, networks can be two-mode as well. Two-mode networks describe either two sets of actors or a set of actors and a set of associated events.

In the first case, which usually is called dyadic two-mode network, there are two sets of actors. The sociomatrix codifies the relations between actors in the first set and actors in the second set.

In the second case, which usually is called affiliation network, there is a set of actors and a set of events or organizations. The sociomatrix measures the attendance or affiliations of the actors (first mode) with a particular event or organization (second mode).

Two-mode networks are described by affiliation network matrices, where \( A(i,j) \) codes the events/organizations each actor is affiliated.

A two-mode sociomatrix is a matrix NxM, where N is the amount of nodes and M is the amount of events. Each (i,j) element can be 0 or 1.

If \( A(i,j)=1 \) then actor \( i \) is affiliated with event \( j \).

This is an example of an two-mode sociomatrix formatted network.

0 0 1 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 0 0 0 0 0 0
0 1 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 1 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0
0 0 0 1 0 0 1 0 0
1 0 0 1 0 0 0 1 0
0 0 1 0 0 0 0 0 1
0 1 1 0 0 0 0 0 1
0 0 0 1 0 0 1 0 0
0 0 1 1 1 0 0 0 1

Pajek-like formatted files

SocNetV supports importing Pajek project file. These file usually have a .net or .paj extension.

Please note that we support a limited subset of the features of this format.

To be more precise, SocNetV understands the following Pajek tags:

  • *Network
  • *Vertices
  • *Edges
  • *Arcs
  • *Arcslist
  • *Matrix.

Also, SocNetV will import files with multiple relations inside (multiple *Matrix tags).

SocNetV does not support vectors (vec), permutations or event-based files or vectors, though.

Warning
Pajek files can be much more complicated than the ones recognised by SocNetV, so import those files with caution.

Here is an example of the Pajek-like form that SocNetV understands. The numbers to the left are just indicating line numbers.

Please note, SocNetV expects quotes around node labels in pajek formatted files.

1) *Network
2) *Vertices 6
3) 1 "pe0" ic LightGreen 0.5 0.5 box
4) 2 "pe1" ic LightYellow 0.8473 0.4981 ellipse
5) 3 "pe2" ic LightYellow 0.6112 0.8387 triangle
6) 4 "pe3" ic LightYellow 0.201 0.7205 diamond
7) 5 "pe4" ic LightYellow 0.2216 0.2977 ellipse
8) 6 "pe5" ic LightYellow 0.612 0.1552 circle
9) *Arcs
10) 1 2 1 c black
11) 1 3 -1 c red
12) 2 4 1 c black
13) 3 5 1 c black
14) *Edges
15) 6 4 1 c black
16) 5 6 1 c yellow

Let 's analyse this a little bit:

The first line *Network declares that this is a Pajek network.

The second line (*Vertices 6) declares the number of vertices of the network and identifies that the following lines describe node properties.

Each one of the following 6 lines (3-8) construct one node. Each node's line has 7 columns-properties:

  • Column 1 denotes the node's number.
  • Column 2 denotes the node's label. The label must be wrapped in quotes.
  • Column 3 indicates that the next column carries the colour of the node's shape.
  • Column 4 denotes the colour of the node's shape.
  • Column 5 denotes the proportional X coordinate of the specific node on the canvas.
  • Column 6 denotes the proportional Y coordinate of the specific node on the canvas.
  • Column 7 denotes the node's shape.

Line 9 (*Arcs) identifies that the following lines will describe arcs from an node to another. Each one of the lines 10-13 construct one arc. For instance, Line 10 constructs an arc from node 1 to node 2 with weight 1 and black colour.

Line 14 identifies that the following lines will describe edges (double arcs) between nodes. Each one of the lines construct one edge. For instance, Line 10 constructs an arc from node 1 to node 2 with weight 1 and black color.

Note that it is legal to have mixed columns in Pajek-like network file. For instance you can have an node's specification line like this:

4 "label" 0.201 0.7205 ic LightYellow diamond.

Also, it is not necessary to declare X and Y coordinates or colors and shapes. In that case SocNetV will use the defaults, that is red diamonds scattered randomly across the canvas. Nevertheless, the first two columns must be valid node numbers and labels.

Note also that weights might be negative as in line 11. Negative weights are depicted as dashed lines on the canvas.

Colour names are not arbitrarily created. Valid colour names for nodes and arcs/edges are those specified in the X11 file: /usr/X11R6/lib/X11/rgb.txt, i.e. red, gray, violet, navy, green, etc. You can change colours of all network elements from inside SocNetV.

SocNetV also supports Pajek files which declare edges/arcs in matrices, like this:

*Vertices 11
1 "minister1" 0.2912 0.2004 ellipse
2 "pminister" 0.4875 0.0153 diamond
3 "minister2" 0.3537 0.3416 ellipse
3 "minister2" 0.3537 0.3416 ellipse
4 "minister3" 0.4225 0.5477 ellipse
5 "minister4" 0.4538 0.1603 ellipse
6 "minister5" 0.4900 0.3836 ellipse
7 "minister6" 0.6212 0.5038 ellipse
8 "minister7" 0.6450 0.2023 ellipse
9 "advisor1" 0.6488 0.6031 box
10 "advisor2" 0.3212 0.5515 box
11 "advisor3" 0.7188 0.4218 box
0 1 1 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 0 0 0
0 1 0 1 0 1 1 1 0 0 0
0 1 0 1 1 0 1 1 0 0 0
0 0 0 1 0 0 0 1 1 0 1
0 1 0 1 0 0 1 0 0 0 1
0 0 0 1 0 0 1 1 0 0 1
1 0 1 1 1 0 0 0 0 0 0
0 0 0 0 0 1 0 1 1 0 0
Definition: matrix.h:127

In the above example, the *Matrix tag replaces *Arcs or *Edges. An ordinary adjacency matrix follows describing all links.

SocNetV v1.5 and later supports multiple *Matrix declarations in the same Pajek file. It will parse each adjacency matrix and load it as a new relation of the same network.

Below is an example pajek file with 4 relations/matrices and 5 actors:

*Network "4 possible graphs of N=5"
*Vertices 5
1 "1" ic red 0.221583 0.644042 circle
2 "2" ic red 0.233094 0.351433 circle
3 "3" ic red 0.696403 0.328808 circle
4 "4" ic red 0.471942 0.197587 circle
5 "5" ic red 0.726619 0.644042 circle
*Matrix :1
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
*Matrix :2
0 0 0 0 0
0 0 0 1 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
*Matrix :3
0 1 0 0 0
1 0 0 1 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
*Matrix :4
0 0 0 0 1
0 0 0 1 0
0 0 0 0 0
0 1 0 0 0
1 0 0 0 0

Another possibility, is the *ArcsList tag. When SocNetV finds that tag in a Pajek file, it expects each node to declare list of its link to other nodes. Here is an example:

*Vertices 9
1
2
3
4
5
6
7
8
9
*Arcslist
2 1 3 9
1 3 4 5
3 1 4 7
4 1 2 3
5 1 3 4
7 2 8 9

For instance, the first line after *Arcslist means: "node 2 is connected to nodes 1, 3 and 9". It is very simple.

UCINET's DL files

UCINET's DL format is one of the easiest to understand. As of version 1.4, SocNetV support "full matrix" and "edgelist1" modes.

Each UCINET file starts with the "DL" mark; then the amount N of nodes is declared and the format (i.e. if a diagonal is present or not).

Afterwards, after the "LABELS:" mark we read the labels of each node line by line. That is, if N was 100 then we expect to read 100 rows with arbitrary strings-labels.

In the end, a DL file declares network data ("DATA") which describes the edges. If the declared format was "full matrix", SocNetV expects to read an adjacency matrix. If the declared format was edgelist1, SocNetV tries to read a list of edges of the form:

source target weight

i.e.

1 2 1
1 3 1
1 4 1
2 1 1
2 3 1
2 4 1
...

For instance the UCINET-formatted file below, contains 4 nodes and 7 arcs/edges:

DL
N=4
FORMAT = FULLMATRIX DIAGONAL PRESENT
LABELS:
On the normalization and visualization of author co-citation data:Salton's cosine versus the Jaccard index
Caveats for the use of citation indicators in research and journalevaluations
Should co-occurrence data be normalized? A rejoinder
Home on the range - What and where is the middle in science andtechnology studies?
DATA:
0 0 0.158114 0
0.201234 0 1 0
1 0 0 0
0.1 1 1 0

Simple Edgelist

The "edgelist" format is the most simple network file supported by SocNetV as it maps what edges/arcs are present in the network.

SocNetV supports two kinds of edgelist files: unvalued and valued (see "Weighted edge list" below).

In an unvalued edgelist file, each line declares one or more directed edges by vertex numbers which are separated by a space(s) or tab(s).

The first element of the row is always considered to be the source node number while the subsequent elements are the numbers of the nodes outlinked by the source node.

No other node or edge attributes are supported.

For example the line

1 2

declares one edge (the directed edge from node 1 to node 2), while the line

1 2 3 4 5

declares these four arcs:

1 -> 2
1 -> 3
1 -> 4
1 -> 5

The simple edgelist format is suitable for unvalued graphs or digraphs of dichotomous relations.

Here's a more complicated example. This is a network of 37 nodes. The first row declares that node 18 has outbound edges to nodes 8, 10, 23 and 21:

18 8 10 23 21
19 11 21
29 5 9 10
23 8 9 18 11
4 7 6 8 20 5 21
5 4 29 20 7 6 8 9 26 21
6 5 7 4 20 21 8
7 4 6 5 8 20 21
9 5 8 23 29 20 21 11 10
8 18 23 4 5 6 7 21 24 26 25 9 10 37 20
10 18 29 8 11 9 20 25 26
11 19 23 9 10 25 21 36
20 4 5 6 7 8 9 10
24 8 26
26 5 8 24 10
21 19 4 5 6 7 8 9 11 18
36 37 11
37 8 36
25 10 11 8

Weighted edge list

The "Weighted list" format another kind of edgelist format supported by SocNetV. It is suitable for both unvalued and valued graphs or digraphs - no colors or positions are supported though.

Files of this format consist of rows with three numbers: a pair of nodes along with a weight.

Each row of the file declares an arc (outbound edge) from a source node to a target node. The first number of the row is always considered to be the source node, the second element is the target node and the third is the weight of the arc from source to target.

For instance, in the example below, the first row declares that node 1 has an outbound edge to node 2, of weight 4 . The second row declares that node 1 is out-linked to 3 and the weight of that arc is 2.

1 2 4
1 3 2
1 6 2
1 8 2
1 10 2
1 11 2
1 13 2
1 14 2
1 18 2