One part of my development on Hypergraph which I recently started again after a long break is to replace my own graph implementation with JGraphT. JGrapht is a nice framework to create graph structures in Java and it offers various algorithms to process graphs. For the hyperbolic graph visualisation I do with Hypergraph, this is a very good alternative to my self-written “graph library”. It also includes the possibility to serialise graphs using for example the GraphViz dot format or the GraphML XML format. However, it doesn’t offer any possibility to read these graph formats.
The following is a proposal how to create readers for JGraphT. Please feel free to leave a comment, an additional proposal etc.
The GraphReader interface
The parsing of a file and the creation of the graph should be done by a instances of a common GraphReader interface. This interface could be:
public interface GraphReader<V, E> {
public void setComponentAttributeProvider(ComponentAttributeProvider<?> attributeProvider);
public Graph<V, E> read(String filename,
VertexFactory<V> vertexFactory, EdgeFactory<V, E> edgeFactory);
public Graph<V, E> read(InputStream inputStream,
VertexFactory<V> vertexFactory, EdgeFactory<V, E> edgeFactory);
public Graph<V, E> read(URL url,
VertexFactory<V> vertexFactory, EdgeFactory<V, E> edgeFactory);
}
The ComponentAttributeProvider can be provided before creating the graph; in this case, attributes like weight, color or other information provided in the file will be stored in the ComponentAttributeProvider. This is not mandatory – if not available, the attributes are simply ignored. For a discussion about storing the attributes, see below.
The read methods return a Graph instance; the actual type of the graph depends on the data in the file. For example the graph format can define a graph as directed/undirected (for example “digraph” in the dot format); the created graph may hence be a DirectedGraph or an UndirectedGraph.
Questions:
- Which read methods should be implemented? Are the three methods I mentioned above enough?
- Which parameters should the read methods have? Shold we offer read methods without an EdgeFactory, which default to ClassBasedEdgeFactory?
- How should exceptions (IOException etc) be handled? This includes parsing errors (wrong format).
Generic Reader
There should be a generic class which hides the complexity of choosing the correct parser. When this generic reader is used, the only information passed to the reader might be a file name or even only an input stream. The reader should derive information about the format by analysing for example the file name, the mime type or the format of the actual data, based on some heuristics.
Some example for heuristics:
- if the file name is given and ends with “.xml” or “.dot” => the file is an xml or dot file and the corresponding reader is used
- if the mime type is available, use xml reader for the xml mime type
- if the first bytes “look like” a dot file (i.e. the file starts according to the dot grammar [strict] (graph | digraph) ), use the dot reader
Example how to use such a generic reader:
GraphReader<V, E> reader = new GenericReader<V, E>();
ComponentAttributeProvider attributeProvider = new DefaultComponentAttributeProvider();
reader.setAttributeProvider(attributeProvider);
Graph<V, E> graph = reader.read(“some_file.dot”, vertexFactory);
In this case, the read(..) methods would assume that the file has the dot format because of the file name. It would instantiate a corresponding reader (say DOTReader) and delegate the actual parsing to this reader.
Storing attributes on vertices and edges
Most graph formats include attributes for vertices and edges, for example a weight or color information. There must be a way to store these attributes either on the graph elements (i.e. vertices or edges) themselves or by maintaining a map for each element which stores the attribute keys and corresponding values. I prefer the second approach for two reasons:
- There exists already a ComponentAttributeProvider which has been introduced to support attributes when dot files are created.
- Storing the attributes on the elements would imply that the elements implement some interface, probably an interface which defines something like “Object getAttribute(String key)”. This would contradict the philosophy of JgraphT since the JgraphT code makes no assumption on the vertices and edges at all.
Hence the readers should store any attributes using ComponentAttributeProvider (probably in a slightly enhanced version) and should make no assumptions on the elements. Even then, there are different ways to work with the attributes:
- Variant 1:
- The vertex and edge classes don’t know about the attributes; to access the attributes, the program has to access the ComponentAttributeProvider.
- Variant 2:
- The elements are instances of a special class which “knows” about the attribute provider. The vertex/edge class has accessors to for example a label (i.e. public String getLabel()) and the implementation of this method delegates to the attribute provider. In particular, the elements don’t store the actual attribute values.
- Variant 3:
- The elements store the actual attribute values and offer appropriate accessors. The attribute provider knows the type of the element and when getAttributes() is called, the attribute provider creates a map and initialises it with the attributes read from the vertex.
Both variant 2 and 3 assume that there is a special vertex or edge class; this is possible because the Reader accepts a VertexFactory and EdgeFactory.
Tags: Hypergraph, JGraphT
Hi Jens,
This is John Sichi (perfecthash over at the jgrapht project).
I’m glad you are working on this; it’s a fairly frequent request from JGraphT users.
One suggestion I have is to take an optional Graph object as input (in addition to returning it as output). This is what we do for the GraphGenerator interface. The reason is that it allows the caller to choose the desired concrete graph implementation, which can be very useful when dealing with custom graph classes. If appropriate, the reader can verify that the graph is empty to start with (but in some cases a reader could also be used to augment an existing graph, potentially attaching to existing vertices by reference). And in your example where the graph type is declared in the file, then the reader can verify that the input graph (if supplied) matches. If a graph is passed, then no edge factory is required since the graph already supplies its own.
We can edit the comments on ComponentAttributeProvider to indicate that when used with a reader, the returned map is read/write (rather than read-only as for a writer).
JVS
Hi, I am looking for a open source java lib for attributed graph representation. Can I have Edge and Vertex Attribution in JGraphT ? If yes, is there any place where i can find some sample codes for it. Thanks
Hi Muhammad,
Such questions are better asked in the mailing lists or sourceforge trackers of the JGraphT project (see http://jgrapht.org). A quick answer, though: in the latest trunk there is org.jgrapht.ext.ComponentAttributeProvider which could be used; another (and possibly preferred) solution is to use special classes for edges an vertices which store the attributes.