Archive for the ‘Hypergraph’ Category

Proposal for graph readers in JGraphT

Thursday, July 8th, 2010

One part of my development on Hypergraph which I recently started again after a long break is to replace my own graph implementation with JGraphT. JGrapht is a nice framework to create graph structures in Java and it offers various algorithms to process graphs. For the hyperbolic graph visualisation I do with Hypergraph, this is a very good alternative to my self-written “graph library”. It also includes the possibility to serialise graphs using for example the GraphViz dot format or the GraphML XML format. However, it doesn’t offer any possibility to read these graph formats.

The following is a proposal how to create readers for JGraphT. Please feel free to leave a comment, an additional proposal etc.

The GraphReader interface

The parsing of a file and the creation of the graph should be done by a instances of a common GraphReader interface. This interface could be:

public interface GraphReader<V, E> {
    public void setComponentAttributeProvider(ComponentAttributeProvider<?> attributeProvider);

    public Graph<V, E> read(String filename,
        VertexFactory<V> vertexFactory, EdgeFactory<V, E> edgeFactory);

    public Graph<V, E> read(InputStream inputStream,
        VertexFactory<V> vertexFactory, EdgeFactory<V, E> edgeFactory);

    public Graph<V, E> read(URL url,
        VertexFactory<V> vertexFactory, EdgeFactory<V, E> edgeFactory);

}

The ComponentAttributeProvider can be provided before creating the graph; in this case, attributes like weight, color or other information provided in the file will be stored in the ComponentAttributeProvider. This is not mandatory – if not available, the attributes are simply ignored. For a discussion about storing the attributes, see below.

The read methods return a Graph instance; the actual type of the graph depends on the data in the file. For example the graph format can define a graph as directed/undirected (for example “digraph” in the dot format); the created graph may hence be a DirectedGraph or an UndirectedGraph.

Questions:

  1. Which read methods should be implemented? Are the three methods I mentioned above enough?
  2. Which parameters should the read methods have? Shold we offer read methods without an EdgeFactory, which default to ClassBasedEdgeFactory?
  3. How should exceptions (IOException etc) be handled? This includes parsing errors (wrong format).

Generic Reader

There should be a generic class which hides the complexity of choosing the correct parser. When this generic reader is used, the only information passed to the reader might be a file name or even only an input stream. The reader should derive information about the format by analysing for example the file name, the mime type or the format of the actual data, based on some heuristics.

Some example for heuristics:

  1. if the file name is given and ends with “.xml” or “.dot” => the file is an xml or dot file and the corresponding reader is used
  2. if the mime type is available, use xml reader for the xml mime type
  3. if the first bytes “look like” a dot file (i.e. the file starts according to the dot grammar [strict] (graph | digraph) ), use the dot reader

Example how to use such a generic reader:

    GraphReader<V, E> reader = new GenericReader<V, E>();
    ComponentAttributeProvider attributeProvider = new DefaultComponentAttributeProvider();
    reader.setAttributeProvider(attributeProvider);
    Graph<V, E> graph = reader.read(“some_file.dot”, vertexFactory);

In this case, the read(..) methods would assume that the file has the dot format because of the file name. It would instantiate a corresponding reader (say DOTReader) and delegate the actual parsing to this reader.

Storing attributes on vertices and edges

Most graph formats include attributes for vertices and edges, for example a weight or color information. There must be a way to store these attributes either on the graph elements (i.e. vertices or edges) themselves or by maintaining a map for each element which stores the attribute keys and corresponding values. I prefer the second approach for two reasons:

  1. There exists already a ComponentAttributeProvider which has been introduced to support attributes when dot files are created.
  2. Storing the attributes on the elements would imply that the elements implement some interface, probably an interface which defines something like “Object getAttribute(String key)”. This would contradict the philosophy of JgraphT since the JgraphT code makes no assumption on the vertices and edges at all.

Hence the readers should store any attributes using ComponentAttributeProvider (probably in a slightly enhanced version) and should make no assumptions on the elements. Even then, there are different ways to work with the attributes:

Variant 1:
The vertex and edge classes don’t know about the attributes; to access the attributes, the program has to access the ComponentAttributeProvider.
Variant 2:
The elements are instances of a special class which “knows” about the attribute provider. The vertex/edge class has accessors to for example a label (i.e. public String getLabel()) and the implementation of this method delegates to the attribute provider. In particular, the elements don’t store the actual attribute values.
Variant 3:
The elements store the actual attribute values and offer appropriate accessors. The attribute provider knows the type of the element and when getAttributes() is called, the attribute provider creates a map and initialises it with the attributes read from the vertex.

Both variant 2 and 3 assume that there is a special vertex or edge class; this is possible because the Reader accepts a VertexFactory and EdgeFactory.

Reanimating Hypergraph

Monday, July 5th, 2010

After a break of several years, I decided to continue the Hypergraph project. For those of you who don’t know Hypergraph: this is an open source Java framework which provides ways to model hyperbolic geometry and visualises graphs and trees using hyperbolic geometry. There has been some research on hyperbolic trees which shows that especially hierarchical data can be view very well using hyperbolic trees.

Every few weeks I received a mail asking for the current status of the project, whether it’s completely dead or whether certain enhancements can be made. There were also various (legitimate) complaints about the missing documentation of the project.

Hence I thought I should continue to work on the project at least if time allows it. Here I would like to give an overview about the next development steps and the reasons for this. I’m afraid, I can’t give a time estimation as I’m not working on this project on a regular basis.

Well, the next steps will be:

  1. Finish the refactoring to become independent of the UI (support at least AWT/Sing and SWT)
  2. Get rid of the graph data structure and algorithms; replace it by JgraphT
  3. Tidy up the existing graph layout algorithms
  4. Tidy up the example applications and add new examples
  5. Enhance the Javadoc
  6. Relaunch the website, including a better documentation

As you can see, I start with the parts which are more fun to me and end up with the (usually boring, but necessary) documentation.

Independence of the UI framework

The existing Hypergraph code consists of three parts: packages related to the graph structure, a package for the hyperbolic geometry and a package for the visualisation of the hyperbolic graphs and hyperbolic trees in a JPanel. At first sight, this sounds like a reasonable architecture, but if you look at the code in detail, you will find that this is not the case, here some examples:

  • The colour attribute for edges and vertices are stored using java.awt.Colour. If you use this graph in frameworks which don’t use java.awt.Colour, this causes a problem.
  • The projector which is responsible for projecting the hyperbolic plane to the screen coordinates depends on JComponent
  • The animator which is responsible for smoothly translating the hyperbolic plane uses javax.swing.Timer.

If the code shall be used by different UI frameworks, these dependencies have to be resolved, which is the first aim. Large parts are already done and committed to the SWT branch, but there is still something to do.

Replacement of the Graph API by JGraphT

When I started in 2002/2003, I decided to write my own graph API. By now, there are some established projects written in Java which offer the functionality I need and I decided to use JGraphT.

The advantage is that I don’t need to worry about the (correct) implementation of any algorithms or data structure – there is no need to reinvent the wheel.

There are disadvantes though: I have to rewrite any code that depend on the existing graph data structure, which is quite a lot by the nature of the project. Also there doesn’t seem to be any reader for GraphML or GraphXML files, so need to implement them.

Tidying up the layout algorithms

The most striking issue with the hyperbolic layout algorithms is the poor quality of the code. I have to confess that I can’t read the code at all…

But there are some other deficiencies: the layout algorithms are graph layout algorithms. Well, that sounds tautologic, but for example the force directed layout is actually a generic multi dimensional scaling algorithm which could also be applied to other data than graphs. It will be interesting to see whether hyperbolic geometry can also be used for other visualisation besides hyperbolic trees and graphs.

This will happen more or less at the same time as the introduction of JGraphT as this is related.

Tidying up the example applications

Similar to the layout algorithms, the code quality of the applications is quite poor – the programs are a mixture of examples and proof of concept. Even the main applet to show hyperbolic trees should be tidied up.

I haven’t thought yet about which examples and which actual applications I should include in the distribution. If you have any proposals or if you would like to have an example for a specific concept in Hypergraph, please add a comment.

Enhancement of  the Javadoc and relaunching the website

The need to enhance the Javadoc should be obvious to anybody who has worked with the code. The same applies to the website: it’s not complete at all.

Currently, it’s difficult or at least not comfortable to add new content to the website due to the way the html files are generated and deployed.

As I have also spent some time working with CouchDB, I might decide to use CouchDB as a storage and web server and create some mixture of blog and wiki for the new website.

Conclusion

As you can see, there is a lot to do and I hope to provide a new release soon which covers at least the first two issues. I will also try to write about how the issues are resolved, i.e. describe the new architecture of Hypergraph.

As a last remark, let me mention that I’m not very glad about the project’s name Hypergraph. When I created the project, I thought this is a nice combination of hyperbolic geometry and graphs. However, I completely ignored that hypergraph is actually a fixed notion in graph theory which can lead to confusion. I wonder whether there is some other name which hints at the main focus of the project: data visualisation using hyperbolic geometry. If you have any ideas, please add a comment.