Wednesday, June 8, 2011

Building a reader for NeXML

So here is what has gone through in the last few days, the code for which is in the packages
  1. org.nescent.phylogeoref.nexml
  2. org.nescent.phylogeoref.nexml.utility
First is the class NeXMLReader.
Well the cynosure here is the method, parseNetwork(File networkFile)

Here is a detailed explanation of it.
        Document document = DocumentFactory.parse(networkFile);
        List<TreeBlock> treeList = document.getTreeBlockList();


Basically you parse the File object network file which wraps a NeXML file inside it. And extract the list of TreeBlocks as a list.

The next thing we do is ask the engine to construct a Phylogeny object from a network object.

           phylogenies[index] = engine.constructPhylogenyFromNetwork(network);

The NeXMLEngine class basically provides all the computation work of constructing a Phylogeny object from a Network object.

The PhyloUtility class provides various kinds of commonly used utility methods. All the methods are static and properly documented, so you can have a look at them.

Then last is the PhylogenyFactory class which is nothing but a factory for new Phylogeny objects.

So now it is possible to read a very simple NeXML files and construct the corresponding Phylogeny object. However the metadata attached to the nodes has still not been attached with the nodes. This will be some challenge as I'll have to discover methods of grabbing this information from the document. Currently the NeXML schema is undergoing a lot of simultaneous developments.

The code is up on github. There is a utility main method in the class NeXMLReader. You can run this file and have a look at it in action. There are some sample files in samples from where you can choose them. Again a reminder that I am running this on a windows machine.

No comments:

Post a Comment