Data Request Broker
2-3-release

Data Request Broker - DRB API®
2-3-release

This document is the API specification for the Data Request Broker - DRB API®.

See:
          Description

Packages
fr.gael.drb DRB main package containing the main interfaces, default and abstract classes.
fr.gael.drb.impl Package containing the implementations.
fr.gael.drb.impl.file Implementation of DRB for the support of Files and Directories of the local File System.
fr.gael.drb.impl.ftp Implementation of DRB for the support of Files and Directories on FTP.
fr.gael.drb.impl.jar Implementation of DRB for the support of Jar files.
fr.gael.drb.impl.sds Implementation of DRB for the support of binary files heareafter called Structured Data Sources (SDS).
fr.gael.drb.impl.tar Implementation of DRB for the support of Tar files.
fr.gael.drb.impl.xml Implementation of DRB for the support of XML documents.
fr.gael.drb.impl.zip Implementation of DRB for the support of ZIP files.
fr.gael.drb.query An implementation of XQuery 1.0 over the DRB API®.
fr.gael.drb.value Envelope classes and operations for primitive values of DRB.
fr.gael.drb.xsd Implementation of DRB for the support of XSD XML schema files.

 

This document is the API specification for the Data Request Broker - DRB API®.

The Data Request Broker - DRB API® is an application programming interface for reading, writing and processing heterogeneous data. DRB API® is an abstract layer that helps developers in programming applications independently from the way data are encoded. Indeed, DRB API® is based on a unified data model that makes the handling of supported data formats much complete and easier. DRB API® could be compared to APIs as DOM, JDBC or ODBC but not limited to XML documents or to databases.

Programmed in Java language and embedding W3C standards, DRB API® provides a portable solution that remains powerful even on large data collections. Initially developed and well-tried for accessing satellite imagery for European Space Agency, DRB API® is the result of more than 15 years of expertise and know-how in data access programming. Thus, DRB API® provides a complete set of tools for solving simple and complex data access issues with minimum of engineering efforts.

Architecture

DRB API® is composed of a main interface which handles data bound to a unified Data Model. This main interface includes fr.gael.drb and fr.gael.drb.value packages which contain all classes and interfaces respectively required to build the tree data structure and to encode the data values.

DRB API® is also composed of several implementations that wrap information extracted from target data file and convert them according to the unified Data Model. It contains today nine implementations defined in fr.gael.drb.impl.file for files and directories within file system or accessed through HTTP protocol, fr.gael.drb.impl.ftp for remote file accessed through FTP protocol, fr.gael.drb.impl.xml for XML documents, fr.gael.drb.impl.sds for ASCII and binary files and finally fr.gael.drb.impl.jar, fr.gael.drb.impl.tar, fr.gael.drb.impl.zip for respectively JAR, TAR and ZIP files. DRB API® contains also facilities which add extra tools to manipulate data: the XQuery engine defined by fr.gael.drb.query and XML Schema validator defined by fr.gael.drb.xsd.

DRB Data Model

As previously introduced, DRB API® endeavors to be as independent as possible from the data type it handles. This behavior intends to make the using applications relevant for the maximum number of environments and contexts of use, and to avoid constraining them to a set of data types, which are in force at a specific period of time or into a specific domain.

To achieve this challenge, a clue element for the using applications is a federation of all processing components around a unified Data Model. Standing as an input/output interface between those processing components, the support of a new data type boils down to configure or implement a new binding of the Data Model. Indeed, the support of a new textual or binary format should not cripple the using systems, not require reworking their architectural designs, and on the contrary, should make the existing features transparently benefiting from the novelty, or at least let them identify, without failure, that they have nothing to deal with it.

Another critical aspect of these architectures, is that the Data Model has a very low level of semantic to allow bindings of data of very different natures. A typical counter-example would be a Data Model including image domain specifics, such as it would be inconsistent to bind any type that has nothing to do with pixels, sample or color models, etc.

The DRB API® Data Model considers that any data can be bound to a DrbSequence of DrbItems, such as shown in the figure below.

The Sequence can be the Empty Sequence if it contains nothing, can be a Singleton if it contains one DrbItem only, or a collection of an unlimited number of DrbItems. The class DrbItem is an abstract concept that denotes a named element that can indifferently be specialized to a DrbNode, an DrbAttribute or an atomic Value. These three specializations have different meanings and different types of relationships in the Data Model.

The DrbNodes intend to model the hierarchical structure (tree) of the bounded data. In modern windowing environments, the directory list is an excellent example of a tree of DrbNodes. The top of the tree components is the root directory or drive, end under that is a list of subdirectories. If the subdirectories contain further subdirectories, they are bounded to DrbNodes as well. The actual files found in any directory in this type of component are the leaves of the tree, also modeled as DrbNodes. Any data that contains parent-child relationships between chunks of information can be modeled as a tree of DrbNodes. Another common example is an organizational chart. In such a chart, every management position is a DrbNode, with child DrbNodes representing the employees under the manager. The leaves of this organizational chart are the employees who are not in management position, and its root is the top level manager. Of course, real organizations don't always adhere to a strict tree structure; this example highlights one limitation of the current DRB API® Data Model that has voluntarily been selected to maintain an acceptable level of complexity with regard to the actual need of cyclic data modeling. Should the express need of cyclic modeling arise, the traversal axes of the Data Model should be augmented with another type of DrbNode relationship, which should not impact the previously implemented bindings. The DrbNode relationships currently available are illustrated in the figure below and summarized in the following table.

Node Traversal AxisSummary
Parent The parent Node if the current Node is not the root Node c.f. DrbSimpleNode.getParent().
Child The collection of child Nodes, if any c.f. DrbSimpleNode.getChildren().
Previous Sibling The previous sibling if the current Node is not the first child of its parent Node c.f. DrbSimpleNode.getPreviousSibling().
Next Sibling The next sibling if the current Node is not the last child of its parent Node c.f. DrbSimpleNode.getNextSibling().

Refer to other methods of the DrbNode class for other convenient traversal axis.

In the DRB API® Data Model, DrbNodes can have attached DrbAttributes that could be considered as Metadata of DrbNodes. Typical DrbAttributes could be the size of a file modeled as a DrbNode, the XML name attribute of the <person name="Smith"/> XML fragment or the compression ratio of ZIP file entry.

Because DrbNodes and DrbAttributes are not only symbolic notions about the modeled data, they can be assigned a given atomic Value that actually binds a part of the data to a primitive type as integers, floating point numbers or strings of characters. The atomic Value may also represent an array of primitive types. Please refer to the fr.gael.drb.value package for more information about the value supported in the current version of the software.

Implementations

The implementations made available in the present distribution are:

DRB ImplementationSummary
File The File implementation wraps around the local file system to bind directories and files as DrbNodes of the Data Model c.f. fr.gael.drb.impl.file.
XML The XML implementation is similar to the Document Object Model (DOM) interface specified by the W3C. Generally speaking, the XML markups correspond to DrbNodes of the Data Model, their attributes to DrbAttributes of the Data Model and the textual content enclosed in the XML markups attributes quotes are bound to atomic fr.gael.drb.DrbValues of the Data Model c.f. fr.gael.drb.impl.xml.
Structured Data Source (SDS) SDS is a critical implementation for accessing scientific or legacy data: it wraps around binary or ASCII files content according to an external description of the data structure. This external description is based on XML Schema documents, following a W3C recommendation. SDS makes use of set of extra markups to achieve the complete description of binary data representation, when this information could not be derived from the standard XML Schema markups: the SDS descriptors remain however fully compatible with the standard c.f. fr.gael.drb.impl.sds.
FTP The FTP implementation wraps around the remote file system exposed through File Transfer Protocol (FTP). It binds directories and files as DrbNodes of the Data Model. Its behavior is similar to the File implementation described above but for remote data c.f. fr.gael.drb.impl.ftp.
ZIP The ZIP implementation wraps around ZIP file archives to bind their entries as DrbNodes of the Data Model. The complete directory structure is expanded instead of modeling a flat list of ZIP file entries c.f. fr.gael.drb.impl.zip.
TAR Same as ZIP implementation for Tape ARchive (TAR) archives c.f. fr.gael.drb.impl.tar.
JAR Same as ZIP implementation for Java ARchive (JAR) archives c.f. fr.gael.drb.impl.jar.

The number of implementations and representation information available along with DRB API® constantly increases among its releases: today, DRB API® already includes new implementations as those supporting HDF, NetCDF or VPF files, and additional packages are already available to support thousands of additional types; these are currently mainly focused on Earth Observation and Geographical Information System (GIS) data types, as those supporting QuickBird, RADARSAT, NITF, DTED, DNG, etc. data products, but DRB API® is not tied down to these domains: other data types from different domains will arise in the next future. Please refer to http://www.gael.fr/drb Web site to learn more about DRB API® or to check if your data type is already available.

Facilities

DRB API® comes up with an XQuery engine that allows selecting and computing data across targeted files. It also provides an XML Schema validation facility that runs over supported formats.

Refer to fr.gael.drb.query and fr.gael.drb.xsd for the specifications of these packages.

From the Command Line

DRB API® is an Application Programming Interface (API): its functions must be called through an application to be operational. Yet, DRB API® supports a some application modules. For instance, it allows executing queries directly through command lines.

Requirements and Dependencies

DRB API® is encoded in Java programming language. It uses packages within Java 5 Platform or highter provided by Sun Microsystems ( http://java.sun.com/j2se/1.5.0/docs/api). DRB API®has only one dependency as the employed Java platform has built in most interfaces and libraries dealing with XML technology. The remaining dependency library is the logging service API log4j 1.2.8 developed by Apache Software Foundation (http://logging.apache.org).


Data Request Broker - DRB API®
2-3-release

Copyright© 2001-2009 GAEL Consultant. All rights reserved. Use is subject to license terms .