In general, data access refers to activities related to storing, retrieving or acting on data housed in a repository.
For TAO, the Data Access feature deals with the provisioning of data (either EO products or ancillary data) that exhibits certain criteria for workflows execution.
Two categories of data sources can be distinguished: local and remote.
Local Data Access
A local data product is nothing but a descriptor for one or more files (for example, a GeoTIFF data product can be just a single raster file, while a Sentinel-2 data product is a structured collection of raster and metadata files). Consequently, a local data source is responsible with providing local data to the TAO platform to the requesting processing component(s).

Even if the local data source (comprising in the local database and file system) is not an external interface, it is important to layout its organization before describing the external interfaces of the system.
As it can be seen in the figure beside, the local data source consists of a product database, in which metadata associated with concrete data products are stored, and a local (in the platform context) file system, in which the data products are physically stored.
The product database stores basic metadata about the EO products, such as acquisition date, geographical footprint, product type, etc.
This metadata allows the users to query the local data source for products satisfying certain criteria, and is created when either products are initially imported or new products are downloaded from remote data sources.
The file system is organized such that an easier distinction can be made between public products (products that are visible/usable by all users) and private products (products visible/usable only by a specific user). This separation further allows the implementation of user quota management.
The visibility of the products is implemented at a logical level (i.e. not by physical operating system rights) in the database.
The file system structure depicted above is visible to (i.e. shared with) all the processing nodes in the TAO cluster. This is necessary in order to allow a uniform way of accessing products by processing components from remote nodes.
External Data Access
External data sources represent external repositories that are not under the direct control of the TAO platform (an example of such an external data source is the Copernicus Scientific Data Hub). They may implement different access interfaces and authentication/authorization schemes. This is transparent for the user of the TAO platform.
Eventually, all needed data would become local, taking also into account the defined user quotas.
Contrary to the differences that may exist between repositories access and the data formats that may be provided, several common features may be derived:
- All data can be expressed in binary form;
- There are common operations (actions) that can be performed on any data source:
- Authenticate/authorize the user connection to the data source;
- Query the data source for data satisfying several criteria;
- Retrieve the data in a binary form that can be interpreted by TAO.
The framework is capable to abstract the data format and to expose them in a unitary way. In addition, it makes the user (or a requesting component) unaware of the original location of data (i.e. a remote repository). This is accomplished by an interface abstraction with querying capabilities.
This approach is illustrated in the following figure:

The two types of data sources (local and remote) share the same interface, the location of the data being transparent for the TAO API. It is the implementation of the data source that takes care of properly connecting to the repository and querying and retrieving the data products. The platform can thus seamlessly access data from local repositories, such as the Sentinel-1 and Sentinel-2 repositories found on the DIAS platforms (CreoDIAS, Mundi and Onda), but also from remote repositories, such as the USGS Landsat repository or Alaska Satellite Facility.
The data source interface exposes operations (i.e. methods) allowing to:
- Connect to the remote repository;
- Authenticate to the remote repository;
- Create a query to be executed against the repository.
Given the diversity of EO products (and product providers), the parameters of a query may be bound to a specific repository. Nevertheless, there is a small subset of parameters that are supported by different providers, namely:
- The product name or identifier;
- The acquisition date (and time);
- The product footprint.
Besides these parameters, various providers may exhibit different parameters (even if a parameter would conceptually represent the same measure, two providers may have different notations for it). In order to cope with this diversity, the query allows for name-value collections of parameters that are described for each data source plugin. This permits adding (or describing) new data sources without any impact (or change) of the object and relational model, using a plugin-based mechanism.
Currently, several data source plugins are available in the TAO platform, namely:
Plugin for | Supported Sensors | Remarks |
---|---|---|
Amazon Web Services | Sentinel-2, Landsat-8 | Supports also local archives |
EO-CAT | ALOS-1, Beijing-1, COSMO-SkyMed, CryoSat-2, ENVISAT, ERS-1, ERS-2, GeoEYE-1, GOCE, ICEYE, IKONOS-2, IRS-1, IRS-P6, JERS-1, KOMPSAT-2, Landsat-1, Landsat-2, Landsat-3, Landsat-4, Landsat-5 ,Landsat-7, Landsat-8, Metop, NigeriaSAT-1, NOAA POES, OceanSAT-2, PAZ, Planetscope, Pleiades-1, PROBA-1, QuickBird-2, RadarSAT-1, RapidEYE, SeaSAT, SkySAT, SMOS, SPOT 1, SPOT 2, SPOT 3, SPOT 4, SPOT 5, SPOT 6, SPOT 7, TerraSAR-X, UK-DMC-1, WorldView-1, WorldView-2, WorldView-3, WorldView-4, | |
PEPS | Sentinel-1, Sentinel-2 | |
USGS | EO-1, Landsat1-5, Landsat4-5 ,Landsat-7, Landsat-8, Landsat-9, SNPP, ECOSTRESS, MODIS | Supports also local archives |
FedEO | ADEOS-I, ADEOS-II, ALOS, ALOS-1, ALOS-2, AQUA, Beijing-1, COSMO-SkyMed, CryoSat-2, MetOP-A, MetOP-B, DEIMOS-1, ERS-1, ERS-2, ENVISAT, FORMOSAT-2, GCOM-C1, GCOM-W1, GeoEYE-1, GOCE, GOES, GPM, ICEYE, IKONOS, IKONOS-2, IRS, IRS-1, IRS-1D, IRS-P5, IRS-P6, JASON, JERS-1, KANOPUS-V1, KOMPSAT-2, Landsat-1, Landsat-2, Landsat-3, Landsat-4, Landsat-5 ,Landsat-7, Landsat-8, METEOR-3M, MOS-1, NigeriaSAT-1, NOAA, NOAA POES, OCEANSAT, OCEANSAT-2, PAZ, Planetscope, Pleiades, PROBA-1, QuickBird, QuickBird-2, RadarSAT-1, RapidEYE, SPOT 1-5, SeaSAT, SMOSTRMM, SkySAT, SPOT, SPOT 1, SPOT 2, SPOT 3, SPOT 4, SPOT 5, SPOT 6, SPOT 7, TERRA, TerraSAR-X, UK-DMC-1, WorldView-1, WorldView-2, WorldView-3, WorldView-4 | Collections limited to ESA archive. Not all products may be retrieved |
Alaska Satellite Facility | Sentinel-1, ALOS, SMAP | |
CreoDIAS | Sentinel-1, Sentinel-2, Sentinel-3, Sentinel-5P, Landsat-8 | Supports also local archives |
CreoDIAS New | Sentinel-1, Sentinel-2, Sentinel-3, Sentinel-5P, Landsat-8 | |
Copernicus DataSpace | Sentinel-1, Sentinel-2, Sentinel-3, Sentinel-5P | |
EarthData | GEDI, IceSAT-2, SMAP | |
LSA Data Center | Sentinel-1, Sentinel-2 | |
THEIA | Sentinel-2, Landsat-8, Pleiades, Venus |
All data sources can be intuitively queried by means of the TAO web interface, which dynamically adapts to the parameters of the respective data source. The query results can be then selectively downloaded, and used as input for further processing.