What is a Data Resource

Top

A data resource is any digital resource which contains recorded observations, derived quantities or interpretations. There are three types of data resources: Catalog, NumericalData and DisplayData. The information to describe each type of data resource is very similiar. At a minimum a data resource description associates a resource identifier to the resource, provides an overview and conveys how to access a resource. A data resource description can also details for the parameters contained in the resource and may include hints as to how to render the data.

The general model of a data resource is that it consists of a set of one or more files that contain data which is obtained from a common source and has a uniform structure to its contents. The data resource description is for the collection and individual files (units of data) in the collection are described as Granules. Each Granule is associated with the data resource description and inherits all of its attributes.

For example, suppose there is data from a magnetometer on a observatory called "MagProbe-1". Data was collected for a full year with the data collected each day stored in separate files. In this case there would be a NumericalData description for the entire collection and 365 Granule descriptions for file.

A web based resource description editor is available at http://spase-group.org/tools/editor/editor.jsp

Basic Data Description

Top

Using the "MagProbe-1" example the basic NumericalData resource description would be:

<Spase>
   <NumericalData>
      <ResourceID>spase://Example/NumericalData/MagProbe-1/Mag/PT1D</ResourceID>
      <ResourceHeader>
         <ResourceName>MagProbe-1 Magnetometer Data<ResourceName>
         <ReleaseDate>2011-03-16T00:00:00<ReleaseDate>
         <Description>Data obtained form the MagProbe-1 Magenetometer for the full year<Description>
         <Contact>
            <PersonID>spase://Example/Person/P.I.Researcher</PersonID>
            <Role>PrincipalInvestigator<Role>
         <Contact>
      </ResourceHeader>
      <AccessInformation>
         <AccessURL>
            <URL>http://magprobe.host.org/magprobe-1/mag/data<URL>
         </AccessURL>
      </AccessInformation>
      <MeasurementType>MagneticField<MeasurementType>
   </NumericalData>
</Spase>
        	

Basic Granule Description

Top

A Granule description for a file might be:

<Spase>
   <Granule>
      <ResourceID>spase://Example/NumericalData/MagProbe-1/Mag/PT1D/2011001</ResourceID>
      <ReleaseDate>2011-03-16T00:00:00</ReleaseDate>
      <ParentID>spase://Example/NumericalData/MagProbe-1/Mag/PT1D</ParentID>
      <StartDate>2011-01-01T00:00:00</StartDate>
      <StopDate>2011-01-01T24:00:00</StopDate>
      <Source>
         <SourceType>Data</SourceType>
         <URL>http://magprobe.host.org/magprobe-1/mag/data/2011001.dat<URL>
      </Source>
   </Person>
</Spase>
Note: The ParentID refers to the NumericalData resource description for the collection.

A More Useful Data Description

Top

While the basic data description enables the discovery and retrieval of a data resource, it does not provide any information as to the contents of data granules. This can be provided by describing the parameters stored in the data granules. In SPASE a Parameter is a named data series within the data granules. A Parameter may be a scalar or multi-dimensional. The SPASE model of data storage is that the values of a parameter can be retrieved (using access software) if given a name or ParameterKey. This is commonly true for most self-documented data storage formats. However, data stored as plain text or comma separated values may lack a self-documented feature. In this case, SPASE has defined a method of expressing a ParameterKey to allow direct interpretation of the plain text file (see http://spase-group.org/docs/conventions.

A basic Parameter description for a magnetic field vector (x, y, z) with a ParameterKey name of "field" is:

<Parameter>
   <Name>Magnetic Field<Name>
   <ParameterKey>Field<ParameterKey>
   <Field>
      <FieldQuantity>Magnetic</FieldQuantity>
   <Field>
</Parameter>
If the format of the data files is a time stamp and a magnetic field vector then the complete NumericalData resource description for the the MagProbe-1 example would be:
<Spase>
   <NumericalData>
      <ResourceID>spase://Example/NumericalData/MagProbe-1/Mag/PT1D</ResourceID>
      <ResourceHeader>
         <ResourceName>MagProbe-1 Magnetometer Data<ResourceName>
         <ReleaseDate>2011-03-16T00:00:00<ReleaseDate>
         <Description>Data obtained form the MagProbe-1 Magenetometer for the full year<Description>
         <Contact>
            <PersonID>spase://Example/Person/P.I.Researcher</PersonID>
            <Role>PrincipalInvestigator<Role>
         <Contact>
      </ResourceHeader>
      <AccessInformation>
         <AccessURL>
            <URL>http://magprobe.host.org/magprobe-1/mag/data<URL>
         </AccessURL>
      </AccessInformation>
      <MeasurementType>MagneticField<MeasurementType>
      <Parameter>
         <Name>Time<Name>
         <ParameterKey>Time<ParameterKey>
         <Support>
            <SupportQuantity>Temporal</SupportQuantity>
         <Support>
      </Parameter>
      <Parameter>
         <Name>Magnetic Field<Name>
         <ParameterKey>Field<ParameterKey>
         <Field>
            <FieldQuantity>Magnetic</FieldQuantity>
         <Field>
      </Parameter>
   </NumericalData>
</Spase>
Much more detail can be provided in a resource description. See the Data Model reference for a complete list.

/school/tutorials/data/