THEMIS high resolution image data consist of grayscale frames with 256x256 16-bit pixels. They are typically acquired at a 3-secoond cadence with all frames from each minute (usually 20) placed in a single PGM file. These PGM files are compressed using gzip to reduce storage requirements and to provide integrity checking. The central PostrgreSQL database in Calgary provides a catalog of file and frame metadata.

A set of RESTful web-services will be implemented to provide access to the high resolution ("stream0") image data. These will be used as the basis for a THEMIS data browser but are also intended to meet the needs of GAIA. This page is intended to formalize and dodument the interface details.

An earlier implementation of RESTish data access is given at the bottom of this page


  1. What is the "best" way of selecting sub-frames within a file? The existing interface uses a dot eg. id/99.3 to select the 4th (zero-based) frame from the file with database ID 9. The potential for conflict with file name extensions is obvious. Other possibilities are
    1. "glob" eg. 99[3] -same as ImageMagick
    2. number eg. 99#3 -like URL fragment
    3. cgi eg. 99?frame

bubble The #name URL fragment would normally be appended after the URL. However, a web browser may not be able to distinguish that is different from, since in normal HTML terms they would be the same resource. Using a question mark would probably stop webcaches from caching the image. Perhaps the image number can be appended as /n and retrieved by the script using path info. -- SteveMarple - 16 Feb 2007

http-img (AKA data)

This "channel" returns a single image using HTTP. The image is determined by using one of the available unique identifiers (see below) to select a data file. Sub-frame selection is optional. No search or wild-card matching will be provided. Image format defaults to JPEG to minimize network traffic, but can be overridden. Additional processing and color mapping capabilities may be available.


Each file is assigned an identifier (ID) that is guaranteed to be unique within a particular insubstantiation of the database.

eg. <img src="http://themis-data/rest/stream0/http-img/dbid/99#3.png"> 

bubble How many images will be stored? Will you need to use bigserial?


A md5 checksum of each uncompressed file is stored in the database. This is primarily intended as a compact record to confirm file integrity. However, it can also be used as a universal identifier that is statistically unique.Thi primary drawback is that the 128-bit checksum must(?) be stored in PostgreSQL as a char(32) and searching is slower than using an int4.

eg. <img src="http://themis-data/rest/stream0/http-img/md5/bdd3cd1f940d7dbbc3b0dc402f6c3df9#3.png"> 


This is primarily intended for debugging, as it requires an intimate knowledge of file naming conventions.
eg. <img src="http://themis-data/rest/stream0/http-img/file/20050221_0232_ekat_themis01_full_1000ms.pgm.gz#3.png"> 


Join site UID, device UID, and ISO time with underscores. Very slow at the database side but nicer for humans.
eg. <img src="http://themis-data/rest/stream0/http-img/uuid/ekat_themis01_20050221T0232#3.png"> 

img-meta (AKA info)

Same as http-img, but return image metadata in one of the following formats
  • flat text (ASCII)
  • HTML (eg. unordered list)
  • XML
  • JSON

sql-list (AKA list)

Allow users to query the database for images matching time and site (location) constraints. Return zero or more matches as URLs to either http-img or img-meta in one ot the following formats
  • flat text
  • HTML
  • XML
  • JSON

bubble The PHP_Element class GAIA is using supports all but flat text. Also support Matlab output. -- SteveMarple - 16 Feb 2007

SQL excerpts




  -- core values required for complete "registration"
  id         int4 PRIMARY KEY DEFAULT nextval('file_seq'),
  path       varchar(40) NOT NULL CONSTRAINT valid_path CHECK 
                   (path ~* '^[0-9]{4}/[0-9]{2}/[0-9]{2}/.+_.+/ut[0-9]{2}$'),
  name        varchar(64) NOT NULL CONSTRAINT valid_name CHECK
                   (name ~* '^[0-9]{8}_[0-9]{2,6}_.+_.+_.+\.(pgm)|(pnm)(\.gz)?$') UNIQUE,
  mtime           int4, --y2037 bug
  --timestamp without time zone NOT NULL,   --last modified
  -- remaining values may be NULL which means "not done yet"
  nbytes          int4 CHECK (nbytes>=0),   --number of bytes in uncompressed file
  nbytes_packed   int4 CHECK (nbytes_packed>=0),  --number of bytes after compression (null if not compressed)
  md5sum    char(32) CONSTRAINT valid_md5sum CHECK 
                   (md5sum ~* '^[0-9a-f]{32}$'), --should be UNIQUE but don't enforce here
--  nframes    int2 CHECK (nframes>=0),      --number of image frames in the file
--  mode_id   int2 REFERENCES modes(id)  



  file_id   int4 REFERENCES files(id),      --4 bytes
  mode_id   int2 REFERENCES modes(id),      --2 bytes
  timestamp   timestamp without time zone NOT NULL,   --8 bytes date/time with 0.01s resolution

  offset   int2[]                --2 bytes * nframes
  duration    int2[] CHECK(duration>=0),       --2 bytes * nframes

--  imager_id   int2 REFERENCES imagers(id),      --2 bytes, could get from mode_id
--  site_id   int2 REFERENCES sites(id),      --2 bytes, could get from imager_id
--  UNIQUE(file_id,number)   --how much does this slow things down?

Proof of concept

My first attempt at providing web-based access to high-resolution THEMIS data was implemented last spring. Time requirements (on themis-data) are roughly

  • 15ms to initialize the CGI script and access the database
  • 30ms to unzip the data file, extract a frame, convert to jpeg, and write to stdout


-- BrianJackel - 16 Feb 2007

Topic revision: r2 - 2007-02-16 - 19:09:09 - SteveMarple
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback