New database entries


Edit the correct database, otherwise your changes will be lost! Changes must be made to the master database. This document assumes that master database is held at the node. Ensure you edit the latest database on the master node, the command psql -l can be used to display the databases. It is recommended you use pgadmin3 to edit the database.

Most database tables have an auto-generated primary key for their first column. The column name is normally the singular of the table's name with _id appended, e.g., for the stations table this is station_id. Do not enter anything in the first column, PostgreSQL will automatically generate the next unique key. Note that tables commonly reference other tables with a foreign key; you will be required to enter the correct value for foreign keys. If you are using pgadmin3 to edit the database primary key columns are indicated by the string "[PK]" below the column name; you will need to refresh the table (F5) in order to see the newly-created primary key. You will not be able to edit a newly created row until you refresh the display.

In order that the foreign keys have already been created for when you need to enter them it is suggested you enter data in the order below. This example assumes you are adding a complete new project, with new stations and contacts. If this is not the case refer to existing database entries when possible.

If you are unsure about the meaning, content or validity of data for a column consult the comments field, in pgadmin3 this can be found from the columns subtree of the table in question. Right-click the column name and select properties.

Add institutes

Contacts are linked to one institute (university, research centre, company etc), projects and channels can be linked to multiple institutes. Reuse an existing institute if possible, otherwise create one in the institute table. Note that the =_ascii+ versions of the columns should use only standard ASCII characters, the non-ASCII columns can use all characters from the UNICODE character set.

Add contacts

Each contact has an associated email address (required), URL (optional) and institute (via institute_id, a foreign key). If the contact is not already listed create a new one, institute_id is the primary key from the institutes table. If for some reason you need a contact with multiple email addresses (avoid if at all possible) then they will need multiple entries.

The email address should be entered with ROT-13 encryption. This cypher has the convenient property that encoding and decoding are the same action. If you cannot do this enter it as it should appear, then view the contacts page ( and copy the email address you see (which has then undergone ROT-13 encryption/decryption) back into the contacts table.

bubbleThe purpose of the ROT-13 encryption/decryption is to make it harder for spammers to harvest email addresses. Correct display in the GAIA web pages always requires JavaScript.

Add project

Add the project name and abbreviation. Note that the abbreviation must be unique and are offered on a first-come first-served basis. Do not edit any abbreviations for established entries - it may break other services. Abbreviations must be entered in upper case without spaces.

Add ancillary data for project (contacts, institutes, PIs and URLs)

Ancillary data associated with the project are entered into the tables project_contacts, project_institutes, project_pis, and project_urls. Each project can then be linked with none, one or many contacts, institutes, PIs and URLs. The order in which each is displayed is given by the priority column, 1 being the first to be displayed. If there is only one item it is not necessary to enter a value in the priority column. In each of these columns project_id is a foreign key and should be entered, select the correct project_id value from the projects table.

At this point check your project details are correct by viewing Correct any mistakes. Stations and data channels will be added next.

tip Add # and the abbreviation in lower case to and you will be taken directly to your project's information.

Add node

If the data is to be hosted on a new node add the details now. If the node is located in a new country you may need to create an new entry in the node_icons table and to update a new flag. Important: ensure that the icons you upload can be freely reused.

Enter the approximate latitude and longitude for the node. GAIA doesn't use geolocation data to find the location of the user, instead we take the location of the node that the user has chosen, distributed datasets are accessed in order of closeness to the selected node.

The hostname should take the form and a matching entry will need to be created in the GAIA domain. Use of extra dots (eg calgary.canada will break the notion of what the correct domain is ( for JavaScript pages.

Check channel_type

Check that the channel types you wish to add are already in the channel_types table. If not you will need to add a new entry. The name column is a descriptive name and may contain spaces. Uses spaces between units (e.g., 170 nm). is_greyscale is a flag indicating if the images are greyscale, to which colour palettes may be aplied, or are already colour. The description column contains an additional description. ref_name is the name be which the channel type is internally referenced, such as in URLs; it cannot contain spaces (use underscores if necessary), dots may be used; it should be written in lower case letters. Finally the instrument_id field defines the type of instrument which recorded the data. If necessary create a new entry in the instrument table. Please follow the existing style.

Add channel

The first column of the channels table is a primary key and should be left blank - the next sequential number will be assigned automatically. Enter the start and end dates in ISO format (YYYY-MM-DD). If the instrument is still operational use 2020-01-01 as the end date. The sync field is a URL to the base of the dataset which other nodes can access to synchronise their copy of the summary images. todo Add ref to where sync is explained. channel_type_id indicates the type of data which is recorded. Choose this item carefully, it defines the type of instrument which recorded the data and whether the images are greyscale or not. Do not confuse similar channel types from different instruments, e.g., 630 nm images from an all-sky camera with 630 nm images from a meridian-scanning photometer.

The orientation of the images from a data set is indicated by orientation_id, see ImageOrientation and select the correct entry form the orientation table.

If a dataset has summary data available then enter the path (relative to the project directory) as a strftime format specifier in summary_data. You may also need to set the data_look_name and data_look_value fields. See the column comment for summary_data for more information.

If summary data is not available leave the summary_data, data_look_name and data_look_value columns empty (NULL).

The format of the summary images is given in image_format. Valid choices are png and jpg. The preferred choice is png for greyscale images as the user is able to apply a colour palette. See also SummaryPlotRequirements.

zipped_thumbnails is a flag indicating if the thumbnails are contained inside a ZIP file, normally they should be to in order to make substantial savings in disk space.

attribution is a place where an acknowledgement to funding agencies, inclusion of NSF grant number etc, can be made.

has_keograms and has_thumbnails are flags indicating whether the dataset has thumbnails. Most datasets should have both. Meridian-scanning photometers have keograms but not images,

ref_frame_id indicates the frame of reference for the data. Ground-based instrument normally use 1 ( earth). Data from space-craft may use other reference frames.

default_palette_id indicates the default palette used for your dataset.

Some datasets (widebeam riometer data for example) do not have keogram but summary plots are generated on demand from CDF summary data files. data_variable_name indicates the name of the variable name inside the CDF file.

Some datasets have lower cadence than the standard 1 minute used by GAIA. If this is the case the cadence column can be set to the appropriate value. Note that the cadence must be a multiple of 1 minute and must exactly divide into 1 day.

For data channels which are operational transfer_delay is the nominal delay in transferring data. The availability of any data within the transfer_delay is unknown, GAIA assumes it may be available. For older data the data_availability is checked to ascertain of the data is present.

sparse_data_max_cadence. Experimental. Please ignore.

Check the data entries

Check the data entries by revisiting the projects page:

Topic revision: r2 - 2010-06-29 - 10:42:56 - SteveMarple
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback