Datasets

Gaia Sky supports the loading and visualization of datasets of different nature.

Preparing datasets

Gaia Sky supports the loading of data in VOTable, CSV, ASCII, etc. using the STIL library. To ensure the datasets are loaded correctly, some preparation might be needed in the form of UCDs and/or column names and units. The following sections describe the expected UCDs and column names for the different data types and units.

Object IDs

Columns with the UCD meta.id are recognized as generic identifiers. Otherwise, the actual matching is done by column name. The following are recognized:

id – generic ID
hip – HIP number
source_id – Gaia source ID

Object names

Names are taken from the columns name, proper, proper_name, common_name and designation.

Since 3.0.2

Also, the regular expression "(name|NAME|refname|REFNAME)((_|-)[\\w\\d]+)?" is matched against column names to find names. This matches anything which starts with name or NAME or refname plus an optional suffix starting with a hyphen or an underscore.

The loader supports multiple names in a single value. The connecting character used is |, so that if multiple names are to be loaded, they must be in a column with one of the above names and the format name-1|name-2|...|name-n.

Positions

For the positional data, Gaia Sky will look for spherical and cartesian coordinates. In the case of spherical coordinates, the following are UCDs supported:

Equatorial: pos.eq.ra, pos.eq.dec
Galactic: pos.galactic.lon, pos.galactic.lat
Ecliptic: pos.ecliptic.lon, pos.ecliptic.lat

If UCDs are not possible (i.e. CSV format), the sky positions should be given in the equatorial system and have the following column names and units:

Right ascension: ra, right_ascension, alpha in degrees
Declination: dec, de, declination, delta in degrees

To work out the distance, it looks for the UCDs pos.parallax and pos.distance. If either of those are found, they are used. If no UCDs are to be found, the column names plx, parallax, pllx and par are accepted. If there are no parallaxes, the default parallax of 0.04 mas is used.

With respect to cartesian coordinates, it recognizes the UCDs pos.cartesian.x|y|z, and they are interpreted in the equatorial system by default.

Proper motions and radial velocities

Proper motions are supported using only the UCDs pm.eq.ra and pm.eq.dec. Otherwise, the following column names are checked, assuming the units to be in mas/yr.

RA: pmra, pmalpha, pm_ra
DEC: pmdec, pmdelta, pm_dec, pm_de

Radial velocities are supported through the UCD dopplerVeloc and through the column names radvel and radial_velocity.

Magnitudes

Magnitudes are supported using the phot.mag or phot.mag;stat.mean UCDs. Otherwise, they are discovered using the column names mag, bmag, gmag, phot_g_mean_mag. If no magnitudes are found, the default value of 15 is used.

Apparent magnitudes are converted to absolute magnitudes with:

\[M = m - 5 log_{10}(d_{pc}) + 5\]

where \(M\) is the absolute magnitude and \(m\) is the apparent magnitude. \(d_{pc}\) is the distance to the star in parsecs.

The absolute magnitude is then converted to a pseudo-size with an algorithm that converts first to a luminosity, and then adjusts the size with an experimental calibration.

Colors

Colors are discovered using the phot.color UCD. If not present, the column names b_v, v_i, bp_rp, bp_g and g_rp are used, if present. If no color is discovered at all, the default value of 0.656 is used as the color index.

The conversion from color index to RGB is done by converting the XP (BP-RP) color index to \(T_{eff}\), and then the \(T_{eff}\) to RGB, using the xp_to_teff() and teff_to_rgb() methods implemented here.

Variability

Variable stars are loaded if light curves (magnitude vs time) and periods are found in the column list. The magnitude list, time list and period are looked up using their column names:

Magnitude list: A list of [mag] is expected under g_transit_mag, g_mag_list, g_mag_series
Time list: A list of Julian dates (offset from J2010, i.e. \(t=JD-2455197.5\)) under g_transit_time, time_list, time_series
Period: A period in Julian days under pf, period

Only variable stars with a period will be loaded. The rest will be skipped.

Other columns

All the columns which do not fit in the abovementioned categories are loaded as extra attributes. These attributes can be used for filtering and color mapping the dataset.

Right now, additional physical quantities (mass, flux, effective temperature (\(T_{eff}\)), radius, etc.) fall into the ‘other columns’ category and are also loaded as extra attributes.

Loading datasets

Datasets can be loaded into Gaia Sky by three different means:

Via SAMP (see this section).
Via scripting (see this section).
Directly using the UI. See the next paragraph.

Since 2.2.5

In order to load a dataset, click on the folder icon in the controls window or press ctrl + o and choose the file you want to load. Supported formats are .csv (Comma-Separated Values), .vot (VOTable) and FITS (as of 3.0.2). Once the dataset has been chosen, a new window is presented to the user asking the type of the dataset and some extra options associated with it. This window is also presented when loading a dataset via SAMP.

Hint

As of version 3.0.2, Gaia Sky supports interactive loading of FITS files.

Datasets can be star datasets, particle datasets or star cluster datasets, depending on whether the new dataset contains stars (with magnitudes, colors, proper motions and whatnot), just particles (only 2D or 3D positions and extra attributes), or cluster properties like the visual radius.

Please, see the Preparing datasets for more information on how to prepare the datasets for Gaia Sky.

Star datasets

Star datasets are expected to contain some attributes of stars, like magnitudes, color indices, proper motions, etc., and use the regular star shaders to render the stars. When selecting star datasets, there are a couple of settings available:

Dataset name – the name of the dataset.
Magnitude scale factor – subtractive scaling factor to apply to the magnitude of all stars (appmag = appmag - factor).
Label color – the color of the labels of the stars in the dataset.
Fade in – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade in the whole dataset. The dataset will not be visible if the camera distance from the Sun is smaller than the lower limit, and it will be fully visible if the camera distance from the Sun is larger than the upper limit. The opacity is interpolated between 0 and 1 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.
Fade out – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade out the whole dataset. The dataset will not be visible if the camera distance from the Sun is larger than the upper limit, and it will be fully visible if the camera distance from the Sun is smaller than the lower limit. The opacity is interpolated between 1 and 0 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.

_images/ds-stars.jpg — Star dataset loading options

Particle datasets

Particle datasets only require positions to be present, and use generic shaders to render the particles. Some parameters can be tweaked at load time to control the appearance and visibility of the particles:

Dataset name – the name of the dataset.
Profile decay – a power that controls the radial profile of the actual particles, as in (1-d)^pow, where d is the distance from the center to the edge of the particle, in [0,1].
Particle color – the color of the particles. Can be modified with the particle color noise.
Particle color noise – a value in [0,1] that controls the amount of noise to apply to the particle colors in order to get slightly different colors for each particle in the dataset.
Label color – color of the label of this dataset. Particles themselves do not have individual labels.
Particle size – the size of the particles, in pixels.
Component type – the component type to apply to the particles to control their visibility. Make sure that the chosen component type is enabled in the Visibility pane.
Fade in – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade in the whole dataset. The dataset will not be visible if the camera distance from the Sun is smaller than the lower limit, and it will be fully visible if the camera distance from the Sun is larger than the upper limit. The opacity is interpolated between 0 and 1 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.
Fade out – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade out the whole dataset. The dataset will not be visible if the camera distance from the Sun is larger than the upper limit, and it will be fully visible if the camera distance from the Sun is smaller than the lower limit. The opacity is interpolated between 1 and 0 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.

_images/ds-particles.jpg — Particles dataset

Star cluster datasets

Star cluster catalogs can also be loaded directly from the UI as of Gaia Sky 2.2.6. The loader also uses STIL to load CSV or VOTable files. In CSV mode the units are fixed, otherwise they are read from the VOTable, if it has them. The order of the columns is not important. The required columns are the following:

name, proper, proper_name, common_name, designation – one or more name strings, separated by ‘|’.
ra, alpha, right_ascension – right ascension in degrees.
dec, delta, de, declination – declination in degrees.
dist, distance – distance to the cluster in parsecs, or
pllx, parallax – parallax in mas, if distance is not provided.
rcluster, radius – the radius of the cluster in degrees.

Optional columns, which default to zero, include:

pmra, mualpha, pm_ra – proper motion in right ascension, in mas/yr.
pmdec, mudelta, pm_dec – proper motion in declination, in mas/yr.
rv, radvel, radial_velocity – radial velocity in km/s.

Star cluster datasets require positions and radii to be present, and use wireframe spheres to render the clusters. The parameters that can be tweaked at load time are:

Dataset name – the name of the dataset.
Particle color – the color of the clusters and their labels.
Label color – color of the label of this dataset. Particles themselves do not have individual labels.
Component type – the component type to apply to the particles to control their visibility. Make sure that the chosen component type is enabled in the Visibility pane.
Fade in – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade in the whole dataset. The dataset will not be visible if the camera distance from the Sun is smaller than the lower limit, and it will be fully visible if the camera distance from the Sun is larger than the upper limit. The opacity is interpolated between 0 and 1 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.
Fade out – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade out the whole dataset. The dataset will not be visible if the camera distance from the Sun is larger than the upper limit, and it will be fully visible if the camera distance from the Sun is smaller than the lower limit. The opacity is interpolated between 1 and 0 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.

_images/ds-clusters.jpg — Star clusters dataset

Variable star datasets

Variable stars are represented in Gaia Sky by displaying the changing magnitude visually in the scene when time is on. These datasets are expected to contain a time series (magnitudes vs times) and a period. Only variable stars with a period are loaded, the rest are discarded.

See the STIL data provider section for more information on how to prepare variable star datasets for Gaia Sky.

Dataset name – the name of the dataset.
Magnitude scale factor – subtractive scaling factor to apply to the magnitude of all stars (appmag = appmag - factor).
Label color – the color of the labels of the stars in the dataset.
Fade in – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade in the whole dataset. The dataset will not be visible if the camera distance from the Sun is smaller than the lower limit, and it will be fully visible if the camera distance from the Sun is larger than the upper limit. The opacity is interpolated between 0 and 1 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.
Fade out – these are two distances from the Sun, in parsecs, that will be used as interpolation limits to fade out the whole dataset. The dataset will not be visible if the camera distance from the Sun is larger than the upper limit, and it will be fully visible if the camera distance from the Sun is smaller than the lower limit. The opacity is interpolated between 1 and 0 if the camera distance from the Sun is larger than the lower limit and smaller than the upper limit.

The process by which light curves are loaded and used in Gaia Sky is a bit involved and outlined below:

First, we check that time series (magnitudes v times) and periods are actually present in the file.
Then, NaN values are removed from the light curve data points.
We fold the time series into a phase diagram using the period and sort the result accordingly with the phase for each data point.
Due to a GPU memory trade-off (the time series data must be sent to the GPU for each star, and all stars must have the same in-memory size in the GPU), we have a limitation of 20 data points per star. If the number of incoming data points is larger than 20, we re-sample the phase diagram.
Finally, the magnitudes are converted to pseudo-sizes for easier representation, and passed on to the model.

Managing datasets

_images/dataset-controls.png — Dataset entry in the control panel

All datasets currently loaded are displayed in the Datasets pane of the control panel. A few actions can be carried out directly from that pane:

– toggle the visibility of the dataset
– open the dataset preferences window
– delete (unload) dataset
– highlight the dataset using the current color and particle size. The color can be changed by clicking on the icon just above of this icon (blue square in the image), and the particle size factor can be adjusted in the dataset preferences window.

Dataset highlight color

Above the highlight-s-off icon is the color icon. Use it to define the highlight color for the dataset. The color can either be a plain color or a color map.

A plain color can be chosen using the color picker dialog that appears when clicking on the “Plain color” radio button.

_images/colorpicker-color.png — The plain color picker dialog

The “Color map” radio button displays the screen shown below. From there, you can choose the color map type, as well as the attribute to use for the mapping and the maximum and minimum mapping values.

The available attributes depend on the dataset type and loading method. Particle datasets have coordinate attributes (right ascension, declination, ecliptic longitude and latitude, galactic longitude and latitude) and distance distance. Star datasets have, additionally, apparent and absolute magnitudes, proper motions (in alpha and delta) and radial velocity. For all datasets loaded from VOTable either directly or through SAMP, all the numeric attributes are also available

_images/colorpicker-cmap.png — The color map dialog

Dataset preferences

Clicking on the preferences icon , a few extra settings can be adjusted:

Size increase factor - scale factor to apply to the particles when the dataset is highlighted.
Make all particles visible - raises the minimum opacity to a non-zero value when the dataset is highlighted.
Filters - allows for the creation of arbitrary selection filters by setting conditions (rules) on the particle attributes. Several rules can be defined, but only one type of logical operator (AND, OR) is possible. The available attributes depend on the dataset and are covered in the dataset color settings