DOKK / manpages / debian 12 / libopenslide-dev / openslide-formats.3.en
OPENSLIDE-FORMATS(3) File Formats OPENSLIDE-FORMATS(3)

openslide-formats - Reference supported formats

openslide.background-color

The background color of the slide, given as an RGB hex triplet. This property is not always present.

openslide.bounds-height

The height of the rectangle bounding the non-empty region of the slide. This property is not always present.

openslide.bounds-width

The width of the rectangle bounding the non-empty region of the slide. This property is not always present.

openslide.bounds-x

The X coordinate of the rectangle bounding the non-empty region of the slide. This property is not always present.

openslide.bounds-y

The Y coordinate of the rectangle bounding the non-empty region of the slide. This property is not always present.

openslide.comment

A free-form text comment.

openslide.mpp-x

Microns per pixel in the X dimension of level 0. May not be present or accurate.

openslide.mpp-y

Microns per pixel in the Y dimension of level 0. May not be present or accurate.

openslide.objective-power

Magnification power of the objective. Often inaccurate; sometimes missing.

openslide.quickhash-1

A non-cryptographic hash of a subset of the slide data. It can be used to uniquely identify a particular virtual slide, but cannot be used to detect file corruption or modification.

openslide.vendor

The name of the vendor backend.

tiff.Artist

The contents of the TIFF Artist tag.

tiff.Copyright

The contents of the TIFF Copyright tag.

tiff.DateTime

The contents of the TIFF DateTime tag.

tiff.DocumentName

The contents of the TIFF DocumentName tag.

tiff.HostComputer

The contents of the TIFF HostComputer tag.

tiff.ImageDescription

The contents of the TIFF ImageDescripton tag.

tiff.Make

The contents of the TIFF Make tag.

tiff.Model

The contents of the TIFF Model tag.

tiff.ResolutionUnit

The contents of the TIFF ResolutionUnit tag.

tiff.Software

The contents of the TIFF Software tag.

tiff.XPosition

The contents of the TIFF XPosition tag.

tiff.XResolution

The contents of the TIFF XResolution tag.

tiff.YPosition

The contents of the TIFF YPosition tag.

tiff.YResolution

The contents of the TIFF YResolution tag.

A list of vendor-specific properties can be found on the pages for each vendor format[1].

Format

single-file pyramidal tiled TIFF, with non-standard metadata and compression

File extensions

.svs, .tif

OpenSlide vendor backend

aperio

http://www.aperio.com/documents/api/Aperio_Digital_Slides_and_Third-party_data_interchange.pdf

Aperio slides are stored in single-file TIFF format. OpenSlide will detect a file as Aperio if:

1.The file is TIFF.

2.The initial image is tiled.

3.The ImageDescription tag starts with Aperio.

Relevant TIFF tags

Tag Description
ImageDescription Stores some important key-value pairs and other information, see below
Compression May be 33003 or 33005, which represent specific kinds of JPEG 2000 compression, see below

Extra data stored in ImageDescription

For tiled images, the ImageDescription tag contains some dimensional downsample information as well as what look like offsets. Additionally, vertical line-delimited key-value pairs are stored, in at least the full-resolution image. A key-value pair is equals-delimited. These key-values are stored as properties starting with “aperio.”. Currently, OpenSlide does not use any of the information present in these key-value fields.

For stripped images, the ImageDescription tag may contain a name, followed by a carriage return. This is used for naming the associated images. The second image in the file does not have a name, though it is an associated image.

TIFF Image Directory Organization

http://www.aperio.com/documents/api/Aperio_Digital_Slides_and_Third-party_data_interchange.pdf page 14:

The first image in an SVS file is always the baseline image (full resolution). This image is always tiled, usually with a tile size of 240x240 pixels. The second image is always a thumbnail, typically with dimensions of about 1024x768 pixels. Unlike the other slide images, the thumbnail image is always stripped. Following the thumbnail there may be one or more intermediate “pyramid” images. These are always compressed with the same type of compression as the baseline image, and have a tiled organization with the same tile size.

Optionally at the end of an SVS file there may be a slide label image, which is a low resolution picture taken of the slide’s label, and/or a macro camera image, which is a low resolution picture taken of the entire slide. The label and macro images are always stripped.

Some Aperio files use compression type 33003 or 33005. Images using this compression need to be decoded as a JPEG 2000 codestream. For 33003: YCbCr format, possibly with a chroma subsampling of 4:2:2. For 33005: RGB format. Note that the TIFF file may not encode the colorspace or subsampling parameters in the PhotometricInterpretation field, nor the YCbCrSubsampling field, even though the TIFF standard seems to require this. The correct subsampling can be found in the JPEG 2000 codestream.

Associated Images

thumbnail

the second image in the file

label

optional, the name “label” is given in ImageDescription

macro

optional, the name “macro” is given in ImageDescription

Known Properties

All key-value data encoded in the ImageDescription TIFF field is represented as properties prefixed with “aperio.”.

openslide.mpp-x

normalized aperio.MPP

openslide.mpp-y

normalized aperio.MPP

openslide.objective-power

normalized aperio.AppMag

Test Data

http://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/

Format

multi-file JPEG/NGR with proprietary metadata and index file formats, and single-file TIFF-like format with proprietary metadata

File extensions

.vms, .vmu, .ndpi

OpenSlide vendor backend

hamamatsu

OpenSlide will detect a file as Hamamatsu if:

1.The file given is a INI-style text file.

2.It has a [Virtual Microscope Specimen] (VMS) or [Uncompressed Virtual Microscope Specimen] (VMU) group.

3.If VMS, there are at least 1 row and 1 column of JPEG images (NoJpegColumns and NoJpegRows).

or if:

1.The file has a TIFF directory structure.

2.TIFF tag 65420 is present.

The Hamamatsu format has three variants. VMS and VMU consist of an index file, 2 or more image files, and (in the case of VMS) an “optimisation” file. NDPI consists of a single TIFF-like file with some custom TIFF tags. VMS and NDPI contain JPEG images; VMU contains NGR images (a custom uncompressed format).

Multiple focal planes are ignored, only focal plane 0 is read.

JPEG does not allow for files larger than 65535 pixels on a side. In VMS, multiple JPEG files are used to encode large images. To avoid having many files, VMS uses close to maximum size (65K by 65K) JPEG files. NDPI, instead, stuffs large levels into a single JPEG and sets the overflowed width/height fields to 0.

Unfortunately, JPEG provides very poor support for random-access decoding of parts of a file. To get around this, JPEG restart markers are placed at regular intervals, and these offsets are specified in the optimisation file (in VMS) or in a TIFF tag (in NDPI). With restart markers identified, OpenSlide can treat JPEG as a tiled format, where the height is the height of an MCU row, and the width is the number of MCUs per row divided by the restart marker interval times the width of an MCU. (This often leads to oddly-shaped and inefficient tiles of 4096x8, for example.)

Unfortunately, the VMS optimisation file does not give the location of every restart marker, only the ones found at the beginning of an MCU row. It also seems that the file ends early, and does not give the location of the restart marker at the last MCU row of the last image file.

Thus, the optimisation file can only be taken as a hint, and cannot be trusted. The entire set of JPEG files must be scanned for restart markers in order to facilitate random access. OpenSlide does this lazily as needed, and also in a background thread that runs only when OpenSlide is otherwise idle.

The VMS map file is a lower-resolution version of the other images, and can be used to make a 2-level JPEG pyramid. JPEG also allows for lower-resolution decoding, so further pyramid levels are synthesized from each JPEG file.

The .vms file is the main index file for the VMS format. It is a Windows INI-style key-value pair file, with sections. Only keys in the Virtual Microscope Specimen group are read by OpenSlide.

Here are known keys from the file:

Key Description
NoLayers Number of layers, currently must be 1 to be accepted
NoJpegColumns Number of JPEG files across, given in ImageFile attributes
NoJpegRows Number of JPEG files down, given in ImageFile attributes
ImageFile Semantically equivalent to ImageFile(0,0,0), though not specified that way. The image in position (0,0,0) of the set of images
ImageFile(x,y) Semantically equivalent to ImageFile(0,x,y), though not specified that way. The image in position (0,x,y) of the set of images
ImageFile(z,x,y) Where x and y are non-negative integers. Both x and y cannot be 0. z is a positive integer. These are the images that make up the virtual slide, as a concatenation of JPEG images. x and y specify the location of each JPEG, z specifies the focal plane
MapFile A lower-resolution version of all the ImageFiles
OptimisationFile File specifying some of the restart marker offsets in each ImageFile
AuthCode Unknown
SourceLens Apparently the objective power
PhysicalWidth Width of the main image in nm
PhysicalHeight Height of the main image in nm
LayerSpacing Unknown
MacroImage Image file for the “macro” associated image
PhysicalMacroWidth Width of the macro image in nm
PhysicalMacroHeight Height of the macro image in nm, sometimes with a trailing semicolon
XOffsetFromSlideCentre Distance in X from the center of the entire slide (i.e., the macro image) to the center of the main image, in nm
YOffsetFromSlideCentre Distance in Y from the center of the entire slide to the center of the main image, in nm

The .vmu file is the main index file for the VMU format. Only keys in the Uncompressed Virtual Microscope Specimen group are read by OpenSlide.

Here are known keys from the file:

Key Description
NoLayers (see VMS above)
ImageFile (see VMS above)
ImageFile(x,y) (see VMS above)
ImageFile(z,x,y) (see VMS above)
MapFile (see VMS above)
MapScale Seems to be the downsample factor of the map
AuthCode (see VMS above)
SourceLens (see VMS above)
PixelWidth Width of the image in pixels
PixelHeight Height of the image in pixels
PhysicalWidth (see VMS above)
PhysicalHeight (see VMS above)
LayerSpacing (see VMS above)
LayerOffset Unknown
MacroImage (see VMS above)
PhysicalMacroWidth (see VMS above)
PhysicalMacroHeight (see VMS above)
XOffsetFromSlideCentre (see VMS above)
YOffsetFromSlideCentre (see VMS above)
Reference Unknown
BitsPerPixel Bits per pixel, currently expected to be 36
PixelOrder Currently expected to be RGB
Creator String describing the software creating this image
IlluminationMode Unknown
ExposureMultiplier Unknown, possibly the multiplier used to scale to 15 bits?
GainRed Unknown
GainGreen Unknown
GainBlue Unknown
FocalPlaneTolerance Unknown
NMP Unknown
MacroIllumination Unknown
FocusOffset Unknown
RefocusInterval Unknown
CubeName Unknown
HardwareModel Name of the hardware
HardwareSerial Serial number of the hardware
NoFocusPoints Unknown
FocusPoint0X Unknown
FocusPoint0Y Unknown
FocusPoint0Z Unknown
FocusPoint1X Unknown
FocusPoint1Y Unknown
FocusPoint1Z Unknown
FocusPoint2X Unknown
FocusPoint2Y Unknown
FocusPoint2Z Unknown
FocusPoint3X Unknown
FocusPoint3Y Unknown
FocusPoint3Z Unknown
NoBlobPoints Unknown
BlobPoint0Blob Unknown
BlobPoint0FocusPoint Unknown
BlobPoint1Blob Unknown
BlobPoint1FocusPoint Unknown
BlobPoint2Blob Unknown
BlobPoint2FocusPoint Unknown
BlobPoint3Blob Unknown
BlobPoint3FocusPoint Unknown
BlobMapWidth Unknown
BlobMapHeight Unknown

NDPI uses a TIFF-like structure, but libtiff cannot read the headers of an NDPI file. This is because NDPI specifies the RowsPerStrip as the height of the file, and after doing out the multiplication, this typically overflows libtiff and it refuses to open the file. Also, the TIFF tags are not stored in sorted order.

NDPI stores an image pyramid in TIFF directory entries. In some files, the lower-resolution pyramid levels contain no restart markers. The macro image, and sometimes an active-region map, seems to come last.

JPEG files in NDPI are not necessarily valid. If ImageWidth or ImageHeight exceeds the JPEG limit of 65535, then the width or height as stored in the JPEG file is 0. libjpeg will refuse to read the header of such a file, so the JPEG data stream must be altered when fed into libjpeg.

NDPI is based on the classic TIFF format, which does not support files larger than 4 GB. However, NDPI files can be larger than 4 GB. NDPI generally handles this by overflowing the corresponding TIFF fields, requiring the reader to guess the high-order bits. This affects TIFF Value Offsets with pointers to out-of-line values, as well as the value of the StripOffsets field. Some TIFF fields (e.g. Software) have the same Value Offset in every directory; for these, no concatenation of high-order bits is necessary. For the others (primarily field 65426) it seems reasonable to select high-order bits which place the value at the largest offset below the directory itself, since the TIFF directory is positioned after the data it points to. NDPI always stores next-directory offsets (in the TIFF header and at the end of each directory) as 64-bit quantities, even though TIFF specifies them as 32 bits; this is possible because the TIFF standard places them at the end of their parent data structures.

It is not clear whether NDPI can support individual directories larger than 4 GB. Such files would require additional inferences for the StripByteCounts field, for Value Offsets that are identical across directories, and for the optimisation entries.

Here are the observed TIFF tags:

Tag Description
ImageWidth Width of the image
ImageHeight Height of the image
Make “Hamamatsu”
Model “NanoZoomer” or “C9600-12”, etc
XResolution Seemingly correct X resolution, when interpreted with ResolutionUnit
YResolution Seemingly correct Y resolution, when interpreted with ResolutionUnit
ResolutionUnit Seemingly correct resolution unit
Software “NDP.scan”, sometimes with a version number
StripOffsets The offset of the JPEG file for this layer
StripByteCounts The length of the JPEG file for this layer
65420 Always exists, always 1. File format version?
65421 SourceLens, correctly downsampled for each entry. -1 for macro image, -2 for a map of non-empty regions.
65422 XOffsetFromSlideCentre
65423 YOffsetFromSlideCentre
65424 Seemingly the Z offset from the center focal plane (in nm?)
65425 Unknown, always 0?
65426 Optimisation entries, as above
65427 Reference
65428 Unknown, AuthCode?
65430 Unknown, have seen 0.0
65433 Unknown, I have seen 1500 in this tag
65439 Unknown, perhaps some polygon ROI?
65440 Unknown, I have seen this: <0 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 1 9 1 10 1 11 1 12 1 13 1 14 1 15 1 16 1 17>
65441 Unknown, always 0?
65442 Scanner serial number
65443 Unknown, have seen 0 or 16
65444 Unknown, always 80?
65445 Unknown, have seen 0, 2, 10
65446 Unknown, always 0?
65449 ASCII metadata block, key=value pairs, not always present
65455 Unknown, have seen 13
65456 Unknown, have seen 101
65457 Unknown, always 0?
65458 Unknown, always 0?

The optimisation file contains a list of 32- (or 64- or 320- ?) bit little endian values, giving the file offset into an MCU row, each offset starts at a 40-byte alignment, and the last row (of the entire file, not each image) seems to be missing. The offsets are all packed into 1 file, even with multiple images. The order of images is left-to-right, top-to-bottom.

The VMS map file is a standard JPEG file. Its restart markers (if any) are not included in the optimisation file. The VMU map file is in NGR format. This file can be used to provide a lower-resolution view of the slide.

These files are given by the VMS/VMU ImageFile keys. They are assumed to have a height which is a multiple of the MCU height. They are assumed to have a width which is a multiple of MCUs per row divided by the restart interval.

For VMS, these files are in JPEG, for VMU they are in NGR format.

The NGR file contains uncompressed 16-bit RGB data, with a small header. The files we have encountered start with GN, two more bytes, and then width, height, and column width in little endian 32-bit format. The column width must divide evenly into the width. Column width is important, since NGR files are generated in columns, where the first column comes first in the file, followed by subsequent files. Columns are painted left-to-right.

At offset 24 is another 32-bit integer which gives the offset in the file to the start of the image data. The image data we have encountered is in 16-bit little endian format.

Associated Images

macro

the image file given by the MacroImage value in the VMS/VMU file, or SourceLens of -1 in NDPI

Known Properties

All key-value data stored in the VMS/VMU file, and known tags from the NDPI file, are encoded as properties prefixed with “hamamatsu.”.

openslide.mpp-x

For VMS, calculated as hamamatsu.PhysicalWidth/(1000*openslide.level[0].width). For NDPI, calculated as 10000/tiff.XResolution, if tiff.ResolutionUnit is centimeter.

openslide.mpp-y

For VMS, calculated as hamamatsu.PhysicalHeight/(1000*openslide.level[0].height). For NDPI, calculated as 10000/tiff.YResolution, if tiff.ResolutionUnit is centimeter.

openslide.objective-power

normalized hamamatsu.SourceLens

Test Data

NDPI format

http://openslide.cs.cmu.edu/download/openslide-testdata/Hamamatsu/

VMS format

http://openslide.cs.cmu.edu/download/openslide-testdata/Hamamatsu-vms/

Format

single-file pyramidal tiled BigTIFF with non-standard metadata

File extensions

.scn

OpenSlide vendor backend

leica

Leica slides are stored in single-file BigTIFF format. OpenSlide will detect a file as Leica if:

1.The file is TIFF.

2.The initial image is tiled.

3.The ImageDescription tag contains valid XML in either of these namespaces: <listitem>http://www.leica-microsystems.com/scn/2010/03/10 </listitem> <listitem>http://www.leica-microsystems.com/scn/2010/10/01 </listitem>

To open Leica files, OpenSlide must be built with libtiff 4 or above.

Relevant TIFF tags

Tag Description
ImageDescription Stores an XML document containing various metadata

File Organization

The ImageDescription tag of the first TIFF directory contains an XML document that defines the structure of the slide.

Leica slides are structured as a collection of images, each of which has multiple dimensions (pyramid levels). The collection has a size, and images have a size and position, measured in nanometers. Each dimension has a size in pixels, an optional focal plane number, and a TIFF directory containing the image data. Fluorescence images have different dimensions (and thus different TIFF directories) for each channel. OpenSlide currently rejects fluorescence images and ignores focal planes other than plane 0.

Brightfield slides have at least two images: a low-resolution macro image and one or more main images corresponding to regions of the macro image. The macro image has a position of (0, 0) and a size matching the size of the collection. Fluorescence slides can have two macro images: one brightfield and one fluorescence.

The slide provides enough information to composite the various images, including the macro image, into a single pyramid. However, there are some complications: <listitem>The resolution of the macro image is generally not related to the resolution of the main images by a power of two. </listitem><listitem>Downsampled dimensions are generally downsampled from the next larger dimension by a factor of 4, but main images can be scanned with distinct objectives that may differ by only a factor of 2. </listitem>.PP Thus, in general, the images in a collection cannot be rendered into a unified pyramid without scaling the original pixel data. OpenSlide does not attempt to do this. Instead, OpenSlide omits the macro image from the pyramid and refuses to open slides whose main images have inconsistent resolutions.

Associated Images

macro

the highest-resolution dimension of the macro image

Known Properties

leica.aperture

the numericalAperture of the main image

leica.barcode

the barcode text. (For slides in the 2010/10/01 namespace, OpenSlide 3.4.0 and earlier report this property as a Base64-encoded string; OpenSlide 3.4.1 and later report it in plain text. For slides in the 2010/03/10 namespace, OpenSlide reports the barcode as it is stored in the XML, since we do not know whether those barcodes are Base64-encoded. If you have a 2010/03/10 slide with a bar code, please comment in this bug[2] or contact the OpenSlide mailing list.)

leica.creation-date

the creationDate of the main image

leica.device-model

the device model of the main image

leica.device-version

the device version of the main image

leica.illumination-source

the illuminationSource of the main image

leica.objective

the objective of the main image

openslide.mpp-x

calculated as 10000/tiff.XResolution, if tiff.ResolutionUnit is centimeter

openslide.mpp-y

calculated as 10000/tiff.YResolution, if tiff.ResolutionUnit is centimeter

openslide.objective-power

normalized leica.objective

Test Data

http://openslide.cs.cmu.edu/download/openslide-testdata/Leica/

Format

multi-file with very complicated proprietary metadata and indexes

File extensions

.mrxs

OpenSlide vendor backend

mirax

OpenSlide will detect a file as MIRAX if:

1.The file is not a TIFF.

2.The filename ends with .mrxs.

3.A directory exists in the same location as the file, with the same name as the file minus the extension.

4.A file named Slidedat.ini exists in the directory.

MIRAX can store slides in JPEG, PNG, or BMP formats. Because JPEG does not allow for large images, and JPEG and PNG provide very poor support for random-access decoding of part of an image, multiple images are needed to encode a slide. To avoid having many individual files, MIRAX packs these images into a small number of data files. The index file provides offsets into the data files for each required piece of data.

The camera on MIRAX scanners takes overlapping photos and records the position of each one. Each photo is then split into multiple images which do not overlap. Overlaps only occur between images that come from different photos.

To generate level n + 1, each image from level n is downsampled by 2 and then concatenated into a new image, 4 old images per new image (2 x 2). This process is repeated for each level, irrespective of image overlaps. Therefore, at sufficiently high levels, a single image can contain one or more embedded overlaps of non-integral width.

The index file starts with a five-character ASCII version string, followed by the SLIDE_ID from the slidedat file. The rest of the file consists of 32-bit little-endian integers (unaligned), which can be data values or pointers to byte offsets within the index file.

The first two integers point to offset tables for the hierarchical and nonhierarchical roots, respectively. These tables contain one record for each VAL in the HIERARCHICAL slidedat section. For example, the record for NONHIER_1_VAL_2 would be stored at nonhier_root + 4 * (NONHIER_0_COUNT + 2).

Each record is a pointer to a linked list of data pages. The first two values in a data page are the number of data items in the page and a pointer to the next page. The first page always has 0 data items, and the last page has a 0 next pointer.

There is one hierarchical record for each zoom level. The record contains data items consisting of an image index, offset and length within a file, and a file number. The file number can be converted to a data file name via the DATAFILE slidedat section. The image index is equal to image_y * GENERAL.IMAGENUMBER_X + image_x. Image coordinates which are not multiples of the zoom level’s downsample factor are omitted.

Nonhierarchical records refer to associated images and additional metadata. Nonhierarchical data items consist of three zero values followed by an offset, length, and file number as in hierarchical records.

A data file begins with a header containing a five-character ASCII version string, the SLIDE_ID from the slidedat file, the file number encoded into three ASCII characters, and 256 bytes of padding. (In newer slides, the SLIDE_ID and file number are encoded as UTF-16LE, so the second half of each value is truncated away.) The remainder of the file contains packed data referenced by the index file.

The slide position file is referenced by the VIMSLIDE_POSITION_BUFFER.default nonhierarchical section. It contains one entry for each camera position (not each image position) in row-major order. Each entry is nine bytes: a flag byte, the X pixel coordinate of the photo (4 bytes, little-endian, may be negative), and the Y coordinate (4 bytes, little-endian, may be negative). In slides with CURRENT_SLIDE_VERSION ≥ 1.9, the flag byte is 1 if the slide file contains images for this camera position, 0 otherwise. In older slides, the flag byte is always 0.

In slides with CURRENT_SLIDE_VERSION ≥ 2.2, the slide position file is compressed with DEFLATE and referenced by the StitchingIntensityLayer.StitchingIntensityLevel nonhierarchical section.

Associated Images

thumbnail

the image named “ScanDataLayer_SlidePreview” in Slidedat.ini (optional)

label

the image named “ScanDataLayer_SlideBarcode” in Slidedat.ini (optional)

macro

the image named “ScanDataLayer_SlideThumbnail” in Slidedat.ini (optional)

Known Properties

All key-value data stored in the Slidedat.ini file are encoded as properties prefixed with “mirax.”.

openslide.mpp-x

normalized MICROMETER_PER_PIXEL_X from the Slidedat section corresponding to level 0 (typically mirax.LAYER_0_LEVEL_0_SECTION.MICROMETER_PER_PIXEL_X)

openslide.mpp-y

normalized MICROMETER_PER_PIXEL_Y from the Slidedat section corresponding to level 0 (typically mirax.LAYER_0_LEVEL_0_SECTION.MICROMETER_PER_PIXEL_Y)

openslide.objective-power

normalized mirax.GENERAL.OBJECTIVE_MAGNIFICATION

Introduction to MIRAX/MRXS[3]. Note that our terminology has changed since that document was written; where it says “tile”, substitute “image”, and where it says “subtile”, substitute “tile”.

Test Data

http://openslide.cs.cmu.edu/download/openslide-testdata/Mirax/

Format

single-file pyramidal tiled TIFF or BigTIFF with non-standard metadata

File extensions

.tiff

OpenSlide vendor backend

philips

Philips TIFF files are stored in single-file TIFF or BigTIFF format. OpenSlide will detect a file as Philips if:

1.The file is TIFF.

2.The TIFF Software tag starts with Philips.

3.The ImageDescription tag contains valid XML.

4.The root element of the XML is DataObject and has an ObjectType attribute with a value of DPUfsImport.

To open BigTIFF files, OpenSlide must be built with libtiff 4 or above.

File Organization

Philips TIFF is an export format. The native Philips format, iSyntax, is a custom multi-file format not currently supported by OpenSlide.

The ImageDescription tag of the first TIFF directory contains an XML document with a hierarchical structure containing key-value pairs. The keys are based on DICOM tags.

The level dimensions given in the TIFF ImageWidth and ImageLength fields, and also in the ImageDescription XML, are merely the TIFF tile size multiplied by the number of tiles in each dimension. Thus, they include the size of the padding in the right-most column and bottom-most row of tiles. Each level typically uses the same tile size but requires a different amount of padding, so the aspect ratios of the levels are inconsistent and the level dimensions are not proportional to the level downsamples. Correct downsamples can be calculated from the levels’ pixel spacings in the XML metadata.

Slides with multiple regions of interest are structured as a single image pyramid enclosing all regions. Slides may omit pixel data for TIFF tiles not in an ROI; this is represented as a TileOffset of 0 and a TileByteCount of 0. When such tiles are downsampled into a tile that does contain pixel data, their contents are rendered as white pixels.

Label and macro images are stored as Base64-encoded JPEGs in the ImageDescription XML. Some slides also store these images as stripped TIFF directories whose ImageDescriptions start with Label and Macro, respectively.

Relevant TIFF tags

Tag Description
ImageDescription Stores an XML document containing various metadata and associated image data
Software Starts with Philips

Associated Images

label

the TIFF directory with an ImageDescription starting with Label, or the image data in the DPScannedImage with a PIM_DP_IMAGE_TYPE of LABELIMAGE

macro

the TIFF directory with an ImageDescription starting with Macro, or the image data in the DPScannedImage with a PIM_DP_IMAGE_TYPE of MACROIMAGE

Known Properties

All key-value data encoded in the DPUfsImport object, in the first DPScannedImage object with a PIM_DP_IMAGE_TYPE of WSI, and in that object’s PixelDataRepresentation objects is represented as properties prefixed with “philips.”.

openslide.mpp-x

calculated as 1000 * philips.DICOM_PIXEL_SPACING[1]

openslide.mpp-y

calculated as 1000 * philips.DICOM_PIXEL_SPACING[0]

Test Data

No public data available. Contact the mailing list[4] if you have some.

Format

SQLite database containing pyramid tiles and metadata

File extensions

.svslide

OpenSlide vendor backend

sakura

OpenSlide will detect a file as Sakura if:

1.The file is a SQLite database.

2.The DataManagerSQLiteConfigXPO table contains exactly one row, and its TableName field refers to a unique table.

3.The unique table contains a row with id = "++MagicBytes" and data = "SVGigaPixelImage".

File Organization

Sakura slides are SQLite 3 database files written by the eXpress Persistent Objects ORM. Tables contain slide metadata, associated images, and JPEG tiles. Tiles are addressed as (focal plane, downsample, level-0 X coordinate, level-0 Y coordinate, color channel), with separate grayscale JPEGs for each color channel. Despite the generality of the address format, tiles appear to be organized in a regular grid, with power-of-two level downsamples and without overlapping tiles. The structure of the file allows scans to be sparse, but it is not clear if this is actually done.

Some irrelevant tables and columns have been omitted from the summary below. DataManagerSQLiteConfigXPO.PP Useful only to get a reference to the unique table. OpenSlide requires this table to contain exactly one row.

Column Type Description
TableName text Name of the unique table, described below

SVSlideDataXPO.PP High-level metadata about a slide. OpenSlide assumes this table will contain exactly one row.

Column Type Description
OID integer Primary key
m_labelScan integer Foreign key to label associated image in SVScannedImageDataXPO
m_overviewScan integer Foreign key to macro associated image in SVScannedImageDataXPO
SlideId text UUID
Date text File creation date?
Description text Descriptive text?
Creator text Author?
DiagnosisCode text Unknown, have seen “0”
HRScanCount integer Presumably the number of corresponding rows in SVHRScanDataXPO
Keywords text Descriptive text?
TotalDataSizeBytes integer Presumably the sum of TotalDataSizeBytes in corresponding SVHRScanDataXPO rows

SVHRScanDataXPO.PP A single high-resolution scan of a slide from SVSlideDataXPO. OpenSlide assumes this table will contain exactly one row.

Column Type Description
OID integer Primary key
ParentSlide integer Foreign key to SVSlideDataXPO
ScanId text UUID
Date text Scan date?
Description text Descriptive text?
Name text Scan name?
PosOnSlideMm blob 16 bytes of binary
ResolutionMmPerPix real Millimeters per pixel
NominalLensMagnification real Objective power
ThumbnailImage blob thumbnail associated image data
TotalDataSizeBytes integer Same as TOTAL_SIZE blob in unique table
FocussingMethod integer Unknown; have seen “1”
FocusStack blob 8 bytes of binary per focal plane; the center focal plane apparently has all zeroes

SVScannedImageDataXPO.PP Contains associated images other than the thumbnail.

Column Type Description
OID integer Primary key
Id text UUID
PosOnSlideMm blob 16 bytes of binary
ScanCenterPosMm blob 16 bytes of binary
ResolutionMmPerPix real Millimeters per pixel
Image blob JPEG image data
ThumbnailImage blob Low-resolution JPEG thumbnail

tile.PP This table is most naturally used to map tile coordinates to tile IDs, but is not suitable for individual lookups because it has no useful indexes. In addition, some Sakura slides don’t have it. OpenSlide ignores the table and constructs tile IDs directly from tile coordinates.

Column Type Description
TILEID text Foreign key to unique table
PYRAMIDLEVEL integer Downsample of the pyramid level
COLUMNINDEX integer Level-0 X coordinate of the top-left corner of the tile
ROWINDEX integer Level-0 Y coordinate of the top-left corner of the tile
COLORINDEX integer 0 for red, 1 for green, 2 for blue

Unique table

This is the table named by DataManagerSQLiteConfigXPO.TableName. It contains named blobs including the JPEG tile data.

Column Type Description
id text Primary key
size integer Length of data field
data blob Data item

This table stores a variety of blob types.

id Description
++MagicBytes SVGigaPixelImage
++VersionBytes Format version, e.g. 1.0.0
Header See below
TOTAL_SIZE The data field is empty. The size field is the sum of all other size fields except ++MagicBytes and ++VersionBytes.
T;2048|4096;4;2;0 Image tile with downsample 4, X coordinate 2048, Y coordinate 4096, channel 2 (blue), focal plane 0
T;2048|4096;4;2;0# MD5 hash of the T;2048|4096;4;2;0 image tile

Header blob

The Header blob is a small binary structure containing little-endian integers as follows:

Offset Size Description
0 4 Tile size in pixels
4 4 Image width in pixels
8 4 Image height in pixels
12 4 Unknown; have seen “8” (bits per channel?)
16 4 Number of focal planes
20 4 Unknown; have seen “3” (number of channels?)
24 4 Unknown; have seen “1”
28 2 Unknown; have seen “256”
30 4 Unknown; have seen “1”
34 4 Unknown; have seen “2”
38 4 Unknown; have seen “3”
42 4 Unknown; have seen “4”
46 4 Unknown; have seen “5”
50 4 Unknown; have seen “6”

Associated Images

label

SVScannedImageDataXPO.Image corresponding to SVSlideDataXPO.m_labelScan

macro

SVScannedImageDataXPO.Image corresponding to SVSlideDataXPO.m_overviewScan

thumbnail

SVHRScanDataXPO.ThumbnailImage

Known Properties

sakura.Creator

SVSlideDataXPO.Creator

sakura.Date

SVSlideDataXPO.Date

sakura.Description

SVSlideDataXPO.Description

sakura.DiagnosisCode

SVSlideDataXPO.DiagnosisCode

sakura.FocussingMethod

SVHRScanDataXPO.FocussingMethod

sakura.Keywords

SVSlideDataXPO.Keywords

sakura.NominalLensMagnification

SVHRScanDataXPO.NominalLensMagnification

sakura.ResolutionMmPerPix

SVHRScanDataXPO.ResolutionMmPerPix

sakura.ScanId

SVHRScanDataXPO.ScanId

sakura.SlideId

SVSlideDataXPO.SlideId

openslide.mpp-x

calculated as 1000 * sakura.ResolutionMmPerPix

openslide.mpp-y

calculated as 1000 * sakura.ResolutionMmPerPix

openslide.objective-power

normalized sakura.NominalLensMagnification

Test Data

No public data available. Contact the mailing list[4] if you have some.

Format

single-file pyramidal tiled TIFF, with non-standard metadata and overlaps; additional files contain more metadata and detailed overlap info

File extensions

.tif

OpenSlide vendor backend

trestle

Trestle slides are stored in single-file TIFF format. OpenSlide will detect a file as Trestle if:

1.The file is TIFF.

2.The TIFF Software tag starts with MedScan.

3.The ImageDescription tag is present.

4.All images are tiled.

Relevant TIFF tags

Tag Description
ImageDescription Stores some important key-value pairs, see below
Software Starts with “MedScan”
XResolution, YResolution Seems to store microns-per-pixel (MPP), which may or may not take into account the correct objective power. Note that this is inverted from standard TIFF, which stores pixels-per-unit, not units-per-pixel.

Extra data stored in ImageDescription

The ImageDescription tag contains semicolon-delimited key-value pairs. A key-value pair is equals-delimited. We use the OverlapsXY and Background Color keys from the ImageDescription, and ignore the rest. All of these values are stored as properties starting with “trestle.”.

Key Description
Background Color Hex-encoded background color info, assumed to be in the format RRGGBB.
White Balance Hex-encoded white balance
Objective Power Reported objective power, often incorrect.
JPEG Quality The JPEG quality value.
OverlapsXY Overlaps, see below.

TIFF Image Directory Organization

The first image in the TIFF file is the full-resolution image. The subsequent images are assumed to be decreasingly sized reduced-resolution images.

The OverlapsXY pseudo-field encodes a list of tile overlap values as ASCII.

Example: “64 64 32 32 16 16” (note the initial space).

These values represent the standard overlaps between adjacent tiles in X and Y, in pixels. This example encodes 3 levels worth of overlaps. Further overlaps are assumed to have the value 0.

Individual tile overlaps may differ from the standard overlaps. These individual overlaps are recorded in .tif-Nb files adjacent to the .tif file, where N is the level number. OpenSlide does not read these files, though they have been partially decoded; see issue 21[5] for details.

Associated Images

macro

the image with a filename extension of “.Full” (optional)

Known Properties

All data encoded in the ImageDescription TIFF field is represented as properties prefixed with “trestle.”.

openslide.mpp-x

copy of tiff.XResolution (note that this is a totally non-standard use of this TIFF tag)

openslide.mpp-y

copy of tiff.YResolution (note that this is a totally non-standard use of this TIFF tag)

openslide.objective-power

normalized trestle.Objective Power

Test Data

http://openslide.cs.cmu.edu/download/openslide-testdata/Trestle/

Format

single-file pyramidal tiled BigTIFF with non-standard metadata and overlaps

File extensions

.bif, .tif

OpenSlide vendor backend

ventana

Ventana slides are stored in single-file BigTIFF format. OpenSlide will detect a file as Ventana if:

1.The file is TIFF.

2.The XMP tag contains valid XML.

3.The XML contains an iScan element, either as the root element or as a child of a Metadata root element.

To open Ventana files, OpenSlide must be built with libtiff 4 or above.

Associated Images

macro

the TIFF directory whose ImageDescription is Label Image or Label_Image

thumbnail

the TIFF directory whose ImageDescription is Thumbnail

Known Properties

All XML attributes in the iScan element are represented as properties prefixed with “ventana.”.

openslide.mpp-x

normalized ventana.ScanRes

openslide.mpp-y

normalized ventana.ScanRes

openslide.objective-power

normalized ventana.Magnification

Test Data

http://openslide.cs.cmu.edu/download/openslide-testdata/Ventana/

Format

single-file pyramidal tiled TIFF

File extensions

.tif

OpenSlide vendor backend

generic-tiff

OpenSlide will detect a file as generic TIFF if:

1.No other detections succeed.

2.The file is TIFF.

3.The initial image is tiled.

TIFF Image Directory Organization

The first image in the TIFF file is the full-resolution image. Any other tiled images in the file with the “reduced resolution” bit set are assumed to be reduced-resolution versions of the original.

Associated Images

None.

Known Properties

Many TIFF tags are encoded as properties starting with “tiff.”.

Test Data

http://openslide.cs.cmu.edu/download/openslide-testdata/Generic-TIFF/

The Carnegie Mellon School of Computer Science.

This manual page was written by Mathieu Malaterre <malat@debian.org> for the Debian GNU/Linux system (but may be used by others).

1.
pages for each vendor format
http://openslide.org/formats/
2.
this bug
http://openslide.orghttps://github.com/openslide/openslide/issues/155
3.
Introduction to MIRAX/MRXS
http://openslide.orghttps://lists.andrew.cmu.edu/pipermail/openslide-users/2012-July/000373.html
4.
mailing list
http://openslide.orghttps://lists.andrew.cmu.edu/mailman/listinfo/openslide-users/
5.
issue 21
http://openslide.orghttps://github.com/openslide/openslide/issues/21#issuecomment-23615583
07/29/2016 OpenSlide 3.4.1+dfsg