ASTM 3D Image Data Format Requirements
E57.04.01 Data Format
Requirements Taskgroup
Version: 0.16 (July 2009)
Please email comments to:
Gene Roe, ASTM E57.04 Interoperability Subcommittee Chair
This document is placed in the public domain, and may be
copied and transmitted freely.
Table of Contents
Scope. 4
E57.04 Subcommittee scope. 4
Document scope. 4
Summary. 4
Intended Use. 4
Exchange Usage. 4
Archival Usage. 5
Guiding Principles. 5
Exchange Implications. 5
Non Goals. 6
General Data Requirements. 7
Phased implementation. 7
3D Imaging Systems Support 7
Extensibility. 7
Random Access vs. Sequential Access. 8
Transformation parameters. 8
Internationalization. 8
File size constraints. 8
Scans Per File. 8
Files per Scan. 9
3D Image Structure. 9
Self-description. 9
Units. 9
Resolution. 9
Little Endian. 9
Version compatibility. 9
Coordinate System.. 9
Picture Data. 10
Data Filtering. 11
Computer Readable Format Description. 11
Licensing. 11
Reliability. 12
Error detection. 12
Recovery. 12
Test Plan. 12
Encryption. 12
Authentication. 12
Performance. 12
Speed Performance. 12
File Size. 12
Compression (Phase 1) 12
Compression (Phase 2) 13
Memory use. 13
Bibliography. 13
Other formats. 13
Reference documents. 13
Other documents and links. 13
To develop and promulgate open, standard data exchange
mechanisms for 3D imaging system derived data in order to promote its widest
possible use.
This document defines the requirements for the ASTM 3D image
file format. This document is not a design. It describes the required
properties of an acceptable design.
Requirements for a standard 3D image data format for the 3D
metrology industry are proposed. The format will allow hardware and software
vendors to reliably exchange 3D images with low software development costs.
Adoption costs are lowered by a proposed reference software implementation of a
format reader and writer, as well as by a proposed translation utility from
existing formats.
The members of the E57.04 subcommittee plan to create an
open source project where a reference implementation of the format will be made
available to the public. Permission will be granted to use the software royalty
free for any purpose, including commercial applications. The copyright will be
held by the individuals involved in the open source project as a group.
There is a wide range of devices in the 3D metrology
industry. Many sectors are immature. Not all sectors are currently
represented on the E57 Interoperability subcommittee. To meet the desire to
move quickly, a phased approach is proposed. A core format with minimal
features, but with sufficient “hooks” for future extensions will be designed
and implemented initially. The organized extensibility of the format is a key
feature of the requirements. Hopefully the format will grow and adapt as
vendors offer new (and open) extensions to the core implementation.
3D imaging systems can produce data at a prodigious rate.
Some fast scanners produce more than a billion points per hour. The format
must handle large files in a space and time efficient manner. Appropriate data
compression is required. With large datasets, error detection, error
correction, and error recovery become important.
Two categories of usage have been identified for consideration
by the E57.04 committee: data exchange, and archiving. These categories have
different (but partially overlapping) data and performance requirements.
Unidirectional data transfer is desired between two software
applications: the writer and the reader.
In general, the two software applications are written by
different vendors and run by different users at different times on different
computers with different operating systems.
Typically, due to the large amount of data, both writer and
reader will store the data internally in a proprietary (disk-based) database
format.
Typically, the data will be transferred only a single time
between writer and reader, and will be stored in the transfer file format.
Archival requirements will not be satisfied in the first
phase of the interoperability standard, but the requirements should be
considered during the initial design to increase the likelihood that they can
be accommodated by extension of the standard rather than replacement.
There are different levels of archival storage goals, with
the simplest relating to ensuring the data remains useful over time (could
still just be points) to more advanced goals such as archiving all information
related to the scanning project and maintaining audit logs of changes.
Archival Usage may have several requirements that make it
different than Exchange Usage, in particular:
1. The
archival store will need to be more robust over time (possibly decades), which
will mean that error detection will need to be more robust, and error
correction may be interesting
2. Ideally
the archival store would be completely self documenting. In exchange usage
supplementary documents describing the format are acceptable, but an archival
store may be found without supplementary documents and the data should be
accessible
3. The
archival store may need features such as tamper detection and digital signing
that are not necessary in an exchange format
4. The
archival store may need to include more extensive meta-data as well as other
types of data related to the work product (such as field notes, digital images,
etc.)
The following principles were used in constructing this user
requirements document and should also guide in the design and implementation of
the format:
1. Reliable
interoperability – transfer from any vendor to any vendor, well tested
2. Open
– vendor neutral, well documented, utilizing open source and IP, international
3. Low
barrier for adoption – low development cost for adopters
4. Minimalist
core design – low complexity, easy consensus, short development time
5. Organized
extensions – add fields in future, without breaking core functionality
The guiding principles used by E57.04 in the selection of
data for exchange usage are:
1. An
exchange file must contain all spatially oriented data and related attributes
that was gathered by sensors or has been calculated by software and is needed
for downstream processing by other software (for example registration
transformations). Both oriented scan data and oriented image data should be
included because they are useful for downstream processing. No other processed
data, such as modeled objects, should be included in the exchange file.
2. Additional
data relating to the scanning project that is not needed by downstream software
should not be included in the exchange file. This means that project data
needed for documentation, liability, or archival reasons must be retained in
the data store of the original software that created the data. The exchange format
will attempt to store enough back-tracing information to allow one to get back
to the original data set and review any non-spatial data that is needed.
The committee decided on the above approach after much
debate, and the concern was that the inclusion of items that “might be useful
later” is a slippery slope that will lead to an explosion of attributes in the
file, while the minimalist approach that gives useful data to downstream
software seems well bounded.
In addition to the above decisions on the overall approach
to what to store, the committee has assumed that:
1. Fully
corrected data will be stored in the exchange file, including the effects of
all vendor specific calibration and environment correction at the time of
scanning (temperature, pressure, relative humidity etc.). In the case of
images the image will be the corrected image if calibration data exists for the
camera.
2. Fixed
position scanners will store their data in scanner local coordinates (either
Cartesian or polar forms are acceptable) and specify a transformation to the
desired global coordinate system. The scanner local coordinates system will
have the scanner at the origin.
3. Moving
scanners will store their data in final global coordinates (all transformations
applied) with the exception that a global offset can be removed and applied
subsequently to all data in order to make the data stream more compact.
Examples of data that will be included in the exchange file:
·
Scan data
·
Oriented image data
·
Registration information
·
Accuracy specifications (device error model) and enough
orientation information to be able to determine the error cloud associated with
each point (the position of the scanning device for example)
Examples of data that will not be included in the exchange
file (go back to original data to determine):
·
Temperature, humidity or other environmental data associated with
a scan
·
The name of the person who took the scan or performed the
registration
·
Vendor specific data, like calibration parameters or battery
voltage
·
Meshing information relating to the scan data
·
Modeled objects created from the scan data
·
Other information associated with the scanning project
The following paragraphs discuss issues that are explicitly
not addressed by the core format or any of its anticipated extensions.
1. High
Performance – In neither the exchange nor the archival case will high
performance usage be a requirement. The proposed formats are not intended to
replace or augment vendor specific high performance storage formats; they are
only intended to describe data for exchange or archival purposes. The creation
of a high performance data store is specifically considered out of scope by the
E57.04 committee.
2. Instrument
control – The format will not address controlling or sending commands to a
3D imaging instrument.
3. Per-point
random access – The large datasets will require compression, which makes
random access at a fine granularity infeasible, although coarse access
granularity will be possible.
4. Work
process – The format does not attempt to contain all the necessary
information to document that proper best practices were followed in the
capturing of the data.
5. Encryption
– The format will not support encryption of the file.
6. Digital
Signing – The format will not support digital signing of sections of the
file.
7. Human
Readable – The format is not required to be human readable. Although
having software to convert the format into a human readable format is desirable
for development and testing.
The format implementation shall be split into two phases.
Features in the first phase shall be chosen to maximize chance of industry
acceptance of the format. The phase 2 features don't have to be implemented
initially, but the initial design must detail how they will work.
The format shall be flexible enough to support a wide
variety of 3D imaging systems: TOF, phase measuring, triangulation, flash LIDAR
(1D + 2D FPA), various scan paths (spherical, linear, spiral, lissajous,
arbitrary).
The format shall support fixed position and dynamic
scanning.
The format shall support data from different instrument
types merged into a single file.
The format shall provide a framework for novel extensions.
The format shall allow extensions to be offered for
standardization without significant modifications.
The format shall be sufficiently general to encode the
following potential extensions:
- Additional attributes of any encoded object: points,
scans, poses, files, transforms.
- New image patterns: spirals, hexagonal grids, dynamic
scanning.
- New encoded objects or groupings of points.
- New color (spectral dependency) encodings.
- New 3D image representations: meshes, splines, curves,
scalar volumetric data, vector volumetric data, uncertainty fields.
The format shall be extensible without a central
coordinator, without risk of name collisions.
Extensions to the format shall be able to utilize all base
functionality: reliability, compression, and self-description.
A reader of the format shall be able to reliably skip over
unknown fields or extensions.
The format shall be capable of encoding several
representations in a file of the same object, for example, a new 3D image
representation and a backward compatible point cloud representation.
Extensions shall be required to be recorded in the file and
tracked with extension format version numbers in the same manner as the base
format.
The format must support sequential reading of data.
The format need not support random access to individual points.
The format should allow random access to larger collections of data, including:
- Header information associated with the file, each scan,
and each chunk, if available
- If multiple scans are present then the reader should be
able to access each scan directly without reading the others
- If multiple chunks are used to store a single scan then
the reader should be able to access each chunk directly without reading
the others
Comment: allowing random access to these more coarse levels
is a trade-off between accessibility of individual data elements and the desire
for a compact and efficient representation.
Each scan may have a rigid body transformation. The
transformation shall be stored as a translation 3-vector and a unit quaternion
in double precision floating point. If no transformation is specified for a
pose then the associated data is untransformed. If the underlying data are
stored in a form other than Cartesian coordinates then there needs to be a
standard mapping to a Cartesian frame so that the rigid body transformation can
be applied in an unambiguous way.
All character strings shall be encoded in UTF-8 as
documented in The Unicode Standard, Version 5.0, §3.9–§3.10 (2006).
The UTF-8 strings may be compressed.
The file format shall support a single file up to 2^61 bytes
in size. It is up to the user to use the appropriate operating system
when dealing with file sizes over 2^32 bytes. The typical file size will
only be as large as necessary to store the given scans and it will grow larger
when new scans are added.
The format shall support multiple scans per file, but it is
required that each scan be associated with a rigid body transformation that
brings all transformed scan data into a single coordinate system for the file.
Each scan in a file may be made with a different instrument.
The file format shall not directly support the
spreading of a single scan over multiple files. However, a scan could be
split into different sections and stored into different files. The user
would implement the split and merge using other products suited to compression
and splitting.
The stored data shall preserve point adjacency information
from the time of scanning
In the event the points were gathered sequentially in time
they shall be stored in the same order
In the event there is no temporal ordering but there is
gridding information, such as rows or columns of data, the gridding information
shall be preserved in this format.
In the event a data sample is missing (no return for
example) this format must preserve the information that a data point is missing.
Self-description
The file format shall have a self describing data structure
such that each data type is adequately described i.e. name of the field, type
of data, resolution, bit width, upper and lower limits, scalar, alignment.
Physical quantities are represented using the International
System of Units (SI). In particular, coordinate and length information are
specified in meters and planar angles are specified in radians. For
performance reasons, raw data may be supplied in integers along with the
conversion factor to SI units.
The resolution of the data shall be of adequate size as to
support the full resolution of the 3D imaging device’s native data without
significant loss. Each data type can be size differently based on the
precision of the underlying data.
The file format shall be stored using the little endian byte
order. Big endian computers will have to convert/swap the data.
The format version is to be
encoded in the file. Consideration for compatibility will be given during
development of initial and subsequent versions to the extent practical. It is
anticipated that subsequent versions will share significant commonality within
the initial version in order to minimize the efforts in creating file
converters. However, it is recognized that advances in technology, hardware,
data structures, etc. may be utilized in developing future versions and such
advances may preclude compatibility among versions.
Cartesian and Spherical coordinates are supported. In both
cases, the values refer to the local (scanner) frame of reference.
For the Cartesian option, data is stored in an (X, Y, Z)
ordered triplet, where X, Y and Z have their standard Cartesian meaning as a
right-handed coordinate system. For Spherical, data is stored in the ordered
triplet (R, A, E), where
R = range (non-negative)
A = azimuth angle (in radians, (-p, +p])
E = elevation angle (in radians,
[-p/2, +p/2]).
Note that this triplet forms a right-handed system. The
conversion from Spherical to Cartesian is accomplished through the formula:
X = R cos (E) cos (A)
Y = R cos (E) sin (A)
Z = R sin (E).
Conversely, in non-degenerate cases, the Cartesian
coordinates can be converted to Spherical via
R = Ö(X2 + Y2 + Z2)
A = atan2(Y, X) (c.f.
http://en.wikipedia.org/wiki/Atan2)
E = sin-1 (Z/R).
In the degenerate case, the following conventions are
observed:
If X = Y = 0, then A = 0;
If X = Y = Z = 0, then both A = 0
and E = 0.
The elevation is measured with respect to the XY-plane, with
positive elevations towards the positive Z-axis. Elevation angle is chosen
preferentially over zenith (or nadir) angle so that the XY-plan is defined by 0°
(instead of π/2). Doing so mitigates horizontal floating-point
discrepancies, e.g., Z = R sin(0°) is preferred to Z = R sin(π/2) because
the latter could be subject to truncation error.
The azimuth is measured as the counterclockwise rotation of
the positive X-axis about the positive Z-axis. This definition of azimuth
follows typical engineering usage. Realize that this differs from traditional
use in navigation or surveying.
In the Phase 1, the format shall support multiple spatially
oriented 2D picture images. The picture images shall be corrected if camera
calibration information is available and stored in an embedded file with an
opaque format (like a file attachment to an email). The name of the format
used shall be stored in a string, enabling a reader to extract the picture
image data to a separate file and invoke the appropriate decoder. Picture
image decoders supported must include at least one lossless format and one
compressed but lossy format. In addition to the image data, image type
information, and image size, the orientation of the camera shall be given by:
(1) location of the apex of the view frustum, (2) orientation of the view (y
axis in view direction, z axis in camera up direction), and (3) the horizontal
and vertical angular field of view.
The vendor is not required to write raw data into the file.
The vendor may adjust, correct, or filter the data to
produce the best and most accurate representation of the objects measured.
The data written shall be the same quality as the data
written in the vendor's proprietary formats.
LAS Compatibility
The ASPRS LAS v1.1 format will be supported by an extension
of the core E57 format.
The extension shall be a
superset of all fields (mandatory, conditional, and optional) in LAS, as
documented in:
·
LAS Specification Version 1.1, March 07, 2005
The extension shall be
co-developed with the core format to help assure the core design generality.
The extension shall enable lossless conversion from LAS
format to the extended E57 format.
Software may be produced by the E57 subcommittee to convert
existing LAS files to the E57 format.
Such software would also enable a near zero cost solution
for writing vendors that can currently write in LAS format to support the E57
format immediately.
A format design that incorporates a computer readable
description of the format in an external file is desirable.
The computer readable description would document mandatory
and optional fields with their names, types, value restrictions, default
values, and default compression algorithms.
Such a description would enable automated validation of
format syntax.
Since the format written will be configurable by the writer
(e.g. writer may choose number of bits in X coordinate field), the external
format description does not eliminate the need for the data file to
self-describe what layout was used in a particular data file.
Said in a different way, the external format description
declares what the layout could be, and the internal layout description declares
what layout of the file actually is.
The proposed format description functionality would be
similar to an XML schema.
The core format would have a standardized computer readable
description file.
Each format extension would have a separate computer
readable description file.
In the archival usage case, the external format description
files could be optionally attached to each data file, to protect against lost
description files decades later.
Any license with ASTM to implement, sell, or use products
based on this data format shall be royalty-free.
A reasonable fee shall be required to purchase a copy of the
data format specification.
The use of all code and data distributed by ASTM in
connection with this format shall be royalty-free.
It is recommended that all designs, software, and data
associated with this format avoid use of licensed Intellectual Property (IP).
If licensed IP is used, it shall be licensed under Fair,
Reasonable, and Non-Discriminatory (FRAND) terms.
To verify file integrity, the file shall be divided into
segments and an Error Detection Code (EDC) shall be computed for each segment
and stored in the file.
The segments shall be arranged so that the data in a segment
can be discarded, without the complete loss of the file, should an
unrecoverable data error occur.
The EDC computation is not optional.
The EDC detection capabilities shall be the same or better
than a 32 bit cyclic redundancy check (CRC) using the gzip polynomial.
To enable recovery from corruption at any point in the file
without complete data loss, some portions of the file shall be duplicated.
EDC implementation shall be time efficient when reading a
small portion of the file.
A design proposal must have plan/procedure to ensure
interoperability between all readers and writers.
A test suite shall be created of 3D image files to verify readers.
A 3D image file comparison program shall be created to
verify writers.
Error recovery test suite shall be created in Phase II.
The format will not support encryption.
The format will not directly support
authentication, tamper detection, or digital signatures.
A message digest code can be computed after the file is
written, by separate software.
Straightforward implementations of the format shall be
faster than reading/writing an ASCII file with equivalent data.
File size shall be no larger than the equivalent ASCII file.
Only simple lossless compression will be supported in Phase
1.
Compression shall be optional, controlled by the writer.
A compressed file shall still be required to meet the
recoverability criteria (single error cannot cause complete file loss).
Compression algorithms in Phase 1 may be variable bit rate –
the number of bits used to encode a field may be data dependent (a consequence
of all lossless compression algorithms).
Uncompressed fields shall be encoded efficiently, with as
few as possible unused bits in file.
Software complexity will be low.
Unused fields will not be required to be stored in file.
The goals of future compression schemes will be determined
at a later date.
It is desirable that the data writer shall be able to choose
format options that allow the file format to be written by a micro-controller
with a limited amount of memory.
LAS
2.0 Draft, August 17, 2007
LAS
1.1, March 7, 2005
LAS
1.0, May 9, 2003
AN IMPLEMENTATION OF THE ASPRS LAS STANDARD - includes list of other existing laser
data file formats.
ECMA-363
Universal 3D File Format -
primarily addressing animation and rendering.
Hierarchical
Data Format 5 - format for storing and managing scientific data.
X3D
- ISO approved virtual reality description language (successor to VRML) based
on XML.
GML
– geography and mapping language based on XML.
Unicode 5.0.0 the
current Unicode standard.
ASPRS
LAS committee
libLAS.org - A C++ library
for reading and writing LAS 1.x data.
XML – a text-based
extensible data format standard.
XML Binary
Characterization Working Group – analysis of requirements for binary XML.
Efficient XML
Interchange (EXI) Working Group – draft of binary XML standard.
OGC – Open
Geospatial Consortium, web based GIS standards organization.
Developing
professional guidance: laser scanning in archaeology and architecture –
analysis of format requirement for archaeology field, recommends LAS.
during implementation.
This site is © Copyright 2010 E57.04 3D Imaging System File Format Committee, All Rights Reserved
|