Geometry Coding and VRML
Presented by
Scott Musler
Multimedia Communications and Visualization Laboratory
Department of Computer Engineering and Computer Science
University of Missouri - Columbia
OVERVIEW
- Virtual-Reality Modeling Language (VRML) is becoming the standard file format for storage/transmission of 3D worlds
- The present discussion is concerned with using the so-called Topological Surgery (TS) method to compress 3D models in a binary VRML format
- The binary VRML proposal is currently being evaluated by the Compressed Binary Format Working Group of the VRML Consortium
- Polygonal models, especially those defined by triangles, are the primary 3D representation in industry, entertainment, manufacturing, and architecture
- They consist of V vertices and F triangles, with one or more property arrays (normals, colors and textures)
- Yet, the current ASCII representation of VRML is hampered, due to limited bandwidth for transmission, and large file sizes
- The TS scheme achieves compression ratios of up to 50:1
- In the TS scheme, there is no loss of polygon connectivity, and visually imperceptible changes in model appearance due to slight losses in geometry position information
VRML
Eight features of VRML are
- Scene graph
- Event processing
- Behaviors
- Encapsulation and reuse
- Distributed content
- Extensibility
- Interactivity
- Animation
VRML
Scene Graph
- Basic building block is the node
- Nodes have fields, which serve as attributes and define the state of the node
- 54 different types of nodes in three classes (grouping, child, and attribute nodes)
- Grouping nodes can have child/grouping nodes as children
- The parent/child relationship forms a scene graph
- The parent/child relationship also forms a sequence of transforms that define the position and viewing of the scene
Even Processing
- Nodes can send/receive messages from/to other nodes, and are termed events
- These are triggered by user input events, and the connection between a receiving/transmitting node is a route
VRML
Behavior
- The user may have the world respond to user input using custom logic (such as the opening of a door), and these user input events are supported by script nodes
- Script nodes are user specified commands that determine which objects are affected, and what to do in the case of an input event
Encapsulation and Reuse
- Allows new nodes to be formed (prototypes) in terms of existing nodes
- This helps in the formation of new nodes and enables enhancibility
VRML
Distributed Content
- Allows, via the Inline node, authors to create worlds composed of multiple VRML worlds, which may also be composed of multiple VRML worlds
- An example could be a car made up of: Car.wrl -> Engine.wrl -> Piston.wrl
- It also supports MPEG, GIF/JPEG, MIDI file formats for associating input events
Extensibility
- Allows authors to create "new" nodes, due to its support of external prototypes
Interactivity
- Has sensor nodes, such as environment sensors, pointing-device select sensors, and pointing-device-drag sensors
VRML
Animation
- Interpolator nodes receive input events (set_fraction) which trigger output events (value_changed)
VRML and Topological Surgery
Overview of Proposal
- VRML was designed to be minimal, yet complete
- Yet, even minimal VRML files are quite large (> 100 kB).
- Proposal combines a binary encoding scheme with the TS scheme
- Requires no modification to existing VRML specification
- It directly translates the ASCII format to a binary form, and supports 2 compression mechanisms - field compression and topologically assisted compression
Field Compression
- Provides for the compression of several multivalue and multivector fields by quantization of coordinates
VRML and Topological Surgery
Topologically Assisted Compression
- Enables compression of certain VRML nodes that contain geometry, and acts on both connectivity and property fields
- Several VRML nodes are already designed to contain geometry (such as the Sphere node, which only requires a radius), but are limited to primitive types
- The workhorse of the geometry nodes is the IndexFaceSet (IFS)
- Any polygonal shape can be described with an IFS, in addition to normals, colors and texture coordinates
- VRML polygonal models consist of a list of faces with vertex references
- One compression scheme is to simply switch to binary encoding rather than ASCII representation - but still have redundancy (due to multiple references of the same vertex position)
- Michael Deering proposed generating generalized triangle strips, but with a limited "width" of 16, which is Hamiltonian path of triangles - an NP-complete problem
- Gabriel Taubin and Jarek Rossignac subsequently devised the TS scheme
Topological Surgery
Background
- Polygonal meshes are the most common way to represent 3D objects in today’s world
- The TS scheme is limited to triangles, which will be focused on
- For a triangular polygonal mesh, usually there are twice as many triangles as vertices
- For 3D meshes, there are 3 vertices per triangle, each with an x-, y-, and z-position (so »
6 references total per vertex (2 x-, y- and z-references))
- If restricted to 1024 (210) vertices and 10 bits, one still must use 60 bits per vertex
- Michael Deering proposed a method (using generalized triangle strips of width no greater than 16, and quantization of vertex positions based on a single ancestor) to obtain 11 bits/vertex
- TS method obtains 2-3 times better compression ratios and, as a by-product, generates long triangle strips
- The objective is to find a good algorithm to generate these triangle strips, for a compressed binary storage format
Topological Surgery
Overview
- One must construct a structure for good quantization of vertex positions
- This is achieved by constructing a vertex tree (VTREE) and quantizating vertex positions using a linear predictor to get vertex corrections (VCOR)
- This computation is considered NP-complete, due to its similarity to the Hamiltonian path (an optimal tree using each vertex once and only once, with no cycles) problem
- Also, one must construct a structure for triangle reconstruction
- This is achieved by constructing a triangle tree (TTREE) and the marching pattern
- In order to construct VTREE and TTREE, one must have a method to facilitate the building of these structures
- To do this, one layers the triangles of the mesh (see Figure 1)
- A vertex is chosen as the start vertex and all triangles incident to it are the first layer
- Subsequently, all triangles incident to triangles of layer N compose triangle layer N+1
- This breaks the mesh up into cut edges (edges that will be part of VTREE) and marching edges (those not of VTREE)

Figure 1: Layering of the fandisk model

Figure 2: The vertex tree of the fandisk model
Topological Surgery
Vertex Tree
- The layering of the mesh helps determine the vertex tree, because layering divides the mesh up into circular rings (for the most part)
- Those edges which make up the layer boundaries are cut edges
- Cut edges amount to the path that the vertex tree takes (for the most part - see Figure 2)
- The "rings" are traversed in a counter-clockwise fashion until all vertices have been used
- This constitutes a tree structure for the compression of vertex positions
- Each run in the vertex tree has 3 values associated with it (run length, last bit, leaf bit)
- Run length is the length of the current run
- Last bit indicates that this is the last run from a current (branching) vertex
- Leaf bit indicates that the current run ends in a leaf vertex
Topological Surgery
Vertex Tree
- Once each run is accounted for, the vertex positions are quantized using a linear predictor (a method to predict vertex positions based on its ancestors)
- The actual vertex positions are not compressed
- Using a linear predictor, the errors between predicted vertex positions (based upon 2, 3, or 4 ancestors) and actual positions are encoded instead of actual positions
Topological Surgery
Vertex Corrections
- Vertex positions comprised of a 3-tuple (x, y, z position)
- Generally each are 32-bit floating point values (much too large)
- First reduce the accuracy of the positions to 8-16 bits
- Then compute a predicted position for each vertex, and then encode the error (as it is of fewer bits) between actual and predicted position using a coding technique (such as Huffman)
- The linear predictor has the form
vn = Î
(vn) + P (l
, vn-1, …, vn-K)
Î
(vn) is vertex position correction for that vertex
P is vertex position correction function
The v parameters are the K ancestors (2, 3 or 4) of the vertex in question
- These corrections are termed the vertex compressed position corrections (VCOR)
- Then the bounding loop is constructed (a pre-order traversal of the VTREE of size (2V-2)) to help with the triangle tree and determination of the marching pattern
- The bounding loop "forms" the path the triangle tree takes
Topological Surgery
Triangle Tree
- The bounding loop provides a pathway for traversal of the triangles
- Edges that are not part of the vertex tree are the marching edges
- The marching edges are the edges that connect triangles within a run
- One "walks" around the mesh in a counter-clockwise fashion, picking up triangles and adding them to the triangle tree, using the bounding loop as a guide (ie: right or left moves along the bounding loop)
- Adjacent triangles must share 2 vertices, so the two triangles differ by 1 vertex reference
- If the differing vertex is "right" along the bounding loop, it is a right move (see Figure 3); otherwise it is a left move
- Each right move is stored as a 1 and left moves are stored as 0’s
- These right/left constitute the marching pattern for each triangle run
- Each run in the triangle tree has 2 entries (run length, leaf bit)
- Run length is the length of the run, and leaf bit indicates if the current run ends in a leaf triangle
Topological Surgery
Triangle Tree
- Each run is considered a triangle strip
- Each run is stored as (run length, leaf bit, marching pattern (size (run length – 1))
- Branching triangles
connect 3 runs in the triangle tree, and the rest of the triangles only connect 1 run in the triangle tree
- The run length is also encoded (possibly using Huffman), while the leaf bit and marching pattern are not
Topological Surgery
Compression
- Construct the vertex spanning tree and bounding loop
- Determine vertex position errors by using the linear predictor (the user specifies the number of ancestors to use, and the number of bits to represent the vertices)
- Determine the triangle tree
- Compute the marching pattern
- Encode the vertex spanning tree, triangle spanning tree, vertex position errors, the marching pattern, and any fixed-size information (such as the size of these structures, and accuracy of the vertex positions, the linear prediction parameters, etc.)
Topological Surgery
Decompression
- Obtain size of structures, accuracy information, number of ancestors to use in linear predictor, and linear predictor parameters
- Obtain and reconstruct vertex tree structure
- Reconstruct vertex positions
- Construct the bounding loop
- Obtain and reconstruct triangle tree structure
- Reconstruct the triangles, using the bounding loop, TTREE, and marching pattern
Topological Surgery
Normals, textures, and colors
- Normals, colors and textures are compressed in the same manner as the vertex positions
- They are first quantized to the desired accuracy (generally using the same accuracy as the vertex positions, i.e. 8-16 bits)
- They are encoded in the same order that the vertex positions were encoded, and also may use the linear prediction "method"
VRML and Topological Surgery (con’t)
- TS is restricted to triangles
- If a VRML file has a general polygon, must transform the polygon to be a triangle.
- The topology of an IFS is stored in its coordIndex field, which contains a sequence of indexes and ‘"-1"’s
- The -1’s partition the sequence into a sub-sequence, which consist of a polygonal face composed of 3 or more vertices
- Then triangulate the non-triangular polygons (by adding edges)
- These additional edges are non-polygonal edges (other edges are polygonal edges – those which actually belong to the original polygon)
- The vertex tree is only composed of polygonal edges, to ensure non-polygonal edges are marching edges
- To recover the actual polygon, first reconstruct all triangles, and then remove the non-polygonal edges
- Careful encoding ensures that the first 2 vertices for a given polygon will be connected
- From there, left and right moves happen, reconstructing the polygon, until the last 2 (marked) vertices are found
- Then the 1st 2 vertices are connected, and the last 2, to form the original polygon
VRML and Topological Surgery
Geometry and Properties
- For geometry and property encoding, 3 types are supported
- Can attach properties to faces, vertices, or corners (a face/vertex "set")
- One can explicitly encode the property value or specify an index into a palette of values
- Plus, 4 types of property values are floating point, integers, colors (an n-tuple) and normal values (a 3D vector)
- Ordering of the data is encoded for vertices due to a depth-first traversal of the VTREE
- Faces and corner data is ordered as used in the output coordIndex field during recovery
VRML and Topological Surgery
Compression Specifics
- Floating points are compressed to 8-12 bits, using a predictor/corrector model
- Integers are likewise compressed in this fashion, unless log (max. integer value) is smaller
- Colors are integerized and compressed like integers
- Normals are attached to faces, indicated by TTREE
- For the predictor/corrector model, the default is 2 ancestors, but they found 4-5 ancestors to be best
- More than 4-5 ancestors actually increased the compressed file size
REFERENCES
G. Taubin, W. Horn, F. Lazarus, J. Rossignac, "Geometry Coding and VRML", Proceedings of the IEEE", June 1998, pp. 1228-1243.