The Ricoh Theta SC and Ricoh Theta SC2 stores orientation information in the video files they produce. Extracting this information is helpful when editing, since it lets you do a first-pass zenith correction and get the footage somewhat stable.
1. Overview
The interesting metadata exists in the user data atom of the moov
atom. Of the atoms I found, I could only understand two of them well enough to put the data to any use: the RDTH
and RDT5
atoms. For completeness I've included the other atoms I ran across, but as you'll see, most of the data is simply marked as "unknown".
2. Navigating the MP4 File
An MP4 file consists of atoms, each atom being a section of the file with a four byte type identifier. Atoms can contain other atoms, so when I write moov.udta
I mean the atom named udta
in the top-level moov
atom.
In the type specifications below I'll use the conventional [u]intX
convention to specify unsigned (uintX
) or signed (intX
) integer values with X bits. Since Ricoh uses both types, I'll also add be
or le
for big-endian and little-endian layouts. A uint32le
is therefore an unsigned 32-bit integer stored in little-endian format.
The atom header is a simple structure, starting with a 32-bit size field that denotes the number of bytes in the atom, including the header. Then follows the atom name as another 32-bit quantity. If the size
field equals 1, then the actual size of the atom follows the name
field as a 64-bit unsigned integer. If the size
is zero, then the atom extends to the end of the file. If it has any other value, the largeSize
field is absent. Then, if the name
field is equal to "uuid"
or 0x75756964
, there is a 16-byte user type field. Otherwise this field is absent.
struct MP4Atom {
uint32be size;
uint32be name;
if (size == 1) {
uint64be largeSize;
}
if (name == "uuid") {
uint8 usertype[16];
}
}
To navigate the MP4 file you start by reading the first atom. If it is the one you were looking for you can treat the bytes in the atom as another sequence of atoms. If not, you should skip ahead size - header size
bytes to the next atom.
3. Metadata Atoms
All atoms are in the user data (udta
) atom of the (moov
) atom.
3.1. Theta SC and Theta SC2
The SC and SC2 shares some atoms, but not all. The Theta SC stores the orientation information in a RDT5
atom as a gravity vector, and the SC2 in a RDTH
atom as a quaternion specifying the camera's orientation.
3.2. moov.udta.RDTH
The RDTH
atom can be found in the moov.udta
atom and stores the camera orientation as a quaternion.
struct RDTH {
/**
* Number of entries. Corresponds to
* the number of frames in the video.
*/
uint32le entries;
uint16le unknown1; // was 30
uint16le unknown2; // was 24
Orientation orientations[entries];
}
struct Orientation {
/**
* A monotonically increasing
* sequence. Some kind of timer?
*/
uint32le unknown1;
/**
* Always zero.
*/
uint32le unknown1;
// The following fields make up a quaternion
/**
* real (scalar) part, rotation amount
*/
float32le r;
/**
* i, pitch axis
*/
float32le i;
/**
* j, roll axis
*/
float32le j;
/**
* k, yaw axis
*/
float32le k;
}
3.2.1. The float32le
Type
This is a 32-bit IEEE 754 single-precision floating-point number stored in little-endian format. If you read the integer correctly, you can then convert it to a float using Java's Float.intBitsToFloat(int i)
or, in C++ by using a union type to reinterpret the bits:
float readFloat32LE() {
union {
/** assuming 32-bit IEEE 754 single-precision */
float f;
/** assuming 32-bit 2's complement int */
uint32_t i;
} u;
u.i = readUInt32LE(); // Read a 32-bit unsigned int in little-endian
return u.f;
}
3.2.2. Example Data
Orientation data for a movie where the camera is rotated 360 degrees around an axis going from back to front. (A barrel roll.)
3.3. moov.udta.RDT5
The RDT5
atom can be found in the moov.udta
atom and stores the camera orientation as a gravity vector (and some more information whose meaning is unknown to me).
For some reasons the number of entries are exactly twice the number of video frames in the file. I don't know why.
struct RDT5 {
/**
* Number of gravity vector entries.
*/
uint32be entries;
uint8 unknown[40];
GravityVector gravityVectors[entries];
}
struct GravityVector {
/**
* x- left, x+ right
*/
int16be x;
/**
* y- down, y+ up
*/
int16be y;
/**
* z- back, z+ front
*/
int16be z;
uint8 unknown[6];
}
The vector components are stored as signed 16-bit integers. To normalize the vector, simply do a float divide by 16384.
3.3.1. Example Data
Orientation data for a movie where the camera is rotated 360 degrees around an axis going from back to front. (A barrel roll.)
As you can see, the data is quite noisy. Smoothing it using a size 16 box blur on the raw data worked well for me.
3.4. moov.udta.RDTD
Appears to be a sequence of 3-vectors encoded as three 16-bit values that correspond to camera orientation. However, all vector components are always positive. The layout is as follows:
struct RDTD {
uint32le entries;
uint16le unknown1;
uint16le unknown2;
RDTDEntry rdtdEntries[entries];
}
struct RDTDEntry {
/**
* A monotonically increasing
* sequence. Some kind of timer?
*/
uint32le unknown1;
/**
* Always zero.
*/
uint32le unknown2;
/**
* Magnitude of x-component of
* gravity vector?
*/
uint16le x;
/**
* Magnitude of y-component of
* gravity vector?
*/
uint16le y;
/**
* Magnitude of z-component of
* gravity vector?
*/
uint16le z;
/**
* Zero unless at the end of data
* when it is 65535. Indicates
* final frame?
*/
uint16le unknown3;
}
3.4.1. Example Data
Orientation data for a movie where the camera is rotated 360 degrees around an axis going from back to front. (A barrel roll.) This is from the same movie as the example data for RDTH
.
3.5. moov.udta.RDTG
A monotonically increasing sequence.
struct RDTG {
uint32le entries;
uint16le unknown1;
uint16le unknown2;
RDTGEntry rdtgEntries[entries];
}
struct RDTGEntry {
/**
* A monotonically increasing
* sequence. Some kind of timer?
*/
uint32le unknown1;
/**
* Always zero.
*/
uint32le unknown2;
}
3.6. moov.udta.@mod
Camera model. Example data: RICOH THETA SC
3.7. moov.udta.@swr
Software revision. Example data: RICOH THETA SC Ver 1.01
3.8. moov.udta.@day
Time of capture. Example data: 2021-02-25T19:40:20+02:00
3.9. moov.udta.@mak
Camera make. Example data: RICOH
3.10. moov.udta.@xyz
GPS data. Example data: +59.384407+017.968946+60CRSWGS_84/