Underlying Calibration Technique

Can anyone tell me the name of the underlying calibration technique in Depthkit Studio? Or if it doesn’t have a name, can anyone give me a high level idea of what Depthkit is doing when sampling during the calibration process?

Hi @BryanDunphy

I can describe the calibration process for you. At a high level, we are calculating a rigid transformation for each pair of sensors such that they are in alignment. We do this for every pair of sensors that has seen the same ArUco marker within the same sampling period. This results in a graph of sensor linkages, and we then choose the path through that graph that visits each sensor node, and results in the lowest error.

What depthkit is doing while sampling is collecting the 3D positions of the detected ArUco markers as seen by each sensor. These 3D positions are the input to calculating the rigid transformation that brings the set of 3D points seen by one camera as close as possible to alignment with the same set of 3D points as seen by the other camera. Naturally, not every point will align perfectly, and this results in an error metric, which we use to both filter out outlier points as well as choose the best sensor links to follow for the overall solution.

In addition to 3D positions, we also record the incidence angle, and how much spatial variation was detected over the course of the sampling period. This additional data is used to filter out erroneous data prior to calculating the transform.

One thing to note about the marker detection is that it is done using the infrared camera on the Azure Kinect, which has a resolution of 1024x1024, and is also a very wide angle lens. Markers that are too small will result in them either not being detected at all, because the details cannot be resolved, or they will have a much higher spatial variation as the noise of the sensor will cause the points to jitter slightly. Using physically larger ArUco markers will help the accuracy of the detected 3D points, as they will be better detected, and spatial variations will have less of an impact in the final averaged position.

Finally we also take into account how much physical space has been sampled to get an idea of how trustworthy the results are. For example it is possible to generate a calibration with very low error, but where the points used to generate the calibration only exist in a small volume of space. We penalize calibrations that do not fill a large volume, and this is reflected in the Quality metric, which takes the sampled volume into account. In general, sampling a large amount of space is the best way to increase the Quality, as we have higher confidence that the calibration is representative of the capture volume.


Hi @Tim, thanks very much for your great reply. This is exactly what I was looking for and has given me a much better understanding of what we are doing. Would there be any advantage of arranging the sensors in such a way that each pair are at 90 degree angles? Or perhaps this would not be relevant in this case because, if I understand the process above, the algorithm automatically chooses sensor pairs based on the error score (and other filtering metrics mentioned above)?

@BryanDunphy 90-degree angles are suitable (and necessary for minimal, full-body coverage configurations like our 5-sensor example). If the chart is aimed properly during a sample linking two sensors which are 90-degrees from each other, then that sample will remain valid IF the Angle of Incidence (AoI) filter is set above 45-50 degrees (or just above the larger of the two angles from the normal of the chart to the axis from the chart to the sensors). If the chart favors one or the other of the sensors in a sample, the other sensor will see the chart at too-oblique an angle, and the sample will be filtered out by the AoI filter.

1 Like

Hi @CoryAllen, thanks so much for your response. That is very clear and helpful for getting a thorough understanding of the AoI filter.

1 Like