Recommended image sequence conversion and FFmpeg settings

We are using a 4 Kinect setup, including 1 paired with a cinema cam. We are encoding the sequence as described in the documentation here.
Is there a recommendation in settings that we should follow for this particular setup, like the amount of rows? Same question for the FFmpeg settings.


@GlennWustlich, the formatting and encoding options for your asset are largely determined by the way they will be rendered/played back.

  • The multi-row formatting process is only necessary if you are trying to fit your asset within the constraints of video codecs and graphics processing pipelines - One common target resolution is 4096x4096, so if for example your 4-sensor asset is 6400x1400, you can reformat it into two rows to make it 3200x2800 (half the width, double the height). In general, you want to choose the number of rows which gives the resulting asset a similar width and height. However, if you are creating geometry sequences for use in 3rd-party pipelines/software, you can keep the image sequences full size in a single row.
  • If you are encoding the image sequences to video, the platform you are targeting determines parameters like scaling and bitrate. Check out our documentation on platform support for more information.
1 Like

Hi Cory, thanks for the reply.
Unfortunately we did not manage to get good results with it. I have sent you an email with one of our captures. Would be great if you could have a look! We really need this problem tackled as we can not continue with our production in the current state.


@GlennWustlich It looks like the asset you emailed has Refinement enabled, but no Refinement masks applied. Refinement masks are the most effective way to clean up your asset and get rid of extra geometry (like the floor, but also random bits of geometry rendered around your subject).

Correct I am aware of that. The example is just to showcase the difference in export between the out-of-the-box cpp video from Depthkit and the image sequence export + ffmpeg conversion. These two are using the same settings in Depthkit.

The example has no masks because Depthkit would otherwise not export the video because of the resolution as is written in the documentation.

In the video you can see that the difference in quality between the two is huge.

Any idea what is causing this? Either a setting in the python script or something in the ffmpeg conversion setting?

Thank you!

@GlennWustlich Ah, I see now what you’re comparing. If you run ffprobe on the resulting ffmpeg-encoded video, what does it report? Specifically, with regard to the color space and color matrix metadata (as explained here)?

Hi Cory, I ran it through ffprobe and see this:

Stream #0:00x1: Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, bt709, progressive), 3984x5760, 4755 kb/s, 30 fps, 30 tbr, 15360 tbn (default)

@GlennWustlich Do you have the exact ffmpeg command you used to encode this video?

Hi Tim,
This is what I used:

ffmpeg -r 30 -f image2 -start_number 144 -i 180_David_TurnAround03_06_13_01_46_Export_03_06_14_23_46_%06d.png -c:v libx264 -x264-params mvrange=511 -c:a aac -b:a 320k -shortest -colorspace bt709 -color_primaries bt709 -color_trc bt709 -color_range pc -b:v 5M -pix_fmt yuv420p TurnMPEG.mp4

Thanks Glenn,

This looks correct at first glance. Can you also specify the system configuration of the machine you’re encoding on, as well as the ffmpeg version?

Sure thing! That is:
AMD Ryzen 9 5950X 16-Core,
64Gb RAM
RTX 3090

ffmpeg version:

Let me know if you need more info. Thanks!

Hey @GlennWustlich,

Thanks for the additional info. I’ve taken a closer look at the command you’re running, and I was able to reproduce the issue on my end. The command you’ve provided does not specify the color space and color range settings within the video filter configuration. While it is setting the metadata appropriately, the underlying video data is not correct. Please add the following option to your ffmpeg command to configure the video scaler appropriately:

-vf "scale=out_color_matrix=bt709:out_range=full"

Let me know if this works for you.