Vfr處理

AviSynth, 可變幀速率(vfr)和混合視頻(hybrid video)

根據幀速率可將視頻分為兩種類型,固定幀速率(CFR)的視頻和可變幀速率(VFR)的視頻.CFR視頻幀速率為恆定值,VFR視頻幀速率為非恆定值.許多視頻編輯程序(例如VirtualDub和AviSynth)都假設VFR視頻的幀速率為恆定值因為AVI不支持VFR.由於一系列的原因,這在不久的將來也是不會改變的.雖然AVI容器不支持VFR,但有一些容器(例如MKV,MP4和WMV/ASF)支持VFR.

混合視頻通常定義為一個混合着pulldown和non-pulldown(pulldown可以是場,標準的3:2 pulldown,或full frame)的視頻.這與pulldown是否是硬性(在回放時場field/幀frame 重複)或軟性(流stream中加入這標記表明在播放時哪些場field/幀 frame是重複的)無關.所以,它可以是CFR或VFR.因此,混合視頻是簡單地混合着不同的幀速率(如通常用於動畫的8,12和16fps)原始幀率base framerate是pulldow之前的速率的視頻.這種視頻需要根據內容決定最終的幀速率.

可變幀速率和混合視頻

我們需要知道通常視頻都是CFR的.但是混合視頻轉換成VFR是很常見的VFR的例子. 混合視頻是混合着不同原始幀速率base framerate的視頻 (如通常用於動畫的8,12和16fps). 更常見的混合視頻的例子是由部分隔行掃描interlaced/progressive NTSC (29.97 fps)和部分FILM (膠捲過帶將23.976 fps 轉成29.97 fps)組成的視頻. soft pulldown, NTSC的部分(也叫做視頻video部分)回放速度是29.97 fps和添加重複幀膠捲過帶telecined的部分(23.976 fps轉換至29.97 fps). hard pulldown,沒有增加任何場fields且回放速度是29.97. 另一個混合視頻的例子是現代電視動畫,科幻電視劇例如SG1,星級迷航,TNG和巴比倫5,和很多紀錄片DVD.

TIVTC工具被設計成無損處理混合視頻的,Decomb工具則通過合成來將視頻處理成CFR.

如何分辨VFR視頻(mkv/mp4)

這裏有一些判斷mkv/mp4是否是VFR的方法:

mpeg-2: DGIndex可以檢查Film/Video的比例, 它可以告訴你有多少內容是軟性pulldown的. 它不能處理硬性pulldown, 它也不能準確的識別視頻當視頻中混合着soft pulldown和hard pulldown.

mkv: 用mkv2vfr或mkvtoolnix取出timecodes.txt文件.

mp4: 用mp4dump(from the MPEG4 tools by MPEG4ip package). 打開命令行節目並輸入(使用相對路徑)

mp4dump -verbose=2 holly_xvid.mp4 > log.txt

打開log文件,你會看到類似下面的代碼(看stts部分來找出每一幀的長度):

type stts
     version = 0 (0x00)
     flags = 0 (0x000000)
     entryCount = 41 (0x00000029)
      sampleCount = 3 (0x00000003)
      sampleDelta = 1000 (0x000003e8)
      sampleCount[1] = 1 (0x00000001)
      sampleDelta[1] = 2000 (0x000007d0)
      sampleCount[2] = 3 (0x00000003)
      sampleDelta[2] = 1000 (0x000003e8)
      sampleCount[3] = 1 (0x00000001)
      sampleDelta[3] = 2000 (0x000007d0)
      etc ...

sampleDelta顯示幀frames的長度,sampleCount顯示有多少幀frames. 根據上面的例子我們得到如下數據: 3幀frames顯示長度1000 1幀frames顯示長度2000 3幀frames顯示長度1000 1幀frames顯示長度2000 .......... 顯示長度的值的單位不是秒而是滴答"ticks", 你可以通過時間表"timescale"來計算時間. 時間表"timescale"被存儲在視頻軌track里(確定你看的是正確的軌, 因為每個軌都有它自己的時間表timescale). 找這樣的輸出內容:

   type mdia
   type mdhd
   ...
   timeScale = 24976 (0x00006190)
   duration = 208000 (0x00032c80)
   language = 21956 (0x55c4)
   reserved = <2 bytes> 00 00

在本例中時間表timeScale是24976. 大多數幀frames的顯示長度是1000. 1000/24976 = 0.04這意味着前3幀frames每一幀顯示了0.04秒,與25 fps (1/25 = 0.04)是相同的. 下一個幀的顯示長度是2000. 2000/24976 = 0.08這意味着這一幀顯示了0.08秒, 與12.5 fps (1/12.5 = 0.08)是相同的. etc ... 從上面例子中的log文件可以看出這個視頻是混合視頻

在AviSynth中打開MPEG - 2混合視頻並重新編碼

假設你有混合視頻,這裏有列出幾種方法對它進行編碼.第一種方法是將它轉換為cfr視頻(23.976 fps或29.97 fps). 第二中方法是編碼成120 fps的avi並且丟幀dropped frames (其中重複的幀null幀將被丟棄後播放). 第三中方法是使用mkv或mp4容器創建一個真正的vfr.

編碼至CFR(23.976 fps或29.970 fps)

(翻譯中) If we choose the video rate, the video sequences will be OK, but the FILM sequences will not be decimated, appearing slightly jumpy (due to the duplicated frames). On the other hand, if we choose the FILM rate, the FILM sequences will be OK, but the video sequences will be decimated, appearing jumpy (due to the "missing" frames). Additionally, when encoding to 29.97 fps, you will get lower quality for the same file size, because of the 25% greater number of frames. It's a tough decision which to choose. If the clip is mostly FILM you might choose 23.976 fps, and if the clip is mostly video you might choose 29.97 fps. The source also is a factor. If the majority of the video portions are fairly static "talking heads", for example, you might be able to decimate them to 23.976 fps without any obvious stutter on playback.

When you create your d2v project file you will see whether the clip is mostly video (NTSC) or FILM (in the information box). However, many of these hybrids are encoded entirely as NTSC, with the film portions being "hard telecined" (the already telecined extra fields having also been encoded) so you'll have to examine the source carefully to determine what you have, and how you wish to treat it.

The AviSynth plugins Decomb and TIVTC provide two special decimation modes to better handle hybrid clips by blending. This will eat bitrate quickly, but it appears very smooth. Here is a typical script to enable this mode of operation:

Telecide(order=0, guide=1)
Decimate(mode=X) # tweak "threshold" for film/video detection

or

TFM(mode=1)
TDecimate(mode=0,hybrid=X) # tweak "vidThresh" for film/video detection

There are 2 factors that enable Decimate to treat the film and nonfilm portions appropriately. First, when Telecide declares guide=1, it is able to pass information to Decimate about which frames are derived from film and which from video. For this mechanism to work, Decimate must immediately follow Telecide. Clearly, the better job you do with pattern locking in Telecide (by tweaking parameters as required), the better job Decimate can do.

The second factor is the threshold. If a cycle of frames is seen that does not have a duplicate, then the cycle is treated as video. The threshold determines what percentage of frame difference is considered to be a duplicate. Note that threshold=0 disables the second factor.

Make sure to get the field order correct - DVDs are generally order=1, and captured video is generally order=0. The included DecombTutorial?.html explains how to determine the field order.

Mostly Film Clips (mode=3)

When the clip is mostly film, we want to decimate the film portions normally so they will be smooth. For the nonfilm portions, we want to reduce their frame rate by blend decimating each cycle of frames from 5 frames to 4 frames. Video sequences so rendered appear smoother than when they are decimated as film. Set Decimate to mode=3, or TDecimate to hybrid=1 for this behavior.

Another IVTC was developed specifically to handle hybrid material without blended frames: SmartDecimate. While you do get "clean" frames as a result, it also may play with slightly more stutter than does Decomb's result. A typical script might go:

B = TDeint(mode=1) # or KernelBob(order=1)
SmartDecimate(24, 60, B)

In order to keep the result as smooth playing as possible, it will insert the "Smart Bobbed" frames from time to time.

Mostly Video Clips (mode=1)

When the clip is mostly video, we want to avoid decimating the video portions in order to keep playback as smooth as possible. For the film portions, we want to leave them at the video rate but change the duplicated frames into frame blends so it is not so obvious. Set Decimate to mode=1, or TDecimate to hybrid=3 for this behavior.

In this case you may also consider leaving it interlaced and encoding as such, especially if you'll be watching on a TV later.

encoding to cfr - 120 fps

For this you'll need TIVTC and avi_tc. Start by creating a decimated avi with timecodes.txt, but skip the muxing. Then open tc-gui's tc2cfr tab and add your files or use this command line:

tc2cfr 120000/1001 c:\video.avi c:\timecodes.txt c:\video-120.avi

Then mux with your audio. This works because tc2cfr creates an avi with drop frames filling in the extra space with drop frames to create a smooth 120 fps avi.

encoding to vfr (mkv)

First download mkvtoolnix. We will use this to mux our video into the MKV container WITH a timecode adjustment file. Make sure that you have the latest version (1.6.0 as of this writing), as older ones read timecodes incorrectly.

There are several AviSynth plugins that you can use to generate the VFR video and required timecode file. An example is given below using the Decomb521VFR plugin. Another alternative is the TDecimate plugin contained in the TIVTC package. See their respective documentations to learn more about tweaking them.

The DeDup plugin removes duplicate frames but does not change the framerate (leaving jerky video if not decimated first), so it won't be included. It can still be used after either method by using their timecodes as input to DeDup.

Decomb521VFR:

Add this to your script:

Decomb521VFR_Telecide(order=1, guide=0)
Decomb521VFR_Decimate(mode=4, threshold=1.0, progress=true, \
             timecodes="timecodes.txt", vfrstats="stats.txt")

Open this script in VirtualDub, it will create the timecodes and stats files, then encode. It will seem to freeze at first, because it examines every frame on the first load.

TIVTC:

This is a 2-pass mode. Add this to your script:

TFM(mode=1, output="tfm.txt")
TDecimate(mode=4, output="stats.txt")

Open this and play through it in VirtualDub. Then close it, comment those lines out (or start a second script) and add:

TFM(mode=1)
TDecimate(mode=5, hybrid=2, dupthresh=1.0, input="stats.txt", \
          tfmin="tfm.txt", mkvout="timecodes.txt")

Load and encode.

framerate:

If you're encoding to a specific size using a bitrate calculator, vfr decimation will mess up the calculations. To make them work again add these to your script:

Before decimation:

oldcount = framecount # this line must be before decimation
oldfps = framerate

End of script:

averagefps = (float(framecount)*float(oldfps))/float(oldcount)
AssumeFPS(averagefps)

muxing:

Now mux to MKV:

Open mmg.exe (mkvmerge gui)
Add your video stream file
Add your audio stream file
Click on the imported video track
Browse for the "timecodes.txt" timecode file
Click on the audio track
If your audio already needs a delay, set one
Start muxing

To play it you need a Matroska splitter. For AVC you will need Haali's Splitter, but for ASP you can use it or Gabest's Splitter.

encoding to vfr (mp4)

If you create a 120 fps avi with drop-frames, however, the mp4 muxed from it will remove them along with any n-vops the encoder creates, leaving vfr. A more laborous way is to encode multiple cfr avi files (some with 23.976 fps film and some with 29.97 fps video) and join them directly into one vfr mp4 file with mp4box and the -cat option.

A third, much easier, method is to encode using the MKV method and then processing the video with tc2mp4: more details on tc2mp4 can be found on the Doom9 forums.

summary of the methods

Summing up the advantages and disadvantages of the above mentioned methods. When encoding to 23.976 or 29.97 fps the clip will be cfr (which editors like AviSynth and Virtualdub need), but it may look jumpy on playback due to duplicated or missing frames. That can be avoided with blending, but encoders can't work as well with that. When encoding to 120 fps using drop frames, the clip is cfr, not jumpy on playback, and very compatible for editing. Encoding to mkv using true vfr (using timecodes) neither loses nor duplicates frames, however it is not nearly as broadly supported as AVI.

Opening non MPEG-2 hybrid video in AviSynth and re-encoding

It is possible to open vfr video in AviSynth without losing sync: DirectShowSource. The most common formats that support hybrid video (vfr) are mkv, mp4, wmv, and rmvb, and the methods below work for all of them; however, if the source is mkv, you can also use mkv2vfr and AviSource.

opening non-avi vfr content in AviSynth

The best way to get all frames while keeping sync and timing is to convert to a common framerate, such as 120 fps for 24/30 (or rather 119.88). Always use convertfps=true, which adds frames like ChangeFPS, or your audio will go out of sync.

DirectShowSource("F:\Hybrid\vfr.mp4", fps=119.88, convertfps=true)

You can also open it as 30p, which then has to be re-decimated but has less frames to deal with, or 24p, breaking any 30p sections:

Re-encoding to 23.976 or 29.97 fps:

# fps=29.97 or fps=23.976
DirectShowSource("F:\Hybrid\vfr.mkv", \
                 fps=29.97, convertfps=true)

or

DirectShowSource("F:\Hybrid\vfr_startrek.mkv", \
                 fps=119.88, convertfps=true)
FDecimate(29.97) # or FDecimate(23.976)

Another way is to find out the average framerate (by dividing the total number of frames by the duration in seconds) and use this rate in DirectShowSource. Depending on the duration of a frame, frames will be added or dropped to keep sync, and it's almost guaranteed to stutter. DirectShowSource will not telecine.

re-encoding 120 fps video

The easiest way to convert vfr sources back into vfr in AviSynth is by using DeDup:

1st pass:

DupMC(log="stats.txt")

2nd pass:

DeDup(threshold=.1, maxcopies=4, maxdrops=4, dec=true, \
      log="stats.txt", times="timecodes.txt")

TIVTC can also do this:

1st pass:

TFM(mode=0,pp=0)
TDecimate(mode=4, output="stats.txt")

2nd pass:

TFM(mode=0,pp=0)
TDecimate(mode=6, hybrid=2, input="stats.txt", mkvout="timecodes.txt")

Once you've encoded your file, mux back to mkv or 120 fps avi.

This will chop out all the duplicate frames directshowsource inserts, while keeping framecount and timing nearly identical. But do not use the timecode file from the input video, use the new one. They may not be identical. (Of course you can play with parameters if you want to use more of the functionality of dedup.)

converting vfr to cfr avi for AviSynth

You can avoid analysing and decimating by using special tools to get a minimal constant-rate avi to feed avisynth. After processing and re-encoding, use tc2cfr or mmg on the output with the original timecodes to regain vfr and full sync. (If you perform any kind of decimation or frame-rate change you'll have to edit the timecode file yourself, although dedup does have a timesin parameter.)

avi

avi_tc will create a timecode and normal video, if the avi uses drop frames and not n-vops or fully encoded frames. It also requires that no audio or secondary tracks are present. To use it, open tc-gui and add your file, or use the following command line:

cfr2tc c:\video-120.avi c:\video.avi c:\timecodes.txt 1

mkv

mkv2vfr extracts all video frames from Matroska to a normal AVI file and a timecode file. This will only work if the mkv is in vfw-mode. The command-line to use it is:

mkv2vfr.exe input.mkv output.avi timecodes.txt

encoding to MPEG-2 vfr video

http://forum.doom9.org/showthread.php?t=93691

I didn't look at it yet, so i can't give any comments/hints.

Audio synchronization

Several methods are discussed to encode your video (at 23.976, 29.97 or vfr video). You might wonder why your audio stays in sync regardless of the method you used to encode your video. Prior to encoding, the video and audio have the same duration, so they start out in sync. The following two situations might occur:

you change the framerate of the stream by speeding it up or slowing it down (as is often done by PAL-FILM conversions). This implies that the duration of the video stream will change, and hence the audio stream will get out of sync.

you change the framerate of your the stream by adding or removing frames. This implies that the duration of the video stream will remain the same, and hence the audio stream will be in sync.

If you encode the video stream at 23.976 or 29.97 fps (both cfr) by using

Decimate(mode=3, threshold=1.0) or 
Decimate(mode=1, threshold=1.0)

frames will be removed or added, and thus your audio stream will be in sync. By a similar reasoning the vfr encoding will be in sync.

Finally, suppose you open vfr video in AviSynth with DirectShowSource. Compare the following

# fps=29.97 or fps=23.976
DirectShowSource("F:\Hybrid\vfr_startrek.mkv", fps=29.97)

and

# fps=29.97 or fps=23.976
DirectShowSource("F:\Hybrid\vfr_startrek.mkv", fps=29.97, convertfps=true)

The former will be out of sync since 24p sections are speeded up, and the latter will be in sync since frames are added to convert it to cfr.

引用連結

原始出處avisynth VFR
必讀: Force Film, IVTC, and Deinterlacing and more (doom9的好人們寫的).
如何創建 120 fps video.
說明文檔 Decomb521VFR.
關於 Decomb521VFR1.0 Matroska VFR自動處理模塊.
Mkvextract GUI by DarkDudae.

Besides all people who contributed to the tools mentioned in this guide, the author of this tutorial (Wilbert) would like to thank bond, manono, tritical and foxyshadis for their useful suggestions and corrections of this tutorial.

翻譯:btcdtc