Home/News | About | Download | Documentation | Forum | Bug Reports | Contact | Donations | Consulting | Projects | Legal | Security | FATE

This forum has not been maintained for a long time and will probably get deleted in the near future.
For faster responses to your questions, please use StackOverflow instead and tag your questions with "FFmpeg".
If you need a backup of the posts from this forum, please contact me directly.

Tip: read/write the exact count of samples with libvorbis

A collection of useful tutorials for some common tasks.

Tip: read/write the exact count of samples with libvorbis

Postby fsoft » Fri May 23, 2014 11:15 am

I use the version 2.2:
Code: Select all
libavutil      52. 66.100 / 52. 66.100
libavcodec     55. 52.102 / 55. 52.102
libavformat    55. 33.100 / 55. 33.100
libavdevice    55. 10.100 / 55. 10.100
libavfilter     4.  2.100 /  4.  2.100
libswscale      2.  5.102 /  2.  5.102
libswresample   0. 18.100 /  0. 18.100
libpostproc    52.  3.100 / 52.  3.100

My libvorbis has the version 1.3.4 and libogg 1.3.1.

I create audiofiles as ogg-vorbis. Strange to say, but the files have not ever the exact count of samples. It seems to be, that they have ever a multiple of 64 samples (?). For example i createted a file with exact 44100 samples with a sine-signal. The result was a file with 44160 samples. The additional 60 samples were silence.

The reason is at first the encoder-definition. Capabilities are set to CODEC_CAP_DELAY. In this case (when not CODEC_CAP_SMALL_LAST_FRAME or CODEC_CAP_VARIABLE_FRAME_SIZE are defined) libav use ever a constant framesize. When we have at last only less samples, then libav fill the frame with silence.

The solution is: We set also CODEC_CAP_SMALL_LAST_FRAME. Libvorbis can really work with variable framesize. But then we have the next problem. Libvorbis generate data for additional samples. It looks like a damped vibration (in my example with sine-wave).

We solve that with the Ogg-Fomat. Let's have look to the Ogg-Documentation. For every ogg-page we have at position 0x6 8 bytes, called 'absolute granule position'.
... Note that the 'position' data specifies a 'sample' number (eg, in a CD quality sample is four octets, 16 bits for left and 16 bits for right; in video it would likely be the frame number. It is up to the specific codec in use to define the semantic meaning of the granule position value). The position specified is the total samples encoded after including all packets finished on this page (packets begun on this page but continuing on to the next page do not count). The rationale here is that the position specified in the frame header of the last page tells how long the data coded by the bitstream is. A truncated stream will still return the proper number of samples that can be decoded fully.
A special value of '-1' (in two's complement) indicates that no packets finish on this page.

Unfortunately save libav on this position the pts, a time-value (in microseconds). My "work around" for this behavior is: i re-calculate the samplenumber from pts and set the pts and the duration in the AVPacket with this.

After creating the codec we change the capabilities:

Code: Select all
   if (codec.id == AVCodecID.AV_CODEC_ID_VORBIS)
      codec.capabilities |= CODEC_CAP_SMALL_LAST_FRAME;

For every encoded AVPacket we re-calculate the samplenumber:

Code: Select all
   if (CodecContext.codec_id == AVCodecID.AV_CODEC_ID_VORBIS) {
      if (avPacket.pts != AV_NOPTS_VALUE) {
         avPacket.dts =
         avPacket.pts = av_rescale_rnd(avpkt->pts, avctx->sample_rate, AV_TIME_BASE, ?);

That's all.

By the way, the same or a similar problem i found by the decoder. The decoder delivered too much samples. But there is the solution easy. We now the samplecount of the full track. Samples more then this we can simple forget.

Of course it would be nice, when the behavior of libav for libvorbis would be changed in future.
Posts: 2
Joined: Fri May 23, 2014 9:44 am

Return to Tutorials

Who is online

Users browsing this forum: Google [Bot] and 2 guests