1. Help Center
  2. Voice Generation Time (VGT)

Introduction to Voice Generation Time (VGT)

Voice generation time (VGT) is calculated as the sum of the generated speech lengths for every sub-block. It can be considered as Studio credits.

VGT is consumed whenever you render a newly created sub-block or modify text in an existing sub-block using any of the three generate (play button) options available in the Studio (block, sub-block, and project level).

However,  modifying the generated speech using a different voice actor, style, pitch, speed, pause, emphasis, pronunciation, punctuation, and volume for the same text will not consume any voice generation time.

The voice generation time consumed to convert text to speech will equal the length of a particular text block. When text is added or modified in a sub-block, speech is generated for the entire sub-block again.

Example:
Consider that you have 10 minutes of VGT in your account and enter a 500-word script into the Studio. The generated audio file is 4:34 minutes; then, 4:34 minutes are deducted from your total VGT. Now, if you go back and edit (add or delete content) any of the text in the above script and render again, then only the audio duration for that particular sub-block will be deducted.

Estimating Voice Generation Requirement
A 1,000-word English script would consume ~ 6 minutes of VGT without accounting for the text changes. (This is an approximate value and will vary based on the script, voice selected, and speed of the voiceover.) Based on this, you can estimate how much voice generation time your project might need and opt for the right plan.

How to track your VGT

You can easily keep track of your Voice Generation Time within any project you’re working on.

Screenshot 2023-05-31 at 12.45.49 PM-1

The voice generation time tracker at the top of your project page will display the total number of minutes used out of your workspace’s allotted time.

When you hover over the tracker, it will display the following additional information:

  • Voice Generation Time Consumption | Speech Used
  • Transcription Minutes Consumption | Transcription Used
  • Project Size
  • Total Workspace Storage Consumption | Total Space Used

Using your Voice Generation Time Efficiently(VGT)

  1. Upload Large Scripts in Smaller Parts

    Generating a voiceover of 1,000 words takes approximately 6 minutes of VGT. Uploading large scripts in smaller chunks helps you track VGT usage more accurately and prevents your entire quota from being consumed at once.

  2. Split Your Script into Blocks and Sub-blocks

    Splitting your script into blocks and further into sub-blocks will minimize your VGT consumption while making text changes. 
    For example, if you have a 1000-word script pasted into a single block and change a single word, the voice generation time will be consumed for the entire block(6 minutes) as it needs to be rendered completely. However, if you split your block further into sub-blocks, the voice generation time will be consumed only for the sub-block where the text changes were made.