Voice Sync channel node

The Voice Sync CHOP detects phonemes in an audio channel given some audio phoneme samples and pro…

See also: Phoneme, Voice Split

The Voice Sync CHOP detects phonemes in an audio channel given some audio phoneme samples and produces lip control channels.

In order to set up a lip-synching system based on phoneme detection, you need to model different mouth shapes for each viseme (visually similar phonemes) and create a “phoneme library” of sample phonemes.

Once these steps are complete, the Voice Sync CHOP matches the phoneme library to the audio voice track in input 1. The Voice Sync CHOP outputs a channel for each phoneme (or viseme). Each channel contains the occurrences of the phoneme in the voice track (as 1 when detected, 0 otherwise) over its full length.

Parameters

Voice Sync

Quality

Allows you to trade off between performance and accuracy.

Silence Level

Voice Detection will only be performed on the input audio if its average volume for a given segment is higher than this value. It should be adjusted so that background noise is not interpreted as voice.

Silence Chan Name

Creates a channel which acts as a 'silence phoneme'.

Vocal Range (Hz)

The approximate vocal range of the voice actor. The defaults are for an average male speaker. This parameter is only used by Voiced Phonemes.

Max Pitch Shift

The maximum number of octaves that a phoneme in the voice track can differ in pitch from the sample phoneme.

Noise Filter (Hz)

Controls the size of the the noise filter to smooth out non-voiced phonemes.

Setup Phoneme Library

This button build the phoneme library from the second and third inputs. It is only used in time slice mode. Before starting the realtime synching, initialize the phoneme library by clicking this button.

Output

Convert Scores into On/Off States

Produces binary on/off state channels for each viseme into On/Off States rather than outputing the raw scores.

Minimum Score

The viseme with the highest score is chosen to be the match. However, if its score is not above the minimum score, then no visemes will be chosen. This parameter helps to eliminate poor matches.

Voiced Bias

Voiced and Non-Voiced phonemes are detected using the different methods. If you find that either the Voiced or Non-Voiced phonemes are dominating the output, you can shift the bias to balance them. Zero bias doesn’t favor either method, +1 completely favors Voiced phonemes and -1 completely favors Non-Voiced phonemes. Normally values should remain close to zero.

Sample Rate

The sample rate of the output channels. This allow partially determines how many matches are done on the input audio.

Super Sample

How many comparisons are done per output sample. A higher super sample will increase the matching accuracy, but will affect performance greatly.

Common

Some of these parameters may not be avaiable on all CHOP nodes.

Scope

To determine which channels get affected, some CHOPs have a scope string. Patterns can be used in the scope, for example * (match all), and ? (match single character).

The following are examples of possible channel name matching options:

chan2

Matches a single channel name.

chan3 tx ty tz

Matches four channel names, separated by spaces.

chan*

Matches each channel that starts with chan.

*foot*

Matches each channel that has foot in it.

t?

The ? matches a single character. t? matches two-character channels starting with t.

r[xyz]

Matches channels rx, ry and rz.

blend[3-7:2]

Matches number ranges giving blend3, blend5, and blend7.

blend[2-3,5,13]

Matches channels blend2, blend3, blend5, blend13.

t[xyz]

[xyz]matches three characters, giving channels tx, ty and tz.

Sample Rate Match

The Sample Rate Match Options handle cases where multiple input CHOPs’ sample rates are different.

Resample At First Input’s Rate

Use rate of first input to resample others.

Resample At Maximum Rate

Resample to highest sample rate.

Resample At Minimum Rate

Resample to the lowest sample rate.

Error if Rates Differ

Does not accept conflicting sample rates.

Units

The units for which time parameters are specified.

For example, you can specify the amount of time a lag should last for in seconds (default), frames (at the Houdini FPS), or samples (in the CHOP’s sample rate).

Note

When you change the Units parameter, it does not convert the existing parameters to the new units.

Time Slice

Time Slicing is a feature which boosts cooking performance and reduces memory usage. Traditionally, CHOPs calculate the channel over its entire frame range. If the channel does need to be evaluated every frame, then cooking the entire range of the channel is unnecessary. It is more efficient to calculate only the fraction of the channel that is needed. This fraction is known as a Time Slice.

Unload

Causes the memory consumed by a CHOP to be released after it is cooked and the data passed to the next CHOP.

Export Prefix

The Export prefix is prepended to CHOP channel names to determine where to export to.

For example, if the CHOP channel was named geo1:tx, and the prefix was /obj, the channel would be exported to /obj/geo1/tx.

Note

You can leave the Export Prefix blank, but then your CHOP track names need to be absolute paths, such as obj:geo1:tx.

Graph Color

Every CHOP has this option. Each CHOP gets a default color assigned for display in the Graph port, but you can override the color in the Common page under Graph Color. There are 36 RGB color combinations in the Palette.

Graph Color Step

When the graph displays the animation curves and a CHOP has two or more channels, this defines the difference in color from one channel to the next, giving a rainbow spectrum of colors.

Usages in other examples

Example name Example for