edu.ucsb.nmsl.autocap
Class NaiveCaptionEstimator

java.lang.Object
  extended by edu.ucsb.nmsl.autocap.NaiveCaptionEstimator
All Implemented Interfaces:
CaptionEstimator

public class NaiveCaptionEstimator
extends java.lang.Object
implements CaptionEstimator

The responsiblity of this class is to estimate captions that have no time- stamp after the Recognition and Alignment phases of the AutoCap process. The estimation of untime-stamped captions is important because each caption must have a time-stamp in order to create a usable caption file. These estimations are based on the assumption that there are recognized segments of the media file before and after each segment except for the first and last captions.

The estmation technique used in the NaiveCaptionEstimator is very simple. This technique uses the average speaking rate of the speaker for the entire video. The assumption here is that the speaker speaks at a consistent rate throughout the video and that accurate estimations of un-time-stamped captions can be made using this metric. The naive technique begins at the first word of an un-time-stamped caption and counts the number of words to the nearest time-stamped caption chunk. This search simultaneously counts forward and backward from the starting word until a time-stamped caption chunk. The time-stamp of the nearest recognized word is recorded along with its distance, in words, from the start word. The distance is multiplied by the global speaking rate and added or subtracted from the recorded time-stamp to formulate an estimated time-stamp. This time-stamp is then recorded as the time-stamp for the untime-stamped caption. This process is executed for each untime-stamped caption for a given run of AutoCap.

As its name implies, this technique is quite naive and very inaccurate. The NaiveCaptionEstimator is included only for research purposes because another technique, the inter-coverate caption estimator technique, performs much better and creates much more accurate time stamps than this technique.

Version:
1.0
Author:
Allan Knight
See Also:
InterCoverageCaptionEstimator

Field Summary
protected  java.util.Vector covered
          Vector that holds all the caption that have at least some words in it recognized by the speech recognition phase, but not necessarily the first word.
protected  double SpeakingRate
          Global speaking rate of speaker recorded during the recognition phase of the AutoCap process.
 
Constructor Summary
NaiveCaptionEstimator(double r)
          This constructor creates an instance of the NaiveCaptionEstimator with the given speaking rate.
 
Method Summary
protected  void collectStatistics(Transcript t)
          This method is called at the end of the estimation process in order to collect statistics about how well the speech recognition system performed while transcribing the input media.
 Transcript completeTranscriptTimes(Transcript t)
          This method performs the caption estimation for a given transcript.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SpeakingRate

protected double SpeakingRate
Global speaking rate of speaker recorded during the recognition phase of the AutoCap process.


covered

protected java.util.Vector covered
Vector that holds all the caption that have at least some words in it recognized by the speech recognition phase, but not necessarily the first word. This vector is used for collecting statistics for research purposes only and is not ncessary for normal operator of AutoCap.

Constructor Detail

NaiveCaptionEstimator

public NaiveCaptionEstimator(double r)
This constructor creates an instance of the NaiveCaptionEstimator with the given speaking rate. All estimations are made based on this global speaking rate.

Parameters:
r - The global speaking rate of the speaker throughout the video.
Method Detail

completeTranscriptTimes

public Transcript completeTranscriptTimes(Transcript t)

This method performs the caption estimation for a given transcript. Each caption of the transcript is investigate, if the first word of the caption has a time-stamp, then no estimation is performed. Otherwise, the time- stamp for the caption is estimated as specified in the class documentation.

Specified by:
completeTranscriptTimes in interface CaptionEstimator
Parameters:
t - The Transcript for which the estimation technique will be applied.
Returns:
A Transcript with all captions time-stamped.

collectStatistics

protected void collectStatistics(Transcript t)
This method is called at the end of the estimation process in order to collect statistics about how well the speech recognition system performed while transcribing the input media. This method is not necessary for the normal operation of AutoCap.

Parameters:
t - The Transcript of the transcribed input media for which statistics are being collected.