Mailing List Archive

Closed captioning and commercial detection
Grant Taylor writes:

>And one could straightforwardly weed out duplicate episodes without
>being limited by the details in the tv guide data.
>
>CC's might also be another source for commercial boundary information;
>often one or the other isn't captioned. Although I suspect the
>presence or absence of CC data may not be so clear...
>
>
These days, many national advertisers are using closed captions, but if
you pay attention to the style of captions used, it may be possible to
distinguish captioned commercials from captioned programming. Many
commercials use the "paint-on" style of captions as opposed to the "roll
up" or "pop-on" style used in programming. A change in caption style may
indicate a commercial or a switch from one program to the next.

And of course one could always examine the /content/ of the captions to
detect commercials. You could use one of those new "bayesian email spam
filtering" algorithms to detect commercials :-)

For more detailed information on closed captioning technology, check out
this site:

http://www.robson.org/gary/writing/nv-line21.html

Regards,

Dan Schwarz