State v. Cary

Decision Date09 February 1968
Citation99 N.J.Super. 323,239 A.2d 680
PartiesThe STATE of New Jersey v. Paul Gordon CARY, Defendant.
CourtNew Jersey Superior Court

Michael Diamond, Asst. Prosecutor, for the State (Leo Kaplowitz, Union County Prosecutor).

Oscar F. Laurie, Chatham, for defendant (Howard Schwartz, Union, on the brief).

BARGER, J.S.C.

This cause has been remanded for a pretrial hearing to determine whether the voiceprint technique in voice identification and the equipment producing the print have sufficient scientific acceptance whereby they produce uniform and reasonably reliable results and will contribute materially to the ascertainment of the truth, and as such is admissible as evidence. State v. Cary, 49 N.J. 343, 352, 230 A.2d 384 (1967).

Spectrogram voice identification is a recently developed technique that allegedly is able to identify a person from graphic representations of his voice. If reliable as an identification tool it will have enormous potential as a forensic aid.

The sound spectrograph with which we are here concerned was first developed at Bell Laboratories in this State about 1941. The instrument produces a permanent spectrogram, which is a graphic display of complex signals. The spectrograph is a basic research instrument used in many laboratories for research studies of sound, music and speech.

Voiceprint identification is the method by which a person can be identified from a spectrograpic examination of his taped voice, the spectrograph being capable of reproducing graphic impressions from tapes of human utterances. Specifically, ten frequently used cue words are normally involved: the, to, and, me, on, is, you, I, it, a. When the source material from which the voice is to be identified is contextual, these specific cue words are excerpted and compared with previously recorded voiceprints of or containing the same cue words.

The basic principle underlying the use of the voiceprint method is that whenever a sound is uttered, an energy output is required to transform it into an intelligible word. This energy output is electronically recorded on a sound spectrograph from the tape recording in the one-tenth of a second that it takes to utter the sound, and is thereafter transferred by the spectrograph into a 'contour' or 'bar' print. The print is a visual representation of the utterance. The voiceprint can then be used for comparison and identification purposes.

There are two basic types of voiceprints: (a) 'bar' and (b) 'contour.' Both types may be the result of a person uttering a cue word or other words as taped. The 'bar' voiceprint shows the resonance bars of the person's voice. The pattern of the bars determines what word is being said. In addition thereto, the voiceprint has dimensions of Time (plotted from left to right, i.e., the beginning of the word is at the left and the end is at the right); the Frequency is plotted along the vertical axis (the lower pitch of sound appears at bottom and higher pitch toward the top); and the Loudness is ascertained by examining the blackness of the printing (the darker the lines of the bar represent greater intensity of sound at each frequency for a particular time).

The 'contour' voiceprint is identical with the 'bar' print with regard to time and frequency measurements. The level of loudness, however, differs somewhat from that of the 'bar' print. The various contours or 'peaks' indicate the changes in intensity of sound at each frequency for a particular time. It has been suggested by an expert in the field of voiceprinting that it is easier to detect patterns of the 'bar' voiceprint, but that the 'contour' voiceprint is more easily analyzed and is also more easily reproduced in print.

The inquiry in this area is whether identification by a voiceprint has the claimed validity and, if so, to what degree? In other words, having several voiceprints, among which there are two made by the same person, can a trained individual, by reading, comparing and analyzing the spectrogram, determine with a high degree of certainty which of those voiceprints are of the same person; or having a known print to be compared with an unknown print, can it be reliably determined that it is or is not the voice of the same person; or having a pre-identified print that is to be compared with an unidentified print, can it be reliably determined that the two prints are or are not voiceprints of the same person's voice? Can the utterances of two or more persons produce the same print? Is the technique generally recognized by the scientific community involved?

It is contended that the voiceprint technique (spectrograph producing a spectrogram) is not affected by either the physiological or emotional conditions of the speaker. The emotional element is often present when the polygraph is employed and is one of the major complaints of the technique. It is argued, therefore, that the voiceprint technique is based primarily upon fixed and constant existing physiological mechanisms such as the vocal cavities and articulators. The major cavities affecting speech are the throat, nasals and the two oral cavities formed by positioning the tongue. The articulators include the lips, teeth, tongue, soft palate and jaw muscles. It is contended that one starts to form a speech pattern and uniqueness in infancy. Whereas the subject's body may undergo certain changes in responding to certain questions using the polygraph, which changes are capable of affecting the graphic recording and interpretation, it is claimed that the voiceprint technique is not so affected and the speaker has no ability to limit the efficiency of the process. The particular mechanical process which generates the impulses is not capable of change even though the person's emotional mood may change. In other words, as opposed to the polygraph results, the spectrograph voiceprint, it is contended, will be accurate regardless of any act or emotion on the part of the speaker.

It is said that the reason the method is efficient and reliable in producing the spectrogram is because it does not check sound or pitch of the voice over which the person speaking or sought to be identified may have some control, but rather merely records the impulses which are created by the aforementioned vocal cavities and articulators of the speaker. These impulses retain their characteristics even if the voice itself is impaired, i.e., by laryngitis or head cold. Another apparent advantage of the voiceprint technique is that a person with reasonable technical skill can in a short time be trained to compare and interpret the spectrogram, as with the comparison and interpretation of fingerprints.

It is contended that even a voice mimic or impersonator will not be able to prevent proper identification when the spectrograph is employed; that basic identifiable features in a person's voiceprint will not be altered by disguising the voice either by whispering, holding the nose, or muffling the voice. It is further contended that 'each voice is uniquely different enough to make it identifiable with the same accuracy that fingerprint identification enjoys.' 1 2 3

Lawrence G. Kersta testified for the State. He is an electrical engineer and physicist, and for many years was employed by the Bell Telephone Laboratories, retiring in 1966. He established the Voiceprint Laboratories at Somerville, N.J., and is referred to as the innovator of the technique. Initially, he was engaged in research which was concerned with a faster means of transmitting communication circuit information and in research relating to the coding of speech for various types of speech coding systems in the field of spectrography. During the course of his research he observed that spectrograms indicated a similarity when those of the same person's voice were compared. About 1960 he commenced conducting various experiments and tests in his field of voice identification, initially using colleagues known to him in the laboratories, and later about 16,000 spectrograms of the speech of about 123 subject people were made as a controlled speech population. The spectrograms were made from magnetic tape recordings and screened for observable speech characteristics so that they could not be easily identified without the use of the spectrogram. The ten cue words mentioned were used, being the words most frequently used in English conversation and on the telephone. Thereafter, certain high school girls were used as panelists, after comparison and identification training comprising about 40 hours. The training consisted principally of being taught to recognize speech characteristics and similarities from spectrograms. In all, about 12 were used, and the panelists were able to identify better than 97% Of the speakers.

Kersta has written many papers and articles on the technique. As the result of his research and experience in the field he is...

To continue reading

Request your trial
28 cases
  • State v. Harvey
    • United States
    • New Jersey Supreme Court
    • July 30, 1997
    ...42 N.J. at 171, 199 A.2d 809. Such a high standard is justified because freedom--indeed life--is at stake. State v. Cary, 99 N.J.Super. 323, 333, 239 A.2d 680 (Law Div.1968), aff'd, 56 N.J. 16, 264 A.2d 209 (1970). Scientific evidence is admissible only if the analysis used has "a sufficien......
  • Windmere, Inc. v. International Ins. Co.
    • United States
    • New Jersey Supreme Court
    • March 19, 1987
    ...v. Lykus, 367 Mass. 191, 327 N.E.2d 671 (1975), D'Arc v. D'Arc, supra, 157 N.J.Super. 553, 385 A.2d 278, and State v. Cary, 99 N.J.Super. 323, 239 A.2d 680 (Law Div.1968), the only witnesses who testified were experts affiliated with the development of the device at the Michigan State Unive......
  • Reed v. State
    • United States
    • Maryland Court of Appeals
    • September 6, 1978
    ...v. Lykus, 367 Mass. 191, 327 N.E.2d 671, 678 (1975); People v. Tobey, 401 Mich. 141, 257 N.W.2d 537 (1977); State v. Cary, 99 N.J.Super. 323, 239 A.2d 680, 685 (1968), Aff'd, 56 N.J. 16, 264 A.2d 209 (1970); D'Arc v. D'Arc, 157 N.J.Super. 553, 385 A.2d 278 (1978); People v. Rogers, 86 Misc.......
  • Reed v. State, 655
    • United States
    • Court of Special Appeals of Maryland
    • April 7, 1977
    ...case, harmless. The Superior Court of New Jersey, Law Division, also declined, in 1968, to allow evidence concerning spectrograms in State v. Cary, supra. That holding was affirmed in 1970, by that State's Supreme Court, State v. Cary, supra. Two years later, however, the latter court order......
  • Request a trial to view additional results

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT