Short bursts of computer‐generated Gaussian noise were rated by observers for the presence or absence of a 500‐Hz signal tone burst. A multiple regression analysis found for each observer the linear combination of the energies in narrow bands around the tone frequency that best predicts his total ratings. The estimates of the regression coefficients provide graphs of the “frequency responses” of the observers. Most of the reliable variance in the total ratings was accounted for by the regression analysis in terms of energy in narrow bands. Differences among observers are explained in terms of differential weighting by observers of features labeled “tone presence,” “pitch,” and “loudness.”

The attenuation characteristics of each of three commercial audiometric headsets of the noise‐barrier type—(1) Aural Research (AR‐100) “Auraldomes,” (2) Madsen (type ME‐70) “Noise‐Excluding Headset,” and (3) Rudmose (RA‐125) “Otocups”—were measured, as were the attenuation characteristics of a standard audiometric headset (Telephonics TDH‐39 earphone with an MX‐41/AR cushion). The threshold‐shift method was used, employing a pure‐tone sound field in an anechoic room. The resulting attenuation data for each of the three noise‐barrier headsets were compared statistically to those for each of the others and to the data for the standard headset. The data were contrasted with those supplied by the manufacturers. In terms of their attenuation capabilities for the octave bands pertinent to audiometry, the Rudmose Otocup (RA‐125) is ranked first, the Auraldome (AR‐100) second, and the Madsen (ME‐70) headset third. It is suggested that attenuation data alone should not determine the acceptability of any such device. The potential user should, in view of the attenuation data yielded by this study, consider such use only after measurement of octave‐band noise levels in the environment in question.

The effect of electrical stimulation of the crossed olivocochlear bundle (COCB) on cochlear potentials was studied in anesthetized and immobilized guinea pigs. Sound‐evoked responses and the endocochlear potential were measured simultaneously in the basal turn. The time course and the magnitude of the slow negative potential in the scala media produced by COCB stimulation can be modified by parameters of COCB stimuli. Augmentation of cochlear microphonic and inhibition of whole‐nerve action potential were found to be dependent on the slow negative potential in scala media and on parameters of the acoustic stimulus. The summating potential showed increase of the positivity with COCB stimulation. The role of the positive summating potential on the inhibition of action potential was discussed.

The recovery from impulse‐noise induced temporary threshold shift was systematically traced for individual rhesus monkeys and men. In addition to the well‐known logarithmic recovery, three other types of recovery were seen (diphasic, plateau, and rebound). A descriptive model is developed for the classification of these recovery functions. The model postulates the existence of two types of temporary threshold shift, process M and process S, both of which may be seen after impulse‐noise exposure.

Perception of auditory patterns based on an intensity difference was investigated in 20 experienced normal‐hearing subjects under binaural and monaural listening conditions. Patterns were made up of either three white‐noise bursts or three 1000‐Hz tone bursts which were temporally spaced. Bursts within each pattern differed only in intensity and were either loud (L) or soft (S), i.e., each pattern included one of one intensity and two of the other. The six possible patterns were SLS, LSL, LLS, SSL, LSS, and SLL. The loud bursts remained at a constant intensity and the soft bursts were attenuated by either 9, 7, 5, or 3 dB. Patterns were presented at 50 dB sensation level. Tone‐burst patterns were easier to perceive and resulted in a larger number of correct responses than noise‐burst patterns. However, there was no significant difference between tone‐burst patterns and noise‐burst patterns in the percentage of errors that were pattern reversals. Symmetrical patterns were reversed more frequently than asymmetrical patterns. Auditory pattern reversals are compared to figure‐ground reversal and simultaneous contrast phenomena in vision and are discussed in relation to sensory inhibition.

When weak signals are presented in a background of continuous noise, the process of detection and the discrimination of a change in duration appears to be very similar. Two experimental techniques were used to investigate duration discrimination. The procedure in which the difference in duration between signals, ΔT, was varied given a fixed signal‐to‐noise ratio gave different results than the procedure in which signal amplitude was varied given a fixed ΔT. Although there were marked individual differences, all subjects roughly supported the general conclusion.

The response bias in YES‐NO detection with gated noise and simultaneously gated signal plus noise was found to show both sequential and probability contrast. The sequential dependencies showed that the more recent a signal event, the more the response bias shifts away from YES. Similarly, the more probable the presentation of a signal, the more the response bias shifts away from YES. The response bias in detection with continuous noise usually shows the opposite effect—response assimilation. The probability of a YES response increases with either greater signal probability or with signal recency. It is suggested that the response‐bias learning which has been postulated to occur in detection experiments depends on the stability of the judgmental frame of reference provided by the continuous noise. When this basis is removed, as in the present study, the response pattern parallels that usually observed in signal recognition studies for which responses are assumed to depend on the memory of the previous presentations. It is concluded that the response pattern, assimilation or contrast, depends more on the stability of the frame of reference than on the type of psychophysical task.

This paper continues the presentation of data describing detection performance in the monaural detection with contralateral cue (MDCC) situation. Subjects detected a monaural sinusoid burst masked by continuous white noise, either with or without an unmasked contralateral cue. The cue, presented in both intervals of each two‐interval forced‐choice trial, was a sinusoid of the same frequency and duration as the signal, but its phase and intensity were experimentally varied. In three preliminary experiments it was shown that: (1) when the cue nominally matched the signal in phase and intensity, it improved detection for frequencies below about 1200 Hz and was detrimental at frequencies above about 1400 Hz; (2) at 500 Hz, cue phase strongly affected detection performance, some phases resulting in performance much worse than without the cue; (3) at 500 Hz, the effect of cue intensity was small for the cue phase giving best performance (good phase), but increasing the cue intensity was detrimental to performance with a bad phase. With very loud cues, regardless of phase, performance declined with increasing cue intensity. The main experiment was a factorial study to examine the interactions of frequency, cue phase, and cue intensity. Phase was again found to be important at frequencies below about 1200 Hz, and to be more important the louder the cue. Worst performance at midfrequencies was found for the phase representing a cue lead of about 700 μsec. The phase and the related interaural time difference giving best performance were functions of both cue intensity and of cue frequency. No theoretical interpretation is attempted.

Most comfortable loudness (MCL) levels for pure tones, broad‐ and narrow‐band noise, and connected speech were studied in three independent experiments using Békésy audiometers and young normal‐hearing males and females. Differences in MCL were explored as a function of attenuation rate, sex, frequency of the pure‐tone and narrow‐band stimuli, interrupted versus continuous pure‐tone stimuli, instructional set, session, and a modified Békésy operation which allowed the subject to hold intensity constant over time versus standard Békésy operation. There were no significant sex, set, session, or operation differences. In all three experiments, a 2.5‐dB/sec attenuation rate produced higher MCLs than a 1.25‐dB/sec rate. In general, a 500‐Hz tone or narrow‐band noise centered at 500 Hz was tracked at the highest sound‐pressure levels (SPLs), while broad‐band noise was consistently tracked at the lowest levels. Regardless of frequency or attenuation rate, continuous pure tones were tracked at higher SPLs than interrupted pure‐tone stimuli. Although intersubject variability was relatively high, the majority of test‐retest differences in each experiment was 10 dB or less. Over‐all MCLs in decibelsSPLre 0.0002 μbar were 49.3 for speech, 49.4 for noise, and 51.7 for pure tones.

The signal, a 135‐Hz noise centered at 250 Hz, was partially masked by continuous wide‐band noise. The interaural correlation of both signal and masker was varied between +1.00 and zero. With a correlated masker (N0), detection is about 14 dB better with an uncorrelated signal (SU) than with a correlated signal. The masking‐level difference (MLD) diminishes as the correlation of the signal is increased to unity and as the correlation of the masker is decreased to zero. The results imply that sizable MLDs are obtained under those conditions where the addition of the signal to the noise results in a decrease in the correlation between the stimulus events at the two ears.

First‐order intermodulation components in cochlear‐microphonic potentials were measured with the differential electrode technique from all four turns of the guinea pig cochlea. Measurements were made with six pairs of primary frequencies and a wide range of primary signal intensities. The spatial patterns of the first‐order difference tones were compared with those of the primaries, and with pure tones whose frequency was the same as that of the difference tone. The results indicated that at low and moderate primary intensity levels the distortion component was localized in the cochlea somewhat apical from the region of maximum excitation by the higher‐frequency primary. With increasing stimulus intensity, a general shift of the distribution pattern was observed, accompanied by the development of a second region of maximal difference‐tone activity. This second region, where the difference tone became more prominent as the intensity was increased, corresponded to the location of maximal microphonic elicited by a pure tone whose frequency was the same as that of the difference tone. These results further confirm that distortion of the cochlear microphonic is a two‐stage process.

A review of previous studies of speechloudness shows great variability in the derived psychophysical functions relating loudness to speech power [i.e., sound‐pressure level (SPL)] and other physical and psychologicalmeasures, such as subglottal pressure (SGP) and vocal effort. This paper argues that, because of the complex nature of speech production and perception, traditional scaling procedures that yield exponential relations between loudness and some other measure should be replaced by multidimensional techniques. Two experiments on speechloudness scaling using partial correlation analysis demonstrate that loudness judgments depend upon both acoustic cues, as measured by peak SPL, and vocal effort cues, as measured by peak SGP in one experiment and subjective effort in the other. The relative dependencies upon these cues are different for different listeners and are at least somewhat resistant to distortion via signal attenuation and/or masking with white noise. These differences between listeners might be exploited in the search for relationships between speaking and listening.

In order to determine if speech spectrograms can be used to identify human beings, two questions must be studied: (1) does the formant structure of phonemes uttered by a certain speaker change over a long interval of time, and (2) can the formant structure be changed by disguise, or is it even possible to imitate the formant structure of another speaker? Spectrograms of utterances produced by seven speakers and recorded over periods of up to 29 years showed that the frequency position of formants and pitch of voiced sounds shift to lower frequencies with increasing age of test persons. Speech spectrograms of texts spoken in a normal and a disguised voice revealed strong variations in formant structure. Speech spectrograms of utterances of well‐known people have been compared with those of imitators. The imitators succeeded in varying the formant structure and fundamental frequency of their voices, but they were not able to adapt these parameters to match or even be similar to those of imitated persons.

A noise whose amplitude envelope followed closely that of a concomitant speech signal was generated by multiplying white noise and the amplitude envelope of the speech, permitting the signal‐to‐noise (S/N) ratio to be specified on a short‐time nonvarying basis. The spectrum of the amplitude envelope for continuous speech was studied, and the distributions of the vowel and consonant levels in articulation test materials were determined. Articulation functions in such noise and in continuous white noise were generated. Within the range of S/N ratios studied, the gains of the functions for vowels and consonants were 4% and 2.5% per decibel, respectively, in both types of noises. The results clearly depict the operational differences between conventional and envelope‐noise S/N‐ratio specification and suggest that use of the envelope‐noise masker may eliminate some of the problems associated with current methods.

Echograms of the dorsal surface of the tongue in the midsagittal plane and in coronal oblique planes are presented. Comparisons are made among tongue height measurements obtained from the transverse and longitudinal echograms and from lateral head x rays. It is concluded that ultrasonic scanning can provide information on the shape of the dorsal surface of the tongue during sustained utterances and in the articulatory rest position.

Numerical ratings of dissimilarity among 12 American English vowels were analyzed by means of the Shepard‐Kruskal procedure for nonmetric multidimensional scaling. The computer output was a geometric configuration of points representing the vowels whose interpoint distances related (approximately) monotonically to the dissimilarities data—the greater the judged dissimilarity between a pair of vowels, the greater the distance between the associated points. Using a procedure developed by Schönemann and Carroll, the axes from the MD‐SCAL analysis were rotated, translated, and dilated to match a target “configuration” based on certain phonetic features. The dimensions from a three‐dimensional analysis were interpreted as “tongue height,” “tongue advancement,” and “retroflexion” (multiple R = 0.70). When the vowel /ɜ/ was eliminated from the analysis, however, only the first two of these dimensions were needed (multiple R = 0.73).

A method is described for determining the shape of the vocal tract from a measurement of the acoustic impulse response at the lips. Under the assumptions that the wave motion is planar and that the losses are negligible, the method gives the area function uniquely. It is further shown that a one‐point measurement cannot provide any information about losses. A preliminary experiment is described which verifies the feasibility of the method.

This study investigated binaural sensory processing when continuous nonspeech tonal stimuli were dichotically presented. The experimental task consisted of auditory (frequency) pursuit tracking, and consequently there was no influence of selective attention or competition on the listening task. The target sound was externally controlled, whereas the cursor sound was generated and controlled by the transduced movements of either the subject's tongue or hand. A significant laterality effect was found only when the source of motor control over the acoustic signal was the speech‐related movements of the tongue. The theoretical implications of these results towards a feedback regulatory theory of speech production are discussed.

A form of preprocessed speech known to be highly intelligible to normal listeners was heard by a group of hearing‐impaired subjects. The preprocessing technique involves high‐pass filtering (cutoff 1100 Hz, slope 12 dB/oct) and infinite amplitude clipping. The subjects heard both unmodified and filtered/clipped word lists at 40‐, 30‐, and 20‐dB sensation levels. Discrimination scores for 13 out of 17 cases were significantly higher at 20‐ and 30‐dB sensation levels for filtered/clipped speech than for unmodified speech.

Absolute and frequency‐difference thresholds were determined by the conditioned‐suppression technique. The results show that the average frequency range of audibility at +50 dB sound‐pressure level extends from 86 Hz to 46.5 kHz, with a best frequency near 8 kHz. Individual differences in sensitivity are related to body weight and, probably, age. The average frequency‐difference limen is 3.5% from 125 Hz to 42 kHz. Compared to other mammals, the auditory capacities of guinea pig are within one standard deviation of the mammalian mean on each of six dimensions: high‐frequency and low‐frequency cutoff, lowest intensity, best frequency, area of the audible field, and frequency discrimination.