Comparisons of Auditory Impressions and Auditory Imagery Associated with Onomatopoeic Representation for Environmental Sounds
© Masayuki Takada et al. 2010
Received: 6 January 2010
Accepted: 29 July 2010
Published: 11 August 2010
Humans represent sounds to others and receive information about sounds from others using onomatopoeia. Such representation is useful for obtaining and reporting the acoustic features and impressions of actual sounds without having to hear or emit them. But how accurately can we obtain such sound information from onomatopoeic representations? To examine the validity and applicability of using verbal representations to obtain sound information, experiments were carried out in which the participants evaluated auditory imagery associated with onomatopoeic representations created by listeners of various environmental sounds. Results of comparisons of impressions between real sounds and onomatopoeic stimuli showed that impressions of sharpness and brightness for both real sounds and onomatopoeic stimuli were similar, as were emotional impressions such as "pleasantness" for real sounds and major (typical) onomatopoeic stimuli. Furthermore, recognition of the sound source from onomatopoeic stimuli affected the emotional impression similarity between real sounds and onomatopoeia.
Sounds infinite in variety surround us throughout our lives. When we describe sounds to others in our daily lives, onomatopoeic representations related to the actual acoustic properties of the sounds they represent are often used. Moreover, because the acoustic properties of sounds induce auditory impressions in listeners, onomatopoeic representations and the auditory impressions associated with actual sounds may be related.
In previous studies, relationships between the temporal and spectral acoustic properties of sounds and onomatopoeic features have been discussed [1–4]. We have also conducted psychoacoustical experiments to confirm the validity of using onomatopoeic representations to identify the acoustic properties of operating sounds emitted from office equipment and audio signals emitted from domestic electronic appliances [5, 6]. We found relationships between subjective impressions, such as the product imagery and functional imagery evoked by machine operation sounds, audio signals, and the onomatopoeic features. Furthermore, in a separate previous study, we investigated the validity of using onomatopoeic representations to identify the acoustic properties and auditory impressions of various kinds of environmental sounds .
Knowing more about the relationship between the onomatopoeic features and auditory impressions of sounds is useful because such knowledge allows one to more accurately obtain or describe the auditory imagery of sounds without actually hearing or emitting them. Indeed, one previous study attempted a practical application of such knowledge by investigating the acoustic properties and auditory imagery of tinnitus using the onomatopoeic representations of patients . Moreover, future applications may include situations in which electronic home appliances such as vacuum cleaners and hair dryers break down and customers contact customer service representatives and use onomatopoeic representations of the mechanical problems they are experiencing; engineers who listen or read accounts of such complaints may be able to obtain more accurate information about the problems being experienced by customers and better analyze the cause of the problem through the obtained onomatopoeic representations. Wake and Asahi  conducted psychoacoustical experiments to clarify how people communicate sound information to others. Participants were presented with sound stimuli and asked to freely describe the presented sounds to others. Results showed that verbal descriptions, including onomatopoeia, mental impressions expressed through adjectives, sound sources, and situations were frequently used in the descriptions. Such information may be applicable to sound design. Indeed, related research has already been presented in a workshop on sound sketching , although the focus was on vocal sketching only.
In practical situations in which people communicate sound information to others using onomatopoeic representation, it is necessary that the receivers of onomatopoeic representations (e.g., engineers in the above-mentioned case) be able to identify the acoustic properties and auditory impressions of the sounds that onomatopoeic representations represent. The present paper examines this issue. Experiments were carried out in which participants evaluated the auditory imagery associated with onomatopoeic representations. The auditory imagery of onomatopoeic representations was compared with the auditory impressions for their corresponding actual sound stimuli, which were obtained in our previous study .
Furthermore, one of the most primitive human behaviors related to sounds is the identification of sound sources . Gygi et al.  reported that the important factors affecting the identification of environmental sounds involve spectral information, especially the frequency contents around 1-2 kHz, and temporal information such as envelope and periodicity. If we do indeed recognize events related to everyday sounds using acoustic cues [13–15], then it may be possible to also recognize sound sources from onomatopoeic features instead of acoustic cues. Moreover, such recognition of the source may affect the auditory imagery evoked by onomatopoeia. Although Fujisawa et al.  examined the auditory imagery evoked by simple onomatopoeia with two morae such as /don/ and /pan/ ("mora" is a standard unit of rhythm in Japanese speech), sound source recognition was not discussed in their study. In the present paper, therefore, we took sound source recognition into consideration while comparing the auditory imagery of onomatopoeic representations to the auditory impressions induced by their corresponding real sounds.
"Major" and "minor" onomatopoeic representations for each sound source.
"Major (1)" and "minor (2)" onomatopoeic representations
whizzing sound (similar to the motion of a whip)
idling sound of a diesel engine
(1) /burorororo/ [bɯ ɽ oɽ oɽ oɽ o], (2) /karakarakarakarakarakorokorokorokorokoro / [kaɽ akaɽ akaɽ akaɽ akaɽ akoɽ okoɽ okoɽ okoɽ okoɽ o]
sound of water dripping
(1) /potyaN/ [potʃ an], (2) /pikori/ [pikoɽ i]
bark of a dog (barking once)
(1) /waN/ [wan], (2) /wauQ/ [waɯ ʔ]
ring of a telephone
(1) /pirororororo/ [piɽ oɽ oɽ oɽ oɽ o], (2) /piriririririririri/ [piɽ i ɽ i ɽ i ɽ i ɽ i ɽ i ɽ i ɽ i]
(1) /kurururu/ [kɯ ɽ ɯ ɽ ɯ ɽ ɯ], (2) /fororoo/ [Φoɽ oɽ oː]
vehicle starter sound
(1) /bururuuN/ [bɯ ɽ ɯ ɽ ɯ ː n], (2) /tyeQ baQ aaN/ [tʃ eʔ bɑ ʔ aan]
hand clap (clapping once)
(1) /paN/ [pan], (2) /tsuiN/ [tsɯ in]
(1) /puu/ [pɯ ː], (2) /faaQ/ [Φaː ʔ]
sound of a flowing stream
(1) /zyorororo/ [dʑ oɽ oɽ oɽ o], (2) /tyupotyupoyan/ [tʃ ɯ potʃ ɯ pojan]
sound of a noisy construction site (mainly the machinery noise of a jackhammer)
(1) /gagagagagagagagagagaga/ [ɡ aŋ aŋ aŋ aŋ aŋ aŋ aŋ aŋ aŋ aŋ a],
sound of fireworks
(1) /patsuQ/ [patsɯ ʔ], (2) /putiiiN/ [pɯ tʃ iː n]
(1) /puiQ/ [pɯ iʔ], (2) /poi/ [poi]
knock (knocking on a hard material like a door, twice)
(1) /koNkoN/ [koŋ kon], (2) /taQtoQ/ [tattoʔ]
chirping of an insect (like a cricket)
twittering of a sparrow
(1) /piyo/ [pijo], (2) /tyui/ [tʃ ɯ i]
harmonic complex tone
(1) /pii/ [piː], (2) /piiQ/ [piː ʔ]
sound like a wooden gong (sounding once)
(1) /pokaQ/ [pokaʔ], (2) /NkaQ/ [n kaʔ]
sound of a trumpet
(1) /puuuuuuN/ [pɯ ː n], (2) /waaN/ [waː n]
sound of a stone mill
(1) /gorogorogoro/ [ɡ oɽ oŋ oɽ oŋ oɽ o], (2) /gaiaiai/ [ɡ aiaiai]
siren (similar to the sound generated by an ambulance)
(1) /uuuu/ [ɯ ː], (2) /uwaaaaa/ [ɯ waː]
shutter sound of a camera
(1) /kasyaa/ [kaʃ aː], (2) /syagiiN/ [ʃ aɡ iː n]
(1) /zaa/ [dzaː], (2) /suuuuuu/ [ssssss]
sound of a temple bell
(1) /goon/ [ɡ oː n], (2) /gaaaaaaaaaaN/ [ɡ aː n]
thunderclap (relatively nearby)
(1) /baaN/ [baː n], (2) /bababooNbaboonbooN/ [bababoː n baboː n boː n]
bell of a microwave oven (to signal the end of operation)
(1) /tiiN/ [tʃ iː n],(2)/kiNQ/ [kin ʔ]
sound of a passing train
(1) /gataNgotoN/ [ɡ ataŋ ŋ oton],
(2) /gararatataNtataN/ [ɡ aɽ aɽ atatantatan]
typing sound (four keystrokes)
(1) /katakoto/ [katakoto], (2) /tamutamu/ [tamɯ tamɯ]
beach sound (sound of the surf)
(1) /zazaaN/ [dzadzaː n],
(2) /syapapukupusyaapaaN/ [ʃ apapɯ kɯ pɯ ʃ aː paː n]
sound of wind blowing (similar to the sound of a draft)
(2) /haaaououou ohaaa ouohaaao/ [haː oɯ oɯ oɯ ohaː oɯ ohaː o]
sound of wooden clappers (beating once)
(1) /taN/ [tan],(2) /kiQ/ [kiʔ]
sound of someone slurping noodles
(1) /zuzuu/ [dzɯ dzzz], (2) /tyurororo/ [tʃ ɯ ɽ oɽ oɽ o]
sound of a wind chime (of small size and made of iron)
(1) /riN/ [ɽ i n], (2) /kiriiN/ [kiɽ i ː n]
sound of a waterfall
(1) /goo/ [ɡ oː], (2) /zaaaaa/ [dzaː]
footsteps (someone walking a few steps)
(1) /katsukotsu/ [katsɯ kotsɯ], (2) /kotoQ kotoQ/ [kotoʔ kotoʔ]
For each sound stimulus, 8 onomatopoeic representations were classified into 2 groups based on the similarities of 24 phonetic parameters, consisting of combinations of 7 places of articulation (labiodental, bilabial, alveolar, postalveolar, palatal, velar, and glottal), 6 manners of articulation (plosive, fricative, nasal, affricate, approximant, and flap) , the 5 Japanese vowels (/a/, /i/, /u/, /e/, /o/), voiced and voiceless consonants, syllabic nasals, geminate obstruents, palatalized consonants, and long vowels, using a hierarchical cluster analysis in which the Ward method of using Euclidean distance as a measure of similarity was employed. For the two groups obtained from cluster analysis, two onomatopoeic representations were selected for each sound. One was selected from the larger group (described as the "major" representation) and the other from the smaller group (the "minor" representation). A major onomatopoeic representation is regarded as being frequently described by many listeners of the sound, that is, a "typical" onomatopoeia, whereas a minor onomatopoeic representation is regarded as a unique representation for which there is a relative smaller possibility that a listener of the sound would actually use the representation to describe it. In selecting the major onomatopoeic stimuli, a Japanese onomatopoeia dictionary  was also referenced. Consequently, 72 onomatopoeic representations were used as stimuli, as shown in Table 1; the expressions are written in both Japanese and the International Phonetic Alphabet . In the experiments, however, the onomatopoeic stimuli were presented to participants using Japanese katakana, which is a Japanese syllabary used to write words. Almost all Japanese are able to correctly pronounce onomatopoeic representations written in Japanese katakana.
Onomatopoeic sounds uttered by listeners of sounds might more accurately preserve acoustic information such as pitch (the fundamental frequency of a vocal sound) and sound level compared to written onomatopoeic representations. Accordingly, onomatopoeic sounds (including vocal sketching) may be advantageous as data in terms of the extraction of fine acoustic information. However, written onomatopoeia also preserve a certain amount of acoustic information. Furthermore, in Japan not only onomatopoeic sounds are often vocalized, but onomatopoeia are also frequently used in printed matter, such as product instruction manuals in which audio signals that indicate mechanical problems are described in words. In such practical applications, there may also be cases where written onomatopoeic representations are used in the communication between customer service representatives and the users of products such as vacuum cleaners and hair dryers. Therefore, in the present study, we used written onomatopoeic stimuli rather than onomatopoeic sounds.
Factor loading of each adjective scale for each factor.
Pair of adjectives
desirous of hearing
not desirous of hearing
Participants were also requested to provide answers by free description to questions asking about the sound sources or the phenomena that created the sounds associated with the onomatopoeic stimuli.
3.1. Analysis of Subjective Ratings
The obtained rating scores were averaged across participants for each scale and for each onomatopoeic stimulus. To compare impressions between actual sound stimuli and onomatopoeic representations, factor analysis was applied to the averaged scores for onomatopoeic representations together with those for the sound stimuli (i.e., the rating results of auditory impressions) obtained in our previous experiments .
By taking into account the factors for which the eigenvalues were more than 1, a three-factor solution was obtained. The first, second, and third factors accounted for 45.5%, 24.6%, and 9.76%, respectively, of the total variance in the data. Finally, the factor loadings for each factor on each scale were obtained using a varimax algorithm, as shown in Table 2.
The first factor is interpreted as the emotion factor because adjective pairs such as "tasteful/tasteless" and "pleasant/unpleasant" have high loadings for this factor. The second factor is interpreted as the clearness factor because adjective pairs such as "muddy/clear" and "bright/dark" have high factor loadings. The third factor is interpreted as the powerfulness factor because the adjective pairs "strong/weak," "modest/loud," and "powerful/powerless" have high factor loadings.
3.2. Analysis of Free Description Answers of Sound Source Recognition Questions
The percentage of correct answers averaged across all "major" onomatopoeic stimuli was 64.3%. On the other hand, the same percentage for "minor" onomatopoeic stimuli was 24.3%. Major onomatopoeic stimuli seemed to allow participants to better recall the corresponding sound sources. These results suggest that sound source information might be communicated by major onomatopoeic stimuli more correctly than by minor stimuli.
4.1. Comparison between Onomatopoeic Representations and Real Sound Stimuli Factor Scores
From Figure 1(a), sound stimuli such as "owl hooting (no. 6)," "vehicle horn (no. 9)," "sound of a flowing stream (no. 11)," "sound of a noisy construction site (no. 12)," and "sound of a wind chime (no. 34)" displayed highly positive or negative emotion factor scores (e.g., inducing strong impressions of tastefulness or tastelessness and pleasantness or unpleasantness). However, the factor scores for the onomatopoeic representations of the same sound stimuli were not as positively or negatively high. On the other hand, the factor scores for the "major" onomatopoeic representations of stimuli such as "sound of water dripping (no. 3)," "sound of a temple bell (no. 25)," and "beach sound (no. 30)" were nearly equal to those of the corresponding real sound stimuli.
According to Table 3, for the emotion factor, the factor scores for the real sound stimuli were closer to those for the major onomatopoeic representations than to those for the minor onomatopoeic representations. The correlation coefficient of the emotion factor scores between the real sound stimuli and the major onomatopoeic stimuli was statistically significant at ( ), while the same scores of the minor onomatopoeic stimuli were not correlated with those of their real sounds.
As shown in Figure 1(b), for the clearness factor, the factor scores for the major and minor onomatopoeic representations were close to those for the real sound stimuli as a whole. Table 3 also shows that the averaged factor score differences between the real sound stimuli and both the major and minor onomatopoeia were the smallest for the clearness factor. Furthermore, the correlation coefficients of the clearness factor scores between the real sound stimuli and the major or minor onomatopoeic stimuli were both statistically significant at (sound versus major onomatopoeia: ; sound versus minor onomatopoeia: ). The impressions of muddiness (or clearness) and brightness (or darkness) for the onomatopoeic representations were similar to those for the corresponding real sound stimuli.
For the powerfulness factor, factor scores for the major and minor onomatopoeia were different from those for the corresponding sound stimuli as a whole, as shown in Figure 1(c) and Table 3. Moreover, no correlation of the powerfulness factor scores between the real sound stimuli and the onomatopoeic stimuli was found.
These results suggest that the receiver of onomatopoeic representations can more accurately guess auditory impressions of muddiness, brightness, and sharpness (or clearness, darkness and dullness) for real sounds from their heard onomatopoeic representations. Conversely, it seems difficult for listeners to report impressions of strength and powerfulness for sounds using onomatopoeic representations.
In the present paper, while onomatopoeic stimuli with highly positive clearness factor scores included the Japanese vowel /o/ (e.g., the major onomatopoeic stimuli nos. 2 and 21), those with highly negative clearness factor scores included vowel /i/ (e.g., the major and minor onomatopoeic stimuli nos. 27 and 34). According to our previous study , the Japanese vowel /i/ was frequently used to represent sounds with spectral centroids at approximately 5 kHz, inducing impressions of sharpness and brightness. Conversely, vowel /o/ was frequently used to represent sounds with spectral centroids at approximately 1.5 kHz, inducing impressions of dullness and darkness. From a spectral analysis of the five Japanese vowels produced by male speakers, the spectral centroids of vowels /i/ and /o/ were actually the highest and lowest, respectively, of all the five vowels . Thus, it can be said that these vowels are at least useful in communicating information about the rough spectral characteristics of sounds.
As mentioned above, a relatively small difference in addition to a significant correlation of emotion factor scores between the real sound stimuli and the major onomatopoeic stimuli were found. Participants could recognize the sound source or the phenomenon creating the sound more accurately from the major onomatopoeic stimuli, as shown in Figure 2.
Preis et al. have pointed out that sound source recognition influences differences in annoyance ratings between bus recordings and "bus-like" noises, which were generated from white noise to have spectral and temporal characteristics similar to those of original bus sounds . Similarly, in the case of the present paper, good recognition of sound sources may be the reason why the emotional impressions of the major onomatopoeic stimuli were similar to those for the real sound stimuli.
In our previous study, we found that the powerfulness impressions of sounds were significantly correlated with the number of voiced consonants . However, as shown in Figure 1(c), the auditory imagery of onomatopoeic stimuli containing voiced consonants (i.e., nos. 26 and 35) was different from the auditory impressions evoked by real sounds. Thus, we can conclude that it is difficult to communicate the powerfulness impression of sounds by voiced consonants alone.
4.2. Effects of Sound Source Recognition on the Differences between the Impressions Associated with Onomatopoeic Representations and Those for Real Sounds
As mentioned in the previous section regarding the emotion factor, there is the possibility that differences in impressions between real sound stimuli and onomatopoeic representations may be influenced by sound source recognition. That is, impressions of onomatopoeic representations may be similar to those for real sound stimuli when the sound source can be correctly recognized from the onomatopoeic representations. To investigate this point for each of the three factors, the absolute differences between the factor scores for the onomatopoeic representations and those for the corresponding sound stimuli were averaged for each of two groups of onomatopoeic representations: one group comprised of onomatopoeic stimuli for which more than 50% of the participants correctly answered the sound source question, and another group comprised of those for which less than 50% of the participants correctly answered the sound source question (see Figure 2). These two groups comprised 30 and 42 representations, respectively, from the 72 total onomatopoeic representations.
Absolute differences between factor scores for onomatopoeic representations and those for real sound stimuli, averaged for each of the two groups of onomatopoeic representations: those for which more than 50% of participants had correct sound source identifications, and those for which less than 50% of participants had correct identifications (standard deviations shown in parentheses).
The auditory imagery of sounds evoked by "major" and "minor" onomatopoeic stimuli was measured using the semantic differential method. From a comparison of impressions made by real sounds and their onomatopoeic stimuli counterparts, the clearness impressions for both sounds and major and minor onomatopoeic stimuli were found to be similar, as were the emotional impressions for the real sounds and the major onomatopoeic stimuli. Furthermore, the recognition of a sound source from an onomatopoeic stimulus was found to influence the similarity between the emotional impressions evoked by such onomatopoeic representations and their corresponding real sound stimuli, although this effect was not found for the factors of clearness and powerfulness. These results revealed that it was relatively easy to communicate information about impressions of clearness, including the muddiness, brightness, and sharpness of sounds, to others using onomatopoeic representations. These impressions were mainly related to the spectral characteristics of the sounds . These results also indicate that we can communicate emotional impressions through onomatopoeic representations, enabling listeners to imagine the sound source correctly. Onomatopoeia can therefore be used as a method of obtaining or describing information about the spectral characteristics of sound sources in addition to the auditory imagery they evoke.
The authors would like to thank all of the participants for their participation in the experiments. This paper was supported by a Grant-in-Aid for Scientific Research (no. 15300074) from the Ministry of Education, Culture, Sports, Science, and Technology.
- Tanaka K, Matsubara K, Sato T: Onomatopoeia expression for strange noise of machines. Journal of the Acoustical Society of Japan 1997, 53(6):477-482.Google Scholar
- Iwamiya S, Nakagawa M: Classification of audio signals using onomatopoeia. Soundscape 2000, 2: 23-30.Google Scholar
- Hiyane K, Sawabe N, Iio J: Study of spectrum structure of short-time sounds and its onomatopoeia expression. Technical Report of IEICE 1998, (SP97-125):65-72.Google Scholar
- Sato T, Ohno M, Tanaka K: Extraction of physical characteristics from onomatopoeia: Relationship between actual sounds, uttered sounds and their corresponding onomatopoeia. Proceedings of the Forum Acusticum, 2005, Budapest, Hungary 1763-1768.Google Scholar
- Takada M, Tanaka K, Iwamiya S, Kawahara K, Takanashi A, Mori A: Onomatopoeic features of sounds emitted from laser printers and copy machines and their contribution to product image. Proceedings of 17th International Congress on Acoustics, 2001 3C.16.01.Google Scholar
- Yamauchi K, Takada M, Iwamiya S: Functional imagery and onomatopoeic representation of auditory signals. Journal of the Acoustical Society of Japan 2003, 59(4):192-202.Google Scholar
- Takada M, Tanaka K, Iwamiya S: Relationships between auditory impressions and onomatopoeic features for environmental sounds. Acoustical Science and Technology 2006, 27(2):67-79. 10.1250/ast.27.67View ArticleGoogle Scholar
- Shiraishi K, Sakata T, Sueta T, et al.: Multivariate analysis using quantification theory to evaluate acoustic characteristics of the onomatopoeic expression of tinnitus. Audiology Japan 2004, 47: 168-174. 10.4295/audiology.47.168View ArticleGoogle Scholar
- Wake SH, Asahi T: Sound retrieval with intuitive verbal descriptions. IEICE Transactions on Information and Systems 2001, E84(11):1568-1576.Google Scholar
- Sonic Interaction Design : Sketching Sonic Interaction Design. Proceedings of the SID Workshop, 2008 http://www.cost-sid.org/wiki/HolonWorkshopGoogle Scholar
- Guski R: Psychological methods for evaluating sound quality and assessing acoustic information. Acta Acustica United with Acustica 1997, 83(5):765-774.Google Scholar
- Gygi B, Kidd GR, Watson CS: Spectral-temporal factors in the identification of environmental sounds. Journal of the Acoustical Society of America 2004, 115(3):1252-1265. 10.1121/1.1635840View ArticleGoogle Scholar
- Warren WH, Verbrugge RR: Auditory perception of breaking and bouncing events: A case study in ecological acoustics. Journal of Experimental Psychology: Human Perception and Performance 1984, 10(5):704-712.Google Scholar
- Ballas JA: Common factors in the identification of an assortment of brief every day sounds. Journal of Experimental Psychology: Human Perception and Performance 1993, 19(2):250-267.Google Scholar
- Rosenblum LD: Perceiving articulatory events: Lessons for an ecological psychoacoustics. In Ecological Psychoacoustics. Edited by: Neuhoff JG. Elsevier Academic Press, San Diego, Calif, USA; 2004:219-248.Google Scholar
- Fujisawa N, Obata F, Takada M, Iwamiya S: Impression of auditory imagery associated with Japanese 2-mora onomatopoeic representation. Journal of the Acoustical Society of Japan 2006, 62(11):774-783.Google Scholar
- International Phonetic Association : Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press, Cambridge, UK; 1999.Google Scholar
- Asano T: The Dictionary of Onomatopoeia. Kadokawa Books, Tokyo, Japan; 1978.Google Scholar
- Osgood CE, Suci GJ, Tannenbaum PH: The Measurement of Meaning. University of Illinois Press, Chicago, USA; 1957.Google Scholar
- Gaver WW: What in the world do we hear? An ecological approach to auditory event perception. Ecological Psychology 1993, 5(1):1-29. 10.1207/s15326969eco0501_1MathSciNetView ArticleGoogle Scholar
- Preis A, Hafke H, Kaczmarek T: Influence of sound source recognition on annoyance judgment. Noise Control Engineering Journal 2008, 56(4):288-299. 10.3397/1.2949893View ArticleGoogle Scholar
- von Bismarck G: Timbre of steady sounds: A factorial investigation of its verbal attributes. Acustica 1974, 30: 146-159.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.