Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
Selected models of speech perception by students 5
semester English Education Study Program
Universitas Nias of english during lecturing
Afore Tahir Harefa
, Mercy Dwiyul Halawa
, Delan Septiani Hulu
, Enjel Tris Jelita Telaumbanua
English Education Study Program, Faculty of Teacher Training and Education, Universitas Nias, Indonesia
septianidela[email protected];
telaumbanuaenjeltrisjelit[email protected]
Email Corresponding:
Article hisrory:
Accepted: 24 December 2023
Reviced: 27 January 2024
Approved: 25 March 2024
Available online: 12 April 2024
Understanding speech (speech perception) is an important ability that
English language students must master. There are several theoretical models
of speech comprehension that attempt to explain the cognitive and linguistic
processes underlying this ability. As for speech perception models, 1) Motor
Theory of Speech Perception, 2) Analysis-by-Synthesis, 3) Group Model, 4)
Fuzzy Logic Model, 5) TRACE Model. This study aims to find out what kind of
speech perception model used by students in lectures, especially in the
process of teaching and learning English. Specifically, this research is
classified as qualitative research. To collect data, the participants studied
were the 5th semester population of the English education study program,
especially at Nias University. Random samples were used and the samples
were 7 people from class A. From the results that have been obtained, there
are 4 (four) students classified as using the Cohort model, 2 (two) students
classified as using the Motor Theory model, and 1 (one) student classified as
using the Fuzzy Logical Model, so we can conclude that most students use
the Cohort model they often use this model in English learning activities
every day.
Speech Perception
©2024, Afore Tahir Harefa, Mercy Dwiyul Halawa, Delan Septiani Hulu, Enjel Tris Jelita Telaumbanua
This is an open access article under CC BY-SA license
1. Introduction
Education Understanding speech (speech perception) is an important ability that
English language students must master. There are several theoretical models of speech
comprehension that attempt to explain the cognitive and linguistic processes underlying this
ability. These models include models of perceptual processing, auditory analytical processing,
auditory memory processing, to bottom-up and top-down interaction processing (Huettig &
Brouwer, 2015). However, these models do not yet comprehensively describe students' speech
understanding during lectures in the English language learning context. The ability to
understand and process native speech accurately is necessary for academic success among
students learning English in a foreign language context (Graham, 2017). However,
understanding English lectures raises various difficulties that include perceptual, linguistic,
cognitive, and pedagogical dimensions of the listening process (Matthee & Unger, 2021;
Vandergrift & Goh, 2012). Speech perception barriers arise from complex interactions between
lecture features, individual listener capacities, and learning conditions (Siegel, 2016).
Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
Recent research shows that contextual factors such as the content of lecture material,
the lecturer's representational style, classroom acoustics, and differences in individual student
characteristics also influence speech understanding (Wagner, 2016). Apart from that, students'
coping strategies in dealing with difficulties understanding speech during lectures also need to
be taken into account. For example, students usually ask the lecturer to repeat their utterances,
or note down the utterances to identify key words (Vandergrift & Goh, 2012). Thus, a
comprehensive model of understanding English language learners' utterances during lectures
needs to consider the various factors above. This study aims to find out what kind of speech
perception model used by students in lectures, especially in the process of teaching and
learning English. These models aim to describe the interaction between the processes of
phoneme recognition and word recognition which these models are:
Motor Theory of Speech Perception
The main thesis of the motor theory is that, at some point in the speech perception
process, speech signals are interpreted by reference to motor speech movements. This theory
directly links the processes of speech production with speech perception by stating that we
perceive speech in terms of how we produce speech sounds. This- ory was advanced by
Liberman and his colleagues at Haskins Laboratories (Liberman, Cooper, Shankweiler, &
Studdert-Kennedy, 1967; Liberman, 1970). The theory was developed to deal with the absence
of invariance between the acoustic signal and its phonemic representation, a problem we have
already discussed.
The basic assumptions of the analysis-by-synthesis model proposed by Stevens (1960)
and Stevens and Halle (1967) are similar to the motor theory in that speech perception and
production are closely tied. This model assumes that we make use of an abstract distinctive
features matrix in a system of matching that is crucial to the speech perception process. The
major claim of the theory is that listeners perceive (analyze) speech by implicitly generating
(synthesizing) speech from what they have heard and then compare the "synthesized" speech
with the auditory stimulus. According to this model, the perceptual process begins with
analysis of auditory features of the speech signal to yield a description in terms of auditory
patterns. A hypothesis concerning the representation of the utterance in terms of distinctive
features is constructed. In cases where phonetic features are not strongly influenced by context
and thus contain an invariant attribute, the auditory patterns are tentatively decoded into
phonemes. When no invariant attributes identify a phonetic feature additional processing is
Cohort Model
Leagues (Marslen-Wilson & Welsh, 1978; Marslen Wilson, 1987) and consists of
tweing sounds such as Idl c This model of word recognition was developed by Marslen-Wilson
and his col the beginning of a target word activates all words in memory that resemble it. For
1988: Massaro, 1989 stages. In the first stage of word recognition, the acoustic-phonetic
information at example, if the word is drive, then words beginning with (d) are activated (dive
traces in syntax (as drink, date, dunk, and so on). These activated words make up the "cohort."
The activation of the cohort words is achieved on the basis of the acoustic information in the
target word and is not influenced by other levels of analysis. The second stage of word
recognition begins once a cohort structure is activated. In this second stage, all possible sources
of information may influence the selection of the target word from the cohort. These interactive
sources of information work toward eliminating words that don't resemble the target word. For
example, further acoustic-phonetic information may eliminate some of the cohort words (date
and dunk); and higher pech processing level sources of information may appear and eliminate
other members the cohort that might not fit with the available semantic or syntactic information
Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
headphonemic identi drink). Finally, word recognition is achieved when a single candidate
remains in the cohort.
Fuzzy Logical Model
Speech perception, according to this model, is a prime example of pattern recognition
(Massaro, 1987, 1989; Massaro & Oden, 1980). The model assumes three operations in speech
perception: feature evaluation, feature integration, and decision. The model makes use of the
idea of prototypes, which are summary descriptions of the percep- tual units of language and
contain a conjunction of various distinctive features. The features of the prototype correspond
to the ideal values that a token should have if it is a member of that category. Continuously fed
feature information is evaluated, inte- grated, and matched against prototype descriptions in
memory, and an identification decision is made on the basis of the relative goodness of match
of the stimulus infor- mation with the relevant prototype descriptions.
This is a neural network model developed by Elman and McClelland (1984, 1986).
States that processing occurs through excitatory and inhibitory connections among numerous
processing units called nodes. Phonetic or distinctive features, phonemes and words constitute
nodes that represent different levels of processing. Each node has a resting level, a threshold,
and an activation level that signifies the degree to which the input is consistent with the unit
that the node represents.
In particular, the model for understanding speech during lectures also needs to be
adapted to the stages of development of students' English language skills (Mora & Valls‐Ferrer,
2012). For example, students' understanding at the final level will be better than at the first
level. Also, students majoring in English usually have a higher understanding of speech than
other majors. Therefore, the development model must consider the diversity of students'
background experiences and English language proficiency achievements.
2. Method
To collect data, the participants studied were the 5th semester population of the English
education study program, especially at Nias University. Random samples were used and the
samples were 7 people from class A. This is related to the aim of finding out what model is
dominantly used by these students in responding to speech perception. Specifically, this
research is classified as qualitative research. Qualitative research is research based on inductive
reasoning patterns based on objective and participatory observations of social phenomena.
Problematic social phenomena include the past, present and even the future. Related to social
studies subjects, economics, culture, law, history, humanities and other social studies subjects,
Suyitno in Islamuddin et al (2023). According to Walidin et al in Fadli (2021), qualitative
research states a type of research method, a procedure for understanding humanitarian or
societal aspects of phenomena through developing a comprehensive and complex picture that
can be expressed orally, providing a comprehensive information perspective. Obtained from
informants, and carried out in an authentic environment. One type of social action that
emphasizes how people interpret and interpret their experiences to understand the social reality
they live in is qualitative research.
Data is collected, analyzed and interpreted using various methods such as interviews,
tests using audio/recordings, and observation. In addition, an open-ended questionnaire was
also used. Because this research requires information from English students at lectures, the
research instrument is a test. This test is in the form of a speech perception test which tests
students' perception based on the sounds they hear. This test is available in two drafts, namely
draft A and draft B. They function as a data collection method where the researcher and the
subject under investigation engage in a process of asking and responding Abdusama et al in
Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
Lase (2023). By using a structured test, the test lasts 5 to 10 minutes and is conducted in
English as the main language.
3. Result and Discussion
a. Test
Test Speech Perception
Instructions: Please listen to the following words and repeat and write them after me.
This test is designed to measure your ability to identify individual speech sounds. The words in
the test are all very similar, except for the initial consonant sound. If you are able to accurately
identify all of the words in the test, then it suggests that you have good speech perception
Another example of a speech perception test is the Benchmark Sentence Test. This test consists
of 50 sentences that are spoken by a male talker in quiet conditions. The listener is asked to
repeat each sentence back to the examiner. The test is scored on the number of words that are
correctly repeated.
Speech perception tests can be used to assess the speech perception skills of children and
adults. They can also be used to monitor the progress of individuals who are receiving speech
Here are some other examples of speech perception tests:
Sentence repetition test
In this test, the listener is presented with a sentence and asked to repeat it back to the speaker.
The sentences are typically spoken in noise to make it more challenging.
Speaker: "The green cat chased the red mouse."
Listener: "The green cat chased the red mouse."
The listener's responses are scored to determine their accuracy. A high score on a sentence
repetition test indicates that the listener has good speech perception skills.
These are just two examples of speech perception tests. There are many other types of speech
perception tests that can be used to assess different aspects of speech perception.
Speech perception tests are used by audiologists and other speech-language pathologists to
diagnose and assess speech perception problems in children and adults. They are also used by
researchers to study speech perception and to develop new speech recognition technologies
Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
b. Test Results
First student
Test 1 : bat, cat, hat, march, put, red, share, fair
Test 2 : the green cat chees red knows
Second student
Test 1 : bad, can’t, hell, mouth, head, great, sad, fant
Test 2: the green cat changes the rednols
Third student
Test 1 : hat, can, hand, ment, pant, blaund, sand, ahand
Test 2 : degreen can to nouns
Fourth student
Test 1 : bat, cat, hat, met, pet, rat, sad, vet
Test 2 : The green karcis to rain house
Fifth student
Test 1 : but, cat, hat, mad, pet, sad, vat
Test 2 : The green the rain mos
Sixth student
Test 1 : bat, ket, het, mad, pet, set, hat
Test 2 : the degree mous
Seventh student
Test 1 : bet, ket, het, med, pet, raw, sent, pat
Test 2 : the green cat cis the green mous
c. Interview Result
(What is your reason for answering/writing answers to tests 1 and 2 like that?)
First student
“I write what I hear based on sounds that I know, for example bats, cats and others. I
rarely hear the words I heard earlier so I have difficulty getting the meaning and have
difficulty writing the correct words.”
Second student
“I can't confirm whether the answer I wrote was correct or not, because I heard the
voice but didn't immediately see the person speaking. so it's hard for me to get the
correct meaning of the word. Usually, I see people speaking English directly so that I
don't misinterpret the word.”
Third student
“The answer I wrote is an answer that I am sure is correct based on what I heard, but I
am also sure that not everything I wrote is correct.”
Fourth student
“The sentence I heard the green ticket to rain house, I am only sure of the words the,
green, ticket because the pronunciation is clear and easy for me to understand while the
next word, I am not sure about such as house or mouse but I just write house.”
Fifth student
I hesitated to write my answer because I listened without seeing the person speaking
directly. usually if I hear someone speaking English, I pay attention to the way he/she
speaks so that I can capture the real meaning. in English there are many words that
sound almost the same, so sometimes to distinguish them I look directly at the person
Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
Sixth student
“I wrote these words based on what I heard, but to get the true meaning, I remembered
the words that I had memorized. and some of the words I heard I had never heard
before so I couldn't get the meaning.”
Seventh student
“I wrote it again based on the words I heard but I had difficulty determining the correct
words or sentences because the recording was too fast. So based on the vocabulary I
heard, that's how I answered/rewrote the sounds from the test.
The test and interview results of the first, fourth, fifth and seventh students had the
same opinion. They said that they answered the questions based on words they had mastered
before. When they hear words played from the audio, they remember the words they have
heard before and write down their answers to the test based on that. Cohort model (Marslen-
Wilson & Welsh, 1978; Marslen Wilson, 1987) and consists of tweing sounds. This word
recognition model was developed by Marslen-Wilson and colleagues. The onset of a target
word activates all words in memory that resemble it. We can conclude that they use this model
because they use their memory to get the meaning of the words they hear.
Based on the response of the third student, he said that he perceived the word based on
what he heard, and what made sense to him. And when he thinks it does not make sense then
he is not sure about the answer he wrote down.
Speech perception, according to this model, is a prime example of pattern recognition
(Massaro, 1987, 1989; Massaro & Oden, 1980). The model makes use of the idea of
prototypes, which are summary descriptions of the perceptual units of language and contain a
conjunction of various distinctive features. The features of the prototype correspond to the
ideal values that a token should have if it is a member of that category. We can conclude that
he used the Fuzzy Logical Model to interpret the meaning of sentences/words in English.
Based on the response of the second and sixth student, they said that their can usually
grasp the meaning of an English word based on the way it is pronounced directly, their can
understand better when their sees the person speaking directly. Liberman, Cooper,
Shankweiler, & Studdert-Kennedy, 1967; Liberman, 1970, put forward the theory of "Motor
Theory" The main thesis of the motor theory is that, at some point in the speech perception
process, speech signals are interpreted by reference to motor speech movements, so it can be
said that this student uses this method in speech perception.
4. Conclusion
From the results that have been obtained, there are 4 (four) students classified as using
the Cohort model, 2 (two) students classified as using the Motor Theory model, and 1 (one)
student classified as using the Fuzzy Logical Model, so we can conclude that most students use
the Cohort model they often use this model in English learning activities every day.
The model for understanding speech during lectures also needs to be adapted to the
developmental stages of students' English language ability, therefore, the development model
should consider the diversity of students' English language experience and achievement
5. References
Huettig, F., & Brouwer, S. (2015). Delayed anticipatory spoken language processing in
adults with dyslexiaevidence from eye‐tracking. Dyslexia, 21(2), 97-122.
Academy of Education Journal
Vol. 15, No. 1, January 2024, Page: 895-901
ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)
Afore Tahir Harefa (Selected models of speech.)
Mora, J. C., & Valls‐Ferrer, M. (2012). Oral fluency, accuracy, and complexity in formal
instruction and study abroad learning contexts. Tesol Quarterly, 46(4), 610-641.
Vandergrift, L., & Goh, C. (2012). Teaching and learning second language listening:
Metacognition in action. New York.
Samuel, A. G. (2011). Speech perception. Annual review of psychology, 62, 49-72.
Werker, J. F., & Hensch, T. K. (2015). Critical periods in speech perception: New
directions. Annual review of psychology, 66, 173-196.
McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech
perception. Cognitive psychology, 18(1), 1-86.
McBride‐Chang, C. (1996). Models of speech perception and phonological processing in
reading. Child development, 67(4), 1836-1856.
McQueen, J. M. (2005). Speech perception. The handbook of cognition, 255-275.
Bianco, R., & Chait, M. (2023). No Link Between Speech-in-Noise Perception and Auditory
Sensory MemoryEvidence From a Large Cohort of Older and Younger
Listeners. Trends in Hearing, 27, 23312165231190688.
Kelly, G. J. (2023). Qualitative research as culture and practice. Handbook of Research on
Science Education, 60-86.
Leeming, D. (2018). The use of theory in qualitative research. Journal of Human
Lactation, 34(4), 668-673.
Aspers, P., & Corte, U. (2019). What is qualitative in qualitative research. Qualitative
sociology, 42, 139-160.
Le Grange, L. (2018). What is (post) qualitative research?. South African Journal of Higher
Education, 32(5), 1-14.
Jean Berko Gleason, (1998) Boston University, Psycholinguistics, Second Edition, Harcourt
Brace College Publishers, Forth Worth Philadelphia San Diego New York Orlando
Austin San Antonio Toronto Montreal London Sydney Tokyo