Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

895

Afore Tahir Harefa et.al (Selected models of speech….)

Selected models of speech perception by students 5

semester English Education Study Program

Universitas Nias of english during lecturing

Afore Tahir Harefa

a,1

, Mercy Dwiyul Halawa

b,2

, Delan Septiani Hulu

c,3

, Enjel Tris Jelita Telaumbanua

d,4

a,b,c,d

English Education Study Program, Faculty of Teacher Training and Education, Universitas Nias, Indonesia

aforetahirharefa@gmail.com;

mercyhalawa26@gmail.com;

septianidela[email protected];

telaumbanuaenjeltrisjelit[email protected]

Email Corresponding: aforetahirharefa@gmail.com

ARTICLE INFO

ABSTRACT

Article hisrory:

Accepted: 24 December 2023

Reviced: 27 January 2024

Approved: 25 March 2024

Available online: 12 April 2024

Understanding speech (speech perception) is an important ability that

English language students must master. There are several theoretical models

of speech comprehension that attempt to explain the cognitive and linguistic

processes underlying this ability. As for speech perception models, 1) Motor

Theory of Speech Perception, 2) Analysis-by-Synthesis, 3) Group Model, 4)

Fuzzy Logic Model, 5) TRACE Model. This study aims to find out what kind of

speech perception model used by students in lectures, especially in the

process of teaching and learning English. Specifically, this research is

classified as qualitative research. To collect data, the participants studied

were the 5th semester population of the English education study program,

especially at Nias University. Random samples were used and the samples

were 7 people from class A. From the results that have been obtained, there

are 4 (four) students classified as using the Cohort model, 2 (two) students

classified as using the Motor Theory model, and 1 (one) student classified as

using the Fuzzy Logical Model, so we can conclude that most students use

the Cohort model they often use this model in English learning activities

every day.

Keywords:

Speech Perception

Models

This is an open access article under CC BY-SA license

1. Introduction

Education Understanding speech (speech perception) is an important ability that

English language students must master. There are several theoretical models of speech

comprehension that attempt to explain the cognitive and linguistic processes underlying this

ability. These models include models of perceptual processing, auditory analytical processing,

auditory memory processing, to bottom-up and top-down interaction processing (Huettig &

Brouwer, 2015). However, these models do not yet comprehensively describe students' speech

understanding during lectures in the English language learning context. The ability to

understand and process native speech accurately is necessary for academic success among

students learning English in a foreign language context (Graham, 2017). However,

understanding English lectures raises various difficulties that include perceptual, linguistic,

cognitive, and pedagogical dimensions of the listening process (Matthee & Unger, 2021;

Vandergrift & Goh, 2012). Speech perception barriers arise from complex interactions between

lecture features, individual listener capacities, and learning conditions (Siegel, 2016).

Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

896

Afore Tahir Harefa et.al (Selected models of speech….)

Recent research shows that contextual factors such as the content of lecture material,

the lecturer's representational style, classroom acoustics, and differences in individual student

characteristics also influence speech understanding (Wagner, 2016). Apart from that, students'

coping strategies in dealing with difficulties understanding speech during lectures also need to

be taken into account. For example, students usually ask the lecturer to repeat their utterances,

or note down the utterances to identify key words (Vandergrift & Goh, 2012). Thus, a

comprehensive model of understanding English language learners' utterances during lectures

needs to consider the various factors above. This study aims to find out what kind of speech

perception model used by students in lectures, especially in the process of teaching and

learning English. These models aim to describe the interaction between the processes of

phoneme recognition and word recognition which these models are:

Motor Theory of Speech Perception

The main thesis of the motor theory is that, at some point in the speech perception

process, speech signals are interpreted by reference to motor speech movements. This theory

directly links the processes of speech production with speech perception by stating that we

perceive speech in terms of how we produce speech sounds. This- ory was advanced by

Liberman and his colleagues at Haskins Laboratories (Liberman, Cooper, Shankweiler, &

Studdert-Kennedy, 1967; Liberman, 1970). The theory was developed to deal with the absence

of invariance between the acoustic signal and its phonemic representation, a problem we have

already discussed.

Analysis-by-Synthesis

The basic assumptions of the analysis-by-synthesis model proposed by Stevens (1960)

and Stevens and Halle (1967) are similar to the motor theory in that speech perception and

production are closely tied. This model assumes that we make use of an abstract distinctive

features matrix in a system of matching that is crucial to the speech perception process. The

major claim of the theory is that listeners perceive (analyze) speech by implicitly generating

(synthesizing) speech from what they have heard and then compare the "synthesized" speech

with the auditory stimulus. According to this model, the perceptual process begins with

analysis of auditory features of the speech signal to yield a description in terms of auditory

patterns. A hypothesis concerning the representation of the utterance in terms of distinctive

features is constructed. In cases where phonetic features are not strongly influenced by context

and thus contain an invariant attribute, the auditory patterns are tentatively decoded into

phonemes. When no invariant attributes identify a phonetic feature additional processing is

required.

Cohort Model

Leagues (Marslen-Wilson & Welsh, 1978; Marslen Wilson, 1987) and consists of

tweing sounds such as Idl c This model of word recognition was developed by Marslen-Wilson

and his col the beginning of a target word activates all words in memory that resemble it. For

1988: Massaro, 1989 stages. In the first stage of word recognition, the acoustic-phonetic

information at example, if the word is drive, then words beginning with (d) are activated (dive

traces in syntax (as drink, date, dunk, and so on). These activated words make up the "cohort."

The activation of the cohort words is achieved on the basis of the acoustic information in the

target word and is not influenced by other levels of analysis. The second stage of word

recognition begins once a cohort structure is activated. In this second stage, all possible sources

of information may influence the selection of the target word from the cohort. These interactive

sources of information work toward eliminating words that don't resemble the target word. For

example, further acoustic-phonetic information may eliminate some of the cohort words (date

and dunk); and higher pech processing level sources of information may appear and eliminate

other members the cohort that might not fit with the available semantic or syntactic information

Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

897

Afore Tahir Harefa et.al (Selected models of speech….)

headphonemic identi drink). Finally, word recognition is achieved when a single candidate

remains in the cohort.

Fuzzy Logical Model

Speech perception, according to this model, is a prime example of pattern recognition

(Massaro, 1987, 1989; Massaro & Oden, 1980). The model assumes three operations in speech

perception: feature evaluation, feature integration, and decision. The model makes use of the

idea of prototypes, which are summary descriptions of the percep- tual units of language and

contain a conjunction of various distinctive features. The features of the prototype correspond

to the ideal values that a token should have if it is a member of that category. Continuously fed

feature information is evaluated, inte- grated, and matched against prototype descriptions in

memory, and an identification decision is made on the basis of the relative goodness of match

of the stimulus infor- mation with the relevant prototype descriptions.

TRACE Model

This is a neural network model developed by Elman and McClelland (1984, 1986).

States that processing occurs through excitatory and inhibitory connections among numerous

processing units called nodes. Phonetic or distinctive features, phonemes and words constitute

nodes that represent different levels of processing. Each node has a resting level, a threshold,

and an activation level that signifies the degree to which the input is consistent with the unit

that the node represents.

In particular, the model for understanding speech during lectures also needs to be

adapted to the stages of development of students' English language skills (Mora & Valls‐Ferrer,

2012). For example, students' understanding at the final level will be better than at the first

level. Also, students majoring in English usually have a higher understanding of speech than

other majors. Therefore, the development model must consider the diversity of students'

background experiences and English language proficiency achievements.

2. Method

To collect data, the participants studied were the 5th semester population of the English

education study program, especially at Nias University. Random samples were used and the

samples were 7 people from class A. This is related to the aim of finding out what model is

dominantly used by these students in responding to speech perception. Specifically, this

research is classified as qualitative research. Qualitative research is research based on inductive

reasoning patterns based on objective and participatory observations of social phenomena.

Problematic social phenomena include the past, present and even the future. Related to social

studies subjects, economics, culture, law, history, humanities and other social studies subjects,

Suyitno in Islamuddin et al (2023). According to Walidin et al in Fadli (2021), qualitative

research states a type of research method, a procedure for understanding humanitarian or

societal aspects of phenomena through developing a comprehensive and complex picture that

can be expressed orally, providing a comprehensive information perspective. Obtained from

informants, and carried out in an authentic environment. One type of social action that

emphasizes how people interpret and interpret their experiences to understand the social reality

they live in is qualitative research.

Data is collected, analyzed and interpreted using various methods such as interviews,

tests using audio/recordings, and observation. In addition, an open-ended questionnaire was

also used. Because this research requires information from English students at lectures, the

research instrument is a test. This test is in the form of a speech perception test which tests

students' perception based on the sounds they hear. This test is available in two drafts, namely

draft A and draft B. They function as a data collection method where the researcher and the

subject under investigation engage in a process of asking and responding Abdusama et al in

Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

898

Afore Tahir Harefa et.al (Selected models of speech….)

Lase (2023). By using a structured test, the test lasts 5 to 10 minutes and is conducted in

English as the main language.

3. Result and Discussion

a. Test

Test Speech Perception

Instructions: Please listen to the following words and repeat and write them after me.

Words:

bat

cat

hat

mat

pat

rat

sat

vat

This test is designed to measure your ability to identify individual speech sounds. The words in

the test are all very similar, except for the initial consonant sound. If you are able to accurately

identify all of the words in the test, then it suggests that you have good speech perception

skills.

Another example of a speech perception test is the Benchmark Sentence Test. This test consists

of 50 sentences that are spoken by a male talker in quiet conditions. The listener is asked to

repeat each sentence back to the examiner. The test is scored on the number of words that are

correctly repeated.

Speech perception tests can be used to assess the speech perception skills of children and

adults. They can also be used to monitor the progress of individuals who are receiving speech

therapy.

Here are some other examples of speech perception tests:

Sentence repetition test

In this test, the listener is presented with a sentence and asked to repeat it back to the speaker.

The sentences are typically spoken in noise to make it more challenging.

Example:

Speaker: "The green cat chased the red mouse."

Listener: "The green cat chased the red mouse."

The listener's responses are scored to determine their accuracy. A high score on a sentence

repetition test indicates that the listener has good speech perception skills.

These are just two examples of speech perception tests. There are many other types of speech

perception tests that can be used to assess different aspects of speech perception.

Speech perception tests are used by audiologists and other speech-language pathologists to

diagnose and assess speech perception problems in children and adults. They are also used by

researchers to study speech perception and to develop new speech recognition technologies

Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

899

Afore Tahir Harefa et.al (Selected models of speech….)

b. Test Results

• First student

Test 1 : bat, cat, hat, march, put, red, share, fair

Test 2 : the green cat chees red knows

• Second student

Test 1 : bad, can’t, hell, mouth, head, great, sad, fant

Test 2: the green cat changes the rednols

• Third student

Test 1 : hat, can, hand, ment, pant, blaund, sand, ahand

Test 2 : degreen can to nouns

• Fourth student

Test 1 : bat, cat, hat, met, pet, rat, sad, vet

Test 2 : The green karcis to rain house

• Fifth student

Test 1 : but, cat, hat, mad, pet, sad, vat

Test 2 : The green the rain mos

• Sixth student

Test 1 : bat, ket, het, mad, pet, set, hat

Test 2 : the degree mous

• Seventh student

Test 1 : bet, ket, het, med, pet, raw, sent, pat

Test 2 : the green cat cis the green mous

c. Interview Result

(What is your reason for answering/writing answers to tests 1 and 2 like that?)

• First student

“I write what I hear based on sounds that I know, for example bats, cats and others. I

rarely hear the words I heard earlier so I have difficulty getting the meaning and have

difficulty writing the correct words.”

• Second student

“I can't confirm whether the answer I wrote was correct or not, because I heard the

voice but didn't immediately see the person speaking. so it's hard for me to get the

correct meaning of the word. Usually, I see people speaking English directly so that I

don't misinterpret the word.”

• Third student

“The answer I wrote is an answer that I am sure is correct based on what I heard, but I

am also sure that not everything I wrote is correct.”

• Fourth student

“The sentence I heard the green ticket to rain house, I am only sure of the words the,

green, ticket because the pronunciation is clear and easy for me to understand while the

next word, I am not sure about such as house or mouse but I just write house.”

• Fifth student

I hesitated to write my answer because I listened without seeing the person speaking

directly. usually if I hear someone speaking English, I pay attention to the way he/she

speaks so that I can capture the real meaning. in English there are many words that

sound almost the same, so sometimes to distinguish them I look directly at the person

speaking.

Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

900

Afore Tahir Harefa et.al (Selected models of speech….)

• Sixth student

“I wrote these words based on what I heard, but to get the true meaning, I remembered

the words that I had memorized. and some of the words I heard I had never heard

before so I couldn't get the meaning.”

• Seventh student

“I wrote it again based on the words I heard but I had difficulty determining the correct

words or sentences because the recording was too fast. So based on the vocabulary I

heard, that's how I answered/rewrote the sounds from the test.

Discussion

The test and interview results of the first, fourth, fifth and seventh students had the

same opinion. They said that they answered the questions based on words they had mastered

before. When they hear words played from the audio, they remember the words they have

heard before and write down their answers to the test based on that. Cohort model (Marslen-

Wilson & Welsh, 1978; Marslen Wilson, 1987) and consists of tweing sounds. This word

recognition model was developed by Marslen-Wilson and colleagues. The onset of a target

word activates all words in memory that resemble it. We can conclude that they use this model

because they use their memory to get the meaning of the words they hear.

Based on the response of the third student, he said that he perceived the word based on

what he heard, and what made sense to him. And when he thinks it does not make sense then

he is not sure about the answer he wrote down.

Speech perception, according to this model, is a prime example of pattern recognition

(Massaro, 1987, 1989; Massaro & Oden, 1980). The model makes use of the idea of

prototypes, which are summary descriptions of the perceptual units of language and contain a

conjunction of various distinctive features. The features of the prototype correspond to the

ideal values that a token should have if it is a member of that category. We can conclude that

he used the Fuzzy Logical Model to interpret the meaning of sentences/words in English.

Based on the response of the second and sixth student, they said that their can usually

grasp the meaning of an English word based on the way it is pronounced directly, their can

understand better when their sees the person speaking directly. Liberman, Cooper,

Shankweiler, & Studdert-Kennedy, 1967; Liberman, 1970, put forward the theory of "Motor

Theory" The main thesis of the motor theory is that, at some point in the speech perception

process, speech signals are interpreted by reference to motor speech movements, so it can be

said that this student uses this method in speech perception.

4. Conclusion

From the results that have been obtained, there are 4 (four) students classified as using

the Cohort model, 2 (two) students classified as using the Motor Theory model, and 1 (one)

student classified as using the Fuzzy Logical Model, so we can conclude that most students use

the Cohort model they often use this model in English learning activities every day.

The model for understanding speech during lectures also needs to be adapted to the

developmental stages of students' English language ability, therefore, the development model

should consider the diversity of students' English language experience and achievement

background.

5. References

Huettig, F., & Brouwer, S. (2015). Delayed anticipatory spoken language processing in

adults with dyslexia—evidence from eye‐tracking. Dyslexia, 21(2), 97-122.

Academy of Education Journal

Vol. 15, No. 1, January 2024, Page: 895-901

ISSN: 1907-2341 (Print), ISSN: 2685-4031 (Online)

901

Afore Tahir Harefa et.al (Selected models of speech….)

Mora, J. C., & Valls‐Ferrer, M. (2012). Oral fluency, accuracy, and complexity in formal

instruction and study abroad learning contexts. Tesol Quarterly, 46(4), 610-641.

Vandergrift, L., & Goh, C. (2012). Teaching and learning second language listening:

Metacognition in action. New York.

Samuel, A. G. (2011). Speech perception. Annual review of psychology, 62, 49-72.

Werker, J. F., & Hensch, T. K. (2015). Critical periods in speech perception: New

directions. Annual review of psychology, 66, 173-196.

McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech

perception. Cognitive psychology, 18(1), 1-86.

McBride‐Chang, C. (1996). Models of speech perception and phonological processing in

reading. Child development, 67(4), 1836-1856.

McQueen, J. M. (2005). Speech perception. The handbook of cognition, 255-275.

Bianco, R., & Chait, M. (2023). No Link Between Speech-in-Noise Perception and Auditory

Sensory Memory–Evidence From a Large Cohort of Older and Younger

Listeners. Trends in Hearing, 27, 23312165231190688.

Kelly, G. J. (2023). Qualitative research as culture and practice. Handbook of Research on

Science Education, 60-86.

Leeming, D. (2018). The use of theory in qualitative research. Journal of Human

Lactation, 34(4), 668-673.

Aspers, P., & Corte, U. (2019). What is qualitative in qualitative research. Qualitative

sociology, 42, 139-160.

Le Grange, L. (2018). What is (post) qualitative research?. South African Journal of Higher

Education, 32(5), 1-14.

Jean Berko Gleason, (1998) Boston University, Psycholinguistics, Second Edition, Harcourt

Brace College Publishers, Forth Worth Philadelphia San Diego New York Orlando

Austin San Antonio Toronto Montreal London Sydney Tokyo