Sunday, March 9, 2008

Wednesday, March 5, 2008

predavanja - 3

Tico, stavila sam samo 3. predavanje jer smo 2. danas kopirale i isto je. ovde bi trebalo da bude i par slika, ali ako ti bash trebaju, to cu ti dati na usb-u...

cmokitj :*, vidimo se!




Lecture 3

SPEECH PRODUCTION

Man possesses, in common with many other animals, the ability to produce sounds by using certain of his body’s mechanisms. The human being differs from other animals in that he has been able to organize the range of sounds which he can emit into a highly efficient system of communication. Nevertheless, like other animals, when he speaks man makes use of organs whose primary physiological function is unconnected with vocal communication, in particular, those situated in the respiratory and digestive tracts.

Vocal Organs

The first prerequisite for the production of speech is to have a source of air to be used during speech. This stage during which the speech organs are supplied with air is referred to as initiation.

The organ which supplies air for the production of speech is the lungs. The lungs are situated in the chest, or within the rib-cage. They consist of soft spongy material and are roughly cone-shaped, with the base resting on an elastic membrane called the diaphragm (pronounced /daI@fr&m/), and stretch to the base of the neck. The lungs are connected to the windpipe, or trachea, which is a tube-shaped organ through which the air eventually comes to the oral cavity. With the downward movement of the diaphragm and the outward movement of the rib-cage muscles the lungs expand and are filled with air. This part of the breathing process is called inhalation. For breathing out, or exhalation, the lung volume is reduced, causing the air to flow out.

Once expelled form the lungs, the air passes through the trachea and comes into the larynx. The larynx is a box made of cartilage and muscle, with the forward portion protruding in the neck (known as ‘Adam’s apple’ in the males). It is particularly important because it contains two bands of elastic tissue called the vocal cords (or vocal folds), which can be brought together or parted. Their movement is the result of the activity of muscles which move the arytenoid cartilages, attached to the vocal cords. Since the vocal cords are elastic, they can take various positions which affect the airstream in different ways (cf. §Modes of Phonation). The space between the vocal cords is known as the glottis.

Having passed through the larynx, the airstream is subject to further modification in the upper cavities. It first reaches the pharynx, or pharyngeal cavity, which stretches from the top of the larynx to the region of the soft palate. This cavity is an important resonator in which the sound is modified.

Depending on the position of the velum (or soft palate), the air goes further either into the oral or the nasal cavity.

The activity of the organs in the oral cavity, or the mouth, has traditionally been in the focus of articulatory phonetics. This is partially justified, since the shape of the mouth and its organs determines finally the quality of speech sounds. The organs used for the production of speech situated in the mouth are commonly subsumed under the name vocal tract.

The oral cavity is bounded by the lips in the front, the palate in the upper part and the pharyngeal wall in the rear. Some of the speech organs are relatively fixed, whereas the others are movable Going from the front part of the vocal tract, the fixed organs are the two ranges of teeth (the upper and lower teeth), the hard palate and generally the upper jaw. The remaining organs are movable: the lips, the tongue and the soft palate with its pendant called the uvula (see Fig. 1). The lower jaw is highly movable, its movement controlling the gap between the upper and lower teeth and the position of the lips.

Some of the organs mentioned are further divided into smaller parts, because the articulation of a number of speech sounds depends on the exact part of the speech organ active during its articulation.

So the palate, also known as the roof of the mouth, is a huge area stretching from behind the upper teeth and ending in the uvula. The part immediately behind the teeth is called the alveolar ridge, it is the harsh surface which can be felt behind the upper teeth. Behind it is the arch which forms the hard palate (often referred to simply as ‘palate’, in the narrow sense). Further back is the soft palate, or the velum, which can be raised or lowered. At its extremity is the uvula, also active in the production of speech sounds in some languages (e.g. the French sound /r/ is uvular).

Of the movable parts the lips are the frontmost boundary of the mouth. They consist of the upper and lower lip. They can assume various positions, from being totally shut to being held apart in various ways. The position of the lips greatly influences the quality of some speech sounds.

Of all movable speech organs, the tongue is by far the most flexible, and is capable of assuming a great variety of positions in the production of both vowels and consonants. The tongue is a complex muscular structure which, for the purpose of studying speech sounds, is divided into several parts. The frontmost extremity of the tongue is called the tip of the tongue and the area around it, facing the teeth when the tongue is at rest, is called the blade of the tongue. While the tongue is at rest, with the tip lying behind the lower teeth, the largest area facing the hard palate is called the front of the tongue, and the part that faces the velum is called the back of the tongue. Further behind it is the root of the tongue, active in the articulation of speech sounds of some languages, but not in English or Serbian.

If, on the other hand, the soft palate is lowered, it enables the airstream to go into the nose, or the nasal cavity. This cavity is important for the production of some consonants in English (nasal consonants /m, n, N/), but in some languages it is also significant in the articulation of some vowels (e.g. in French, Portugese, etc.). More on the role of the nasal cavity is to be found in the section Resonating Cavities.

Figure 1 represents a schematic image of the organs of speech above the trachea.


Figure 1. Organs of speech.

Speech Initiation: Airstream Mechanisms

The term ‘Airstream Mechanisms’ refers to the sources of energy for generating speech sounds, using airflow and pressure in the vocal tract. We can distinguish three basic mechanisms, namely, lung airflow, glottalic airflow and velaric airflow.

Lung airflow is the basic source for speech production. Sounds produced with the air stream coming from the lungs are called pulmonic sounds. In principle, air flowing either into or out of the lungs during the respiratory cycle may be used in generating speech sounds, and the nature of the sound produced will depend on what is happening in the vocal tract above the trachea. The two mechanisms (outward and inward lung air) are often referred to as egressive pulmonic airstream and ingressive pulmonic airstream respectively. Egressive lung airflow is the normal mode for speech, since it is easier to control. An egressive pulmonic mechanism is the norm in all languages. No language in the world seems to use ingressive lung airflow as a distinctive feature of particular speech sounds during normal articulation.

Glottalic airflow mechanism (also called pharyngeal) uses air above the glottis. Once the lung airflow has come above the glottis, it may be closed and the sounds may be produced with the larynx moving up and down. A glottalic sound exists in English in the pronunciation of some speakers, for example, at the end of the word sit. This sound is called glottal stop /?/. It can be produced by taking breath and holding it (thus shutting the glottis), then uttering /p/, /t/ or /k/ without opening the glottis, using only air compressed by raising the larynx.

Velaric (or oral) airflow is generated entirely within the oral cavity, by raising the back of the tongue to make firm contact with the soft palate. Air in front of this closure is then manipulated. Sounds produced in this way are called clicks. Clicks are normally ingressive sounds. Clicks can be labial (reminding of a light kiss), dental (tut-tut or tsk-tsk sound, usually expressing disapproval), alveolar (like urging a horse), etc. Clicks are found as systematic sounds in very few world’s languages (basically some African languages).

The stage of speech production at which air stream is supplied for further manipulation in the oral tract is called initiation.

Modes of Phonation

The term phonation refers principally to vocal cord vibration but can also be taken to include all the means by which the larynx functions as a source of sound, not all of which involve vibration of the folds in a strict sense.

Vocal cords can be held far enough apart to allow non-turbulent (free) flow of air through the glottis. In that case voiceless sounds are produced. The same position is taken for normal breathing. In the case of whisper, there has to be far greater constriction, and this is achieved by pressing the vocal cords against each other, and leaving only a small gap between the arytenoids (pyramidal cartilages at the back of the larynx). Voice refers to normal vocal cord vibration occurring along all or most of the length of the glottis. In this case the vocal cords are close together, but take a very loose position, and are thus capable of vibrating. In the typical speaking voice of a man, this opening and closing action is likely to be repeated between 100 and 150 times in a second (100 – 150 Hz), and in the case of a woman’s voice this frequency of vibration might well be between 200 and 325 Hz. This difference occurs due to the size of the glottis (11-16 mm in females, 17-22 mm in males). The vocal cords can also be tightly closed, with the lung air pent up behind them. This is the position taken in the pronunciation of the glottal stop. In the case of /h/ the vocal cords come close enough together to produce friction.


Figure 2. Some positions of the vocal cords.

The Resonating Cavities

The airstream, having passed through the larynx, is now subject to further modification according to the shape assumed by the upper cavities of the pharynx, mouth and the presence or absence of the nasal cavity.

The pharynx: The pharyngeal cavity extends from the top of the trachea and oesophagus to the region at the rear of the soft palate. It is divided into three sections: laryngopharynx, oropharynx and nasopharynx. The activity of the muscles which affect the shape and volume of this resonator greatly influences the quality of speech sounds.

The escape of air from the pharynx may be affected in one of three ways:

(i) The soft palate may be lowered, as in normal breathing, in which case the air may escape through the nose and the mouth. This is the case of the production of nasal vowels, as in the French words ‘un bon vin blanc’. This quality is achieved through the function of the nasopharyngeal cavity. Nasal airflow does not necessarily have to go through the nose.

(ii) The soft palate may be lowered so that there is a nasal outlet for the airstream, but a complete obstruction is simultaneously made at some point in the mouth, with the result that, although air enters all parts of the oral cavity, no oral escape is possible. This is how nasal consonants are produced.

(iii) The soft palate is held in its raised position, so that air escapes solely through the mouth. This is the case of the production of most speech sounds, which are called oral sounds.

Tuesday, March 4, 2008

phonetics - lecture 1

Lecture 1.

THE SPEECH PROCESS

When a person wants to convey a message, he/she can use a variety of means:

Visual: writing, sign language, waving flags, flashing mirrors

Audible: fog-horn, Morse code, drum, or simply spoken form, by word of mouth

A vast majority of communication is performed by speaking; it is by far the most frequent means of communication.

PHONETICS is concerned with the human noises by which ‘the message’ is actualized, or given audible shape: the nature of those noises, their combinations and their function in relation to the message.

In order to determine the domain of phonetics, we have to cast light upon the very process of speech, its elements and stages.

The SPEECH PROCESS is defined as the activity of human organisms by which the sounds of the language are produced, transmitted through the air and received. In essence, speech is COMMUNICATION, i.e. the exchange of information by means of auditory sensory stimulation.

This definition implies that there are two participants in the process: the SPEAKER and the LISTENER (or the HEARER).

The following simple model can clarify the domain of phonetics in the communication process:

C – Creative Function

F – Forwarding Function

H – Hearing Function

NP – Nervous Pathways

VO – Vocal Organs

The SPEAKER – produces sounds, the LISTENER/HEARER receives them, AIR CHANNEL – the medium through which sounds are transmitted.

SPEAKER – On the part of the speaker, the first function set in motion is the CREATIVE FUNCTION, which takes place in the brain. It is central, as it is through this function that the message is conceived and formed. Stored in the brain is a profound knowledge of the way in which the language operates – this knowledge is manifold and derives from our experience both as a speaker and listener from the earliest childhood – permissible grammatical patterns, vocabulary items, we know what the voices of a man, a woman or a child sound like, we have at least some knowledge of dialects other than ours, we know what general probabilities are of one word or expression following another…

There are three distinguishable phases of the creative function:

1) a need for communication arises, in response to some outside event or to some inner thought process;

2) decision on the medium to be used, so the speaker decides whether to convey the message in an audible or visual form;

3) decision on the form the message will take. E.g. Have another cup? Would you like another cup? Pass your cup. Have some more.

This was the PSYCHOLOGICAL STAGE of the speech process on the part of the speaker.

FORWARDING FUNCTION: the part of the brain concerned with controlling muscular movement sends out patterned instructions in the form of nervous impulses along the nervous pathways which connect the brain to the muscles of the organs responsible for speech sounds, the lungs, larynx, tongue, etc. These instructions call upon the muscles concerned to perform various delicate combinations and sequences of movement, which result in the ‘right’ sound being emitted in the ‘right’ order. – NEUROLOGICAL STAGE

This neurological activity is then transformed into MUSCULAR ACTIVITY: the lungs are contracted, the vocal cords vibrate, the tongue moves, the jaws go up and down… This is an extremely accurately controlled action – probably the most elaborate muscular skill any one of us will ever perform. This stage of the speech process is called the PHYSIOLOGICAL STAGE.

As a result of the movement performed by the speech organs, air is set in motion – the muscular movement has been transformed into SOUND WAVES – waves of varying air pressure spread out in every direction to eventually impinge upon the ear of the listener. This is called the ACOUSTIC STAGE.

Once the sound waves reach the ear of the listener, he/she performs the second part of the speechh process. The organ through which the listener receives speech sounds is the EAR. It is a complex organ, consisting of the outer and inner ear. The ear-drum is sensitive to the air pressure patterns and is made to move in and out in accordance with the movements of the air. This is again termed as the PHYSIOLOGICAL STAGE of the speech process, but this time on the part of the listener.

A final transformation takes place in the inner ear – where this organic movement of the ear-drum is again transformed into neurological activity, which results in nerve impulses being sent along the nervous pathways connected to the listener’s brain (NEUROLOGICAL STAGE).

The listener’s brain may also be thought of as having two functions: a hearing function and a creative function.

HEARING FUNCTION. The impulses coming from the ear are accepted as sound sequences of constantly changing quality and characteristic length, pitch, loudness. The listener HEARS the message, but does not yet understand it (like listening to a foreign language). To understand the message, the listener must interpret the sounds in the light of the stored knowledge in his brain; he not only hears the sounds, but recognizes them and matches them up with what he know to be possible in the language at various levels and finally selects the most likely meaning in the given circumstances, which again is the creative function. This activity of the listener is again referred to as the PSYCHOLOGICAL STAGE of the speech process.

The process of matching starts with the sounds themselves. If one hears a sound or a combination of sounds which does not exist in the language, he rules things out. E.g. /T/ does not exist in Serbian, nor does the word initial sequence /stv/ in English, so we replace these with what is the most probable in the given circumstances.

PERCEPTION – the problem is how a continuous flowing signal is converted into a series of individual units that we call the sounds of a language

This is not simple, as the acoustic characteristics do not translate directly or perfectly into linguistic units (physical units vs. mental units).

When we listen, we do not only identify and segment relevant sounds in the incoming signal. Language as communication requires from us to identify speech as words, phrases, sentences and discourse – ultimately messages. We decode messages for the information they convey.

We therefore analyze perception as consisting of the following stages:

1) Auditory stage (based directly on the physical input) – the initial point at which we take in the raw signal with all its acoustic properties

2) Phonetic stage – abstraction of the concrete acoustic signal – ignoring the difference between to different speech signals and identifying them as one speech sound/phoneme (/g/ in /gi/ or /gu/). This kind of making generalizations about the sound category (a phoneme) is called CATEGORICAL PERCEPTION.

3) Phonological stage – where we rule out what we know to be impossible from the phonological standpoint. So, if we, for example hear that someone pronounces the word [stvi:t] in English, we rule it out because we know the initial sequence does not comply with the phonological rules of English, and we interpret it as the sequence of phonemes /stri:t/ instead.

4) Lexical, syntactic and semantic stage – top-down and bottom-up models of perception

At this stage we can rule out things like:

(lexical nonsense) Accidents carry out honey between the house.

(syntactic/grammatical) Men is on strike. Man are on strike.

(semantic/pragmatic) My wives have just told me…