Tag Archives: sounds


Continue reading

Who “r” you?

My latest article for the BBC is on “r” – that sound we make in many different ways, and sometimes not at all, depending on who we are and where we’re from. It has a very interesting history, and not just in English!

What a single sound says about you


Different sounds that we think are the same sound (but others don’t)

My latest article for The Week is on sound distinctions that other languages make but we don’t. Some of these are things that even linguistics students don’t notice until they’re pointed out. It includes a video!

The subtle sounds that English speakers have trouble catching


You can get far by acting immature

That article I wrote for TheWeek.com about teenage noises, and its accompanying video, have grown slightly longer legs yet. It’s been reposted and featured on several sites, including PopSci.com and even in a column on Australia’s Crikey.com.au. The Huffington Post presented the video with a write-up.

And today listeners of National Public Radio’s Weekend Edition Saturday heard Scott Simon interview me about it – listen to it on their website. The segment is 3 minutes long, which means I still have 12 minutes of fame coming to me. I hope it’s not for something humiliating.

Annoying teenage noises

Annoying teenage noises

My latest article for TheWeek.com looks at annoying noises that callow adolescents make. I give a detailed phonological analysis of each of them – and I reproduce all of them in a video.

A linguistic dissection of 7 annoying teenage sounds

8 odd sounds from other languages you could never make except you probably already have

My latest article for TheWeek.com has been posted today:

8 bizarre sounds you’ve probably made without knowing it
And their prevalence in several foreign languages


(Please note that I don’t make up the captions for the photos. Where it says an uvular trill I would recommend reading a uvular trill.)

Watch a video of me reading it and making the sounds:

A Word Taster’s Companion: The world speaks in harmony

Today: the third installment of my how-to guide for word tasting, A Word Taster’s Companion.

The world speaks in harmony

It’s our ability to parse the flow of sound into separate sounds that makes language work. We have a conceptual understanding of the different sounds we make – ideal sounds, targets that we aim for and come variously close to when we actually speak. When the sounds are strung together, we still think of them as independent units. It’s like handwriting: the letters may flow together so you can’t say exactly where one ends and the next one starts, but you can see the different letters.

Now, when we hear someone talking, how do we know what different movements their mouth is making, what targets they’re shooting for? It’s all to do with the harmonics.

When you make a vocalization, your vocal cords are vibrating at a certain frequency – which, if you’re singing, is the note you’re singing – but they’re also echoing in your vocal tract at various frequencies that are multiples of the base frequency (two, three, four or more waves for every one of the base frequency). If you sing an A at 440 Hertz (vibrations per second), there are also echoes of that at, for instance, 880 Hertz and 1760 Hertz, among others.

Now, which harmonics sound louder and which sound quieter will be determined by the shape of the resonating space in your mouth. There’s a resonating space at the back of your mouth, from your larynx to the top of your tongue, and the higher your tongue is, the longer that space and the lower the frequency of the harmonics that stand out. There’s also a space between the front of your mouth and the closest point your tongue comes to your palate, and the smaller that space is, the higher the resonance. The stand-out harmonics those spaces engender are called formants: the one at the back is the first formant, and the one at the front is the second formant. (There are third and fourth formants that play smaller roles.)

Thus, [u] – “oo” as in “boot” – is heard as it is because it has lower harmonics coming out in both formants: the back of the tongue is high, making a big space between it and the larynx, and it’s also far back, making a big space between it and the front of the mouth. On the other hand, [æ] – “a” as in “cat” – is heard as it is because both formants are higher; the tongue is low and towards the front. And [i] – “ee” as in “beet” – has low resonances in the first set, and higher ones in the second set. The second set are always at least a little higher than the first, even when saying the low back vowel [a], as in “bother.”

We also recognize consonants this way. If they’re consonants that stop the flow of air, we recognize them by what the tongue is doing immediately before and after. If they let just a little air through, we also get the sound of the air as it hisses or buzzes. I’ll go into close-up details of the vowels and consonants in coming chapters.

So we hear these sounds, and we have a sense of where in the mouth they’re coming from, and we also have an idea of what sound could come next in any given word – by the time you’re a couple of sounds into a word, the possibilities are narrowed down quite a bit. We can also hear the effect of the tongue moving and changing the shape of the resonating space in the mouth. And we have learned a repertory of different sounds that we recognize as distinct speech sounds (I won’t say “letters”; those are what we write to represent the sounds). The actual sounds won’t always be exactly identical, but as long as they’re close enough to a target, an identifiable known speech sound, they will be identified as it, especially if the sounds around it lead us to expect it.

These target sounds – sounds that we recognize as separate speech sounds – are called phonemes. If you meet someone who speaks another language who can’t manage to differentiate “bit” from “beat,” that’s because their native language doesn’t have a distinction between those two vowel sounds, so they’re not used to making the distinction when speaking. They may even believe they can’t. They might have a heck of a hard time telling them apart when listening, too, because they both land close enough to the same target in the set of sounds they’re used to. It’s the same with English speakers hearing and making sounds from some other languages: we may not be able to tell apart sounds that, to the language’s native speakers, are obviously different. After all, learning language is also a process of unlearning: in order to have separate sounds, you not only have to treat similar sounds as completely different; you also have to forget that some sounds are different because you need to treat them as the same in order for your language to make sense.

Next: Horseshoes, hand grenades… and phonemes

A Word Taster’s Companion: What makes a word

Today: the second installment of my how-to guide for word tasting, A Word Taster’s Companion.

What makes a word

Let us start by looking at the parts of words. Take a word. In fact, let’s start with start. Here’s a simple question: what is this word, start, made of?

Did someone say five letters?


No, words are not made of letters.

That’s right: one of the first things just about anyone knows about words is the first thing they’re going to have to unlearn.

Tell me, what did you do first, when you were a very small child: write or speak?

You almost certainly learned to speak a few years before you learned to write. You knew the sounds long before you knew the symbols used to represent them on paper.

But aren’t those sounds letters?

They sure aren’t. Letters came along to represent sounds many thousands of years after humans started speaking. And anyone who can write English knows that the same letter is often used to represent several different sounds – for instance, fat, make, above – and the same sound can be represented by different letters – hay, hey, weigh.

Words are made up of quite a few different things, actually – and we’ll get to them all by the time I’m done with you – but on the most basic level of expressive form, words are made up of sounds (unless you are deaf and speak sign language).

And those sounds are made by the physical movements of your vocal tract. (If you speak sign language, they’re made up of movements of your hands and other body parts.) So when you say a word, you feel it. And when you hear a word, you know what it feels like.

So feel it. Feel this word: “Start.” Say it.

What do you feel your tongue doing? First the tip is up near the front of your mouth, behind the teeth and ahead of the ridge (that ridge is called the alveolar ridge). It’s letting some air through, making a hissing noise. Your voice is not activated: you could only whisper, not sing, while saying [s].

Then your tongue closes off the airflow. For a moment no air gets out of your mouth, because your nose is closed too (by means of a flap at the back of your mouth). Then you release it, and the tongue drops down and sits flat on the bottom of your mouth, and your voice starts up: [a].

Then, if you’re among those who say the [r], the tongue humps up like a cat stretching. It makes a narrower passage between itself and the roof of your mouth (your palate).

Finally, the tip of the tongue touches again and blocks the airflow as the voice stops – but you may find that even before the tongue gets all the way there the airflow has stopped; many people will make this stop using the closing point in the throat, the glottis, which is what you use to stop the air when you swallow or hold your breath.

So there you have it. One continuous movement of the tongue, with the voice engaged just in the middle. A continuous flow of physical movement and a continuous flow of sound. But we hear it as five sounds, because we have learned to divide the sound stream we hear into those sounds.

Next: The world speaks in harmony

The madder matter of t’s and d’s

One of the most common “have you ever noticed” things people like to make mention of in English pronunciation – especially North American English pronunciation – is how, in many words, such as matter and betting, “we say ‘t’ as ‘d’.”

I put that in quotes because that’s what people say.

It’s not really true.

Actually, we say them both as a third sound. It just happens that this third sound, to our ears, sounds more like [d] than like [t]. (By the way: I’m using the linguistic standard of putting a sound in brackets, [t], if it’s the sound we’re actually making, and between slashes, /t/, if it’s the sound we believe ourselves to be making whether or not we actually are making exactly that sound. So “hit it” will always be /hɪt ɪt/ but not always [hɪt ɪt].)

Here, I’ll prove that we don’t say it as [d]. Say the following, slowly and carefully, perhaps as though you’re speaking to someone who is hard of hearing:

I’m not kidding about the reckless betting.

No problem making /t/ and /d/ different there, right?

Now say it quickly, as quickly as you reasonably can, maybe two or three times in a row.

Those d’s and t’s seem to be pretty much the same sound now, right? All d’s, perhaps?

No, not all d’s. Say this slowly and carefully, perhaps as to someone who is having a hard time hearing you:

I’m not kidding about the reckless bedding.

Before, when you said “reckless betting” quickly, there was no problem with a hearer knowing you were talking about gambling. But when you say the [d] clearly, that’s out the window; you’re now talking about crazy quilts and sheets. You can’t say “bedding” clearly and be taken for saying “betting” under normal circumstances.

We tend to think that we’re saying it as [d] because most of us don’t have a letter to associate with what we are saying it as. But I’ll tell you what we’re really saying it as: a thing linguists call a tap. The tongue just taps the alveolar ridge without really stopping the airflow. We sometimes make a flap, which is when the tongue taps on the way past rather than bouncing off. A tap is like in “better” (said quickly and casually); a flap is like in “editing” (said quickly and casually). The International Phonetic Alphabet symbol for a tap or a flap is [ɾ].

Does that look like a partly-formed r? As well it might. Some speakers – particularly those with accents we might think of as “proper” British – will use it for /r/ in the middle of words, as in “very horrifying.” North Americans, who aren’t used to saying /r/ that way, often represent this as a d as in “veddy British.” But it’s not [d]. It’s [ɾ].

Here’s how sounds work in language: Every language has a set of sounds that are considered to be distinctive – swap in a different one and you have a different word (or a non-existent word). These distinctive sounds are called phonemes. Do not confuse these with the letters of the alphabet. For instance, c is a letter that can stand for the phoneme /k/ as in can, /s/ as in ice, or even /tʃ/ in some loan words such as ciao. On the other hand, /k/ is a phoneme that can be represented by c as in can, k as in kill, ck as in kick, ch as in school, q as in question, even que as in unique.

But a sound that is considered to be distinctive may have several different ways of being produced, depending on where it shows up. We just happen to hear them all as versions of the same sound and thus interpret them all as the same sound by habit without generally noticing that there is any difference. Take /t/, for instance. Say the following words:

ting sting matting mattress mat mitten

Each one has a different version of that /t/. Linguists call these different versions phones (as if that word didn’t have enough meanings already). The system of phones is phonetics, while the system of phonemes is phonemics. (Phonics is not a word linguists use.)

Put your hand in front of your mouth and say “ting sting.” You might feel an extra puff of breath on “ting.” If you say “pill spill” you will feel much more of a puff on “pill.” We put those puffs on voiceless stops (/k, t, p/) when they’re at the very beginning of a syllable – but not if there’s /s/ before them at the start of the syllable. Those puffs are called aspiration.

That’s two of the six different variations on /t/ – what linguists call allophones of /t/. I’m sure you can hear the different allophones in “matting” (with the tap) and “mattress” (with “mattress” the /tr/ together sound like “ch” plus “r”). Now how about “mat”? The difference with that one is that we don’t release /t/ when there isn’t another vowel or liquid after it – we just hold it closed. Usually we just close our throat (glottal stop) and sometimes we don’t even entirely touch the tongue to the roof of the mouth. If you have /n/ after it, as in “mitten,” just the nasal passage releases, unless you’re speaking carefully or formally.

All of these are thought of as /t/. All of them are heard as /t/. But they really do differ. In some languages some of them are treated as distinct sounds. You know how speakers of some languages can’t say “beat” and “bit” differently? That’s because those two vowels are allophones – different phone realizations – of the same phoneme in those languages. Well, we’re like that with things like the difference between aspirated and unaspirated stops.

Why do we do this? Economy of effort. A /t/ is a voiceless alveolar stop. We don’t always retain all those characteristics of voice (voiceless), place (alveolar), and manner (stop); we’ll stick with whichever is sufficient to make the sound recognizable while not having to make too much effort to say it, and sometimes we’ll add a little more distinction where needed. So at the start of a word, we add that puff of air to make it clearer that it’s not /d/. We don’t need to do that after /s/ because we never say /sd/ at the start of a word. In the middle of a word like matter, we just keep the place and a similar manner, but we don’t stick too closely to the voicelessness or the hard stop. At the end, as in “mat,” or before a nasal, as in “mitten,” we reduce it to a different stop (glottal) that takes less effort to say. That’s also what some people (notably some British people) do when they use a glottal stop between two vowels, as though “matter” were “ma’er” (or “ma’ah”). The quality of being a voiceless stop is enough; the other two voiceless stops (/k, p/) don’t reduce to a glottal stop in English.

So those are the allophones of /t/. What you need to know is that sometimes two different phonemes have, in some contexts, the same phone as an allophone. Most “short” vowels in English reduce to a neutral unstressed vowel [ə], for instance. The case in point today is [ɾ], which can be a version of /t/ or /d/ (or, in some kinds of English, /r/).

We think of voice as the difference between /t/ and /d/. But they’re stops – how do you voice a consonant when your air flow is stopped? You don’t, really. You know the difference between /t/ and /d/ mainly by how the sounds before and after behave. Say this:

mad mat

In “mad” your voice keeps going right up until you say the [d], but in “mat” you cut off a moment sooner. You also say the vowel a bit shorter.

Now say this:

The madder matter

The difference is very subtle, isn’t it? But you may say the [æ] before the /d/ a little longer than before the /t/, and you may cut the voice out just a little for the /t/ version. It’s not really enough to be sure about when you’re listening, but there may be that small effect of the sound you’re thinking about when you say it.

On the other hand, you might really say them both the same way.

It just happens that that way will not be with [d]. It will be with [ɾ].