Toilet-paper-roll words

There are some words that have two pronunciations, but most people prefer one or the other, sometimes quite vehemently. It’s sort of like whether you have your toilet paper roll over the top or down the back. These words – and where the pronunciation difference comes from – are the subject of my latest article for TheWeek.com:

Aunt, adult, pajamas: Why can’t we agree how to pronounce common words?


A Word Taster’s Companion: Syllables 2: Breaking words

Syllables 2: Breaking words

OK, the words I talked about in “Syllables 1: The basic bits” are all one syllable, so they’re not that hard. When we get to more than one syllable, now, that’s where things get interesting. Try this word – a very appropriate one: breaking. It’s made of break plus ing. But how do you say it?

Slow it down. Now sing it on two notes. Now put a space between those notes, just a slight gap. Now speed it up, keeping the gap.

If you’re now singing brea, king, brea, king, it probably sounds quite normal and feels easy enough to do. If you’re now singing break, ing, break, ing, it more likely sounds unnatural and feels more difficult to do.

But why would the [k] go and attach itself to the suffix when it belongs to the root word? Because it’s just easier to do it that way. Consonants tend to prefer onsets over codas, given the chance. Oh, there are many things that can keep a consonant on the end of a syllable rather than migrating to the beginning of the next one. I won’t be so tedious as to make a long list of them here; much better if you just explore syllables yourself and see how they really break, and try to sort out why they break where they do. But be aware that there are many places where what you may have always thought was the syllable break actually isn’t.

“But we hyphenate it between break and ing!” Yes, we do. In English, we don’t always put hyphens at the actual syllable boundaries. We also take into consideration the parts the word is made of (morphemes – I’ll get to those) and the relation between the spelling and the pronunciation. Breaking is made of break and ing, and even though we actually put the /k/ at the start of the second syllable we still think of it as being at the end of the first one. But also, we don’t know how brea- should be pronounced until we see the next letter: Brea…thing? Brea…ding? Brea…king? So we hyphenate it as break-ing, because those are the constituent parts and because if you see brea- at the end of one line it may be a surprise to see king on the next.

We run into another problem in English because of how we think about vowels. English has tended to have “long” vowels in open syllables – syllables without codas – and more notably has a strong tendency to have “short” vowels only in closed syllables – syllables with codas. A word such as break shows that we can have a “long” vowel in a closed syllable (but usually it will be indicated with multiple written vowels, often with a “silent e” after the final consonant, showing us that the final consonant was originally the onset of another syllable). But whereas we can have open/closed pairs with “long” vowels – bray/break, be/beat, buy/bite, bow/boat, boo/boot, cue/cute – just try to find an open match for bit, bet, or book (bat has bah, though open syllables with [æ] are uncommon; hut has huh, but most places you hear that vowel sound are unstressed; there are many words with [ɑ] in open syllables – it’s an exception).

So “short” vowels generally need to be in closed syllables. But! As already observed, consonants tend to shift from coda to onset when they can. Look at latter and later. In later, dividing it is easy; la-ter. But in latter? Don’t even bother thinking the syllable splits where we hyphenate it, lat-ter. There’s no long (or double) [t] in there – nothing like you hear in hot toddy or cat-tail. No, this is a case where we think of the /t/ as being at the end of one syllable even though it’s attracted to the start of the next syllable – since there’s no onset on the next syllable, and it’s in the middle of the word, there’s a natural tendency to shift.

So does that mean, then, that latter really divides la-tter? Well, some people say so. Some intro linguistics professors will tell you straight out that, for instance, Christmas breaks phonetically as Chri-stmas (as a rule we don’t say the t, so the [s] is naturally pulled to the onset because it can go before the [m]). But say it slowly and forcefully. Are you sure the [s] is all the way with the next syllable? When you say latter, does it seem as though the /t/ – which is usually said by North Americans not as a [t] but as an alveolar flap, making it identical or very similar to ladder (the [æ] may be slightly longer in ladder) – is as much with the first syllable as with the second? Some linguists think that’s not an unreasonable way of looking at it. They call this ambisyllabicity: it goes with both syllables. Not everyone agrees that it exists. But this is an important thing to know about linguistics: although it seems very scientific, with all its technical terms and structures and codifications and so on, in fact there’s lots of disagreement about all sorts of things, even basic issues such as phonemes. You learn things in one linguistics course and are told they’re wrong in the next. Eventually you get far enough that you can start making up your own mind and disagreeing too. See? Language is a sport not just for those who use it but for those who study it, too.

Next: The rhythm method

A Word Taster’s Companion: Lovely, lyrical liquids

Lovely, lyrical liquids

Liquid. Say rarely rural, really, Larry. Oh, come now, you can do it! Why would such flowing sounds cause any trouble?

And they are flowing sounds. English has two phonemes of the type called liquids: /r/ and /l/. Mind you, they do each have more than one allophone.

Liquids are consonants that involve contact or near-contact of the tongue with the palate but allow ample air to pass around – more than for a fricative. They produce no buzz or hiss. You could actually almost drink some kind of liquid (water, beer, wine) while holding your tongue in the position to say a liquid, but the swallowing would cause you to say a nasal instead.

Liquids are lovely, lush, lyrical. You can sing them, though your voice teacher will probably tell you to sing vowels instead. The singability of these sounds means that they can be syllables, and often are. Say burble, turtle, gurgle. If you’re like most Canadians and Americans, your first syllable of each word has not a vowel per se but simply a sustained /r/ with the tongue not really moving during it. And while you may or may not slip in a little vowel – a short schwa – before the /l/ in burble and gurgle, you almost certainly don’t in turtle, where the tongue can maintain the tip contact and simply release the sides to go from the /t/ to the /l/, making the peak of the syllable not a vowel but the liquid /l/.

So why aren’t liquids considered vowels? In the case of /l/, the tongue tip touches the roof of the mouth, so that rules it out. But in both cases, even though they can be peaks of syllables (meaning you can use them where you would use vowels in words like turtle and gurgle), they don’t behave like vowels anywhere else. We use them where we use consonants, at the beginnings and ends of syllables. And liquids aren’t the only consonants that can be the peaks of syllables: we also do it with nasals (like the syllabic /n/ we usually say in button).

We don’t have unvoiced versions of these sounds in English, though they do exist in other languages – Welsh, for instance. Well, let me make a small correction here: we don’t have separate unvoiced phoneme pairs for these. But we do sometimes say unvoiced versions of them, thanks mainly to our habit of aspirating voiceless stops at the beginnings of stressed syllables (remember that from “Stop! What are you doing”? Don’t make me explain it all over again!). If the stop is followed by a liquid rather than a vowel, the aspiration makes at least part of the liquid voiceless. Say play and pray. If you pay attention, you will find that your voice doesn’t really start up again until the ay part – the /l/ and /r/ are said mostly or entirely voiceless.

There are some other allophones of liquids as well. You’re probably used to saying /r/ with your tongue humped up in the middle but not touching (except at the sides). But you’re surely also used to hearing trilled versions, from languages such as Spanish and Italian but also from Scots English and some other kinds. Trills are actually not liquids. They’re functionally similar to liquids and tend to be used in the same ways and places, but the difference between a trill and a liquid is like the difference between dribbling a basketball (trill) and just picking it up and running with it (liquid). Except that you don’t get whistled out for saying a liquid.

A further effect of this is that /r/ can be said in some dialects as tap – rather than a trill, which is multiple bounces, you say just one bounce. This is why some British accents can sound to North Americans like they’re saying “veddy” when they say very: we only use a tap for /d/ and /t/, not for /r/. But it goes both ways: a Yorkshire accent can sound to someone from southern England like it’s saying “gor any” rather than “got any” because they, like North Americans, tap /t/ in that position, while in the standard southern British accent only a /r/ would tap in that position.

To add to the fun, in English we have two kinds of /l/, a “bright” one and a “dark” one. The difference is that the “bright” /l/, which is used at the beginnings of syllables, has the tip up but the back fairly low, whereas the “dark” /l/ has the back well up, and sometimes the tip doesn’t quite touch, especially if it’s before another consonant. Compare la la la la la with all all all all all. And then say elk elk elk elk elk and see how the /l/ is reduced to something almost like /u/ without lip-rounding.

Oh, and speaking of lip-rounding, you will notice, if you observe for a bit, that we normally round our lips to some extent when saying /r/. This makes the sound more distinctive. Stand in front of a mirror and watch yourself say ring. Say it slowly and clearly. Maybe take a little cell phone video of yourself doing it. You will see that your lips are rounded. Now say wring. When you say that, you think of your lips being rounded. And they are. But they’re just as rounded when you say ring.

Liquids are certainly mellifluous sounds, though holding them too long can have a sort of “low class” sound to them, just due to established norms in speaking English. Say “haaaaaaard” and then say “harrrrrrrd.” The latter sounds like what? A pirate, perhaps? Now say “faaaaaaaall” and then say “falllllllllll.” Does it sound like a dog, or like someone’s choking you? The effect is much less, however, at the starts of words: “rrrrrright” and “llllllllush” probably sound simply emphatic, and perhaps even a little upper-class – again, due entirely to association with who is heard to say them when.

What do liquids feel like to say? Well, in theory you can sustain them indefinitely, but in practice you will find that your tongue tires out sooner than you might expect because it’s being held in a tensed position. This is especially true of the “dark” /l/, which can feel a bit like choking. Ultimately, in their fluidity, liquids are rather like the fish in the stream of your speech. They’re slick and smooth and wet, and as lovely as they are to look at you probably won’t enjoy holding them for all that long.

Liquids are called approximants by linguists. But they’re not the only approximants out there. Along with these consonants that can behave like vowels, there are sounds that are vowels behaving like consonants…

A Word Taster’s Companion: Ah, frick it

Ah, frick it

Affricate. I do like this word, affricate, though it actually doesn’t contain the sound it names. “Affricate” is not “African” said with a cold and laryngitis, nor is it an expression of dismay or frustration (“I forget!” “Ah, frick it!”). Well, some affricates may be expressions of dismay – [ts] gets used for this at times – but it’s not essential to their nature. An affricate is a stop that releases to a fricative: a single gesture of the tongue, thought of by the speaker as a single sound, but made of two parts: the tongue moves, making a sort of breaking sound. It’s a consonant equivalent of a diphthong. Judge for yourself: Say “judge” and listen to the consonants in the word – is there more to them than in “dud” or “shush”

We don’t have a lot of affricates in English. If you look at the consonant list in “Sushi thief!” you’ll see a reason why: an affricate requires a stop and a fricative in the same place, and we don’t have that many pairs like that. Actually, we have even fewer than we could. Our only affricate phonemes in English are /tʃ/ and /dʒ/: “ch” and “j.”

We may occasionally say the available other stop-fricative combinations – [ts] and [dz] – and sometimes we may even say them so they’re not across syllable boundaries (as what’s up sometimes becomes ’tsup, for instance). But we don’t think of them as single sounds. In fact, many people will have a resistance to saying them where we can say /tʃ/ and /dʒ/, or will even think they can’t say them because we don’t start syllables with a stop followed by a fricative. Many English speakers have problems saying something like “tsump” and “dzump” – or tsar, or tsunami. But we have no problem saying “chump” and “jump,” or “char” (or “chunami,” if that were a word), even though they’re also a stop plus a fricative in a very similar place in the mouth. This is because we see them – and perform them – as one gesture. You’re saying char, not tshar. It’s the difference between courtship and core chip, for instance. To show in phonetic transcriptions that they’re a single phoneme, sometimes a joining line is written under the two letters. But that’s not supported by many character sets, so you don’t see it all the time.

We also say affricates as versions of stops. For instance, say choo-choo train. You may have noticed that you make the t is as the same sound as the ch. You’ll find the same thing, but voiced, in juju drain. In many places where [t] and [d] release with the tongue flexing towards the palate – nature, gradual, dread – the gesture results in affrication: as you release the stop you make a fricative on the way to the next sound. So our target phoneme is /t/ or /d/ and we have it in mind to say that sound and we hear it as a version of that sound, but it actually comes out as  [tʃ] and [dʒ].

But those aren’t quite the only affricates we have as allophones. Say cute. Now say it with emphasis, especially on the start – draw it out: Cute! Notice how the hump of your tongue is actually fairly far forward in your mouth when you say the [k]? And how air escapes past it as it releases to the vowel? Congratulations. You’ve just made an affricate that most Anglophones can hardly even conceive of existing – even though they make it: a voiceless palatal affricate. (The International Phonetic Alphabet way of writing it is [cç].)

It’s the further progress of that movement, by the way, that led Latin c, originally [k] in all positions, to become [tʃ] before [e] and [i], as it is in Italian and as one hears it in church music. It’s very easy to move [cç] forward just a little more to [tʃ]. (The process was a little different with [sk]: it dropped the stop as it softened up and it became [ʃ] without passing through [stʃ] – which is why excelsis is “ek-shell-cease” and not “ex-chell-cease,” and prosciutto is “pro-shoot-toe” and not “pros-choo-toe.”) That movement, from [k] to [tʃ], is also one way English came to have these affricates; cheap, for instance, is related to words and roots in other Germanic languages that start with [k] – German kauf, for instance.

It also goes in the other direction: the “y” sound as in yes and yellow – written as [j] in IPA – can be made so narrow that it touches the palate and makes an affricate. You can hear this in some dialects of Spanish: llave, [jave], has moved to [dʒave] in some South American versions, and the same accent can cause its speakers to pronounce English with the same effect: for instance, your sounding like jor. This same process is in fact a way that Latin words with j, which was really i in Latin, came to be said with [dʒ] in English.

What do affricates feel and sound like to say? [tʃ] can have a kind of mechanical or metallic crispness, which shows up in chug, cha-ching, and similar words. It sounds like bells, small change, machines… That effect is softened when you add voice, but there can still be a certain sturdiness, as for instance in Jack and jug. I’d say this also draws on the effect produced by a sense of jutting jaw and meeting teeth, which can be a movement you make when you say these sounds. On the other hand, the crispness of the release and the involvement of the most delicate of our stops, [t] and [d], can make these seem light and pretty in the right context, for instance Chelsea and Jennifer.

Consider the different sound effects between guy and chap, or coffee and java. Try swapping in affricates for stops, or vice versa: choffee? Gava, dava? Does it make it feel sturdier or more delicate, or something else entirely? One thing’s sure: that extra little break does add a little more richness to the flavour.

Next: Lovely, lyrical liquids

A Word Taster’s Companion: The nose knows

The nose knows

Nasal. In phonetics, nasal is also a manner, not a place. Yes, your nose is a place, but you don’t put your tongue in your nose to say nasal consonants. Nasal consonants are made in the same set of places as stops. The difference is that when you say a nasal, your nose is open – more exactly, the velum is lowered, allowing air to pass through the nose.

Try this: say “nnnnnn.” Now say “nnnnnnd.” What happened at the end there? Your velum raised. All of a sudden air couldn’t get out because the passage through your nose was blocked off. That’s the difference between a nasal consonant, which allows air to bypass the mouth through the nose, and an oral consonant – all consonants involve the mouth, of course, but in phonetics oral means not nasal.

And this is why nasals tend to become voiced stops when you have a congested nose. Say mind your manners with your nose pinched shut and you will sound like bide your badders. And pinching your nose shut produces the same effect as raising your velum. You could do that instead to say stops: say “ann” and pinch your nose at the end and you have “and.” But if you had to pinch your nose every time you said /b/, /d/, or /g/ – or /p/, /t/, or /k/ – it would be a problem.

So say pat, bad, man. Voiceless, voiced, nasal: /p/, /b/, /m/. One place, three manners.

Same with the tongue tip: tat, dad, nan.

Now say cat, gad, ngan.

What was that last one? Well, if you can say the voiceless and voiced stop at the back of the mouth, you can certainly say the nasal there. So /k/, /g/, /ŋ/. (I love that symbol, ŋ – it looks like an elephant, doesn’t it?) And no, there’s no [g] in it. We just write it ng because centuries ago we didn’t have a separate phoneme for that sound, [ŋ]; it was just what we did with [n] before [k] or [g] (it still is that too). And then we dropped the [g] in many places so that ng changed from [ŋg] to just [ŋ]. Yes, that’s right, when you say doing you have already dropped the [g], even if you say it “properly.” If you say it like doin’, you haven’t dropped the [g]; there is no [g] to drop any more. You’ve just moved the velar nasal forward to be an alveolar nasal. (And, by the way, doin’ was considered the correct way to say it for a long time, but it was changed back to “the way it’s spelled” in the 18th and 19th centuries.)

The reason we don’t start words with [ŋ] is that it was originally always before a [g] or [k], and it only came to be independent where we dropped the [g]. Some other languages allow it, but many Anglophones believe they can’t say it at the start of a word, so names like Ngaio (a Maori female name, best known from Ngaio Marsh, an author of detective fiction) and Nguyen (a very common Vietnamese family name) tend to be modified in English pronunciation.

Pity. Get a good handle on this sound. If you want to be a really good word taster, and taste the really good words, you have to be willing to make all the sounds your mouth is capable of making, in all the places your mouth is capable of making them, even in locations in words that you wouldn’t normally have them.

What do nasals feel like to say? Well, they’re singable, and they can have a warm and comforting nature. What do you say when you think of good food, for instance? Mmmmm. But they can also be used for hesitation, because they’re consonants you can hold on for a long time without getting to the point. It’s luck – or is it? – that no starts with a nasal, so we can say nnnnnnnnooo as we cagily consider a questionable option. And the velar [ŋ], held by itself, is as likely to express frustration or resistance, perhaps because it’s well suited to saying with teeth clenched. Say it expressively and observe the wide-mouth grimace you probably make… and you hands clenching into fists.

But hold [ŋ] at the end of a word and it has some of that [g] or [k] grip, but much softer. As with the stops, you’ll typically find [n] to feel the lightest, and [m] to feel the warmest. But as always… it varies.

Next: Sushi thief!

A Word Taster’s Companion: Stop! What are you doing?

Stop! What are you doing?

Stop. No, that’s not an order, that’s a manner. If no air can get through the mouth or nose at all until you release the consonant, that consonant is a stop. All the consonants in decapitate are stops, for instance. Our English stops are voiceless /p/, /t/, /k/ and “voiced” /b/, /d/, /g/. Why did I just stick scare quotes on “voiced”? Because you don’t really keep your voice going during the time your mouth is stopped up. Not usually, anyway. Try it with holding a /b/, /d/, or /g/ and trying to make a voiced sound. Sounds like you’re stifling a sneeze – or something worse. No, the usual difference is actually in how close before the stop the voice stops and how soon after releasing the stop the voice starts again. (Linguists call this voice offset time and voice onset time.) We also tell the stops apart by how long the vowel is before them, as I mentioned in “The vowel circle.” The differences are small, but they’re enough to notice.

Now, let’s get some exercise.

Say picket, kaput, tip-top; doggèd, bagged; debit, batted. Pay attention to your tongue as you say them. Emphasize them. Get a feel for the sound.

If you’ve read “Horseshoes, hand grenades… and phonemes,” you know about the aspiration on the first sounds of picket, kaput, and tip-top. (If you haven’t read it, why not? Give yourself one demerit point and go back and read it. Honestly, how do you expect to be an expert if you skip things?) I’m talking about the difference between the /p/ in spit and the /p/ in pit. Also between the voiceless stops in still and skill and the ones in till and kill. Put your hand in front of your mouth while you say them if you want to refresh your memory. Don’t do it in public; people might think you’re checking your breath. Actually, you are, but not that way.

OK, now say a picket, a picket, a picket, a picket, a picket, a picket, a picket… Come on, faster!

Now say gotta be, gotta be, gotta be, gotta be, gotta be, gotta be, gotta be… come on, pick it up!

You may have noticed something in picket a and gotta. Most North American English speakers will, in relaxed speech, turn [t] and [d] between vowels into a tap or flap of the tongue – so the dd in madder and the tt in matter tend to be indistinguishable much of the time (thank goodness for context). The IPA symbol for this sound is [ɾ]. The voice never actually cuts out on a tap, which is why people often think it’s just changing the [t] to a [d] – the tap is more like a [d], but it’s not one; it’s as much like a quick British “r,” which is why the symbol is the shape it is, [ɾ] (and why some North Americans think some Brits say “veddy” for very). But you may nonetheless say madder slightly differently from matter. This will be a subtle difference in the voicing length on the [æ], as I’ve mentioned: a vowel is shorter before a voiceless stop. But the difference can often be too subtle to be reliable.

What do stops feel like to say? They’re percussive, but the exact quality varies according to place and voicing. Listen to them as you say them: [p] is lower in tone than [k], which is lower than [t]. This is because of the size and shape of the resonating cavities when you release the stops. This makes [t] the lightest and most fragile-seeming of the bunch. That’s helped by its being on the tip of the tongue, which feels less substantial than the back of the tongue, which kicks with [k], or the lips, which pop with [p]. But the tip is also the most agile part.

Add voicing now – in other words, reduce the voice onset time after release. They’re [b], [d], [g]. They’re blunter, stickier. But they still have the same kind of differentiations as their voiceless counterparts.

But it’s not as though there’s some absolute intrinsic taste to each of them. It varies from word to word, and from speaker to speaker. Say them all several times and decide for yourself how they seem to you: pat kid bag, tap dig back, top dog buck, put big cod… Yes, part of it is in how they play with other sounds. And the meanings and other associations of the words, of course. Oh, we’ll get to that!

Next: The nose knows

A Word Taster’s Companion: The consonant line

The consonant line

If vowels are the blood of words, consonants are the bones. And while vowels are in a circle in the mouth, consonants are in a line, because they’re made by contact – or very close constriction – between the tongue and the palate, or the lips with or without the teeth.

Start by getting just a basic sense of what your tongue is doing. Move the tip of your tongue slowly from “th” (as in “thin”) to “s” to “sh,” then back forward. Now do the same but with voice: “th” as in “this,” “z,” “zh.” And back.

Now let’s go just a little crazier: saying “l” (as in “let”), make the same range of movement with your tongue tip. Does it tickle? Oh good.

What you’re doing when you do that is running your tongue tip between the back of your teeth and the back of your alveolar ridge – alveolar comes from the Greek for “wind.” Behind it is the hard palate. Keep curling your tongue farther back if you can and you’ll get to the soft palate, also known as the velum. This is where, with the back of your tongue, you say the final sounds in long, log, and lock. All the consonants in English are articulated somewhere in the line between there and the teeth and lips. (OK, except for [h]. And also the glottal stop, but that’s not a separate phoneme.) Some other languages go farther back.

Consonants may be linear, but they have several ways they can be made, so there are more of them. Linguists classify them by voice, place, and manner. The manner – the type of movement made – is what really makes them interesting. All good word tasters must mind their manners, and in the next six sections I will tell you the manners to mind.

First: Stop! What are you doing?

The madder matter of t’s and d’s

One of the most common “have you ever noticed” things people like to make mention of in English pronunciation – especially North American English pronunciation – is how, in many words, such as matter and betting, “we say ‘t’ as ‘d’.”

I put that in quotes because that’s what people say.

It’s not really true.

Actually, we say them both as a third sound. It just happens that this third sound, to our ears, sounds more like [d] than like [t]. (By the way: I’m using the linguistic standard of putting a sound in brackets, [t], if it’s the sound we’re actually making, and between slashes, /t/, if it’s the sound we believe ourselves to be making whether or not we actually are making exactly that sound. So “hit it” will always be /hɪt ɪt/ but not always [hɪt ɪt].)

Here, I’ll prove that we don’t say it as [d]. Say the following, slowly and carefully, perhaps as though you’re speaking to someone who is hard of hearing:

I’m not kidding about the reckless betting.

No problem making /t/ and /d/ different there, right?

Now say it quickly, as quickly as you reasonably can, maybe two or three times in a row.

Those d’s and t’s seem to be pretty much the same sound now, right? All d’s, perhaps?

No, not all d’s. Say this slowly and carefully, perhaps as to someone who is having a hard time hearing you:

I’m not kidding about the reckless bedding.

Before, when you said “reckless betting” quickly, there was no problem with a hearer knowing you were talking about gambling. But when you say the [d] clearly, that’s out the window; you’re now talking about crazy quilts and sheets. You can’t say “bedding” clearly and be taken for saying “betting” under normal circumstances.

We tend to think that we’re saying it as [d] because most of us don’t have a letter to associate with what we are saying it as. But I’ll tell you what we’re really saying it as: a thing linguists call a tap. The tongue just taps the alveolar ridge without really stopping the airflow. We sometimes make a flap, which is when the tongue taps on the way past rather than bouncing off. A tap is like in “better” (said quickly and casually); a flap is like in “editing” (said quickly and casually). The International Phonetic Alphabet symbol for a tap or a flap is [ɾ].

Does that look like a partly-formed r? As well it might. Some speakers – particularly those with accents we might think of as “proper” British – will use it for /r/ in the middle of words, as in “very horrifying.” North Americans, who aren’t used to saying /r/ that way, often represent this as a d as in “veddy British.” But it’s not [d]. It’s [ɾ].

Here’s how sounds work in language: Every language has a set of sounds that are considered to be distinctive – swap in a different one and you have a different word (or a non-existent word). These distinctive sounds are called phonemes. Do not confuse these with the letters of the alphabet. For instance, c is a letter that can stand for the phoneme /k/ as in can, /s/ as in ice, or even /tʃ/ in some loan words such as ciao. On the other hand, /k/ is a phoneme that can be represented by c as in can, k as in kill, ck as in kick, ch as in school, q as in question, even que as in unique.

But a sound that is considered to be distinctive may have several different ways of being produced, depending on where it shows up. We just happen to hear them all as versions of the same sound and thus interpret them all as the same sound by habit without generally noticing that there is any difference. Take /t/, for instance. Say the following words:

ting sting matting mattress mat mitten

Each one has a different version of that /t/. Linguists call these different versions phones (as if that word didn’t have enough meanings already). The system of phones is phonetics, while the system of phonemes is phonemics. (Phonics is not a word linguists use.)

Put your hand in front of your mouth and say “ting sting.” You might feel an extra puff of breath on “ting.” If you say “pill spill” you will feel much more of a puff on “pill.” We put those puffs on voiceless stops (/k, t, p/) when they’re at the very beginning of a syllable – but not if there’s /s/ before them at the start of the syllable. Those puffs are called aspiration.

That’s two of the six different variations on /t/ – what linguists call allophones of /t/. I’m sure you can hear the different allophones in “matting” (with the tap) and “mattress” (with “mattress” the /tr/ together sound like “ch” plus “r”). Now how about “mat”? The difference with that one is that we don’t release /t/ when there isn’t another vowel or liquid after it – we just hold it closed. Usually we just close our throat (glottal stop) and sometimes we don’t even entirely touch the tongue to the roof of the mouth. If you have /n/ after it, as in “mitten,” just the nasal passage releases, unless you’re speaking carefully or formally.

All of these are thought of as /t/. All of them are heard as /t/. But they really do differ. In some languages some of them are treated as distinct sounds. You know how speakers of some languages can’t say “beat” and “bit” differently? That’s because those two vowels are allophones – different phone realizations – of the same phoneme in those languages. Well, we’re like that with things like the difference between aspirated and unaspirated stops.

Why do we do this? Economy of effort. A /t/ is a voiceless alveolar stop. We don’t always retain all those characteristics of voice (voiceless), place (alveolar), and manner (stop); we’ll stick with whichever is sufficient to make the sound recognizable while not having to make too much effort to say it, and sometimes we’ll add a little more distinction where needed. So at the start of a word, we add that puff of air to make it clearer that it’s not /d/. We don’t need to do that after /s/ because we never say /sd/ at the start of a word. In the middle of a word like matter, we just keep the place and a similar manner, but we don’t stick too closely to the voicelessness or the hard stop. At the end, as in “mat,” or before a nasal, as in “mitten,” we reduce it to a different stop (glottal) that takes less effort to say. That’s also what some people (notably some British people) do when they use a glottal stop between two vowels, as though “matter” were “ma’er” (or “ma’ah”). The quality of being a voiceless stop is enough; the other two voiceless stops (/k, p/) don’t reduce to a glottal stop in English.

So those are the allophones of /t/. What you need to know is that sometimes two different phonemes have, in some contexts, the same phone as an allophone. Most “short” vowels in English reduce to a neutral unstressed vowel [ə], for instance. The case in point today is [ɾ], which can be a version of /t/ or /d/ (or, in some kinds of English, /r/).

We think of voice as the difference between /t/ and /d/. But they’re stops – how do you voice a consonant when your air flow is stopped? You don’t, really. You know the difference between /t/ and /d/ mainly by how the sounds before and after behave. Say this:

mad mat

In “mad” your voice keeps going right up until you say the [d], but in “mat” you cut off a moment sooner. You also say the vowel a bit shorter.

Now say this:

The madder matter

The difference is very subtle, isn’t it? But you may say the [æ] before the /d/ a little longer than before the /t/, and you may cut the voice out just a little for the /t/ version. It’s not really enough to be sure about when you’re listening, but there may be that small effect of the sound you’re thinking about when you say it.

On the other hand, you might really say them both the same way.

It just happens that that way will not be with [d]. It will be with [ɾ].


If there’s one thing in the English language that can be a cause of stress, it’s stress. Our words do not have a consistent pattern of stress, and sometimes it can’t even be readily guessed by looking at them: there isn’t anything within the form that shows where it goes, and so you need to have deeper knowledge of the word – or simply to have learned it by rote.

And sometimes you haven’t. Jim Taylor, in suggesting a look at this topic, mentioned a friend who pronounced cadaver as “CADaver”. Of course, only a cad would aver that a person who made such an error was somehow inferior; we can go much of our lives without hearing some words, and so it is not uncommon to be left to our own guesses. The result is that people, as is sometimes remarked, get the emPHAsis on the wrong sylLABle.

Ah, emphasis. It’s a way of saying which way the “oomph” faces. Actually, emphasis originally referred to putting more in a word than its denotation – that is to say, using it to imply something extra. From that it came to refer to intensity of expression, and so it is also sometimes used to mean what is also called stress or accent in words. Which is suitable, because if you have a foreign accent, you’re likely to find that the emphasis is, as mentioned above, a cause of stress.

But, now, why isn’t it “emPHAsis”? After all, it’s “emPHAtic”. Well, for this pair in particular, we need to go back to the source – Greek: emphasis has come to us unaltered from Greek, and in Greek, the stress – the emphasis, one may say, although in classic Greek it may have been more a question of vowel intonation than of emphasis – is on the first syllable. The word comes from em “in” and phainein “show” (we see the same root in, for instance, epiphany), and the rules of Greek morphology move the stress to the first syllable in this case. On the other hand, in the derived Greek word emphatikos, the added suffix puts the stress on the last syllable – a syllable which we have dropped in English emphatic, and we’ve put the stress, as we generally do in -ic words, on the second-last syllable (the penult, as it’s called).

So what this word has within it that guides its stress is its Greek origin. And origins turn out to have a lot to do with English word stress. English has gotten its wordstock from a lot of different languages, and different languages have different rules about stress – and we sometimes, though not always, keep the stress when we borrow the word.

Some languages, like Greek, have stress that shifts and that is not really consistent from word to word – on the other hand, Greek also has accent markers to show which syllable gets the stress (though in modern Greek they’re not always used). Some languages have stress that is contingent strongly on vowel length – Latin has this characteristic. Some languages have stress and vowel length independent of each other: Finnish and Hungarian both always put the stress on the first syllable regardless of which vowels are long or short. Some languages have consistent stress without contrastive vowel length: Polish, for instance, always has stress on the penult, but doesn’t really have a long-short vowel contrast.

English, for its part, does have some predictable patterns, and to some extent they relate to what we call “long” and “short” vowels – though in Modern English the distinction is one of quality rather than of quantity (long vowels in English actually were, more than half a millennium ago, just extended versions of the short ones, but that all changed during what’s called the Great Vowel Shift). But much of that actually relates to the other languages (especially Latin) that we got the words from. In truth, Old English (which is what was spoken in England from roughly the 7th to the 11th century AD) put the stress as a rule on the first syllable of the root. (If there was a prefix tacked on the beginning, it wouldn’t receive stress.)

What this also means is that when we add a suffix that came down from Old English, it will probably not affect the stress of the word. Suffixes such as -dom, -ful, -hood, -man, -ness, -ward, and -wise don’t draw stress to them – though -wise with its “long” vowel gets a secondary stress. On the other hand, suffixes from French, such as -ee, -eer, -aire, -elle, -esque, -ese, and -ette, tend to grab the stress.

Meanwhile, words we get from Latin – such as cadaver – tend to get stress on syllables that in Latin had a long vowel (in cadaver, the second /a/ was long: /ka da: ver/). Words that come from Greek sometimes follow Greek stress patterns, and sometimes just get the accent on the antepenult (the third-last syllable) in that grand old English habit: Socrates was said in Greek with the stress on the penult, but in English we shifted it to the antepenult. Words we got from French, if we present them as French words not much changed (more typical with more recent borrowings), will get stress on the final syllable, but there are many words in English that came from French long enough ago that the stress has changed. And of course those words usually are derived from Latin words, which will add further influence.

So how do you know what syllable an English word is stressed on – which syllable gets the emphasis, as it were? Dude, look it up.

But, ah, if you’re on a desert island with no internet and there’s a maniac who’s going to hurt you if you don’t know the pronunciation of this or that word (in real life there are many more such maniacs than there are desert islands to put them on), start by identifying the bits of the word. You can ignore all the inflectional suffixes – things like -ing and -ed and -es and so on – but of course you will need to take note of any suffixes that tend to draw the stress (including -ic, -icity, and -idity, which draw it to the syllable before them).

If the root has two syllables, the stress is probably on the first – unless it’s a loan word or one of those verbs that pair with adjectives and nouns (insult, perfect, torment, escort, etc.).

If the root has more than two syllables, look at the penult of the root. If it has a “long” vowel or the vowel is followed by two or more consonant sounds before the next vowel, it’s probably stressed. If not, the stress probably goes on the antepenult.

Unless it doesn’t, of course.