Category Archives: writing

BIBBO

Back in high school in Banff, in the early 1980s, I had a math and computer science teacher named John P. Stutz. He was quite the character (still is; between then and now, he has been mayor of Banff, and now he performs weddings for a living). One of his lessons I remember best is the classic computer science axiom: GIGO – garbage in, garbage out. In other words, no matter how good a program is, if you feed it garbage data, you will get garbage results. The fact that it’s a well-designed program on a good computer doesn’t automatically transmute rubbish into gold.

Back then, machine learning and artificial intelligence were still in early days. The sophistication I now get from my iPhone – or Google, for that matter – would have blown my head clean off back then. Stutz’s too, probably. But these days, you can seriously propose and produce things that would have been pie in the sky at the time: say, having a computer read job applications and automatically filter candidates, or getting writing advice from a computer application that had learned from a large number of published articles, or having software use security cameras to scan faces and flag people whose appearance matched faces with a higher-than-average likelihood of interaction with police. And when you train that kind of thing, GIGO matters. But something else matters even more: BIBBO.

BIBBO? That stands for bias in, bigger bias out.

Why not just BIBO? Well, here’s the thing. Garbage is garbage, but bias is scalable, and repeated bias compounds. When computers learn biased things from biased data and then put those biased things in the real world, that has real-world effects that feed back and increases the bias in the model. And also, if bias is going in, it’s because that bias can be found in the real world, and the machine’s biased output will confirm and strengthen that real-world bias. Not just the machine but the people who use the machine will have bigger bias.

Consider the job application filtering program I mentioned. The machine learning will look and see that certain kinds of applicants have been more likely to get hired, and will filter those kinds of applicants in and other applicants out. Seems fine? Not if the hiring choices in the original set were influenced by factors such as race, gender, religion, prestige address, prestige school etc. For most jobs, none of these factors have a direct bearing on ability to do the job. But the machine will see a certain kind of name and address and so on and downgrade the applicant on the basis of other similar people not getting hired previously. And then as such applicants are hired less and less, the machine’s bias is reinforced – as is the bias of those doing the hiring.

Consider the writing advice idea. An application that has looked at thousands of academic journal articles will have identified various stylistic features of academic writing. What it won’t know is that many of those features are actually functionally bad: they obscure key details and bury essential points in circumlocution and uncommon terms. Many editors are trying to work to undo these practices, but it’s an uphill struggle, as academic authors often think that if text sounds too clear it’s not erudite enough, and if it takes into account the author’s specific role and position it’s not objective. Throw in software that reinforces these habits and you get academic authors having these prejudices reinforced and being told to do more of what the editors want them to do less of.

And consider the security cameras. If the data from the cameras is used to advise police on who to do a stop-and-check with, and each stop-and-check is counted as an interaction with the police, then obviously it produces a feedback effect. If in one month people with red hair happen to have interactions with police at twice the rate of people with brown hair – statistical anomalies do happen – and that data is fed back into the system, in the next month people with red hair may be more likely to be stopped and checked, which will increase their interaction statistics. And even if the system only counts actual arrests, not interactions, anything that increases the likelihood of someone being stopped by police also increases the likelihood of their being arrested, assuming they have the same likelihood as anyone else of happening to be doing or possessing something illegal, and the same likelihood of not responding well to being stopped by police for no obvious reason. And then the system becomes more biased, and so may the police officers – and perhaps society in general.

These aren’t made-up examples, either. They’re taken from the real world, from products and applications promoted by software companies. I’m not naming names just because it’s tiring (and occasionally expensive) to deal with angry techbros.

There are ways to correct for these biases, of course. You can work on evening out the training data; you can correct for biases in the data and the output. Above all, though, you need to know what to watch out for, and how to deal with it. You need to know BIBBO. Because if there’s bias in the system, bias is the system.

It will naturally help if you yourself, as a designer of the machine learning, do not also have uninspected and uncorrected biases. A problem we face today with many of these applications is the idea that if it’s large amounts of real-world data processed by sophisticated programs, then it is objective and not subject to human biases. This is false, of course – it’s programmed by humans and the data is taken from humans in a society with its own biases – but there are people in the field who do not seem to see that it is false, because they have an ideology that science (including math and computer science) is hard and strong and intelligent and objective, while things that study humans – sociology, philosophy, etc. – are soft and weak and wishy-washy and tendentious. They come through education with this bias, and they use it to filter the information they get, and they design computer applications with that bias. And so you get these things that reinforce bias. All because they thought they could avoid bias by avoiding inspecting bias. But BIBBO.

Global English?

This article originally appeared on the blog of ACES: The Society for Editing.

English is not one language and never has been. Even Old English had different dialects. Global English is a family of varieties, mostly mutually comprehensible but loaded with traps and surprises. And even when you can easily understand English from another part of the world, you will most likely recognize that it’s from somewhere you aren’t… and you’ll eventually get confused by something.

All of that shouldn’t be a surprise to anyone, but some people seem to think it’s possible to produce a neutral, non-regional, truly global English. I will grant that it’s possible to produce an English that seems at least slightly foreign to anyone anywhere – the famous “mid-Atlantic” English you hear in some movies is a spoken version – but it is not possible to produce a variety of English that is taken as unremarkably local by every English speaker everywhere. There are several reasons for this.

Pronunciation

The most obvious difference is in pronunciation. Get someone from Kalgoorlie, Western Australia, someone from Tuscaloosa, Alabama, and someone from Newcastle upon Tyne, England, to have a pleasant chat and see if they can understand each other at all. 

Pronunciation is less of an issue when dealing with the written word – you probably won’t have a person from Buffalo writing “hot” and a person from Toronto thinking it’s “hat,” as you may when it’s spoken. But text is, in fundamental ways, a representation of the spoken word, and it often relies on reference to the spoken word. 

Not just jokes but advertisements and catchphrases rely on rhymes and wordplays that are particular to just some varieties of English – “caught” and “court” sounding the same, or “quarter” and “border” rhyming for instance. These differences also help ensure the impossibility of English spelling reform: you can’t make a phonetic spelling of one variety of English that won’t be incomprehensible to users of many other varieties.

Spelling

Not that English spelling is the same everywhere of course. Canadians are used to American-style spellings but can be very patriotic about colour and centre in some contexts; if a Canadian book expects a largely American audience, however, you can count on those Canadian spellings to alienate them. And on the other hand, if you just go with British-style spellings in Canada, you’ll soon realise it doesn’t always suit. And there are more striking differences, such as gaol versus jail, oestrogen versus estrogen, and arse versus ass – though that last case is arguably a difference of which word is used, not just which spelling.

Same thing, different word

There are many, many things that have different names in different countries. It’s well known that British cars have boots and bonnets instead of trunks and hoods and that a British lorry is an American truck (of a specific kind); it’s generally famous that what Americans call a barbecue Australians call a barbie. Fewer people will know that South Africans call the same thing a braai, or that instead of saying bro or buddy they say boet (which sounds like “boot”) – while in India, they say yaar.

For that matter, there are regional differences even in America, some of them quite celebrated. Is a Pepsi a pop, a soda, or a Coke (used in defiance of trademarks)? Do children on playgrounds ride see-saws or teeter-totters? Such regional differences – which don’t always divide on the same lines – are what linguists call isoglosses, and maps showing the isoglosses are some of linguists’ favorite things.

Same word, different thing

Americans occasionally run up against the fact that pants and fanny mean less publicly acceptable things in British English, and Americans are likely to know that in England and Australia mate refers to a friend rather than a romantic partner.

They’re less likely to know that hotel can mean a restaurant in India; that South Africans call a traffic light a robot; that in India you don’t graduate, you pass out; that tea can be a full meal in England; that a torchlight in Nigeria is a torch in England and a flashlight in the US; that I understand you in the US is I hear you in Nigeria; or that South Africans say shame when they are shown a cute baby or told of happy news such as an engagement.

Americans may not even know what someone from a different part of the US means by boulevard (a grassy strip between sidewalk and street or a wide avenue with a green strip in the middle?).

Turns of phrase

The lexical differences also extend to idiomatic turns of phrase. Where an American might write Main Street on Friday is different from a suburb on the weekend, a Brit would have The High Street on Friday is different to a suburb at the weekend.

A person from England might say I’ll knock you up to mean I’ll drop by and might tell you to keep your chin up by saying Keep your pecker up, but if the hearer is from North America, the results could be… awkward.

Some differences are points of pride: New Yorkers make waiting on line rather than waiting in line a kind of local shibboleth, and for New Zealanders, a phrase like Kiwi as (as in This food is Kiwi as) is, well, as Kiwi as… as what? They expect you to fill in the blank.

Grammatical niceties

There is also the matter of things that are correct usage in one variety but terrible errors in another. I dreamed I dove into a lake may be fine in the US, but I dreamt I dived into a lake is necessary in England. I casted my vote yesterday is terrible in some countries but absolutely correct in Nigeria. I’ll call you when I reach is normal in India rather than I’ll call you when I arrive.

Cultural references

Words and grammar aren’t the only things that vary from place to place though. English-speaking culture is obviously far from uniform, and some baseline assumptions just don’t work the moment you cross a border. Food is different, and passing references can quickly be opaque: not everywhere has food trucks or pretzel carts or chaiwallahs; not everyone can order poutine or grinders or bangers.

And while any Canadian will know what another Canadian means by toque and parka, most other people in the world won’t.

Americanizing and Canadianizing texts is a large and expensive business, and the spellings are the least of the issue. I remember one time a Canadian colleague working on a converted document discovered a number of instances of underprovinciald in a document; it turned out that someone had done a replace-all from state to provincial without checking. But when a guide to a health care topic starts talking about insurance, no amount of word replacement will fix the disparity between the US and Canada – or, really, between the US and anywhere else.

Houses and other buildings can be different, including what’s called the first floor (ground floor in the US and Canada, the floor above ground in most of the rest of the world).

There are also regional differences. In Canada, for instance, if you talk about a condo in Ontario, you probably mean a high-rise apartment; in Alberta, a condo is more likely to mean a townhouse, possibly a vacation property. What you mean by the word bungalow can vary quite a bit depending on where you are in the US. And in some cities, a duplex is typically side-by-side residences with one common wall, while in others, it’s a house with one residence on the upper level and the other on the lower – meaning that a reference to the people in the other half banging on the wall may be confusing.

Global varieties

How many kinds of English are there? Hmm, get a book of paint colors from a hardware store and tell me how many kinds of white, or blue, or black there are. Get another book and count again. English has national standard varieties, regional varieties within countries, local variants, socially divided varieties (often people from the same social group in different cities will sound more like each other than like people from other social groups in their respective cities). 

And don’t forget that the status of English is not the same in every country where it’s spoken – it’s the historical main language in some, the language of a colonizing class in others, and a lingua franca in still others. 

But in every country where texts are published in English, someone needs to make sure that that English doesn’t seem strange. And that someone may be you. The one thing you can be sure of is that while one variety of English may be comprehensible to speakers of another, it may alienate them – and may give rise to significant misunderstandings.

No exceptions?

Do I see a hand in the back? …Yes? …Labels on boxes? And short warnings and things like that? Yes, it’s true that you can produce some short passages that look local to anyone anywhere. But that’s not a global variety of English; it’s a snippet, and many other similar snippets will not seem so universal. 

It’s like going up to a rail ticket office in a European country and knowing enough of the local language to buy a ticket without their noticing that you’re not a native speaker: it doesn’t mean you’re fluent. You couldn’t carry on a conversation without being smoked out. You sure couldn’t write an article – let alone a book – that would be smoothly idiomatic. 

The same is true with using English from one part of the world in another part of the world. Oh, they’ll understand you, probably. But they’ll know you’re not from there, and there will be extra friction and effort in the communication and comprehension. You may not realise it, but the little differences to what you’re expecting colour your reception. And editing means understanding, appreciating, and working with these subtleties.

In effect, localizing English is like translating from one language into another, just subtler. You should only localize into a variety you have native fluency in – if you try to adapt a text into the English of a country you’re not from, you will eventually make an embarrassing mistake. But you also need to know the variety you’re converting from well enough to understand the local points of usage and cultural assumptions, so you don’t think a Canadian’s toque is a chef’s hat, don’t believe that a South African at a robot is watching an android, or don’t get what the big deal is about jumping out a first-floor window.

Which, in my view, seems like an excellent excuse to do some international traveling… when you can.

Some advice for a would-be author

Occasionally a friend or family member will be talking with someone who wants to publish their writing and the friend or family member suggests they ask me for advice. I’ve just sent off an email to one such person, and I think other people might also benefit from the advice (modified a bit to be more general). Now, bear in mind, this is about magazine publishing, and while I’ve had quite a few articles published, I’m not a magazine assigning editor and haven’t ever been one, so some of this advice is second-hand and will benefit from further insight from people who have to field queries, pitches, and submissions regularly. But it’s a start.

The thing I would suggest doing first, when it’s possible, is seeing what magazines your favourite bookstores and newsstands carry in the subject area you have in mind. Then have a look at those magazines and see which ones are publishing pieces of about the length and kind of topic you’re interested in writing. (No matter how interesting a piece is, if a magazine doesn’t have a way to fit it into their lineup of content, they won’t be able to use it.) 

If a magazine looks like it publishes articles of the sort you’re writing, look at its website; there will usually be information on how to submit, or they may want to you to email first just describing yourself and the article and asking if they’d be interested in seeing it. (If they say they don’t accept unsolicited submissions, don’t bother them. They won’t make a special exception for you.) They always want to know what you’ve already published and where. If you have a blog, you can direct them to that and they’ll be able to see what kinds of things you’ve written; also, if you have some idea of how many people read your blog, and the number is large enough, you can mention it to them. Magazines tend to favour authors who bring readers with them! 

Above all, when emailing editors, be friendly and polite, but also concise and to the point—the easier it is to answer an email, the sooner they will probably answer it. So just lead off with a short statement about the article (“Would [magazine] be interested in an article about [X]?”), then describe yourself and your blog and anything relevant you’ve had published elsewhere, and give a bit more detail on the article and why it would be well suited to their magazine (it never hurts to say nice things about the magazine too!), and thank them for their time and say you look forward to hearing from them.

If you feel that your article needs editing before submitting, one thing to bear in mind is that you may have friends who will happily give you advice and tell you things you need to do to it and so on, but unless they have reasonable experience in publishing, their advice may not actually be good advice. Above all, don’t worry too much about tiny points of grammar—although those are the things friends often like to pick on first, the truth is that they’re the easiest things for the magazine to fix, and if you focus too much on them it can often be to the detriment of the larger items such as good structure and storytelling. (Also, many of the “grammatical errors” that many people pick on aren’t errors, and many of their “corrections” make everything worse!) On the other hand, if you pay a professional editor to edit if for you before submitting, you may get good results, but it may cost you more than the magazine will pay you. Some magazines, though, if they see you have a good story that just needs a little structural work, may work with you on it. It really depends on the publication and editor.

Good luck!

Words that glitter and splash

I was to have been presenting on this at the ACES conference in Salt Lake City this year, but, for pandemic reasons, that was cancelled. So the nice people of ACES asked me if I would be interesting in contributing an article to their website on the topic, with a limit of 3000 words. I was happy to do so… and managed to keep it just under the limit! I’m presenting it here as well. This is a longer read than my usual, but on the other hand it’s much shorter than my master’s thesis. Continue reading

The stocking-stuffer every writer needs

Last Christmas, I gave you my 12 Gifts for Writers, first as serialized blog posts and then as a PDF ebook (it’s also available as an audiobook if you sponsor me on Patreon). This year, I’ve made a print version of it for all of you who like to hold real paper things in your own hands. And I’ve made a few tiny revisions in it (nothing big, but still…).

It’s 44 glorious pages in trade paperback, and all for the low price of 50¢ per gift – in other words, $6 per book (plus shipping and handling). (You want free gifts? Get the ebook.) Buy it now for the writer in your life. In fact, since everyone’s a writer, buy lots of copies so you can give one to everyone you know who wants to write things that people will buy.

Order it from Lulu.com.

About the serial comma

People have opinions about the serial comma (also called the Oxford comma). Sometimes very strong opinions. So I sat down with my lunch, some Cheerios, and a Martini to tell you the truth.

How to write gleefully

This article was first published on The Editors’ Weekly, the blog of Editors Canada.

There are times when you want to make your prose more lively – if not flagrantly flippant then at least glancingly gleeful. Your words could land with a thump or splash or flit by with a twirl, but they must be sprightly. You want to write like a child. Well, no, not like a child – children aren’t very good writers; their sense of sentence structure is a bit squishy and scrawny – but like a child would write if a child had the skill of an adult. You want to be extra expressive. Continue reading

Novel medical treatments

To go with my presentation “Translating medicalese into everyday English,” here’s the article that I wrote for The Editors’ Weekly, the blog of Editors Canada.

People with serious health problems are often subject to novel treatments. But that shouldn’t mean being treated like they’re in a novel. Continue reading

Translating medicalese into everyday English

I’ve spent nearly 20 years of my life helping people communicate healthcare information clearly and effectively to ordinary readers (among other things – I’m not a one-trick pony!). This year at the Editors Canada conference I gave a one-hour presentation sharing some of the important things I’ve learned.

Here’s the handout: harbeck.ca/James/Harbeck_Medicalese_Handout.pdf

And here’s the article I wrote for the Editors Canada blog to go with it: Novel medical treatments

If you work for a company that communicates healthcare information to ordinary people, I can come do a seminar for you with exercises – get in touch with me via jamesharbeck.com/contact/.

Here’s the presentation – all 56 minutes and 23 seconds of it:

When to Use Bad English

Here’s my presentation at the 2019 ACES conference in Providence on when and how to use “bad” English (not just swearwords but nonstandard grammar and other things some people look down on).