Counting Words

counting III (cc) — Martin Fisch via Creative Commons

Word counts: a phrase that strikes utter apathy in the hearts of people everywhere. Well, most people. If you’re a writer or editor, you probably care (at least some) about word counts. They are a rough measure of the size of a piece of writing, and in shorter works (journal articles, short fiction, etc) they can be a measure of effort for use in paying writers. Typically book-length work is paid based on unit sales and/or other complicated algorithms so it matters less how many words something is once it reaches that scope. Now, determining what lengths qualify as “novel” versus, say, “novella” is a whole other discussion, but let’s focus on the fact that word counts are used to determine relative size and values for works that tend to be collected or anthologized.

General Problems

Obviously the main problem with using length to determine value is that it doesn’t have any reflection on quality or amount of effort. If I can slap together a 5,000 word story with no research and a couple revisions in the span of a workday (which is not outside the realm of possibility) and sell it at five cents per word, that’s $250 for a day’s work. Broken down, it’s $31.25 per hour. Respectable. On the other hand, if it takes me the equivalent of two weeks worth of time to produce that same story (whether through many more revision cycles, or research, or just the common laborious nature of writing) so that I spent a pay period’s worth of time on getting that $250 paycheck, suddenly my pay rate becomes roughly $3.13 per hour which is well below minimum wage.

Of course, this is art. By definition, its worth is difficult to quantify. The assumption is that editorial evaluation works to measure inherent value independent of expended effort. Unlike, say, the work I do for my day job where time spent is sometimes as important as the final result (often because the results are binary “works/doesn’t work” propositions), the only way to even begin to gauge what one short story is worth versus another is by how many resources are required to include it. If a writer could generate lengthy, sales-ready prose with the same rapidity that I can, say, restore a server to life after it fails, you might find someone who could earn as much at writing short fiction as I earn as an engineer. There’s a specific reason why I maintain a day job.

But it’s also true that the number of words in a finished draft have zero bearing on the number of words expended to reach that draft. One of my short stories, for example, has consistently hovered between 2,100 words and 3,500. But it’s also been re-written from scratch at least twice and revised significantly five additional times. If I had to guess I’d say the number of words written on it to date is about 20,000. It’s the nature of the game but the point is there is only convention that allows writers to accept their work’s value is tied to its finished length. After all, graphic art is also difficult to quantify but it isn’t priced based exclusively on the size of the picture, usually the artist sets the price based on materials plus time to complete it. Granted, paintings and such are more immutable than written works (editors can still shape the final product), but the point is that length as a factor in the determination of a piece’s value, at best, is peculiar and, at worst, significantly devalues an author’s effort.

Technical Problems

Let’s step aside for a moment and discuss the technical limitations of word counts. Because if we’re going to use the number of words measure to weigh the value of a piece of writing, we have to determine how that number is evaluated. And the disappointing truth is, it’s not as cut and dry as it seems.

I mean, on the surface it should be easy to figure out how many words something is. It’s tedious, but any human can count words with a high degree of accuracy. The trick is, no human wants to sit there and count thousands of words. So we rely on machines to do it. But when it comes to counting words, there is a context. For example, I use unix command lines a lot so I tend to default to a utility called “wc” which can scan a file and tell me how many words it has in it. But wc counts characters on either side of whitespace (spaces, tabs, new lines) so hyphenated words always count as one word instead of two. Also asides using em dashes—like this one—are commonly typed with no space between the punctuation which means even if you want to argue that hyphenated words should count as a single word, the asides count the surrounding and initial/final words as single entities.

There are lots of other examples of programming decisions determining how many words are counted by a word counting program. To illustrate this, I checked a handful of places for word count results. Basically I used the top results from a google search of “word counter” plus the wc utility, Word, and my own personal word processor of choice, Scrivener. The test string I used was this:

This is test 10 to—attempt to, anyway—determine to-pay amount…for “accuracy” (or ‘accuracy’) on ironsoap.com.

Here are the results:

Method	Result
wc	14
http://www.wordcountertool.com/	14
http://www.wordcounttool.com/	17
https://wordcounter.net/	14
http://wordcounttools.com/	16
MS Word	16
Scrivener	18

Now, a manual count of the words comes up with 18. Only Scrivener matched my eyeball count. Now, a spread of 4 words out of 18 may not seem like a lot, especially since the test sentence was designed to trip up these algorithms. But that’s also an incredibly small sample size and it illustrates that the margin of error can be as much as 23%.

I tried again with a different text:

Method	Result
wc	5898
http://www.wordcountertool.com/	5898
http://www.wordcounttool.com/	6110
https://wordcounter.net/	5898
http://wordcounttools.com/	5912
MS Word	5969
Scrivener	5991

This one is probably more representative of a real text and the margin of error is much more tolerable here, about 3%. But if you convert it to monetary terms, the difference between the high count (6,110) and the low (5,898) at five cents per word is the difference between $305.50 and $294.90. Now, sure, ten and a half bucks isn’t going to make a huge difference one way or the other as a variance on a flat rate. But, if the story took eight hours to write the hourly rate is $38.19 per hour versus $36.86 per hour. That’s a 3.5% difference. If you contextualized that as a salary that would be analogous to a pretty decent raise! Only the raise in this case has nothing to do with performance—remember, this is the exact same text—and everything to do with how the words are counted.

A Thought Exercise

So I started accepting submissions for 200 CCs. This has placed me squarely in the editor-in-chief’s chair and suddenly I’m on the other side of this issue as well. As a writer, I admit, I hadn’t done much thinking about the variances between word counts. I’m basically ecstatic any time anyone wants to give me any amount for my writing. I’d basically resigned the difference to a rounding margin and determined to be happy with whatever amount I earned from writing sales. But as an editor it does matter to me because not only do I want to be fair to each individual writer as well as all the contributing authors, but I want to pay as much as I can possibly afford to each contributor.

This is not entirely altruistic of me. It also benefits me to pay as much as I can. For one thing, the more I pay the better the quality of writing I can feature. I’ve already gotten a lot of writing that is frankly well above what I’m able to afford. I’m getting these babies at bargain prices. But unlike consumer shopping, that doesn’t make me happy, that makes me uncomfortable. I want the 200 CC writers to receive fair compensation for their work. Being able to pay more would also mean better word-of-mouth among writers’ groups, and that means more (and better) submissions to choose from.

At the very least I want to pay the correct amount and that means using the word counting algorithm that counts closest to the truth. But it’s even more complicated than that because my particular endeavor is a specific range of words. At or near 200 words. I grant a little leeway on either side, but what happens when someone’s Word document counts 190 words but I count it at only 175? According to my guidelines, that’s too short. I’m not a stickler and I have a minimum pay amount that covers both of those, but the point remains, word count is a weird yardstick and that’s even before you realize it’s very imprecise as well.

But what else are we going to do?

Admittedly, I haven’t thought through the ramifications of all this, but follow along with me for a minute. What if we—as writers—set prices that we sent along with our submissions? What if we acted like graphical artists and took into account our professional level, our raw materials, and our time and solicited each submission with a price we would be happy to receive for acceptance? For example, I clearly cannot command full professional rates at this point (last year’s ongoing streak of 75+ consecutive rejections from pro-level markets confirms this) but what if I solicited my fictional 5,000 word story from above, the one that took me eight hours to write, at $200? That would be $20 per hour. It also works out to $0.04/word, which is mid-to-upper level semi-pro rates, and that’s probably fair for what I can readily command at this point. One story I have that’s probably pretty close to that both in terms of word count and time spent actually feels pretty good at about that level.

negotiation — Georgie Pauwels via Creative Commons

Now, this might be a pretty big shift in the way not only writer’s works but also publishing markets are perceived. Maybe it means markets are no longer “pro” or “semi-pro” but the writers themselves are. Markets are free to publish stories that are both within their editorial criteria and affordable to their budgets. Unknown writers might sell to prestigious publications primarily based on their affordability. Big name authors might be able to lend name credence to upstart publications by lowering rates. Heavily researched articles might command higher rates due to their labor-intensive nature. Easy-to-write novellas might find more publication opportunities because they aren’t (necessarily) worth more than agonized-over flash pieces.

Maybe it wouldn’t make a difference. Maybe introducing bidding-war behaviors into the submissions process would cause a decrease in the average selling price of works as authors lowered their prices in an attempt to be more attractive to potential publishers. Maybe editors don’t want the hassle of trying to make value judgements based on proposed cost or having to enter into negotiations on price (as well as rights terms, potentially).

Whether this idea is bad or good, what I don’t care for in any case is the shrugging assumption that the long-standing practice of gauging written work’s value by its length is the best or only way to do it.

ironSoap.com

The Writing of Paul A. Hamilton

General Problems

Technical Problems

A Thought Exercise

Follow ironSoap.com

General Problems

Technical Problems

A Thought Exercise

Related posts:

Follow ironSoap.com