Nerd-Author Fun

I’ve spent a few days goofing off from writing. Well, kind off…it was writing-related.

I wrote a Java program that can load and process my novel. Now having done that load work will enable me to add useful tools in the future, but for now I just did some basic word frequency analysis. Sounds like some nerd fun? And it was.

First, technical stuff and then some results:

Technical stuff

Loading it into the program turned out to be more difficult than I expected. Part of the difficulty was how I defined things on the page. When I was younger I’d have told you that anywhere there is a gap between blocks of text then it is a paragraph. In my mind, at least, the concept of a paragraph is stretched out-of-shape by the frequent carriage returns of dialogue.

Is this a paragraph? Two? Three? I’m so confused…

I’m sure there’s probably a technical term (which I’m happy to be told)., but I didn’t want to research it. So, I solved the problem like any fiction author: I just made words up.

Hence forth, for all time until I find a better name, they shall be known as minor blocks (green) and major blocks (blue). The term paragraph may now be discontinued.


(I suspect I’m already in the process of changing my mind…)


Before you peruse the results, you might wonder what possible good a function like this might be? (Admittedly at the moment there is too much information). The tool could be used in the following ways:

  1. There are some words, which are so peculiar or powerful that they should only be used once in a story. This tool will help locate those words. For example: gruesome (0), or horror (4). Wow, there’s a lot of cry (10) / crying (5) going on. I really need to check that… Point proven.
  2. There are also some words that mean-nothing and should be replaced with more descriptive terms, like interesting (3).
  3. It could help expose word-use problems. For example, when my characters want to swear they say “frak”. If I find a “frack” or a “fak” then I know I’ve made a mistake.
  4. Nerdy pleasure (hey, it’s valid for me)

When considering these results please note the following caveats:

  • Not all bugs have been ironed out; give me a 5% margin for error.
  • Contractions are included (so “don’t” and “do not” is counted as 2 words)
  • There are no exclusions yet (“a”, “is” etc are included)

For a novel slightly over 86K words, I was surprised with the results.

  • 8,443 unique words
  • The top 10 most frequent words account for 18,624 words. (the, to, and, a of, he, you, was, his, I).
  • Most frequent words per first letter: Unsurprisingly mostly character names. (A = and; B = be; C = could; D = Danyel; E = even; F = for; G = get; H = he; I = I; J = Jessica; K = Keeshar; L = like; M = Menas; N = not; O = of; P = people; Q = Queen; R = Regent; S = said; T = the; U = up; V = very; W = was; X = Xu; Y = you; Z = Zekkari).
  • Everything above 15 characters long was a processing error 🙂Words starting with letter

Length of words


Nobody talked me down…

I’ve had a mini break from writing (dangerous, I know). But it’s been a time of enjoyment and productivity (albeit in other areas), so I don’t regret it.

Firstly since no one talked me down, I’ve been doing some coding in Java. It’s not a writing program yet but the framework to support it (at about 75% completion, to pull a number from the sky). And while I’m making up numbers let’s also say its a thousand percent under budget. (Speaking of budgets – the Australian budget is out tonight and here’s an excellent article on the immorality of spending the next generation(s) money). But I digress…

For my framework I’ve gone with what’s called an internal frame application because it allows maximum flexibility to the user. You can stretch the application over multiple monitors and position and size any number of internal windows to your preferences.

Writing Framework1

Each window can then have any number of panels added to it. (For example a writing panel, a character attributes panel, a todo panel…)

On other matters I’ve also been enjoying more time in the kitchen, having fun preparing a few more meals. (This gives both me enjoyment and my beautiful wife a break: wins-all round).

But now that I have some feedback from my beta readers it’s time to get back to writing and Vengeance Will Come. My next few posts I plan on writing about how I work through those beta reader comments.

Talk me Off the Ledge

Other than writing my other hobby is Java programming. I’ve dabbled a bit in the last year or so but have, for the sake of my writing, managed to suppress the desire to do it almost entirely.

Writing is very time intensive. Programming is very time intensive. Doing one means cannibalizing the other.

And yet part of me really wants to develop a writing program. It would be customised to exactly how I want it, with the features I want (hypothetically). In full disclosure, I’ve started this project multiple times, and generally abandoned it due to how slow progress is.

I apologise for the poor-res pictures, but the originals aren’t close at hand.


On the other hand, developing it would take hundreds of hours, and realistically, it would probably only ever be a B-grade application. I’m a competent programmer, but my knowledge if a blade, is a bit rusty.

Lately though, I’ve been getting distracted from writing by other things – so would a distraction like this be a bad thing? Probably…