A new book by Joe Clark about Canadian spelling
The raw data behind Organizing Our Marvellous Neighbours
Or buy the book, learn about it, find out what’s new, read the errata, look at related pictures, read the offsite blog, or contact the author
Here you may download all the raw data I produced for the book that is available in computer-readable format.
- Spellchecker test sentences: About 250 sentences with highlighted words in British, American, and/or Canadian spellings. Not every combination is represented, and a few of the words elsewhere in those sentences are troublesome for some spellcheckers. Run these through your spellcheckers to evaluate false positives and negatives. Still to come: A table explaining the actual and expected outcomes from spellchecking vs. the stated spellings in dictionaries from all three countries. (Added 2008.10.06)
- Test results from those sentences, available as an Excel file only. (Added 2008.10.11)
- Results from literary award-winners (Excel worksheet only). (Added 2008.10.12)
- Full bibliography and sources (added 2009.02.13)
I have the data in a combination of HTML, text, and Excel files, all of which need to be cleaned up to be comprehensible. This will take a while. It was more important to use the data to write the book than to have absolutely everything ready to go all at once on launch day. Nor will everything be published in one gulp. Data will be made available as I update it.
If you’re in an enormous rush, ask for what you want and I’ll send you what I have as-is.
Pages use austere formatting so you can copy and paste more easily.
Special warning to tiresome open-source fanatics
Microsoft Office files are now a published format and I will have no compunctions whatsoever about providing Excel files as raw data. You know perfectly well that open-source software like OpenOffice can read such files. If yours can’t, buy a new computer.
The published data will, moreover, have the same copyright protection as the book itself (copyright © Joe Clark 2008; all rights reserved, and I mean that). Like the book and like everything else I have ever created or will create, the data will not be licensed and are not licensed under Creative Commons. If you legitimately believe that copyright law prevents you from using the data in some useful way, make your case to me and let’s see if we can swing a deal.