On writing a technical book

Win-Vector Blog 2013-12-20

I have been doing a lot of writing lately (the book, clients, blog, status updates, and the occasional tweet). This has made me acutely aware of how different many of these writing tasks tend to be.

My primary writing is technical analyses and solutions. When it is going really well the pattern is this:

A few days of client meetings.
A week of online and pencil and paper research.
A couple of days to write and revise a 5 to 10 page custom paper in Latex.
Present work to client, with an opportunity to explain what is unclear.

For blog posts:

Keep folders of ideas
Let a new idea cuts ahead of the others and start work.
Around 10 pages get written in manic burst.
Print and markup for readability.
Post to the blog.
Add a few post publication ideas, revisions and edits.

And I wrote an over 100 page thesis, so a book should be easy.

What I didn’t anticipate is book chapters are very different beasts. When I write a chapter or a portion of a chapter for our book at the least I must at least:

Coordinate with the book’s first author (Dr. Nina Zumel).
Work what can be done given what the previous chapters have in fact covered, what we expected to teach in this chapter and what future chapters (maintained in an outline) need from this chapter.
Set a schedule of drafts and revisions with our managing editor.
Write the first draft of the chapter: about 30 to 40 pages, usually over 10 diagrams taking around 3 to 4 weeks. This step itself includes many rounds of printing, editing and correction.
Submit the chapter to the production editor. The editor easily puts in 4 days of work critiquing the chapter, often producing 10 pages of written substantial critiques. The editor in addition to keeping the plan for the book in mind, also alternates between pretending to know less about the material than the authors (simulating readers of various levels of expertise and interest) and pretending to know more than the authors (simulating reviewers and experts).
Revise the chapter to address the editor’s concerns and re-submit to the editor (usually about 3 days).
Once the chapter is stable it gets included in one of the “book to date” external reviews. There are 3 such reviews and each one involves about 15 volunteers reading and submitting written comments on the material.
Revise to respond to external reviews (often as short as a day per chapter).
Professional copy-editor then works through the chapter. In addition to correcting errors the copy-editor make sure the book maintains a consistent style. The copy-editor’s work easily takes 3 or more person-days.
Merge copy-editor changes (thankfully using a change management system) and make more corrections (just under 1 day).
Submit book to a final technical reviewer to ensure claims make sense and example code works as advertised.
Book chapter then goes into pre-production with more professional control of formatting and figures.
Various checks of galleys and production issues.

This is ignoring the work we put into designing our lessons, researching solutions, curating data examples, working example analyses, and maintaining additional publication tools. Or all the steps above happened after we thought we had already decided what book we were going to write. A recent example: for just two small sections in one of our appendices I ended up needing to write a blog article (so it would be available as reference ) and generate a new synthetic dataset. And of the two authors: I am the lazy one. The first author works even harder and has planned, designed and written at least half of the material.

Counting appendices our book has 13 substantial chapters that have to go through this whole process. (Note: a heads up to our Manning Early Access Edition subscribers, preview chapters have only been through steps 1 through 6 though you will of course get access to the finished work when available.)

A large part of the additional work in producing a good book together is to get the overall quality very much above what is possible in informal articles. This makes things harder on the author and easier on the reader. As the projected size of the audience goes up it (unfortunately) makes more and more ethical sense to move suffering from the reader (i.e. make reading easier by having better structure and fewer mistakes) and onto the authors (insist on better structure, fewer mistakes, many more revisions, and much more criticism). Something that will save thousands of readers 5 seconds of confusion represents hours wasted human potential, so it is worth fixing even if it takes the author a few hours of their life. We can’t and don’t catch everything, but we always think in terms of trade-offs that heavily favor readers over the authors and editors.

Another large part of producing a good book is widening the audience. Can the book be read by an expert (is it correct enough and interesting enough to be of benefit to them)? Can the book be read by a non-expert (does it simplify and establish enough to be understood)? Are the examples enriching enough to pay back the time a reader may invest in working through them (are we not wasting the time of dedicated readers)? Can the book be simply read or even skimmed (are we actually explaining things, or holding the reader hostage to our examples)? Each piece of writing in the book must shine for at least one audience and not be hindrance to any of the other audiences (and a lot is deleted when it fails this test).

The effort and process have been very rewarding and are resulting in a fantastic book in Practical Data Science with R. The book is wrapping up nicely (all but one appendix complete and over half the material already past detailed copy-editing) and I can’t wait to have an actual print copy in my hands.