Cobweb Corner KISS Grammar Main Course Page

Novelists' Writing
-- Some Preliminary Statistical Comments

(Back to Novelists' ToC)

     As I write this, only two small groups of writing samples have been analyzed and presented on this web site. The samples from the six novelists are the second, the first being the ten samples of fourth graders'. Statisticians, of course, would scorn such small samples, but then, statisticians don't present all their data -- they simply ask that we accept their conclusions. Because the sample sizes discussed here are small, we cannot come to any conclusions, but there are some things that should be noted.

Words per Main Clause 
and Total Subordinate Clauses per Main Clause

     The first thing to note is that the results obtained here come close to those of Kellogg Hunt, Roy O'Donnell, and Walter Loban. Our ten fourth graders averaged 7.7 words per main clause; Loban's third graders averaged 7.6; his fourth graders, 8.02. O'Donnell's third graders averaged 7.67; Hunt's fourth graders, 8.51. (For more on Hunt, et. al., click here.) When we take into consideration the differences in definition, we can probably be comfortable that we are in the ball park in describing fourth graders' writing. The same is true of the novelists. Hunt's average for professional writers was 20.3 words; our six novelists averaged 21.2.
     What is interesting, however, and what Hunt's study may have obscured, is the variety in main clause length among professional writers. Many people have pointed out that differences in mode (narrative, descriptive, expository) affect main clause length. The differences, however, are much more complicated than that. Both Austen (12.3 words/mc) and Twain (7.6 words/mc) present dialogue, but there is still a significant difference in their main clause length. The passage from Dickens (13.9 words/mc) presents an overview, but the passages from Hawthorne (43.1 words/mc), James (33.2 words/mc) and Tolstoi (17.1 words/mc) suggest a difference in clause length that goes beyond mere differences in mode. 
      I make these observations simply to suggest that too much was made of Hunt's original conclusions. It was felt that there was a big "gap" between the writing of twelfth graders (14.4 words/mc) and the "professionals." Sentence combining, as a "writing" activity, was developed to bridge this  "gap." But  if we look at the statistics of our individual novelists, we have to ask if that gap even exists. (As usual, I am arguing against any instruction that forces students' writing into one direction or another without giving students the ability to make such judgments for themselves.)
     As with main clauses, our samples come close to Loban's and Hunt's for total subordinate clauses per main clause. Loban's fourth graders wrote 19 subordinate clauses for every 100 main clauses; Hunt's wrote 29; ours wrote 22. Hunt's professionals wrote 74; our six novelists averaged 72. Although our sample size is small, in other words, it gains support from the research of Hunt, O'Donnell, and Loban.

Sentence Openers

     Twice during the last year I have seen people say that they were taught not to being a sentence with a prepositional phrase. Then, of course, there is the infamous injunction against beginning a sentence with but. I simply want to note, therefore, that 12% of our novelists' sentences begin with a prepositional phrase, and another 3.9% begin with "But." 3.8% begin with a subordinate clause.
     Since I am using this essay to comment on the fourth graders' writing, we should note that they began 2.7% of their sentences with prepositional phrases, 2.8% with subordinate clauses, and 2.3% with "But." IF one spends some time looking at the other differences between the writing of the fourth graders and that of the novelists, I think one will conclude that the fourth graders did a pretty good job in this area, and that raises the question of why so many teachers spend time trying to teach students to vary the openings of their sentences. Perhaps they do so because they don't know how to (or the system won't let them) teach students how to analyze the structure of their own sentences. Doing so, of course, would give the students the ability to discuss and make their own judgments on such stylistic questions.

Fragments, Comma-Splices, and Run-ons

     The obsession of some English teachers with teaching grammar primarily as a means of avoiding errors raised the question of fragments, etc. Interestingly, 7% of the novelists' main clauses were fragments; 10% (primarily but not exclusively because of Dickens' passage) were preceded by comma-splices. No run-ons were noted. This compares to 5% fragments, 2% comma-splices, and 4% run-ons for the fourth graders. There are, of course, good and bad fragments, comma-splices (and even run-ons), but here I simply want to suggest that the statistics thus far suggest that perhaps too much is being made of the problem of fragments, etc. This is especially true because most teachers have either no idea, or bad ideas, about how to remediate the supposed problem.

Passive Verbs

     Lately I have even heard of teachers who want to ban the passive voice entirely! I don't know on what authority they want to base this dictum, but if we look at how writers write, thirteen percent of our six novelists' finite verbs were in passive voice.

Embedding Level of Subordinate Clauses

     Although I am looking forward to adding more samples of adults' writing, our six novelists give us a tentative benchmark for looking at the syntactic maturity of students. A primary difference appears in the embedding level of subordinate clauses. For every hundred main clauses, the novelists embedded 20.9 subordinate clauses at level 2 (i.e.., within another subordinate clause). This compares to 1.8 for the fourth graders. Perhaps an even more important difference is that the 2nd level embedded clauses of the novelists averaged 13.6 words in length, compared to 3.7 words for the fourth graders. The novelists also embedded 6.1 subordinate clauses at level three (i.e., within a second level embedding) for every 100 main clauses. None of the fourth graders used a level three embedding. This tentatively supports a main argument of the KISS Approach to grammar -- the differences between students' writing and that of accomplished adults is not a matter of types of constructions used, but rather a question of the embedding of constructions within constructions. The current approaches to grammar that simply teach students types of constructions (clauses, phrases, etc.) are useless because what students need is to be able to untangle the more complicated embeddings in their own writing.

Length of Verbal Phrases

     The following table summarizes the number and length of verbal phrases used by the novelists and by fourth graders:

Novelists 4th Graders
Infinitives / 100 m c 17.3 7.2
      w / Inf 8.5 3.3
Gerunds / 100 m c 7.3 1.0
      w / Gerund 3.2 0.8
Gerundives / 100 m c 18.2 1.0
      w / Gerundive 10.6 1.0
The table shows that the fourth graders used all three types of phrases, but it also supports Hunt's claim that the gerundive is a "late-blooming" construction. When (and if) I have the time, I hope to review the fourth graders' passages to show that most of the gerundives in them are actually what O'Donnell called "formulas" (strings of words which children learn as a phrase). 
     This contention is supported by average phrase length. Note that the gerundives of fourth graders average one word in length, compared to 10.6 words for the novelists. Here again, the difference is the embedding of one construction within another. Consider, for example, the following sentence from Hawthorne:
\-\{Before this ugly edifice,} and {between it and the wheel-track} {of the street,} was a grass-plot, much overgrown*GiveR30 {with burdock, pig-weed, apple-pern, and such unsightly vegetation,} [RAJFwhich evidently found something*INFDE03 congenial {in the soil} [RAJFthat had so early borne the black flower {of civilised society,} a prison.#App02]]
"Overgrown," modified by "much," begins a 30-word gerundive phrase that continues to the end of the sentence. (One way of looking at this is to say that if "overgrown" is taken out of the sentence, everything from "much" to the end must also be taken out.) Thus "overgrown" is modified by the prepositional phrase "with ... vegetation," and the objects within that phrase are modified by the "which" clause. The "that" clause, a second level clause embedding, modifies "soil" in the first level "which" clause," and the sentence ends with "prison" as an appositive to "black flower." Such embedding within embeddings is simply beyond the syntactic maturity level of fourth graders.

***

     The preceding statements are, of course, largely tentative. As time allows, I will be adding more samples to the study.