What Counts, How, and So
What?
Main Clauses
Last Revise 5/18/99
If there is interest in it, I
will go back and pull out more of the the relevant information on defining
main clauses from the studies by Hunt, O'Donnell, Loban, Mellon, O'Hare,
and Bateman/Zidonis. The picture, especially that presented in the latter
three studies, won't be pretty. I consider my own definition of a main
clause to be closest to Hunt's "T-Unit." In essence, his study proved that
counting words per main clause is the most effective, basic way to measure
syntactic maturity. He had, however, no reason for this, or for explaining
the various errors that he must have found in students' writing.
O'Hare's study, the one most
widely acclaimed as proving that the study of grammar is useless, is particularly
questionable. O'Hare writes:
This study was interested in
the students' writing ability and not at all in their spelling, punctuation,
or handwriting talents. In order to eliminate the possible effects of these
extraneous factors on the evaluators' judgments, the thirty pairs of compositions
were typewritten so that spelling and punctuation could be corrected.
The corrections were made by a secretary at the University School. While
fully aware that discourse can be punctuated in different ways that could
possibly affect meaning, this researcher was satisfied that no bias was
introduced because all the punctuation and spelling changes were made by
one person who was never aware of the group to which a particular composition
belonged. (Sentence Combining, 51-52, emphasis added.) |
In itself, this passage invalidates O'Hare's entire study.
How
can he possibly claim to be improving students' writing when most of the
significant errors in that writing are eliminated from consideration?
If you are not familiar with O'Hare's study, his entire concept of "improvement"
is based on longer main clauses -- the more words per main clause, the
better the writing!
O'Hare's "corrections" raise
still other questions. Since the secretary corrected all the writing, the
researchers were faced with no fragments, no comma-splices, no run-ons.
O'Hare, therefore, did not have to face the question of how fragments affect
the count. (See below). More importantl, the primary difference between
the control and experimental groups in O'Hare's study is that the experimental
group "was exposed to the sentence-combining practice" (35). If we look
at this from the students' perspective, the students in the experimental
group were asked to combine sentences to make them longer; the control
group was not. Surely the experimental group got the message that longer
is better, a message to which they responded in their writing. But as early
as 1965, Hunt wrote: "As more nonclausal structures are packed into a clause
the likelihood of stylistic faults occuring increases apace. The greater
the congestion the greater the hazard" (152) O'Hare eliminated the real
problem he faced by simply having most of the errors corrected before the
passages were analysed!
The psychological
model
underlying the KISS approach provides, I believe, a much better set of
reasons for what counts. No corrections were made to students' writing.
Fragments, comma-splices, and run-ons were marked, as were errors in subject/verb
agreement. The basic idea of the model is that the reader's (and writer's)
brain chunks words together in short-term memory. Every word (except interjections)
is chunked to another word or construction until everything is eventually
chunked to a main subject / verb / complement. At the end of a main S /
V / C pattern, the content of short-term memory is dumped to long-term,
and STM is cleared for the next sentence.
If the preceding hypothesis
is correct, it explains some of students' major errors. For one thing,
as I will try to show in the section on errors, comma-splices and run-ons
result from writers sensing a connection between two main clauses, but
not understanding how to punctuate it. Thus, in their own processing, they
dump to LTM, but they signal this dump with a comma. Or, not knowing what
to do, they simply leave out all punctuation altogether and end up with
a run-on. To explore this idea, in the analyzed texts comma-splices and
run-ons have been counted as separate main clauses.
In the analyzed texts, the beginning of a main clause
is marked by
\-\; a comma-splice, by \,\;
a run-on, by \R\; and a fragment by \F\.
To distinguish main clause length from sentence length, the beginning of
a main clause that functions as a compound is marked by \C\. |
As Hunt, O'Donnell, and Loban
showed, main-clause length increases naturally, with age. The older, or
more experienced we become, the more words we, as both readers AND WRITERS,
can juggle in STM. Most fragments occur, I would suggest, because the complexity
of the ideas in the writer's head exceeds that writer's ability to juggle
words and constructions in STM. The result is that the writer gets part
of the main clause on paper, becomes confused, sticks in a period and capital
letter, and then writes the rest of the main clause as a separate sentence.
Many teachers will recognize the probable validity of this hypothesis simply
because the advice often given to students to fix fragments is -- "Combine
them with the preceding or following sentence."
But if this hypothesis is correct,
it has implications for what should be counted. If words per main clause
is a basic measure of syntactic maturity, and if fragments result from
the writer's exceeding that maturity, then fragments should be counted
as separate main clauses. And that is what I have done in these studies.
Words
per Main Clause
Introduction
Conducted in the 1960's and 70's,
the studies of Loban, Hunt, and O'Donnell demonstrated that a writer's
average number of words per main clause naturally increases with age. The
following table is a compilation of their studies:
Average Number of Words per Main Clause by
Grade Level
Grade
Level |
Loban's
Study |
Hunt's
Study |
O'Donnell's
Study |
3 |
7.60 |
|
7.67 |
4 |
8.02 |
8.51 |
|
5 |
8.76 |
|
9.34 |
6 |
9.04 |
|
|
7 |
8.94 |
|
9.99 |
8 |
10.37 |
11.34 |
|
9 |
10.05 |
|
|
10 |
11.79 |
|
|
11 |
10.69 |
|
|
12 |
13.27 |
14.4 |
|
Professional
Writers |
|
20.3 |
|
Loban's data taken from Language Development:
Kindergarten through Grade
Twelve. Urbana, IL.: NCTE. 1976. 32. Hunt's and
O'Donnell's data taken from the
summary in Frank O'Hare, Sentence Combining. Urbana,
IL.: NCTE. 1971. 22. |
The differences in the studies (such as O'Donnell's showing
9.99 words/main clause for 7th grade students and Loban's showing 8.94)
should raise questions, but there is little doubt that the average number
of words per main clause increases with age. Because a reader's brain dumps
to long-term memory at the end of main clauses, the clearing of STM creates
a rhythm to the text. Even if readers can not identify main clauses, they
must surely sense this rhythm.
Theoretical
Considerations
Is Longer More Mature?
Mellon, Bateman, Zidonis,
especially O'Hare, and many others have assumed that more words per main
clause is a reflection of "better" writing. (An increase in words per main
clause is, after all, the primary "proof" offered in their studies.) Many
teachers have questioned the assumption that longer is better. (Is it an
American fallacy? Or perhaps a male fallacy?) Little has been done, however,
to challenge the assumption directly, probably because of a lack of a theoretical
framework and a method for doing so.
Stephen Jay Gould's
Full
House: The Spread of Excellence from Plato to Darwin (NY: Harmony
Books, 1996) may provide both a theory and a method. Gould's primary purpose
in the book is to disprove the theory of evolution as progress toward more
complex organisms. (For those who might be interested -- in setting up
his argument, he devotes a large part of the book to explaining the disappearance
of the 0.400 batting average.) Although Gould's concern is biology, his
discussion of progress toward more complex organisms may be very comparable
to the question of progress (improvement) toward more complex (longer)
main clauses.
Gould's primary argument is
that, in biology, we have focussed on the more complex and generally ignored
the "full house" of all organisms -- which includes many, many more simple
organisms than it does complex. Rather than try (inadequately) to summarize
Gould's argument, I will attempt to apply his concepts to the question
of main clause length and natural syntactic development.
Although we usually think of
children's first "sentences" as consisting of two words, there are single
word sentences: "Think!" Now suppose we want to make a graph of the number
of sentences (or main clauses) of different lengths in written texts. The
left side of our graph has what Gould would call a "wall" at the number
one -- there are no sentences that consist of less than one word. The right
"wall" of our graph -- as James Joyce among others has taught us -- is
fairly wide open. Theoretically, sentences (or main clauses) could be thousands
of words long.
In his biological argument,
Gould argues for a left wall of single-celled organisms plus random variation.
With life originating -- at least to the extent that we can track it --
at the left wall, random variation can only result in greater complexity.
At this point, natural language development clearly differs from Gould's
biological model. Gould points out that the world is still overwhelmingly
full of single-celled organisms. Very rare, however, is the adult who speaks
or writes only in one- or two-word sentences. Clearly there is a natural
tendency to increase main-clause length beyond the minimal. And clearly
this tendency is good -- to a certain extent. If longer is simply better,
then all of us should be writing like James Joyce, and none of us should
be writing like Hemingway. At what point (10 words per main clause? 15
words per main clause? 20 words per main clause? 25 words per main clause?)
does longer stop becoming better and become worse?
The question is
very complicated, especially because some constructions which make writing
better (appositives and gerundives) also decrease the number of
words per main clause. Nevertheless, the question is approachable through
statistical research. Almost every semster, for example, I have my students
analyze a passage of their own writing and calculate the average number
of words per main clause. We then average these averages and almost invariably
end up with a number between fifteen and sixteen. And, based on out theory
of how the human brain processes language, we discuss the advantages
of being near the average and the disadvantages of being at either tail
of the distribution. (If the average in a passage is too low, the writing
may sound too simple and immature; if it is too high, it may tire -- or
even be incomprehensible for the reader.)
Fortunately or unfortunately,
Gould has also shown me that my -- and others' -- calculations of averages
in this context may be misleading. In calculating averages, I have always
used the "mean" -- add all the values and then divide by the number of
cases. As far as I can remember -- it's something that I probably should
check, but then there are a lot of things I should do -- Mellon, O'Hare,
etc. also all used the "mean." We are probably, however, dealing with what
Gould calls a "skewed distribution" -- there are probably a lot more one-,
two-, three-, four-, etc. word main clauses than there are thirty one-,
thirty-two-, thirty three-, etc. According to Gould, "The mean is a terrible
measure for any vernacular notion of 'average' or 'central tendency' in
... a highly skewed distribution, because the introduction of just one
Bill Gates will pull the mean way to the right." (159) The "median" (half-way
point) may thus be better than the "mean"; and Gould argues for the mode
("most common value" or "the peak value of the bell curve itself" 159).
At times like this, statistics
give me a headache, but if we ever tell students that their sentences are
too short, or too long, we should have the responsibility of at least trying
to understand what we are talking about. I intend to revise this section
as I attempt to apply Gould's ideas, but for now an imaginary example will
help me see if I understand:
Suzzie (If you love Suzzie....) writes an essay
comprised of ten main clauses with the following word-counts -- 9, 10,
11, 11, 12, 12, 14, 15, 16, 33.
The "mean" word-count is 14.3; the median, 12; and the
mode 11.5. This may not seem like a big difference, but according to Loban's
study (See above.) it is the difference
between pre-tenth graders and high school graduates.
It looks as if I need to recompute and reconsider some statistics. |