Cobweb Corner KISS Home Page My Home Page
Return to Statistical Table of Contents

1986 Study of Seventh Grade Writing 
Fragments, Comma-splices and Run-ons
in Seventh Graders' Writing

     A common complaint of middle school teachers is that their students' writing includes  numerous fragments, comma-splices, and run-ons. The teachers' concerns have prompted a number of studies of the errors in students' writing, most of which have concluded that the errors are the result of "syntactic growth." One of the  problems, however, is that "syntactic growth" is still a very vague concept, and, to my knowledge, none of the researchers who studied errors ever gave, or referred to, a detailed theory of how that growth occurs. Equally important, perhaps, is the fact that conventional means of publication (books and journal articles) do not allow the researcher to include all the original data (the students' writing), so that others can review it and make their own conclusions.  Although this little study has numerous flaws of its own, it at least partially avoids these two because it is based on the KISS theory of natural syntactic development, and because transcripts of the students' writing are here for anyone to examine.
      Perhaps the major weakness of this study is that I had to transcribe the students' writing myself. Looking back, it probably would not have been that expensive to pay someone "neutral" to transcribe them for me, but this is not a "comparative" study, pitting one means of instruction (mine) against another. I therefore had no reason to skew the transcripts -- I was simply looking at the syntax of seventh grade writers to see what I would find. I might note that the transcripts were made in 1986, and, with one exception, I did not look back at the originals while doing the analysis. The conclusions that I reached in the process of doing the analysis surprised me, so I doubt that I skewed the transcripts in that direction when I originally made them. There are, of course, many other flaws with this study. It is widely agreed, for example, that errors are "mode" dependent, but this study does not consider the mode(s) of the students' discourse. It would also be very interesting to know what the students were taught about grammar, long- and short-term, before they wrote these papers. In that regard, I should note that none of the students' teachers were aware of the KISS Approach to grammar. I do not know, however, which textbooks and/or methods the students used, to what extent, etc.

     The 31 papers presented here support the complaint of many middle school teachers, but I suggest that they support a new, at least for most schools, idea of what to do about them. Unfortunately, far too many schools are still using (and even going back to) the traditional drill-and-kill exercises that have long been proven to be ineffective. Other schools, misled by Frank O'Hare's claim to be "improving" students' writing, have turned to sentence-combining exercises. Not having read his study, they do not know that O'Hare corrected fragments, comma-splices and run-ons before the students' papers were analyzed or evaluated. Sentence combining has its place among instructional methods, but it may well lead to more, rather than fewer errors. Before explaining my own suggestions for handling this problem, I would like to briefly explain my own expectations about what I would find. Then we need to look at the data, for the data are what the suggestions are based on.
     Since I have only taught college students, my expectations were based on the research of Hunt, O'Donnell, and Loban (as usual) and on several other studies I had read. Primarily because my favorite researchers have convincingly demonstrated that subordinate clauses begin to blossom in seventh grade, I expected that most of the errors would involve subordinate clauses. I was wrong. The table below suggests that only 27 of the identified 148 fragments, comma-splices, and run-ons can best be attributed to students' problems with subordinate clauses. That is only 18% of the total. A much larger proportion of the errors (48%) appears between clauses that express either an amplification of the first idea, or a contrast to it. 

Performance Errors, or Competence Errors?

     Before looking in more details at the results of this study, we need to consider what was counted, and why. The first distinction that we need to make is between performance errors and competence errors. This distinction is extremely important because it affects what we, as teachers, should do about the errors. Performance errors may result from a student's being tired, bored, or sloppy, but there is nothing that we can teach the student, at least about grammar, that will help the student avoid such errors. In most cases, for example, a reader has two signs that she has come to the end of a sentence -- an ending punctuation mark, and a capital letter. In many of the essays analyzed here, however, one of these two signs is often missing. The presence of the other sign, however (whether it be the ending mark or the capital letter) clearly suggests that the student perceived the end of the sentence. The error, therefore, is, in all probability, a performance error. With one type of exception, I have not counted these cases as errors. The exceptions are all cases in which the second clause must begin with a capital letter for a different reason -- "I," "Bob," "Mary." In these cases, the reader gets no signal of the end of a main clause, no signal to dump to long-term memory. I have, therefore counted these as errors, but we should remember that many of them may also be performance errors.
     I should also note, perhaps, that I have not considered errors in usage or in subject/verb agreement. The rules of usage are not systematic, and errors in usage thus need to be treated individually. As for errors in subject/verb agreement, the entire corpus contains less than one such error per 100 finite verbs. The KISS Approach, simply because it forces students to consider which subject goes with which verb, helps students learn to avoid such errors, but because there were so few of these errors in these samples, I decided against exploring their causes. That left me with 148 identified fragments, comma-splices, and run-ons.
     The table at the bottom of this page includes links to each of the identified errors. The links lead to the statistical analysis page for each essay, and within those pages, a link next to each identified error leads to a usually short categorization and explanation/discussion of the error. Although in some cases, the discussions suggest what should be done about the error, most of the discussions concern the probable causes of the error. (If we don't know what causes it, how can we help students fix it?) For the purposes of this study, I have divided the errors into six categories, but we need to remember that many of the "errors" may be the result of the writer's indecision about how to handle the logical relationship between the ideas expressed in the two "clauses." In writing the explanations, for example, I frequently found myself saying "The student may have sensed an amplification relationship here, or he may have vaguely sensed cause/effect." One of the things I want to suggest is that many of the errors appear in places where the students had (and probably  vaguely perceived) multiple choices. Our students are probably smarter than we usually think they are.

Garbles, Fragments, and Interjections

       In calculating their statistics, Hunt and O'Donnell simply eliminated most garbles and fragments, and they were not always clear about how they handled interjections. The researchers who followed them generally followed this procedure -- if they addressed the question at all. (See "Definitions of the 'T-Unit,'" and use "Find" to search for "garble," "fragment," and "interjection.") This practice is comparable to trying to explain why a horse runs so fast and discarding all references to its legs. It suggests that none of these researchers had a very clear idea of what they were studying.
      Although we are studying the words on paper, those words reflect what was going on in the students' heads, and that is really our primary interest. Although garbles may reflect poor handwriting, they also represent words that were in the writer's head, and the very fact that they are garbled suggests that the writer was experiencing some sort of confusion. In the 31 papers examined here, I found very few garbles, but I was always able to determine the number of words in it. I therefore approximated those "words," as closely as I could, and left them in. 
      Also contrary to the practice of most of the researchers, fragments were counted as separate main clauses. Discarding them, or attaching them to the main clause to which they belong, as most of the researchers did, seriously distorts the picture that we need of the writer's syntactic processing. The psycholinguistic model suggests that readers -- and writers -- basically process a main clause in working memory (STM). The punctuation at the end of the main clause reflects a clearing of STM for the next main clause. In a statistical analysis, a writer's average number of words per main clause should  therefore reflect his or her performance (not competence) in processing words through his or her STM. A fragment, in other words, is a sign that the writer was not able to handle the "entire sentence" in STM. Overwhelmed, the writer puts down a closing punctuation mark and then continues the rest of the sentence as a separate unit. Counting fragments as separate main clauses thus makes the "count" a truer picture of what was going on in the writer's head. Note also that it results in a lower average number of words per main clause.
      Interjections that are punctuated as separate sentences also pose a problem. Three such fragments were identified in this set of papers, all in essay # 31. If they are counted as separate main clauses, the average number of words per main clause for this essay is 7.0. If they are discarded, the average jumps to 7.5. For people not familiar with this type of research, the difference may not seem significant, but for the research community, it is. I decided, however, to treat these interjections as separate main clauses because discarding anything distorts the picture. I could have simply attached them to the preceding or following main clause (which would have resulted in an average of 7.7), but I felt that the student's punctuation probably reflected what was going on in her head. There is a difference between "GOSH! I hope me and him do go together, ..." and "GOSH, I hope me and him do go together, ...." (Note that the whole idea of providing transcripts of the originals is to let others examine and discuss technical questions such as this.)

The Performance "Errors"

     My readers can, of course, follow the links and decide for themselves if they agree with my categorizations, but a few words of explanation about the six categories may help.

Acceptable

     The program I use to do the analysis and make the calculations does not distinguish between acceptable and unacceptable fragments and comma-splices. Closer inspection, however, revealed four of the 22 fragments (18%) and one of the 45 splices ((2%) to be acceptable (at least to me). Discussions of some of the others may suggest that they too may be acceptable, but that I had a reason for putting them in a different category.

Afterthoughts

     Elsewhere I have attempted to explain language as a stream of meaning. The question is complex. Dostoevsky claims that he could only get about ten percent of what was in his head down on paper, and I think I understand what he meant. We actually need to think in terms of at least two streams -- the stream that is in the writer's head, and the stream that appears as spoken or written words. If the stream in the writer's head is full, the writer's problem is to select ideas from that stream and embed them in words and sentences. This is probably what Dostoevsky had in mind. On the other hand, if the stream in the writer's head is shallow, then the writer probably spends a significant amount of time searching for something to say. In between these two situations are those in which the mind has probably formulated a sentence, i.e., embedded it in syntax, but while it is being written, the mind perceives an additional relevant idea. In some cases, these ideas are written as fragments, afterthoughts. In this set of papers, the clearest example of this is from paper #15:

The gerbils live in Mrs. Stewart's classroom. I like to observe them. Mrs. Stewart buys their food and feeds them. And water provides water for them.
This student was, in all probability, capable of writing "Mrs. Stewart buys their food, feeds them, and provides water for them." The error probably occurred because the writer thought of water after the previous sentence was already mentally formulated. It is actually a performance error in editing. (See also, "Length," below.)

Amplification / Contrast

      This is the category that surprised me. As noted previously, 48% of all the identified errors fall into this category. More specifically,  71% of all the comma-splices, and 48% of all the run-ons fall into this category. Put still differently, 71 of the 126 identified comma-splices and run-ons, or 56%, fall into this group. I could have separated this into two categories, and they are so identified within the individual discussions, but I originally made it one because it seems to me that they are related. In almost every case, these errors involve compound sentences that could probably best be punctuated with a colon or dash (for amplification) or with a semicolon (for contrast). It seems to me that the use of these punctuation marks for these purposes should be taught together, so I kept the two categories as one. 
     Readers can, of course, check the individual cases to see if they agree with my classification. I expect some disagreement, but I would note that amplification or contrast is also suggested as a possible or contributing cause for errors which I have put into other categories. Thus while some readers may want to subtract from the totals in this category, other readers may want to add to it. If this study does truly reflect the writing of seventh graders, attentive readers should already see its implications. Instead of attempting to force variety, length, and advanced constructions onto seventh graders, shouldn't we be helping them to understand the syntactic structure of their own sentences and teaching them how to use a colon, dash and semicolon to combine main clauses?

Careless / Other

     Because all I had to work with were the students' papers, I found that I could not always determine a probable or possible cause for an error. Some of these errors are probably careless, but in other cases I am probably just too slow-witted to see the possible cause. A better methodology would have enabled me to discuss the papers, and thus the errors, with the students who wrote them, but I did not have this opportunity.

Length

     In this set of essays, this category, like "afterthoughts," includes only fragments. I have distinguished the two because afterthoughts suggest a slow stream of meaning in the writer's head, and because, when the fragment is attached to the preceding or following main clause, it results in a main clause that appears to be well within the writer's level of competence, i.e., close to or below the average number of words per main clause in the essay. Simply put, the fragments I have attributed to length appear to result from the mental stream running fast and furious. The writer is trying to get them down on paper, but his or her STM is not yet capable (competent) of juggling all the words. Thus the writer throws in a period which results in a fragment. In the discussion of the individual cases, I usually show that attaching the fragment to the preceding or following main clause would result in a main clause that is two, three, or four times the writer's average in length, and often longer than the longest "correct" main clause in the paper. Students can, of course, be taught how to go back and edit out these errors, but these errors are much more complex than afterthoughts, and thus more difficult for students to deal with.
     Length is, of course, a contributing factor to many of the errors that I have placed in other categories, and it can even affect very short main clauses. Miller's theory of a seven-slot working memory may be an oversimplification, but we need to remember that a writer has to keep track of multiple systems in addition to the intended meaning -- syntax, pronoun reference, rules of usage, spelling, etc. Stopping to think of how a word is spelled puts a burden on working memory. The longer the writer has to think about the spelling, the more likely it is that aspects of syntax, etc. within that main clause will be lost or mixed up. And the more words there are in a main clause, the more chances there are for an STM crash to occur. Hunt saw this as early as 1969, which is probably why most of the researchers automatically eliminated errors from consideration. It is also the reason for my concern about those teachers who attempt to force length, variety, and advanced constructions into their students' writing. Our students are not our private plants which we can force to bloom in February.

Subordination

      Some of the errors, but fewer than I expected, were clearly the result of problems with subordinate clauses. Most of the fragments are simply detached subordinate clauses. The splices and run-ons in this group, however, present a different kind of problem. My favorite researchers (again) have convincingly demonstrated that subordinate clauses begin to develop, for the average student, in seventh grade. I also agree with John Mellon and others who argue that syntactic "growth" largely follows cognitive growth. This study itself suggests that seventh graders are still thinking largely in terms of amplification and contrast. Consider, therefore, the following sequence from paper 29:

He has got a great batting average, he is hitting a lot of homeruns. They have won the district, the whole team has been hitting good.
I attributed the first comma-splice to amplification. Although he did not know the best way to punctuate it, the writer probably saw hitting home runs as an expansion or amplification of "batting average." In the second sentence, the topic shifts to the team, but here I attributed the splice primarily to subordination because of the implicit cause/effect relationship -- they won because the whole team had been hitting well. The writer, however, may well have viewed this as an amplification relationship, similar to that in the preceding sentence, or, even more likely, he may have been caught between the two. Whereas the first splice might be attributed to our failure to teach the use of the dash, colon, and semicolon when and as well as we should, the second is more likely to be simply a reflection of the process of both cognitive and syntactic growth.

Implications/Suggestions

      One study of thirty-one papers can result in only tentative recommendations, especially when the papers were transcribed by the researcher. The study, however, is easy enough to replicate. The data is here, and the following suggestions are supported by the data, by theory and, as noted above, by other research. 

1. The importance of analyzing students' writing

       If this little study suggests nothing else, it should suggest the importance of teachers analyzing the syntax of their own students' writing. The published research covers students of different grade levels, of different degrees of ability within those grade levels, and of different socioeconomic groups. Published research may also have been manipulated, not for the benefit of the students, but for the benefit of the researchers. (See Mellon.) The direct analysis of students' writing  by their teachers is thus much more relevant than is any published research, including mine.
      Some teachers are intimidated by the thought of performing such an analysis -- most of our college courses in grammar have not prepared them to do so. But learning how to do so is not that difficult. The most important findings of this study (the frequent use of amplification and contrast by the seventh graders, usually expressed in main clauses and often resulting in comma-splices and run-ons) required only the ability to identify main and subordinate clauses.

2. The usefulness of the KISS Approach (& Curriculum)

      While writing this essay, I was also reading Daiker, Donald A., Andrew Kerek, & Max Morenberg, eds. Sentence Combining and the Teaching of Writing: Selected Papers from the Miami University Conference, Oxford, Ohio, October 27 & 28, 1978. The Departments of English, University of Akron and the University of Central Arkansas, 1979.  If you read that book, which is a celebration of sentence-combining (or my notes on it), you will see that many advocates of sentence combining advocate teaching a limited number of grammatical constructions.  Indeed, the method proposed by Jeannette Harris and Lil Brannon is very close to the KISS Approach. The only two fundamental differences are that they were working with advanced writers and within a very limited span of time. They concluded that "Sentence analysis gives the students an objective means of looking at their own writing, and sentence combining gives them the means of improving it." (174)
     If the seventh graders whose papers are analyzed here had been working within the ideal KISS Curriculum, they would have come to seventh grade able to identify the prepositional phrases and S/V/C patterns in their own writing. That knowledge would make it much easier for them to learn to identify clauses. Within the ideal curriculum, they would be spending time in seventh, eighth, and ninth grades learning to identify, manipulate -- and better punctuate -- both main and subordinate clauses. Spread over the course of three years, that instruction would not require a great deal of class time, and the KISS Instructional Matrices provide suggestions for integrating that instruction with reading, style, logic, literature, etc.

3. The uselessness of teaching terms in simplistic or complex forms of grammar

      Public pressure to return to the teaching of grammar has led to an explosion of textbooks that are very similar to those books whose approach and methods have been demonstrated to be ineffective. They will continue to be ineffective. Would it have helped the students whose writing is presented here, if they had been taught to identify subjects, various types of verbs, and a whole host of other grammatical constructions within the simplistic sentences presented in most, if not all, grammar workbooks? Perhaps one of the most interesting findings of this little study is that, if we consider only the longest main clause produced by each student, these seventh graders averaged 21.1 words per main clause. This is above the average main-clause length (20.3) that Hunt found in the writing of professionals. The two numbers, of course, are not directly comparable, but they do show that the average main-clause length of professional writers is within the range of competence of seventh graders. Seventh graders, in other words, are already writing sentences that are much more complex than those in the workbooks. (John Mellon made this point in his famous study, way back in 1969.)
       Teaching students with such books not only wastes their time, it is an insult to their intelligence. It is like trying to teach children to run by having them memorize and  identify leg muscles in simplified anatomical drawings. First, it will do them no good, and second, they will not remember it. Many people have even suggested, and I concur, that the use of such books may even be harmful in that it encourages the students to write simplistic, safe sentences, thereby delaying, rather than supporting natural syntactic development. We need to remember that most of these books are pushed by the major publishers who are more interested in selling big, expensive books than they are in helping students. (See also "Save Money! Burn the Grammar Textbooks!")

4. Teaching seventh graders how to use dashes, colons, and semicolons

     If you did not find the preceding three suggestions convincing, I hope that this study at least made the point that perhaps more time should be spent helping seventh graders handle the punctuation of main clauses. There is much that we still need to learn about natural syntactic development, but, to my knowledge, there is no research that demonstrates that forcing seventh graders to use gerundives, appositives, or other advanced constructions has any long-term positive effect.  Theoretically, such instruction may even be harmful because it forces students to see as "correct" constructions which are, in Vygotsky's term, beyond their "zone of proximal development." Put more simply, it may tell students that they are supposed to do what we tell them to, and that we don't care whether or not they really understand. Teachers who claim that some of their students do understand may, in effect, be saying that they want to work with the best of their students, and to h___ with the rest of the class. Advanced constructions will come in their time. Let's not fool with Mother Nature.



      The links below lead directly to each sentence within the analyzed texts. Within the text, you will find links to my explanation/discussion of each error.
 
Fragments Splices Run-ons
Acceptable
#13 #20 #21 #22
Total = 4
#16
Total = 1
Afterthought
#1 #2 #7 #15
#19
Total = 5
Amplification / Contrast
#1 #2 #3 #4
#5 #6 #7 #8
#12 #13 #14 #15
#17 #19 #20 #21
#22 #23 #24 #25
#26 #29 #30 #32
#33 #34 #35 #36
#37 #39 #41 #42
Total = 32
#2 #4 #6 #7
#9 #14 #16 #17
#18 #19 #20 #21
#22 #25 #28 #32
#33 #35 #38 #39
#41 #43 #48 #50
#53 #54 #56 #57
#58 #64 #65 #68
#69 #71 #73 #74
#78 #79 #81
Total = 39
Careless / Other
#8 #9 #10 #11
#16
Total = 5
#11 #18 #27 #38
#45
Total = 5
#1 #5 #10 #11
#15 #23 #24 #26
#27 #29 #30 #37
#40 #44 #45 #46
#47 #55 #59 #60
#61 #62 #67 #70
#76 #77
Total = 26
Length
#3 #4 #5 #18
Total = 4
Subordination
#6 #12 #14 #17
Total = 4
#9 #10 #28 #31
#40 #43 #44
Total = 7
#3 #8 #12 #13
#31 #34 #36 #42
#49 #51 #52 #63
#66 #72 #75 #80
Total = 16