Trends in Two-Way Immersion Education

Center for Applied Linguistics


Language and Literacy Outcomes


Along with academic achievement, language and literacy outcomes of TWI students are two areas of great interest to those in the field, and there has been a fair amount of research dedicated to these topics. To date, only one large-scale, quantitative study of bilingualism and biliteracy development in TWI programs has been conducted, through the Center for Research on Education, Diversity, & Excellence (CREDE) and the Center for Applied Linguistics (CAL) (Howard, Christian, & Genesee, 2003). Most of the remaining research has been qualitative, with each study focusing on a relatively small number of students in a single TWI program. Cumulatively, these studies indicate that, on average, both native English speakers and English language learners in TWI programs achieve the goal of developing bilingualism and biliteracy. The English language learners, however, tend to develop more balanced abilities in the two languages than the native English speakers. In addition, these studies point to the need for research on effective instructional strategies for promoting the language and literacy development of students in the minority language, given that the two interventions described in this section were not effective in attaining this goal.


Oral Language Development


Howard, Christian, and Genesee (2003) investigated the Spanish and English oral language development of 131 NSS and 118 NES in 11 TWI programs across the United States. Using a modified version of the SOPA (Student Oral Proficiency Assessment), they conducted English and Spanish oral proficiency assessments with these students at the end of third and fifth grades. The average oral English proficiency of both groups of students was quite high in both third grade and fifth grade, with average scores in the mid to high 4 range on a scale of 0 to 5. This indicates advanced skills on the part of both native English speakers and native Spanish speakers. In addition, standard deviations for both groups dropped to extremely low and equivalent levels, indicating that the very high mean scores of both groups in fifth grade were reflective of most individual scores as well. In Spanish, both groups of students showed progress from third grade to fifth grade. Native English speakers showed more growth than native Spanish speakers, which was possible as their initial score at the end of third grade was much lower than that of native Spanish speakers. By the end of fifth grade, the mean scores of both groups were in the advanced range, although the mean score of the NSS was still higher than that of the NES. Additionally, the standard deviations of both groups decreased over time, but the standard deviations of the native English speakers were always much higher than those of the native Spanish speakers, indicating much more variability in Spanish language proficiency among native English speakers than native Spanish speakers. In addition, as a group, the native Spanish speakers experienced a subtle shift from slight dominance in Spanish in third grade to comparable scores in English and Spanish by the end of fifth grade, while the native English speakers were always clearly dominant in English.


Based on classroom observations and testing in a 50/50 TWI program in Virginia, Howard and Christian (1997) studied the oral and written development of elementary students in English and Spanish. In English, all NES students entered as fluent English speakers and remained that way, so there was no evidence of the TWI program causing delay or interference. The NSS students also developed strong English oral skills: all NSS third graders were rated as fluent according to the LAS-O, and no significant differences were found in oral English proficiency between NES and NSS students. In Spanish, development was strong but not quite as strong as in English. Eighty-eight percent of NSS tested as fluent in Spanish in first grade as compared to 100% of NES testing fluent in English in first grade. This may be attributed to the fact that most of the NSS had lived all or most of their life in the U.S., and therefore always had had English influence. In second grade and above, 100% of the NSS tested fluent in Spanish. About 20% of NES rated fluent in Spanish in grades 1-2, and about 50% rated fluent in grades 4-5. Overall, NSS tended to be more balanced bilinguals on average than NES.


A locally-developed, interview format, native language assessment instrument was used to compare NES and NSS students in the Amigos program in Massachusetts with NSS controls in grades 1-3 (Cazabon, Lambert & Hall, 1993). In all grades, NES Amigos scored higher in English than NSS Amigos and NSS controls, while NSS controls scored highest in Spanish each year, followed by NSS Amigos and NES Amigos (although the difference between the two NSS groups in third grade Spanish was negligible). Overall, in English, there was no significant group effect for NSS Amigos vs. NSS controls, but there were significant effects for grade and group by grade. There was a significant grade effect and a significant group effect for NES Amigos vs. NSS Amigos, but no significant group effect by grade interaction. In Spanish, for NSS Amigos vs. NSS controls, there were significant differences for group, grade, and group by grade. For NSS Amigos vs. NES Amigos, there were significant group and grade effects, but no group by grade interaction.


Two studies (Montague & Meza-Zaragosa, 1999; Stein, 1997) used an intervention model to examine the outcome of specific curriculum approaches in TWI programs. Stein (1997) studied the effect of Focus on Form (FonF) in a two-way immersion program in Virginia. Because of TWI programs’ emphasis on learning language through content, explicit language instruction is generally discouraged. The consequence of such an approach is that students gain reasonable proficiency in their second language but they often lack the grammatical accuracy of native speakers. This is most frequent in the case of native English speakers learning a minority language. In this study, Stein analyzed the effect of FonF in the form of implicit, incidental negative feedback in content classes. This feedback was given in relation to subject-verb agreement and noun-adjective agreement in Spanish. Four groups participated in the study: two experimental classes of fourth graders and two comparison classes of fifth graders. The former were given feedback and instruction on such agreement, whereas the latter were not. The results showed that non-native Spanish speakers were significantly lower in agreement knowledge than native speakers, demonstrating the need for instruction in this area, according to the author. However, this experimental study also showed that there was no significant effect of instruction through implicit feedback between the experimental and control classes. Stein states that this lack of an effect could be due to the subtleties of such feedback (often students do not realize they are being corrected), the lack of consistency with feedback, as the teacher did not give feedback to every error, limited opportunities for feedback because it depends on production, and the short time (6 weeks) allotted to test the effect of this FonF model.


Montague and Meza-Zaragosa’s (1999) study examined the role of teacher expectations in minority language production. Participants were 45 pre-literate 4- and 5-year- old children in a 50/50 program, most of whom had been enrolled since age 3. Over the school year, the Spanish classroom teacher modified her level of elicitation during Language Experience Approach lessons. In the beginning of the year, the teacher did not specifically ask children to use Spanish, and the students generally used their stronger language. During the intervention phase, Spanish elicitation prompts were given, and NES students showed a drop in interest and participation, although the responses of NSS students in Spanish increased. Production increased during post-intervention, as did all students’ metalinguistic awareness, but it did not return to the level where it was during the baseline phase.


Writing Development


Howard, Christian, and Genesee (2003) investigated the English and Spanish writing development of 344 native English speakers and native Spanish speakers in 11 Spanish/English two-way immersion programs across the United States. Nine waves of writing data in each language were collected over a three-year period, from the beginning of third grade through the end of fifth grade. An analytic rubric was used to score these writing samples on composition, grammar, and mechanics. On average, the native Spanish speakers (NSS) and native English speakers (NES) had remarkably similar trends in English and Spanish writing. At all time points, the mean scores of the native speakers were always higher than the mean scores of second language speakers (such that native English speakers had higher mean scores in English and native Spanish speakers had higher mean scores in Spanish), but the shapes of the trajectories of mean performance for the two groups in the two languages were comparable. Moreover, there was a tremendous amount of overlap in scores across the two groups. While the mean scores of native speakers were consistently higher than the mean scores of second language speakers, there were many second language speakers who scored higher than their native language peers, and vice versa. In other words, many native Spanish speakers scored higher than native English speakers in English, and many native English speakers scored higher than native Spanish speakers in Spanish. The mean English writing ability of native English speakers was always clearly higher than their mean Spanish writing ability. For native Spanish speakers, however, the situation was much different, as their mean scores in English and Spanish were virtually identical at all time points.


In a more detailed analysis of the same dataset, Howard (2003) used an individual growth modeling framework to estimate average growth trajectories in each language, as well as to assess the predictive power of native language and home language use on average final status (end of fifth grade performance) and average rate of change. Three major findings emerged from this study:


  1. Writing development in both English and Spanish slowed over time, with faster growth in third grade and slower growth over fourth and fifth grades.

  2. Both native language and home language use were significant predictors of English writing development, with native language related to both final status and rate of change, and home language use related only to final status. After controlling for gender, free/reduced lunch eligibility, and participation in special education, being a native English speaker and speaking more English at home were associated with higher average final status in English writing, although the gap between the native language groups diminished over time.

  3. Home language use was a significant predictor of Spanish writing final status. After controlling for gender, personal problems, participation in special education, and free/reduced lunch eligibility, speaking more Spanish at home was associated with higher average final status in Spanish writing at the end of fifth grade. There was also a significant interaction between home language and free/reduced lunch on the rate of change of Spanish writing development, with students who were eligible for free/reduced lunch having varying rates of change in relation to home language use, and students who were not eligible for free/reduced lunch having the same rate of change regardless of home language use.


Serrano and Howard (2003) investigated English influence on the fifth grade Spanish writing ability of 55 native Spanish speakers in three 90/10 TWI programs. Serrano and Howard found that many samples demonstrated evidence of English influence, but that this influence was not extensive. In other words, most students exhibited a small amount of English influence in their Spanish writing. English influence was noted in three domains: 1) mechanics, 2) vocabulary, and 3) syntax. Influence in mechanics most frequently had to do with spelling or capitalization. In vocabulary, three types of influence were found: 1) direct incorporation of English words, 2) modifications of English words to reflect Spanish morphology and phonology, and 3) applying an English meaning to a similar Spanish word. Finally, at the level of grammar, three types of influence were also found: 1) direct translations of idioms; 2) word order transfer, where English word order was applied in Spanish; and 3) the use of English syntactic constructions in Spanish. English influence was found to be most common in vocabulary, followed by grammar and then mechanics.


Howard and Christian (1997) analyzed Spanish and English writing samples of four NES and four NSS TWI students in the upper elementary grades. They found that, in general, writing in both languages showed reasonable sophistication in all four domains, particularly with organization. The Spanish essays were usually comparable to the English essays with regard to organization and topic development, but they showed more mechanical errors and more linguistic/grammatical errors, usually regarding word order, word choice, and agreement. There was no code switching in the English essays and only a few instances in the Spanish ones, though all were flagged with quotation marks. The English writing samples of NES and NSS were generally comparable, especially in the upper grades (5-6). The Spanish samples of NSS tended to be more sophisticated in terms of vocabulary and grammar than those of their NES peers. However, NSS did make some grammatical mistakes in Spanish, generally at a higher frequency than in their English writing.


In a study using daily journal writing to examine emerging biliteracy in a TWI first grade, Kuhlman, Bastian, Bartolomé, and Barrios (1993) studied 16 Mexican American NSS and NES. The program was whole-language oriented and separated students by native language for language arts in the morning, with everyone together for content instruction in Spanish in the afternoon. Students wrote in their journals for 10 minutes every day after lunch, in their language of choice. Once a week, researchers observed the writing process and tape-recorded students reading their journal entries aloud. The authors found a general developmental trend—1) squiggles/drawings, 2) alphabet letters, 3) lists, and 4) sentences— but not all students passed through all stages, or in the same order. There were no differences in patterns for NSS and NES, although NSS tended to start at a different stage (letters and numbers) than NES (lists or sentences). The researchers attributed this possibly to the kindergarten curriculum, which emphasized oral English over Spanish writing for NSS. There was very little evidence of spontaneous second-language writing. There was social interaction among children during journal writing, and it seemed to make a big difference. More advanced students helped students who were at earlier stages, and native language speakers provided second-language writing encouragement to their peers in the other language group.


A qualitative study of the biliteracy development of NES and NSS in a Spanish/English TWI program in the Northeast illuminates the connection between the first, or native language (L1) and the second language (L2) in a curriculum that employs a process writing approach (Gort, 2001). In relation to strategic code-switching, it was found that developing bilingual writers used their full linguistic repertoire when writing in both the first and second languages. For the most part, students used both languages while creating the texts, but the final product was monolingual. More specifically, Spanish-dominant children used English and Spanish when writing in both languages, but English-dominant children used both languages only when writing in Spanish. Code-switching facility depended on several factors: the child’s language dominance, bilingual development, the linguistic context, and the language proficiency of the interlocutors. Regarding positive literacy transfer, the students applied skills learned in one language to writing in the other language. It was discovered that for mature literacy processes (skills that are maintained once learned), both Spanish and English dominant children transferred patterns from their first language to their second language. As for immature processes (skills that are developmental and temporary), for both groups of students these skills appeared first in L1, then in both L1 and L2, and then in L2 only before disappearing. Again, transfer was contingent upon degree of biliteracy. Concerning interliteracy, it was found that developing bilingual writers inappropriately applied language specific elements, such as literacy and print conventions, of one language to the other. For both NES and NSS, these errors appeared in L1 writing first, then temporarily in both L1 and L2, and then again in L1 only.


Ha (2001) analyzed the writing ability of native English speakers and native Korean speakers (NKS) in grades 1-5 in a Korean/English 50/50 program. Although examined cross- sectionally, the author found that both NES and NKS showed progress in writing ability in both languages, although Korean writing didn’t seem to experience the same leaps at each consecutive grade level that English did. For both groups of students at all grade levels, Korean writing ability was lower than English writing ability, and that gap was bigger at each consecutive grade level. Korean speakers tended to be more balanced bilinguals, showing higher writing ability in Korean than NES. Students did not show signs of L2 interference in L1 writing, and there did not seem to be a delay in writing ability in either language.


Reading Development


The majority of research on the development of reading ability among TWI students has used standardized academic achievement measures as indicators of reading ability. As a result, those studies were included in the previous section on academic achievement and are not repeated here.


As part of the large-scale study discussed above, Howard, Christian, and Genesee (2003) looked at the English and Spanish reading performance of 344 TWI students in 11 programs across the United States. Cloze measures of English reading comprehension were collected at the beginning of third grade and the end of fifth grade, while a cloze measure of Spanish reading comprehension was collected only at the beginning of third grade. In English, the native Spanish speakers made slightly more mean progress than the native English speakers, but this was likely due, at least in part, to the fact that their mean scores in third grade were lower than those of the native English speakers. At both points, the mean scores of the native Spanish speakers were lower than those of the native English speakers, although the gap narrowed over the three-year period. In Spanish, there was once again a native language effect, where the average scores of native Spanish speakers in third grade were significantly higher than those of native English speakers. Comparing across languages at the beginning of third grade, the native English speakers had a slightly higher mean score on the English reading assessment than on the Spanish reading assessment, and the opposite was true for the native Spanish speakers. In other words, both groups had slightly higher mean scores in their native language than in their second language.


A study of 156 second, third, and fourth grade students in a Korean/English TWI program (Bae & Bachman, 1998) demonstrated that listening and reading skills in Korean were related for both native and non-native Korean speakers. Using a latent variable approach (structural equation modeling), the authors concluded that for both groups of students, the two language comprehension variables were factorially distinct, with a high correlation between listening and reading. Additionally, there were different amounts of variation in listening vs. reading across the two groups. There was more variation in listening ability among non-native Korean speakers because native Korean speakers were fluent and all scored at the top of the range. In contrast, all of the non-native Korean speakers had limited reading ability in Korean, and therefore, scored toward the bottom of the scale in reading, so there was more variation among native Korean speakers.




Several important findings can be drawn from the research on language and literacy development in TWI programs:


  1. There seems to be a native language effect, such that native speakers generally perform higher than second language speakers in terms of both oral and written language proficiency.

  2. Not surprisingly, there seem to be slightly different patterns for NES and language minority students, with NES always showing a clear dominance in and preference for English, and language-minority students demonstrating more balanced bilingualism. Sometimes the language-minority students tend to perform slightly higher in their native language, and other times slightly higher in English. In general, however, their performance on language and literacy measures across languages is much more similar than that of their NES peers.

  3. There is some evidence for transfer of skills across languages, with some studies reporting similar writing processes and products across the two languages. This is not the case in the writing study conducted in a Korean/English program, however, and may point to differences in the amount of potential cross-linguistic transfer and/or interference that may occur depending on the similarities or differences in orthographies in the two languages of instruction.

  4. Both the Korean reading study in this section and the reading studies presented on academic achievement point to inter-relationships between language and literacy skills within and across languages.

  5. From a methodological standpoint, because many of these studies involved relatively small numbers of students in a single TWI program, generalizability is limited. Additional research that looks at language and literacy development of TWI students on a larger scale is needed for a more comprehensive understanding of the developmental trajectories of oral language, reading, and writing in two languages. Further research is also needed to learn more about the developmental processes that children go through as they become bilingual and biliterate, and the instructional contexts that may impact that development, such as initial literacy instruction or the amount of instruction provided through the minority language. Finally, intervention studies, perhaps of longer duration than those reported here, are needed to learn more about effective instructional strategies for promoting oral and written language development in two languages.


Read the entire report.

How wonderful it is that nobody need wait a single moment before starting to improve the world.


— Anne Frank

© 2020 Castle Island Bilingual Montessori

Castle Island Bilingual Montessori is an independent school and does not discriminate in our enrollment policies on the basis of race, color, gender, religion, sexual orientation, nation or ethnic origin.