The ability to comprehend and produce language and music is unique to humans (Patel, 2010). Since both domains share structural and auditory complexity, strong parallels between music and language have already been proposed alongside some differences, with an emphasis on possible applications of music to educational settings (Strait & Kraus, 2011). An increasing number of studies have put forward the possible benefits of music training on non-musical learning domains such as verbal intelligence, executive function, creativity, arithmetic processing and linguistic abilities (Gibson, Folley, & Park, 2009; Hoch & Tillman, 2012; Moreno et al., 2009, 2011).

There are firm scientific grounds for a link between musical abilities and first (L1) and/or second language (L2) proficiency. Firstly, musical and speech sounds are segmented and processed similarly by the auditory system (François, Chobert, Besson, & Schön, 2012; Schön et al., 2010). Secondly, these components of language and music may be compounded into larger meaningful units in a structured hierarchical manner—grammar and harmony or rhythm (Patel, 2010; Sloboda, 1985). This shared syntactic integration resource hypothesis (Patel, 2003) challenges the domain-specificity approach and argues that, although representations of music and language components may be stored in different brain regions, a common neural network is used to interpret and structure music and speech sounds (Schön et al., 2010). Therefore, musical practice and expertise fine-tune the auditory system for both music and speech processing and, in turn, strengthen related neural and cognitive mechanisms (Kraus, Strait, & Parbery-Clark, 2012).

In particular, the behavioural and neurological effects of musical expertise on native language proficiency have been extensively researched (Parbery-Clark, Tierney, Strait, & Kraus, 2012). However, less is known about how music training or expertise could impact upon L2 learning. To identify possible benefits of musicianship, Slevc and Miyake (2006) conducted a pioneering study which provided clear evidence for the transfer of musical skills to L2 receptive and productive phonology, as distinct from L2 syntax or lexicon. By concluding that musical expertise may assist only L2 sound structure acquisition, this influential study set the direction for subsequent research, which largely examined the phonological aspects in L2 learning.

With respect to shared syntax and meaning processing in music and language, research suggests that the neurophysiological mechanisms responsible for syntax are enhanced, and develop earlier, due to musical training (Jentschke & Koelsch, 2009; Slevc, Rosenberg, & Patel, 2009). In addition, speech recognition and production depend on, and operate simultaneously with, conceptually-driven processes, such as awareness of the lexical, syntactic and semantic aspects of language (Rapp & Goldrick, 2000). Research suggests that two shared cognitive processes may underlie linguistic and musical syntax: working memory or executive functioning, and implicit learning (i.e., the ability to uncover an abstract structure from noisy input; Moreno et al., 2011; Francois & Schön, 2011). Although there is on-going research on the syntactic links between music and language (Slevc, 2012), no studies to date have been conducted to examine how music through these shared resources may influence the learning of L2 syntax and grammar. Due to the lack of studies regarding L2 structure and meaning aspects, this article discusses the available research on the potential phonological and reading benefits of musical experience on L2 learning.

Previous review articles have concentrated on the effects of musical experience on auditory and speech processing skills (Besson, Chobert, & Marie, 2011; Kraus & Chandrasekaran, 2010; Strait & Kraus, 2011) or on the general link between music and language (Asaridou & McQueen, 2013; Slevc, 2012). One recent review did examine the music – L1 and L2 relationship with an emphasis on brain processes (Chobert & Besson, 2013). However, the present article aims at presenting an overview and critically reflecting on the impact of music education and/or expertise on certain L2 learning skills, by examining:

  1. Which areas of L2 skills are affected and to what extent,
  2. What type or nature of music training is sufficient for the transfer effects,
  3. The relationship between L1 and L2 proficiency and musicality,
  4. The evidence for the role of working memory as one of the explanatory factors in the music-L2 transfer effects.


The article employs a method of research literature analysis to examine the possible benefits of music education and expertise on L2 skills.

Systematic search for peer-reviewed articles was carried out in EBSCOhost (PsycARTICLES, PsycINFO), Scopus and Web of Science databases, using the keywords “music”, in combination with “second language”, “L2”, and/or “foreign language”.

Inclusion criteria were defined as the regency of the article (i.e. not older than 7 years) and also the impact factor (higher than 1.5) of the journals included in ©Thomson Reuters Journal Citation Reports. Initial search in November 2012 combined with a more updated search in January 2014 yielded 56 hits, 13 of which were included in the present article’s analysis. Among these 56 hits, there were a relatively restricted number of articles within the chosen particular topic (i.e., music’s effect on L2), therefore few were excluded in practice. I chose articles that concentrated on the psychological aspects of the music-L2 relationship, thus following an experimental psychology perspective and methodology as well as using quantitative statistical analysis of data. Articles that were of a descriptive nature were excluded. Nonetheless, the chosen papers were selected to represent as many linguistic subdomains and as wide a range of musical activities as possible.

Analysis of Evidence for the music-L2 transfer effects

A more detailed overview of the methods and findings of relevant articles can be found in Tables 1, 2, 3, and 4. The arguments will be directed by methods of analysis of relevant articles and rational reflection. The analysis and arguments will be organized thematically as follows.

Table 1

Summaries of articles that examined phonological perception of L2: tonal variation.

Article Sample:
size, age, musical experience
Native language - L2 Measures Findings

Delogu et al.
  • N=90
  • Adults
  • Controls, musicians, Mandarin Chinese experts
  • N=20
  • Children (6–8 years)
  • High and Low Musical ability groups
Italian - Mandarin Chinese Musical ability;
L2 receptive phonology;
Lexical tone discrimination;
Years of musical training.
Adult musicians performed better than musically naive subjects and equivalently to Chinese experts in the lexical tone discrimination task

Children with a higher musical ability showed higher tonal performance.

The discrimination of segmental variations (consonant or vowel changes), was not enhanced in musically able subjects.

Lee & Hung
  • N=72
  • Adults
  • Musicians and non-musicians
English – Mandarin Chinese Processing of linguistic and musical pitch Musicians more accurate at identifying correct syllables among intact or modified syllables in pitch height or pitch contour.

Marie et al.
  • N=22
  • Adults
  • Musicians and non-musicians
French – Mandarin Chinese Lexical tone and segmental processing;
Musicians were faster and more accurate at detecting both tonal and segmental variations than non-musicians.

Musical expertise improved the perception and categorization of segmental and tonal linguistic contrasts.

Marques et al.
  • N=22
  • Adults
  • Musicians and non-musicians
French - Portuguese Lexical tone processing;
Musicians were faster and more accurate than non-musicians in detecting prosodic pitch violations.

Martínez-Montes et al.
  • N=52
  • Adults
  • Musicians and non-musicians
Spanish – Mandarin Chinese Syllabic pitch processing;
Musicians showed larger mismatch negativity MMNs to pitch contour deviations in both harmonic sounds and L2 syllables.

Note: aAlso studied segmental processing. L2 = second language.

Table 2

Summaries of articles that examined phonological perception of L2: phoneme duration & language segmentation.

Article Sample:
size, age, musical experience
Native language - L2 Measures Findings

Herrera et al.
  • N=97
  • Pre-school children, (Mean age 4.5 years)
  • None
Spanish- Spanish and Tamazight-Spanish Intelligence;
Spanish vocabulary;
Phonological awareness;
Naming speed;
Verbal short-term memory (STM).
Applied Intervention: 2 eight-week training periods of phonological training with/without music over 2 years. Early phonological and musical intervention improved both naming speed and phonological awareness (two predictors of reading readiness) in native speakers and L2 learners.

L2 learners who received training with music developed naming speed skills and phonological awareness of the ending of words more rapidly than Spanish children in the control group.

François et al.
  • N=24
  • Children (Mean age = 8)
  • None
French – artificial language Verbal Comprehension;
Perceptual Reasoning;
Working Memory and Attention Speech segmentation ERPs
Applied intervention: 2 year training in music or painting Children with musical training improved their speech segmentation abilities while children in the painting group did not.

The difference in the electrophysiological responses for familiar and unfamiliar words was greater in the music group than in the painting group.

Sadakata & Sekiyama
  • N=53 (Dutch)
  • N=54 (Japanese)
  • Adults
  • Musicians (excluding singers) and non-musicians
Japanese-Dutch and Dutch-Japanese L2 receptive phonology - the timing and quality of Japanese consonants, and the quality of Dutch vowels. Musical expertise benefited discrimination of speech materials (indicating automatic neural encoding processes) more than identification (application of categories to an incoming sound) in both L1 and L2.

Musicians showed the greatest enhancement in the identification of temporal aspects of speech (duration of consonants and vowels).

Note: L1 = first language, L2 = second language.

Table 3

Summaries of articles that examined phonological production of L2.

Article Sample:
size, age, musical experience
Native language - L2 Measures Findings

Posedel et al.
  • N=45
  • Adults
  • Years of musical training as a continuous variable
English-Spanish Working memory (Operation span);
Pitch perception;
L2 phonological production;
Start and duration of music and L2 training.
Correlation between the length of musical training and both pitch perception and working memory.

Pitch perception was the only significant predictor of Spanish pronunciation quality.

Milovanov et al.
  • N=40
  • Children (10–12 years).
  • Musically trained or untrained
Finnish-English Musical aptitude
Children with advanced English pronunciation abilities had better musical skills than those who showed less accurate English pronunciation skills.

Children with good linguistic skills showed more pronounced sound-change evoked activation with the music stimuli.

Milovanov et al.
  • N=46
  • Adults
  • Non-musicians, choir members, English philology students
Finnish-English Musical aptitude;
Phonemic listening discrimination.
All the participants performed equally well in the phonemic listening discrimination task.

The participants with higher musical aptitude were able to pronounce English better than the participants with less musical aptitude.

Note: L2 = second language.

Table 4

Summaries of articles that examined perception of L2 written text.

Article Sample:
size, age, musical experience
Native language - L2 Measures Findings

Strait et al.
  • N=42
  • Children (Mean age = 10.4)
  • None
English Oral and silent reading;
Musical aptitude;
Attention/Auditory Working Memory (AWM);
Human auditory brainstem responses
The music aptitude, by means of shared neural (brainstem) and cognitive (AWM/Attention) mechanisms, accounted for 38% of variance in children’s reading abilities.

The effect of musical aptitude largely came from children’s rhythmic skills.

Swaminathan & Gopinath
  • N=76
  • Primary school children
  • Musically trained and untrained
Indian languages - English Verbal intelligence,
Wechsler’s Intelligence Scale for Children (WISC);
English reading proficiency
Children with music training (Western or Indian) perform better than untrained children in L2 comprehension and vocabulary, but not in reading proficiency.

Note: L2 = second language.

Enhanced Phonological Processing and Production and L2 Comprehension

In order to define the possible benefits of musical ability on L2 learning, it is necessary to explain how diverse linguistic subdomains can be affected by this cross-domain relationship. As auditory qualities of language and music are more tangible than semantic ones, a considerable amount of research has been done on the effect of musical training on pitch and duration perception in speech.

Enhanced processing of tonal variations. Firstly, it has been found that musically able adults detect and identify foreign language lexical tone variation better than adults without musical background. Analysis of research literature indicates that tonal fluctuation processing has been studied most extensively (see Table 1). For example, Marques, Moreno, Castro, and Besson (2007) demonstrated that musicians more easily detect diminutive prosodic pitch deviations; whereas, Delogu, Lampis, and Belardinelli (2010) indicated that both musically able children and adults are better at discriminating lexical tones, but not phoneme duration variation. Lee and Hung (2008) extended findings regarding enhanced discrimination abilities and showed that musicians are also more accurate at identifying pitch height and pitch contour in speech. At the neurophysiological level, this music experience advantage persists – studies using event-related potentials (ERPs) reveal that musicians categorize tonal changes more easily and confidently (Marie, Delogu, Lampis, Belardinelli, & Besson, 2011). Musicians are also faster to categorize prosodic variation (Marques et al., 2007), and more sensitive to foreign syllabic tone change (Martínez-Montes et al., 2013).

Enhanced processing of utterance duration. Secondly, musical experience has a beneficial impact on the processing of phoneme duration and overall language segmentation (see Table 2). Marie et al. (2011), in contrast to Delogu et al. (2010), showed that musical training enhanced discrimination and categorization of both segmental and tonal contrasts. In a cross-linguistic study, Sadakata and Sekiyama (2011) compared musicians and non-musicians and found that musical expertise benefited discrimination of speech materials more than identification in both L1 and L2, and the greatest effect of musicality on identification processes was seen in the temporal aspects of speech (duration of consonants and vowels). Therefore, musicianship enhanced automatic neural encoding to a greater extent than the application of categories to an incoming sound. Importantly, the enhancement effect interacted with the previous experience of the linguistic background—for example; Dutch musicians did not outperform Japanese musicians and non-musicians on sensitivity to certain specific Japanese language material. The distinctiveness of perceptual cues depended on the type of first language—implying that some languages themselves enhance phonological perception more than other languages. Moreover, François, et al. (2012) experimentally assigned children to 2 years of music and painting training and noted that only musical training improved speech segmentation abilities in an artificial language as demonstrated by behavioural and electrophysiological responses. Herrera, Lorenzo, Defior, Fernandez-Smith, and Costa-Giomi (2011) used a similar paradigm and showed that a 2-year phonological and musical intervention improved both naming and phonological awareness of real language words. Therefore, research suggests that music training may enhance phonological perception (tonal and timing) at both early processing and categorical perception levels.

Enhanced pronunciation. Thirdly, musical training has a positive impact on L2 phonological production abilities (see Table 3), and this is most likely mediated by the effects of musical training length on pitch perception ability as suggested by Posedel, Emery, Souza, and Fountain (2011). In their study, pitch perception was the only significant predictor of Spanish pronunciation quality. Perhaps due to the smaller sample, the length of musical education was not significantly correlated with pitch perception, L2 pronunciation or working memory (WM) within the group of musically trained individuals. Milovanov, Huotilainenc, Välimäkid, Esquef, and Tervaniemi (2008) studied children and demonstrated the opposite direction of the relationship between music ability and L2 skills – Finnish children with advanced English pronunciation abilities had higher musical aptitude. Such findings were supported using adult participants; however, musical aptitude had no additional advantage on phonemic discrimination (Milovanov, Pietilä, Tervaniemi, & Esquef, 2010).

Enhanced L2 comprehension. Fourthly, the perception of a written text, that is, reading, can also be enhanced by musical training, although there have been fewer research articles published on this topic (see Table 4). Herrera et al. (2011) demonstrated that 2 years of phonological and music training improve reading readiness in native speakers and L2 learners. Strait, Hornickel, and Kraus (2011) examined older children and showed that musical aptitude (and particularly rhythmic abilities) is related to reading skills. Furthermore, Swaminathan and Gopinath (2013) studied primary-school children speaking various Indian languages, and found that those who were trained in Western or Indian music performed better in English L2 comprehension and vocabulary, and this advantage was not due to familiarity with English music tradition.

Thematic overview. When comparing the above four sub-themes, a certain number of observations can be made. To start with, Sadakata and Sekiyama’s (2011) results are in line with Marie et al.’s (2011) findings from event-related potentials, which measured responses to the tonal processing of the Mandarin language. Although Japanese has simpler tone systems, both studies found enhanced perceptual processing and categorization of important L2 contrasts. The choice to study two very different languages (Dutch and Japanese) was an advance, because intonation, spectral and timing features are prominent in music and in these languages.

The Strait et al. (2011) findings can potentially explain how identification of L2 categories occurs in Sadakata and Sekiyama’s (2011) study. Although identification (application of categorical information to auditory stimulus) is a perceptually and cognitively challenging task, and the learning of L2 sound categories was very short, musicians might have been faster than non-musicians to acquire foreign categories as an acoustic regularity. Since word meaning in the Japanese and Dutch languages changes with the temporal and spectral qualities of sound, respectively, these languages require joining together the temporal/spectral and syllabic material. Delogu et al. (2010) and Marie et al. (2011) also suggest that the positive effect of music expertise on enhanced perceptual processing might arise because musicians rely on both left-lateralized segmental cue processing and right-lateralized tonal cue processing. Therefore, weaker brain lateralization and enhanced subcortical processing, induced by musical activities, could play a role in learning L2 that is very different from the native one.

Next, the Posedel et al. (2011) finding that the length of musical training within the musicians group was not correlated with pitch perception or Spanish productive phonology skills contradicts most previous findings as well as Sadakata and Sekiyama’s (2011) results, as in their study the duration of training positively affected the accuracy of identification performance. This discrepancy could be explained by differing participants in the two studies—Posedel et al. (2011) only accounted for the length of musical training, whereas Sadakata and Sekiyama (2011) defined their musician sample as experts with current active involvement in musical activities. Posedel et al.’s (2011) finding in which only the pitch perception, but not the years of music training, influenced pronunciation is consistent with the Milovanov et al.’s (2008, 2010) findings that children and adults with advanced English pronunciation abilities have higher musical aptitude. Thus, it is not surprising that the years of training per se might not directly translate to better musical and phonological skills, but pitch perception abilities, which are a solid quality of musical aptitude, may have a more direct effect on L2 productive phonology.

The phonological awareness of L2 seems to be positively affected not only by musical expertise, but also by training (e.g., Herrera et al., 2011) and innate predisposition to musical expertise (e.g., Strait et al., 2011). Pre-school children who received musical training had enhanced phonological awareness in a distinctly different language from their mother-tongue (Tamazight), similarly to adult musicians in Sadakata and Sekiyama’s (2011) study. Strait et al. (2011) not only eliminated the effects of musical training, but showed that even musical aptitude could predict speech processing abilities by the enhanced perception of rhythmic regularities, which adds to Sadakata and Sekiyama’s (2011) emphasis on spectral and temporal aspects.

Conceptualisation of Terms—What Is Musical Training or Expertise?

The relationship between musical training and expertise. Only 2 studies from the selected articles asked participants to go into music training, whilst the majority of others assumed that the quality of music training was sufficient to result in musical expertise that would initiate L2 skills enhancement. Herrera et al. (2011) carried out a 2-year follow-up study on pre-school children. Participants received either phonological training with music (based on children’s rhymes and songs) or without music. The phonological training greatly improved both naming speed and phonological awareness, which are two strong predictors of reading readiness. Tamazight children, who received training with music, developed their L2 (i.e., Spanish) naming speed skills and phonological awareness of word endings more rapidly than Spanish children in the control group. François et al. (2012) also carried out 2 years of music training and showed that the speech segmentation abilities of an artificial L2 were improved in children who underwent music compared to art training. These studies thus provide evidence for music training nurturing brain plasticity and facilitating L2 phonological and verbal comprehension skills whilst controlling for potential antecedent inclination towards music as a cause for these changes.

Musical expertise and its possible confounds. It follows that when broad and complex issues like music or language are being investigated, the terminology and validity of measures that studies use are of critical importance. Herrera et al. (2011), Sadakata and Sekiyama (2011), and Strait et al. (2011) tested for pre-existing differences between the groups, such as general cognitive abilities and relevant confounding factors. However, Posedel et al. (2011) treated participants as a homogenous sample and did not account for intelligence and L2 vocabulary knowledge, as Herrera et al.’s (2011) study did. Furthermore, Posedel et al. (2011) tested the amount of formal Spanish training as a single measure of pre-existing L2 competency, despite the fact that informal exposure to Spanish culture or the overall individual differences in Spanish proficiency, gained after a particular amount of formal training, could affect the phonological production. The Posedel et al. (2011) study could have benefited from inspecting the subjective self-ratings of L2 expertise (as was done in Sadakata & Sekiyama’s, 2011 study), since it is known that the length of exposure to L2 improves the phonological abilities (Slevc & Miyake, 2006). Swaminathan and Gopinath (2013), on the other hand, provide evidence for the music-L2 link using participants with non-Western music training, which not only extends the transfer effects from Western Classical music training but also aims to control for exposure to the L2 culture.

Vocal versus instrumental musical expertise. It is also important to note that the majority of studies examined the effects of instrumental expertise, explicitly excluding singing as a type of musical training without explaining the reasons behind it. Yet, the Herrera et al. (2011) study was based on spoken and singing training that engaged children in various activities, but did not teach music per se and still attained remarkable L2 improvements. Similarly, Ludke, Ferreira and Overy (2013) found that the “listen-and-sing” learning method can facilitate verbatim memory for L2 phrases after a brief 15-min learning period, but musical ability and training do not facilitate this effect. In contrast, Moreno et al. (2009) and François et al. (2012) used formal musical training which consisted of several well-established methodologies (i.e., Kodaly and Orff) and found very similar effects to findings in Herrera et al.’s (2011) study—improved reading skill, pitch discrimination and speech segmentation. Whereas the aforementioned music education methodologies include the active use of one’s voice, Schön et al. (2008) showed that mere listening to a vocalization rather than a speech fragment evoked strong language learning facilitation. However, it is understandable that some researchers eliminate musical expertise gained from singing, since singers usually know how to play at least one instrument, and this is practiced in formal musical education and is naturally relevant. The observation that singing may involve instrumental expertise implies that singing might provide additional L2 enhancement; therefore, it may be hard to disentangle the effects of each.

Indeed, very diverse definitions of musical experience and methods have been used, but similar positive results for music-L2 transfer were achieved in the aforementioned research.

Effects of Music on First and Second Language Proficiency

Musical activities seem to give an advantage to the aforementioned linguistic subdomains of one’s first and second languages. Herrera et al. (2011) found that phonological training with music especially benefited Tamazight-speaking (L1) children, who, after training, were similar to Spanish control children in the naming speed task. Thus, although native language speakers were generally better at reading readiness measurements, the effect was stronger for those children who learned Spanish as their second language. Similar advantages of musicianship on L1 and L2 were found in Sadakata and Sekiyama’s (2011) study—for example, Dutch and Japanese musicians outperformed non-musicians in the identification of Japanese stop contrasts. With regard to discrimination performance, musicians demonstrated shorter reaction time (RT) and greater accuracy in L1 and L2. Furthermore, Japanese musicians did not show an overall advantage on L1 materials, but did exceed Japanese non-musicians on several L2 speech contrasts. Hence, both studies imply that the effect of musical activities is stronger on one’s L2 perhaps due to the ceiling effect. Considering the aforementioned link between L1 and L2, Strait et al.’s (2011) results may apply to L2 learning as well, although the study exclusively examined reading abilities in children’s native-language.

Despite the similar findings with regard to L1 and L2 in the aforementioned studies, the learning of both languages occurred concurrently in Herrera et al.’s (2011) study and sequentially in Sadakata and Sekiyama’s (2011) study. Hence, children’s L1 and L2 abilities were affected simultaneously, while adults’ L2 abilities were based on their first language. Studies show that L2 proficiency is significantly influenced by L1 proficiency (Lee et al., 2007), and this L1-L2 link is in line with Sadakata and Sekiyama’s (2011) study which argues that L2 proficiency should not be explained in terms of musical expertise alone.

The Role of Working Memory in the Music-Language Transfer Effects

Working memory (WM) is associated with L2 reading, speaking, vocabulary and listening abilities (Kormos & Sáfár, 2008). However, among the selected papers only a few articles took into account WM in their analysis of music-L2 transfer (Herrera et al., 2011; François et al., 2012; Posedel et al., 2011; Slevc & Miyake, 2006). Herrera et al. (2011) indicate that verbal WM and naming speed are strong mutual predictors of reading and are crucial at the initial stages of reading acquisition. Phonological recoding strategy is said to develop concurrently; this aims at decomposing the written word into sound components and keeping it in the WM while the meaning and sounding is retrieved from LTM. Musical training especially accelerated children’s L2 naming speed, that is, the recall of labels from long-term memory.

In contrast, Posedel et al. (2011) assumed that WM ability could mediate the relationship between musical training and L2 productive phonology; however, they did not further hypothesize that WM would predict Spanish phonological production. Thus, the inclusion of WM in the analysis was based on unsubstantiated grounds, and resulted in the finding that WM was not a significant mediator between musical training and phonological production. This was not surprising, since a direct link between working memory and pronunciation quality is unlikely, and other variables might be involved in this relationship. Posedel et al. (2011) nonetheless suggested that auditory WM could predict the receptive phonology and syntax of the L2. Receptive phonology, unlike the syntactic aspects of L2, was studied in the Herrera et al. (2011) and Strait et al. (2011) papers with auditory working memory as an important factor in the music-language link.

Musical abilities were found to improve auditory WM (AWM; Strait, Kraus, Parbery-Clark, & Ashley 2010). Strait et al. (2011) further revealed that AWM/Attention is the driving mechanism of children’s reading abilities. This adds to previous findings since this relationship is achieved by mere musical aptitude, without any training. Most importantly, Strait et al. (2011) argued that the relation between AWM/Attention and the amount of repetitive brainstem enhancement was accounted for by their shared relationships with music aptitude. They studied children aged between 8 and 13 years who were divided into good and poor readers. Researchers measured oral and silent reading speed, auditory working memory, musical aptitude (including tonal and rhythm), and human auditory brainstem responses (ABRs). The music aptitude, by means of shared neural (brainstem) and cognitive (AWM/Attention) mechanisms, accounted for 38% of variance in children’s reading abilities, and no other pre-tested factors accounted for this relationship. The effect of musical aptitude largely came from children’s rhythmic skills. In addition, good readers and musically inclined children had a bigger subcortical enhancement of regular speech harmonics and better auditory WM, indicating stronger top-down functions. As musicians tend to have well-developed rhythmic ability due to extensive learning of melody patterns and lyrics, it translates to better detection of speech regularities, both of which in turn play an important role in improving AWM and attention (Kraus et al., 2012). This influence is reciprocal - the enhanced cognitive abilities provoke functional brain plasticity that reinforces subcortical processing of sound and speech.

AWM/Attention was also suggested as an explanatory mechanism for the enhancement effects in many of the selected papers. However, there is some contradictory evidence regarding the role of attention. For example, Marie et al. (2011) noted that the facilitation effects in their study cannot be accounted for by attention, since musical experience did not benefit all lexical pitch and segmental processing aspects automatically.


Summary of the main findings

It is evident from the aforementioned studies that musical training and aptitude, as well as the resulting musical expertise, positively alter aspects of L2 proficiency. The studies demonstrated that not only are pitch perception, subcortical processing of acoustical regularities and WM the possible mediators between musical experience and language skills, but also that musical training may benefit reading acquisition and phonological awareness of timing and pitch of L2 speech sounds in both acoustical and categorical analysis of speech.

Music facilitation of L2 phonological perception and production has been the most predominantly studied area. More recent studies elaborated and confirmed Slevc and Miyake’s (2006) findings of superior L2 receptive and productive phonology among musicians, and this collection of results is consistent with the Speech Learning model which states that phonological production and perception of L2 segments are related (Flege, 1995). Furthermore, the L2 facilitation was linked to musical expertise, musical training and musical aptitude, and this relation is in accordance with the “OPERA” hypothesis assumptions; higher acoustic encoding precision in musicians should benefit various aspects of phonological awareness (cf., Patel, 2012). To date, less clear is the theoretical basis behind the reading skills facilitation by music experience, but it may be best explained by increased listening sensitivity, WM and attention, or general intelligence (Strait et al., 2011; Swaminathan & Gopinath, 2013). Nonetheless, all the articles support Patel’s (2003) hypothesis of common networks being at least partially shared between music and language domains.

Practical implications for education

The reviewed findings have practical implications for music teaching and language learning. Collectively, they suggest that music training may play a significant role in human language development by aiding a number of perceptual and cognitive processes required for language acquisition. Thus, integrating music training into L2 programmes or encouraging concurrent music and language training can potentially improve foreign language pronunciation, receptive phonology and reading skills. Since music training can facilitate both the tonal and timing aspects of L2 phonology, the implicated facilitation may apply to different language groups, including tonal languages (Delogu et al., 2010). Tonal contrasts being especially hard to distinguish without prior experience, music could provide a tool for familiarization with musical and lexical tones. In an increasingly globalizing world, it becomes useful to have ‘perceptual fundamentals’: the sensitivity to the key acoustic parameters such as pitch or duration, which would help in learning languages from other language families more easily.


Some conceptual and methodological issues of these studies and the research area need careful consideration. Posedel et al. (2011) examined the phonological production, and made attempts in finding mediating factors between music and L2. However, the test for pitch perception in this study (Wing Measures of Musical Talents; Wing, 1968) was used in other studies as a musical intelligence/ability test (Delogu et al., 2010; Slevc & Miyake, 2006). Furthermore, Slevc and Miyake (2006) included a melody production task (extension of melody-judgement test) whereas the Posedel et al. (2011) version had no productive aspect even though they measured L2 phonological production. Wing’s (1968) test measures musical pitch, but not perception of changes of pitch in speech (i.e. tonal variation). Thus, it would be more reasonable to expect musical pitch discrimination skills to link with pitch in speech comprehension, which would then relate to L2 phonological production. Alternatively, as mentioned above, the productive pitch test (for example, production of note A, used as a standard for tuning) could have been used for a more precise mediating role.

Moreover, the intensiveness and type of participants’ musical training was not accounted for in the Posedel et al. (2011), Marques et al. (2007), and Sadakata and Sekiyama (2011) studies (cf., Woody & Lehmann, 2010) and neither was absolute pitch (AP) in any of the studies. AP ability is the extreme of musical expertise, and therefore due to the exceptionally enhanced acoustical awareness and brain connectivity may benefit language skills even more than ordinary musical expertise (cf., Loui, Li, Hohmann, & Schlaug, 2011). AP might be especially relevant for studies that used tonal languages, such as Mandarin Chinese, and tone identification tasks; but, since the discussed articles did not explicitly study this possibility, the role of AP in tone identification remains inconclusive (Lee & Hung, 2008).

In addition, Posedel et al. (2011) and Sadakata and Sekiyama (2011) looked at ‘late’ L2 learners, some of whom were experts in the music domain, whereas Herrera et al. (2011), François et al (2012), Milovanov et al. (2008), Delogu et al. (2010), and Swaminathan and Gopinath (2013) considered young children whose brains were still at the developmental phase, and music was introduced while L1 and L2 skills were still emerging. The latter studies have an advantage in the participants’ age, since there might be a sensitive period, namely around seven years, beyond which music-induced structural alterations in the brain are less salient (Habib & Besson, 2009). Accordingly, because the adult musicians’ brains would have already been impacted by structural and functional changes, it is problematic to distinguish whether it could have been early musical training or gained expertise over long practice that positively affected L2 learning (Lee & Hung, 2008; Martinez-Montes et al., 2013, Posedel et al., 2011; Sadakata & Sekiyama, 2011). While the studies examined the potential benefits of musical training or expertise on L2 proficiency separately, the relation between short-term musical lessons, musical expertise, and the effects of both on L2 remain unclear. To date there have been few attempts made to link the effects of the amount and the continuity of musical training (Delogu, et al., 2010), and, therefore, more longitudinal studies looking at various age groups of musicians and non-musicians learning a L2 would be useful to study and elucidate the potential causal link. Since most of the articles (except for the ones that experimentally manipulated training) used quasi-experimental design, it is possible that other factors underlie music- L2 transfer. For example, individual differences such as motivation to excel in learning or predisposition to music or language, openness to experience and conscientiousness may explain why musicians are better at certain L2 skills. Therefore, only studies using longitudinal designs could reliably demonstrate that the positive transfer is a result of music training.

Suggestions for Future Research

There are many promising directions that could further enhance this research area. Most importantly, the research on the music-L2 link has been inclined towards phonology. Future studies should aim at examining other shared aspects between music and language; namely, structure and meaning (Slevc, 2012).

Moreover, none of the studies looked at the differences between individual and group musical activities; that is, between singing and playing solo and in a choir or an orchestra (Kraus et al., 2012). Since playing an instrument and singing in a group require greater response control and selective attention, we would expect musicians with cooperative musical experience to have better executive function, which in turn has been related to bilingual abilities (Bialystok & DePape, 2009).

Furthermore, if L1 proficiency impacts L2, the question arises as to what influence different types of languages have on L2 abilities in combination with musical activities: For example, tonal languages enhance tonal aspects of speech perception and are even related to greater structural brain plasticity, irrespective of whether the language is learned as L1 or L2 (Crinion et al., 2009). Similarly, languages with specific grammatical structures, such as Indo-European, which retain a complicated declension system (i.e., the Lithuanian language has 7 cases), or three genders (i.e., German), might potentially enhance some other L2 subdomains like syntax or grammar. In quantity languages, such as Hungarian and Japanese, the duration of vowels changes the word meaning, thus speaking such languages as one’s native language might enhance the perception of phoneme durations in L2. For example, Roncaglia-Denissen, Schmidt-Kassow, Heine, Vuust, and Kotz (2013) found that if the learner’s native language has distinct rhythmic properties (i.e., stress position) compared to the L2, mastering them may enhance musical rhythm perception. However, the interaction of language-specific (L1) characteristics and musical activities on L2 has not yet been thoroughly examined. As research has somewhat concentrated on English or tonal languages, it would be also of interest to examine the influence of musical expertise on quantity languages as L2.

Likewise, motivational and situational factors, such as reasons for learning an L2, should be studied, as it may be that highly motivated individuals without musical experience can be as good as musicians at learning languages. Since music expertise and training is usually acquired in childhood, future research could test if there is a critical period of music training and to what extent it would be beneficial for adults learning L2. In addition, although Strait et al. (2011) examined native language learning, the proposed model confirms the findings of other papers and could be potentially useful in L2 acquisition and bilingual settings. Therefore, in order to ascertain the educational implications of musical aptitude and training via WM, the study of Strait et al. (2011) should be replicated with an L2 learned simultaneously or after L1 is acquired. Similarly, the identification of foreign speech categories sheds more light than discrimination on natural speech processing, and it would be interesting to see the effects of background sounds on L2 identification performance in musicians and non-musicians (cf., Thompson, Schellenberg, & Letnic, 2012). In short, the interactive effects of musical skills and variations in language and music systems and external variables need to be further examined.


The discussed articles provide evidence for enhanced auditory and cognitive abilities in musically trained individuals, which contribute to the phonological and reading aspects of L2 acquisition. However, the application of findings from native-language studies to L2 is still in its early years and the influence of musical activities on other language subdomains like syntax or pragmatics is not yet understood. There is a shortage of research regarding the wide array of L2 learning aspects other than phonological, and more studies are needed to fully define the scope of benefits that musical activities exert on children’s and adults’ ability to acquire additional languages.