Skip to content
Learning Strategy

Why Phone Calls in a Second Language Feel Unbearable

T

The Expat Speak Team

Language Learning Designers

March 1, 202614 min read
phone anxietyspeaking anxietyL2 anxietycognitive loadpre-task planning

The Phone Rings. Your Stomach Drops.

You can handle a face-to-face conversation in German. Maybe not elegantly, but you get through it. Then the phone rings — a doctor's office, a landlord, the Krankenkasse — and your mind goes blank before you even answer.

This is not a character flaw. It is a predictable response to a communication medium that strips away the very compensatory mechanisms L2 speakers depend on. Phone calls in a second language are harder. The research confirms what your nervous system already knows.

FLA and Self-Efficacy Correlation

r = −.70

One of the strongest relationships in the foreign language anxiety literature (Zhou et al., 2023; N=26,589)

Phone Anxiety (Millennials)

81%

Report pre-call apprehension (BankMyCell survey; industry data, not peer-reviewed)

i

A note on the evidence

This article synthesizes research from applied linguistics, communication psychology, and clinical anxiety studies. Where evidence is indirect or contested, we flag it explicitly. The field has surprisingly few studies that directly compare phone and face-to-face L2 communication — a gap we address honestly.

The Phone Strips the Channels You Need Most

When you listen to your second language, your eyes do work your ears cannot. Eye-tracking studies consistently show that L2 speakers look significantly more at the speaker's mouth than native speakers do — and lower-proficiency listeners fixate even more. The phone removes precisely this compensation channel.

Birules et al. (2020) found that even highly proficient L2 speakers attended significantly more to the talker's mouth when processing L2 versus their native language. Gruter et al. (2023) replicated and extended this: lower-proficiency listeners showed a gradient effect, fixating even more on the mouth. This means phone calls eliminate the visual channel that L2 speakers actively exploit.

Cue TypeFace-to-FacePhone CallImpact on L2
Lip-reading / Visual speechAvailableAbsentCritical for phoneme disambiguation when auditory processing is uncertain
Facial expressionsAvailableAbsentSignals comprehension, confusion, patience — all invisible by phone
Gestures / Deictic pointingAvailableAbsentNon-native listeners show greater reliance on gestural cues (Drijvers et al., 2019)
Gaze / Turn-taking signalsAvailableAbsentVisual transition signals prevent overlap and awkward silences
Shared physical contextAvailableAbsentReferential grounding through pointing and shared environment

L2 speakers rely more heavily on visual cues because their auditory processing is less automatic. Phone calls strip all of them simultaneously.

!

A contested area

The assumption that video is always better is not universally supported. Some studies find non-significant differences between video and audio-only conditions (Batty, 2015; Kamiya, 2025). A 2023 meta-analysis confirms the effect is moderated by task type and learner level — video helps most when the task is difficult and visual information is content-relevant.

Non-native listeners show greater reliance on gestural cues than native listeners, with distinct oscillatory dynamics during audiovisual speech processing. Removing this channel concentrates all processing demands on the auditory channel.

Drijvers et al. (2019), NeuroImage[source]

Why Everything Feels Faster and Harder

Cognitive Load Theory explains why phone calls feel overwhelming. Working memory has limited capacity distributed across partially independent visual/spatial and auditory/verbal channels. L2 listening already imposes high intrinsic load from unfamiliar phonology, syntax, and vocabulary. When visual cues are available, they provide a compensatory channel. On the phone, all processing concentrates on the auditory channel.

The result is a resource squeeze: your brain needs more processing power for the same input. When comprehension difficulty rises, the same objective speech rate feels subjectively faster because you cannot segment and predict as effectively. The literature consistently identifies speech rate as a major L2 listening difficulty — and on the phone, this is amplified.

Vocabulary retrieval as primary anxiety trigger

~18%

Of learners cite 'not remembering vocabulary although they knew it' as their principal worry (Barkanyi & Brash, 2025)

FL anxiety ↔ Achievement correlation

r = −.34 to −.39

Three independent meta-analyses converge on this negative relationship (Teimouri et al., 2019; Zhang, 2019; Botes et al., 2020)

i

Why listening feels harder on the phone

Non-native speech perception research finds that if adverse-condition listening is hard, foreign-language listening is 'doubly so' — you face both imperfect signals and imperfect linguistic knowledge. Meanwhile, non-native speaker pairs engage in significantly more repair than native pairs (Varonis & Gass, 1985), yet phone calls make repair harder because you cannot use visual clarification strategies.

Repair — the process of fixing misunderstandings — becomes a double bind on the phone. Breakdowns are more frequent because you cannot lip-read or use gestures. Yet the available repair strategies are more limited and more face-threatening. All repair must be accomplished verbally. Studies of L2 phone conversations find a pattern of repair avoidance: service personnel accept candidate understandings rather than initiating potentially face-threatening repair sequences.

When You Cannot Predict What Comes Next

Perceived control is a transdiagnostic anxiety vulnerability — it cuts across social phobia, generalized anxiety, panic, and OCD. A meta-analysis of 51 studies (N=11,218) found a large negative association between perceived control and anxiety. L2 phone calls embody reduced perceived control: you cannot predict the interlocutor's vocabulary, speaking rate, accent, or topic shifts.

Script theory explains why unpredictable calls feel worse. Mental representations of stereotyped event sequences — the 'restaurant script,' the 'phone call script' — enable prediction and reduce cognitive processing demands. When you lack well-developed target-language phone scripts, the predictability advantage is lost. Mak (2011) found that 'speaking without preparation' was the most anxiety-provoking factor among 313 Chinese ESL students.

Predictability (Low)Predictability (High)
Known interlocutor, familiar topic15% → 85%
Unknown caller, predictable purpose35% → 65%
Unknown caller, unfamiliar topic65% → 35%
Authority figure, high stakes80% → 20%

Conceptual visualization of predictability gradients in phone call contexts. Higher predictability correlates with lower anxiety. Based on script theory and control research.

i

Cultural phone conventions add unpredictability

Cross-cultural pragmatics research reveals variation in phone conventions that compounds L2 anxiety. Germans predominantly use surname-only self-identification; Australians use informal 'Hello.' Iranian L2 German speakers may transfer extended 'how are you' sequences that German speakers misinterpret as topic introduction. You are not just managing language — you are managing unknown cultural scripts.

Perceived control functions as a transdiagnostic vulnerability factor across social phobia, generalized anxiety, panic, and OCD. The inverse of Langer's illusion of control — perceived uncontrollability — drives emotional disorders.

Gallagher et al. (2014), Cognitive Therapy and Research[source]

Avoidance Feels Like Relief But Functions as a Trap

Mowrer's two-factor theory explains the reinforcement cycle precisely. Factor one: classical conditioning pairs the phone with aversive experiences (embarrassment, incomprehension), creating a conditioned fear response. Factor two: avoiding the call produces immediate anxiety relief (negative reinforcement), strengthening avoidance behavior.

Critically, low-cost avoidance behaviors are resistant to fear extinction — even after the fear response objectively decreases, if the avoidance option remains available, people revert to it. Phone call avoidance is precisely this type of low-cost avoidance: texting, emailing, or asking a partner to call are easy substitutes that feel like reasonable alternatives.

Avoidance predicts anxiety (18-month follow-up)

b = 1.38

Higher avoidance at baseline predicted higher anxiety 18 months later (Van Uijen et al., 2017; N=221)

Three meta-analyses converge

r = −.34 to −.39

Consistent negative correlation between foreign language anxiety and achievement (Teimouri et al., 2019; Zhang, 2019; Botes et al., 2020)

The dependency cycle undermines the practice that builds proficiency. Swain's Output Hypothesis holds that L2 speakers need to produce 'pushed output' to notice gaps in their knowledge and test hypotheses. When you rely on intermediaries for phone calls, you forfeit exactly the demanding output practice that drives productive skill development. Each avoided call prevents experiences that would recalibrate threat predictions: misunderstanding is survivable. Repair works. You can ask for repetition.

!

Real-world consequences

Language barriers cause documented harm in healthcare settings. Divi et al. (2007) analyzed 1,083 adverse event reports and found that approximately 50% of adverse events in limited-English-proficiency patients resulted in physical harm versus approximately 30% for English speakers. While phone avoidance specifically has not been isolated in this research, the broader pattern is clear: language barriers impede access to services.

What Works: Evidence-Based Strategies

The research points to several strategies with varying levels of evidence support. Pre-task planning has the strongest evidence base for phone anxiety specifically. Task repetition reliably builds fluency. Graduated exposure follows established anxiety-treatment principles, though direct trials in L2 phone contexts are absent.

StrategyEvidence QualityKey FindingApplication
Pre-task planningStrongr = .807 for fluency (Wu & Ellis, 2023)One minute of planning significantly improves accuracy; prepare key phrases before calling
Task repetitionStrongd = 0.67 for complexity; largest gains in first 3 repetitions (Abdi Tabari et al., 2025)Repeat the same call type until anxiety drops, then vary one element
Script preparationModerateNo direct L2 phone trials; supported by planning researchPrepare opening, purpose statement, comprehension check, and closing routines
Graduated exposureModerateProven for general FLA; no phone-specific L2 trialsProgression: recorded messages → scripted calls → semi-scripted → spontaneous
Role-play simulationModerated = 1.29 improvement in some studies; methodological limitationsPractice with supportive interlocutors before real calls

Evidence quality ratings: Strong = meta-analytic support or multiple replications; Moderate = promising but limited or indirect evidence.

Pre-task planning is the best-evidenced intervention applicable to L2 phone anxiety. Even one minute of planning significantly improves accuracy. For phone calls specifically, the practical implication is clear: before calling, pre-formulate key phrases, anticipate vocabulary needs, and reduce the cognitive load of real-time production. Mak (2011) found that 'speaking without preparation' was the most anxiety-provoking factor — planning directly addresses this.

Pre-task planning showed a very large effect on fluency: r = .807, partial eta-squared = .763. Even brief planning time significantly improves L2 oral production.

Wu and Ellis (2023), Language Learning Journal[source]

A Pre-Call Routine That Works

You do not need to eliminate anxiety. You need to function despite it. Here is a practical routine that operationalizes the research on pre-task planning, script preparation, and graduated exposure. Use it before high-stakes calls.

StepTimeActionResearch Basis
12 minWrite down the exact purpose of the call in one sentenceGoal clarity reduces cognitive load; Mak (2011): unprepared speaking is top anxiety trigger
23 minPrepare micro-scripts: opening, purpose statement, two repair phrases, closingScript theory: predictability reduces anxiety; Levelt's model: sentence frames ease formulation
32 minPre-activate vocabulary: list 5–10 key terms; say each aloud onceBarkanyi & Brash (2025): vocabulary retrieval is primary online anxiety trigger
41 minPrepare environmental control: quiet space, documents ready, note paperEnvironmental optimization reduces distraction pressure
52 minRehearse the opening aloud three times; record and listen onceTask repetition: largest gains in first three performances (Lambert et al., 2017)

Ten-minute pre-call routine synthesizing planning research, script theory, and task repetition findings.

Repair phrases to pre-script

The most high-leverage preparation is having repair language ready. Try these: Entschuldigung, ich habe das nicht verstanden. Konnten Sie das wiederholen? / Sorry, I didn't catch that. Could you repeat? / Lassen Sie mich kurz zusammenfassen, um sicherzugehen... / Let me summarize briefly to make sure I understand...

The goal is not a perfect call. The goal is a functional call. Broken German that gets the appointment scheduled moves the story forward. The research on willingness to communicate (MacIntyre et al., 1998) shows that some learners with high proficiency refuse to speak, while others with minimal knowledge communicate whenever possible. Be the second type.

What We Still Do Not Know

The most striking finding of this review is how little direct research exists at the intersection of L2 anxiety and telephone communication specifically. The field needs:

  • A validated telephone-specific L2 anxiety scale. Neither the PRCA-24 nor the FLCAS contains phone-specific items. No instrument captures medium-specific concerns: unpredictability of caller identity, inability to prepare environmental context, compensatory hypervigilance to paralinguistic cues.
  • Experimental studies comparing L2 performance across communication modalities. No peer-reviewed study directly compares phone versus face-to-face L2 comprehension or production with matched tasks. All evidence uses video-versus-audio-only as proxy.
  • Intervention trials targeting L2 phone-call anxiety specifically. Graduated exposure and systematic desensitization are well-established for general anxiety, but no treatment study applies them specifically to L2 phone contexts.
  • German and French-specific research. Despite the prominence of these languages, direct empirical comparison of phone call anxiety in German or French L2 learners appears absent from the published literature.
i

On contradictory findings

Chen and Chew (2021) found lower anxiety in audio-only synchronous voice chat compared to face-to-face — a finding that appears to contradict widespread reports of phone anxiety. The resolution: context matters critically. Their study used known interlocutors (classmates), structured educational tasks, familiar technology, and low stakes. These 'classroom safety' features transformed audio-only from an anxiety source into an anxiety-reducing environment. Authentic phone calls with unknown callers, unpredictable purposes, and institutional stakes likely reverse this effect.

You Handled It

The theoretical case for why L2 phone calls provoke disproportionate anxiety is robust. Visual cue removal increases processing demands. Low perceived control amplifies anxiety. Low self-efficacy intensifies the fear. Avoidance reinforces the cycle through negative reinforcement while depriving you of the output practice essential for development.

Yet the mechanisms operate at multiple levels simultaneously — and that means multiple points of intervention. Pre-task planning addresses unpredictability. Script preparation addresses the lack of target-language phone scripts. Task repetition builds procedural knowledge. Graduated exposure reduces conditioned fear through controlled experience.

You will not eliminate the anxiety. But you can make the call anyway. The research is clear: the practice you gain from functioning despite anxiety is what eventually reduces the anxiety itself. Not the other way around.

Speakers need to produce 'pushed output' to notice gaps in their knowledge, test hypotheses, and engage in metalinguistic reflection. Avoiding the phone forfeits exactly the demanding output practice that drives productive skill development.

Swain (1995), Output Hypothesis

References (Selected)

This article synthesizes findings from eye-tracking research (visual cue dependence), meta-analyses (anxiety-control relationships, FLA-self-efficacy), cognitive load theory, and task-based language teaching (planning, repetition). Links go to publisher pages (usually DOI).

  1. Birulés J, Bosch L, Pons F, Lewkowicz DJ (2020) Attention to the mouth across auditory and visual contexts in monolingual and bilingual infants and adults. Language, Cognition and Neuroscience.
    Eye-tracking: L2 speakers attend significantly more to the talker's mouth when processing L2 vs. L1.
  2. Grüter T, Pons F, Parlato-Oliveira E, Hiroshima K, Lee K, Fourlinnie I (2023) Visual attention to the mouth during L2 listening. Studies in Second Language Acquisition.
    Lower-proficiency L2 listeners fixate even more on the mouth — a gradient effect.
  3. Sueyoshi A, Hardison DM (2005) The role of gestures and facial cues in second language listening comprehension. Language Learning.
    N=42 ESL learners: significantly better comprehension with visual cues at both proficiency levels.
  4. Kwon SK, Yu G (2024) The effect of viewing visual cues in a listening comprehension test on L2 learners' test-taking process and performance: An eye-tracking study. Language Testing.
    N=57 Korean EFL learners with eye-tracking: examines how L2 listeners use visual cues during video-based listening tests.
  5. Batty AO (2015) A comparison of video- and audio-mediated listening tests with many-facet Rasch modeling. Language Testing.
    N=200+: Small, non-significant differences between video and audio-only using many-facet Rasch modeling.
  6. Kamiya N (2025) The limited effects of visual and audio modalities on second language listening comprehension. Language Teaching Research.
    N=52: Limited effects of watching gestures and lip movement on L2 listening comprehension.
  7. Gallagher MW, Bentley KH, Barlow DH (2014) Perceived control and vulnerability to anxiety disorders. Cognitive Therapy and Research.
    Meta-analysis of 51 studies (N=11,218): large negative association between perceived control and anxiety.
  8. Zhou J, Chiu MM, Dong Z, Zhou B (2023) The relationship between foreign language anxiety and self-efficacy: A meta-analysis. Current Psychology.
    Meta-analysis of 37 studies (N=26,589): r = −.70 between FLA and self-efficacy.
  9. Kim JS, Oh HJ (2023) Telephone anxiety and digital communication preferences. Communication Research Reports.
    N=520: L2 status amplifies the relationship between digital technology use and telephone anxiety.
  10. Vervliet B, Indekeu E (2015) Low-cost avoidance behaviors are resistant to fear extinction. Frontiers in Behavioral Neuroscience.
    Phone call avoidance is low-cost avoidance — resistant to extinction even after fear decreases.
  11. Van Uijen SL, van der Linden D, Schmeets PMJ, Cremers HR, Emmerik REA (2017) Avoidance behavior predicts general anxiety 18 months later. PLOS ONE.
    N=221: Avoidance at baseline predicted higher anxiety at 18-month follow-up (b = 1.377, p < .001).
  12. Wu X, Ellis R (2023) The effects of pre-task planning on L2 oral production. Language Learning Journal.
    N=43: Very large effect on fluency — r = .807, ηp² = .763.
  13. Lambert C, Kormos J, Minn D (2017) Task repetition and L2 speech production. Studies in Second Language Acquisition.
    N=32: Largest fluency gains across first three performances, continued through fifth.
  14. Abdi Tabari M, Zhuang J, Farahanynia M (2025) Task repetition effects on L2 performance: A meta-analysis. System.
    Meta-analysis: medium effect on syntactic complexity (d = 0.67), positive effects on accuracy and fluency.
  15. Chen Y, Chew SY (2021) Speaking performance and anxiety levels in face-to-face and synchronous voice chat. Computer Assisted Language Learning.
    N=40 Chinese EFL learners: lower anxiety in audio-only — but context (classroom safety) matters critically.
  16. Lindberg E, McDonough K, Trofimovich P (2022) Physiological anxiety in L2 conversation. Studies in Second Language Acquisition.
    N=60 with GSR monitoring: physiological arousal correlates with negative self-perceptions of fluency.
  17. Bárkányi Z, Brash B (2025) Foreign language speaking anxiety, mental health, and online learning. Language Teaching (Cambridge Core).
    Systematic review: vocabulary retrieval is the primary online anxiety trigger; ~18% of learners cite it.
  18. Divi C, Koss RG, Schmaltz SP, Loeb JM (2007) Language proficiency and adverse events in US hospitals. International Journal for Quality in Health Care.
    N=1,083 adverse event reports: ~50% of LEP adverse events resulted in physical harm vs. ~30% for English speakers.
  19. MacIntyre PD, Dörnyei Z, Clément R, Noels KA (1998) Conceptualizing willingness to communicate in a L2. The Modern Language Journal.
    Willingness to communicate: some learners with high proficiency refuse to speak; others with minimal knowledge communicate whenever possible.
  20. Swain M (1995) Three functions of output in second language learning. In: Cook G, Seidlhofer B (eds) Principle and Practice in Applied Linguistics.
    Output hypothesis: speakers need 'pushed output' to notice gaps, test hypotheses, and engage in metalinguistic reflection.
  21. Teimouri Y, Goetze J, Plonsky L (2019) Second language anxiety and achievement: A meta-analysis. Studies in Second Language Acquisition.
    Meta-analysis: k=97, N=19,933, r = −.36 between anxiety and achievement.
  22. Sánchez L, Choi Y, Oh S, et al. (2023) Does modality matter? A meta-analysis of video-based L2 listening. System.
    Effects of video on L2 listening are contingent — moderated by task type, learner level, visual information type.
  23. Drijvers L, Van Der Plas M, Özyürek A, Jensen O (2019) Native and non-native listeners show differential neural responses to multimodal speech. NeuroImage.
    Non-native listeners show greater reliance on gestural cues with distinct oscillatory dynamics.
  24. Mak B (2011) An exploration of speaking-in-class anxiety with Chinese ESL learners. System.
    N=313: 'Speaking without preparation' was the most anxiety-provoking factor.
  25. Varonis EM, Gass SM (1985) Non-native/non-native conversations: A model for negotiation of meaning. Applied Linguistics.
    NNS-NNS pairs engage in significantly more repair than NS-NS pairs — repair is more necessary but harder by phone.
Found this helpful? Share it with fellow expats.