The Palgrave Handbook of Applied Linguistics Research Methodology PDF - AZPDF.TIPS (2024)


The Palgrave Handbook of Applied Linguistics Research Methodology

Aek Phakiti  •  Peter DeCosta Luke Plonsky  •  Sue Starfield Editors

The Palgrave Handbook of Applied Linguistics Research Methodology

Editors Aek Phakiti Sydney School of Education and Social Work University of Sydney Sydney, NSW, Australia Luke Plonsky Applied Linguistics Northern Arizona University Flagstaff, AZ, USA

Peter DeCosta Department of Linguistics, Germanic, Slavic, Asian and African Languages Michigan State University East Lansing, MI, USA Sue Starfield School of Education UNSW Sydney Sydney, NSW, Australia

ISBN 978-1-137-59899-8    ISBN 978-1-137-59900-1 (eBook) Library of Congress Control Number: 2018955931 © The Editor(s) (if applicable) and The Author(s) 2018 The author(s) has/have asserted their right(s) to be identified as the author(s) of this work in accordance with the Copyright, Designs and Patents Act 1988. This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover illustration: © Aek Phakiti This Palgrave Macmillan imprint is published by the registered company Springer Nature Limited. The registered company address is: The Campus, 4 Crinan Street, London, N1 9XW, United Kingdom


Applied linguistics is a broad, evolving interdisciplinary field of study, which examines language use with relevance to real-world problems across a range of social contexts using a diverse set of methodologies. This Handbookaims to provide a comprehensive, yet accessible treatment of basic and more advanced research methodologies in applied linguistics as well as to offer a state-of-the-­ art view of various substantive domains within the field. The Handbookcovers a range of research approaches, presents current perspectives, and addresses important considerations in different research methods, such as designing and implementing research instruments and techniques and analyzing different types of applied linguistics data. Innovations, challenges, and trends in applied linguistics research are addressed throughout the Handbook. This Handbookhas brought together a range of authors in various areas of research into one volume. The authors work with a variety of languages in a host of research contexts, ensuring both breadth and depth. As the Handbookeditors, we have curated themes and ideas that are aligned with the current research climate as well as areas that help applied linguists better understand social and educational phenomena and the nature of language, language learning, and language use.

Readership As we anticipate that many readers of this Handbookmay be junior scholars seeking guidance on research methods, and taking into account the many options and pathways on offer, we have striven to ensure that the Handbookprovides an up-to-date entry point into both approaches that have stood the test v

vi Preface

of time and approaches that may be less well known, but offer interesting possibilities and perspectives. This Handbook is suitable for use by advanced undergraduate and postgraduate students as well as beginning and well-­ established applied linguists who would like both a broad and in-depth understanding of contemporary applied linguistics research methods and topics. Specifically, this Handbook can be used in applied linguistics, second language studies, and TESOL graduate programs around the world. Libraries, university departments, and organizations dealing with applied linguistics issues will also find this Handbook to be an invaluable resource.

Comments or Suggestions The editors would be grateful to hear comments and suggestions regarding this Handbook. Please contact Aek Phakiti at [emailprotected], Peter De Costa at [emailprotected], Luke Plonsky at lukeplonsky@, or Sue Starfield at [emailprotected]. Sydney, NSW, Australia East Lansing, MI, USA Flagstaff, AZ, USA Sydney, NSW, Australia

AekPhakiti PeterDeCosta LukePlonsky SueStarfield


We wish to express our heartfelt thanks to the contributors of this Handbook who worked hard to produce great chapters and promptly responded to our requests and comments on earlier drafts. You are all truly amazing. We would also like to thank the many researchers, authors, and methodologists who published research articles, book chapters, and books not only in applied linguistics but across various disciplines. Their contributions have helped us deepen our understanding of numerous issues and methods relevant to applied linguistics. Next, we are very grateful to Palgrave for their kind support throughout the completion of this Handbook. We would like to thank our colleagues and friends at Georgetown University, Michigan State University, Northern Arizona University, the University of New South Wales, and the University of Sydney who discussed with us essential ideas and issues to be included in this Handbook and read and commented on several chapter drafts, in particular: Janette Bobis, Jesse Egbert, Mia Jun, Amy Kim, Wendy Li, Alison Mackey, Guy Middleton, Lourdes Ortega, Brian Paltridge, Jack C.Richards, Fran Waugh, and Yiran Xu. Finally, we would like to thank our partners who supported us throughout the process of putting together this Handbook, including weekend Skype time when our schedules—sometimes across four distinct time zones—would align.



Part I Research Approaches and Methodology


1 Applied Linguistics Research: Current Issues, Methods, and Trends  5 Aek Phakiti, Peter De Costa, Luke Plonsky, and Sue Starfield 2 Habits of Mind: How Do We Know What We Know? 31 Richard F. Young 3 Quantitative Methodology 55 Luke K. Fryer, Jenifer Larson-Hall, and Jeffrey Stewart 4 Qualitative Methodology 79 Shim Lew, Anna Her Yang, and Linda Harklau 5 Mixed Methodology103 Alison Mackey and Lara Bryfonski 6 Traditional Literature Review and Research Synthesis123 Shaofeng Li and Hong Wang 7 Research Replication145 Rebekha Abbuhl ix

x Contents

8 Ethical Applied Linguistics Research163 Scott Sterling and Peter De Costa 9 Writing a Research Proposal183 Sue Starfield 10 Writing a Research Article199 Betty Samraj

Part II Research Instruments, Techniques, and Data Sources


11 Interviews and Focus Groups225 Matthew T. Prior 12 Observation and Fieldnotes249 Fiona Copland 13 Online Questionnaires269 Jean-Marc Dewaele 14 Psycholinguistic Methods287 Sarah Grey and Kaitlyn M. Tagarelli 15 SLA Elicitation Tasks313 Susan Gass 16 Introspective Verbal Reports: Think-Alouds and Stimulated Recall339 Melissa A. Bowles 17 Corpus Research Methods for Language Teaching and Learning359 Magali Paquot 18 Digital Discourses Research and Methods375 Christoph A. Hafner


Part III Data Analysis



19 Correlation and Simple Linear Regression in Applied Linguistics395 Reza Norouzian and Luke Plonsky 20 Exploratory Factor Analysis423 Aek Phakiti 21 Confirmatory Factor Analysis and Structural Equation Modeling459 Aek Phakiti 22 Analyzing Group Differences501 Luke Wander Amoroso 23 Statistics for Categorical, Nonparametric, and Distribution-­ Free Data523 Jesse Egbert and Geoffrey T. LaFlair 24 Reliability Analysis of Instruments and Data Coding541 Kirby C. Grabowski and Saerhim Oh 25 Analyzing Spoken and Written Discourse: A Role for Natural Language Processing Tools567 Scott A. Crossley and Kristopher Kyle 26 Narrative Analysis595 Phil Benson 27 Interaction Analysis615 Elizabeth R. Miller 28 Multimodal Analysis639 Jesse Pirini, Tui Matelau-Doherty, and Sigrid Norris

xii Contents

Part IV Selected Research Topics and Areas in Applied Linguistics


29 Instructed Second Language Acquisition663 Shawn Loewen 30 Bilingualism and Multilingualism681 Tej K. Bhatia 31 Forensic Linguistics703 Samuel Larner 32 World Englishes719 Peter De Costa, Jeffrey Maloney, and Dustin Crowther 33 Heritage, Community, and Indigenous Languages741 Shereen Bhalla and Terrence G. Wiley 34 Translation and Interpreting761 Claudia V. Angelelli 35 Identity777 Ron Darvin 36 Gesture Research793 Gale Stam and Kimberly Buescher 37 Language Policy and Planning811 David Cassels Johnson and Crissa Stephens 38 Second Language Pragmatics829 Soo Jung Youn 39 Language Testing and Assessment845 April Ginther and Kyle McIntosh



40 Linguistic Landscape869 David Malinowski 41 Researching Academic Literacies887 David Bloome, Gilcinei T. Carvalho, and Sanghee Ryu Index903

Notes on Contributors

RebekhaAbbuhl  is Associate Professor of Linguistics at California State University Long Beach, where she teaches courses in language acquisition, research methods, and pedagogy. Her research interests include second language writing and the role of feedback in the development of foreign language proficiency. Luke Wander Amoroso is Assistant Professor of Linguistics at Truman State University. His research interests are in the validity and reliability of L2 tests, speaking proficiency, and second language acquisition (SLA) research methodology. He tries to keep his features in order and works as a language testing consultant with the United States Department of Justice. He works with ESL and EFL teachers in the United States and China to incorporate insights from SLA interaction research into ESL/EFL teaching methods. Claudia V. Angelelli  is Chair of Multilingualism and Communication at HeriotWatt University, UK, and Professor Emerita of Spanish Linguistics at San Diego State University, US.Her research lies at the intersection of sociolinguistics, applied linguistics, and translation and interpreting studies. She has authored Medical Interpreting and Cross-cultural Communication (2004) and Revisiting the Role of the Interpreter (2004) and co-edited Researching Translation and Interpreting (2015) and Testing and Assessment in Translation and Interpreting Studies (2009). Her work has appeared in The Annual Review of Applied Linguistics, The Critical Link, Cuadernos de ALDEEU, International Journal of the Sociology of Language (IJSL), Interpreting, Meta, MonTI, The Translator, Translation and Interpreting Studies (TIS), and numerous edited volumes. PhilBenson  is Professor of Applied Linguistics at Macquarie University, Sydney, Australia. His research interests are in autonomy and out-of-class language learning, study abroad, and multilingualism. He has a strong preference for qualitative research and has published on both qualitative research methods and narrative inquiry. He is especially interested in oral history as an approach to research on the long-term lanxv


Notes on Contributors

guage learning experiences of multilingual individuals. He is co-author of Second Language Identity in Narratives of Study Abroad (Palgrave Macmillan, 2012) and Narrative Inquiry in Language Teaching and Learning Research (2013). Shereen Bhalla  is a research associate and the facilitator of the Language Policy Research Network and serves as Manager of Online Education at the Center for Applied Linguistics (CAL). At CAL, Bhalla conducts research, co-authors papers, and regularly presents at national and international conferences on issues regarding language policy, heritage language learning, and English as an international language. She has experience teaching and working with pre-service and in-service teachers in the areas of culturally responsive teaching, second language acquisition, writing development and oral communication. She received her PhD in Culture, Literacy and Language from the University of Texas at San Antonio. TejK.Bhatia  is Professor of Linguistics and Director of South Asian languages at Syracuse University, Syracuse, New York. He has been Director of the Linguistic Studies Program and Acting Director of Cognitive Sciences. He is also a Faculty Fellow at the Forensic & National Security Sciences Institute. He is Editor-in-Chief of Brill Research Perspectives on Multilingualism and Second Language Acquisition. His publications include five handbooks with William C. Ritchie: Handbook of Bilingualism and Multilingualism (2013), A New Handbook of Second Language Acquisition (2009), The Handbook of Bilingualism (2006), Handbook of Child Language Acquisition (1999), and Handbook of Second Language Acquisition (1996). DavidBloome  is College of Education and Human Ecology (EHE)Distinguished Professor of Teaching and Learning at The Ohio State University. David’s research focuses on how people use spoken and written language for learning, teaching, social relationships, constructing knowledge, and shared histories. He is former president of the National Council of Teachers of English and of the National Conference on Research in Language and Literacy; former co-editor of Reading Research Quarterly; and founding editor of Linguistics and Education. David was inducted into the Reading Hall of Fame in 2008 and in 2015 he received the John. J.Gumperz Lifetime Achievement Award. Melissa A. Bowles  is an associate professor in the Department of Spanish and Portuguese and Director of the Second Language Acquisition and Teacher Education PhD concentration at the University of Illinois at Urbana-Champaign. Her main research interests are classroom second and heritage language acquisition and the ways in which instruction differentially affects the two learner groups. She routinely uses verbal reports in her research and has written about them extensively, most notably in The Think-Aloud Controversy in Second Language Research (2010). Lara Bryfonski is a doctoral candidate in applied linguistics at Georgetown University. Her research focuses primarily on interaction and corrective feedback in second language acquisition, as well as task-based language teaching and learning. Lara is also a licensed English as a Second Language (ESL) teacher and has taught ESL in a variety of contexts in the U.S. and abroad.

  Notes on Contributors 


KimberlyBuescher  is Assistant Professor of Applied Linguistics at the University of Massachusetts, Boston. Her research interests include L2 learning and teaching, L2 literacy, students and teachers’ use of gesture, French prepositions, and teacher education preparation. Her dissertation “Developing Second Language Narrative Literacy Using Concept-Based Instruction and a Division-of-Labor Pedagogy” examined the extent to which concept-based instruction and a division-of-­labor pedagogy promoted the development of intermediate learners’ narrative literacy abilities in French. She has published book chapters on the learning and teaching of French prepositions and the internalization of talk, gesture, and concepts in the L2 classroom. Gilcinei T. Carvalho  is an associate professor at Universidade Federal de Minas Gerais, Brazil. He is a member of the Knowledge and Social Inclusion Graduate Program and a researcher at the Center for Literacy Studies, in the School of Education. He explores sociolinguistic approaches in the study of acquisition and the development of written language, including academic literacies. He is co-editor of Jornal Letra A. FionaCopland  is Professor of TESOL at the University of Stirling, Scotland. She has taught English and trained teachers in a number of different countries. Her research interests include post-observation feedback in pre-service teacher education, teaching English to young learners and ethics in qualitative research. Fiona has written a book on Linguistic Ethnography: Collecting, Analysing and Presenting Data (2015, SAGE) with Angela Creese, as well as edited a collection entitled Linguistic Ethnography: Interdisciplinary Explorations (2015, Palgrave) with Julia Snell and Sara Shaw. Scott A. Crossley  is Associate Professor of Applied Linguistics at Georgia State University. Scott’s primary research focus is on natural language processing and the application of computational tools and machine learning algorithms in language learning, writing, and text comprehensibility. His main interest area is the development and use of natural language processing tools in assessing writing quality and text difficulty. He is also interested in the development of second language learner lexicons and the potential to examine lexical growth and lexical proficiency using computational algorithms. Dustin Crowther  is a visiting Assistant Professor of English at Oklahoma State University, and holds a PhD in Second Language Studies from Michigan State University. He previously completed his MA in Applied Linguistics at Concordia University in Montréal, Canada. His research interests include second language pronunciation, the promotion of mutual intelligibility in multilinguistic and multicultural contact, World Englishes, and research methodologies. His research has been published in a wide range of journals, including Studies in Second Language Acquisition, The Modern Language Journal, and TESOL Quarterly. Ron Darvin  is a Vanier Scholar at the Department of Language and Literacy Education of the University of British Columbia. Together with Bonny Norton, he received the 2016 TESOL Award for Distinguished Research for their article


Notes on Contributors

“Identity and a model of investment in applied linguistics” that appeared in the Annual Review of Applied Linguistics. Ron has also published in TESOL Quarterly, Journal of Language, Identity, and Education, and The Routledge Handbook of Language and Identity. Peter De Costa  is an associate professor in the Department of Linguistics and Languages at Michigan State University. His primary areas of research are identity and ideology in second language acquisition. He is the author of The Power of Identity and Ideology in Language Learning (Springer, 2016). He also edited Ethics in Applied Linguistics Research (2016). His work has appeared in AILA Review, Applied Linguistics Review, International Journal of Applied Linguistics, Language Learning, Language Policy, Language Teaching, Linguistics and Education, Research in the Teaching of English, System, TESOL Quarterly, and The Modern Language Journal. He recently guest edited special journal issues on scalar approaches to language learning and teaching (Linguistics and Education, 2016, with Suresh Canagarajah), teacher identity (The Modern Language Journal, 2017, with Bonny Norton), study abroad research methodologies (System, 2017, with Hima Rawal and Irina Zaykovskaya), and World Englishes and second language acquisition (World Englishes, 2018, with Kingsley Bolton). He is the co-editor of TESOL Quarterly. Jean-Marc Dewaele is Professor of Applied Linguistics and Multilingualism at Birkbeck, University of London. He is interested in individual differences in foreign language acquisition and use. He won the Equality and Diversity Research Award from the British Association for Counselling and Psychotherapy (2013) and the Robert Gardner Award for Excellence in Second Language and Bilingualism Research (2016) from the International Association of Language and Social Psychology. He authored Emotions in Multiple Languages (second edition published in 2013 by Palgrave). JesseEgbert  is an assistant professor in the Applied Linguistics program at Northern Arizona University. He specializes in corpus-based research on register variation, particularly academic writing and online language, and methodological issues in quantitative linguistic research. His research has been published in journals such as Journal of English Linguistics, International Journal of Corpus Linguistics, Corpus Linguistics and Linguistic Theory, and Applied Linguistics (2018, Routledge). His books include an edited volume titled Triangulating Methodological Approaches in Corpus Linguistic Research (2018, Routledge) and a book titled Register Variation Online (2018, Cambridge). LukeK.Fryer  is an associate professor and head of faculty and research postgraduate student teaching and learning programs at the University of Hong Kong. His main area of research is the role of non-cognitive factors like interest within teaching and learning. His work on interest, related motivations, and learning strategies has been published widely in journals such as British Journal of Educational Psychology, Internet and Higher Education, and Computers and Education. His statistical analyses focus on longitudinal structural equation modeling and person-centered analyses.

  Notes on Contributors 


SusanGass  is University Distinguished Professor at Michigan State University. She has published widely in the field of second language acquisition. She serves as co-­ editor of Studies in Second Language Acquisition. She has lectured in many parts of the world, including South America, North America, Asia, Africa, and Australia. From 2002 to 2008, she was the President of the International Association of Applied Linguistics and prior to that she was the President of the American Association for Applied Linguistics. She is the recipient of numerous awards and serves as the Director of the Second Language Studies Program and the English Language Center, both at Michigan State University. April Ginther  is an associate professor in the Department of English at Purdue University, where she directs two language support programs. She has been an invited speaker and workshop provider at institutions and conferences around the world, presenting on her primary scholarly pursuits: the development and validation of second language proficiency assessments, the measurement of second language fluency, and the use and interpretation of language proficiency test scores by diverse groups of stakeholders. She recently stepped down as co-editor of Language Testing. KirbyC.Grabowski  is adjunct assistant professor in the Applied Linguistics and TESOL Program at Teachers College, Columbia University, where she teaches courses on second language assessment, performance assessment, generalizability theory, pragmatics assessment, research methods, linguistics, and L2 pedagogy. She is on the Editorial Advisory Board of Language Assessment Quarterly and formerly served on the Executive Board for ILTA as Member-at-Large. She was a Spaan Fellow for the ELI at the University of Michigan, and she received the 2011 Jacqueline Ross TOEFL Dissertation Award for outstanding doctoral dissertation in second/foreign language testing from Educational Testing Service. SarahGrey  is Assistant Professor of Linguistics and Spanish at Fordham University in NewYork City, United States of America. She uses psycholinguistic approaches and ERPs to study adult second language acquisition and bilingualism, and her work has appeared in The Modern Language Journal, Studies in Second Language Acquisition, and the Journal of Neurolinguistics. She received her PhD in Applied Spanish Linguistics from Georgetown University and prior to joining Fordham University she worked as a postdoctoral research fellow in Psychology and the Center for Language Science at Pennsylvania State University. ChristophA.Hafner  is an associate professor in the Department of English, City University of Hong Kong. He has published widely in the areas of English for specific purposes, digital literacies, and language learning and technology. He is co-author (with Rodney H.Jones) of Understanding Digital Literacies: A Practical Introduction (Routledge, 2012). LindaHarklau  is a professor in the TESOL and World Language Education and Linguistics Program at the University of Georgia. Her research examines language


Notes on Contributors

learning and academic achievement of immigrant youth in high school and college, schooling structure and educational policy, and teacher education. A recipient of the TESOL Distinguished Research Award, she also teaches and publishes on the subject of qualitative methods, particularly longitudinal case study and ethnography. David Cassels Johnson  is Associate Professor of Education at the University of Iowa. He holds a PhD (with distinction) in Educational Linguistics from the University of Pennsylvania. His research, teaching, and service focus on how language policies impact educational opportunities for linguistically diverse students, in both bilingual education and English language education programs. He is the author of Language Policy (2013, Palgrave Macmillan) and co-editor of Research Methods in Language Policy and Planning: A Practical Guide (2015, Wiley-Blackwell, with Francis M.Hult). KristopherKyle  is an assistant professor in the Department of Second Language Studies at the University of Hawai’i. His research interests include second language writing and speaking, language assessment, and second language acquisition. He is especially interested in applying natural language processing (NLP) and corpora to the exploration of these areas. GeoffreyT.LaFlair  is an assistant professor in the Department of Second Language Studies at the University of Hawai’i at Mānoa. He conducts research on large- and small-­scale language assessments and quantitative research methods in the field of second language studies. His research has been published in Language Testing, Applied Linguistics, and The Modern Language Journal. SamuelLarner  is a lecturer in Linguistics at Manchester Metropolitan University, UK.His PhD thesis, completed in 2012, explored the socio- and psycholinguistic theory of formulaic sequences and their use by authors when writing short personal narratives, with the goal of identifying individual authorial consistency and distinctiveness for authorship purposes. He has published several journal articles, book chapters, and a monograph, focussing mainly on methods of forensic authorship attribution. In addition to teaching and researching forensic linguistics, Samuel undertakes consultancy in authorship analysis. Jenifer Larson-Hall  is an associate professor in the English Department at the University of Kitakyushu in Japan. Her research interests lie mainly in second language acquisition but she believes statistics substantially affects conclusions that are drawn in the field and has published a variety of articles and books geared toward applied researchers in second language acquisition. Her most recent book is A Guide to Doing Statistics in Second Language Research using SPSS and R (2016, Routledge ). Her 2017 article in The Modern Language Journal, “Moving Beyond the Bar Plot and Line Graph to Create Informative and Attractive Graphics”, argues for the importance of data-accountable graphics.

  Notes on Contributors 


ShimLew  is a doctoral candidate in the Department of Language and Literacy at the University of Georgia. Her area of research is in teacher education for English learners, particularly developing content-area teachers’ hybrid professional development as content and language teachers and integrating disciplinary literacy instruction into K-12 STEM classrooms. Shaofeng Li  is an associate professor in Foreign/Second Language Education at Florida State University where he teaches courses in second language acquisition and language pedagogy and supervises masters and PhD students. His main research interests include task-based language teaching and learning, form-­focused instruction, individual learner differences (especially language aptitude and working memory), and research methods. Shawn Loewen is an associate professor in Second Language Studies in the Department of Linguistics & Germanic, Slavic, Asian and African Languages at Michigan State University. His research interests include instructed second language acquisition, particularly as it pertains to learner interaction. He is also interested in research methodology and the development of statistical knowledge. He teaches a quantitative analysis class, as well as classes on second language acquisition. In addition to journal articles, he has authored Introduction to Instructed Second Language Acquisition (2015) and co-authored, with Luke Plonsky, An A–Z of Applied Linguistics Research Methods (2016, Palgrave). His co-edited volume (with Masatoshi Sato) The Routledge Handbook of Instructed Second Language Acquisition appeared in 2017. AlisonMackey  is Professor of Linguistics at Georgetown University. She is interested in interaction-­driven second language (L2) learning, L2 research methodology and the applications of interaction through task-based language teaching, as well as second language dialects and identities. She is the editor of the Annual Review of Applied Linguistics, published by Cambridge University Press, an official journal of the American Association for Applied Linguistics. DavidMalinowski  is a language technology and research specialist with the Center for Language Study at Yale University. With a background in language and literacy education, multimodal communication, and technology-enhanced learning, he conducts research and supports pedagogical innovation on such technology-­related topics as internet-mediated intercultural language learning (telecollaboration) and course-sharing with videoconferencing. At the same time, he maintains a significant interest in linguistic landscape, seeking to find productive intersections between urban sociolinguistics and place-based language learning. David holds a masters in TESOL from San Francisco State University and a PhD in Education from UC Berkeley. JeffreyMaloney  is Assistant Professor of English at Northeastern State University. He holds a PhD in Second Language Studies from Michigan State University and an MA in Applied Linguistics from Ohio University. His research interests include lan-


Notes on Contributors

guage teacher training with technology, computer-assisted language learning, and language teacher and learner identity. Tui Matelau-Doherty  is a PhD candidate at Auckland University of Technology in New Zealand. Her research uses Multimodal (Inter)action Analysis to explore the relationship between creative practice and ethnic identity. Her masters research examined the ethnic identity co-constructed within tertiary education environments by Māori female students. The findings of this research were published in Interactions, Texts and Images: A Reader in Multimodality (2014, De Gruyter). In addition, poems she wrote as part of her data collection were published in the journal Multimodal Communication. KyleMcIntosh  is an assistant professor in the Department of English and Writing at The University of Tampa, where he works primarily in the academic writing and TESOL certificate programs. His research focuses on English for Academic Purposes, intercultural rhetoric, and writing assessment. With Carolina Pelaez-Morales and Tony Silva, he co-edited the volume Graduate Studies in Second Language Writing (2015). Elizabeth R. Miller  is Associate Professor of Applied Linguistics in the English Department at the University of North Carolina at Charlotte. Her research involves adult immigrant learners of English in the U.S. and focuses on issues related to language ideologies and learners’ agency and identity. Her work has appeared in a number of journals, and two of her recent publications include The Language of Adult Immigrants: Agency in the Making (2014) and the co-edited volume Theorizing and Analyzing Agency in Second Language Learning: Interdisciplinary Approaches (2015). RezaNorouzian  is a PhD candidate in the English as a Second Language program at Texas A&M University. In addition to his doctoral studies, Reza has also obtained a Graduate Certificate in Advanced Research Methods from Texas A&M University. Reza’s research interests include instructed second language acquisition and advanced research methods. Reza has published in a number of journals including Second Language Research and Issues in Applied Linguistics. Reza is a contributor to StackExchange (data science, statistics, and programming forum). Sigrid Norris  is Professor of Multimodal (Inter)action and Director of the AUT Multimodal Research Centre at Auckland University of Technology in New Zealand. Born in Feudingen Germany, she received her BA in Russian Language and Literature from George Washington University, and later received an MS and was conferred her PhD in Linguistics by Georgetown University in the United States. She is the founder of the theoretical/methodological framework Multimodal (Inter)action Analysis, has edited and authored numerous academic books, journal articles and book chapters, written two poetry books, and is the editor of the international journal Multimodal Communication. SaerhimOh  is Senior Test Development Manager at Assessment Technology and Engineering at Pearson. Her research interests include linguistic tools in second lan-

  Notes on Contributors 


guage writing assessment, feedback in second language writing, speech recognition in second language speaking assessment, and English Language Learner assessment. She received her doctorate degree in Applied Linguistics from Teachers College, Columbia University. She was the 2017 Robert Lado Memorial Award recipient in recognition of the best graduate student paper presentation at the annual meeting of Language Testing Research Colloquium (LTRC). MagaliPaquot  is a permanent Fonds de la Recherche Scientifique (F.R.S.-FNRS) research associate at the Centre for English Corpus Linguistics, Université catholique de Louvain. She is co-editor-in-chief of the International Journal of Learner Corpus Research and a founding member of the Learner Corpus Research Association. Her research interests include corpus linguistics, learner corpus research, vocabulary, phraseology, second language acquisition, linguistic complexity, crosslinguistic influence, English for Academic Purposes, pedagogical lexicography and electronic lexicography. AekPhakiti  is an associate professor in TESOL at the University of Sydney. His research focuses on language testing and assessment, second language acquisition, and research methods in language learning. He is the author of Strategic Competence and EFL Reading Test Performance (2007), Experimental Research Methods in Language Learning (2014), Language Testing and Assessment: From Theory to Practice (Bloomsbury, forthcoming), and, with Carsten Roever, of Quantitative Methods for Second Language Research: A Problem-Solving Approach (2018). With Brian Paltridge, he edited the Continuum Companion to Research Methods in Applied Linguistics (2010) and Research Methods in Applied Linguistics: A Practical Resource (2015). He is Associate Editor of Language Assessment Quarterly. He was Vice President of ALTAANZ (Association for Language Testing and Assessment of Australia and New Zealand, 2015–2017). Jesse Pirini  is a lecturer in the School of Management at the Victoria Business School, Victoria University of Wellington. Jesse received his PhD at the Auckland University of Technology, studying knowledge communication, agency and intersubjectivity in high school tutoring. Jesse develops multimodal theory and methodology. He works with a wide range of data sources, including f­ amily interaction, high school tutoring, augmented reality and video conferencing. Along with academic journal articles and chapters, Jesse is also the author of a practical workbook for training tutors and he supports community-based peer tutoring programmes. Luke Plonsky is Associate Professor of Applied Linguistics at Northern Arizona University, where he teaches courses in research methods and second language acquisition. Recent and forthcoming publications in these and other areas can be found in journals such as Annual Review of Applied Linguistics, Applied Linguistics, Language Learning, The Modern Language Journal, and Studies in Second Language Acquisition, as well as in edited volumes published by Cambridge University Press, Wiley Blackwell, De Gruyter, and others. He is also Associate Editor of Studies in Second Language Acquisition, Managing Editor of Foreign Language Annals, and Co-Director of IRIS (


Notes on Contributors

MatthewT.Prior  is Associate Professor of Applied Linguistics/Linguistics/TESOL in the Department of English at Arizona State University, where he teaches courses in qualitative methods, discourse analysis, sociolinguistics, TESOL, and second language acquisition. His interests include narrative, discursive-­ constructionist approaches to identity, and social-psychological dimensions of multilingualism. He is author of Emotion and Discourse in L2 Narrative Research (2016) and co-editor of the volume Emotion in Multilingual Interaction (2016). SangheeRyu  is a research professor in the research center of Korean Language and Literature Education at Korea University, South Korea. Ryu’s research focuses on the use of discourse analysis and formative-design experiments to explore and improve the teaching and learning of argumentative writing with an emphasis on underlying definitions of rationality. Ryu has taught pre-service teacher education courses on teaching reading and teaching writing at The Ohio State University. She teaches graduate courses on research methodology at Korea University. BettySamraj  is Professor of Linguistics at San Diego State University. Her main research interests are in academic writing in different disciplines (including interdisciplinary fields) and genre analysis. She has conducted analyses of several different genres such as research article introductions, abstracts, masters theses, graduate student research papers, manuscript reviews, personal statements and, most recently, suicide notes. She teaches teacher preparation courses such as English for Specific Purposes and Teaching ESL Reading and Writing in a masters program in applied linguistics. Gale Stam  is Professor of Psychology at National Louis University in Chicago, Illinois. Her research interests include language, culture, and cognition; gesture; and L1 and L2 acquisition. She has published articles on changes in thinking for speaking, the importance of looking at gesture in L2 acquisition, gesture and lexical retrieval in an L2 setting, and language teachers’ gestures. She serves on the editorial board of the journals Gesture and Language and Sociocultural Theory and has co-edited two volumes: Gesture: Second Language Acquisition and Classroom Research (2008) and Integrating Gestures: The Interdisciplinary Nature of Gesture (2011). Sue Starfield  is a professor in the School of Education at UNSW Sydney. With Brian Paltridge, she is co-author of Thesis and Dissertation Writing in a Second Language: A Handbook for Supervisors (2007) and of Getting published in academic journals: Negotiating the publication process (2016) and co-editor of the Handbook of English for Specific Purposes (2013). She co-authored Ethnographic Perspectives on Academic Writing with Brian Paltridge and Christine Tardy (2016). With Brian Paltridge, she is co-editor of two new book series: Routledge Introductions to English for Specific Purposes and Routledge Research in English for Specific Purposes. Her research interests include tertiary academic literacies, advanced academic writing, postgraduate pedagogy, ethnographic methodologies, identity in academic writing, and access and equity in higher education.

  Notes on Contributors 


CrissaStephens  is a doctoral candidate at the University of Iowa. Her work uses a critical sociocultural lens to examine how language policies interact with social identity development and opportunity in education. Her teaching and activism in the US and abroad help to inspire her approach, and her recent publications utilize ethnographic and discourse-analytic methods to explore language policy and educational equity inlocal contexts. ScottSterling  is Assistant Professor of TESOL and Linguistics in the Department of Languages, Literatures, and Linguistics at Indiana State University. His recent work investigates the level of training, current beliefs and practices that the field of applied linguistics has towards research ethics. His main area of focus is meta-research, particularly research ethics, and he has published work related to these topics in various journals and edited volumes in linguistics. He completed his PhD at Michigan State University in 2015 with a dissertation that focused on the complexity and comprehensibility of consent forms used in ESL research. JeffreyStewart  is Director of Educational Measurement and a lecturer atKyushu Sangyo University in Japan. He has published articles in numerous journals such as TESOL Quarterly and Language Assessment Quarterly regarding vocabulary acquisition and testing using a number of advanced statistical modeling tools, most specifically item response theory. KaitlynM.Tagarelli  works as a postdoctoral fellow in Psychology and Neuroscience at Dalhousie University in Halifax, Canada. She received her PhD in Applied Linguistics from Georgetown University and her research uses behavioral, Eventrelated Potential (ERP), and Functional magnetic resonance imaging (fMRI) techniques to examine the neural and cognitive mechanisms involved in language learning and processing. Dr. Tagarelli is particularly interested in the brain structures and memory systems underlying language learning, and how individual differences and learning conditions interact with learning processes and outcomes. Her work has appeared in edited volumes and Studies in Second Language Acquisition. Hong Wang  is a subject librarian and information specialist at the University of Auckland. She has a masters degree in library and information science, a bachelor’s degree in foreign language education, and an associate degree in computer science. She has extensive experience in lecturing on information literacy, and she has also taught ESL and Chinese in various instructional settings in China and the U.S. TerrenceG.Wiley  is Professor Emeritus at Arizona State University and immediatepast President of the Center for Applied Linguistics, specializing in language education and policy. His recent works include Handbook of Heritage, Community, and Native American Languages: Research, Policy, and Practice (co-editor, 2014) and Review of Research in Education, 2014, 38(1). Wiley co-founded the Journal of Language, Identity and Education and the International Multilingual Research Journal. He is organizer of the International Language Policy Research Network of Association


Notes on Contributors

Internationale de la Linguistique Appliquée and recipient of the American Association for Applied Linguistics Distinguished Scholarship and Service Award (2014). AnnaHerYang  is a doctoral student in the Department of Language and Literacy Education at the University of Georgia. She is also the project coordinator of a fiveyear National Professional Development grant. Her research interest primarily focuses on the pedagogical experiences of mainstreamed ESOL (content-area) teachers of English learners. Soo Jung Youn  is Assistant Professor of English at Northern Arizona University, USA.Her academic interests include L2 pragmatic assessment, task-based language teaching, quantitative research methods, and conversation analysis. In particular, her research focuses on assessing L2 learners’ ability to accomplish various pragmatic actions in interaction by investigating a wide range of interactional features indicative of a varying degree of pragmatic competence using mixed methods. Her studies have recently been published in Language Testing, System, and Applied Linguistics Review. RichardF.Young  is Emeritus Professor of English at the University of WisconsinMadison and Chutian Professor in the School of Foreign Languages at Central China Normal University. His research focuses on the relationship between language and social context and has resulted in four books: Discursive Practice in Language Learning and Teaching (2009), Language and Interaction (2008), Talking and Testing, and Variation in Interlanguage Morphology(1998), as well as over 70 articles.

List of Figures

Fig. 1.1 Fig. 2.1 Fig. 2.2 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 9.1 Fig. 9.2 Fig. 9.3 Fig. 12.1 Fig. 12.2 Fig. 12.3 Fig. 14.1

Fig. 14.2

The five key stages of empirical research 13 “Practicing speaking” in Spanish (Hall, 2004, p.76) 34 Representing embodied cognition (Goodwin, 2003, Fig.2.9, p.35)39 Beeswarm plot of interest in interacting with Chatbot (Data 1) and human partner (Data 2) 61 Diagram of longitudinal Chatbot experiment design 65 Combination interaction/boxplots of the longitudinal Chatbot data66 A parallel plot showing interest in human versus Chatbot interlocutors over three testing times 68 Hypothesized model of interest in task and course 71 Final model of interest in task and course 71 The four questions framework 185 Visual prompt for a literature review 188 How is my study contributing? 191 Draft 1 of fieldnotes 254 Coded fieldnotes 259 Screenshot of Transana programme used to collate fieldnotes and recordings (Hall, personal data, 2015) 264 Sample visual world. Note: In this example, “cat” is the target, “caterpillar” is an onset competitor, “bat” is a rhyme competitor, and “hedgehog” is an unrelated distractor. Images are from the Multipic database (Duñabeitia etal., 2017) 291 Sample data from mouse-tracking language experiment. Note: The black line represents a competitor trajectory; the gray line represents a target trajectory. Images are from the Multipic database (Duñabeitia etal., 2017) 292 xxvii


Fig. 14.3

List of Tables

Sample ERP waves and scalp topography maps of the standard ERP correlate of semantic processing (N400). Note: Each tick mark on y-axis represents 100 ms; x-axis represents voltage in microvolts, ±3μV; negative is plotted up. The black line represents brain activity to correct items, such as plane in example 2a. The blue line represents brain activity to a semantic anomaly, such as cactus in example 2b. The topographic scalp maps show the distribution of activity in the anomaly minus correct conditions with a calibration scale of ±4μV.Fromdata reported in Grey and Van Hell (2017) 296 Fig. 14.4 Examples of (a) semantic priming using lexical decision, (b) masked semantic priming, and (c) syntactic priming using a picture description task. Note: Drawing credit: Kyle Brimacombe 298 Fig. 14.5 Artificial linguistic systems in language learning paradigms (based on Morgan-Short et al., 2010; Saffran et al., 1996; Tagarelli, 2014). Note: Drawing credit: Kyle Brimacombe 302 Fig. 17.1 Grammar and Beyond 4, “Avoid Common Mistakes” box (p.75) 365 Fig. 17.2 “Be careful note” on the overuse of modal auxiliaries (MEDAL2, p.17)367 Fig. 19.1 Scatterplots of four samples of students’ scores 401 Fig. 19.2 Scatterplots indicating small, medium, and large r in L2 research 402 Fig. 19.3 Crosshatched area representing an r2 of 0.25 (25%) 403 Fig. 19.4 Representation of Pearson’s r as a non-directional measure 404 Fig. 19.5 Representation of regression as a directional measure 404 Fig. 19.6 Scatterplot for predicting OLA from LR(years)406 Fig. 19.7 Menu for selecting simple regression analysis in SPSS 407 Fig. 19.8 Selections for running regression analysis in SPSS 407 Fig. 19.9 Statistics for running regression in SPSS 408 Fig. 19.10 ANOVA partitioning of total sum of squares (SOS) in OLA (R2=50.1%)410 Fig. 19.11 Scatterplot with for LR(years) predicting OLA with the regression line413 Fig. 19.12 Factor shown as the commonly shared area among standardized variables417 Fig. 20.1 EFA versus PCA 425 Fig. 20.2 12 essential steps in EFA 429 Fig. 20.3 Screenshot of the strategy use in lectures data 429 Fig. 20.4 Descriptive statistics options in SPSS 430 Fig. 20.5 EFA in SPSS 432 Fig. 20.6 Factor analysis menu 433 Fig. 20.7 SPSS Descriptives dialog box 433 Fig. 20.8 SPSS extraction dialog box 435 Fig. 20.9 SPSS extraction dialog box 436

  List of Figures 

Fig. 20.10 Fig. 20.11 Fig. 20.12 Fig. 20.13 Fig. 20.14 Fig. 20.15 Fig. 20.16 Fig. 20.17 Fig. 20.18 Fig. 20.19 Fig. 20.20 Fig. 21.1 Fig. 21.2 Fig. 21.3 Fig. 21.4 Fig. 21.5 Fig. 21.6 Fig. 21.7 Fig. 21.8 Fig. 21.9 Fig. 21.10 Fig. 21.11 Fig. 21.12 Fig. 21.13 Fig. 21.14 Fig. 21.15 Fig. 21.16 Fig. 21.17 Fig. 21.18 Fig. 21.19 Fig. 21.20 Fig. 22.1 Fig. 23.1 Fig. 23.2 Fig. 25.1 Fig. 25.2


Scree plot (PCA) 438 Creating a parallel analysis syntax 439 Customising a parallel analysis syntax in SPSS 440 Extracting factors using the principal axes factoring method with the fixed factor number = 5 441 Scree plot (PAF) 443 Rotation dialog box (direct Oblimin method) 444 Rotation dialog box (Varimax method) 445 Options dialog box 445 Creating a factor score 452 Factor scores in the SPSS data sheet 452 Creating a composite score for comprehending strategies 453 A third-order factor CFA model 462 CFA model of reading performance (Standardised solution; N=651)463 CFA model of reading performance (Unstandardised solution; N=651)464 A hypothesised SEM model of the influences of trait cognitive and metacognitive processing on reading performance (N=651)466 Eight essential steps in CFA or SEM 470 Open a data file in EQS 478 EQS spreadsheet 479 EQS diagram drawing tool 480 EQS diagram drawing canvas 481 Factor structure specification 481 A hypothesised CFA of comprehending strategies 482 EQS model specifications 483 EQS model specifications 484 Analysis in EQS 485 Distribution of standardised residuals 486 Parameter estimates options 490 First-order CFA for comprehending strategy use (unedited version)491 Revised first-order CFA model for comprehending strategy use 492 Distribution of standardised residuals (revised model) 493 Revised CFA model of comprehending strategy use 494 Histogram of vocabulary test scores for Teaching Method 2 507 Bar plot displaying normed frequency of said in news and other CORE registers 527 Resampled mean differences based on the data from Donaldson (2011)533 TAALES GUI 579 TAALES .csv file for analysis 580


Fig. 25.3 Fig. 25.4 Fig. 25.5 Fig. 25.6 Fig. 25.7 Fig. 25.8 Fig. 25.9

List of Figures

WEKA explorer 581 File selection in WEKA 581 Histogram for normally distributed TAALES index 582 Histogram for non-normally distributed TAALES data 583 Selection of model in WEKA 585 Selection of cross-validation type in WEKA 585 Initial linear regression model reported in WEKA with suppression effects 586 Fig. 25.10 Final linear regression model 587 Fig. 33.1 Heritage language dissertations between 2011 and 2016 by country748 Fig. 33.2 Language(s) studied by articles in the Heritage Language Journal and dissertations. Note: Some of the articles published in the Heritage Language Journal and dissertations publised in the ProQuest Database contain the examination of one more heritage language 751

List of Tables

Table 5.1 Table 6.1

Common types of mixed methods designs 107 A comparison between traditional reviews and research syntheses140 Table 9.1 Thesis proposals: structure and purpose (based on Paltridge & Starfield, 2007, p.61) 194 Table 10.1 Moves in empirical research article introductions 203 Table 10.2 Dimensions to consider when constructing a research article 214 Table 10.3 Discovering norms for use of metadiscoursal features 215 Table 14.1 Common language-related ERP effects 295 Table 19.1 Two variables showing a perfectly positive Pearson’s r397 Table 19.2 Two variables showing a perfectly negative Pearson’s r398 Table 19.3 Imperfect r due to differences in ordering399 Table 19.4 Imperfect r due to differences in scores’ shapes399 Table 19.5 Imperfect r due to differences in scores’ shapes and ordering399 Table 19.6 Data for predicting OLA from LR (N=10)405 Table 19.7 SPSS output of model summary from simple regression analysis408 Table 19.8 ANOVA output table for simple regression analysis in SPSS 409 Table 19.9 Output for regression coefficients in SPSS 412 Table 19.10 Result of prediction of OLA from LRyears for our ten participants412 Table 19.11 Coefficients table with modified scale for predictor variable 413 Table 20.1 Descriptive statistics of items one to five 431 Table 20.2 Cronbach’s alpha coefficient of the questionnaire 431 Table 20.3 KMO and Bartlett’s test based on 37 items 434 Table 20.4 Communalities (initial and extracted) 437 Table 20.5 Total variance explained 438



List of Tables

Table 20.6 Table 20.7 Table 20.8 Table 20.9 Table 20.10 Table 20.11 Table 20.12 Table 20.13 Table 20.14 Table 20.15 Table 20.16 Table 21.1 Table 21.2 Table 21.3 Table 21.4 Table 21.5 Table 21.6 Table 22.1 Table 22.2 Table 22.3 Table 22.4 Table 22.5 Table 23.1 Table 23.2 Table 23.3 Table 25.1 Table 33.1 Table 33.2 Table 38.1

A comparison of eigenvalues from the dataset with those from the parallel analysis 440 KMO and Bartlett’s test based on 25 items 441 Initial and extraction values based on the PAF method 442 Total variance explained (five-factor extraction) 442 Factor correlation matrix 444 Rotated factor matrix based on the Varimax method (based on 25 items) 447 Rotated factor matrix based on the Varimax method (based on 24 items) 448 Cronbach’s alpha coefficient of each factor (based on 24 items) 448 Rotated factor matrix based on the Varimax method (based on 23 items; final model) 449 Rotated factor matrix based on the Varimax method (based on 23 items; final model) 450 Correlations between factor scores and composite scores (N=275, **=p0.90). It is generally held that a Cronbach’s alpha of over 0.70 is acceptable (Devellis, 2012). Now let us take a look at the results of a robust paired t-test on the two groups (it is a percentile bootstrapped 20% trimmed means paired-samples t-test; R code similar to this procedure can be found on page 307in Larson-­ Hall, 2015). We consider confidence intervals to be more informative and appropriate than p-values, so we will report those first. This test returns a 95% confidence interval which gives a range where 95% of the time we would expect to see the possible difference in means between the groups fall if we repeatedly tested these groups. The 95% confidence interval is [−0.01, 0.21], meaning that the difference between the groups might be as much as 0.21 of a point, or it might be zero (since the CI passes through zero). The range for the confidence interval is fairly narrow, so we can be confident that the actual difference between groups is small and there is basically no difference between groups. Another way of thinking about this confidence interval is that it shows the size of the differences between groups, which at the most might be only 0.21 points out of 6 possible points, so the effect size of the difference between groups is quite small. If we want to look at the results the more traditional way, with p-values, the robust analysis returns a p-value of p=0.10, which is larger than our alpha level of α=0.05. In this way, we would conclude that the probability of the groups being the same is good and we will accept our null hypothesis that the groups are equal. In both cases we would also like to generate a standardized effect size called a Cohen’s d effect size. Mizumoto’s website, which generated Fig. 3.1, also calculates Cohen’s effect sizes, and d=0.1, 95% CI [−0.02, 0.23]. This effect size means that the difference between the groups amounts to 0.1 times a standard deviation of difference between groups, which is quite small. All of our results (confidence intervals with effect sizes contained in the size of the original measurement, p-values, and standardized effect sizes) point to the fact that the differences between groups are not very strong and that we should consider the groups as equivalent.


L. K. Fryer et al.

This is good news for our English program—we have just found out that our students are as happy chatting with an artificial intelligence on their computer notebooks as chatting with other students. Because of this, we might decide to have Chatbot homework so that students can continue to practice “speaking” outside of the classroom, but without the hassle of making a scheduled time to talk to a real person. Our experiment shows that the technology is definitely useful and could possibly be an important factor to help improve our students’ interest in English and motivation to continue. Since our school is planning to make a significant investment in the tablet technology, we’re quite happy to see these results.

Longitudinal Analysis Our comparison of students’ interest in the Chatbot versus human raters was encouraging. However, it may not be safe to conclude from the results that this will always hold true, even with the examined population. Until now, our study can be considered to be what is known as cross-sectional, comparing our experimental condition (students attempting the Chatbot activities) with a control condition (students speaking with one another, as usual), which is used as a reference point for comparison. Many studies in our field can be described this way; frequently, we only measure the effect of our experimental condition once in a single study, and leave it at that. A limitation of such studies is that since the dependent variable is only measured once, we receive only what could be considered a “snapshot” of our treatment’s effect on learners. Until now we have only measured students’ engagement in the Chatbot activity at just a single point in time. Of course, what we would really like to know is whether the Chatbot holds students’ interest as much as a human partner does in the long run. In such cases, it may be worthwhile to measure student interest in our treatment not once, but multiple times throughout the semester. Such a study can be considered to be longitudinal. We submit that longitudinal studies are almost always to be preferred over cross-sectional studies, and Ortega and Iberri-Shea (2005) and Ortega and Byrnes (2008) concur. Of course, one must actually design this kind of study ahead of time, and this is indeed what we did. Prior to the beginning of the course, students’ interest in English in general was also estimated with a short survey (five items); subsequently, all students used the Chatbot to reduce the novelty ­factor. Three weeks later, students were randomly divided into either (1) the Chatbot first, human second, or (2) the human first, Chatbot second group; half of the students tried a conversation that began with structures used in the

  Quantitative Methodology 


Fig. 3.2  Diagram of longitudinal Chatbot experiment design

course content with a human partner and the second half of the class tried the same conversation with the Chatbot, employing the tablet and voice to speech software (this is labeled as T-1in Fig.3.2). After doing the task with their first partner, students then switched and did the task with the other type of partner. Immediately after the tasks with each type of partner at each time period (T-1, T-2, and T-3) were finished, students completed a short five-item survey measuring interest in the task (the dependent variable measured). Finally, after all the tasks were competed, the students took a survey of interest in the course, which they had also taken 11 weeks earlier, one week after T-1. We created a control condition by having a counterbalanced design where each participant could act as their own control. Having a control group is essential if one wants to make any statements about the effect of a variable or condition. In addition, testing the same person multiple times results in a reduction of the error in the statistical component, thus strengthening the results of the study by increasing power and decreasing the number of participants needed. Another point to make about this illustrative study is that technology makes gathering data, even longitudinally and from large numbers of students as this study does, much less difficult than it has been previously. Technology is changing the playing field of both teaching and research and has major implications for applied linguistics (Duff, 2010). Tablets and clickers make a micro-analysis of the type we are discussing here infinitely more possible,


L. K. Fryer et al.

without interfering with valuable class time. We hope researchers will consider the value of gathering large amounts of data whatever their question may be—student interest, pronunciation accuracy, understanding of grammatical structures, and so on. In looking at the data, first let us consider the mean scores and standard deviations. In all cases, the number of participants was 121. For Chatbots, the interest at Time 1 (Week 7) was X=3.8, SD=1.2; at Time 2 (Week 10) it was X=3.1, SD=1.4; and at Time 3 (Week 13) interest was X=2.8, SD=1.5. Thus we see the interest declining over time, with the standard deviation getting larger each time. For humans, the interest at Time 1 was X=3.9, SD=1.0; Time 2 was X=3.8, SD=1.1; Time 3 was X=3.8, SD=1.2. The interest for a human interlocutor stayed essentially the same every time, with similar standard deviations. Figure3.3 shows an interaction boxplot, giving interaction plots as well as boxplots for the data in two different configurations.

Fig. 3.3  Combination interaction/boxplots of the longitudinal Chatbot data

  Quantitative Methodology 


The graphics in Fig.3.3 illustrate the picture of interest declining in the Chatbot over time, while enthusiasm for human interlocutors remained fairly constant throughout the length of the semester, although at all testing times the standard deviations were fairly wide and the range covered the entire scale, indicating a large array of opinions among the students.

Repeated-Measures ANOVA Although the paired-samples t-test we performed in the first step of the analysis was useful in comparing one condition to the other at one time period, for comparing the scores of the same groups of students over multiple time periods, we need to look to a repeated-measures (RM) ANOVA which can take into account the lack of independence between measurements at more than two points. Right away, we can see from the descriptive statistics and Fig.3.3 that two of the assumptions behind the statistical test are violated; the data are not normally distributed because boxplots indicate outliers, and standard deviations are also different when summarizing over humans versus Chatbots, with the Chatbots having a wider standard deviation (and thus variance). We proceeded with a least-squares RM ANOVA anyway; violating assumptions means it will be more difficult to find a statistical result and results in a loss of power. In this case, however, even with the violations of assumptions we were able to find a statistical two-way interaction between interlocutor and testing time, F2,223 = 25.3, p Spilt 0.001, generalized eta squared = 0.02 (because sphericity was violated the Huynh-Feldt correction was applied to this result). This means that the participants did not rate their interest equally at all time periods, and this interest was affected by whether they were talking to humans or Chatbots. A formal statistical analysis would investigate the exact differences in the intersection of time and interlocutor by conducting further RM ANOVAs for testing time at the value of interlocutor (Chatbot or human), or, conversely, depending on our interests, a paired-samples t-test between Chatbots or humans could be conducted at each value of the testing time. We will skip the formal results of such an analysis in this paper, but it is clear from the interaction plots in Fig.3.3 that the students lost interest in the Chatbots over time. The initial paired-samples t-test showed that there was no difference between Chatbot and human interlocutors at the first testing, but testing for later times showed that students became less interested in talking with Chatbots over time. Figure3.4 gives a parallel plot showing individual results for this same data. Although individuals differed widely, the trends of higher interest scores for humans are clear as well.


L. K. Fryer et al.

Fig. 3.4  A parallel plot showing interest in human versus Chatbot interlocutors over three testing times

We want to point out that doing only a one-time analysis then leads us to a completely erroneous conclusion about the effectiveness of Chatbots. By testing at multiple times we strengthen both our statistical and logical case and realize that the first testing time was not showing a true or complete picture of students’ long-term interest in interlocutors.

Model-Based Analysis To summarize up to this point, cross-sectional evidence suggested no significant difference in interest between Chatbots and humans, but longitudinal analysis indicated a small yet statistical decrease in interest over time for the Chatbot speaking task but not the human speaking tasks. These results signal

  Quantitative Methodology 


the difference between these two fundamental research designs. By employing the same research design but a different analytical framework, another layer of questions might be asked. Regression analysis is an analytical method which is particularly useful for estimating the predictive connections between variables over time. While the test of such predictive relationships can be conducted with cross-sectional data, Tracz (1992) suggests that time-order (repeatedmeasures) data is a minimum research design component for any such estimated links to indicate potential causality (X causes Y). Tracz in fact sets out three research design criteria which must be met for causal implications to be drawn: (1) temporal sequencing of variables (X precedes Y), (2) a relationship among variables (rxySpigt0), and (3) control (X is controlled for when predicting Y with Z). The first condition can be met relatively easily through design and the second condition must be present naturally. The third condition can be perceived as a continuum from High control (rigid experimental design) to Low control (accounting for prior variance on pre-measures). In the current study, Criteria 1 and 2 were addressed by the use of a longitudinal research design and focusing on strongly related variables. The study’s design aimed to control (Criteria 3) future interest in two ways. First, prior variance was controlled for by including prior measurements of all variables. Initial condition (Chatbot or human) was controlled for by random assignment to the treatment groups. The research design described (presented in Fig.3.2) lends itself to indicating predictive relationships through regression analysis. Basic regression analysis is the prediction of one variable by another variable (see Norouzian & Plonsky, Chap. 19). Multiple regression or hierarchical linear modeling can include multiple predictor variables, thereby potentially explaining more variance in the dependent variable. If the regression analysis seeks to explain or predict multiple outcomes (dependent variables), then a model-based approach is necessary. There are two related model-based analytical approaches which might be utilized to test the predictive relationships between multiple independent and dependent variables. The first employs observed variables (mean-based) and is typically referred to as path analysis. The second approach employs latent variables (variables based on multiple indicators rather than mean scores) and is generally referred to as structural equation modeling (SEM; see also Phakiti, Chap. 21). It should be noted that path analysis forms the underpinnings of SEM. Both analytical approaches allow for a full model test, which means that each regression is not calculated separately or sequentially but instead they are all estimated simultaneously. This simultaneous analysis


L. K. Fryer et al.

enables researchers to more completely understand the individual contribution of both different predictor variables and the same predictor over multiple time points. This approach also provides model test statistics or fit indices based on the chi-square for the model. These fit indices allow for the comparison of different models and general understanding of how well the model fits the data (Kline, 2011). In most cases, deciding between a mean or latent variable-based approach comes down to sample size. A latent-based approach provides the benefit of addressing measurement error inherent in mean scores and greater degrees of freedom, but demands a substantially larger sample size (e.g., > 200, 500, 1000; for an in-depth discussion see Fan, Thompson, & Wang, 1999). The measurement error-free latent variables afforded by this approach result in more accurate estimates of prediction and have been suggested as essential for reciprocal modeling over time (Pedhazur, 1982). The additional degrees of freedom allow latent-based models to test a greater number of regression relationships while still providing a measure of model fit (see Loehlin, 1998). We resolved to utilize SEM. To begin with, confirmatory factor analysis (CFA) of all variables was undertaken to assess their convergent and divergent validity. CFA is a type of SEM which just assesses the latent indicator-based variables (measurement models) for fit to the data. It is an essential first step to assess whether your variables are being appropriately estimated (in other words, reasonable loading of indicator items; convergent validity) and then whether the correlation between your modeled variables is reasonable. For SEM, high correlation between modeled variables is the chief concern. Correlational relationships greater than 0.90 suggest insufficient divergent validity and will likely result in non-convergence for analyses. Acceptable fit and reasonable inter-correlations from the confirmatory factor analysis suggest proceeding with the longitudinal analysis. Figure3.5 presents the hypothesized model to be tested. The lines in Fig.3.5 represent regressions tested, while the circles represent the latent (multiple indicator) variables modeled. Analysis of the hypothesized model resulted in acceptable fit to the data set. The model regression outcomes are presented in Fig.3.6. Based on Peterson and Brown’s (2005) recommendations for regression benchmarks, in line with Hattie’s (2009) guidelines for educational effect sizes, all modeled predictive effects except one were large. The strong predictive connections were expected given the theoretical consistency of the variables and relatively small time between the data points. The insignificant link between Chatbot tasks and students’ interest in the course is precisely what this analytical approach is designed to look for. After accounting for prior course interest and students’ interest in the human speaking task, the Chatbot

  Quantitative Methodology 

Fig. 3.5  Hypothesized model of interest in task and course

Fig. 3.6  Final model of interest in task and course



L. K. Fryer et al.

task failed to contribute to students’ future interest in the course. This is an important finding because technology like Chatbots is generally brought into the class to stimulate interest in the language under study. Also, as noted, course interest is an essential mediator between interest in classroom tasks and interest in the broader domain under study. This finding demonstrates that in addition to interest in Chatbot conversation declining over time, it also fails to enhance student’s long-term interest.

Summary In this chapter, we illustrated the steps of the quantitative approach to research methodology by walking the reader through three different research questions concerning the same research topic. We have noted that there is never only one way to look at any given data set, but have shown that progressively more sophisticated analysis can perhaps come closer to answering the questions that researchers actually hold; in this case, looking at the big picture of how Chatbots affect not just current interest in the task but interest in the course overall is probably the most informative and relevant of any of our three questions. We urge researchers to pursue knowledge of more sophisticated statistical techniques that are able to address these bigger picture questions (cf. Brown, 2015). We would also suggest that SLA researchers utilize theoretical frameworks and analytical methods that best suit their research questions. It is common practice in many areas of SLA (not just motivation) to continue using long-­ standing methods and questionnaires which may not be suited to address some teachers’ and students’ needs. The broader fields of education and psychology can offer substantial alternatives to current approaches, which may in some cases lack strong theoretical and empirical support. SLA researchers would do well to explore all possible avenues, including those outside SLA. One possible reason for the reluctance of many SLA practitioners to embrace practices in neighboring fields could be enduring beliefs that SLA differs entirely from other social and educational science. While there may be ­substantial differences for some aspects of language acquisition (e.g., other fields besides linguistics may have little to say about phonemic acquisition), a considerable portion of what is often called acquisition is in actual fact forms of classroom learning. We believe classroom language learning research can be enriched by the broader fields of education, psychology, and educational psychology, most specifically.

  Quantitative Methodology 


We believe the quest to improve our practices in quantitative methodology must embrace many different facets, but that the results will be better answers to our questions and further scientific progress.

Resources forFurther Reading Brown, J. D. (2004). Research methods for applied linguistics: Scope, characteristics, and standards. In A.Davies & C.Elder (Eds.), The handbook of applied linguistics (pp.476–501). Oxford: Blackwell Publishing Ltd. This chapter contains a thorough overview of types, topics, and purpose of research in the field of applied linguistics. Brown has laid out the steps of quantitative research so clearly that we did not see any point in repeating his excellent points and thus took a somewhat different approach for this chapter. Brown, T. (2006). Confirmatory factor analysis for applied research. NewYork: Guilford Press. This book is an important resource toward fully understanding confirmatory factor analysis, which is an essential building block for structural equation modeling and advanced applications. Chatterjee, S., & Hadi, A.S. (2015). Regression analysis by example. NewYork: John Wiley & Sons. This book is an extremely practical guide to understanding regression, which is made easy to use and understandable through an example-based structure. Kline, R. B. (2011). Principles and practices of structural equation modeling (3rd ed.). NewYork: Guilford Press. This book is quickly becoming the most commonly referenced introductory textbook for structural equation modeling. It is a great introductory text but also a useful resource for researchers with basic skills seeking a better understanding or to learn intermediate level applications.


L. K. Fryer et al.

Larson-Hall, J. (2016). A guide to doing research in second language acquisition with SPSS and R (2nd ed.). NewYork: Routledge. This book walks the reader through the basics of statistical analyses including descriptive statistics, understanding variables, and foundational statistical methods such as t-tests, regression, and ANOVA.The SPSS and R programs are used in parallel. Loehlin, J.C. (1998). Latent variable models: An introduction to factor, path, and structural analysis. Mahwah, NJ: Lawrence Erlbaum Associates Publishers. This book provides a powerful tutorial on path analysis, which is the underlying statistical framework that structural equation modeling is built upon. Path analysis is poorly understood by many applied researchers but is essential to understanding what is possible with SEM. Phakiti, A. (2014). Experimental research methods in language learning. London: Bloomsbury Academic. This book walks the reader quite thoroughly through types of quantitative research, validity and reliability, techniques and instruments, and basic statistical concepts.

References Blom, E., & Unsworth, S. (Eds.). (2010). Experimental methods in language acquisition research. Amsterdam: John Benjamins. Brown, J.D. (2004). Research methods for applied linguistics: Scope, characteristics, and standards. In A.Davies & C.Elder (Eds.), The handbook of applied linguistics (pp.476–501). Oxford: Blackwell Publishing Ltd. Brown, J. D. (2014). Mixed methods research for TESOL. Edinburgh: Edinburgh University Press. Brown, J. D. (2015). Why bother learning advanced quantitative methods in L2 research. In L.Plonsky (Ed.), Advancing quantitative methods in second language research (pp.9–20). NewYork, NY: Routledge. Brown, J. D., & Rodgers, T. S. (2003). Doing second language research. Oxford: Oxford University Press. Brown, T. (2006). Confirmatory factor analysis for applied research. NewYork: Guilford Press.

  Quantitative Methodology 


Chatterjee, S., & Hadi, A.S. (2015). Regression analysis by example. NewYork: John Wiley & Sons. Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. NewYork: Routledge. Devellis, R.F. (2012). Scale development: Theory and application (3rd ed.). Thousand Oaks, CA: Sage. Dewey, J. (1913). Interest and effort in education. Boston: Houghton Mifflin Company. Duff, P. (2010). Research approaches in applied linguistics. In R.B. Kaplan (Ed.), Oxford handbook of applied linguistics (2nd ed., pp.45–59). NewYork: Oxford University Press. Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 56–83. Fryer, L.K., Ainley, M., & Thompson, A. (2016). Modelling the links between students’ interest in a domain, the tasks they experience and their interest in a course: Isn’t interest what university is all about? Learning and Individual Differences, 50, 57–165. Fryer, L. K., Ainley, M., Thompson, A., Gibson, A., & Sherlock, Z. (2017). Stimulating and sustaining interest in a language course: An experimental comparison of AI and Human task partners. Computers in Human Behavior. https:// Fryer, L. K., & Carpenter, R. (2006). Bots as language learning tools. Language Learning & Technology, 10(3), 8–14. Fryer, L.K., & Nakao, K. (2009). Online English practice for Japanese University students: Assessing chatbots. Paper presented at the Japan Association for Language Teaching national conference, Tokyo. Permanent Online Location: Hattie, J.C. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. London and NewYork: Routledge and Taylor & Francis. Hidi, S., & Renninger, K.A. (2006). The four-phase model of interest development. Educational Psychologist, 41(2), 111–127. Hill, J., Ford, W.R., & Farreras, I.G. (2015). Real conversations with artificial intelligence: A comparison between human–human online conversations and human– chatbot conversations. Computers in Human Behavior, 49, 245–250. Hudson, T. (2015). Presenting quantitative data visually. In L. Plonsky (Ed.), Advancing quantitative methods in second language research (pp.78–105). NewYork: Routledge. Hudson, T., & Llosa, L. (2015). Design issues and inference in experimental L2 research. Language Learning, 65(S1), 76–96. Jang, E.E., Wagner, M., & Park, G. (2014). Mixed methods research in language testing and assessment. Annual Review of Applied Linguistics, 34, 123–153. Jia, J., & Chen, W. (2008). Motivate the learners to practice English through playing with Chatbot CSIEC.In Z.Pan, X.Zhang, A.Rhalib, W.Woo, & Y.Li (Eds.),


L. K. Fryer et al.

Technologies for E-learning and digital entertainment (pp. 180–191). New York: Springer. Kline, R.B. (2011). Principles and practices of structural equation modeling (3rd ed.). NewYork: Guilford Press. Larson-Hall, J.(2015). A guide to doing statistics in second language research using SPSS and R (2nd ed.). NewYork: Routledge. Larson-Hall, J., & Herrington, R. (2009). Improving data analysis in second language acquisition by utilizing modern developments in applied statistics. Applied Linguistics, 31(3), 368–390. Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: What gets reported and recommendations for the field. Language Learning, 65(S1), 127–159. Loehlin, J. C. (1998). Latent variable models: An introduction to factor, path, and structural analysis. Mahwah, NJ: Lawrence Erlbaum Associates Publishers. Loewen, S., & Gass, S. M. (2009). Research timeline: The use of statistics in L2 acquisition research. Language Teaching, 42(2), 181–196. Mackey, A., & Gass, S.M. (Eds.). (2011). Research methods in second language acquisition: A practical guide. Oxford: Wiley-Blackwell. Norris, J.M. (2015). Statistical significance testing in second language research: Basic problems and suggestions for reform. Language Learning, 65(S1), 97–126. Norris, J.M., Ross, S.J., & Schoonen, R. (2015). Improving second language quantitative research. Language Learning, 65(S1), 1–8. Ortega, L., & Byrnes, H. (2008). The longitudinal study of advanced L2 capacities: An introduction. In L. Ortega & H. Byrnes (Eds.), The longitudinal study of advanced L2 capacities (pp.3–20). NewYork: Routledge. Ortega, L., & Iberri-Shea, G. (2005). Longitudinal research in second language acquisition: Recent trends. Annual Review of Applied Linguistics, 25, 26–45. Paltridge, B., & Phakiti, A. (Eds.). (2015). Research methods in applied linguistics: A practical resource (2nd ed.). London: Bloomsbury Academic. Pedhazur, E.J. (1982). Multiple regression in behavioral research (2nd ed.). NewYork: Holt, Rinehart and Winston. Peterson, R. A., & Brown, S. P. (2005). On the use of beta coefficients in meta-­ analysis. Journal of Applied Psychology, 90(1), 175–181. Phakiti, A. (2014). Experimental research methods in language learning. London: Bloomsbury Academic. Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687. Plonsky, L., & Oswald, F.L. (2014). Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912. Porte, G. K. (2010). Appraising research in second language learning: A practical approach to critical analysis of quantitative research (2nd ed.). Philadelphia: John Benjamins.

  Quantitative Methodology 


Roever, C., & Phakiti, A. (2018). Quantitative methods for second language research: A problem-solving approach. London and NewYork: Routledge. Schiefele, U. (1991). Interest, learning, and motivation. Educational Psychologist, 26(3&4), 299–323. Tobias, S. (1995). Interest and metacognitive word knowledge. Journal of Educational Psychology, 87(3), 399–405. Tracz, S. M. (1992). The interpretation of beta weights in path analysis. Multiple Linear Regression Viewpoints, 19, 7–15. Tufte, E. (2001). The visual display of quantitative information. Cheshire, CT: Graphics Press. Tukey, J. W. (1960). A survey of sampling from contaminated distributions. In I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, & H. B. Mann (Eds.), Contributions to probability and statistics: Essays in honour of Harold Hotelling (pp.448–485). Stanford, CA: Stanford University Press. Walker, A., & White, G. (2013). Oxford handbooks for language teachers: Technology enhanced language learning. Oxford: Oxford University Press. Wilcox, R. (2001). Fundamentals of modern statistical methods: Substantially improving power and accuracy. NewYork: Springer-Verlag.

4 Qualitative Methodology ShimLew, AnnaHerYang, andLindaHarklau

Introduction This chapter provides an overview of qualitative research (QR) in applied linguistics, with a particular focus on recent research and developments in the field over the past six years. The chapter briefly reviews the historical development of qualitative methods in applied linguistics. It explores the philosophical and methodological premises of qualitative methods in the field and relates them to broader developments in QR across the social sciences, particularly in regard to issues of validity and quality of QR.The chapter then surveys current trends and issues based upon a review of over 100 QR journal articles in applied linguistics since 2011. It characterizes the varieties of qualitative approaches taken, theoretical frameworks used, and types of data collection and analytical methods employed. It points out challenges and controversies and concludes with limitations and future directions in QR in applied linguistics.

S. Lew (*) Teacher Education and Educational Leadership, University of West Florida, Pensacola, FL, USA e-mail: [emailprotected] A. H. Yang • L. Harklau TESOL & World Language Education Program & Linguistics Program, University of Georgia, Athens, GA, USA e-mail: [emailprotected]; [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



S. Lew et al.

Historical Development QR in applied linguistics has been defined as “research that relies mainly on the reduction of data to words (codes, labels, categorization systems, narratives, etc.) and interpretative argument” (Benson, 2013, p.1). Compared to quantitative research, qualitative findings generally depend on particular ways of collecting, analyzing, and interpreting data in specific contexts (e.g., Creswell & Poth, 2017; Denzin & Lincoln, 2017). QR can be differentiated from quantitative research primarily by researchers’ contrasting views of social construction and mental models about the use of numbers (e.g., Creswell & Poth, 2017). Maxwell (2010) suggests that while quantitative research uses variables and correlations to conceptualize the world, QR uses events and processes instead. Moreover, while quantitative approaches seek regularities and causal relations by evidencing a regular association between a change in one entity and a change in another, qualitative researchers tend to view causality as fundamental association with the causal mechanisms and processes of changes across time actually involved in particular events (e.g., Denzin & Lincoln, 2017). Richards (2009) argues that in recent years applied linguistics scholars are increasingly going beyond such dichotomous views of quantitative and qualitative research, moving toward pragmatic approaches that focus more on practical and contextual issues than conceptual debates. Benson (2013) observes that applied linguistics adopted QR methods considerably later than most other social sciences. It was not until the early 1990s that major journals (e.g., Applied Linguistics, Language Learning, Studies in Second Language Acquisition, and TESOL Quarterly) began publishing articles utilizing qualitative methods, and it took a decade more for qualitative methods to appear in research manuals in the field (e.g., Croker & Heigham, 2009; Richards, 2003). TESOL Quarterly’s 2003 publication of QR research guidelines (Chapelle & Duff, 2003) represented a further landmark in the field. Since then, QR has appeared regularly in applied linguistics research and has achieved mainstream status (see, e.g., the most recent TESOL Quarterly QR research guidelines; Mahboob etal., 2016). Nevertheless, our review concurs with previous work (e.g., Benson, Chik, Gao, Huang, & Wang, 2009; Richards, 2009) suggesting that QR appears less frequently than quantitative research in most applied linguistics journals. Our analysis also finds significant differences in QR publication trends among journals in the past six years, with some (e.g., International Review of Applied Linguistics) still publishing very little while the percentage has gradually increased in others (e.g., The Modern Language Journal, Applied Linguistics, and TESOL Quarterly).

  Qualitative Methodology 


Philosophical andMethodological Premises Given the status of QR as a standard, if somewhat underrepresented, research approach in the field of applied linguistics, it is important to unpack the varying philosophical and methodological premises behind varying forms of QR.

pistemology andOntology Underlying QR inApplied E Linguistics Every research methodology is associated with a theoretical perspective that embodies “a certain way of understanding what is (ontology) as well as a certain way of understanding what it means to know (epistemology)” (Crotty, 2009, p.10). These theoretical perspectives on the nature of being and knowing tend to remain tacit in most studies using quantitative methodologies since they tend to take a realist view, seeking knowledge of an objective reality existing outside of human perception (Sealey, 2013) and an objectivist epistemological perspective assuming that “truth and meaning reside in their objects independently of any consciousness” (Crotty, 2009, p.42). Epistemological and ontological stances play a more prominent role in QR, however, because of widely varying theoretical perspectives on what kinds of knowledge researchers believe can be produced by their research and how they believe readers should interpret their research findings and outcomes. Qualitative researchers in applied linguistics may, like quantitatively oriented colleagues, hold realist ontological views. Nevertheless, within the realm of the social sciences, realist philosophers and methodologists hold differing epistemological stances on the extent to which researchers have access to that reality, acknowledging the partiality and fallibility of human knowledge mediated through semiotic representations (e.g., Sealey, 2013). Qualitative researchers, particularly those aligning with post-structuralism, might also hold antirealist ontological views seeking “to obtain knowledge of entities that are conceived as not ‘given’, “that is, not independent of human action or of embeddedness in human culture” (Crookes, 2013, p.1). These varying positions have implications for the research methodology one chooses and the claims one makes. For example, even within the single qualitative methodology of grounded theory, some researchers take a realist ontology and positivist epistemology, focusing on “a method of discovery,” “a method of verification,” “an objective external reality,” “a passive, neutral observer,” “categories as emergent from the data,” and “a direct and, often, narrow empiricism” (Strauss & Corbin, 1998). If, on the other hand, research-


S. Lew et al.

ers hold an antirealist ontology with constructivist epistemology, they may use grounded theory highlighting “the flexibility of the method,” “a multiple, processual, and constructed” social reality, and “researchers’ reflexivity” about their “position, privileges, perspective, and interaction” (Charmaz, 2014, p.13).

Validity andQuality ofQR While the notion of standards for QR in applied linguistics has been critiqued as potentially simplistic or limiting, issues of quality and rigor of QR research continue to be debated in applied linguistics and across the social sciences (Richards, 2009). Unlike objectivist-oriented research in which validity lies in eliminating or controlling personal biases, many feel that qualitative methods inevitably involve subjectivity when researchers interact with social worlds with a host of assumptions about human knowledge and realities (e.g., Crotty, 2009). Therefore, validity lies in the manner in which researchers manage their subjectivity and in detailed justification for the appropriateness of researchers’ choices of research design for a particular social setting and researcher-subject relations in order to establish transparency and accountability (Holliday, 2013). Accordingly, instead of validity per se, qualitative researchers may use other terms. For example, in a classic essay, Lincoln and Guba (1985) proposed alternate criteria. Credibility refers to whether a researcher has taken necessary steps to ensure their interpretations are trustworthy, such as employing constant comparison, searching for negative evidence, and using member validation. This is related to transparency, whether research reports contain enough information about methods to help readers evaluate the trustworthiness of methods. Transferability refers to providing a “thick” or rich enough description of the research so thatreaders can assess the applicability of the study to their own situations. Dependability of the study might be addressed by explicitly noting issues of researcher subjectivity, particularly as it concerns methodology. Finally, researchers might seek confirmability by showing as much data to readers as possible in order to demonstrate the relationship between data and claims. Likewise, Tracy (2010) proposes eight key markers to gauge QR quality: (1) worthiness of topic, (2) rich rigor, (3) sincerity, (4) credibility, (5) resonance, (6) significance of contribution, (7) ethics, and (8) meaningful coherence. Validity in qualitative methodology also often entails researcher reflexivity. Depending upon their approach, researchers might critically reflect on their

  Qualitative Methodology 


personal and theoretical biases, consider their presence in the research, view themselves as tools for inspecting the entire research process (Schwandt, 2001), and acknowledge their voice as part of text (Starfield, 2013). In all, within the common premise in QR that the researcher is an instrument, there are varying epistemological and ontological stances holding differing views about the nature of what researchers are doing such as whether they are documenting reality; whether they even can do so; if they can, whose reality it is; what kinds of knowledge they aim to attain; what kinds of knowledge are possible; and how they know that the kinds of knowledge are adequate and legitimate. These differing beliefs are important since they influence the selection and use of methodology and methods, whether and how validity is considered, and the extent to which researcher reflexivity is considered.

Current Trends andIssues In line with recent reviews (e.g., Harklau, 2011), we also found that English is still the most commonly studied target language in applied linguistics research. More research is needed regarding other target languages and cultures. For example, a small but growing body of QR on heritage language learning has focused primarily on Spanish (e.g., Burke, 2012) and Korean (e.g.,Choi & Yi, 2012) in the US context. French is often the focus of QR studies in Canada (e.g., Macintyre, Burns, & Jessome, 2011). Some QR studies involving multiple target languages have taken place in the context of international schools and college foreign language classes(e.g., Jonsson, 2013). Our review of recent work shows a continuing focus on language learners at the college level (e.g.,Bidabadi & Yamat, 2014) and English as a second/ foreign language teachers/teacher candidates (e.g., Cheng, 2016; Nuske, 2015). Secondary school learners and teachers (e.g., Brooks, 2015) and college instructors (e.g., Kayi-Aydar, 2011) have also been frequent foci. Studies of young learners and their teachers have been relatively rare in recent work. Also present but uncommon in recent work are QR studies on adult learners (e.g., Judge, 2012) and teacher-learner, child-parent, or immigrant family interactions (e.g., de Fina, 2012). The US remains the most frequently studied national context for QR in applied linguistics. Other Anglophone countries including the UK, Canada, Australia, New Zealand, and Ireland are also well represented. Less frequently the focus has been on European, or East or South Asian countries. Studies situated in Central and South America, Africa, and Middle Eastern contexts have been few. Likewise, research spanning multiple national contexts, which


S. Lew et al.

usually examines telecollaborative and online language learning, and study abroad, has been relatively rare. Teaching and learning English as a second/foreign language or other foreign languages continues to be the dominant topic of QR in applied linguistics (e.g., Kim, 2015). Teacher education, professional development, and teachers’ experience and reflection of their teaching or learning (e.g., Shirvan, Rahmani, & Sorayyaee, 2016) have also been frequent topics in recent work. Other present topics of QR in applied linguistics include identity and agency in learners, teachers, and immigrants (e.g., Cheng, 2016; Kim & Duff, 2012); language ideologies in multilingual contexts (e.g., Rudwick & Parmegiani, 2013); heritage language learning and teaching (e.g., Choi & Yi, 2012); community and service learning (e.g., Leeman, Rabin, & Román-Mendoza, 2011); refugee experiences (e.g., Warriner, 2013); multilingualism and multi-­ literacies (e.g., Dorner & Layton, 2014; Takeuchi, 2015); language policies (e.g., Nero, 2015; Tamim, 2014); text/speech analysis (e.g., Britt, 2011); sign language and learners with disabilities (e.g., Cramér-Wolrath, 2015); parent involvement and perceptions of education (e.g., Kavanagh & Hickey, 2013); technology, online gaming, and telecollaboration (e.g., Shin, 2014; Whyte, 2011; Zhang, 2013); linguistic landscape analysis (e.g., Zabrodskaja, 2014); media discourse analysis (e.g., Chamberlin-Quinlisk, 2012); and research ethics (e.g., Perry, 2011). In all, QR in applied linguistics has thus far had an uneven and limited reach, leaving much potential for future exploration.

Overview ofResearch Designs/Types Major types of QR in applied linguistics can be classified in varying ways. However, most scholars (e.g., Benson etal., 2009; Harklau, 2011) identify two main points of departure or approaches for QR methodologies. One is the analyses of “sociocultural and ecological contexts of language learning and teaching” (Harklau, 2011, p.178) or “the people, situations, and social processes involved in language learning and teaching” (Benson et al., 2009, p.84). The other is the analyses of “spoken and written texts” (Benson etal., 2009, p. 84), or “the construction of social realities through discourse” (Harklau, 2011, p.178). The former approach relies primarily upon analysis of participant observation and interviews, while the latter relies primarily on the analysis of audio or video recordings and texts. The former tends to include case study, ethnography, longitudinal studies, think-aloud studies, narrative, self-study, stimulated recall, action research, diary study, and phenomenology, while the latter tends to prioritize discourse analytic methods, analysis of

  Qualitative Methodology 


classroom interaction, conversation analysis (CA), corpus study, genre analysis, and systemic functional analysis (Benson etal., 2009).

Varieties ofQualitative Approaches Several methodologies tend to predominate in recent QR in applied linguistics.

Case Study While most quantitative approaches seek large samples, a case study explores “a ‘how’ and ‘why’ question” about “a contemporary set of events”—a case, “over which a research has little or no control” (Yin, 2014, p.14). In applied linguistics, the case usually has been “a person (e.g., a teacher, learner, speaker, writer, or interlocutor) or a small number of individuals on their own or in a group (e.g., a family, a class, a work team, or a community of practice)” (Duff, 2014, p.233). Qualitative case studies in applied linguistics are often longitudinal (Hood, 2009) since they usually document the process, rather than the outcome, of language learning and teaching, the interaction between individuals and contexts mediated by various factors over time (Harklau, 2008), and the development of learner/teacher identity and cognition. Case studies are often used to explore “previously ill-understood populations, situations, and processes” (Duff, 2014, p.242). Recent examples include linguistic minority students, heritage language learners, indigenous language learners, and transnational language learners and sojourners. Case studies have not only contributed significantly to theories and models in those areas in applied linguistics but also have often influenced educational policies and practices (Duff, 2014). Case study is also one of the oldest and most common forms of inquiry in applied linguistics. Recent case study work in applied linguistics may or may not be further specified as ethnographic, multiple or multi-case, collective, naturalistic, or narrative (e.g., Brooks, 2015; Wyatt, 2013). Many studies align methods with well-known case study texts including Merriam (2009), Stake (2000), Duff (2008), Richards (2011), and Yin (2014). Case study is frequently employed in current applied linguistics scholarship to document the implementation of specific instructional tools, programs, or policies. Other recent case study foci have included challenges of language learners and teachers in particular contexts; interactions among teachers/learners, family members, or multiple parties in particular settings; learners’ or teachers’ identity formation or emotions; learning processes and changes in learners’ use of linguistic concepts;


S. Lew et al.

and individual reactions to experiences such as study abroad or teacher reflection.

Ethnography Ethnography, originating in anthropology and sociology, refers to a variety of research approaches distinguished by “[involving] some degree of direct participation and observation” and “[constituting] a radically distinctive way of understanding social activity in situ” (Atkinson, 2015, pp.3–4). While ethnography is usually associated with an emic approach in terms of “developing empathy with, and giving voice to, participants” (Markee, 2013, p. 3), in applied linguistics it varies in whether it focuses on “the traditional anthropological emically-oriented what of participant understandings and experiences” or on “the interactional sociolinguist’s etically-oriented how participants structure social realities through interaction” (Harklau, 2005, p.189). Studies in the field may be explicitly identified as ethnography (e.g., Dorner, 2011; Jonsson, 2013; Malsbary, 2014) or ethnographic case study (e.g., Huang, 2014; Marianne, 2011). Alternatively, researchers may note the use of ethnographic data collection and analysis methods including observation, field notes, microethnography, and in-depth interviews (De Costa, 2016). Like the majority of QR methods, data sets are likely to contain interviews, documents or other textual and photographic archival data, and observation and field notes. Some studies may combine ethnography with other methodological strands of QR such as grounded theory and/or critical discourse analysis. Widely cited authorities for this methodology include Glaser and Strauss (1967), Spradley (1980), Strauss and Corbin (1998), Erickson (1992), LeCompte and Schensul (1999), Denzin and Lincoln (2003), Fetterman (1998), DeWalt and DeWalt (2002), Patton (2014), and Emerson, Fretz, and Shaw (2011). Much of the current work using ethnography takes place in multilingual educational settings (e.g., Hornberger& Link, 2012) and investigates issues including the experience of learners or teachers in particular contexts, the development of particular competence, dual language education, multilingual literacies, linguistic landscapes, and language ideologies (see Blommaert, 2013; McCarty, 2014).

Conversation Analysis CA, as developed by Harvey Sacks, rose out of the tradition of ethnomethodology to examine language as social action, such as turn-taking (Sacks, Schegloff, & Jefferson, 1974). CA aims to account for the practices and pref-

  Qualitative Methodology 


erence organization of turn-taking, repair, and conversational sequencing during naturalistic or instructed talk-in-interaction (e.g., Hutchby & Wooffitt, 2008). Over the years, CA has broadened greatly to become an influential methodology for examining social interaction and its sequential organization across the social sciences, including applied linguistics (e.g., Markee, 2000; Seedhouse, 2004). Scholars see CA as both informing and informed by applied linguistics. For example, they point out that their research offers significant additions to “classic CA’s entrenched monolingualism” (Kasper & Wagner, 2014, p.200) through studies of practices of multilingual interaction and multimodality. Studies in this tradition may be self-identified as CA (e.g., Britt, 2011; Dings, 2014), or they may be distinguished by their use of CA transcription conventions and the systems developed by Sacks et al. (1974). Ethnomethodology, founded by Garfinkel (1967), is often cited as well since CA is derived from ethnomethodology. Topics of work using CA approaches vary from specific aspects of interaction such as alignment activity, membership categories, epistemic change, or code-switching to broader aspects of language such as pronunciation, humor, gesture, or narrative construction. Even broader are CA’s investigations of language learner development and interactional competence over time. Some recent work uses CA as part of a mixed methods design that also incorporates quantitative analysis of discourse features (e.g., Siegel, 2015; Taguchi, 2014).

Grounded Theory Originating in the sociology of medicine in the mid-1960s (see Glaser & Strauss, 1967), grounded theory has become a widespread QR methodology across the applied social sciences. It offers “a set of general principles, guidelines, strategies, and heuristic devices” (Charmaz, 2014, p.3) “for collecting and analyzing qualitative data” to “constructa theory ‘grounded’ in [researchers’] data” (p. 1). Grounded theorists share common strategies including open-ended, inductive inquiry; “[conducting] data collection and analysis simultaneously in an iterative process”; “[analyzing] actions and processes rather than themes and structure”; “[using] comparative methods”; “[developing] inductive abstract analytic categories”; “[engaging] in theoretical sampling”; and “[emphasizing] theory construction rather than description or application of current theories” (p.26). Applied linguistics scholarship using grounded theory (e.g., Choi & Yi, 2012; Dorner & Layton, 2014; Valmori & De Costa, 2016) can often be


S. Lew et al.

distinguished by the use or mention of associated techniques or tenets such as open/thematic, axial, and selective coding, or cyclical process. Key methodological texts in this area include Glaser and Strauss (1967), Charmaz (2014), and Strauss and Corbin (1998).

Narrative Inquiry Originating in a philosophical perspective that sees story as human beings’ most fundamental means of making sense of the world (see Connelly & Clandinin, 2006), narrative inquiry can refer to “research in which narratives, or stories, play a significant role” (Benson, 2014, p.155). In applied linguistics, studies using narrative inquiry may go by a number of different names including life histories, language learning histories, language learning experience, diary studies, memoirs, language biographies, autobiographies, and autoethnography (Barkhuizen, 2014). Frequently used data sources for this method include “autobiographical records or reflection, published memoirs, written language learning histories, or interviews” (p.156). Narrative inquiry is most commonly used to document language learners’ and teachers’ development, practices, identity, agency, beliefs, emotion, positionality, and motivation (e.g., Baynham, 2011; Canagarajah, 2012; Casanave, 2012; Liu & Xu, 2011). It has been applied in a variety of contexts including informal and out-of-class learning, foreign language classrooms, study abroad, and refugee experiences. Ideally, narrative inquiry can capture what Benson (2014) calls “language learning careers or learners’ retrospective conceptions of how their language learning has developed over the longer term” (p. 165). Narrative inquiries in applied linguistics highlight that learning a language is to acquire and develop new identities beyond learning its forms and functions. Narrative research has been a major methodology for exploring language teachers’ and teacher candidates’ perceptions and experiences (e.g., Andrew, 2011; Zheng & Borg, 2014). The work of Connelly and Clandinin (2006) has been highly influential in this school of inquiry and is frequently cited, along with Kramp (2004), Benson (2014), and Lieblich, Tuval-Mashiach, and Zilber (1998).

Critical Discourse Analysis (CDA) CDA seeks to discover ways in which “social structures of inequality are produced in and through language and discourse” (Lin, 2014, p.214). In CDA, language is seen as “intrinsically ideological”(p. 215). Theorists posit that it “plays a key (albeit often invisible) role in naturalizing, normalizing, and thus

  Qualitative Methodology 


masking, producing, and reproducing inequalities in society” (p.215). CDA frameworks seek to theorize, to varying degrees, the relationship between microstructures of language and macrostructure of society, as mediated by social practices. CDA has often been used in applied linguistics to analyze public and media discourse and educational discourse. Studies typically do not use CDA as the sole methodology but in combination with other methodologies such as case study (e.g., Dorner & Layton, 2014). Frequently cited figures in CDA include Fairclough (2003) and Rogers (2011).

Action Research Action research in applied linguistics is not distinguishable from other research methods’ methodological features per se. Rather, what makes it distinctive is that research is conducted by teacher-researchers as a form of self-critical inquiry about their educational practice (Cochran-Smith & Lytle, 1999). Situated inlocal educational practices, the quality and rigor of action research has been called into question. Nevertheless, Burns (2009) argues that action research develops reflective teachers who are committed to thinking professional and, as a result, can identify conditions in their own classrooms that do not coincide with the theories presented to them in their teacher training (Burns, 2009). Thus, the research findings they discover are “far more attractive” and “immediate to [their] teaching situations” (p.7). Action research is of benefit not only to teacher-researchers by empowering them as agents rather than receivers of theory, research, and policy but also to the profession and field of language teaching (Burns, 2009). Recent published work, for example, has explored teachers’ experience with professional development, documented their implementation of new teaching methods and curricula, and investigated learners’ development (e.g., Burke, 2012; Calvert & Sheen, 2015; Castellano, Mynard, & Rubesch, 2011). Methodological work frequently cited in this area includes Coghlan and Brannick (2010) and Burns (2009).

Other Qualitative Methodologies A number of studies in applied linguistics do not specify a particular methodology but instead use the term “qualitative” more generically. Other studies are not explicitly qualitative but nonetheless use data gathering and reporting methods associated with qualitative methodologies such as interviews, discourse analysis, or inductive thematic or content analysis. Moreover, a grow-


S. Lew et al.

ing number of studies combine multiple QR methodologies. Since various qualitative methodologies often share similar methodological features and a common purpose to come to a holistic understanding of a particular phenomenon, this can be a pragmatic and productive combination. Nevertheless, underspecifying or combining QR methodologies should be approached with caution, since there are important differences and potential points of conflict in the epistemological and ontological assumptions underlying various methodologies.

Theoretical Frameworks Some view QR in the social sciences as part of a move from personal issues toward public and social issues, such as social policies and social justice (Denzin & Lincoln, 2017). While QR in applied linguistics is often applied with no “clear sociopolitical agenda” (Lazaraton, 2003, p.3), Benson (2013) notes that applied linguists who adopt sociocultural perspectives and theoretical frameworks “such as sociocultural theory, communities of practice, social realism, and ecological approaches” tend to gravitate toward qualitative methods (p.3). A review of current QR in the field indicates that sociocultural theory and critical theory are two of the most frequent self-identified theoretical frameworks. Other widely used theoretical frameworks include language socialization (e.g., Kim & Duff, 2012), Deleuzian and Foucaultian poststructuralism (e.g., Waterhouse, 2012), constructivism (e.g., Wyatt, 2013), activity theory (e.g., Zhu & Mitchell, 2012), systemic functional linguistics (e.g., Dafouz & Hibler, 2013), Bakhtinian dialogism (e.g., Huang, 2014), and positioning theory (e.g., Pinnow & Chval, 2015). Nonetheless, it is important to note that many studies adopt particular constructs from theories without adopting associated theoretical frameworks wholesale.

Data Collection Methods Commonly used methods for collecting data in QR in applied linguistics include interviews, observations, questionnaires, audio and video recordings of interaction, and collection of textual artifacts (Benson, 2013; Harklau, 2011). Each may be used alone or in combination (Benson, 2013). Our

  Qualitative Methodology 


review found that over two-thirds used interviews. Interviews without any further specification or “semi-structured” interviews were most commonly mentioned. Interviews were also variously described as “cognitive,” “qualitative,” “open-ended,” “diary-based,” “informal,” “in-depth,” “extended,” “lengthy,” or “ethnographic,” or as “video-stimulated recall interviews” or “interview conversations.” Ten percent of these studies relied exclusively on interviews. Most studies used face-to-face interviews; phone interviews or other types of interviews were rare. Few studies utilized focus group interviews or conversations. Observations were another common method, used in one-third of the studies we reviewed. The majority of studies used observation in combination with interviews. “Observation” without further specification, “participant observation,” and “classroom observation” were the most common descriptions used. “Ethnographic” and “video” observations were also used. Some studies mentioned accompanying field notes with observation, while others did not. Very few studies mentioned observation protocols or rubrics. Approximately one quarter of the QR studies we reviewed used questionnaires and a few used surveys. Surveys frequently contained at least some open-ended questions. Audio and/or video recordings were also a common method, used in about ten percent of the studies reviewed. Recordings encompassed activities including interaction, reading, feedback sessions, group discussion, presentations, classes, think-aloud and stimulated recall, and segments of speech from various speakers. Duration of recordings is usually not specified. We also found a myriad of written artifact data sources used in QR in applied linguistics. These included journals or reflections of language teachers or student teachers, language learner diaries/blogs/journals, oral and written language production samples, presentations and writing projects, essays, online interactions, dialogs, screen capture of text chat, email discussion, logs, emails, self-evaluation reports, evaluations of instruction, narratives, and drawings. Furthermore, data also included samples of classroom curriculum and instructional materials including worksheets, lesson plans, course syllabi and unit outlines, textbooks, newspaper and magazine articles, and web pages. School documents and websites were another type of written artifact, as were public documents including newspaper articles, public comments, websites, reports, and policies. Linguistic landscape studies drew on multimedia artifacts including photos and multilingual street signs. Lastly, researchers noted collecting their own texts as well such as research logs/notes, research journals, analytical memos, and annotations of interviews.


S. Lew et al.

Data Analysis Methods Methods of data analysis and data reduction in recent QR reports in applied linguistics are often underspecified. As in Benson’s (2013) review, we found that only a handful of studies we reviewed gave full accounts of the analysis process. The majority did not describe data analysis methods and procedures in detail, referring instead to established analytical methods. Most frequently these included techniques from grounded theory (e.g., open, axial, and selective coding and the constant comparative method), phenomenology, ethnography, qualitative content analysis, and ethically generated typological analysis. Researchers often used general descriptive terms such as interactive, iterative, inductive, and recursive. An increasing number indicated using software such as QSR NVivo, Transana, and ATLAS.ti. A lack of specificity in describing data analysis has been a common weakness across the social sciences, with qualitative methodologists calling data analysis “the ‘black hole’ of qualitative research” (Lather, 1991, p.149; St. Pierre & Jackson, 2014). They suggest that post-coding data analysis “occurs everywhere and all the time” (St. Pierre & Jackson, 2014, p. 717) rather than linearly and sequentially from data collection to analysis and to representation. They call for increased training for qualitative researchers in data analysis as thinking with theory (Jackson & Mazzei, 2012) rather than as a simple mechanical coding process that “continues to be mired in positivism” (St. Pierre & Jackson, 2014, p.717). On the other hand, applied linguistics as a field may be at somewhat of an advantage methodologically compared to other branches of the social sciences when transcribing and analyzing verbatim texts because of its rigorous analysis of language and text (Benson, 2013).

Gaps, Challenges, andControversial Issues Despite the growth of QR in applied linguistics, there continue to be gaps, challenges, and controversies. For one thing, the predominance of studies of English acquisition in university settings in Western nations continues. While studies of adolescents have increased, QR studies of other populations including young learners, those who struggle with fragmented schooling, the learning disabled, and LGBT individuals remain rare. Studies of language teachers continue to focus primarily on pre-service educators rather than those already in the classroom. In spite of growing theoretical interest in topics such as community linguistic diversity and superdiversity, multilingualism, and lan-

  Qualitative Methodology 


guage in the professions, studies on settings outside of formal education continue to be relatively rare and are greatly needed. Moreover, while QR in applied linguistics has made incursions internationally, QR methodologies tend not to address multilingual and multicultural scholarship and rely almost exclusively on Western philosophical and research traditions. Logistics of doing QR in cultural and sociopolitical contexts outside of mainstream Western contexts are not often considered (Flick, 2014), and few studies consider the ethics of doing QR in a particular sociocultural context (but see De Costa, 2014).

Limitations andFuture Directions While this review offers a portrait of the field, because of the sheer diversity of QR approaches taken as well as the tendency for applied linguistics researchers to underspecify QR methodologies (Benson etal., 2009), no single review can be exhaustive. While we have shown that QR methods have become a routine and well-accepted mode of inquiry in contemporary applied linguistics research, appearing regularly in peer reviewed journals, books, and book chapters, we have also noted that QR remains somewhat limited both in topic coverage and publication venues. Applied linguistics is not always about second language acquisition (SLA) or multilingualism in Western contexts. Despite the diversity and prosperity of QR in the field, there are still significant gaps and unexamined languages, populations, and regions. Turning to these areas could in turn lead future applied linguistics researchers to develop innovative QR methods. Moreover, there remain clear differences among journals in how receptive they are to publishing QR reports, and thus, a great deal of work needs to be done to promote understanding about the value of QR to publishers as well as policy makers and research funders in the field, who still place a higher value on quantitative research methods. Finally, this review has noted a lack of specificity when describing methodology or data analysis. In order to ensure greater understanding and acceptance of QR in applied linguistics, research reports need to provide fuller accounts of research design.

Resources forFurther Reading Benson, P. (2013). Qualitative methods: Overview. In C.A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 1–10). Chichester: Wiley-Blackwell.


S. Lew et al.

This chapter provides a brief overview of qualitative methods in applied linguistics encompassing historical background, qualitative research in applied linguistics journals, approaches to qualitative research, data collection methods, data analysis methods, issues of quality, research areas, frameworks, and themes. De Costa, P.I., Valmori, L., & Choi, I. (2017). Qualitative research methods. In S.Loewen & M.Sato (Eds.), The Routledge handbook of instructed second language acquisition (pp.522–540). NewYork: Routledge. This chapter discusses current issues in qualitative research methods in contemporary instructed second language acquisition (ISLA) research. By drawing on sample studies, it explores five issues: understanding and establishing rigor in qualitative research, articulating the discourse analytic approach, exploring researcher reflexivity and ethics, mixing methods, and unconventional blending of theories with different methodologies. Harklau, L. (2011). Approaches and methods in recent qualitative research. In E.Hinkel (Ed.), Handbook of research in second language teaching and learning (pp.175–189). NewYork: Routledge. This chapter offers a profile of recent trends in qualitative research on second language teaching and learning since 2003 and discusses methods of data collection, methodological frameworks, and future direction. Richards, K. (2009). Trends in qualitative research in language teaching since 2000. Language Teaching, 42(2), 147–180. This article provides a review of the developments in qualitative research in language teaching since the year 2000 with a particular focus on its contributions to the field and emerging issues.

References Andrew, M. (2011). “Like a newborn baby”: Using journals to record changing identities beyond the classroom. TESL Canada Journal, 29(1), 57–76. Atkinson, P. (2015). For ethnography. Los Angeles, CA: SAGE.

  Qualitative Methodology 


Barkhuizen, G. (2014). Narrative research in language teaching and learning. Language Teaching, 47(4), 450–466. Baynham, M. (2011). Stance, positioning, and alignment in narratives of professional experience. Language in Society, 40, 63–74. Benson, P. (2013). Qualitative methods: Overview. In C. A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp.1–10). Chichester: Wiley-Blackwell. Benson, P. (2014). Narrative inquiry in applied linguistics research. Annual Review of Applied Linguistics, 34, 154–170. Benson, P., Chik, A., Gao, X., Huang, J., & Wang, W. (2009). Qualitative research in language teaching and learning journals, 1997–2006. Modern Language Journal, 93(1), 79–90. Bidabadi, F.S., & Yamat, H. (2014). Strategies employed by Iranian EFL freshman university students in extensive listening: A qualitative research. International Journal of Qualitative Studies in Education, 27(1), 23–41. Blommaert, J.(2013). Ethnography, superdiversity and linguistic landscapes: Chronicles of complexity. Bristol: Multilingual Matters. Britt, E. (2011). “Can the church say amen”: Strategic uses of black preaching style at the State of the Black Union. Language in Society, 40, 211–233. Brooks, M.D. (2015). Notes and talk: An examination of a long-term English learner reading-to-learn in a high school biology classroom. Language and Education, 30(3), 235–251. Burke, B.M. (2012). Experiential professional development: Improving world language pedagogy inside Spanish classrooms. Hispania, 95(4), 714–733. Burns, A. (2009). Doing action research in English language teaching: A guide for practitioners. NewYork, NY: Routledge. Calvert, M., & Sheen, Y. (2015). Task-based language learning and teaching: An action-research study. Language Teaching Research, 19(2), 226–244. Canagarajah, A.S. (2012). Teacher development in a global profession: An autoethnography. TESOL Quarterly, 46(2), 258–279. Casanave, C.P. (2012). Diary of a dabbler: Ecological influences on an EFL teacher’s efforts to study Japanese informally. TESOL Quarterly, 46(4), 642–670. Castellano, J., Mynard, J., & Rubesch, T. (2011). Student technology use in a self-­ access center. Language Learning & Technology, 15(3), 12–27. Chamberlin-Quinlisk, C. (2012). Critical media analysis in teacher education: Exploring language-learners’ identity through mediated images of a non-native speaker of English. TESL Canada Journal, 29(2), 42–57. Chapelle, C.A., & Duff, P.A. (2003). Some guidelines for conducting quantitative and qualitative research in TESOL. TESOL Quarterly, 37(1), 147–178. Charmaz, K. (2014). Constructing grounded theory. Thousand Oaks, CA: SAGE. Cheng, X. (2016). A narrative inquiry of identity formation of EFL university teachers. Journal of Education and Training Studies, 4(5), 1–7. Choi, J., & Yi, Y. (2012). The use and role of pop culture in heritage language learning: A study of advanced learners of Korean. Foreign Language Annals, 45(1), 110–129.


S. Lew et al.

Cochran-Smith, M., & Lytle, S.L. (1999). The teacher research movement: A decade later. Educational Researcher, 28(7), 15–25. Coghlan, D., & Brannick, T. (2010). Doing action research in your own organization (3rd ed.). London: SAGE. Connelly, F. M., & Clandinin, D. J. (2006). Narrative inquiry. In J. L. Green, G.Camilli, & P.Elmore (Eds.), Handbook of complementary methods in education research (3rd ed., pp.477–487). Mahwah, NJ: Lawrence Erlbaum. Cramér-Wolrath, E. (2015). Mediating native Swedish sign language: First language in gestural modality interactions at storytime. Sign Language Studies, 15(3), 266–295. Creswell, J. W., & Poth, C. N. (2017). Qualitative inquiry and research design: Choosing among five approaches (4th ed.). Thousand Oaks, CA: SAGE. Croker, R. A., & Heigham, J. (2009). Qualitative research in applied linguistics: A practical introduction. NewYork: Palgrave Macmillan. Crookes, G. (2013). Epistemology and ontology. In C.A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp.1–8). Chichester: Wiley-Blackwell. Crotty, M. (2009). The foundations of social research: Meaning and perspective in the research process. Thousand Oaks, CA: SAGE. Dafouz, E., & Hibler, A. (2013). ‘Zip your lips’ or ‘Keep quiet’: Main teachers’ and language assistants’ classroom discourse in CLIL settings. Modern Language Journal, 97(3), 655–669. De Costa, P.I. (2014). Making ethical decisions in an ethnographic study. TESOL Quarterly, 48(2), 413–422. De Costa, P.I. (2016). The power of identity and ideology in language learning: Designer immigrants learning English in Singapore. Dordrecht: Springer. De Fina, A. (2012). Family interaction and engagement with the heritage language: A case study. Multilingual, 31(4), 349–379. Denzin, N.K., & Lincoln, Y.S. (2003). The landscape of qualitative research. Thousand Oaks, CA: Sage Publications. Denzin, N.K., & Lincoln, Y.S. (2017). The SAGE Handbook of qualitative research (5th ed.). Thousand Oaks, CA: SAGE. DeWalt, K.M., & DeWalt, B.R. (2002). Participant observation: A guide for fieldworkers. Walnut Creek, CA: AltaMira Press. Dorner, L.M. (2011). Contested communities in a debate over dual-language education: The import of “Public” values on public policies. Educational Policy, 25(4), 577–613. Dings, A. (2014). Interactional competence and the development of alignment activity. Modern Language Journal, 98(3), 742–756. Dorner, L.M., & Layton, A. (2014). “¿Cómo se dice?” Children’s multilingual discourses (or interacting, representing, and being) in a first-grade Spanish immersion classroom. Linguistics and Education, 25, 24–39. Duff, P.A. (2008). Case study research in applied linguistics. NewYork, NY: Lawrence Erlbaum.

  Qualitative Methodology 


Duff, P.A. (2014). Case study research on language learning and use. Annual Review of Applied Linguistics, 34, 233–255. Emerson, R.M., Fretz, R.I., & Shaw, L.L. (2011). Writing ethnographic fieldnotes (2nd ed.). Chicago: University of Chicago Press. Erickson, F. (1992). Ethnographic microanalysis of interaction. In M.D. LeCompte, W.L. Millroy, & J.Preissle (Eds.), The handbook of qualitative research in education (pp.202–225). NewYork: Academic Press. Fairclough, N. (2003). Analysing discourse: Textual analysis for social research. NewYork: Routledge. Fetterman, D. (1998). Ethnography: Step by step (2nd ed.). Thousand Oaks, CA: SAGE. Flick, U. (2014). Challenges for qualitative inquiry as a global endeavor: Introduction to the special issue introduction. Qualitative Inquiry, 20(9), 1059–1063. Garfinkel, H. (1967). Studies in ethnomethodology. Englewood Cliffs, NJ: Prentice-Hall. Glaser, B.G., & Strauss, A.L. (1967). The discovery of grounded theory: Strategies for qualitative research. NewYork: Aldine de Gruyter. Harklau, L. (2005). Ethnography and ethnographic research on second language teaching and learning. In E.Hinkel (Ed.), Handbook of research in second language teaching and learning (pp.179–194). Mahwah, NJ: Lawrence Erlbaum Associates. Harklau, L. (2008). Developing qualitative longitudinal case studies of advanced language learners. In L. Ortega & H. Byrnes (Eds.), The longitudinal study of advanced L2 capacities (pp.23–35). NewYork: Routledge. Harklau, L. (2011). Approaches and methods in recent qualitative research. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp.175–189). NewYork: Routledge. Holliday, A.R. (2013). Validity in qualitative research. In C.A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp.1–7). Chichester: Wiley-Blackwell. Hood, M. (2009). Case study. In R. A. Croker & J. Heigham (Eds.), Qualitative research in applied linguistics: A practical introduction (pp. 66–90). New York: Palgrave Macmillan. Hornberger, N., & Link, H. (2012). Translanguaging and transnational literacies in multilingual classrooms: A biliteracy lens. International Journal of Bilingual Education and Bilingualism, 15(3), 261–278. Huang, I.-C. (2014). Contextualizing teacher identity of non-native-English speakers in U.S. secondary ESL classrooms: A Bakhtinian perspective. Linguistics and Education, 25, 119–128. Hutchby, I., & Wooffitt, R. (2008). Conversation analysis (2nd ed.). Cambridge: Polity. Jackson, A.Y., & Mazzei, L.A. (2012). Thinking with theory in qualitative research: Viewing data across multiple perspectives. London: Routledge. Jonsson, C. (2013). Translanguaging and multilingual literacies: Diary-based case studies of adolescents in an international school. International Journal of the Sociology of Language, 224, 85–117.


S. Lew et al.

Judge, J.W. (2012). Use of language learning strategies by Spanish adults for Business English. International Journal of English Studies, 12(1), 37–54. Kasper, G., & Wagner, J.(2014). Conversation analysis in applied linguistics. Annual Review of Applied Linguistics, 34, 171–212. Kavanagh, L., & Hickey, T.M. (2013). ‘You’re looking at this different language and it freezes you out straight away’: Identifying challenges to parental involvement among immersion parents. Language and Education, 27(5), 432–450. Kayi-Aydar, H. (2011). Re-exploring the knowledge base of language teaching: Four ESL Teachers’ classroom practices and perspectives. TESL Reporter, 44(1–2), 25–41. Kim, H.J. (2015). A qualitative analysis of rater behavior on an L2 speaking assessment. Language Assessment Quarterly, 12(3), 239–261. Kim, J., & Duff, P.A. (2012). The language socialization and identity negotiations of generation 1.5 Korean-Canadian university students. TESL Canada Journal, 29, 81. Kramp, M.K. (2004). Exploring life and experience through narrative inquiry. In K. deMarrais & S.D. Lapan (Eds.), Foundations for research: Methods of inquiry in education and the social sciences (pp.103–121). Mahwah, NJ: Erlbaum. Lather, P. (1991). Getting smart: Feminist research and pedagogy with/in the postmodern. NewYork: Routledge. Lazaraton, A. (2003). Evaluative criteria for qualitative research in applied linguistics: Whose criteria and whose research? Modern Language Journal, 87(1), 1–12. LeCompte, M. D., & Schensul, J. J. (1999). Designing & conducting ethnographic research. Walnut Creek, CA: AltaMira. Leeman, J., Rabin, L., & Román-Mendoza, E. (2011). Identity and activism in heritage language education. Modern Language Journal, 95(4), 481–495. Lieblich, A., Tuval-Mashiach, R., & Zilber, T. (1998). Narrative research: Reading, analysis and interpretation. Thousand Oaks, CA: Sage. Lin, A. (2014). Critical discourse analysis in applied linguistics: A methodological review. Annual Review of Applied Linguistics, 34, 213–232. Lincoln, Y.S., & Guba, E.G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage. Liu, Y., & Xu, Y. (2011). Inclusion or exclusion? A narrative inquiry of a language teacher’s identity experience in the ‘new work order’ of competing pedagogies. Teaching and Teacher Education, 27(3), 589–597. Macintyre, P.D., Burns, C., & Jessome, A. (2011). Ambivalence about communicating in a second language: A qualitative study of French immersion students’ willingness to communicate. Modern Language Journal, 95(1), 81–96. Mahboob, A., Paltridge, B., Phakiti, A., Wagner, E., Starfield, S., Burns, A., … De Costa, P. I. (2016). TESOL Quarterly research guidelines. TESOL Quarterly, 50(1), 42–65. Malsbary, C.B. (2014). “It’s not just learning English, it’s learning other cultures”: Belonging, power, and possibility in an immigrant contact zone. International Journal of Qualitative Studies in Education, 27(10), 1312–1336.

  Qualitative Methodology 


Marianne. (2011). Reading books in class: What “just to read a book” can mean. Reading in a Foreign Language, 23(1), 17–41. Markee, N. (2000). Conversation analysis. Mahwah, NJ: Erlbaum. Markee, N. (2013). Emic and ethic in qualitative research. In C.A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp.1–4). Chichester: Wiley-Blackwell. Maxwell, J. A. (2010). Using numbers in qualitative research. Qualitative Inquiry, 16(6), 475–482. McCarty, T.L. (Ed.). (2014). Ethnography and language policy. NewYork: Routledge. Merriam, S.B. (2009). Qualitative research: A guide to design and implementation. San Francisco: Jossey-Bass. Nero, S. (2015). Language, identity, and insider/outsider positionality in Caribbean Creole English research. Applied Linguistics Review, 6(3), 341–368. Nuske, K. (2015). Transformation and stasis: Two case studies of critical teacher education in TESOL. Critical Inquiry in Language Studies, 12(4), 283–312. Patton, M.Q. (2014). Qualitative research and evaluation methods (4th ed.). Thousand Oaks, CA: SAGE. Perry, K.H. (2011). Ethics, vulnerability, and speakers of other languages: How university IRBs (do not) speak to research involving refugee participants. Qualitative Inquiry, 17(10), 899–912. Pinnow, R.J., & Chval, K.B. (2015). “How much You wanna bet?”: Examining the role of positioning in the development of L2 learner interactional competencies in the content classroom. Linguistics and Education, 30, 1–11. Richards, K. (2003). Qualitative inquiry in TESOL. NewYork: Palgrave Macmillan. Richards, K. (2009). Trends in qualitative research in language teaching since 2000. Language Teaching, 42(2), 147–180. Richards, K. (2011). Case studies. In E.Hinkel (Ed.), Handbook of research in second language teaching and learning (Vol. 2, pp.207–221). London: Routledge. Rogers, R. (Ed.). (2011). An introduction to critical discourse analysis in education. NewYork: Routledge. Rudwick, S., & Parmegiani, A. (2013). Divided loyalties: Zulu vis-à-vis English at the University of KwaZulu-Natal. Language Matters, 44(3), 89–107. Sacks, H., Schegloff, E.A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50, 696–735. Schwandt, T. (2001). Dictionary of qualitative inquiry. Thousand Oaks, CA: Sage. Sealey, A. (2013). Realism. In C.A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp.1–6). Chichester: Wiley-Blackwell. Seedhouse, P. (2004). The interactional architecture of the language classroom: A conversation analysis perspective. Malden, MA: Blackwell. Shin, D.S. (2014). Web 2.0 tools and academic literacy development in a US urban school: A case study of a second-grade English language learner. Language and Education, 28(1), 68–85.


S. Lew et al.

Shirvan, M., Rahmani, S., & Sorayyaee, L. (2016). On the exploration of the ecology of English language teachers’ personal styles in Iran. Asian-Pacific Journal of Second & Foreign Language Education, 1(1), 12–28. Siegel, A. (2015). Social epistemics for analyzing longitudinal language learner development. International Journal of Applied Linguistics, 25(1), 83–104. Spradley, J.P. (1980). Participant observation. NewYork: Holt, Rinehart and Winston. St. Pierre, E. A., & Jackson, A. Y. (2014). Qualitative data analysis after coding. Qualitative Inquiry, 20(6), 715–719. Stake, R.E. (2000). Case studies. In N.K. Denzin & Y.S. Lincoln (Eds.), Handbook of qualitative research (2nd ed., pp.435–454). Thousand Oaks, CA: SAGE. Starfield, S. (2013). Researcher reflexivity. In C.A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp.1–7). Chichester: Wiley-Blackwell. Strauss, A.L., & Corbin, J.M. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory. Thousand Oaks, CA: SAGE. Taguchi, N. (2014). Development of interactional competence in Japanese as a second language: Use of incomplete sentences as interactional resources. Modern Language Journal, 98(2), 518–535. Takeuchi, M. (2015). The situated multiliteracies approach to classroom participation: English language learners’ participation in classroom mathematics practices. Journal of Language, Identity & Education, 14(3), 159–178. Tamim, T. (2014). The politics of languages in education: Issues of access, social participation and inequality in the multilingual context of Pakistan. British Educational Research Journal, 40(2), 280–299. Tracy, S. (2010). Qualitative quality: Eight “Big-Tent” criteria for excellent qualitative research. Qualitative Inquiry, 16(10), 837–851. Valmori, L., & De Costa, P.I. (2016). How do foreign language teachers maintain their proficiency?: A grounded theory approach. System, 57(1), 98–108. Warriner, D.S. (2013). “It’s better life here than there”: Elasticity and ambivalence in narratives of personal experience. International Multilingual Research Journal, 7(1), 15–32. Waterhouse, M. (2012). ‘We don’t believe media anymore’: Mapping critical literacies in an adult immigrant language classroom. Discourse: Studies in the Cultural Politics of Education, 33(1), 129–146. Whyte, S. (2011). Learning to teach with videoconferencing in primary foreign language classrooms. ReCALL, 23(3), 271–293. Wyatt, M. (2013). Overcoming low self-efficacy beliefs in teaching English to young learners. International Journal of Qualitative Studies in Education, 26(2), 238–255. Yin, R.K. (2014). Case study research: Design and methods (5th ed.). Thousand Oaks, CA: Sage. Zabrodskaja, A. (2014). Tallinn: Monolingual from above and multilingual from below. International Journal of the Sociology of Language, 2014(228), 105–130.

  Qualitative Methodology 


Zhang, H. (2013). Pedagogical challenges of spoken English learning in the Second Life virtual world: A case study. British Journal of Educational Technology, 44(2), 243–254. Zheng, X., & Borg, S. (2014). Task-based learning and teaching in China: Secondary school teachers’ beliefs and practices. Language Teaching Research, 18(2), 205–221. Zhu, W., & Mitchell, D.A. (2012). Participation in peer response as activity: An examination of peer response stances from an activity theory perspective. TESOL Quarterly, 46(2), 362–386.

5 Mixed Methodology AlisonMackey andLaraBryfonski

Introduction The field of applied linguistics research has recently benefited from a surge of methodological innovations that have been accompanied by significant discussions about the productive use of combining methods (Dörnyei, 2007; Hashemi & Babaii, 2013; Mackey, 2015; Moeller, Creswell, & Saville, 2015). While the field of applied linguistics has unquestionably advanced through both tightly controlled laboratory studies (e.g., Johnson & Newport, 1989) and deep ethnographic work (e.g., Cekaite, 2007), it is clear that neither experimental methods nor qualitative research alone can account for many of the questions we explore in our field. The addition of qualitative methods to traditionally quantitative domains can provide quantitative research with a more authentic lens to view language processes in context. We argue, like several others (e.g., Hashemi & Babaii, 2013), that applied linguistics research that approaches questions of language with a range of methods in mind is important as we advance our understanding of how languages are learned, taught, and used in a variety of contexts and domains. Mixed methods research (MMR), also known as multi-method or multi-methodology, employs aspects of both quantitative and qualitative methods and designs to better understand a given phenomenon. In the current chapter, we aim to explain how to utilize quantitative and qualitative data to complement one another to shed light on A. Mackey (*) • L. Bryfonski Georgetown University, Washington, DC, USA e-mail: [emailprotected]; [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



A. Mackey and L. Bryfonski

important questions in our field. We first define and provide a historical ­overview of mixed methods research in field of applied linguistics. Next, we describe a variety of designs of mixed methods research along with practical advice on three main stages of mixed methods research: planning, implementation, and analysis. The third section explores some challenging ideas in this innovative methodological domain. We finish with an examination of limitations along with some proposals for future directions in mixed methodology research.

Definitions Mixed methods research—also known as multi-method, combined methods, mixed research, or triangulation (Mackey & Gass, 2015)—is a strategy of inquiry that allows the researcher to explore a research question from multiple angles potentially avoiding the limitations inherent in using one approach, quantitative or qualitative, independently. Quantitative methodology is traditionally characterized by carefully controlled experimental design and random assignment, while qualitative methodology is characterized by grounded theory, case studies, and detailed description. Quantitative research aims for reliable and replicable design with outcomes that can be generalized across a population. Qualitative data focuses on processes rather than outcomes in order to understand a problem deeply and thoroughly (Mackey & Gass, 2015). Researchers implementing mixed methods approaches generally acknowledge there can be biases inherent in utilizing only one of these methods and attempt to address these biases by combining both strategies. Mixed methods research has been a prevalent methodology in the sciences for many years (see, e.g., Creswell, 2015), yet it has only been clearly defined as mixed methods in the applied linguistics literature in the past decade (as noted by Hashemi & Babaii, 2013 among others). Over this time, mixed methods research has been defined and described in many ways. Mixed methodology is sometimes referred to as the integration of both quantitative and qualitative methods within a single study. In applied linguistics research, the most common way mixed methods are described is “triangulation” of multiple sources or methods (Mackey & Gass, 2015); however, some researchers (e.g., Creswell, 2003) define triangulation as a specific subtype of mixed methods design (see mixed methods research types below). Mixed methodology has also been defined as “the concept of mixing different methods” and “collecting and analyzing both forms of data in a single study”

  Mixed Methodology 


(Creswell, 2003, p.15). According to Tashakkori and Creswell (2007), mixed methodology can be described as “research in which the investigator collects and ­analyzes data integrates the findings, and draws inferences using both qualitative and quantitative approaches in a single study or program of inquiry” (p.4).

Current Issues The Qualitative-Quantitative Dichotomy The idea of a dichotomy between quantitative and qualitative research methods has been routinely debated in the field, often dividing researchers across epistemological stances as well as methodological approaches (Riazi, 2016; Trochim, 2006). More recently, however the field has been adopting and adapting mixed methodologies as one of the approaches for research into how languages are learned (King & Mackey, 2016; Riazi & Candlin, 2014) including entire volumes dedicated to mixed methods approaches in applied linguistic research (Riazi, 2017). Recent issues of the journal Studies in Second Language Acquisition (35(3), 2014) on bridging approaches to SLA research, as well as the 100th anniversary edition of The Modern Language Journal (100(S1), 2016) which included an introduction to “layered” approaches to theories and methods of SLA have addressed the issue of methodology in applied linguistic research. In the introduction to the 100th anniversary issue, King and Mackey (2016) argue that “taking such a cross-field, collaborative perspective is essential for us to fully and adequately address pressing, unresolved problems on both academic and practical fronts” (p. 210) citing mixed methods approaches as one means in which to address challenges facing applied linguistics researchers and language practitioners and arguing that not only should researchers blend methodologies but also layer epistemological perspectives. In this way, researchers can utilize multiple methods and stances to facilitate the involvement of both cognitive and social factors present in a variety of language processes. Introspective measures such as think-aloud and stimulated recall protocols (see Gass & Mackey, 2016) when used in combination with in-depth interviews and naturalistic observations, supplemented by quantitative data from new technologies such as eye trackers (see Smith, 2012) are one of many ways to layer perspectives and methods. Alternatively, quantitative data can be utilized to supplement qualitative data for example when little is known about a given research area


A. Mackey and L. Bryfonski

prior to investigation. This is common in classroom action research (see Loewen & Philp, 2012 for examples of classroom action research), where an instructor first observes students prior to collecting quantitative data such as exam scores.

Why Choose aMixed Methodology Design? Mixed methods designs enable researchers to investigate the same phenomenon from multiple angles. When deciding whether or not to implement a mixed methods design, a researcher must first consider the research question(s) and then ask what kinds of data collection, analyses, and interpretations will best enable them to answer those questions. There is frequently no single “right” answer to the question of which method, and mixing them allows the researcher the freedom not to choose. Depending on the type of design chosen, the blending of multiple methods can enable researchers to explore and validate under-researched phenomena, shed light on difficult-to-interpret findings, address a problem from multiple theoretic perspectives, or to simply conceptualize a problem more deeply and thoroughly.

Types ofMixed Methods Studies Although mixed methods designs are typically categorized as either concurrent or sequential depending on the order in which data collection takes place (Creswell, 2003), there is no universally accepted way of designing and implementing a mixed methods study. Many varieties of mixed methods utilize multiple ways of incorporating quantitative and qualitative data. In mixed methods designs, the results from one type of inquiry (more qualitative or more quantitative or primarily descriptive) can be used to inform or understand the results from another method of inquiry. Methods might be integrated at every stage of the design process or might follow each other in a predetermined sequence (see Table 5.1 for an overview). In any case, it is helpful for applied linguistics researchers to familiarize themselves with the idea that there a wide range of mixed methodology designs are possible before selecting the type that will be most appropriate for their research project since, clearly, some design types lend themselves more readily to applied linguistics research questions than others. In the section, we define and describe several common designs utilized in mixed methods

  Mixed Methodology 


Table 5.1  Common types of mixed methods designs Design types


Concurrent designs

Quantitative and qualitative data collected simultaneously. Collective interpretation of findings • Triangulation All data collected in one phase and weighted equally in analysis • Concurrent embedded All data collected in one phase, but not weighted equally in analysis Sequential designs Quantitative and qualitative data collected via multiple phases of research implementation • Explanatory Quantitative data are collected first. Qualitative data are collected as a follow-up • Exploratory Qualitative data are collected first. Quantitative data are collected as a follow-up • Sequential embedded Quantitative and qualitative data are collected in multiple iterations to inform and follow-up on one another

research utilizing examples from studies in the field of applied linguistics to illustrate each design type.

Concurrent Designs In concurrent designs, the researcher collects both quantitative and qualitative data at the same time and uses the data collectively as a means to interpret the findings of the investigation (Creswell, 2003). Concurrent designs can be further broken down into triangulation designs and embedded concurrent designs. Triangulation designs were one of the first strategies utilized in mixed methodology research (Creswell, 2003; Jick, 1979). In this type of design, all the data are collected in one phase and receive equal weight during analysis. This allows the researcher to “compare and contrast quantitative statistical results with qualitative findings or validate or explain quantitative results with qualitative data” (Creswell & Plano Clark, 2011, p.62). For example, a questionnaire that aims to measure a learner’s contact with the target language might include both quantitative Likert-type questions asking them how many hours per week they speak the target language, as well as open-ended responses that elicit more qualitative data such as the domains or tasks in which learner interacts in the language, such as “who do you use the language with outside of class?” Since both types of data are collected within the same survey (i.e., concurrently), the results are triangulated from both qualitative and quantitative data (see Sample Study 5.1 for an example).


A. Mackey and L. Bryfonski

Sample Study 5.1 Lee, E. (2016). Reducing international graduate students’ language anxiety through oral pronunciation corrections. System, 56(1), 78–95. Research Background This mixed methodology study investigates the link between the type of oral corrective feedback used and language anxiety level. Research Problems The research addressed two main questions: What patterns of corrective feedback and learner repair occur in advanced-level adult ESL classrooms? How does oral corrective feedback affect students’ anxiety about speaking English? Research Method • Type of research: Concurrent triangulation • Setting and participants: Sixty advanced ESL university students from a variety of language backgrounds participating in an international teaching assistant training program took part in the study. • Instruments/techniques: Both quantitative data obtained from a pre- and post-survey and classroom observations as well as qualitative data from follow-up interviews and open-ended survey questions were collected. • Data analysis: Classroom instruction was recorded for one month and corrective feedback moves were coded by type and by the presence or absence of learner repairs. Pre- and post-surveys collected data on learners’ self-perceptions of anxiety, attitude, motivation, and self-­confidence. Post-instruction interviews asked follow-up questions about their affective stances toward corrective feedback. Key Results Findings from this study drew upon data collected by both quantitative and qualitative means. Quantitatively, learners’ reports of anxiety decreased from pre- to posttest. Qualitative data revealed that learners’ reactions varied based on the types of feedback they received. Specifically, clarification requests were associated with increases in students’ language learning anxiety. Comments The concurrent triangulation of quantitative and qualitative methods shed light on how corrective feedback can affect learners’ emotional states even at advanced levels of acquisition.

Concurrent embedded designs are similar to triangulated designs in that both qualitative and quantitative data are collected simultaneously. However, one type of data are embedded, generally meaning that it plays a secondary role to the other data, rather than both being weighed equally. In most research of this type, the focus of the design is quantitative with qualitative data collection added to support or better interpret the quantitative findings (Creswell & Plano Clark, 2011). Embedded designs are primarily utilized when the researcher is interested in measuring the impact of an intervention as well as

  Mixed Methodology 


understanding the experience of the intervention (Mackey & Gass, 2015). For example, an SLA researcher interested in the effects of task type on learners noticing of corrective feedback provided by their instructor might have students fill out uptake sheets while performing various tasks. These uptake sheets (see Mackey, 2006, for an example) elicit quantitative data on the number of times the learner noticed the provision of feedback, but also more qualitative data on what stage of the task they were in, what kinds of feedback they were provided or even introspectively, how the feedback affected their emotional state. In this example, the primary data would be a comparison of the quantitative data with task type; however this data would be supported and strengthened when combined with the qualitative data giving insights into the learners’ responses to the intervention.

Sequential Designs In contrast to concurrent designs, sequentially designs involve the collection of data in multiple phases or in a set sequence. Creswell and Plano Clark (2011) break down sequential designs into three categories: explanatory, exploratory, and sequential embedded.

E xplanatory Design Explanatory designs focus on quantitative data but use qualitative follow-up data to explain quantitative results. Researchers might use an explanatory design to better understand quantitative results, especially if those results were not expected. For example, if the outcomes of a task-based intervention showed no changes in the fluency or complexity of learners’ speech in the target language, a stimulated recall interview conducted after the intervention might elucidate some of the reasons for why no changes occurred. For example, learners might not have understood the task instructions or might not have been motivated to complete the task. Without exploring what goes on during a treatment, researchers may lose some critical context. Qualitative data collection can fill this gap (see Sample Study 5.2 for an example of qualitative analyses used to follow-up on quantitative findings). Sometimes researchers will follow-up with qualitative data collection on selected participants whose results are statistical outliers to better understand why those learners did very well or very poorly in comparison to the other participants. However, in all cases of explanatory designs, the results from the quantitative phase are typically emphasized over the supplementary qualitative data (Mackey & Gass, 2015).


A. Mackey and L. Bryfonski

Sample Study 5.2 Demmen, J., Semino, E., Demjen, Z., Koller, V., Hardie, A., Rayson, P., & Payne, S. (2015). A computer-assisted study of the use of violence metaphors for cancer and end of life by patients, family carers and health professionals. International Journal of Corpus Linguistics, 20(2), 205–231. Research Background The authors of this study utilize a combination of corpus linguistics techniques and qualitative thematic analyses in their in-depth investigation of how metaphors of violence are utilized by patients, healthcare professionals, and family carers in discussing cancer and end of life. Research Gaps The study uses corpus linguistics in order to shed light on the use of violence metaphors in communication about illness and healthcare in a more rigorous and systematic manner than was the case in previous research. Research Method • Type of research: Sequential explanatory • Setting and participants: The study consisted of a 1.5-million word corpus constructed from semi-structured interviews with hospice healthcare professionals, patients diagnosed with terminal cancer, and unpaid family carers looking after family members with terminal cancer. The corpus was primarily made up of data from online forums where healthcare professionals, patients, and family members participated. • Instruments/techniques: Data were compiled into a corpus for further analysis. • Data analysis: First a manual analysis of a sample from the corpus was used to identify and tag all metaphorical expressions utilized to refer to cancer by participants. The results were used to inform a computer-­aided quantitative analysis of the whole corpus to examine the distribution of violence metaphors by stakeholders and by context (interview or online forum) and to create lists of the most frequently used violence metaphors by the various stakeholders. The resulting lists were then used in a qualitative exploration of the different ways in which members from the stakeholder groups used violence metaphors, as well as what aspects of the stakeholders’ experiences were highlighted through the use of violence metaphors. This qualitative analysis in addition to the quantitative descriptive corpus analysis provided deeper understanding as to when and why these violence metaphors were used in the data collected. Key Results Findings indicated that patients, carers, and professionals utilize a wider array of violence metaphors than previously identified. Results also showed how metaphor use varied between contexts of interaction (online forums vs. interviews) and by stakeholder groups. Comments This study provides a unique example of how multiple methods (quantitative followed by qualitative) as well as multiple linguistic analysis techniques (corpus linguistics and qualitative thematic analysis) can shed light on complex questions in applied linguistics.

  Mixed Methodology 


E xploratory Design Exploratory designs are the other side of the coin to explanatory designs. Here, qualitative data are collected first in order to guide subsequent collection of quantitative data. This type of mixed methodology allows researchers to begin to ask questions which might result in answers that can generalize the findings of results from qualitative data collection. Researchers might use an exploratory design to investigate an under-explored phenomenon and define and describe variables before embarking on a quantitative design. Classroom action research (see Mackey & Gass, 2015) might utilize an exploratory design. For example, a classroom language teacher might notice their students seem to be more engaged during different aspects of a lesson. The instructor might then decide to collect some qualitative data on learners’ perceptions about their own levels of engagement during different tasks. A helpful quantitative followup might be designed to investigate quantitatively whether learners perform better on tasks they report they are more engaged in. This two-phase model allows the researcher or language teacher to develop an understanding of the research question through thematic analyses or other qualitative means and then connect those analyses to quantitative findings. An example of an exploratory design in a classroom context is illustrated in Sample Study 5.3.

Sample Study 5.3 Mroz, A. (2015). The development of second language critical thinking in a virtual language learning environment: A process-oriented mixed-­ method study. CALICO Journal, 32(3), 528–553. Research Background The researcher in this study utilized a mixed methods design to investigate the emergence of L2 strategies in collaborative discourse produced by university French learners interacting in a computer-mediated setting. Research Gaps The research design was motivated by a call for more qualitative data informing the results from CALL (computer-assisted language learning) studies and included a perspective layered by both sociocultural theory and ecological perspectives. Research Method • Type of research: The author defines the study as an embedded design utilizing a data transformation procedure. The design of the study is primarily qualitative with an embedded quantitative element. • Setting and participants: Two undergraduate intermediate French classes consisting of 27 students. Five students volunteered to act as case-study students.


A. Mackey and L. Bryfonski

• Instruments/techniques: Qualitative data included bi-weekly observations of French classes, a focus group with five case-study students, and retrospective perception data. Quantitative data were elicited from computer-generated logs of students’ interactions in an online Second Life community. • Data analysis: The author integrated the findings from both quantitative and qualitative data to examine student perceptions of the impact of interaction in the online community on their critical thinking skills in their second language. Key Results Results indicated an increase in the use of higher-order critical thinking abilities during interaction in the computer-mediated setting. Qualitative themes shed light on the students’ reactions to the collaborative learning environment and the use of technology for language learning. Comments This article clearly outlines its design and rationale for the use of mixed methodology. The integration of quantitative data into stages of a primarily qualitative, case-study design makes this a clear example of a sequential exploratory design.

Sequential Embedded Design In the sequential embedded design, quantitative and qualitative data are collected in multiple phases to both inform and provide insights into potential explanations for one another. As in concurrent embedded designs, one data type (again, in the applied linguistics field, this has been typically quantitative data) is primary. Qualitative data collected before an intervention can aid in participant selection, instrument verification, or to shape the subsequent intervention. Qualitative data collected after an intervention can help to illuminate patterns in quantitative intervention data. The integration of both quantitative and qualitative methods at each stage of a research design is shown in Sample Study 5.4. Sample Study 5.4 Préfontaine, Y., & Kormos, J. (2015). The relationship between task difficulty and second language fluency in French: A mixed methods approach. Modern Language Journal, 99(1), 96–112. Research Background The authors of this study utilized a mixed methods approach to shed light on the relationship between task difficulty and fluency.

  Mixed Methodology 


Research Problem The researchers asked “What quantitative and qualitative differences exist in the perceived difficulty of narrative tasks in French as an L2?” (p.99) as well as several other questions regarding the variation in L2 fluency across the tasks and the relationship between perceived difficulty and fluency performance. Research Method • Type of methodology: Sequential embedded. • Setting and participants: Forty adult learners of French studying in an immersion context participated in one of three task conditions. • Instruments/techniques: The three conditions all included a narrative task that varied in terms of the amount of demands it posed on learners’ creativity, content knowledge, and linguistic resources. • Data analysis: Task difficulty was measured both quantitatively with questionnaire data as well as qualitatively with retrospective interview data. Fluency was measured quantitatively using a variety of fluency measures analyzed in Praat. Key Results The use of these multiple methods allowed researchers to capture a holistic account of the phenomenon under consideration; quantitative questionnaire data demonstrated a link between lexical retrieval difficulty and fluency difficulty that was related to perceived task difficulty, while qualitative interview data explained how the learners evaluated the difficulty of the tasks and also which task features affected their fluency. Quantitative fluency data showed that articulation rate and average pause time were related to task difficulty. Comments The seamless integration of both quantitative and qualitative methods is clear from the researchers’ research question stated above—the hallmark of an embedded mixed methods design. Since the authors collected data in several phases, this is considered a sequential embedded method.

New Trends inMixed Methods Designs In addition to the mixed methodology models described above, several novel models have been successfully applied to research in the field of applied linguistics. For example, Hashemi and Babaii (2013) examined 205 SLA research articles via a content-based analytic approach to identify current trends in mixed methodology SLA research. They found that in general, concurrent designs are more prevalent than sequential designs and that while many researchers used mixed methodology designs they did not achieve “high degrees of integration at various stages of the study as a quality standard for mixed research” (p.828). Another finding was the existence of mixed methods research designs that did not fit the typology proposed by Creswell, Plano


A. Mackey and L. Bryfonski

Clark, and Garrett (2008). For example, Mackey (2006) utilized a primarily quantitative design with qualitative elements embedded and with follow-up qualitative data collection. Specifically, a pre-/post-test design was utilized to measure noticing and learning of new forms, with a qualitative learning journal method embedded. Stimulated recall interviews and follow-up questionnaires represented explanatory qualitative data. This study, therefore, included both concurrent and sequential features. Another example presented by Hashemi and Babaii (2013) was a needs analysis of Egyptian engineering students conducted by Pritchard and Nasr (2004). The aim of this study was to use the results of the needs analysis to design a reading improvement program for the engineering students. The study began as an exploratory study with qualitative data collected on student needs. The researchers then followed this qualitative information with a quantitative establishment of the validity and reliability of the development materials. The final phase involved embedded quantitative and qualitative data collection to obtain feedback from teachers and students.

oing Mixed Methods inApplied Linguistics D Studies After selecting a mixed methods strategy that suits the research questions under investigation, there are several considerations that should be attended to during the planning, implementation, and analysis stages of the mixed methods research.

Planning When in the planning stages of the study design, the researcher should be able to answer and justify several decisions including: 1. Will the design give equal weight to both quantitative and qualitative findings or theoretical perspectives? 2. Will the same participants contribute to all phases of the project? 3. Is there a clear justification for the design choice (i.e., concurrent or embedded) stemming directly from the research questions? Mackey and Gass (2015) suggest sketching a visual diagram or representation of the research design as a useful strategy for visualizing each stage of the project. Within the sketch the research team can specify procedures and

  Mixed Methodology 


expected outcomes for each phase of the study. This can be useful both for the researchers as they implement the study as well as for the readers of the final report of the project if the design is incorporated into the write-up.

Implementation Implementation is the stage where the design is applied and the data are collected. The order in which data are collected will, of course, depend on whether or not the researcher has chosen a concurrent or sequential design. The issue of multiple sources should also be considered, alongside multiple methods. For example, in task-based language teaching studies, researchers are urged to not only collect data on language program needs from teachers and administrators but also from students themselves. Decisions about methods and sources can have bearing on the ultimate validity and reliability of the study outcomes (as further discussed in the section on challenges, below).

Analysis The analytic techniques and methods chosen depend heavily on the research questions and mixed methods strategy used to investigate the research questions. When conducting and reporting results and outcome, researchers should verify they can answer the following questions: (a) how will my analyses complement or justify one another? And, (b) will the analyses be integrated or kept separate? It is likely that in the mixed methods research design that data will be collected from multiple sources as well as via multiple data collection techniques, for example, a researcher might triangulate data from interviews of both teachers and students as well as from surveys or questionnaires. In order to best represent the wealth of data that is obtained in mixed methods studies, researchers might need to consolidate data, meaning refine, combine, or boil data down to a more concise format. They may also need to transform the data, which means to modify data so that it can be compared across sources. While we often take it for granted that quantitative data can be represented visually through the use of line plots, box plots, and histograms, qualitative results can also be represented visually. Descriptive matrices (see Miles, Huberman, & Saldana, 2014) or data visualization techniques made possible by qualitative data software like QSR International’s NVivo software (see Richards, 1999) can help readers better understand and compare the results from qualitative stages to make sense of the overall findings.


A. Mackey and L. Bryfonski

Challenges inMixed Methods Research Pros andCons toMixed Methods Designs An obvious advantage to mixed methods designs is the way that triangulation of sources helps to shed greater light on a problem. However, occasionally, there are challenges to integrating data from both types of approach into one report. This is particularly the case when the findings turn out to be contradictory. For example, a pretest-posttest design explanatory design may show a significant result in terms of an advantage for one type of treatment, let’s say an instructional intervention. In order to provide more details and better understand the result, the researchers build into the design a provision to randomly select some of the participants for focal case studies. If the original experimental group consisted of twenty-five participants (with a similar number in the control group), three participants might be selected as focal case-­study participants. However, it is possible that none, or not all of them, follow the same patterns as in the group data. So the data would seem to be contradictory. This situation illustrates how helpful it can be to mix methods, by showing that quantification and aggregation of data can often obscure interesting patterns at the individual level. To mitigate against a confusing report, researchers need to dig deeply into the case studies and use them to explain how improvements are taking place in the group as a whole, but this is not always true for every individual in the group. Case-study data can help explain why, to some extent. For example, a study that provides an in-depth ethnographic investigation of a select group may seek to expand the study to include quantitative findings of group trends, to promote generalizability to a wider population. Case-study findings may contradict findings from quantitative results. These issues must be thoughtfully and carefully described and explained in any research report so that research consumers can understand the nuances to the results from each element in the mixed methods study. These examples illustrate the importance of making initial decisions about which type of data are used to explain or complement the other. As Abbuhl and Mackey (2015) noted, “Some researchers claim that the two approaches are epistemologically and ontologically incompatible, making split methods little more than unprincipled methodological opportunism” (p.8). The idea that quantitative and qualitative methods are inherently incompatible is not the position we take, although it is important to guard against it by designing research carefully, and not falling into the trap of presenting a primarily

  Mixed Methodology 


q­ uantitative study with a token case study at the end or of presenting a statistical analysis of some descriptive data at the beginning of an ethnography— neither of these approaches would qualify as genuinely mixed methods, but rather primarily one or the other, while borrowing a technique from a different paradigm. It is important for true mixed methods study to be designed as such from the outset, recognizing the need for both approaches to fully address the research questions or problem area. While there is no “ideal” mixed methodology study since all research must step from a particular problem, we can say that a sequential embedded design type using both quantitative and qualitative data collection at all stages of the process is an important goal. For example, a tightly controlled experiment that uses integrated introspective and interview methods, where the introspective and interview methods are carefully intertwined.

Limitations ofMixed Methods Designs Mixed methods studies, while an important and useful approach to investigating questions in our field, are, as is the case for all research, not without limitations. Mixed methods research typically requires more resources in the sense of time, both from the participants and from the researchers. Essentially, researchers need the best of both worlds for a mixed methods study to work, so both the extensive experimental methods, using careful development of procedures and materials, and controlling and counterbalancing variables, together with the intensive integrated and grounded approach to individual participation and understanding that comes with qualitative research and time investment. It is also the case that domain expertise needs to be considered. Many researchers are trained in one paradigm or the other and indeed, naturally gravitate to one or the other in terms of their preference and comfort level for data collection and analysis. Team approaches are helpful in this regard, as well as further training and education. Even when we consider the final outcomes, primarily quantitative research is usually written up in journal articles or book chapters, and qualitative research is more often written up in more lengthy publications, such as monographs. This disparity in outcomes also needs to be considered. Having said all this, it seems logical that using multiple sources and data types to address questions in the field and not being wedded to one paradigm or another will advance knowledge, and the recent increase in interest in mixed methods research supports this.


A. Mackey and L. Bryfonski

Resources forFurther Reading Brown, J. D. (2014). Mixed methods research for TESOL. Edinburgh: Edinburgh University Press. This textbook provides a useful and practical overview of the basics of research methodology with a focus on combining quantitative and qualitative methods judiciously in TESOL research projects. Chapters provide depth on the key stages in the development of a mixed methods study including: planning a project, gathering and analyzing data, interpreting results, and reporting results. The book focuses in on the domains of classroom research, action research, conversation, and discourse analysis as well as survey-based research and language program evaluations. Creswell, J.W., & Plano Clark, V.L. (2011). Designing and conducting mixed methods research (2nd ed.). Los Angeles, CA: SAGE. For those interested at examining methodology from a broader social science-­oriented perspective, this textbook provides a step-by-step look at all aspects of the mixed methodology research process. This text is especially useful for the many diagrams of the various types of mixed methods designs described in the current chapter and provides greater depth for each of the main mixed methods designs. Ivankova, N.V., & Greer, J.L. (2015). Mixed methods research and analysis. In B.Paltridge & A.Phakiti (Eds.), Research methods in applied linguistics: A practical resource. London: Bloomsbury. This practical book chapter defines and describes mixed methods research in manner that is accessible to both undergraduate and postgraduate students studying applied linguistics. The chapter provides an overview of when and why mixed methods research is utilized, philosophical assumptions and issues associated with mixed methods research as well as a breakdown of the steps of implementing a mixed methods study. Journal of Mixed Methods Research (JMMR; This peer-reviewed journal focuses on mixed methods research in fields of social, behavioral, health, and human sciences from an international

  Mixed Methodology 


­ erspective. The journal is supported and contributed to by leading mixed p methodology researchers including John Creswell, Abbas Tashakkori, Charles Teddlie, and Michael Patton among many others. King, K.A., & Mackey, A. (2016). Research methodology in second language studies: Trends, concerns, and new directions. Modern Language Journal, 100(S1), 209–227. This introduction to the 100th anniversary edition of The Modern Language Journal offers an in-depth discussion of the importance of mixing and layering methods and perspectives in order to further the field of applied linguistics. The article summarizes some new advances in methodology and points to future directions for second language and applied linguistics researchers. Mackey, A., & Gass, S.M. (2015). Second language research: Methodology and design (2nd ed.). London: Routledge. The second edition of this handbook for second language acquisition researchers includes practical advice for quantitative, qualitative, and mixed methods studies. Chapter 9 on mixed methods provides overviews of the main types of mixed methods designs used in second language acquisition research and additionally provides activities and discussion questions that can be used to further deepen understandings about mixed methods research. Riazi, A.M. (2017). Mixed methods research in language teaching and learning. London: Equinox publication Company. This recent publication is the first in the field to bring together the current body of mixed methodology research in language teaching and learning. The book takes a practical approach in order to promote the use of mixed methods by applied linguistics researchers. The challenges of designing and implementing mixed methods research are also addressed. This is a useful resource for doctoral students, postgraduates, or others developing research proposals that integrate quantitative and qualitative methods.


A. Mackey and L. Bryfonski

References Abbuhl, R., & Mackey, A. (2015). Second language acquisition research methods. In K.King & Y.Lai (Eds.), Encyclopedia of language and education: Vol. 10. Research methods in language and education (3rd ed.). Dordrecht: Springer. Cekaite, A. (2007). A child’s development of interactional competence in a Swedish L2 classroom. Modern Language Journal, 91(1), 45–62. Creswell, J.W. (2003). Research design: Qualitative, quantitative and mixed methods approaches. Thousand Oaks: SAGE. Creswell, J.W. (2015). A concise introduction to mixed methods research. Los Angeles, CA: SAGE. Creswell, J.W., & Plano Clark, V.L. (2011). Designing and conducting mixed methods research (2nd ed.). Los Angeles, CA: SAGE. Creswell, J.W., Plano Clark, V.L., & Garrett, A.L. (2008). Methodological issues in conducting mixed methods research designs. In M.M. Bergman (Ed.), Advances in mixed methods research (pp.66–83). London: SAGE. Demmen, J., Semino, E., Demjen, Z., Koller, V., Hardie, A., Rayson, P., & Payne, S. (2015). A computer-assisted study of the use of violence metaphors for cancer and end of life by patients, family carers and health professionals. International Journal of Corpus Linguistics, 20(2), 205–231. Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. Oxford: Oxford University Press. Gass, S.M., & Mackey, A. (2016). Stimulated recall methodology in second language research (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. Hashemi, M., & Babaii, E. (2013). Mixed methods research: Toward new research designs in applied linguistics. Modern Language Journal, 97(4), 828–852. Jick, T. (1979). Mixing qualitative and quantitative methods: Triangulation in action. In J. Van Maanen (Ed.), Qualitative methodology (pp. 135–148). Beverly Hills: SAGE. Johnson, J.S., & Newport, E.L. (1989). Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21(1), 60–99. King, K.A., & Mackey, A. (2016). Research methodology in second language studies: Trends, concerns, and new directions. Modern Language Journal, 100(S1), 209–227. Lee, E. (2016). Reducing international graduate students’ language anxiety through oral pronunciation corrections. System, 56(1), 78–95. Loewen, S., & Philp, J.(2012). Instructed second language acquisition. In A.Mackey & S.M. Gass (Eds.), Research methods in second language acquisition: A practical guide (1st ed., pp.55–73). Oxford: Blackwell Publishing Ltd. Mackey, A. (2006). Feedback, noticing and instructed second language learning. Applied Linguistics, 27(3), 405–430.

  Mixed Methodology 


Mackey, A. (2015). Methodological practice and progression in second language research. AILA Review, 27, 80–97. Mackey, A., & Gass, S.M. (2015). Second language research: Methodology and design (2nd ed.). London: Routledge. Miles, M.B., Huberman, A.M., & Saldana, J.(2014). Qualitative data analysis: A methods sourcebook. Thousand Oaks, CA: SAGE. Moeller, A.J., Creswell, J.W., & Saville, N. (Eds.). (2015). Language assessment and mixed methods. Cambridge: University of Cambridge Press. Mroz, A. (2015). The development of second language critical thinking in a virtual language learning environment: A process-oriented mixed-method study. CALICO Journal, 32(3), 528–553. Préfontaine, Y., & Kormos, J.(2015). The relationship between task difficulty and second language fluency in French: A mixed methods approach. Modern Language Journal, 99(1), 96–112. Pritchard, R. M. O., & Nasr, A. (2004). Improving reading performance among Egyptian engineering students: Principles and practice. English for Specific Purposes, 23(4), 425–445. Riazi, A. M. (2016). Innovative Mixed-methods Research: Moving Beyond Design Technicalities to Epistemological and Methodological Realizations. Applied Linguistics, 37(1), 33–49. Riazi, A. M., & Candlin, C. N. (2014). Mixed-methods research in language teaching and learning: Opportunities, issues and challenges. Language Teaching, 47(2), 135–173. Richards, L. (1999). Using NVivo in qualitative research. Thousand Oaks, CA: SAGE. Smith, B. (2012). Eye tracking as a measure of noticing: A study of explicit recasts in SCMC. Language Learning & Technology, 16(3), 53–81. Tashakkori, A., & Creswell, J.W. (2007). Editorial: The new era of mixed methods. Journal of Mixed Methods Research, 1(1), 3–7. Trochim, W. M. (2006). The research methods knowledge base (2nd ed.). Retrieved from

6 Traditional Literature Review andResearch Synthesis Shaofeng Li andHongWang

Introduction A literature review is a retrospective account of previous research on a certain topic, and it may achieve various purposes. For researchers, a literature review may serve to contextualize and inform further research. Specifically, prior to carrying out a new study, the researcher needs to find a niche by identifying what has been done on the topic under investigation and to mine through existing methodology with a view to developing instruments and materials that can best answer his/her own research questions. Also, through a literature review, a researcher may draw on existing evidence to verify a theory or build a new theory. For practitioners and policy makers, the conclusions reached in a literature review based on aggregated research findings may serve as a basis for decision-making in terms of how to meet the needs of different stakeholders. Such integration of research and practice is

S. Li (*) Foreign and Second Language Education Program, School of Teacher Education, Florida State University, Tallahassee, Florida, USA Center for Linguistics and Applied Linguistics, Guangdong University of Foreign Studies, Guangzhou, China e-mail: [emailprotected] H. Wang University of Auckland Library and Learning Services, The University of Auckland, Auckland, New Zealand e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



S. Li and H. Wang

called evidence-based practice (Bronson & Davis, 2012), which is of particular importance in a heavily practice-­oriented discipline such as applied linguistics. Because of the importance of literature review, there has been a call to treat it as a research method in its own right (Cooper, 2016). In fact, in the literature on literature review, a distinction has been made between traditional literature reviews, such as the type that appears in the literature review section of a journal article, and more systematic approaches of previous research, such as meta-analysis, which is conducted following a set of well-defined ­procedures and protocols. Although there has been much discussion about the differences between traditional reviews and systematic reviews, systematic comparisonshave been rare, and the terminology relating to the two approaches and to literature review as a genre has been ambiguous and confusing. To clarify any potential ambiguities, and for the purposes of this chapter, the term literature review is used to refer to the genre as a whole as well as to traditional literature reviews. Research synthesis is reserved for systematic reviews such as meta-analysis. Proponents of research synthesis have generally been negative about traditional reviews, criticizing its unscientific nature. However, we argue thatboth traditional reviews and research syntheses have merits, that they serve different purposes, and that they do not have to be mutually exclusive. The following sections discuss the procedures and best practices of each of the two approaches and conclude by making a comparison between the two approaches and proposing ways to integrate them.

Traditional Literature Review Although anyone getting an academic degree or pursuing an academic career may have to write a literature review at some point, there has been surprisingly little information on how to do it. A quick look at research methods textbooks in applied linguistics shows that none of them includes detailed information on how to conduct a literature review. In a study by Zaporozhetz (1987), Ph.D. advisors (N=33) ranked the literature review section the lowest in terms of the amount of help they provided to their students—they reported spending the most time on the supervision of the methods chapter. The lack of interest in guiding students on how to do a literature review is probably because (1) it is considered an easy and transparent process, not a skill that needs to be trained, and (2) there are a myriad of ways of writing a literature review, which makes it challenging to provide a general guidance. However, it can be argued that (1) doing a literature review is not a naturally acquired skill, and (2) despite the variety of styles and approaches, there are some common principles and procedures one could follow in order to write a successful review.

  Traditional Literature Review andResearch Synthesis 


Jesson, Matheson, and Lacey (2011) defined a traditional literature review as “a written appraisal of what is already known … with no prescribed methodology” (p.10). This definition suggests that a literature review is not a mere description of previous research; rather, it provides an evaluation of the research. The definition also distinguishes a traditional review from a research synthesis, which is carried out by following a set of well-defined procedures (see Plonsky & Oswald, 2015). Traditional reviews include the introductory sections of the study reports of empirical studies as well as freestanding reviews such as those published in Language Teaching. The main purpose of a literature review for an empirical study is to set the stage for a new study. The purposes of freestanding reviews, by contrast, are more diverse, such as providing a state-of-the-art review of the research on a certain instructional treatment, clarifying the myths and central issues of a substantive domain, proposing a new research agenda, and summarizing the methods previous researchers have used to measure a certain construct. Because of the diversity of topics and purposes of freestanding reviews, there are no fixed formats to follow. The focus of this section, consequently, will be on the literature reviews for empirical studies, which, of course, overlaps with freestanding reviews in many ways. Before going into further detail, it is necessary to emphasize that what we are discussing here is how to do, not just how to write, a literature review; writing a literature review is the final step of the entire process of doing a literature review. The purpose of doing a literature review is trifold: (1) to contextualize the study to be conducted, (2) to inform the study design, and (3) to help the researcher interpret the resultsin the discussion section. Specifically, when contextualizing the current study, the researcher needs to identify what is known about the topic by discussing the related theories, research, and practices. The researcher also needs to identify what is unknown about the topic, explain how it is informed by, and deviates from, previous studies, and convince the reader of the significance of the current study. An equally important purpose of doing a literature review is to draw on the methodology of existing research to answer the research questions of the current study. Finally, doing a literature review enables the researcher to refer back to the theories and research expounded in the review section when discussing how the findings of this study may contribute to the existing body of knowledge. The content of a literature review, or the scholarship to be evaluated, is of three types: conceptual, empirical, and practical. Conceptual knowledge concerns theories, including arguments, statements, claims, and terminology. Empirical knowledge refers to the findings of empirical studies as well as the methodological aspects of the studies. Practical knowledge can be divided into two types. One refers to the knowledge contributed by practitioners including (1) the findings of action research, such as those reported in articles published in the ELT Journal or in the practitioners’ research section of Language Teaching


S. Li and H. Wang

Research; (2) guidelines and principles for effective practice, such as the information from teacher guides; and (3) opinions, debates, and discussions on public forums such as the Internet. The other type of practical knowledge pertains to policies and instructions formulated by government agencies to guide the practice of the domain in which the research is situated. These three types of knowledge correspond to three aspects of a research topic: theory (conceptual), research (empirical), and practice (practical). Although it is not a must to be all-inclusive, a literature review should minimally include the theories and research relating to the research topic. However, given the applied nature of our field, the review of the literature would appear incomplete if the practical dimension is left out.

Stages The process of doing a literature review can be divided into six stages, which are elaborated in the following sections.

Stage 1: Defining theProblem When doing a literature review, the first step is to define the research problem or formulate research questions for the study. Although research questions usually appear at the end of a literature review, in practice a researcher must have them at the beginning of the process. The research questions constitute a flagship guiding the literature review as well as other parts of an empirical study such as study design. Therefore, if the researcher isuncertain about where to start during a literature search, or even how to organize the literature review, the best way is to consider what questions the study seeks to answer and then find information on what theorists, researchers, and practitioners said about the questions. Although the research questions may be fine-tuned as the review process unfolds, they serve as starting points leading you towards the destination and guiding the literature search as well as later stages of the process.

Stage 2: Searching fortheLiterature After formulating the research questions, the next step is to search for the literature to be included in the review. The most common search strategy is to use electronic databases, including (1) domain-general databases (e.g., Google Scholar), (2) domain-specific databases in applied linguistics (e.g., LLBA), (3)

  Traditional Literature Review andResearch Synthesis 


databases from neighbouring disciplines (e.g., PsycINFO in psychology), and (4) databases for Ph.D./M.A. dissertations and theses (e.g., ProQuest Dissertations & Theses). One emerging powerful source of information is ­public academic forums such as Academia (, which are not only venues for academic communication between researchers but also large repositories of information. Another commonly used strategy is ancestry chasing, that is, mining the reference sections of primary studies (the term primary is used to distinguish studies synthesized in a literature review and the review itself ) and review articles to find relevant items.

Stage 3: Selecting Studies Although a traditional literature review usually does not report how the studies included in the review are selected, the researcher must make decisions, albeit “behind the scene,” on which retrieved studies actually go into the review. Unlike a research synthesis, which must be inclusive, a traditional review is selective. Although the selection criteria are idiosyncratic, some general principles should be adhered to. The first principle is that the selected studies must be representative. Representativeness has two dimensions: influential and diverse. Influential studies refer to milestone or seminal studies on the topic under investigation, which are frequently cited and/or are published in prestigious venues such as journals with high impact factors. By diverse, it is meant the included studies must represent different perspectives, disparate findings, and varied contexts or populations. Critics of traditional reviews argue that the authors may include only studies that support a certain theory, show certain results, or are carried out with a certain methodology. Therefore, it is important to include studies that represent different theories and trends to reduce the likelihood of bias and arbitrariness. The second principle concerns relevance. Retrieved studies may be relevant to theresearch questions to varying degrees. Giventhe usually limited space for a literature review, it is important to include the most relevant research. The third consideration is study quality, that is, the reviewed studies should have high internal and external validity.

Stage 4: Reading theLiterature Two principles should be kept in mind when reading the literature. The first is to read carefully and understand thoroughly. A common problem in doing a literature review is piecemeal reading (Booth, Colomb, & Williams, 1995),


S. Li and H. Wang

which may cause incomplete or inaccurate understanding. There is no shortcut to a successful review, and familiarity with the literature is the only key. It is advisable to read key articles several times, read alternative explanations in the case of a difficult or complicated theory, and read all hallmark studies. The second principle is to read actively and critically instead of passively and mechanically. While reading, one should not assume that published research is perfect. Instead, consider whether the study in question was conducted using valid methods, whether the results are due to the idiosyncratic versus principled methods, how the study is similar to, and different from, other studies, whether the interpretations are warranted, and how the study informsone’s own study in terms of whatone can draw on, what improvements one wants to make, or whether one will reorient the focus of one’s ownstudy based on whathas been learned from this study.

Stage 5: Organizing theData Information derived from the retrieved studies constitutes data for the literature review and should be organized in two ways: discretely and synthetically. In a discrete organization, the details about each individual study are recorded in a table or spreadsheet, and this type of information can be labelled study notes. Study notes can be arranged alphabetically according to the authors’ family names, and the notes about a primary study should include a brief summary of the study, followed by detailed information about the methods, results, and interpretations of the results. As to the literature review of a retrieved study, it would be useful to observe what theories or models the author refers to, how he/she defines the constructs, and how he/ she summarizes and critiques previous research. However, the author’s comments and interpretations of other studies might be biased and inaccurate, and therefore it is always advisable to read the original articles in case of any uncertainty. Finally, the study notes table should have a section for any comments the reviewermay have on any aspect of the study that merits further attention. A synthetic organization involves extracting themes, patterns, or trends that have emerged from individual studies. Synthetic notes can be organized after the study notes about each individual study are in place, but it is easier and more efficient to work on both at the same time. Organizing synthetic notes entails categorizing the information from the primary studies and identifying and reflecting on the commonalities and disparities between them. In which way should the information be categorized depends on what has

  Traditional Literature Review andResearch Synthesis 


emerged from the studies and whether categorizing the studies in a certain way leads to an interesting point or a convincing argument. Categorization can be based on study findings. For example, some studies may have found a certain instructional treatment to be effective, while others may not. It would then be necessary to divide them into two categories and ascertain what characteristics each group of studies share that lead to their respective and conflicting findings. In a similar vein, categorization can be done methodologically based on learner population, study context, measures of independent and dependent variables, and so on. Furthermore, methodological information collated from different studies can be the target of synthesis if one purpose of the review is to summarize the methodology of the primary studies. Finally, in addition to empirical knowledge, the synthetic notes should include sections for theoretical and practical knowledge so that all information needed for the review converges in one venue.

Stage 6: Writing UptheReview Components of a literature review. A literature review typically consists of three components: an introduction, the body of the review, and research questions. The length of the introduction ranges from one or two paragraphs (for a journal article) to a chapter (for a Ph.D. thesis), but the purpose is the same: to give “the reader a sense of what was done and why” (APA, 2001, p.16). This section of the literature review should provide a succinct overview of the topic, spell out the key issues surrounding the topic, state the significance of the study, identify the gaps in knowledge, explain the aims of the study, and inform the reader of the structure of the literature review. The body of the literature review should contextualize the current study by summarizing the theory, research, and practice of the research topic and justifying the significance of the study. The body of the review leads towards the research questions, and therefore before the research questions are introduced, it is necessary to summarize previous findings and controversies and show the links between previous research and the currentstudy. Structuring a literature review. There is no fixed format as far as how a literature review should be organized, but given the separation between the three types of knowledge—conceptual, empirical, and practical—the macro structure may consist of three major parts dealing with the theories, research, and practices relating to the focus of the current study. For an empirical study, the bulk of the review should be empirical knowledge, namely, the findings and methods of empirical studies.


S. Li and H. Wang

There are two ways to present the information contributed by empirical studies: thematic and anthological. Thematic presentation is based on the themes emerging from the primary studies (from the synthetic notes), and the flow of information proceeds through arguments. An argument is “the logical presentation of evidence that leads to and justifies a conclusion” (Machi & McEvoy, 2012, p.65). An argument has three components: claim, evidence, and warrant (Booth etal., 1995). A claim should be substantive and contestable in order to arouse readers’ interest and contribute to the existing body of knowledge. The evidence cited to support the claim needs to be: 1 . accurate, that is, incorrect evidence should not be used 2. precise, namely, be specific, avoid being vague, and hedge or use qualifiers if absolute preciseness is impossible 3. sufficient, meaning there should be enough evidence for the validity of the claim 4. authoritative, that is, evidence should be robust and influential 5. perspicuous, which means evidence should be clear and easy to understand The warrant of an argument is the reasoning linking the evidence and the claim; it is about the logical connections between the evidence and the claim, not the evidence or claim per se. Therefore, if there are no logical links between the claim and the evidence, then the argument is not warranted, even though the evidence is sound. In a nutshell, if oneelects to present the information about empirical studies thematically, the literature review would be built around different arguments in which claims or themes are reported with supporting evidence from multiple sources or studies. Anthological presentation means that the review is organized as a collection of individual studies, reporting details on the methods and results of each study—similar to study notes. Although this practice is prevalent in the field of applied linguistics, it is less effective than thematic presentation because the primary objective of a literature review is to critique, synthesize, and show readers what to make of previous findings rather than create an annotated bibliography. This is not to say that we should root out detailed descriptions of individual studies; rather, study details are necessary when they are important for making an argument or when hallmark or seminal studies are reported due to the special place they occupy in a literature review. However, because of the critical nature of traditional reviews, study details, when reported, should be accompanied with comments and critique. The critical nature of a literature review. As Imel (2011, p.146) pointed out, a literature review “provides a new perspective on the topic …[and is] more than

  Traditional Literature Review andResearch Synthesis 


the sum of its parts.” Key to developing a new perspective is a critical assessment of the findings and methods of the relevant body of research. The following strategies can beutilized to make a literature review a critical piece of writing: • Identify the differences and similarities between primary studies in terms of their findings and methods. • Include all representative findings, not only those that are “cherry-picked” to support your own position. If there are disparities, it is better to explain rather than ignore. For opposing evidence that is uninterpretable, it is better to state that “certain studies support one conclusion and others support another” (APA, 2001, pp.16–17), rather than provide an unconvincing interpretation. • Challenge existing theories and research. If you disagree with an argument or claim on reasonable grounds, do not hide your position; if there is clear evidence showing the limitations of a previous study, point them out; propose new or alternative interpretations for previous findings. However, do not stigmatize previous research even though you have found loopholes and limitations. • Discuss the significance of previous studies. If you have discovered merits of a view, finding, or method, it is important to make them known to the reader. Thus, being critical entails demonstrating not only the weaknesses of previous research but also their strengths and contributions. • Evaluate and clarify controversies or opposing theoretical positions and discuss how they have influenced the research and practice of the substantive domain. • Propose new directions, methods, or theories, which may complement, but not necessarily contradict or supersede, existing research. • Use linguistic devices to show the relationships between ideas and describe the authors’ stances. For example, cohesive expressions, such as “similarly,” “in contrast,” “however,” “therefore,” and so on, are effective in making evident how information is related. Appropriate reporting verbs should be used to accurately capture the authors’ positions and stances, such as “contend,” “argue,” “state,” “observe,” “assert,” “report,” and so on.

Research Synthesis Research synthesis grew out of the dissatisfaction with traditional reviews, which are deemed unscientific and subjective. Cooper (2016) argued that like an empirical study, a research synthesis must meet rigorous methodological


S. Li and H. Wang

standards in order for the conclusions to be trustworthy. Despite the different labels for research synthesis, experts seem to converge on the point that a research synthesis is comprehensive in coverage and transparent in reporting, and its purpose is to reach conclusions based on study findings, which may be used to guide practice and policy-making. In applied linguistics, two types of research synthesis have emerged: methodological and substantive. Methodological synthesis provides a survey of one or more methodological aspects of the primary research with a view to evaluating whether current practices meet certain criteria and what improvements can be made. Plonsky and associates (e.g., Plonsky & Gass, 2011) have published a series of methodological syntheses assessing the study quality of the primary research in second language acquisition. For example, Plonsky (2013) synthesized 606 empirical studies published between 1990 and 2010in two major journals: Language Learning and Studies in Second Language Acquisition, coding the studies for designs, statistical analyses, and reporting practices. The results showed a number of strengths and flaws, and the author proposed strategies to resolve the identified issues. Substantive syntheses seek to aggregate the results of primary studies and reach conclusions about whether an instructional treatment is effective or a certain relationship exists or how frequently a certain phenomenon occurs. Depending on the way the data is analysed, substantive syntheses can be further divided into three types: thematic synthesis, vote-counting, and meta-­ analysis. In a thematic synthesis, study findings are reported as themes and categories. A thematic synthesis may appear similar to a traditional review, but it is not, because it has all the characteristics of a research synthesis. In particular, it seeks to reach conclusions based on the totality of research rather than critique some selected studies. However, freestanding state-of-the-art reviews, which fall into the traditional review paradigm, have the potential of being converted to thematic syntheses if they follow more rigorous procedures, transparently report the review methods, and narrow their foci. An example thematic synthesis is Dixon etal.’s (2012) synthesis of 72 empirical studies on the optimal conditions of second language acquisition. The synthesis sought to answer five research questions from four perspectives based on transparent study selection criteria. The article reported the information about each included study in a 20-page table, and the study findings were described in a narrative style. The second type of substantivesynthesis is called vote-counting, in which study findings are analysed by tallying the number of significant and nonsignificant p-values reported in primary studies. An alternative, which is based on similar principles, is to average the p-values generated by the primary

  Traditional Literature Review andResearch Synthesis 


s­tudies. An example vote-counting synthesis is Ellis (2002), who aggregated the results of 11 studies examining the effects of form-focused instruction on the acquisition of implicit knowledge. Ellis conducted the synthesis by counting the number of studies that reported significant or nonsignificant effects, followed by a discussion of the results. Strictly speaking, thematic synthesis, which is discussed above, is one type of vote-counting because the conclusions are based on whether primary studies reported significant findings, even though a thematic synthesis does not overtly count the number of significant p-values. The third type of substantivesynthesis is meta-analysis, where effect sizes extracted from each primary study are aggregated to obtain a mean effect size as a proxy of the population effect(e.g., Li, 2010). Meta-analysis is a common type of research synthesis, and in fact, research synthesis has, either implicitly or explicitly, been equated with meta-analysis by many experts in this field (e.g., Cooper, 2016). Since Glass (1976) coined the term “meta-analysis,” it has become the most preferred method of research synthesis in various fields such as psychology and medicine. Unlike the dearth of guidelines on how to conduct a traditional literature review, there has been an abundance of publications including journal articles, book chapters, and books, which provide systematic instructions on how to carry out a meta-analysis (see Li, Shintani, & Ellis, 2012). Given that it is the most favoured and common method of research synthesis, the focus of this section is on meta-analysis.

Meta-Analysis Meta-analysis is a statistical procedure aiming to (1) aggregate the quantitative results of a set of primary studies conducted to answer the same research question(s) and (2) identify factors that moderate the effects across studies. Thus, a meta-analysis seeks to obtain a numeric index of the effects of a certain treatment or the strength of a relationship existing in the population; it also investigates whether the variation of the effects or relationship can be explained by systematic substantive and/or methodological features of the included studies. In a meta-analysis, the “participants” are the included studies, and the unit of analysis is the effect size contributed by each primary study or calculated based on available information. The variables for analysis are those ­investigated by the primary researchers or created by the meta-analyst on the basis of the features of the studies (e.g., participant demographics, research context, treatment length, etc.). The effect size, which takes different forms depending on the nature of the construct or the study design, is the building


S. Li and H. Wang

block of a meta-analysis. Importantly, the effect size is a standardized index that makes it possible to compare the results of different studies. Basing the analyses on effect sizes overcomes the limitations of the dichotomizing p-value in null hypothesis significance testing (NHST), which represents the presence or absence of a significant effect but tells nothing about the size of the effect. NHST is sensitive to sample size. For example, a significant p-value may result from a large sample even though the effect is small, and a nonsignificant p-value may be associated with a large effect but a small sample. Sun, Pan, and Wang (2010) reported that in 11% of the articles published in selected journals in educational psychology, there was a discrepancy between effect sizes and the results of NHST: medium to large effect sizes were associated with nonsignificant p-values, while small effect sizes were accompanied with significant p-values. Because the effect size contains information about the magnitude of an effect and it is exempt from the influence of sample size, its utility has gone beyond meta-analysis. For example, many applied linguistics journals have made it a requirement to include effect sizes in manuscripts reporting empirical studies. Meta-analysis is considered to be superior to other methods, such as votecounting, and to a large extent the superiority lies in the usefulness of effect size (Li etal., 2012). First, by aggregating effect sizes across primary studies, meta-analysis provides information about the size of the overall effect, whereas vote-counting can only tell us whether an effect is present. Second, in meta-­ analysis, effect sizes can be weighted in proportion to sample sizes, while in vote-counting all data points carry the same weight. Third, meta-analysis can provide a precise estimate of an effect or relationship, vote-counting methods can only demonstrate what the majority of the studies show about the construct. Fourth, meta-analysis follows rigorous statistical procedures. Theresults are likely more robust and credible and can guide practice and policy-making.

How toDo Meta-Analysis? An easy way to understand meta-analysis is to construe it as an empirical study, which involves problem specification, data collection, data analysis, and research report writing. In the following, we briefly outline five major stages involved in conducting a meta-analysis along with some of the issues and choices encountered at each stage. For a detailed tutorial on conducting a meta-analysis in applied linguistics, see Li etal. (2012), Ortega (2015), and Plonsky and Oswald (2015).

  Traditional Literature Review andResearch Synthesis 


Stage 1: Identifying aTopic As with a primary study, a successful meta-analysis starts with clear, well-­ defined research questions. However, unlike a primary study where the research questions are based on hypotheses and theories or gaps in previous research, the research questions for a meta-analysis are primarily those examined by the primary studies. Nevertheless, the meta-analyst still has to delineate the research domain based on a thorough understanding of the theories and the research that has been carried out. For instance, a meta-analysis on language aptitude (Li, 2015) requires an unequivocal theoretical and operational definition of aptitude. In educational psychology, aptitude refers to any person trait that affects learning outcomes, including cognitive (e.g., analytic ability) as well as affective variables (motivation, anxiety, etc.). However, in most aptitude research in applied linguistics, aptitude has been investigated as a cognitive construct. Therefore, defining the research topic is no easy matter, and the decisions at this preliminary stage have a direct impact on the scope of the synthesis and how it is implemented at later stages.

Stage 2: Collecting Data Data collection for a meta-analysis involves searching for and selecting studies for inclusion in the synthesis. Different from a traditional review, which usually does not report how the reviewed studies are identified and sieved, a metaanalysis must report the details on the strategies, databases, and keywords used during the search, as well as the criteria applied to include/exclude studies. One judgement call at this stage concerns whether to include unpublished studies. Evidence shows that studies that report statistically significant results are more likely to be published or submitted for publication (Lipsey & Wilson, 1993). Therefore excluding unpublished studies may lead to biased results, and this phenomenon is called publication bias. One solution, of course, is to include unpublished studies. However, there have been objections to this recommendation on the grounds that unpublished studies are difficult to secure, that they lack internal and external validity, and so on. Another solution is to explore the extent to which publication/availability bias is present in the current dataset, such as by plotting the calculated effect sizes to see whether studies with small effects are missing, followed by a trim-and-­fill analysis to see how the mean effect size would change if the missing values were added, or by calculating a fail-safe N statistic to probe whether the obtained results would be easily nullified with the addition of a small number of studies.


S. Li and H. Wang

Stage 3: Coding theData Data coding involves reading the selected studies, extracting effect sizes from the studies, and coding independent and moderating variables. As a start, wewould like to point out that most introductory texts ignore the reading stage and emphasize the technical aspects of meta-analysis. However, it is important to recognize that meta-analysis is fundamentally a statistical tool that helps us solve problems, not an end in itself. Eventually the contribution of a meta-analysis lies in the findings and insights it generates, not the sophisticated nature of the statistical procedure. Therefore, while every effort should be made to ensure statistical rigour, careful reading of the study reports is critical to a thorough understanding of the substantive domain, meaningful coding of the data, and accurate interpretation of the results. While reading, the meta-analyst should keep notes about each individual study recording the main findings and methodological details, which can be checked at any stage of the meta-analysis. While the meta-analyst should read the whole article for each study, effect size extraction relates mainly to the results section of the study report. There are three main types of effect sizes: d, r, and OR (odds ratio), which represents mean difference between two groups, correlation, and probability of the occurrence of one event in a certain condition compared with another, respectively. In a meta-analysis, the effect sizes from the primary studies serve as the dependent variable, and the independent variables are of two types: study-­ generated and synthesis-generated (Cooper, 2016). Study-generated independent variables are those investigated in primary studies, and these variables should be recorded intact. Synthesis-generated independent variables are not directly investigated in primary studies, that is, they are not manipulated as variables by primary researchers. Rather, they are created by the meta-analyst a posterior based on the characteristics of the primary research. Synthesis-­ generated variables also include what Lipsey and Wilson (2001) call study descriptors, which refer to the methodological aspects of the primary research such as participants’ age, research setting, and so on.

Stage 4: Analysing theData In most meta-analyses, data analysis follows a two-step procedure: aggregation of all effect sizes that produces an estimate of the population effect, followed by moderator analysis exploring whether the variation of effect sizes is

  Traditional Literature Review andResearch Synthesis 


due to systematic differences between subgroups of studies formed by treatment type, research setting, and so on. Effect size aggregation must be theoretically meaningful. For example, in Li’s (2015) meta-analysis on language aptitude, there were two types of studies based on different theoretical frameworks and conducted via two distinguishable methodologies. The effect sizes were aggregated separately because squashing the two study types was theoretically unsound, albeit statistically feasible. In a meta-analysis, several issues need to be attended to for each analysis. One is to assign different weights to the included studies in proportion to their sample sizes such that large-sample studies carry more weight because they provide more accurate estimates of the population effect. The second is to make sure that one study contributes only one effect size for each aggregation, to prevent effect size inflation, Type I errors, and the violation of the “independence of data points” assumption of inferential statistics. In the event that a study contributes several effect sizes, it is advisable to either pick one or average them, depending on which choice suits the research question for that analysis. The third is to calculate a confidence interval for each mean effect size, which is the range the population effect falls into. A confidence interval that does not include zero means the effect size is significant, and a narrow interval represents a robust effect. Fourth, for each mean effect size, it is necessary to report a measure of variability—either standard error or standard deviation—that shows the distribution of the aggregated effect sizes. Finally, a hom*ogeneity test, which is called Qw (“w” in the subscript is the abbreviation for “within-group”) test, should be performed for each group of effect sizes to assess the distribution of the effect sizes. Moderator analysis, which is alternatively called subgroup analysis, is often conducted through the Qb (b in the subscript stands for “betweengroup”) test, which is similar to one-way ANOVA, with the only difference being that the Qb test incorporates study weights in the calculation of coefficients and standard errors. A significant Q value indicates significant differences between the subgroups of effect sizes, and post hoc pairwise Qb tests are in order to locate the source of significance. In applied linguistics, a common practice is to use the confidence interval as a test of significance: lack of an overlap between two confidence intervals indicates that the mean effect sizes are significantly ­different. However, it must be pointed out that the opposite is not true, namely, an overlap does not mean absence of significant differences (see Cumming, 2012, for further details). Therefore, the confidence interval is at best a conservative, if not unreliable, test of statistical significance.


S. Li and H. Wang

Stage 5: Writing UptheResearch Report Similar to the report for an empirical study, a meta-analytic report includes the following sections: introduction, methods, results, discussion, and conclusion. The introduction should contextualize the meta-analysis by: 1 . elaborating the relevant theories 2. defining the research problem and scope 3. identifying the central issues and controversies 4. explaining the rationale for examining the variables The methods section reports information about search strategies, selection criteria, the coding protocol, and statistical procedures. Transparent reporting is a defining feature that distinguishes meta-analysis from traditional literature reviews; transparent reporting of methods and analytic procedures makes it possible for other researchers to replicate a meta-analysis to verify the findings. The results section should include a summary of the methodological aspects of the synthesized studies to provide an overall picture about how studies in this domain have been conducted, followed by the results of the meta-­analysis. Valentine, Pigott, and Rothstein (2010) recommended creating a table providing a description of the characteristics and results of each study including participant information, the treatment, and the calculated effect sizes. For the meta-analytic results, each mean effect size should be accompanied with the number of effect sizes (k), the confidence interval, a measure of dispersion (standard error or standard deviation), the results of a hom*ogeneity test, and the p-value for the effect size. Results of a moderator analysis should include the number of effect sizes for each group/condition, the Qb value, and the related p-value. In the discussion section, the meta-analyst may interpret the results “internally” with reference to the methods of the primary studies and “externally” to theories and other reviews such as state-of-the-art traditional reviews and meta-analyses on this and similar topics. The results can also be discussed in terms of how they can be used to guide practice and policy-making as well as future research.

raditional Review andResearch Synthesis: T Comparison andIntegration Given that traditional reviews and research syntheses have been discussed as two separate approaches in the literature, there seems to be a need to make a direct comparison between them. The purpose of the comparison is not only

  Traditional Literature Review andResearch Synthesis 


to distinguish them but also to stimulate thoughts on how to accurately understand what the two approaches can achieve and how to overcome their limitations and maximize their strengths. Before comparing the two methods of review, it is helpful to provide a taxonomy of literature review as a genre, which, according to Cooper (2016), can be classified on six dimensions: focus, goal, perspective, coverage, organization, and audience. The focus of a review refers to the type of information to be included, which may take the form of theories, empirical findings, research methods, and policies/practices. The goals to accomplish may include summarizing existing knowledge, evaluating the validity of certain aspects of the primary research, or identifying issues central to the field. Perspective refers to whether a reviewer adopts a neutral position or is predisposed to a certain standpoint. Coverage concerns whether a review is based on all available studies or a selected set of studies. In terms of organization, a review can be structured based on the historical development of the research, the themes that emerged in the literature, or the methods utilized by subgroups of studies. The audience of a review may vary between researchers, practitioners, general public, and so on. It is noteworthy that the options for each dimension can be applied jointly when identifying the characteristics of a review or making decisions on what type of review one plans to carry out. For example, the focus of a review can be on both theory and research findings. Similarly, a review can be structured historically, but the research within a certain period can be synthesized thematically. While Cooper’s scheme does not directly distinguish traditional reviews and research syntheses, it helps us understand the differences between the two approaches, which are distinguishable along the following lines (see Table6.1). First, traditional reviews are based on selected studies while research syntheses are based on all available studies. Certainly, research syntheses may also be selective, following certain screening criteria, but in general they tend to be more inclusive than traditional reviews. Second, the studies included in a traditional review are vetted via the reviewer’s expertise or authority; in a research synthesis, however, the primary studies are often assessed based on criteria for study quality. Third, traditional reviews do not follow protocols or procedures, whereas research synthesis follows rigorous methodology. Fourth, one important feature of research synthesis is its transparency in reporting the review procedure or process, which is absent in traditional reviews. A corollary is that a research synthesis is replicable, but a traditional review is not. Fifth, traditional reviews do not often have research questions and often seek to provide overarching descriptions of previous research; research syntheses, in contrast, aim to answer clearly defined research questions. The above arguments and observations seem to suggest that traditional reviews are subjective


S. Li and H. Wang

Table 6.1  A comparison between traditional reviews and research syntheses Traditional review

Research synthesis

Focus Purpose

Theory, research, and practice Research To justify further research; identify To answer one or more research questions by collating evidence central issues and research gaps; from previous research; resolve discuss state of the art; critique controversies; guide practice previous research


Selective: most relevant and representative; no selection criteria

Methodology No prescribed methodology

Inclusive: all relevant studies; based on systematic search and justified selection criteria With transparent methodology


Organized by themes and patterns Analogous to the template of a research report, including an introduction, methods, results, discussion, and conclusion


Narrative, critical, and interpretive Descriptive, inductive, and quantitative


Objective, transparent, and Flexible; appropriate when the replicable; results may inform purpose is to critique rather than practice and policy-making aggregate research findings


Subjective; based on idiosyncratic methods; not replicable

Subject to the quality of available studies; comparing oranges and apples; time-consuming

and may lead to false claims and that research syntheses are more objective and generate robust results that can guide practice and policy-making. However, it is arbitrary and inaccurate to discount traditional reviews as valueless, and their utility is justifiable on the following grounds. First, a traditional review is often part of a journal article, a master’s thesis, or a Ph.D. dissertation, and the primary purpose is to contextualize a new study and draw on existing research methods to answer the research questions of the current study. In this context, it seems less important to include all available studies and reach conclusions based on the totality of the research. Second, traditional reviews are flexible and can be used to synthesize knowledge other than research findings such as theories, practices, and policies. Third, traditional reviews are often critical. Thus, if the purpose of a review is to critique certain aspects of previous research such as the instruments used to measure a certain construct rather than collate information to show the effectiveness of a certain treatment or the presence of a certain relationship, then the traditional approach is more appropriate. Finally, traditional reviews are appropriate when (1) it is premature to conduct a research synthesis because of the lack of research, and (2) aggregation of evidence is not meaningful because the primary studies are carried out using heterogeneous methods.

  Traditional Literature Review andResearch Synthesis 


So what do we make of the status quo of literature review as a field of research in relation to the two different review methods? First of all, the differences between the two methods are suggestive rather than conclusive, and they stand in a continuum rather than a dichotomy. For example, although traditional reviews are more likely to be critical, those adopting a more systematic approach may also critique certain aspects of the research based on their aggregated results. Therefore, it is better to consider the differences in terms of overall emphasis and orientation rather than treat them as polarized disparities. Second, the differences are based on what has been observed about two broad types of review. They are a posteriori and descriptive, not stipulated or prescriptive; in other words, they do not have to differ the way they are assumed to be different. For example, traditional reviews have been criticized for being subjective, but there is no reason why they cannot become more objective by following more rigorous procedures, such as conducting more thorough literature searches, including more studies that represent different perspectives, and so on. Finally, and importantly, we should find ways to integrate traditional literature reviews and more systematic approaches such as meta-analysis, and such initiatives have already taken place in our field. For example, Carpenter (2008) included a meta-analysis of the predicative validity of the Modern Language Aptitude Test in her Ph.D. dissertation when reviewing the literature on language aptitude. In this case, a small-scale meta-analysis is embedded in a traditional literature to explore an important issue, which constitutes an assimilative approach where a meta-analysis plays a supplementary role. Another way is to adopt a more balanced approach where the two review methods are utilized to synthesize (1) different types of knowledge or (2) studies conducted with different methods. In the case of (1), a good example is Xu, Maeda, Lv, and Jinther (2015), who used more traditional methods to synthesize the theoretical and methodological aspects and meta-analysis to aggregate the findings of the primary research. In the case of (2), an example is Li (2017), who conducted a comprehensive review of the research on teachers’ and learners’ beliefs on corrective feedback. The author meta-analysed the results of the studies conducted using similar methods (e.g., studies using a five-point Likert scale) but described the themes and patterns demonstrated by studies that deviated from the majority, such as those using a four- or three-point scale and those that used qualitative methods (e.g., observations, interviews, or diaries). Therefore, rather than make an a priori decision to carry out a certain type of review, we may customize our methodology based on the available data, using or integrating different approaches so as to provide a comprehensive, impartial view of the research domain.


S. Li and H. Wang

Resources forFurther Reading Borenstein, M., Hedges, L., Higgins, J., & Rothstein, H. (2009). Introduction to meta-analysis. Chichester: John Wiley & Sons. This book constitutes a comprehensive, practical guide for both novices and experts on how to carry out a meta-analysis. The book also discusses different practices and judgement calls one may face at each step, draws the reader’s attention to pitfalls, and makes recommendations on how to resolve issues and overcome challenges. Galvan, J. (2014). Writing literature reviews: A guide for students of the social and behavioral sciences. Glendale, CA: Pyrczak Publishing. Galvan’s book is an extremely useful source of information for students who have no experience in conducting a literature review and for teachers who want to teach students how to do a literature review. Like this chapter, Galvan dissects literature review into a multi-step process rather than treats it as a product that only concerns the write-up or the discourse features of a review. The book provides straightforward, accessible guidelines as well as concrete, real-world examples illustrating how to apply the guidelines.

References American Psychological Association. (2001). Publication manual of the American Psycho­ logical Association (5th ed.). Washington, DC: American Psychological Association. Booth, W., Colomb, G., & Williams, J.(1995). The craft of research. Chicago: The University of Chicago Press. Bronson, D., & Davis, T. (2012). Finding and evaluating evidence: Systematic reviews and evidence-based practice. NewYork: Oxford University Press. Carpenter, H.S. (2008). A behavioural and electrophysiological investigation of different aptitudes for L2 grammar in learners equated for proficiency level. Ph.D. dissertation, Georgetown University. Cooper, H. (2016). Research synthesis and meta-analysis: A step-by-step approach (5th ed.). Thousand Oaks, CA: SAGE. Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and metaanalysis. New York: Routledge. Dixon, L., Zhao, J., Shin, J., Wu, S., Su, J., Burgess-Brigham, R., & Snow, C. (2012). What we know about second language acquisition: A synthesis from four perspectives. Review of Educational Research, 82(1), 5–60.

  Traditional Literature Review andResearch Synthesis 


Ellis, R. (2002). Does form-focused instruction affect the acquisition of implicit knowledge? A review of the research. Studies in Second Language Acquisition, 24(2), 223–236. Glass, G. (1976). Primary, secondary, and meta-analysis of research. Education Researcher, 5, 3–8. Imel, S. (2011). Writing a literature review. In T.Rocco & T.Hatcher (Eds.), The handbook of scholarly writing and publishing (pp.145–160). San Francisco, CA: Jossey-Bass. Jesson, J., Matheson, L., & Lacey, F. (2011). Doing your literature review: Traditional and systematic approaches. Los Angeles, CA: SAGE. Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60(2), 309–365. Li, S. (2015). The associations between language aptitude and second language grammar acquisition: A meta-analytic review of five decades of research. Applied Linguistics, 36(3), 385–408. Li, S. (2017). Teacher and learner beliefs about corrective feedback. In H.Nassaji & E.Kartchava (Eds.), Corrective feedback in second language teaching and learning (pp.143–157). NewYork, NY: Routledge. Li, S., Shintani, N., & Ellis, R. (2012). Doing meta-analysis in SLA: Practices, choices, and standards. Contemporary Foreign Language Studies, 384(12), 1–17. Lipsey, M., & Wilson, D. (1993). The efficacy of psychological, educational, and Behavioural treatment: Confirmation from meta-analysis. American Psychologist, 48(12), 1181–1209. Lipsey, M., & Wilson, D. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage. Machi, L., & McEvoy, B. (2012). The literature review: Six steps to success. Thousand Oaks, CA: Corwin. Ortega, L. (2015). Research synthesis. In B.Paltridge & A.Phakiti (Eds.), Research methods in applied linguistics: A practical resource (pp. 225–244). London: Bloomsbury. Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35, 655–687. Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61(2), 325–366. Plonsky, L., & Oswald, F.L. (2015). Meta-analyzing second language research. In L. Plonsky (Ed.), Advancing quantitative methods in second language research (pp.106–128). NewYork, NY: Routledge. Sun, S., Pan, W., & Wang, L. (2010). A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. Journal of Educational Psychology, 102(4), 989–1004. Valentine, J., Pigott, T., & Rothstein, H. (2010). How many studies do you need? A primer on statistical power for meta-analysis. Journal of Educational and Behavioral Statistics, 3(2), 215–247.


S. Li and H. Wang

Xu, Y., Maeda, Y., Lv, J., & Jinther, A. (2015). Elicited imitation as a measure of second language proficiency: A narrative review and meta-analysis. Language Testing, 33(4), 497–528. Zaporozhetz, L. (1987). The dissertation literature review: How faculty advisors prepare their doctoral candidates. Ph.D. dissertation, The University of Oregon.

7 Research Replication RebekhaAbbuhl

Introduction This chapter traces the history of replication research in the field of applied linguistics, culminating in a discussion of current views of replication research as a means of evaluating the internal and external validity of a study, providing better explanations of phenomena of interest, and ultimately, driving both theory and pedagogy forward. The chapter discusses different types of replication studies (e.g., exact, approximate, conceptual) with recent examples from the field. Controversies surrounding the use of this type of research—including whether replications of qualitative studies are possible or desirable—are discussed. The author concludes with current recommendations to facilitate replication research in applied linguistics.

History Replication is so engrained in the physical and natural sciences that it rarely provokes discussion: it is taken as self-evident that repeating experiments is needed to confirm and test the generalizability of results (Hendrik, 1990).

R. Abbuhl (*) Department of Linguistics, California State University at Long Beach, Long Beach, CA, USA e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



R. Abbuhl

However, the treatment of replications in applied linguistics (and in the social sciences in general) has been more contentious. As a relatively young field, applied linguistics has prioritized innovation and originality for most of its short history. This emphasis has spurred tremendous growth and an ever-­expanding research agenda—but at the same time, it is at least partially responsible for the widespread view that replications are the poor cousins of original research (Porte, 2012). And despite early attempts to argue to the contrary (e.g., Polio & Gass, 1997; Santos, 1989; Valdman, 1993), replications have remained scarce in applied linguistics. Polio (2012b), for example, examined studies conducted between 1990 and 2009 in six major journals in the field and identified only 24 replication studies explicitly labeled as such. Similarly, Marsden and Mackey (2013) [as reported in Mackey & Marsden, 2016] found that in the database Language and Linguistic Behavior Abstracts, there were only 40 such studies dated between 1973 and 2013. A series of developments has helped to shift that view in applied linguistics. In 2007, Language Teaching became the first journal in the field to have a strand specifically devoted to replications; a subsequent symposium organized by the editor of that journal, Graeme Porte, was held in 2009, covering various issues related to using, interpreting, and teaching replications. In 2012, the first book-length treatment of replications in the field was published (Porte, 2012), and since then renewed attention has been given to this topic (e.g., Basturkmen, 2014; Bikowski & Schulze, 2015; Bitchener & Knoch, 2015; Casanave, 2012; Chun, 2012; Gass & Valmori, 2015; King & Mackey, 2016; Markee, 2017; Matsuda, 2012; Mu & Matsuda, 2016; Plonsky, 2012, 2015; Polio, 2012a, 2012b; Porte, 2012, 2013; Porte & Richards, 2012; Sasaki, 2012; Schmitt, Cobb, Horst, & Schmitt, 2017; Smith & Schulze, 2013; Webb, 2015; Willans & Leung, 2016).

Current Issues There are many factors behind the recent surge of interest in replications. One, of course, is the increasing maturity of applied linguistics as a field: Discovery, for a young area of inquiry, is paramount, but for more mature fields, confirmation of results is as equally important (Sakaluk, 2016). The surge of interest may be related to the field’s growing sophistication with respect to its use and interpretation of statistics. Traditionally, a statistically significant finding was considered tantamount to a generalizable finding.

  Research Replication 


However, as a number of researchers have pointed out, this is not the case; external replications are needed to provide information about the generalizability of any given result (e.g., Nassaji, 2012). Another reason for the increasing attention to replication is the current replicability crisis facing the sciences in general and social sciences, such as psychology. Controversial claims have been made that most published research results are false (e.g., Ioannidis, 2005; see also Pashler & Harris, 2012). In addition, there have been highly publicized cases of fraud (see e.g., King & Mackey, 2016; Pashler & Wagenmakers, 2012, for discussions) and questionable research practices (see e.g., John, Loewenstein, & Prelec, 2012; Simmons, Nelson, & Simonsohn, 2011, for overviews). These questionable research practices include not mentioning all dependent measures, selectively reporting statistically significant results (and omitting those that are not), misrepresenting p-values (e.g., reporting that a p-value greater than a particular criterion, say 0.05, was actually less than that), claiming that an unexpected finding had actually been predicted from the beginning, stopping data collection when the hoped-for result had been obtained, excluding data in order to obtain the hoped-for result, and falsifying data (John et al., 2012). In John etal.’s (2012) survey of over 2000 psychologists, the majority admitted to engaging in one or more of these practices, leading John etal. to conclude that “these practices may constitute the de facto scientific norm” (p.524). It has long been noted that the social sciences have a “file drawer” problem (Rosenthal, 1979, p.638), in that statistically significant results are far more likely to be published than those that are not. This may help explain the prevalence of these questionable research practices: if publication is more likely when the results are statistically significant, and if career considerations (e.g., retention, tenure, promotion) hinge on publications, then it is little wonder why such practices continue. As these practices inflate Type I errors (i.e., false positives, or erroneously rejecting the null hypothesis), researchers in many fields have called upon the use of replications as a safeguard mechanism (e.g., Burman, Reed, & Alm, 2010, for public finance; Benson & Borrego, 2015, for engineering education; McNeeley & Warner, 2015, for criminology; Mezias & Regnier, 2007, for management, among many others). Efforts across a range of fields have also been made to encourage more replications, as we will discuss below. Nevertheless, controversy has continued, in part because of the misunderstandings surrounding the definition and interpretation of replications. We turn to the issue of defining in the next section.


R. Abbuhl

Overview ofDesign A tripartite distinction is typically made between exact, approximate, and conceptual replications (Abbuhl, 2012a, 2012b; Language Teaching Review Panel, 2008; Porte, 2012; Porte & Richards, 2012; see, however, Polio, 2012b, for a slightly different classification). It should be understood from the outset, however, that these three types exist along a continuum, from the most faithful to the original study (exact) to the least (conceptual). Each of these will be discussed in turn below. The least common (but most well known) type of replication in applied linguistics is the exact replication (also known as direct or literal). This involves identifying a methodologically sound study and repeating the study as exactly as possible (i.e., the same tasks, setting, and type of participant) in order to confirm the original findings. It is recognized that some small changes will be inevitable (e.g., the gap in time) even though every effort is made to be as faithful to the original study as possible (Earp & Trafimow, 2015). In the field of applied linguistics and in the social sciences in general, approximate replications are much more common than exact replications (for recent examples, see Booth, 2013; Johnson & Nicodemus, 2015). In this type of replication, the original study’s methodology is adhered to in most respects, but one or two of the non-major variables—such as the context, participant first language (L1), or task—are changed in order to determine the generalizability of the original results. In other words, there is a deliberate change in order to determine whether the results of the original study hold for a new population, setting, language, and so on. Sample Study 7.1 provides an example of an approximate replication. Sample Study 7.1 Johnson, M., & Nicodemus, C. (2015). Testing a threshold: An approximate replication of Johnson, Mercado & Acevedo 2012. Language Teaching, 49(2), 251–274. Research Background One question in the second language (L2) writing literature concerns the effect of pre-task planning on written production. The original study (Johnson, Mercado, & Acevedo, 2012) sought to examine the effect of pre-task planning on the fluency and complexity of L2 writing. Johnson et al. (2012) found that although pre-task planning had a small effect on writing fluency, it did not significantly affect grammatical or lexical complexity. Research Problems/Gaps As their results differed from those found in previous studies, Johnson et al. (2012) suggested that the characteristics of their participants (e.g., their rela-

  Research Replication 


tively low proficiency in the L2) may have been partially responsible. In particular, they hypothesized that writers who have not reached a certain level of L2 proficiency may not benefit from pre-task planning as their cognitive resources may already be stretched by composing in the L2. Research Method • To test this hypothesis, Johnson and Nicodemus (2015) conducted an approximate replication of the original study, changing the participants to a group of L1 writers. • The researchers divided the 90 L1 writers into a control group with no pretask planning and three groups with different types of pre-task planning. • The writing produced by the students was analyzed in the same manner as the original study, with the fluency, grammatical complexity, and lexical complexity calculated. Key Results Johnson and Nicodemus (2015) found that pre-task planning had no effect on the written fluency, grammatical complexity, or lexical complexity of their L1 writers. This result led the researchers to conclude that Johnson et al.’s (2012) hypothesis (that writers need to have a certain level of proficiency in the language in order to benefit from pre-task planning) could not be not supported. Comments This study provides evidence that approximate replications have a role to play beyond determining the generalizability of an original study. Had the results of Johnson and Nicodemus’ (2015) replication differed markedly from those in the original (viz., if there had been a large effect for pre-task planning for the L1 writers), this would have supported the original study’s explanation for their results (viz., that their L2 participants had not reached a threshold level of proficiency to benefit from the planning). The consistency of results between the replication and original allowed Johnson and Nicodemus (2015) to refute the original study’s explanation of their results.

At the least faithful end of the continuum is the conceptual replication. Here, a new research design (which may involve changing the independent and/or dependent variables, operationalizations of phenomena, and participant population) is used to investigate the same general idea or hypothesis as the original study. The purpose of such a replication may be to test the generalizability of relationships to new sets of variables within a larger model, or alternatively, to determine to what extent the findings of the original study were artifacts of its own methodology. Recent examples of this type of replication include Frankenberg-Garcia (2014), Leow (2015), Lim and Godfroid (2015), Rott and Gavin (2015), and Yoon and Polio (2017). An example of a conceptual replication can be found in Sample Study 7.2.


R. Abbuhl

Sample Study 7.2 Yoon, H.-J., & Polio, C. (2017). The linguistic development of students of English as a second language in two written genres. TESOL Quarterly, 51(2), 275–301. Research Background The original study, Lu (2011), used 14 different measures of syntactic complexity to examine a corpus of writing produced by English as a second language (ESL) writers at four different levels of proficiency. The corpus contained both timed and untimed essays, 16 different topics, and different genres (argumentative and narrative). One of the main findings was that there was greater syntactic complexity in the argumentative essays than in the narrative essays. Research Problems/Gaps Yoon and Polio (2017) sought to determine whether differences observed in Lu (2011) were most likely due to developmental reasons or to genre factors. Research Method • In a conceptual replication of Lu (2011), Yoon and Polio (2017) examined a range of writing development measures, including syntactic complexity, syntactic accuracy, lexical complexity, and fluency on two different topics. • They recruited ESL students (N=37, at the top two levels of a university program) and a group of native speakers (N=46) to see which differences could be attributed to genre and which to proficiency. • A longitudinal component was added to examine changes over time. Key Results Yoon and Polio (2017) found interactions between genre and time in the L2 essays. For example, in the narrative essays, there was a significant increase in two of the syntactic complexity measures; however, no such increase was seen in the argumentative essays. Comparing the genres, the researchers found that several syntactic complexity measures were greater in the argumentative writing than in the narrative writing. Differences between the genres were also seen in the lexical complexity and lexical diversity measures (but not in fluency or in accuracy). Similar patterns were also seen in the native speaker writings. Comments Yoon and Polio’s supportive conceptual replication provided evidence that Lu’s (2011) results were not simply artifacts of the writing development measures used or the proficiency level of the participants. Given the scarcity of research on genre effects in the field of L2 writing, Yoon and Polio’s results highlight the importance of addressing this variable when investigating writing development.

  Research Replication 


Challenges This section presents several challenges in replication research, such as interpretations of research findings, qualitative research replications, bias issues, and reporting and data sharing issues.

Interpretation ofResults One of the most common concerns regarding replication research centers on the interpretation of results. Replications are typically classified as “successes” or “failures” depending on whether or not they report the same findings as the original study. In the case of a “successful” exact replication, the findings of the original study are confirmed and researchers can have more confidence in the internal and external validity of the original study (e.g., Language Teaching Review Panel, 2008). However, it has also been maintained that such findings lack novelty and that the researcher is merely (and unimaginatively) verifying the results of the original study (Hendrik, 1990). As Bronstein (1990) puts it, in such a case “you’ve just demonstrated something that we already knew” (p.73). If the exact replication “fails,” on the other hand, researchers may be tempted to conclude that the findings of the original study are false and need to be overturned. Recent researchers have pointed out that this is not the case (replication “failures” may be due to many reasons, including undetected moderators [i.e., subtle and currently unknown factors that affect the results], human error, low statistical power in the replication, and/or other factors, e.g., Makel & Plucker, 2015; Maxwell, Lau, & Howard, 2015; Open Science Collaboration, 2012), but this uncertainty contributes to the view that unsuccessful replications are “uninterpretable” (Bronstein, 1990, p.74). However, it needs to be kept in mind that just as no any single original study is proof for a particular phenomenon (just evidence for), no single replication will either prove or falsify the original study. Replications should be seen as informative rather than as conclusive (Earp & Trafimow, 2015). For example, if an exact replication obtains the same findings as the original study, it should be seen as additional evidence that the original study is valid. If it does not reach the same findings, it should be seen as evidence that more research is necessary (Crandall & Sherman, 2016), not that it is a failure. In this light, the terms success and failure should be replaced with supportive and non-supportive.


R. Abbuhl

Moving on to the next type of replication, a supportive approximate replication should be seen as providing evidence that the results of the original study can be generalized, for example, to a new language, population, or context. A non-supportive approximate replication may suggest that the original results cannot be generalized, but more importantly, it should serve as a call for further research on the topic. Similarly, a supportive conceptual replication can provide evidence that a particular finding is not an artifact of the original study’s methodology; a non-supportive conceptual replication should again serve as a springboard for more research. Were the differences in methodologies responsible for the non-supportive results? Were there undetected moderators, and if so, what are they? Addressing these questions, as Crandall and Sherman (2016) note, can provide opportunities for theoretical advances.

Qualitative Research Replications Another controversy surrounding replications is whether they are possible or even desirable for qualitative studies (e.g., Casanave, 2012; Matsuda, 2012; Sasaki, 2012). Traditionally, the argument against qualitative replications was that the verification of results in this type of inquiry was a reductionist and fruitless endeavor: the particularities of the researcher, the participant(s), and the setting would make it highly unlikely that another researcher could study the same participant (or even the same type of participant) in the same setting using the same methodological tools and come up with the same results. In addition, as Schofield (2002) notes, “[t]he goal is not to produce a standardized set of results that any other careful researcher in the same situation or studying the same issue would have produced. Rather it is to produce a coherent and illuminating description of and perspective on a situation that is based on and consistent with detailed study of that situation” (p.174, italics in original). However, much of this concern over the verification of qualitative findings (and hence the antipathy to replications among some qualitative researchers) seems to stem from conflating replication with exact replication (Porte & Richards, 2012). Due to the interpretive and situated nature of qualitative inquiry, researchers’ concerns over exact replications are not without merit. This does not mean, however, that all qualitative replications should be abandoned or even that qualitative exact replications are impossible. For example, let’s say that an instrumental case study was conducted on a small group of English as a second language (ESL) writers to determine their perceptions of written corrective feedback as they progressed through univer-

  Research Replication 


sity. In contrast to an intrinsic case study (where the primary focus is to explore and understand the case at hand, without an attempt to generalize across cases), in an instrumental case study, understanding the phenomena being studied is more important than the cases themselves (Stake, 2005). Let’s further say that the researcher made full use of sound qualitative research practices (e.g., rich description, triangulation, audit trails, coding checks, member checking, long-term observation). With such a study, readers can make determinations as to whether any of the information provided applies to their own local context. However, the study may also raise some interesting questions: for example, if another researcher used the same sound methodology that the original researcher employed, what perceptions would she or he uncover in a slightly different population, say English as a foreign language writers? Would the context impact the students’ perceptions? A supportive approximate replication here would suggest that context plays a minimal role in student attitudes toward written corrective feedback; a non-supportive approximate replication, rather than being a failure, would uncover both theoretically and pedagogically interesting information—in particular, that context does play a role. In a similar vein, a conceptual replication of our ESL writer study could involve changing aspects of the methodology—for example, replacing researcher-guided interviews with peer-guided ones, or supplementing the interviews with diary data. A supportive conceptual replication here would suggest that the methodology does not exert undue influence over the students’ responses. A non-supportive conceptual replication, rather than undermining the original study or proving it false, would suggest that methodological changes can influence student responses. But what about exact or nearly exact replications in qualitative research? If a qualitative researcher were to take this approach, she or he could investigate the same type of participant, in the same setting, using the same methodology (all the while recognizing that the particularities of the researcher and participant could affect the results). A supportive exact replication here would provide evidence that these particularities are not major factors when studying the phenomenon at hand; a non-supportive exact replication would provide the equally valuable evidence that those factors do likely play an important role. Future studies could then work to determine how much of a role they play and under what conditions they exert an influence. When we abandon using the terms success and failure to describe replications, and when we recognize that replicating some types of qualitative studies (e.g., instrumental case studies) could offer valuable information to the field, some of the current debate over qualitative replications may be dispelled. As


R. Abbuhl

Porte and Richards (2012) note, this kind of reconceptualization could help reconfigure “transferability issues in [qualitative research], moving away from questions about whether findings might be transferable and toward the issues of transferability itself: In what respects cases might be transferable, for what reasons they might be transferable, etc.” (p. 289). It is an approach that a number of qualitative researchers have advocated (e.g., Golden, 1995; Markee, 2017; Schofield, 2002), and one that may open up additional areas of inquiry.

Bias Issues Another challenge concerning replication studies is the bias issue. As Makel and Plucker (2015) note, “there is … a catch-22: Replication is good for the field but not necessarily for the individual researcher” (p.158). Researchers have long noted that replications are a low-prestige endeavor, commonly associated with a lack of creativity and a confrontational personality (e.g., Bronstein, 1990; Earp & Trafimow, 2015; Easley, Madden, & Dunn, 2000; Hubbard & Armstrong, 1994; Makel & Plucker, 2015; Makel, Plucker, & Hegarty, 2012; Mezias & Regnier, 2007; Neuliep & Crandall, 1990). Real or perceived editorial biases, as well as concerns over external funding and career advancement, may further dissuade researchers from conducting replications (Bronstein, 1990; Earp & Trafimow, 2015; Neuliep & Crandall, 1990). Recent events in psychology (where the replication author was attacked via social media and threatened by the original author) also lend support to the view that replications continue to be a risky endeavor (see LeBel, 2015, for a discussion). There are signs that the situation may be improving (Mu and Matsuda’s [2016] survey of second language writing researchers did not uncover biases toward replications), but it would be premature at this point to declare that original and replication studies are on equal footing.

Reporting andData Sharing Issues Another issue that impacts the field’s ability to conduct and interpret replications concerns reporting practices. Over the years, numerous calls have been made for authors to report full methodological details in their studies so as to facilitate replication (e.g., Polio & Gass, 1997). Researchers interested in conducting exact replications, of course, need these details to make the replication as faithful to the original as possible, but even researchers doing approximate or conceptual replications require this information in order to interpret any differences in findings between their own studies and

  Research Replication 


the original. In the past, full methodological information was often excluded due to page restrictions imposed by journals, but with the advent of online supplementary materials and databases, such restrictions are gradually proving to be less of a barrier. Recent researchers have also advised authors to be more transparent with respect to data analysis. Quantitative researchers have been urged to be more consistent in their reporting of descriptive statistics (means, standard deviations), confidence intervals, effect size, and statistical power (e.g., Larson-Hall & Plonsky, 2015; Liu & Brown, 2015; Plonsky, 2013); qualitative researchers have been encouraged to provide more information on their coding processes and analytical decisions (e.g., Moravcsik, 2014; Richards, 2009). Such practices will not only help in the interpretation of results but also in the comparison of studies, which is central to both replications and meta-analyses. Related to these reporting concerns is the issue of data sharing. Previous studies have provided evidence that authors may be reluctant to share their raw data (or, in cases of storage failure, unable to) (e.g., Plonsky, Egbert, & LaFlair, 2015). Sharing data (when not prohibited by privacy concerns) has been encouraged by a number of bodies, including the American Psychological Association and the Linguistics Society of America.

Future Directions As noted by Reis and Lee (2016), “pointing to the importance of replication is a little bit like arguing that we should help the elderly and babies: everyone agrees in principle but how to make it happen can often become vexing” (p. 2). Thankfully, in recent years great strides have been made in applied linguistics and in other fields to facilitate replication research. With respect to making full methodological information available, a number of recent developments have occurred, from journals explicitly stating that “Methods sections must be detailed enough to allow the replication of research” (from the journal Language Learning) to the use of supplementary online materials. The latter can take the form of additional files posted to the publishing journal’s website (as in the case of Language Learning, for example) to large-scale repositories containing materials used in published research (e.g., the Instruments for Research into Second Languages [IRIS], Marsden, Mackey, & Plonsky, 2016). A number of journals (e.g., Language Teaching and Studies in Second Language Acquisition) have also actively called for replication studies. Language Teaching has taken the additional step of publishing articles that


R. Abbuhl

provide recommendations as to which articles in the field should be replicated, along with recommendations as to how to carry out those replications (e.g., Bitchener & Knoch, 2015; Gass & Valmori, 2015; Schmitt et al., 2017). There have been interesting developments outside the field of applied linguistics as well. For example, in psychology, researchers have suggested that a “replication norm” be established, encouraging researchers “to independently replicate important findings in their own research areas in proportion to the number of original studies they themselves publish per year” (LeBel, 2015, p.1). Psychologists have also created websites that allow researchers to upload and view unpublished replications in their field (e.g., Actively encouraging replications—whether through a replication norm or otherwise—as well as developing methods of making such studies publically available, will help remove some of the real and perceived barriers to replication research. Another suggestion from the field of psychology is to “explore small and confirm big” (Sakaluk, 2016, p. 47). According to Sakaluk, in the initial stages of exploration, small-scale studies would help develop hypotheses about particular phenomena. In the confirmatory stages, researchers would then seek to replicate the findings from the first stage using large samples (thus, if a non-­significant finding was found, researchers could rule out low statistical power as being responsible). If individual researchers do not have access to sufficient participants for the confirming stage, a many labs approach (e.g., Klein et al., 2014; Open Science Collaboration, 2012; Schweinsberg etal., 2016) could be taken. Here, multiple independent labs pre-register their hypotheses, materials, and data analysis plans (so as to minimize the questionable research practices discussed above). The study would then be carried out, and the results of the separate replications aggregated in a meta-analysis. The beginnings of such a project in applied linguistics are discussed in Marsden etal. (2016). Pre-registered protocols could also be used to minimize the file drawer problem. For example, LeBel (2015) noted that in several journals in psychology, the results of pre-registered studies are published regardless of the results. Replications can help researchers verify study results, examine questions of generalizability, and in the process, better understand phenomena of interest. Perhaps most importantly, though, accepting replications—and in particular, multiple replications—as a normal part of the research process can help bring home the point that no single study has the ability to conclusively separate the spurious from the real.

  Research Replication 


Resources for Further Reading Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: What gets reported and recommendations for the field. Language Learning, 65(S1), 127–159. The authors provide recommendations concerning data reporting that will facilitate study interpretation, meta-analyses, and replications. Mackey, A., & Marsden, E. (2016). Advancing methodology and practice: The IRIS repository of instruments for research into second languages. NewYork: Routledge. Recognizing the importance of data sharing and transparency in the field, Mackey and Marsden present a collection of articles based on the Instruments for Research into Second Languages (IRIS), a publically available repository for materials used in second language research. Markee, N. (2017). Are replication studies possible in qualitative second/foreign language classroom research? A call for comparative re-production research. Language Teaching, 50(3), 367–383. In this article, Markee presents a strategy for replicating qualitative studies in second language classroom research (the comparative re-production method, a common practice in ethnomethodological conversation analysis). Porte, G. (Ed.). (2012). Replication research in applied linguistics. Cambridge: Cambridge University Press. This edited collection of articles is the first (and to date, only) book-length treatment of replications in the field of applied linguistics. The collection covers a variety of topics, including the history of replications in applied linguistics, the relationship between statistical interpretations and replications, teaching replications at the graduate level, and writing them up. Two examples of replication studies are also provided.


R. Abbuhl

References Abbuhl, R. (2012a). Practical methods for teaching replication to applied linguistics studies. In G.Porte (Ed.), Replication research in applied linguistics (pp.135–150). Cambridge: Cambridge University Press. Abbuhl, R. (2012b). When, when, and how to replicate research. In A.Mackey & S. Gass (Eds.), Research methods in second language acquisition (pp. 296–312). Oxford: Wiley-Blackwell. Basturkmen, H. (2014). Replication research in comparative genre analysis in English for academic purposes. Language Teaching, 47(3), 377–386. Benson, L., & Borrego, M. (2015). The role of replication in engineering education research. Journal of Engineering Education, 104(4), 388–392. Bikowski, D., & Schulze, M. (2015). Replication and evaluation in CALL. CALICO Journal, 32(2), i–v. Bitchener, J., & Knoch, U. (2015). Written corrective feedback studies: Approximate replication of Bitchener & Knoch (2010a) and Van Beuningen, De Jong & Kuiken (2012). Language Teaching, 48(3), 405–414. Booth, P. (2013). Vocabulary knowledge in relation to memory and analysis: An approximate replication of Milton’s (2007) study on lexical profiles and learning style. Language Teaching, 46(3), 335–354. Bronstein, R. (1990). Publication politics, experimenter bias and the replication process in social science research. Journal of Social Behavior and Personality, 5(4), 71–81. Burman, L., Reed, W., & Alm, J.(2010). A call for replication studies. Public Finance Review, 38(6), 787–793. Casanave, C. (2012). Heading in the wrong direction? A response to Porte and Richards. Journal of Second Language Writing, 21(3), 296–297. Chun, D. (2012). Replication studies in CALL research. CALICO Journal, 29(4), 591–600. Crandall, C., & Sherman, J.(2016). On the scientific superiority of conceptual replications for scientific progress. Journal of Experimental Social Psychology, 66, 93–99. Earp, B., & Trafimow, D. (2015). Replication, falsification, and the crisis of confidence in social psychology. Frontiers in Psychology, 6(621), 1–11. Easley, R., Madden, C., & Dunn, M. (2000). Conducting market science: The role of replication in the research process. Journal of Business Research, 48(1), 83–92. Frankenberg-Garcia, A. (2014). The use of corpus examples for language comprehension and production. ReCALL, 26(2), 128–146. Gass, S., & Valmori, L. (2015). Replication in interaction and working memory research: Révész (2012) and Goo (2012). Language Teaching, 48(4), 545–555. Golden, M. (1995). Replication and non-quantitative research. PS: Political Science and Politics, 28(3), 481–483.

  Research Replication 


Hendrik, C. (1990). Replications, strict replications, and conceptual replications: Are they important? Journal of Social Behavior and Personality, 5(4), 41–49. Hubbard, R., & Armstrong, J. (1994). Replications and extensions in marketing: Rarely published but quite contrary. International Journal of Research in Marketing, 11(3), 233–248. Ioannidis, J.(2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. John, L., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. Johnson, M., Mercado, L., & Acevedo, A. (2012). The effect of planning sub-­ processes on L2 writing fluency, grammatical complexity, and lexical complexity. Journal of Second Language Writing, 21(3), 264–282. Johnson, M., & Nicodemus, C. (2015). Testing a threshold: An approximate replication of Johnson, Mercado & Acevedo 2012. Language Teaching, 49(2), 251–274. King, K., & Mackey, A. (2016). Research methodology in second language studies: Trends, concerns, and new directions. Modern Language Journal, 100(s1), 209–227. Klein, R.A., Ratliff, R.A., Vianello, M., Adams, R.A., Jr., Bahník, S., Bernstein, M. J., … Nosek, B. A. (2014). Investigating variation in replicability: A “ManyLabs” replication project. Social Psychology, 45(3), 142–152. Language Teaching Review Panel. (2008). Replication studies in language learning and teaching: Questions and answers. Language Teaching, 41(1), 1–14. Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: What gets reported and recommendations for the field. Language Learning, 65(S1), 127–159. LeBel, E. (2015). A new replication norm for psychology. Collabra, 1(1), 1–13. Leow, R. (2015). The roles of attention and (un)awareness in SLA: Conceptual replication of N.C.Ellis & Sagarra (2010a) and Leung & Williams (2012). Language Teaching, 48(1), 117–129. Lim, H., & Godfroid, A. (2015). Automatization in second language sentence processing: A partial, conceptual replication of Hulstijn, Van Gelderen, and Schoonen’s 2009 study. Applied Psycholinguistics, 36(5), 1247–1282. Liu, Q., & Brown, D. (2015). Methodological synthesis of research on the effectiveness of corrective feedback in L2 writing. Journal of Second Language Writing, 30, 66–81. Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly, 45(1), 36–62. Mackey, A., & Marsden, E. (2016). Advancing methodology and practice: The IRIS repository of instruments for research into second languages. NewYork: Routledge.


R. Abbuhl

Makel, M., & Plucker, J.(2015). An introduction to replication research in gifted education: Shiny and new is not the same as useful. Gifted Child Quarterly, 59(3), 157–164. Makel, M., Plucker, J., & Hegarty, B. (2012). Replications in psychology research: How often do they really occur? Perspectives on Psychological Science, 7(6), 532–542. Markee, N. (2017). Are replication studies possible in qualitative second/foreign language classroom research? A call for comparative re-production research. Language Teaching, 50(3), 367–383. Marsden, E., & Mackey, A. (2013, June). IRIS and replication. Paper presented at the 9th International Symposium on Bilingualism, Singapore. Marsden, E., Mackey, A., & Plonsky, L. (2016). The IRIS repository: Advancing research practice and methodology. In A.Mackey & E.Marsden (Eds.), Advancing methodology and practice: The IRIS repository for research into second languages (pp.1–21). NewYork: Routledge. Matsuda, P. (2012). On the nature of second language writing: Replication in a postmodern field. Journal of Second Language Writing, 21(3), 300–302. Maxwell, S., Lau, M., & Howard, G. (2015). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? American Psychologist, 70(6), 487–498. McNeeley, S., & Warner, J.(2015). Replication in criminology: A necessary practice. European Journal of Criminology, 12(5), 581–597. Mezias, S., & Regnier, M. (2007). Walking the walk as well as talking the talk: Replication and the normal science paradigm in strategic management research. Strategic Organization, 5(3), 283–296. Moravcsik, A. (2014). Transparency: The revolution in qualitative research. PS: Political Science & Politics, 47(1), 48–53. Mu, C., & Matsuda, P. (2016). Replication in L2 writing research: Journal of second language writing authors’ perceptions. TESOL Quarterly, 50(1), 201–219. Nassaji, H. (2012). Significance tests and generalizability of research results: A case for replication. In G. Porte (Ed.), Replication research in applied linguistics (pp.92–115). Cambridge: Cambridge University Press. Neuliep, J., & Crandall, R. (1990). Editorial bias against replication research. Journal of Social Behavior and Personality, 5(4), 85–90. Open Science Collaboration. (2012). An open, large-scale, collaborative effort to estimate the reproducibility of psychological science. Perspectives on Psychological Science, 7(6), 657–660. Pashler, H., & Harris, C. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7(6), 531–536. Pashler, H., & Wagenmakers, E.-J. (2012). Editors’ introduction to the special section on replicability in psychological science: A crisis of confidence? Perspectives on Psychological Science, 7(6), 528–530.

  Research Replication 


Plonsky, L. (2012). Replication, meta-analysis, and generalizability. In G.Porte (Ed.), Replication research in applied linguistics (pp.116–132). Cambridge: Cambridge University Press. Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687. Plonsky, L. (2015). Quantitative considerations for improving replicability in CALL and applied linguistics. CALICO Journal, 32(2), 232–244. Plonsky, L., Egbert, J., & Laflair, G. (2015). Bootstrapping in applied linguistics: Assessing its potential using shared data. Applied Linguistics, 36(5), 591–610. Polio, C. (2012a). No paradigm wars please! Journal of Second Language Writing, 21(3), 294–295. Polio, C. (2012b). Replication in published applied linguistics research: An historical perspective. In G.Porte (Ed.), Replication research in applied linguistics (pp.47–91). Cambridge: Cambridge University Press. Polio, C., & Gass, S. (1997). Replication and reporting: A commentary. Studies in Second Language Acquisition, 19(4), 499–508. Porte, G. (2012). Replication research in applied linguistics. Cambridge: Cambridge University Press. Porte, G. (2013). Who needs replication research? CALICO Journal, 30(1), 10–15. Porte, G., & Richards, K. (2012). Focus article: Replication in second language writing research. Journal of Second Language Writing, 21(3), 284–293. Reis, H., & Lee, K. (2016). Promise, peril, and perspective: Addressing concerns about reproducibility in social-personality psychology. Journal of Experimental Social Psychology, 66, 148–152. Richards, K. (2009). Trends in qualitative research in language teaching since 2000. Language Teaching, 42(2), 147–180. Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. Rott, S., & Gavin, B. (2015). Comprehending and learning from Internet sources: A conceptual replication study of Goldman, Braasch, Wiley, Greasser and Brodowinska (2012). CALICO Journal, 32(2), 323–354. Sakaluk, J.(2016). Exploring small, confirming big: An alternative system to the new statistics for advancing cumulative and replicable psychological research. Journal of Experimental Social Psychology, 66, 47–54. Santos, T. (1989). Replication in applied linguistics research. TESOL Quarterly, 23(4), 699–702. Sasaki, M. (2012). An alternative approach to replication studies in second language writing: An ecological perspective. Journal of Second Language Writing, 21(3), 303–305. Schmitt, N., Cobb, T., Horst, M., & Schmitt, D. (2017). How much vocabulary is needed to use English? Replication of van Zeeland & Schmitt (2012), Nation (2006) and Cobb (2007). Language Teaching, 50(2), 212–226.


R. Abbuhl

Schofield, J. (2002). Increasing the generalizability of qualitative research. In A. Huberman & M. Miles (Eds.), The qualitative researcher’s companion (pp.171–203). Thousand Oaks, CA: SAGE. Schweinsberg, M., Madan, N., Vianello, M., Amy Sommer, S., Jordan, J., Tierney, W., … Uhlmann, E.L. (2016). The pipeline project: Pre-publication independent replications of a single laboratory’s research pipeline. Journal of Experimental Social Psychology, 66, 55–67. Simmons, J., Nelson, L., & Simonsohn, U. (2011). False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. Smith, B., & Schulze, M. (2013). Thirty years of the CALICO Journal—Replicate, replicate, replicate. CALICO Journal, 30(1), i–iv. Stake, R. (2005). Qualitative case studies. In N. Denzin & Y. Lincoln (Eds.), Handbook of qualitative research (pp.443–466). Thousand Oaks, CA: SAGE. Valdman, A. (1993). Replication study. Studies in Second Language Acquisition, 15(4), 505. Webb, S. (2015). Learning vocabulary through meaning-focused input: Replication of Elley (1989) and Liu & Nation (1985). Language Teaching, 49(1), 129–140. Willans, F., & Leung, C. (2016). Empirical foundations for medium of instruction policies: Approximate replications of Afolayan (1976) and Siegel (1997b). Language Teaching, 49(4), 549–563. Yoon, H.-J., & Polio, C. (2017). The linguistic development of students of English as a second language in two written genres. TESOL Quarterly, 51(2), 275–301.

8 Ethical Applied Linguistics Research ScottSterling andPeterDe Costa

Introduction Conducting research ethically is paramount for the continuing success of any research field. And increasingly, many journals often stipulate that ethical requirements are met before submitting a paper, with some journals asking contributing authors directly to produce evidence of Institutional Review Board (IRB) approval1 from their respective institutions. At the same time, however, it is important to note that research ethics takes on a different role when the data being collected and analyzed comes from human beings. Research ethics has been defined in a number of ways including stated definitions, such as that offered by the British Economic and Social Research Council (ESRC, 2015, p.43) as “the moral principles guiding research, from its inception through to completion and publication of results and beyond” (p.4). Admittedly, there are many definitions and viewpoints for potential ethical frameworks, too many to be listed here. In light of a space limitation, we offer two such viewpoints below as put forward by Pimple (2002) and Emanuel, Wendler, and Grady (2013). S. Sterling (*) Department of Languages, Literatures, and Linguistics, Indiana State University, IN, USA e-mail: [emailprotected] P. De Costa Department of Linguistics and Germanic, Slavic, Asian and African Languages, Michigan State University, East Lansing, MI, USA e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



S. Sterling and P. De Costa

Pimple (2002) breaks down ethical research into three components: 1 . Truth in reporting and representing data 2. Fairness in citing and using the work of others 3. Wisdom to only conduct meaningful and useful research In this view, research is said to be ethical not if it follows a set guideline or checklist but instead looks at the nature of the research and researcher. Is the researcher being truthful in the reporting of his or her information, or is there potential fabrication of data or hiding of non-significant results? In terms of qualitative data, are participants being fairly represented or has the author cherry-picked data to make a point? Fairness revolves around properly citing work and not plagiarizing. Additionally, it could align with fairness in conducting group-based research projects and ensuring that all authors are fairly attributed. Finally, wisdom to conduct meaningful research discusses the need that research is useful to society or science and that researchers are not bringing about undue and unnecessary harm. An additional conceptualization of ethical research can be seen in Emanuel etal. (2013), who draw on a more positivist view of research as compared with Pimple. These authors suggest that ethical research needs to have: 1 . value to science or society 2. scientific validity 3. reliable independent review 4. respect for persons 5. balanced risk-to-benefit ratio 6. fair participant selection 7. truly informed consent The seven items listed by Emanuel etal. (2013) can be broken into two large themes: (1) value of research (Items 1, 2, and 3) and (2) ethically conducting research (Items 4, 5, 6, and 7). Similar to Pimple (2002), Emanuel et al. discuss the need for research to be useful or have value. For applied linguistics research that might mean pedagogical implications for research but it could also entail furthering our knowledge about the nature of language. A potential project that wouldn’t have value would be one that investigates if it were possible to learn a second language as an adult. There is a plethora of research that already shows that adult second language acquisition (SLA) is possible and so this project would not add any real value or knowledge to the field. In fact, a study like this would take away resources that could be spent on more valuable projects such as looking at how adult SLA is accomplished or at a particular interesting feature. Scientific validity is an issue that has been discussed heavily in the last 15years (Norris & Ortega,

  Ethical Applied Linguistics Research 


2000; Plonsky & Gass, 2011). The overall goal of this research trend is to point out various methodological and statistical issues the field faces. Ensuring that the studies conducted are done in methodologically rigorous ways might not seem like a question of ethics but how the field uses its resources and ensuring that the results produced are accurate are crucial. Results from linguistics studies can be used by governments or other policy makers to make decisions that affect the lives of millions; basing these policies on a poorly conducted study or misinterpreted data could spell disaster. Finally, an area that has seen little review in the field is just how research is reviewed during the peer review stage. Another large area of research ethics discussed in Emanuel etal. (2013) that impacts the field of applied linguistics relates to how the field conducts research. A fair selection of participants can be difficult to achieve in applied linguistics. Truly blinded and randomized sampling procedures are usually not available as participants are often selected due to their linguistic background. However, those who conduct research in the classroom are often confronted with issues such as balancing the benefits from being in an experimental group as opposed to a control group. A balanced risk-to-benefit ratio is generally taken to mean that any risks taken on by a participant are balanced by the benefit. Asking a student to contribute to a research project that takes away from their instructional time is a burden, one that might be too large if all the benefits go to other stakeholders such the researcher, institutions, or future students. Ensuring actual informed consent can be difficult (see Sterling, 2017). Consent documents are difficult to read, often not even read by participants, and often information is purposefully withheld in order to not bias the data. These definitions provided by Pimple (2002) and Emanuel etal. (2013) highlight the fact that research ethics are consistently negotiated and do not stop after research publications. Both Pimple (2002) and Emanuel et al. (2013) note that research should be conducted only if it is deemed to be useful or worthwhile, a sentiment made clear in Ortega (2005a), who emphasized the social utility of applied linguistics research. Additionally, all of the definitions of research ethics imply that ethical research goes beyond simply doing what is regulated by governmental or institutional bodies and that a large burden of conducting ethical research falls on the researcher. Building on developments in ethical practices observed in applied linguistics research, this chapter focuses on three main points revolving around the agency that researchers are required to exercise when conducting ethical research. First, issues related to the general lack of ethics training received by applied linguists are highlighted. Second, wefocus on IRBs in terms of how these organizations potentially assist and restrict research, as well as the reliance that many researchers place on the IRB.Finally, we will offer advice in


S. Sterling and P. De Costa

the form of questions that applied linguistics researchers should ask themselves when conducting research.

Historical Development Applied linguistics is an international and methodologically diverse field. As such, opinions on what constitute best practices will always be situated within the context of the research, the home institution or funding body, and within an individual researcher’s moral and ethical framework. It can be difficult to know what or if major ethical issues have occurred in applied linguistics due to a general lack of publications on the topic. Little has been published explicitly on research ethics when compared to the overall number of methodologically oriented publications produced every year, though we are seeing that change as research methodology and meta-research are becoming more prevalent in the field. That said, however, the issue of ethical practices in applied linguistics has been circulating within the field’s broader discourse for more than four decades. For one, it was foregrounded in TESOL’s “Guidelines for Ethical Research in ESL,” which appeared in a 1980 issue of the TESOL Quarterly. The primary focus of this document and that of many other texts that came out in the 1980s (e.g., Brown, 1988) and the early 1990s (e.g., Hatch & Lazaraton, 1991) was the logistical aspects of conducting research. Simply put, the earlier literature sought to highlight the formal procedures associated with setting up a research project. This phenomenon was underscored by Dufon (1993), who pointed out that much of the methodology literature at the time emphasized “theoretical and methodological approaches, statistics, validity, replication, and so forth” (p.158). By the mid-2000s, however, research methods books in applied linguistics started to address ethics more explicitly, due in part to the increase in IRB involvement in the research process (Duff, 2008). This deepening interest in carrying out ethical research was exemplified in the coverage of ethics in Mackey and Gass (2005), McKay (2006), Dörnyei (2007), and Phakiti (2014) and has become more prominently addressed in book chapters (e.g., De Costa, 2015; Sterling, Winke, & Gass, 2016), journal articles (e.g., De Costa, 2014; Mahboob etal., 2016), recent methodology volumes (e.g., De Costa, 2016a; Mackey & Gass, 2016; Paltridge & Phakiti, 2015), and special issues of several journals such as The Modern Language Journal (Ortega, 2005b), TESL Canada Journal (Kouritzin, 2011), and Diaspora, Indigenous and Migrant Education (Ngo, Bigelow, & Lee, 2014).

  Ethical Applied Linguistics Research 


Current Core Issues As applied linguists who are interested in research ethics, the authors are constantly having discussions with our colleagues and students about research ethics and are often asked to provide advice for various situations. While anecdotal evidence is often best avoided, it is difficult to find much actual empirical evidence on the general beliefs of researchers in the field. One major issue that we often encounter in our conversations is an overall feeling of apathy toward the topic of research ethics. This apathy can be seen in comments such as “who cares” or “if the IRB says it is fine, then it is fine.” Apathy can also be seen in the lack of education in research ethics in PhD training programs (see Sterling etal., 2016, for a discussion of ethics training in applied linguistics graduate programs). This lack of training along with a general creep of federal regulations into research ethics (Haggerty, 2004) has led to consequences, namely, an overreliance on IRBs as the gold standard for research ethics and research ethics training. However, IRBs are designed to ensure that federally mandated ethical practices are followed when human beings are involved in a research project. While helpful, IRBs are not supposed to be used as a model of ethical correctness. Thus, fieldspecific ethical training and the development of linguistic ethical research are needed for research within applied linguistics. In one of the few studies investigating applied linguistics researchers’ views on research ethics, Sterling et al. (2016) and Sterling and Gass (2017) found that applied linguists place a high degree of faith into the IRB to make ethical decisions for them. Replacing one’s agency in making decisions on complex ethical issues and relying instead on a nebulous ethical agency could result in situations in which a university’s interests are put above those of our individual participants. In Sterling etal. (2016) and Sterling and Gass (2017), participants were asked to read ethically challenging scenarios that were designed to fall into an ethical “gray zone.” All scenarios contained issues that while not always purely unethical should have been perceived as problematic. The participants in the study, who were all applied linguistics researchers, largely rated each scenario as having low to no ethical conflict. In other words, even though the fictional situations contained multiple areas of questionable practices, the majority of participants did not perceive them as ethically challenging. In open comments, participants indicated that the situations were ethical because an IRB had approved of them, highlighting the fact that applied linguistics researchers have started to place more responsibility


S. Sterling and P. De Costa

on the IRB than on their own ethical judgments. This loss of agency might result in scholars who are not properly trained to handle the complex realities of research, which will almost always contain gray areas where no right/ wrong decision can be made. Finding no ethical issues present in the various situations, and calling on the fact that the IRB signed off on their imaginary study, further indicates that researchers are not always clear about their ethical duties or the IRB’s. Another topic that comes up in conversation is an overreliance on “general rules of thumb” toward research ethics that are not substantiated with empirical evidence. Issues such as paying student-participants compared to using extra credit, participant understanding of research, length of consent form, participant motivation to be in research, and many others are passed from researcher to researcher, probably in informal contexts as suggested by Sterling etal. (2016). These rules of thumb might be provided during informal conversations in the hallway, through meetings with a mentor, or even passed on through less trained colleagues such as other graduate students or language instructors not specifically trained in research ethics. These rules of thumb might also be passed along during more formal classroom settings as well. While these rules of thumb might contain useful knowledge, there is currently limited data available for how well the rules actually work. A clear example can be found in the length of consent forms. Often the general dictum is that shorter is better. However, Sterling (2017) shows that consent forms written for ESL studies tend to be around two pages long and fluctuate widely on their vocabulary and reading comprehension levels. In fact, writing short consent forms might require researchers to use more complex language and jargon. Thus, the general advice of keeping consent forms short might actually lead more difficult forms and just less ethical practices. There is a general lack of strong empirical evidence for many of the ethical practices that are field specific. Often, papers on research ethics are taken from the experiences of the author or are written as position papers. Empirical investigations into whether or not policies or guidelines are actually “best practice” are rarely, if ever, investigated. This lack of direct evidence for best practices means that mentors are often left using untested rules of thumb when conducting research or passing on critical ethical information to future researchers. Devising future training materials might be challenging due to the dearth of information we currently know about best practices within the field, making this area of investigation not only ripe for future researchers but of paramount importance for the advancement of the field.

  Ethical Applied Linguistics Research 


One of the main issues with research ethics in applied linguistics research is the minimal amount of education and training dedicated to the topic. When surveyed, many in the field indicated that research ethics was isolated to a single day/unit of a research methods class and was mainly covered as part of IRB certification or during the IRB application itself (Sterling etal., 2016). Two recent studies (Sterling etal., 2016; Sterling & Gass, 2017) have found similar trends. The authors found a divide between the topics covered that were considered to be procedural ethics (Guillemin & Gillam, 2004), items covered during IRB training, and those that might be considered academic integrity items. Procedural ethics were covered more extensively in formal graduate school training, either during IRB training or in research methods courses. By contrast, the academic integrity items, mentorship, authorship, collaboration, and peer review were only discussed informally, if at all. Taken together, the issues from this section highlight the fact that research ethics in applied linguistics has seen limited empirical research into best practices. We have also discussed the trend of applied linguistics researchers shifting their ethical responsibilities from themselves and onto mandatory ethical review boards. A fear going forward for the applied linguistics research community is that future scholars will not receive quality training on research ethics.

Challenges andControversial Issues In this section, we discuss five different macro-topics that are likely to be played out in research conducted in applied linguistics. Our questions do not have any definite solutions, but seek to raise awareness about the challenges and controversies surrounding the conduct of ethical applied linguistics research. We use the following questions to highlight several facts. First, research is full of pitfalls that are not easily avoidable. At some level, almost any decision will have a negative outcome for someone, even if that negative outcome is negligible. The second reason we selected these questions is to underscore that applied linguistics research ethics need to be considered from the inception of a project to publication and beyond. Finally, these questions highlight the fact that IRBs should not be completely entrusted with ethical decisions. There are many issues that go beyond basic federal and institutional mandates that scholars should consider. In the best-case scenario, a university IRB is staffed with dedicated people trained in law, research methodology, and ethics. Though highly qualified, such individuals cannot be expected to understand the intricacies, complexities, and ethical considerations involved


S. Sterling and P. De Costa

in all domains of research involving human subjects. As a consequence, it is up to the researcher to constantly consider these issues. The five questions that are discussed in this section are: 1 . Who defines what a language community is and who speaks for them? 2. How do we balance a participant’s right to confidentiality with his/her desire to be known for participating in research? 3. How do we ensure informed consent has been collected, given that cultures have differences in opinions on research ethics and the terminology used in the consent process? 4. What right do participants have to their data once a project has been completed? Can researchers continue to use the data they collected for various analyses, or does every new analysis require a renewal of consent? 5. What ethical role does a researcher have after the research paper is published? Are researchers responsible for how their data is appropriated by governments or teachers, and would they need to withdraw a paper if it contains data that has been found to be inaccurate at a later date?

uestion 1: Who Defines What aLanguage Community Is Q andWho Speaks forThem? The first question involves a topic that is often discussed in linguistic field method textbooks but one that is alarmingly difficult to answer. On the surface, this is an easy question. I study language X, so my research community constitutes of people Y.To gain access to language X, I would need to ask someone from that language group. However, this type of rationalization may unravel very quickly. Not everyone who speaks language X is part of community Y, and not all people who live in or around community Y are members of that community. More often than not, language communities in applied linguistics research are actually imaginary constructs created by researchers. For example, even though a researcher might consider “Spanish 101 students” a community, it is quite unlikely that students taking that course feel a strong sense of connection to their classmates. In fact, many language students probably feel closer affiliations to other people from their home country or region regardless of language than they do to the language learning community as a whole. The question of group identity becomes even more complicated when we view the outcomes of research. Taking the example of English language learners (ELL), who benefits from ELL research? It is unlikely that the ELL partici-

  Ethical Applied Linguistics Research 


pants in the study will gain anything from the research project, as they will graduate or move out of ELL courses before the paper sees publication. One line of rationalization often used when considering the benefits to research is that a study could impact future cohorts of students, which might include siblings or children of current participants but that is difficult to argue for in terms of community membership, especially with potentially imaginary groups. In the end, it is likely that participants are being asked to absorb all the risks and likely see none of the benefits while others will gain all the benefits at no risk to themselves. Another common complaint by applied linguists relates to blanket IRB policies that do not appear to work well in applied linguistics research. Typically this complaint surfaces in comments related to linguistic research not being inherently risky, consent being difficult to obtain in some cultures, or confusion over how to treat non-native English speakers. However, this type of complaint is unfair as IRBs are typically not staffed by linguists. If you find yourself in a situation where a general policy will not work with your intended data collection, the best suggestion is having an in-person meeting with your IRB.As with many faceless organizations, it is easy to vilify the IRB as an entity that only wants to make your life more difficult and is guarding the interests of your institution. However, one also needs to remember that many IRB members are volunteer faculty who truly care about protecting people. The worst-case scenario from having a discussion with your IRB is that they will not change their minds and you will find yourself in the same situation where you started. In the best-case scenario, both sides learn from each other, your project is deemed in compliance with the required guidelines, and future applied linguists from your institution working in a similar context will profit from your experience by having an easier time getting the IRB (assuming that its members do not change) to understand their difficult situations. Sample Study 8.1 provides an illustration of this very issue. Sample Study 8.1 Duff, P.A., & Abdi, K. (2016). Negotiating ethical research engagements in multilingual ethnographic studies in education: A narrative from the field. In P.I.De Costa (ed.), Ethics in applied linguistics research: Language researcher narratives (pp.121–141). NewYork: Routledge. Research Background This chapter examines the ethical dilemmas encountered by the second author (Abdi) in as she prepared to embark on a two-year ethnographic multiple-case study of transnational children moving between Canada and China.


S. Sterling and P. De Costa

Research Problems Taking at its starting point the unexpected challenges that emerge while conducting ethnographic research, this chapter provides helpful insights into behind-the-scenes interaction that often goes on in order to gain approval from the ethics board of a university. Specifically, the chapter details the complexities surrounding interactions between researchers, the departmental authorities who need to sign off on applications to carry out research, and the staff from the office of research services that oversees ethical review applications. Research Method Even though the study described in this chapter is based on ethnography, the chapter takes on the form of a narrative in that it describes the unexpected tensions that occur when carrying out field research that involves internationally mobile child participants. Key Results The central tension encountered by Abdi was having to consider the demands of her ethical review board which enforced protocols that favored a predetermined and static study design. Consequently, she had to grapple with a protracted and time-consuming amendment process and having to educate the board about the nuances involved in conducting research in China. Comments The chapter underscores how given the evolving nature of ethnography in particular and research in general, some level of procedural flexibility is needed in order for researchers to ensure that ethical research practices are implemented.

uestion 2: How Do WeBalance aParticipant’s Right Q toConfidentiality withHis/Her Desire toBeKnown forParticipating inResearch? In the United States, confidentiality is promised to most participants when they agree to take part in research, and it is often a stipulated IRB requirement. Requiring confidentiality is often not a problem for many large-scale quantitative research projects, as large volumes of data are collected and all traces of participant individuality are removed. However, with many qualitative projects, this is not often the case. Generally, the fewer participants in a research project, the more difficult it is to conceal their identity. If a researcher is collecting data on one of the last four speakers of a language, then there is a 25 percent chance of guessing who the participant is. Keeping data confidential is important for IRBs, but in some projects, this requirement may contradict what the participant wants. Some participants want it to be known that they participated in the research, or that they contributed to the preservation of a language. If a participant wants to be unanon-

  Ethical Applied Linguistics Research 


ymous, should the researcher comply with this demand? If an adult participant wants to exercise her agency and claim the fact that she participated in the research, should the researcher object? An example of this problem occurred in research that Sterling was conducting. He initially wanted to offer his participants the chance to select their own pseudonym for the project. However, in his earlier work, several participants had opted to use their real names for a variety of reasons, the main being that they had nothing to hide and that they trusted Scott. However, participants in this particular study did not have any real indication of what would be asked of them at the onset of the study, nor did they know how they would be represented in the final write-up. Additionally, Scott later discovered that participants, who had all signed a written consent form and verbally agreed that they fully understood the research study, had no idea that he was not a teacher in their ELL program or that he was a linguist. He also learned that they were not even sure what he would do with the data. In the end, Scott opted to use a numbering system for identification instead as he concluded that the participants were not adequately informed enough to make the decision to revoke their confidentiality. A counterpoint to this argument might be the use of member checks, which occurs when a researcher asks participants to read a manuscript prior to publication to ensure that the data accurately represents the participant. Even if a participant agreed to be part of a research project at the beginning and wanted to expose his/her identity, there would still be a chance before publication for the participant to reconsider this decision. While amember-checks strategy ishighly recommended, this counterpoint is an excellent place to illustrate the complexities surrounding applied linguistics research and shows that not all research is the same and that providing a “simple solution” will not work in all cases. In Sterling’s study discussed earlier, the data were collected through focus groups. By the time the data were analyzed and written up, the vast majority of the participants had left not only the language program in which the data was collected, but also had graduated from the university and presumably left the country as well. The feasibility of tracking down 40–50 former students in various countries around the world, who will likely have forgotten that they even took part in a one-hour focus group interview years earlier, is a daunting task. Additionally, member checks assume that participants are not only literate but also literate enough to comprehend a scientific document. The current trend in academic publishing would also require that the participant be able to read English at a high level, or that the researcher translate the document into the L1 of the participant, an additional burden on a likely already time-stressed situation.


S. Sterling and P. De Costa

Ultimately, in many ways, this question of confidentiality will be left to the researcher to decide. Again, offering untested rules of thumb will likely only lead to practices that will probably not work that well in many cases. Your opinions on the agency of participants, the voice that participants want to adopt, and other issues must be entered in to any calculation. One suggestion, however, is to judge the level of seriousness that participants put into answering the question of using their names. Do your participants strongly press for exposing their true identity, or are they indifferent? Returning to Scott’s example, his participants appeared to be extremely indifferent to which name was used with many saying something along the lines of “just use my name or whatever.” It would be prudent to explain all the possible negative consequences that could surface if their identity becomes known, even if they seem extremely unlikely. In the end, you are responsible for the safety of all your participants both during and after a research project has ended.

uestion 3: How Do WeEnsure Informed Consent Has Q Been Collected, GivenThat Cultures Have Differences inOpinions onResearch Ethics andtheTerminology Used intheConsent Process? It is tempting to think that simply responding affirmatively to the question, “Do you agree to be part of my research?”, should be sufficient evidence that someone has agreed to take part in a research project. Yet what consent is, and what it looks like, changes between studies and cultures. The basic idea behind informed consent, that a participant understands and voluntarily agrees to be part of a research project, is an idea that many researchers feel comfortable with. Informed consent is not a universal idea that transcends all cultures. In some cultures people are suspicious when asked to sign legal-like documents, and in others a respected person such as a teacher can grant consent for an entire group to be part of a research project. While cultural differences can be problematic for obtaining informed consent, it should also be noted that consent documents likely make little sense to anyone not familiar with research, even if the participant and researcher share a culture. In experimental research, the true nature of the project is often hidden at the onset of data collection to avoid biasing the data pool, meaning that people have to agree to take part in research but are not actually told what kinds of things are expected of them. In many instances, the full objective is disclosed after a study, but again there are many instances when

  Ethical Applied Linguistics Research 


disclosing the focus of a study might affect the results if participants talk with each other. Qualitative research encounters similar issues, in that it is not always possible to draw a clear line between when research is being conducted and when a casual conversation is taking place. As mentioned, notions such as the benefits to community or need for confidentiality might be concepts that make sense to the research community; however, these concerns probably have very little real value to or its impact may not be fully realized by someone outside the academy. For an illustration, see the sample study presented in Sample Study 8.2.

Sample Study 8.2 Lee, E. (2011). Ethical issues in addressing inequity in/through ESL research. TESL Canada Journal, 28(31), 31–52. Research Background Research ethics obligations stretch far beyond the data collection and retention process and at times these responsibilities will be split between various stakeholders in a research project. An example of unforeseen problems with implications for different groups can be found in a report by Ena Lee (2011) who reports a predicament she found herself in during the course of an ethnographic case study of an ESL program. Her stated goal for the project was to observe how the ESL program mixed culture and language instruction, and how that mixture affected the identities of those in the program. Research Problems Unlike traditional views of protecting participants from harm or reducing risks, Lee’s article dealt with the long-term needs of protecting identity, confidentiality, and the dangers associated with the loss of those features. While applied linguistics tends to be a low-risk field, that doesn’t mean participants face no dangers. Instead, problems can arise at any time before, during, and after a research study. Research Method This study takes a narrative, anecdotal perspective and provides the reader with a detail account of how a single author struggled to navigate a difficult ethical issue. Key Results During Lee’s original study, she noticed that one of her instructor-­participants was being racialized by the other staff members in the program. Lee wanted to publish data about this incident but recognized that her instructor-participant would be easy to identify in any report and thus at risk of losing her job. Of course, Lee had received consent to conduct this study, but the participants’ treatment at the language center would not have been discussed then. Lee was left feeling of guilt at knowing that an unfair practice was taking place and not


S. Sterling and P. De Costa

being in a position to do anything about it. As a researcher ethnographer, Lee is expected to be objective and not place value judgments on a situation. While one might feel that treating someone differently because of their race is wrong, a researcher supposed to observe and report back to a larger community. In the end, Lee decided to not publish this portion of the data until the participant moved to a new job and when her career was no longer in jeopardy if she was exposed. Comments Lee’s story is interesting as it shows the unexpected consequences that can take place during a research study. It also highlights the ongoing need to think ethically and responsibly that go beyond the end of data collection. Finally, it should be mentioned that no part of this story is likely to surface in IRB training and that Lee was forced to take control over her own ethical understanding in order to make a decision.

Hence, if participants are not able to know what is expected of them in a research project, can we ever really claim that participants are informed? How do we address assumptions that are not explicitly stated in the documents? As applied linguists we understand that data will be presented at conferences, that graduate student assistants might be asked to code data, or that a different researcher might be invited to look at the data for a reliability check. But do our participants share this belief? The answer to this question remains unclear, but it seems unlikely. The advent of the internet has created new public spaces for people to post their thoughts in either textual or audio formats. These public forums afford applied linguistics researchers the opportunity to investigate daily language usage by people around the world, but does this mean that we can analyze this information without consent from the author? (For further discussion of conducting ethical online research, see Gao & Tao, 2016.) Most people would say “yes,” because this communication is in the form of texts that exist in the public sphere; however, they are also generally in the form of social media posts intended for friends and family members. For instance, is it ethical to analyze how one’s aunt discusses political issues of gender and identity on Facebook? What about using a website that hacks people’s phones and uploads personal text messages that were never intended to be read by others? For many, the last example might seem unethical, but is using someone’s hacked personal communication different from analyzing the letter correspondence or diary of someone from history? Again, we find ourselves running into a gamut of issues that are difficult to answer. The simple question of “do you agree to take part in this research” is alarmingly oversimplified. In short, consent in research is a much larger topic than we can discuss in this chapter but nevertheless requires further consideration.

  Ethical Applied Linguistics Research 


uestion 4: What Right Do Participants Have toTheir Q Data Once aProject Has Been Completed? Can Researchers Continue toUse theData They Collected forVarious Analyses, or Does Every New Analysis Require aRenewal ofConsent? Through the use of digital corpus linguistic tools, we have the ability to sift through thousands of hours of recorded data and look for similarities between when a person uses the past tense and when they use the past perfect in English. Yet, unless data were specifically collected as part of a corpus, should researchers be able to repurpose the information? Do participants have any say as to what happens to their data after they have been collected? Reusing data is a fairly common research practice. Both authors have been in positions when we were studying a particular piece of data and noticed that it contained interesting elements that we never really considered at the onset of the project. Scott collected data in an ELL class where his students were asked to have a discussion with four strangers. What he never expected was to see a high frequency of humor used to navigate difficult topics such as terrorism, hom*osexuality, and harmful stereotypes. After reviewing the transcripts, it was clear that this data would be interesting to investigate and Scott collected retroactive consent from the participants. This was easy enough to do since the course was still in progress. But what would his options have been had he waited a few more weeks and his ability to track down the participants lessened? In Peter’s case, it wasn’t his original intention to visit the dormitory in which the female participants in his school-based ethnography stayed. Initially, his plan was to conduct observations within the premises of the school, but after learning about the tight living conditions under which they lived, and in order to better understand the sociolinguistic realities of the scholarship students with whom he worked, he decided to visit their dormitories to learn about the out-of-school dynamics that also influenced their language learning outcomes. Fortunately, he was successful in applying for permission to conduct his closely monitored visit, which in turn shed new light on their learning experience (De Costa, 2016b). As noted, after participants have finished their part of a research project, they typically scatter and can be extremely difficult to track down. Thus, the ability to gain retroactive consent to use data for future projects is problematic. Many readers might be wondering if it is possible to simply file an IRB application as already existing data. The answer is usually “yes.” However, applying for data as already existing is akin to telling the IRB that risks will be


S. Sterling and P. De Costa

minimal to the participants. While reusing data is not itself an ethical violation, the problem here is that we would be equating gaining IRB approval with absolute ethicality. We might also be violating participants’ trust by exposing their data in ways they had never agreed to. Put simply, gaining permission to collect data is only the first necessary step in data collection. As highlighted in this section, there will never be a perfect answer for when it is acceptable to reuse data and when new consent should be collected. It is probably fairly common practice to reuse data when the data collected and the new analyses are closely aligned. Using interview data that were originally intended to focus on learner written identity to investigate how learners use various tenses when writing a story is probably an acceptable practice to most. Using pieces of audio from the interview data just described as part of another research project, one where you are attempting to elicit opinions from strangers on accents from native speakers of the language, decidedly goes beyond the consent form signed.

uestion 5: What Ethical Role Does aResearcher Have Q Afterthe Research Paper Is Published? Are Researchers Responsible forHow Their Data Is Appropriated byGovernments or Teachers, andWould They Need toWithdraw aPaper If It Contains Data That Has Been Found toBeInaccurate ata Later Date? For many people, publication is the final stage of a research project. Once a paper has been submitted, the data needs to be safeguarded, but some applied linguists would argue that this is the end of the ethical contract. Yet, we also need to consider what role do researchers play in the continuing use of their study. Shohamy (2004) describes how she and her colleagues published data that were later used by the Israeli government to limit the number of Ethiopian students allowed into Israeli schools. This was never the intention of the researchers, but their actions had negative ramifications for Ethiopian students as the findings were used to “limit the number of students from Ethiopia in Israeli schools and to transfer others to ‘more successful’ schools, a discriminatory, unethical policy that our research was seen to legitimize” (p.730). Are the researchers responsible? Do they need to make public statements or defend their wrongly cited work? As it can be seen, there are no clear answers to this type of question. Another issue that has surfaced in other fields is the question of whether papers that contain inaccurate information should be retracted from p ­ ublished

  Ethical Applied Linguistics Research 


sources. Imagine researchers finding correlations between specific genes believed to control some aspect of language, but later that the gene in question is found to have nothing to do with language at all and that the original data were anomalous. Should the authors retract their paper or do we allow the publication to stand since it was published in good faith and with the best understanding of the phenomena at the time? Or does a question like this open the door for papers to be pulled due to a lack of support of a particular theory or methodological assumption? Another issue to consider in publication is the confidentiality that we can actually guarantee to our participants. Ironically, participants in a study have the least amount of confidentiality from those with the most influence on their lives. For example, a study might be focused on investigating the amount of grammar knowledge that Spanish students gain over the course of a single year. If the students do particularly poorly on the task, the results could be interpreted as being a fault of the instruction. Most people who read this article will have no idea who the instructor is. However, anyone involved in the project is likely to remember the researcher’s name and could probably piece together enough information to figure out the identity of the instructor. While unlikely, the information found in the study could have ramifications on the career of the teacher. In sum, the various issues brought up in this section highlight the fact that researchers have a moral obligation that stretches beyond just safeguarding data after publication. What exactly this obligation is must be decided upon by the individual researcher who has to wrestle with the dilemma of conducting her research in an ethical manner.

Resources for Further Reading De Costa, P.I. (Ed.). (2016). Ethics in applied linguistics research: Language researcher narratives. NewYork: Routledge. This edited volume contains chapters from various scholars in the field with each chapter focusing on observation and advice of scholars in applied linguistics. Chapter topics range from education in research ethics to the ethical use of technology in data collection. The works in this volume are useful to researchers at any phase of their career and will generate discussion in graduate-­ level classes.


S. Sterling and P. De Costa

Mackey, A., & Gass, S. (2016). Second language research: Methodology and design (2nd ed.). NewYork: Routledge. Chapter 2 in this book provides helpful information about codes of ethics and the procedures that need to be carried out to ensure that informed consent is obtained and participation protection is observed. This updated edition also details ethical concerns surrounding online data collection. Mahboob, A., Paltridge, B., Phakiti, A., Wagner, E., Starfield, S., Burns, A., … De Costa, P. I. (2016). TESOL quarterly research guidelines. TESOL Quarterly, 50(1), 42–65. This article gives an overview to the TESOL Quarterly research guidelines, one of the few explicitly stated guidelines for applied linguistics research. Areas of focus include the review process, brief introduction to common methodologies, and a section centered on research ethics. The research ethics section discusses the macro-/microethical divide as well as other areas covered during this current chapter. Ortega, L. (2005a). For what and for whom is our research? The ethical as transformative lens in instructed SLA. Modern Language Journal, 89(3), 427–443. This article can be found in a special issue of The Modern Language Journal focusing on ethics, methodology, and epistemology (also edited by Ortega). Ortega argues for applied linguistics research to take a more social perspective including: (1) social utility should be used to judge research, (2) research is never conducted in a vacuum and thus always has a value, and (3) epistemological diversity is benefit to a field.

Note 1. We will use the term IRB or Institutional Review Board throughout this chapter. An IRB is the American version of ethical review boards that are common in many countries. The general goal of an IRB is ensure that governmental requirements of ethical behavior are being followed. Additionally, IRBs provide support for technical ethical issues such as ensuring that data is securely kept and that participants are being treated fairly.

  Ethical Applied Linguistics Research 


References Brown, J.D. (1988). Understanding research in second language learning. Cambridge: Cambridge University Press. De Costa, P.I. (2014). Making ethical decisions in an ethnographic study. TESOL Quarterly, 48(3), 413–422. De Costa, P. I. (2015). Ethics in applied linguistics research. In B. Paltridge & A. Phakiti (Eds.), Research methods in applied linguistics: A practical resource (pp.245–257). London: Bloomsbury. De Costa, P.I. (Ed.). (2016a). Ethics in applied linguistics research: Language researcher narratives. NewYork: Routledge. De Costa, P. I. (2016b). The power of identity and ideology in language learning: Designer immigrants learning English in Singapore. Dordrecht: Springer. Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. Oxford: Oxford University Press. Duff, P. A. (2008). Case study research in applied linguistics. New York: Lawrence Erlbaum Associates. Dufon, M.A. (1993). Ethics in TESOL research. TESOL Quarterly, 27(1), 157–160. Emanuel, E.J., Wendler, D., & Grady, C. (2013). What makes clinical research ethical? The Journal of the American Medical Association, 283(20), 2701–2711. Gao, X., & Tao, J.(2016). Ethical challenges in conducting text-based online applied linguistics research. In P.I. De Costa (Ed.), Ethics in applied linguistics research: Language researcher narratives (pp.181–194). NewYork: Routledge. Guillemin, M., & Gillam, L. (2004). Ethics, reflexivity, and “ethically important moments” in research. Qualitative Inquiry, 10(2), 261–280. Haggerty, K.D. (2004). Ethics creep: Governing social science research in the name of ethics. Qualitative Sociology, 27(4), 391–414. Hatch, E., & Lazaraton, A. (1991). The research manual: Design and statistics for applied linguistics. Boston, MA: Heinle and Heinle. Kouritzin, S. (Ed.). (2011). Ethics in cross-cultural, cross-linguistic research [Special issue]. TESL Canada Journal, 28(2). Lee, E. (2011). Ethical issues in addressing inequity in/through ESL research. TESL Canada Journal, 28(31), 31–52. Mackey, A., & Gass, S. (2005). Second language research: Methodology and design. Mahwah, NJ: Lawrence Erlbaum Associates. Mahboob, A., Paltridge, B., Phakiti, A., Wagner, E., Starfield, S., Burns, A., … De Costa, P.I. (2016). TESOL quarterly research guidelines. TESOL Quarterly, 50(1), 42–65. McKay, S.L. (2006). Researching second language classrooms. Mahwah, NJ: Lawrence Erlbaum Associates. Ngo, B., Bigelow, M., & Lee, S. (Eds.). (2014). Introduction: What does it mean to do ethical and engaged research with immigrant communities? Special issue:


S. Sterling and P. De Costa

Research with immigrant communities. Diaspora, Indigenous and Migrant Education, 8(1), 1–6. Norris, J.M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50(3), 417–528. Ortega, L. (2005a). For what and for whom is our research? The ethical as transformative lens in instructed SLA. Modern Language Journal, 89(3), 427–443. Ortega, L. (Ed.). (2005b). Methodology, epistemology, and ethics in instructed SLA research [Special issue]. Modern Language Journal, 89(3), 315–488. Paltridge, B., & Phakiti, A. (Eds.). (2015). Research methods in applied linguistics: A practical resource. London: Bloomsbury. Phakiti, A. (2014). Experimental research methods in language learning. London: Bloomsbury Publishing. Pimple, K.D. (2002). Six domains of research ethics: A heuristic framework for the responsible conduct of research. Science and Engineering Ethics, 8(2), 191–205. Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61(2), 325–366. Shohamy, E. (2004). Reflections on research guidelines, categories, and responsibility. TESOL Quarterly, 38(4), 728–731. Social Research Council. (2015). ESRC Framework for Research Ethics. Retrieved from­frameworkfor-research-ethics-2015/ Sterling, S. (2017). Investigating the complexity of consent forms in ESL research. [Manuscript in preparation]. Sterling, S., & Gass, S. (2017). Perspectives from the field on ethical behavior in applied linguistics and second language research. [Manuscript in preparation]. Sterling, S., Winke, P., & Gass, S. (2016). Training in research ethics among applied linguistics and SLA researchers. In P.I. De Costa (Ed.), Ethics in applied linguistics research: Language researcher narratives (pp.15–37). NewYork: Routledge.

9 Writing aResearch Proposal SueStarfield

Introduction One of the earliest documents that doctoral scholars have to write is a research proposal in which they provide a rationale and motivation for the research study they plan to undertake. It is a cognitively challenging activity that “demands thinking logically through the entire project from beginning to end” (Rogers, Zawacki, & Baker, 2016, p.58). As Paltridge (1997) tells us, even though your research focus may change and develop over time, the proposal is “a key first step in the research process” (p. 62). Throughout your future career as a researcher, there will be opportunities to write different kinds of research proposals such as, for example, grant funding proposals. In some contexts, doctoral students write dissertation grant proposals (Cheng, 2014) seeking funding for their doctoral study. While there are many similarities between doctoral research proposals and grant funding proposals, they are written for different purposes, the main one being the funding component of the grant proposal. In this chapter, however, the focus will be on the research proposal doctoral students in most contexts are expected to write and submit for feedback and approval prior to being allowed to commence their study.1

S. Starfield (*) School of Education, UNSW Sydney, Sydney, NSW, Australia e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



S. Starfield

In a post on her academic writing blog, Pat Thomson argues that the purpose of a research proposal is a bid by the author to gain acceptance into a community: “to demonstrate that the researcher has the capacity to produce disciplinary knowledge.” She writes that “In order to do so, the proposal writer must show familiarity with the “right” language, knowledge production practices, existing debates and taken for granted “truths” of the relevant scholarly community” ( Clearly, becoming an “insider” to disciplinary ways of thinking and writing is part of the process of becoming a doctoral scholar and developing a research proposal is a stage in the process. At the same time, however, there are a number of components of a research proposal that are common across all disciplines. In fact, there is already much advice available to students on how to put together a research proposal both online and in manuals and handbooks (e.g., Creswell, 2014; Paltridge & Phakiti, 2015; Paltridge & Starfield, 2007; Punch, 2012; Tracy, 2013). While completed doctoral theses and dissertations are widely available online on sites such as ProQuest Dissertations & Theses Global (http://www., proposals, on the other hand, typically have only a small number of readers, and it is quite difficult to gain access to examples of successful research proposals. Punch (2012) contains examples of research proposals from several disciplines that use qualitative, quantitative and mixed methods approaches. He also provides a useful list of other manuals that contain examples of research proposals (pp.138–140). There is also now at least one online open access database containing sample dissertation research proposals from a range of disciplinary fields and adopting different methodologies ( Although none of these proposals is from an applied linguistics field, and proposal formats will differ due to disciplinary and methodological differences, it is certainly worthwhile to look at a number of research proposals as you set out to conceptualise your own study and draft your proposal. The core constituents of a research proposal typically listed in much of the standard advice given to commencing doctoral students are (see e.g., Paltridge, 1997): • The aims of the research project • The scope of the research project • The significance and originality of the research project in relation to previous work in the field • The research approach or method to be adopted • A provisional chapter plan • A provisional timetable for future progress

  Writing aResearch Proposal 


While these fairly abstract notions provide a starting point for conceptualising both the research and the proposal, they fail to signal what Pat Thomson (cited above) is putting forward in her blogpost: that the work of the proposal is to gain acceptance of the arguments we are making in our proposal for why our study is needed. We are trying to persuade our readers—probably not more than two to three people in the case of a student’s research proposal— that we have a topic worthy of study that is relevant and of interest to members of our field. In the remainder of this chapter, I want to provide you with some writing and thinking tools that, based on my experience, can help develop your ability to put the case for your applied linguistics research project.

Tools forDeveloping aResearch Proposal In a course I taught for a number of years for PhD students in the Faculty of Arts and Social Sciences at my university on developing a research proposal, we began with an activity that directly addressed the work of the proposal using the four questions framework (see also Starfield & Paltridge, 2014), which asks students to briefly respond (in writing) to the following prompts as illustrated in Fig.9.1. These four questions lie at the heart of the research proposal and through attempting to answer them, students can begin to generate both their

1. What is the queson I am trying to answer?

2. Why is it worth answering?

3. How have other people tried to answer it?

4. How am I going to go about answering it?

Fig. 9.1  The four questions framework


S. Starfield

thinking and their writing. The students who took my course found that returning to these four questions as they drafted and redrafted their research proposal was very helpful in conceptualising the focus of their study. Of course, over the course of six months their initial question might evolve or change substantially, but using the framework allowed them to keep focussed on the core purposes of a research proposal. “What is the question I am trying to answer” refers to the aims of the project and the potential research questions which need to be clearly articulated. Using the first person, “I” encourages ownership of the project which is important as students transition from an undergraduate identity to one in which they have a much greater degree of agency in defining and designing the parameters of their study. “Why is it worth asking” refers to the “significance and the originality” of the proposed research or what is also called the “so what” or “why bother” question. In other words, what will your study contribute to the field you are researching? Over the period of developing your research proposal, you should become able to more clearly articulate how your study will contribute. There are of course many ways in which we can articulate such a contribution (see Fig.9.3 below), and it may take quite a bit of drafting, discussion and redrafting before the nature of this contribution takes shape. The question that follows: “how have other people tried to answer it” reminds us, as indicated in the list above of the core constituents of a proposal, that claims to significance and originality have their origin in previous work in the field and have to be argued for. The fourth and final question gets us thinking about the methodology we want to adopt—your investigative approach and why it is the most appropriate approach to investigating your specific question. So, rather than just providing a set of headings such as aims, literature review or methodology, the four questions framework asks you to consider the functions of the different components of the proposal or as I said earlier, the work it is trying to do. In the following sections, I address each of the four questions in more detail.

What Is theQuestion IAmTrying toAnswer? Research questions do not emerge fully formed from anyone’s head: they are the outcome of much thinking, writing and going back and forth to the literature. Looking at successful research proposals or completed dissertations can thus be a bit misleading as they do not convey the many iterations of the

  Writing aResearch Proposal 


questions that would have led to the set of neatly posed questions in the final proposal. Continuing to use the four questions framework as you progress in your thinking and reading can help you get closer to the final version. You might also want to try the writing prompts suggested by Rowena Murray (2002, p.104) to kick-start writing about the context/background to your study: • • • • • • • • •

My research question is …. (50 words) Researchers who have looked at this subject are …. (50 words) They argue that …. (25 words) Researcher A argues that …. (25 words) Researcher B argues that …. (25 words) Debate centres on the issue of …. (25 words) There is still work to be done on …. (25 words) My research is closest to that of Researcher A in that …. (50 words) My contribution will be …. (50 words)

Writing prompts can be very helpful ways of generating thinking and writing, especially when clear word limits are provided. Working through the prompts leads you to your contribution, as they encourage you to think about how understanding debates in the field and what still needs to be done can help frame your study’s contribution and how all the four questions are seamlessly related. As Murray points out, working through the prompts helps to establish the focus and direction of your thesis. Paltridge and Starfield (2007, p.59) suggest a number of ways to refine a research question. These include: • Read broadly and widely to find a subject about which you are passionate. Immerse yourself in the literature, use your library, read the abstracts of other recent theses and dissertations, check dissertations on the web. • Narrow your focus to a single question: be disciplined and not overambitious. • Be prepared to change or modify your question if necessary. • Be able to answer the question “Why am I doing this project?” (and not a different one). • Read up-to-date materials—ensure that your idea is achievable and no one else has done or is doing it. • Consult other students who are further down the track, especially those who have the same advisor as you. • Discuss your ideas with your advisor and lots of other people.


S. Starfield

• Work through the implications of your research question: consider existing materials and ideas on which it is based, check the logic, spell out methods to be used. • Condense your research question(s) into two sentences, write them down— with pride—and post them next to your computer, where you can see them daily. Change the question(s) if needed. • Ask yourself: What will we know at the end that we did not already know?

How Have Others Tried toAnswer It? Another activity I used encouraged students to conceptualise their “research space” (Feak & Swales, 2011) using a visualisation activity. Students are given a handout with a simple Venn diagram of three intersecting circles (see Fig.9.2) and asked to group the key literatures or authors in one of the three circles—the intersection of the three circles is where the topic of the thesis is located—what Swales (2004) calls their research “niche.” This activity is best done on hard copy in my experience. Physically writing in the circles promotes a different kind of thinking than the more linear writing or typing on a page and often encourages students to see connections not apparent earlier (see also Starfield & Paltridge, 2014, pp.112–115). It is vitally important that you come to understand that the literature review section is not simply a listing or summary of everything you have read or an attempt to convince the reader that the writer is knowledgeable about the

Literature 1 Key authors

Your topic Literture 2 key authors

Fig. 9.2  Visual prompt for a literature review

Literature 3 key authors

  Writing aResearch Proposal 


work of others (see also Li & Wang, Chap. 6). As Rudestam and Newton (2007) point out, you are using the literature to develop both a coherent argument and a conceptual framework for your study so that your reader should be able to conclude: “Of course, this is the exact study that needs to be done at this time to move knowledge in this field a little further along” (pp.63–64). In the early stages of doing your literature review, it may be that what you are writing does look like a list or summary—a collection of annotated bibliographies put together as you read widely in your topic area. This is entirely to be expected and I encourage my students to develop a notemaking system, either in paper format or probably more usefully as a soft copy file, to make brief notes on whichever article, book chapter or book they are reading. Noting down the author’s theoretical perspectives, main arguments and evidence for these as well as the article’s relationship to your own work will all be helpful at this initial stage and later on. This preliminary writing is the beginnings of our writing and actively engages our thinking too as the activity causes us to not just passively consume the texts we’re reading but to engage with the thinking of the authors of these texts and locate their writing within broader theoretical frames. Many researchers now use EndNote and other similar software for these same tasks. From early on in your research career, it is a good idea to develop a method of annotated reading and notetaking that works for you as you will be doing a lot of reading. As your literature review develops over time, it must move beyond the inventory/summary stage into the argument that supports the bid your proposal is making. Rudestam and Newton (2007) suggest several quite simple but not immediately obvious ways in which the literature review can begin to sound more like your “take” on the topic so that your own “voice” begins to emerge: • Try to avoid beginning your sentences with “Jones said …”; “Smith found …”—this shifts the focus of your reviews from your own argument to the work of others. • Try to “develop a theme and then cite work of relevant authors” (p.65) to support your arguments or to provide examples or counterexamples of your point. • Try to limit excessive quoting. This can also lessen your authority and control. • Try to avoid reporting everything. Be selective—“build an argument not a library” (p.66).


S. Starfield

Why Is It Worth Answering? The second question in the four questions framework asked you to think about why your research is worth doing. Of course it is, but what we do not always succeed in doing is persuading others that it is. Sometimes this is because the idea of doing research that is significant and original seems overwhelming and impossible. How am I, a PhD student, going to produce this kind of research is a not uncommon thought. The now quite well-known comment by an experienced thesis examiner, “it’s a PhD not a Nobel Prize” (Mullins & Kiley, 2002, p. 386) acknowledges the anxiety experienced by many doctoral students that their contribution may not be judged to have a sufficient degree of originality, that is, be worthy of a Nobel prize! Rowena Murray helpfully provides a list of ways in which we can think about originality. It is not, she argues, simply saying something no one has said before. What her list does is provide us with ways of thinking about where the originality of our work might lie and how we can best articulate this. What is important, and what doctoral scholars in the early stages of their studies sometimes find difficult, is to begin to think about the ways in which their work might demonstrate an original or significant contribution. My students find the list below, from Murray (2002, p. 59), gets them thinking about this dimension of the proposal, and, in addition provides some tools for having a conversation with their supervisor. • • • • • • • • • • • • •

You say something no one has said before. You do empirical work that has not been done before. You synthesise things that have not been put together before. You make a new interpretation of someone else’s material/ideas. You do something in this country that has only been done elsewhere. You take an existing technique and apply it to a new area. You work across disciplines, using different methodologies. You look at topics that people in your discipline have not looked at. You test existing knowledge in an original way. You add to knowledge in a way that has not been done before. You write down a new piece of information for the first time. You give a good exposition of someone else’s idea. You continue an original piece of work.

The activity in Fig.9.3 is set up to help you think about the contribution of your study. One of the criteria thesis examiners are asked to consider is the extent to which the study makes a significant contribution to the field. It’s important to

  Writing aResearch Proposal 


Why is it worth answering? Why bother? My topic/question is significant because ....

My topic/question is contributing ..... [select one or more of the boxes below and explain briefly in what ways your study will contribute to your field]




To practice/policy

Fig. 9.3  How is my study contributing?

think about the specific ways in which your study may be contributing. Certainly, your proposal should identify these and, as stated above, locate your contribution in relation to previous work in the field. The first prompt you are asked to write to focuses on the significance of your study, while in the quadrants beneath you are asked to consider the nature of your contribution and jot down some reasons that support your claims. Your study does not have to contribute across all four domains: it may only be contributing in one of the four or its significant contribution may lie elsewhere. For example, if your work is in education or Teaching English to Speakers of Other Languages (TESOL), you may see your work as contributing to pedagogy rather than to any of the four quadrants. Alternatively, as an applied linguist you may see your contribution as being


S. Starfield

more within the realm of methodology or research design. Use the worksheet in Fig.9.3 to brainstorm ideas about your study’s significance and contribution as you work through versions of your proposal draft.

How AmIGoing toGoAbout Answering It? This question relates to the investigative approach you will use to answer your research question. As with the question about how others have answered it, the response to this question develops into an argument for why the approach you have chosen to adopt is the best approach for answering your questions. The methodology and methods section of your proposal needs to put forward a logical justification for your choice of research paradigm and research methods that will assure your supervisor and panel members that your proposed study is both viable and feasible within the time available. Chapters 3, 4 and 5 of this volume discuss the three main research paradigms used by applied linguists: quantitative, qualitative and mixed methods. As Richard Young argues in Chap. 2, a student’s choice of paradigm or methodology is often shaped by the “habits of mind” that draw them to a particular graduate school or advisor or that they develop in graduate school. Your choice of investigative approach—how to answer your research question(s)— will be shaped to some extent by these habits of mind but also by the nature of your topic, by the conversations you have with your peers as well as by the prevailing zeitgeist. Christine Casanave (2014, p. 59) advises doctoral students not to “embark on a methodological approach that you are likely to hate” and also recommends seeking out an “adviser who is philosophically compatible with you” and “who can guide your development.” It is i­ mportant to think through these issues of methodological choice as you are committing to a number of years, if not a lifetime, of working within the particular paradigm you choose to adopt for your dissertation. An important distinction that we point out in our book (Paltridge & Starfield, 2007) is between methodology and methods; although the two terms are often used as if they were interchangeable, in my view they refer to quite different aspects of the research process. Methodology refers to the research paradigms alluded to above—the epistemologies or ways of knowing that shape the kinds of knowledge our investigative approach can help us uncover (see also Phakiti & Paltridge, 2015 for further discussion of research paradigms in applied linguistics research). For example, within quantitative research, psycholinguistically oriented studies, such as those described in Chaps. 14 and 15 of this volume into the processes underlying language learning are likely to lead to experimental studies of how students learn that seek to measure that learning.

  Writing aResearch Proposal 


More qualitative methodologies may lead to studies that involve trying to understand the perceptions and identities of those being studied through methods such as interviews and/or observation (e.g., Chaps. 11 and 12 by Matthew Prior and Fiona Copland, respectively). Using a mixed methods approach could involve bringing together the strengths of quantitative and qualitative approaches to adopt methods such as a large-scale questionnaire/survey followed by interviews with participants who have completed the questionnaire to gather more in-­depth information.

Your “Two-Pager” When you have worked through the activities and writing prompts in the chapter, I recommend you try to write a “two-pager” as suggested by Punch (2012, p.80). The instructions are quite simple: • Write no more than two pages using single spacing. • Describe as clearly and directly as possible what your proposed research is trying to find out and how it will do it. Focus more on: • What am I trying to find out? • How am I going to do it? and less on context, background and literature. Your two-pager is a work-in-progress document, but it is an important first step in drafting the proposal. I always asked my students to work on a two-­ pager about half way through the semester, and they would then swap with one another in class and provide peer feedback on each other’s drafts. If you can find a peer to do this activity with, I think you will notice the benefit.

So What Will My Research Proposal Look Like? Up until now, I have been advising you on how to think and write about the different constituents of your proposal—how to conceptualise it. In this section, I want you to now give it a more formally recognisable shape and form as you get ready to submit it to your advisor for review. Table9.1 lays out the most commonly used section headings for thesis proposals and summarises the purpose of each section.


S. Starfield

Table 9.1  Thesis proposals: structure and purpose (based on Paltridge & Starfield, 2007, p.61) Section



To summarise, in a few words, what the research will be about


To provide an overview of the study which you will expand on in more detail in the text which follows

Overall purpose

To present a clear and concise statement of the overall purpose of the research

Relevant background literature

To demonstrate the relationship between the proposed study and what has already been done in the particular area; that is, to indicate the “gap” that the study will fill

Research question/s

To locate the study within a research paradigm. To provide an explicit statement of what the study will investigate

Definitions of terms

To provide the meaning of the key terms that have been used in the research question/s

Research methodology

To give an illustration of the steps the project will go through in order to carry out the research

Anticipated problems and limitations

To show awareness of the limitations of the study, what problems may be met in carrying it out and how they will be dealt with

Significance of the research

To say why the study is worth carrying out

Resources To say what resources the research will require—and what other required/budget costs may be anticipated in carrying out the study Ethics

To provide a statement as to how participants will be advised of the overall nature of the study and how informed consent will be obtained from them

Proposed table of To give an overview of the scale and anticipated organisation of contents the thesis or dissertation Timetable

To give a working plan for carrying out, and completing, the study


To provide detailed references and bibliographic support for the proposal


To provide examples of materials that might be used, or adapted, in the study

Criteria forAssessing Research Proposals Once your proposal is ready to be submitted to your advisors or panel, you may be wondering how they will evaluate it. Cadman (2002) surveyed and interviewed thesis supervisors (advisors), asking them to prioritise the particular features they expected to see in a research proposal. If you have worked through this chapter and the suggested activities, her findings may not surprise you. The supervisors indicated that they gave most value to:

  Writing aResearch Proposal 

• • • • • • • •


the logic of the student’s argument a well-focussed research question, set of research objectives or hypothesis the width and depth of the student’s reading the feasibility of the student’s project a critical approach to the literature justification of the project through the literature understanding of current issues on the student’s topic matching of methodology and methods to the research questions

Now read through your proposal, imagine you are the supervisor of your thesis or a member of your panel. Review your proposal using the criteria listed above and make changes you think are needed.

Conclusion Embarking on a doctoral thesis or dissertation is a major life event that is, I believe, in the majority of cases, a life-transforming one. Having said that, it is a huge investment on many levels and should not be engaged in without serious consideration. While this chapter has focussed on writing a research proposal, I would like to recommend an extremely thoughtful book that every prospective doctoral student should read long before getting to the proposal drafting stage. Aptly titled Before the Dissertation (Casanave, 2014), it explores (and explodes) many of the myths and realities of the doctoral journey honestly and clearly.

Resources forFurther Reading Casanave, C. (2014). Before the dissertation: A textual mentor for doctoral students at early stages of a research project. Ann Arbor: University of Michigan Press. This wise and useful book is essential reading for those either contemplating or in the early stages of doctoral study. Creswell, J. (2014), Research design (4th ed.). Thousand Oaks, CA: SAGE. A key text for students designing a research project, Creswell covers all the angles.


S. Starfield

Punch, K. (2012). Developing effective research proposals (2nd ed.). London: SAGE. Shorter and sharper than Creswell, this book is a thorough beginner’s guide to developing a research proposal, written in a highly accessible style.

Note 1. In the North American context, PhD submissions are known as dissertations, while in countries with British higher education traditions, they are referred to as theses. In this chapter, I use them interchangeably to refer to the written submission of a doctoral candidate for examination.

References Cadman, K. (2002). English for academic possibilities: The research proposal as a contested site. Journal of English for Academic Purposes, 1, 85–104. Casanave, C. (2014). Before the dissertation: A textual mentor for doctoral students at early stages of a research project. Ann Arbor: University of Michigan Press. Cheng, Y.-H. (2014). Dissertation grant proposals as “writing games”: An exploratory study of two L2 graduate students’ experiences. English for Specific Purposes, 36, 74–84. Creswell, J. (2014). Research design (4th ed.). Thousand Oaks, CA: SAGE. Feak, C.B., & Swales, J.M. (2011). Creating contexts: Writing introductions across genres. Ann Arbor, MI: University of Michigan Press. Mullins, G., & Kiley, M. (2002). It’s a PhD, not a Nobel prize. Studies in Higher Education, 27, 369–386. Murray, R. (2002). How to write a thesis. Maidenhead: Open University Press. Paltridge, B. (1997). Thesis and dissertation writing: Preparing ESL students for research. English for Specific Purposes, 16, 61–70. Paltridge, B., & Starfield, S. (2007). Thesis and dissertation writing in a second language: A handbook for supervisors. London: Routledge. Paltridge, B., & Phakiti, A. (2015). Developing a research project. In B.Paltridge & A. Phakiti (Eds.), Research methods in applied linguistics: A practical resource (pp.260–278). London: Bloomsbury. Phakiti, A., & Paltridge, B. (2015). Approaches and methods in applied linguistics research. In B.Paltridge & A.Phakiti (Eds.), Research methods in applied linguistics: A practical resource (pp.5–25). London: Bloomsbury. Punch, K. (2012). Developing effective research proposals (2nd ed.). London: SAGE.

  Writing aResearch Proposal 


Rogers, P.M., Zawacki, T.M., & Baker, S.E. (2016). Uncovering challenges and pedagogical complications in dissertation writing and supervisory practices: A multimethod study of doctoral students and advisors. In S. Simpson, N. A. Caplan, & M.Cox (Eds.), Supporting graduate student writers (pp.52–77). Ann Arbor, MI: University of Michigan Press. Rudestam, K., & Newton, R. (2007). Surviving your dissertation. Thousand Oaks, CA: SAGE. Starfield, S., & Paltridge, B. (2014). Generic support for developing a research proposal. In S.Carter & D.Laurs (Eds.), Developing generic support for doctoral students (pp.112–115). London: Routledge. Swales, J. M. (2004). Research genres: Explorations and applications. Cambridge: Cambridge University Press. Tracy, S.J. (2013). Qualitative research methods: Collecting evidence, crafting analysis, & communicating impact. Chichester: Wiley-Blackwell.

10 Writing aResearch Article BettySamraj

Introduction As the “principal site for knowledge-making” (Hyland, 2009a, p. 67), the research article has been the focus of numerous discourse studies with two major goals: one, to understand the epistemologies of different disciplines and, the other, to inform the teaching of writing. The prominence granted to the discourse organization of the research article in Swales’ (1990) monograph Genre Analysis: English in Academic and Research Settings inspired a plethora of studies on the research article, the structure of its main sections in terms of functional moves and constituent steps, and rhetorical and linguistic features characterizing this genre. Comparisons of this genre across disciplines and languages have contributed to our understanding of disciplinary and cultural values in academic discourse. In addition, researchers have also compared parts of the genre to one another (e.g., results, discussions, and conclusions by Yang and Allison (2003), and abstracts and introductions by Samraj (2005) and the research article to other genres such as textbooks (e.g., Kuhi & Behnam, 2011) in an effort to increase our understanding of this prestigious genre. At the same time, researchers have also noted “the growing dominance of English as the global medium of academic publications” (Lillis & Curry, 2010, B. Samraj (*) Linguistics and Asian/Middle Eastern Languages, San Diego State University, San Diego, CA, USA e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



B. Samraj

p.1) and have pointed out that “the reward systems within which scholars work increasingly … foreground English-medium publications” (Lillis & Curry, 2010, p. 48). The need for scholars to publish in English-­medium journals, especially those who might be considered “off-network” (Belcher, 2007, p.2), makes English for research publication an urgent issue and has been a stated motivation for studies focusing on this prestige genre. Since another recent handbook chapter has provided an overview of research on research articles in general (see Samraj, 2016, in the Routledge Handbook of English for Academic Purposes), the current chapter will limit its attention to studies on research articles from applied linguistics. In this chapter, I will discuss the findings from genre analyses of applied linguistics research articles, first, attending to analyses of the overall organization of the research article. Following a short review of studies on the macro-structure of applied linguistics research articles, I will consider the organization of the conventional main sections, abstracts, introductions, methods, results, discussions, and conclusions. Second, I will discuss studies reporting on the use of rhetorical features, such as metadiscourse and academic criticism that have been analyzed in applied linguistics research articles. As I discuss these findings, I will point to ways students can apply these findings to their own writing of research articles. The final section of this chapter will provide further ways for novice writers to draw on research findings to shape their own writing in applied linguistics as well as implications for EAP instruction.

Macro-structure ofResearch Articles Most studies on the organization of research articles have focused on the type and order of functional moves and steps in particular sections, such as the introduction and discussion section. A move is defined as a “discoursal or rhetorical unit that performs a coherent communicative function” in written or spoken discourse (Swales, 2004, pp.228–229), and moves can be realized by one or more steps. Generally, research articles have been assumed to have an Introduction-Method-Results-Discussion (IMRD) structure, although some studies have questioned this assumption (Lin & Evans, 2012). The most comprehensive study of macro-organization and section headings in applied linguistics articles is one by Ruiying and Allison (2004), where they distinguish between primary and secondary applied linguistics research articles. Their analysis of research articles reporting primary research points to variability in the macro-organization of applied linguistics articles, although all contain the three sections, Introduction, Method, and Results. This variability arises from

  Writing aResearch Article 


the presence of nonconventional headings such as experimental design in place of the conventional method and the use of content headings such as “L2 reading strategies”when reporting results. Ruiying and Allison (2004) also note the presence of additional sections, such as theoretical basis and literature review between the conventional introduction and methods sections and the section pedagogic implications close to the end of the research article. As shown by a later study discussed below (Lin, 2014), the presence of macro-organizations other than the conventional IMRD structure can have an impact on the structure of a conventional section, such as the introduction. Given this, writers should not immediately assume an IMRD structure when writing a research article but consider possible variations that are used by published writers in applied linguistics articles. Ruiying and Allison (2004) postulate a macro-structure of introduction, argumentation, and conclusion for secondary research articles, which report critical reviews and syntheses of research. They further distinguish three kinds of organization for the argumentation section of such secondary research articles according to their overall purpose: “theory-oriented, pedagogy-oriented, and (pedagogic) application-oriented” argumentations (Ruiying & Allison, 2004, p. 275). For example, the pedagogy-oriented argumentation has a problemsolution or demand-supply pattern in contrast to a point-by-point pattern in the argumentation of a theory-oriented article. Writers preparing review articles then should consider the purpose of their syntheses and determine the structure of the argumentation based on this purpose. See also Li and Wang (Chap. 6)on traditionalliterature reviews and research syntheses.

Abstracts Research article abstracts, long known to be more than an objective summary of the research article, are said to have become more of a stand-alone genre (Gillaerts & Van de Velde, 2010). Santos’s (1996) early study of abstracts from applied linguistics articles postulated five moves to account for this part-­genre: (1) situating the research, (2) presenting the research, (3) describing the methodology, (4) summarizing the results, and (5) discussing the research, with moves two and three being obligatory. Santos found that moves one and five, where persuasive work is conducted in situating and justifying the research in terms of previous research and in connecting the results to research or the real world, were the least frequent in the abstracts in comparison to other moves. However, Hyland’s (2000) cross-disciplinary study using a similar framework found the frequencies of these moves in social


B. Samraj

s­ cienceabstracts (including those from applied linguistics) to be much higher than their frequencies in abstracts from the hard sciences, indicating that these moves that connect the study reported to previous research and generalize the findings are important in the discipline of applied linguistics, even if they are less frequent than the other moves in abstracts. Pho (2008), in a comparison of abstracts from applied linguistics and educational technology, employing Santos’ (1996) framework, also revealed that the first move, situating the research, and last move, discussing the research, although less frequent than the other three abstract moves, were more frequent in applied linguistics than in the educational technology abstracts, providing further support for the importance of these moves in applied linguistics academic writing. Melander, Swales, and Fredrickson (1997) found that linguistics abstracts in English produced by Swedes were more likely to exclude introductions and conclusions (somewhat similar in function to moves one and five in Santos’ (1996) framework) than those produced by American English writers. The analyses of abstracts in applied linguistics research articles discussed here indicate that although contextualizing the study and discussing research results may not be obligatory rhetorical functions of abstracts, they play a more important role in applied linguistics abstracts than abstracts from some other disciplines. Authors of applied linguistics research articles should therefore consider including these functional moves which perform more rhetorical work than some of the other more common moves in abstracts.

Introductions Introductions in linguistics research articles have been well-studied, also with a focus on cross-disciplinary and cross-linguistic variation. Swales’ (1990, 2004) Create A Research Space (CARS) framework is frequently used in these studies. This framework includes three rhetorical moves of establishing a territory, establishing a niche, and presenting the present study. A simple version of the CARS model given in Feak and Swales’ (2011, p.55) volume, Creating Contexts: Writing Introductions Across Genres, is given in Table10.1 below. An early study of introductions in language studies research articles in Swedish (Fredrickson & Swales, 1994, p. 15) noted the infrequent use of move two, “establish a niche” in the CARS framework. A more recent study (Sheldon, 2011) comparing the structure of introductions in applied linguistics research articles produced in Spanish by native speakers, and in English by native speakers and non-native speakers with Spanish as a first language (L1)

  Writing aResearch Article 


Table 10.1  Moves in empirical research article introductions Move 1: Establishing a research territory (a) showing that the general research area is important, central, interesting, problematic, or relevant in some way (optional) (b) introducing and reviewing items of previous research in the area (obligatory) Move 2: Establishing a niche (citations to previous literature possible) (a) indicating a gap in the previous research (b) extending previous knowledge in some way Move 3: Presenting the present work (citations to previous literature possible) (a) outlining purposes or stating the nature of the present research (obligatory) (b) listing research questions or hypotheses (probable in some fields but rare in others (PISF)) (c) announcing principal findings (PISF) (d) stating the value of the present research (PISF) (e) indicating the structure of the research paper (PISF)

produced some complex results. Spanish speakers writing in their L1 were more likely to include this second move than those writing in English as L2 (second language). This finding seems to contradict earlier studies capturing the absence of this move in research article introductions in languages other than English such as Malay research articles in agriculture (Ahmad, 1997), while at the same time indicating that establish a niche might be a difficult rhetorical function to perform in EAL, and that student writers might need more practice and help with producing this important move. Belcher’s (2009) study of the use of explicit gap statements, for example, by stating that research in a particular area has been limited, (in contrast to implicit gap statements where academic criticism is hedged) by writers categorized as speakers of English as an international language (EIL) and those considered native English (EL) speakers resulted in the unexpected finding of EIL authors overwhelmingly preferring an explicit gap statement both at the beginning and end of a ten-year period (1996 to 2006) from which the data were gathered. EIL female writers also showed a greater preference for explicit gap statement over their female EL counterparts. Both male and female EL writers grew in their preference for explicit gap statements over this ten-year period. The increasing pressure to publish in competitive journals with low acceptance rates is presented as a possible reason for EIL writers, especially females, adopting what can be construed as a survival strategy (Belcher, 2009, p.231). The above studies focusing on the establish a niche move in applied linguistics research articles should caution us against simple generalizations about the difficulty that any rhetorical move might pose to EAL (or EIL) writers of research articles. However, the increasing frequency in use of explicit gap statements in applied linguistics research article introductions ­underscores


B. Samraj

the need for novice writers to acquire this rhetorical move for publishing success. Some studies on the structure of introductions have focused on sub-­ disciplines within applied linguistics. Two sub-disciplines, second language acquisition and second language writing, are the foci of a study by Ozturk (2007), which revealed the variability inherent in an interdisciplinary field such as linguistics. Research article introductions from the journal Studies in Second Language Acquisition display the three-move structure of the CARS model much more frequently than the introductions from the Journal of Second Language Writing, where introductions manifested a variety of rhetorical organizations. Another study focusing on these same subdisciplines (Rodriguez, 2009) noted the more frequent use of centrality claims, that is, statements that assert the importance of the topic being explored, in terms of real-world relevance in introductions from second language writing and a preference for centrality claims foregrounding research vigor and interest in second language acquisition research. Novice writers, then, would benefit from exploring the functional organization of research article introductions and the centrality claims employed in the journals to which they wish to submit their manuscripts because of intradisciplinary variation. Variations or innovations in introduction structure in applied linguistics have also been identified in research articles where literature reviews are found as sections between introductions and methods, an increasingly common structure in a variety of disciplines (Lin, 2014). The study by Lin (2014) showed that deviations from the conventional IMRD research article structure can have an impact on the rhetorical organization of other part-genres, such as introductions and methods. Two groups of introductions were identified in articles with a subsequent literature review section. One set included introductions with the regular CARS structure while the other nontraditional or orientation introduction included a two-move structure, where the first move identified key issues and the second move presented the study, similar to the third move in the CARS model. Lin (2014) reported that these orientation introductions did not contain substantial niche establishment although they contained a sub-move where the value of the research issue was explicated. Since the structure of the literature review was not analyzed in such research articles, it is not clear if niche establishment was more prevalent in that part-genre. What a writer may consider though is that the presence of a separate literature review might alter the shape of the introduction and its persuasive strength.

  Writing aResearch Article 


Methods The methods section in research articles remained relatively unexplored after Swales’ (1990) first monograph focusing on this genre. However, the last decade has seen an increase in interest in this section following Swales’ (2004) discussion of methods as being on a cline with clipped (fast) texts on one end and elaborated (slow) ones on the other, although few studies have been conducted on methods sections in applied linguistics research articles. In one study, Lim (2011a) reports on analyses of particular steps in the move delineating sample procedures found in methods sections of experimental reports from applied linguistics, motivated both by the need to explicate cross-disciplinary variation in organizational structure and his experience with the challenges posed by particular features of methods construction to his L2 writers in Malaysia. Using both qualitative and quantitative analyses, he identified the structure of the steps describing the sample/participants and justifying the sampling procedures and the grammatical features that experienced writers frequently use with these steps. Because of the limited research on methods sections in applied linguistics research articles, novice writers might compare the characteristics given by Swales (2004) for clipped and elaborated texts against published research articles from their sub-discipline of applied linguistics.

Results, Discussions, andConclusions The most comprehensive analysis of sections that follow methods in research articles is Yang and Allison’s (2003) qualitative study that identified the linear and hierarchical structure of these sections while explicating the complex ways in which results, discussions, and conclusions interrelate. The two-level analysis in terms of moves and steps captured differences across these sections that are not just due to the presence of unique moves but also differences in frequencies and development of the same moves in different sections. The primary communicative function of reporting results of the results section is seen in the multiple iterations of the move reporting results and the relative infrequency of the move commenting on results, which is not only obligatory but also more extensively developed in discussion sections. The conclusions section contains three moves, summarizing the study, evaluating the study, and deductions from the research, all of which can also appear in a discussion section. However, the same moves vary in their constituent steps across the two sections. In addition, the value of each move in a section is


B. Samraj

impacted by the other moves also present. The focus of the discussion sections is the commentary on specific results while the focus with the conclusion is a more general one on the overall results and an evaluation of the study as a whole (Yang & Allison, 2003). A number of other studies have focused on just one of these sections that follow the methods section (e.g., Peaco*ck, 2002) or even a move in a particular section such as the comments on results move in discussions (Basturkmen, 2009). Following early analyses of the discussion section (e.g., Holmes, 1997; Hopkins & Dudley-Evans, 1988), Peaco*ck (2002) engaged in a multidisciplinary analysis, contrasting native and non-native authors, of what has been identified as a challenging part-genre for novice writers. Language and linguistics was one of the seven disciplines included in this study that identified finding, claim, and reference to previous research as obligatory moves in discussions across disciplines, using a single-level framework of nine moves. Discussion sections from language and linguistics are characterized by more frequent use of the move reference to previous research, greater cycling of moves and less frequent use of recommendations for further research. In a later study, Dujsik (2013) used Peaco*ck’s (2002) framework to analyze discussions from five applied linguistics journals and revealed that most moves in the texts in his corpus occurred with similar frequencies to Peaco*ck’s results. Lim (2010), asserting the need for research on research articles from specific disciplines to prevent overemphasis on some rhetorical features in English for Academic Purposes (EAP) writing instruction, analyzed the move of commenting on results in results sections (not discussion sections as in Basturkmen, 2009), where such comment moves are also found. Comparing the structure of the move in education and applied linguistics research articles, Lim (2010) revealed that the steps, explaining a finding, evaluating a finding, and comparing a finding with the literature, were all much more prevalent in the applied linguistics articles in contrast to the education articles, leading him to conclude that education results sections were “comment-stripped” (p.291). In another study on the reporting of results, Lim (2011b) again contrasts education and applied linguistics to explore the steps used in the move paving the way for research findings (labeled in Yang and Allison’s (2003) study as preparatory information) as well as the linguistic structures that characterize these steps. Lim’s (2011b) study identified four specific steps that pave the way to a report of results: (1) indicating the structure of the result section to be presented, (2) providing background information to the results to be reported, (3) reiterating research questions/purposes, and (4) stating location of data. Interestingly, the first three steps were more common in the applied linguistics articles with mean frequencies at least twice as high as those for the

  Writing aResearch Article 


e­ ducation articles. The last step of indicating the location of data in tables and graphs had similar frequencies in both sets of data. This study, like others discussed earlier in this chapter, sheds light on the discoursal preferences exhibited in applied linguistics research articles, which Lim (2011b, p.743) refers to as “unpredictable complexity” found in the real world of discourse that is quite different from the idealized discourses often held out as the standard in language teaching. In a study comparing discussion sections in student-produced theses and published research articles in applied linguistics (specifically language teaching), Basturkmen (2009) focused on the construction of argument in one move in discussions, commenting on results. Her fine-grained analysis provides a helpful picture of the construction of elaborate arguments in this move that is part of a common results-comment cycle in discussions. Alternative explanations for results, references to literature in support of explanations, and evaluation of explanations contribute to the complex comments move. Student-produced discussions contained the same steps as experienced writers, but the student writers tended to include many more results and compared their results to those in the literature and, importantly, provided far fewer alternative explanations and were less likely to extend their findings to general theory as were the expert writers. Conclusions in research articles, although not foregrounded in the traditional IMRD structure, have received some attention, especially after Yang and Allison (2003) identified the main rhetorical moves in conclusions (mentioned earlier) and specified their relationship to those that constitute the discussion section. One such study (Amnuai & Wannaruk, 2013) compared the structure of conclusions in applied linguistics research articles produced in international journals and those produced by Thai writers in English journal publications by high-ranking government universities in Thailand using Yang and Allison’s (2003) move framework. This study revealed that 35% of the conclusions in the Thai journals contained just one move. More importantly, only around 20% of the Thai corpus contained the move evaluating the study and 45% contained the move deductions from the research, significantly lower than the frequencies found in the international corpus. As the authors of this study conclude, non-native and inexperienced writers need to realize the importance of evaluating their studies, contextualizing their findings, and generalizing their research findings in the conclusion. The results discussed so far in this section point to a number of discoursal preferences seen in applied linguistics research articles that novice writers could adopt to produce successful research articles instead of idealized discourse that might be held up in standard language teaching. They could


B. Samraj

­rovide background information and reiterate research questions before p reporting their results. Other useful strategies to adopt would be commenting on results, moving from results to generalizations about results, and providing alternative explanations for the results being discussed. More than one study has also revealed the importance of intertextual links to previous research in different moves in both the results and discussion sections, indicating that novice writers should embed their own work in previous research in these sections. Providing evaluations of their studies and pointing to implications from their research in their conclusions might also help novice writers produce successful research articles.

Conclusion onMacro-structure ofResearch Articles The current discussion of the rhetorical organization of applied linguistics research articles has indicated that even articles reporting empirical findings may exhibit structures other than the conventional IMRD structure. Studies of the rhetorical structure of sections have revealed the importance of certain rhetorical moves in applied linguistics research articles. Connecting the study being reported to previous literature in the field has been shown to be important in the abstracts (introduction or situating-the-study move) and introductions (establish a niche move) in addition to the discussion section. Furthermore, generalizing from results was also shown to be important in applied linguistics abstracts (Santos, 1996). The analysis of move structure has also highlighted the role of commentary in both the results and discussion sections of applied linguistics research articles (Basturkmen, 2009; Lim, 2010). In fact, complex argumentation can be built in discussion sections through the use of alternative explanations for results and their evaluation in the commentary of the results reported (Basturkmen, 2009). Novice writers need to be mindful of these general features of research articles from applied linguistics while also considering intra-disciplinary variation in writing norms in applied linguistics.

hetorical Features Characterizing Applied R Linguistics Research Articles In seeking to write a successful research article, the author has to demonstrate membership in the target disciplinary community not only by producing a text that follows the generic structure in terms of moves and steps valued by

  Writing aResearch Article 


expert members of the community but also by manifesting the sort of author persona valued in this genre in that target disciplinary community. The sort of author persona constructed in academic writing has been explored in a range of studies that can be broadly construed as studies of metadiscourse (e.g., Hyland, 2005; Lorés Sanz, 2008). As Kuhi and Behnam (2011, p.98) state, “metadiscourse is a principled way to collect under one heading the diverse range of linguistic devices writers use to explicitly organize texts, engage readers, signal their own presence and signal their attitudes to their material.” The findings from the studies discussed below then show novice writers how to manage their authorial presence and their relationships with their readers when writing a research article. A comparative study of metadiscourse in a number of academic genres in applied linguistics has revealed the nature of metadiscourse that characterizes research articles (Kuhi & Behnam, 2011). Through a detailed and substantive analysis of a range of metadiscourse features, such as evidentials (explicit references to other sources), hedges (e.g., the modal may), directives (e.g., imperatives), and reader pronouns (e.g., you), Kuhi and Behnam (2011, p.116) show that “language choices reflect the different purposes of writers, the different assumptions they make about their audiences, and the different kinds of interactions they create with their readers.” Their findings showed that the use of evidentials and hedges to indicate deference to the academic community was high in research articles while the use of directives and reader pronouns, which convey an imposition on the reader, was rare. In contrast, the latter set was common in introductory textbooks, a low prestige academic genre. Interpersonal resources such as self-mention and explicit references to other texts were also more valued in research articles than the other academic genres analyzed. Hence, acquiring metadiscoursal norms that would allow an author to engage in “a dialogism that is a manifestation of positive politeness and communality” (Kuhi & Behnam, 2011, p.121) would be essential for novices to be successful in producing research articles. A subset of the textual realizations of interpersonal meanings has also been examined in sections of linguistics research articles, such as abstracts and discussions. The use of metadiscourse resources in research article abstracts has been the focus of a number of studies. Lorés Sanz (2008) compared author visibility, which conveys authority and originality, in abstracts and other sections of research articles from three areas in linguistics (English for specific purposes, pragmatics, and general linguistics) through an analysis of the use of the first person pronoun. The author’s voice is shown not to be very strong in abstracts but strongest in the results sections of research articles where his/ her contributions to the field are foregrounded. Interestingly, author presence


B. Samraj

was found to be more muted in the discussions and conclusions sections where the use of other linguistic features, such as impersonal active constructions and agentless passive constructions, resulted in the construction of a more objective stance. Interpersonality in applied linguistics research article abstracts was considered from a diachronic perspective in Gillaerts and Van de Velde’s (2010) study that analyzed interactional metadiscourse (Hyland, 2005), specifically hedges, boosters, and attitude markers, in texts from 1982 to 2007. The study revealed a drop in use of interactional metadiscourse, particularly due to a drop in boosters (e.g., clearly) and attitude markers (e.g., is misleading), prompting the authors to speculate whether this showed the move of applied linguistics as a discipline toward the norms held by the hard sciences. In contrast, the use of hedges remained strong in this period, and the authors, in fact, showed a rise in the use of a combination of hedges, boosters, and attitude markers. The combination of features mitigates author stance in abstracts, which Gillaerts and Van de Velde (2010, p.137) postulate could be due to the increase in size of the applied linguistics discourse community. The continued relative frequency of hedges in applied linguistics abstracts is in line with Kuhi and Behman’s (2011) finding about the importance of hedges in applied linguistics research articles. Applied linguistics research articles in English form a part of the corpus of a multifaceted cross-disciplinary, cross-linguistic study of academic writing called the KIAP project (Fløttum, 2010). Dahl and Fløttum (2011) examined the construction of criticism as part of the KIAP project and focused on research article introductions from linguistics and economics and considered how these criticisms were constructed as authors made new claims within established disciplinary knowledge. They further analyzed each criticism along three dimensions: whether the author was explicitly visible (writer mediated), whether the criticism was specifically directed at an author (personal/impersonal), and whether the criticism was hedged. The total number of instances of criticism was higher in the linguistics introductions than the economics introductions although criticisms were found in a greater number of economic texts. Most criticisms in both disciplines were unhedged and not writer mediated. The linguistics introductions included a larger proportion of criticisms that were author directed, hence, more pointed, than those in the economics texts. Dahl and Fløttum (2011, pp.273, 278) argue that the main function of the criticism in both disciplines is “showing the uniqueness and originality of the writer’s findings” and that authors’ “new claims often take the form of a posited difference between established and new knowledge.”

  Writing aResearch Article 


Another study that is also part of the KIAP project explored the influence of language and discipline on the construction of author identity and polyphony (or other voices) in research articles (Fløttum, 2010). This study explored the construction of three common author roles (author as researcher, writer, and arguer) constructed in research articles from three disciplines, linguistics, economics, and medicine, in three languages, English, French, and Norwegian. Based mainly on an analysis of the use of the first person pronoun and accompanying verbs, linguistics authors are said to assume all three author roles and to be most “clearly present of the three discipline profiles, as well as the most explicitly argumentative and polemical authors” (Fløttum, 2010,p.273). In contrast, economics authors are researchers and writers (text guides) and less explicitly argumentative. In another study using the corpus from the KIAP project, Dahl (2004) analyzed the use of specific kinds of metadiscourse in research articles from the same three disciplines, economics, linguistics, and medicine, and languages, Norwegian, French, and English. Rhetorical metatext, which marks the rhetorical acts in the argumentation analyzed through use of verbs that refer to discourse acts, such as discuss and argue, was found to the same extent in the linguistics and economics articles in English. The results point to argumentation being much more a part of the knowledge-construction process in these two disciplines than in medicine, where, according to Dahl (2004, p.1820), the results are said to “reside outside the texts.” Locational metatext, where the author points to the text itself or its component parts, was used more by economics authors than linguistics authors in both English and Norwegian. However, the linguistics texts contained more locational metatext than the medical research articles. Dahl (2004) posits two reasons for the greater use of rhetorical and locational metatext in economics and linguistics: the relative youth of these disciplines and the need for results in these two disciplines to be more subjectively interpreted. A relatively lower use of metadiscourse was found in the French texts for all disciplines and seemed to indicate less author presence and responsibility for argumentation structure and sign posting in research articles no matter the discipline. These studies on author presence and metadiscourse in research articles in applied linguistics have yielded several key findings. Research articles in applied linguistics on the whole are characterized by the presence of hedges, evidentials, self-mention, and references to other sources, which enable authors to be deferential toward the disciplinary community, and acknowledge the value of previous research while asserting the author’s own place in the community (Kuhi & Behnam, 2011). While the use of some interactional metadiscourse might have decreased over time, the use of hedging has


B. Samraj

remained relatively strong in applied linguistics research articles (Gillaerts & Van de Velde, 2010). Although an author’s presence in a research article varies across different sections, authorial presence and explicit argumentation as knowledge-construction are valued in applied linguistics writing (Dahl, 2004; Fløttum, 2010; Lorés Sanz, 2008). In addition, academic criticism is not uncommon (Dahl & Fløttum, 2011). These findings from analyses of metadiscourse and author presence in applied linguistics research articles reveal specific ways in which metadiscourse is important in applied linguistics research articles in English and clearly point to the need for novices and non-native speakers to focus their attention on the linguistic features that construct an appropriate author persona and writer-­ reader relationship. A simple activity that novice writers in an academic writing class could engage in would be a comparison of applied linguistics research articles from which these features have been removed and unmodified research articles containing these elements of metadiscourse, author presence, and argumentation. Such an activity can focus the novice writer’s attention on the functions performed by these discoursal features. Junior scholars in applied linguistics could themselves compare sections of the research article such as methods and discussions with those from another discipline for explicit use of criticism, argumentation, hedges, self-mention, and reference to previous research in order to raise their awareness of practices in applied linguistics research articles.

Implications forWriting aResearch Article andConclusions Several studies have discussed the growing pressure on EAL writers to publish in English-medium research journals and the challenges they may face (Flowerdew, 2014; Hyland, 2009b). Flowerdew’s (2014) chapter on “English for research publication purposes” provides a helpful overview of these issues, including the need for EAL writers to appropriately interpret manuscript reviewer comments, the power relationships between writer and supervisor and writer and editor in academic publication and the roles played by literacy brokers, those other than named authors such as editors and translators, in the publication process (Lillis & Curry, 2010). One of the early stages in the process of being published in English is ­learning to write a research article in English. Courses in EAP (or English for specific academic purposes (Flowerdew, 2016)) and English for research

  Writing aResearch Article 


­ ublication processes have benefitted from the findings from discourse studies p of the research article both of the macro-structure of various sections of the research article and rhetorical features that characterize this genre. The results of cross-disciplinary studies or those that focus on the unique features of the research article from a particular discipline have underscored the need for EAP courses that acknowledge disciplinary variation in genres and conventions of academic communities. Many of these findings have been transformed into excellent teaching materials in volumes, such as thoseby Swales and Feak (2000, 2004) and Feak and Swales (2009, 2011), used to familiarize students with the discourse and linguistic tools needed to attain success for writing in various essential social contexts, thereby acculturating them into a variety of target disciplinary communities. The results of analyses of the research article from the sub-discipline of applied linguistics have not merely identified linguistic features but the communicative functions expressed by organizational structures and linguistic choices, such as the use of the first person pronoun. These results can be employed as data in EAP courses for rhetorical consciousness-raising tasks, an important insight from Swales (1990) and a “fundamental feature of his pedagogic approach,” where students are made aware of the linguistic features of a genre and their connection to communicative functions (Flowerdew, 2015, p. 104; Flowerdew, 2016). The growing use of corpora and computational techniques in EAP research has also had an impact on the teaching of EAP, especially the teaching of writing of the research article. Lee and Swales (2006) and Charles (2014), among others, report on the use of student-built corpora of research articles in advanced EAP courses with students from multiple disciplines. While the results of the studies on applied linguistics research articles can be used in the design of EAP materials and tasks, with or without electronic corpora and computational tools, in what has been labeled as a pragmatist approach, it might serve us well to remember that Swales (1997, p.381) refers to his approach to teaching academic writing to advanced students as liberation theology because in his EAP course he seeks to free his students from “consuming attention to the ritualistic surfaces of their texts, … from dependence on imitation, on formulas, and on cut-and-paste anthologies of other writers’ fragments” among other things. Bearing this in mind, practitioners should develop EAP tasks and materials that promote discovery-based analysis of relevant data that raise their students’ rhetorical consciousness and lead to the writing of successful research articles while ensuring the maintenance of some rhetorical diversity (Mauranen, 1993).


B. Samraj

Applied linguistics students not attending an EAP course can use research articles from a preferred applied linguistics journal to explore the disciplinary features discussed in this chapter. Keeping in mind the advice of scholars such as Swales (1997) and Mauranen (1993), I present here some suggestions for novice writers based on aspects of applied linguistics research articles reviewed in this chapter. Those seeking to write research articles in applied linguistics might benefit from considering certain dimensions of this genre in this discipline. They could analyze research articles from journals they are targeting as a venue for their own work in order to answer the questions given in Table10.2, which focus on text structures. As discussed earlier, a number of rhetorical features have also been analyzed in applied linguistics research articles. The same set of research articles from the journal selected as a publication venue can be used by novice writers to explore the questions provided in Table10.3, which can focus the writer’s attention on a few select rhetorical functions. Seeking to answer the questions given in these two tables would focus writers’ attention on some key dimensions to consider when writing a research article without limiting the students’ options to merely adopting language choices from published texts. Instead these questions could help a writer maintain Table 10.2  Dimensions to consider when constructing a research article Overall organization: 1. What should be the main sections of the research article? 2. Should I have a separate literature review between the introduction and methods section? Abstract: 1. Should I have a preliminary move where I connect my research to previous work? 2. Should I end with a move that states the implications of my study? Introduction: 1. Should I include all the moves in the CARS model? 2. Should I provide a gap? How explicit should the gap be? Methods: 1. What features of the clipped and elaborated methods should I include? Results: 1. Should I focus solely on reporting on results? 2. Should I also comment on results by connecting to previous research and by evaluating or explaining a finding? Discussion: 1. How should I comment on results? 2. Should I provide alternate explanations for results and evaluate them? 3. Should I draw on previous literature? Conclusion: 1. After providing a summary of the study, should I evaluate it? 2. Should I provide deductions from the study?

  Writing aResearch Article 


Table 10.3  Discovering norms for use of metadiscoursal features Intertextual links: 1. Where in the research article and how should I refer to previous literature? 2. Should references to other authors be explicit? Author presence and strength of claims: 1. Should my authorial role be explicit in various sections? 2. What discourse functions warrant use of the first person? 3. How should I criticize author claims explicitly? 4. How much hedges, boosters, and attitude markers should I use in making claims to establish new knowledge?

some rhetorical diversity while adhering to genre convention in his/her sub-discipline. Applied linguists have performed a number of studies on the research article from their own field, which can inform pedagogy as discussed above. These studies seem timely given the growing number of EAL graduate students in the field. Given the diversity in foci within applied linguistics (Kaplan, 2010), a greater number of intra-disciplinary studies on the research article can enhance our understanding of academic conventions in this field. Further, research articles employing quantitative and qualitative methodologies can also be compared to add to this understanding of discourse norms in applied linguistics.

Resources forFurther Reading Feak, C., & Swales, J.M. (2009). Telling a research story: Writing a literature review. Ann Arbor, MI: The University of Michigan Press. This volume focuses on the literature review and includes valuable information on the use of metadiscourse, taking a stance, and choices in citations in literature reviews. Feak, C., & Swales, J.M. (2011). Creating contexts: Writing introductions across genres. Ann Arbor, MI: The University of Michigan Press. Introductions from a few academic genres (such as proposals and book reviews) are attended to in this volume, with the most attention paid to research article introductions. The various moves in research article introductions and essential language features are the focus.


B. Samraj

Samraj, B. (2016). Research articles. In K. Hyland & P. Shaw (Eds.), The Routledge handbook of English for academic purposes (pp. 403–415). NewYork, NY: Routledge. This chapter provides a review of studies on research articles from a variety of disciplines, not just applied linguistics. As such, it captures some of the variation in disciplinary norms manifested in the research article. Swales, J.M. (2004). Research genres: Explorations and applications. Cambridge: Cambridge University Press. This volume discusses the nature of a number of research genres including the Ph.D. defense and research talks. Especially relevant is the chapter on the research article, which provides a comprehensive discussion of the standard research article, the review article, and short communications. Swales, J.M., & Feak, C. (2009). Abstracts and the writing of abstracts. Ann Arbor, MI: The University of Michigan Press. This volume, like the others listed below, can be used by instructors or independent researcher-users to teach or learn about the structure of a particular academic genre and the linguistic choices that characterize that genre. This volume focuses on different kinds of abstracts (such as conference abstracts) but pays particular attention to the research article abstract. The carefully constructed tasks will develop the user’s rhetorical awareness of the genre and provide practice in producing the genre. Swales, J. M., & Feak, C. (2011). Navigating academia: Writing supporting genres. Ann Arbor, MI: The University of Michigan Press. This fourth volume in this series perhaps is the least focused on the writing of a research article. However, it does contain useful information regarding communication surrounding the publication process such as responding to reviewer comments. It also includes some information on the author biostatement that accompanies the research article.

References Ahmad, U.K. (1997). Research article introductions in Malay: Rhetoric in an emerging research community. In A.Duszak (Ed.), Culture and styles of academic discourse (pp.273–304). Berlin: Mouton de Gruyter.

  Writing aResearch Article 


Amnuai, W., & Wannaruk, A. (2013). A move-based analysis of the conclusion sections of research articles published in international and Thai journals. 3L; language, linguistics and literature. The Southeast Asian Journal of English Language Studies, 19(2), 53–63. Basturkmen, H. (2009). Commenting on results in published research articles and masters dissertations in language teaching. Journal of English for Academic Purposes, 8(4), 241–251. Belcher, D.D. (2007). Seeking acceptance in an English-only research world. Journal of Second Language Writing, 16(1), 1–22. Belcher, D. D. (2009). How research space is created in a diverse research world. Journal of Second Language Writing, 18(4), 221–234. Charles, M. (2014). Getting the corpus habit: EAP students’ long-term use of personal corpora. English for Specific Purposes, 35, 30–40. Dahl, T. (2004). Textual metadiscourse in research articles: A marker of national culture or of academic discipline? Journal of Pragmatics, 36(10), 1807–1825. Dahl, T., & Fløttum, K. (2011). Wrong or just different? How existing knowledge is staged to promote new claims in English economics and linguistics articles. In F.Salager-Meyer & B.Lewin (Eds.), Crossed words: Criticism in scholarly writing (pp.259–282). Bern and Berlin: Peter Lang. Dujsik, D. (2013). A genre analysis of research article discussions in applied linguistics. Language Research, 49(2), 453–477. Feak, C., & Swales, J.M. (2009). Telling a research story: Writing a literature review. Ann Arbor, MI: University of Michigan Press. Feak, C.B., & Swales, J.M. (2011). Creating contexts: Writing introductions across genres. Ann Arbor, MI: University of Michigan Press. Fløttum, K. (2010). Linguistically marked cultural identity in research articles. In G.Garzone & J.Archibald (Eds.), Discourse, identities and roles in specialized communication (pp.267–280). Bern: Peter Lang. Flowerdew, J.(2014). English for research publication purposes. In B.Paltridge & S. Starfield (Eds.), The handbook of English for specific purposes (pp. 301–321). Chichester: Wiley-Blackwell. Flowerdew, J.(2015). John Swales’s approach to pedagogy in genre analysis: A perspective from 25 years on. Journal of English for Academic Purposes, 19, 102–112. Flowerdew, J. (2016). English for specific academic purposes writing: Making the case. Writing & Pedagogy, 8(1), 5–32. Fredrickson, K., & Swales, J. (1994). Competition and discourse community: Introductions from Nysvenska studier. In B.Gunnarsson, P.Linell, & B.Nordberg (Eds.), Text and talk in professional contexts. ASLA: Sweden. Gillaerts, P., & Van de Velde, F. (2010). Interactional metadiscourse in research article abstracts. Journal of English for Academic Purposes, 9(2), 128–139. Holmes, R. (1997). Genre analysis, and the social sciences: An investigation of the structure of research article discussion sections in three disciplines. English for Specific Purposes, 16(4), 321–337.


B. Samraj

Hopkins, A., & Dudley-Evans, T. (1988). A genre-based investigation of the discussion sections in articles and dissertations. English for Specific Purposes, 7(2), 113–121. Hyland, K. (2000). Disciplinary discourses: Social interactions in academic writing. Harlow: Pearson. Hyland, K. (2005). Metadiscourse. NewYork, NY: Continuum. Hyland, K. (2009a). Academic discourse: English in a global context. London: Bloomsbury Publishing. Hyland, K. (2009b). English for professional academic purposes: Writing for scholarly publication. In D. Belcher (Ed.), English for specific purposes in theory and practice (pp.83–105). Ann Arbor, MI: University of Michigan Press. Kaplan, R. (Ed.). (2010). The Oxford handbook of applied linguistics (2nd ed.). NewYork, NY: Oxford University Press. Kuhi, D., & Behnam, B. (2011). Generic variations and metadiscourse use in the writing of applied linguists: A comparative study and preliminary framework. Written Communication, 28(1), 97–141. Lee, D., & Swales, J.(2006). A corpus-based EAP course for NNS doctoral students: Moving from available specialized corpora to self-compiled corpora. English for Specific Purposes, 25(1), 56–75. Lillis, T., & Curry, M.J. (2010). Academic writing in a global context: The politics and practices of publishing in English. NewYork: Routledge. Lim, J. M. H. (2010). Commenting on research results in applied linguistics and education: A comparative genre-based investigation. Journal of English for Academic Purposes, 9(4), 280–294. Lim, J.M. H. (2011a). Delineating sampling procedures: Significance of analyzing sampling descriptions and their justifications in TESL experimental research reports. Ibérica, 21, 71–92. Lim, J. M. H. (2011b). ‘Paving the way for research findings’: Writers’ rhetorical choices in education and applied linguistics. Discourse Studies, 13(6), 725–749. Lin, L. (2014). Innovations in structuring article introductions: The case of applied linguistics. Ibérica, 28, 129–154. Lin, L., & Evans, S. (2012). Structural patterns in empirical research articles: A cross-­ disciplinary study. English for Specific Purposes, 31, 150–160. Lorés Sanz, R. (2008). Genres in contrast: The exploration of writers’ visibility in research articles and research article abstracts. In S.Burgess & P.Mártin-Mártin (Eds.), English as an additional language and research publication and communication (pp.105–123). Bern: Peter Lang. Mauranen, A. (1993). Cultural differences in academic discourse–problems of a linguistic and cultural minority. Afinla-Import, 23(51), 157–174. Melander, B., Swales, J.M., & Fredrickson, K. (1997). Journal abstracts from three academic fields in the United States and Sweden: National or disciplinary proclivities? In A.Duszak (Ed.), Culture and styles of academic discourse (pp.251–272). Berlin: Mouton de Gruyter.

  Writing aResearch Article 


Ozturk, I. (2007). The textual organisation of research article introductions in applied linguistics: Variability within a single discipline. English for Specific Purposes, 26(1), 25–38. Peaco*ck, M. (2002). Communicative moves in the discussion section of research articles. System, 30(4), 479–497. Pho, P.D. (2008). Research article abstracts in applied linguistics and educational technology: A study of linguistic realizations of rhetorical structure and authorial stance. Discourse studies, 10(2), 231–250. Rodriguez, M. (2009). Applied linguistics research articles: A genre study of sub-­ disciplinary variation. Unpublished MA thesis, San Diego State University. Ruiying, Y., & Allison, D. (2004). Research articles in applied linguistics: Structures from a functional perspective. English for Specific Purposes, 23(3), 264–279. Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introductions in two disciplines. English for Specific Purposes, 24(2), 141–156. Samraj, B. (2016). Research articles. In K.Hyland & P.Shaw (Eds.), The Routledge handbook of English for academic purposes (pp. 403–415). New York, NY: Routledge. Santos, M. B. D. (1996). The textual organization of research paper abstracts in applied linguistics. Text, 16(4), 481–499. Sheldon, E. (2011). Rhetorical differences in RA introductions written by English L1 and L2 and Castilian Spanish L1 writers. Journal of English for Academic Purposes, 10(4), 238–251. Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Swales, J.M. (1997). English as tyrannosaurus rex. World Englishes, 16(3), 373–382. Swales, J. M. (2004). Research genres: Explorations and applications. Cambridge: Cambridge University Press. Swales, J.M., & Feak, C. (2000). English in today’s research world: A writing guide. Ann Arbor, MI: University of Michigan Press. Swales, J.M., & Feak, C. (2004). Academic writing for graduate students: Essential tasks and skills. Ann Arbor, MI: University of Michigan Press. Swales, J.M., & Feak, C. (2009). Abstracts and the writing of abstracts. Ann Arbor, MI: University of Michigan Press. Yang, R., & Allison, D. (2003). Research articles in applied linguistics: Moving from results to conclusions. English for Specific Purposes, 22(4), 365–385.

Part II Research Instruments, Techniques, and Data Sources

There are eight chapters in this part of the Handbook. In each chapter, the authors discuss current and core issues and key research strategies, instruments, and techniques for data gathering. They address challenges and controversial issues, as well as limitations and future directions. The authors also provide examples from sample studies and annotated resources for further reading. • In Chap. 11 (Interviews and Focus Groups), Matthew Prior discusses the philosophical and methodological background to conducting and evaluating interview and focus group research in applied linguistics. Prior discusses the strengths and limitations of various approaches and deconstructs “commonsense” assumptions associated with interviewing by examining the influence of two prevalent perspectives: “interview as research instrument” and “interviews as social practice.” Prior also gives attention to issues related to rapport, language choice, interculturality, and the “naturalness” of interview data. • In Chap. 12 (Observation and Fieldnotes), Fiona Copland provides an overview of current approaches to observation with a particular focus on fieldnotes as data. Drawing on a number of recent observational studies in educational and workplace settings, Copland shows how approaches to observation respond to the contextual realities of the research site such as access, relationships, and level of intrusion. The author also discusses the status of fieldnotes as data and how they can be analyzed to develop empirical findings.


Research Instruments, Techniques, and Data Sources

• In Chap. 13 (Online Questionnaires), Jean-Marc Dewaele weighs the advantages and disadvantages of data collection through online questionnaires, as opposed to the traditional pen-and-paper method. ­ Online questionnaires allow researchers to reach out to a larger, more diverse, and more motivated pool of potential participants (i.e., not just their own students). The chapter considers these issues and offers practical advice on how to develop questionnaires, how to run them, and how interpret to interpret the findings. • In Chap. 14 (Psycholinguistic Methods), Sarah Grey and Kaitlyn M.Tagarelli provide insights into the mental representation of language. The authors discuss common psycholinguistic tasks and measures together with their strengths and limitations. They review traditional behavioral measures such as response time as well as newer measures, including eye-­ tracking and event-related potentials. They also discuss the major experimental paradigms that are employed with psycholinguistic tasks and measures and consider the research conducted with these paradigms. • In Chap. 15 (“SLA Elicitation Tasks”), Susan Gass deals with two common elicitation procedures in second language acquisition (judgment tasks and elicited imitation). The chapter begins with a discussion of the uses and misuses of judgment data. The major focus is on the practicalities involved in collecting acceptability/grammaticality judgment data, although mention is also made of other types of judgment data (e.g., truth-value judgments) and an alternative to judgment data, namely, magnitude estimation. • In Chap. 16 (Introspective Verbal Reports: Think-Alouds and Stimulated Recall), Melissa Bowles provides an overview of the use of verbal reports and synthesizes research that has examined their validity, finding them to be valid, if implemented appropriately. The chapter includes a concise guide to the proper use of verbal reports in language research, from data collection to analysis, concluding with a discussion of the method’s limitations and possible triangulation with other data sources. • In Chap. 17 (Corpus Research methods for Language Teaching and Learning), Magali Paquot offers a critical overview of how corpus-based methods and the subsequent descriptions of language use have been used to inform the publication of reference works (most particularly grammars and learner dictionaries), language teaching syllabus design, and the content of teaching materials (e.g., textbooks). The chapter focuses on how corpus-derived frequency information and (genre-based) lexicogrammatical descriptions can be used to help with decisions about what to teach and how to teach it.

  Research Instruments, Techniques, and Data Sources 


• Finally, in Chap. 18 (Digital Discourses Research and Methods), Christoph A.Hafner provides an overview of the unique affordances and constraints of digital tools and how such tools affect, among other things, the kinds of meanings we can make and the kinds of relationships we can have. Hafner examines important kinds of digital texts and interactions, having regard to (1) how the study of such digital discourses relates to key questions in applied linguistics and (2) how particular digital discourses can be studied, considering issues related to the collection and analysis of data.

11 Interviews andFocus Groups MatthewT.Prior

Introduction Given the person-centered, experiential focus of much applied linguistics research, there is perhaps no investigative activity more widespread than interviewing. Its appeal can be traced in large part to its immediacy and grounding in common sense. After all, when you want to find out what people know, believe, or feel about language-related matters; when you want to document their personal histories and explore their present circ*mstances; when you want them to comment on hypothetical scenarios, explain their motivations, and describe their imagined futures—you ask them. Interviewing, so the logic goes, offers a means to obtain information unavailable to direct observation and to understand individuals in their own words. The interviewer’s function, therefore, is to elicit and report these uncovered facts and first-person ­perspectives.Or is it? In this chapter, I reflect upon this “commonsense logic” by examining some of the philosophical and methodological principles and practices of interview1 and focus group research in applied linguistics. I begin with a brief background to the history of research interviewing, outlining some of the key issues and discussing various approaches along with their associated strengths and challenges. I provide a sample study and an annotated list of suggested resources for further reading. M. T. Prior (*) Department of English, Arizona State University, Tempe, AZ, USA e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



M. T. Prior

Definitional Matters Despite its ubiquitous presence in the research domain (and beyond), interviewing is neither self-explanatory nor a neutral transaction (i.e., it is not atheoretical, apolitical, or a-ethical): it isa dynamic interaction requiring a responsible and activeinterviewerand a responsive and willing interviewee. Sometimes the interview is described as methodology, approach, method, technique, ortool—terminology that may evoke very distinct (and discipline-­specific) conceptual and procedural lenses. Interviews can be designed as stand-alone studies as well as preliminary, follow-up, or complementary components within qualitative, quantitative, or mixed methods projects. In its diversity, interviewing embodies a “complex, multidimensional collection of assumptions and practices” (Gubrium, Holstein, Marvasti, & McKinney, 2012, p. x). As a professional and creative activity, it is a “craft,” a “skill,” and an “art” (Brinkmann & Kvale, 2015; Rubin & Rubin, 2012; Weiss, 1995) that can only be honed through practical, “realworld” experience. The institutional frame of the interview, with its alternating question-answer format, distinguishes it from ordinary or spontaneous conversation. Thus, a working definition of the research interview would be: a prearranged interaction for the specific purposes of gathering and/or generating information, usually recorded, involving participants whose roles or “situated identities” (Zimmerman, 1998; i.e., “interviewer” and “interviewee”) are predetermined. We must also acknowledge the common asymmetric participation format (e.g., the interviewer usually sets the topic and questions; the interviewee is often restricted in the scope of possible or normative responses).

Historical Development In recent decades, much has been made of our “interview society” (Atkinson & Silverman, 1997) and its preoccupation with personal inquiry and “generating empirical data about the social world by asking people to talk about their lives” (Holstein & Gubrium, 2004, p.140). Due to the embeddedness of interviewing and interview-like practices in daily life, most peoplein the industrialized world are socialized into numerous interview genres (e.g., journalistic interviews, TV talk shows, employment interviews, police questioning, doctor-patient interactions, healthcare surveys, political polls) by the time they reach adulthood (Edley & Litosseliti, 2010).

  Interviews andFocus Groups 


The quintessential component of the social scientist’s toolkit, the research interview, has been described as an “‘indigenous’ method distinctive to contemporary Western culture” (Atkinson, 2015, p.95). Although interviewing practices and their concomitant expectations may vary across cultural contexts (Briggs, 1986), the influence of broadcast, print, and digital media in the latter part of the previous century, combined with the modern hyper-­ connected era, has brought the interview society global (Gubrium & Holstein, 2002). In many countries, foreign language proficiency interviews are a part of educational systems as well as the military, government, and even business (Fulcher & Davidson, 2013). Personal interviews also function as one of the primary gatekeeping procedures for asylum seekers and other migrant groups worldwide (e.g., Maryns, 2012). I will turn now to the “interviewing society” of academic research by considering two influential traditions: survey interviewing and (auto)biographical interviewing.

Two Research Interviewing Traditions Survey Interviews The face-to-face or in-person interview is perhaps the oldest and most common form of survey data collection (Kropf, 2004). In the survey interview, the interviewee is selected as a representative cross-sectional sample of a target population (e.g., recent migrants, pre-service teachers, motivated students, anxious learners). As with other survey methods, effort is made to control for error by standardizing procedures so respondents receive the same question items or prompts in the same order and manner. This format is advantageous in that it allows the collection of data from a large pool of respondents. Though commonly viewed as a quantitative method, it can incorporate qualitative components, such as extended or open-ended responses (Lavrakas, 2008). See also Currivan (2008) on a modified style of survey interviewing labeled “flexible” or “conversational” interviewing. As many researchers have pointed out, standardized interviews may “pressure respondents into categories that fail to capture their experiences” (Schaeffer, 1991, p.369). This problem is referred to as “ecological validity” (i.e., the extent to which the opinions, attitudes, and other responses reflect interviewees in their everyday lives). One way to maximize ecological validity and improve instrument design is to make a regular practice of piloting materials to assess the quality of the questions, protocols, and potential responses (Richards, 2003; Roulston, 2010; Seidman, 2013).


M. T. Prior

( Auto)Biographical Interviews A second tradition in interview research that has had particularly strong uptake in applied linguistics is the elicitation of personal histories and first-­person narratives (e.g., Barkhuizen, Benson, & Chik, 2014; De Fina & Georgakopoulou, 2012). These are often labeled autobiographical, oral history, or life-story interviews. Storytelling is a basic mode of human interaction; however, interviews to elicit personal stories are relatively recent phenomena. This line of research was led mainly by feminist scholars and historians (Goodwin, 2012) following the “narrative turn” (Chase, 2011) in the mid-­1980s.2 These interviews tend to be in-depth and may be conducted once or over multiple sessions. In coding and analysis, researchers focus on participants’ life trajectories, storied experiences, and key events, paying special attention tothe connections and meanings speakers attach to them (Atkinson, 1998; Wengraf, 2001). Narrative interview is a term sometimes used synonymously with autobiographicalinterviews (cf. Wengraf, 2001, on “biographic-narrative” interviewing), but it generally refers to “small-scale” narrative interviews rather than extended life-story interviews (see Flick, 2014, on “episodic interviews”; also Pavlenko, 2007; Prior, 2016; cf. Riemann & Schütze, 1991, for a distinctive “narrative interview” approach). Miller (2014), for example, illustrates narrative interviewing in action in a case study of adult immigrants in the US.Through a close analysis of speakers’ narrative accounts of learning English and navigating their multilingual repertoires, she attends to their discursive construction of agency and sense-making practices within the interview context. Although the term sociolinguistic interview is occasionally used by researchers in relation to autobiographical interviewing, it is most commonly associated with variationist and other sociolinguistic research seeking to elicit spoken data for analysis. Labov (1972) is well known for using sociolinguistic interviews to collectpersonal narratives from AAVE speakers for samples of spontaneous, casual (i.e., unmonitored) speech. For consistency with the ­literature, the term “sociolinguistic interview” is best used in studies where the purpose is linguistic analysis. The ethnographic interview label is perhaps the most contentious. In cultural anthropology, ethnographic interviews have long been an integral component of field studies (Skinner, 2012; Spradley, 1979), where the aim is to understand and represent an emicor “insider’s” perspective through observation and carefully developed relationships with members of the host community. Though many researchers in applied linguistics and neighboring disciplines often use “ethnographic” as a shorthand description for “qualitative,” others (e.g., Atkinson, 2015; Watson-Gegeo, 1988) contend that a qualitative interview is not ethnographic if it does not involve essential components

  Interviews andFocus Groups 


of ethnographic fieldwork (e.g., extended duration, observation, direct participation). However, see Heyl (2001) for an expanded perspective on ethnographic interviewing informed by feminist, postmodern, and related critical and reflexive perspectives.

Two Core Issues In an influential paper on the treatment of qualitative interviews in applied linguistics, Talmy (2010) distinguishes between two prevalent perspectives: “interview as research instrument” and “interview as social practice.” Because these get at the conceptual core of interviewing, they deserve careful consideration here and by all interview researchers.

Interview asResearch Instrument The interview as research instrument, or “data collection” perspective, corresponds to what a number of researchers have variously described as “knowledge collection” or “data mining” (Brinkmann & Kvale, 2015, p. 57), “excavation” (Holstein & Gubrium, 2004, p.141), and “harvesting psychologically and linguistically interesting responses” (Potter, 2004, p.206). This aligns with the transmission or “conduit metaphor” (Reddy, 1979) model of communication. An implicit assumption here is that interviewees—provided they are willing, linguistically and developmentally competent (though perhaps with some careful scaffolding and other support by the interviewer)— can provide access to their “internal” or psychological worlds and lived experiences when prompted to do so in the interview situation. Because the interview as research instrument perspective supports the researcher’s goal of collecting data, it has tended to be the “default” in interview studies throughout applied linguistics. The standardized survey interview cited earlier offers a classic example. This perspective can also be found in investigations of specific constructs (e.g., attitudes, beliefs, autonomy, motivation, identity) as well as narrative and ethnographic research, where researchers are interested in assembling the in-depth and diverse responses of groups and individuals. A shared practice among the various interview studies conducted in this vein is that researchers frequently present their analyses and findings in the form of (often decontextualized) quotes, narratives, shared themes, and other material that “emerged” from the data (see Talmy, 2010, 2011, for an illustrative critique).


M. T. Prior

Interview asSocial Practice A contrasting perspective emphasizes the ways in which interviews are “active” (Holstein & Gubrium, 2004) and collaborative accomplishments. This constructionist approach corresponds with what Talmy (2010) refers to as “interview as a social practice” or “data generation.” The key difference with the preceding perspective is that here there is no objective or emergent “truth” or experience waiting to be elicited and collected. To quote Holstein and Gubrium (2004), “respondents are not so much repositories of knowledge— treasuries of information awaiting excavation as they are constructors of knowledge in association with interviewers” (p.141). This stance should not be understood as an extreme form of “epistemic relativism” that denies interviewees have actual experiences, perceptions, and so on (cf. Potter & Hepburn, 2008). The fact that interviewees can and do indeed talk about such things when asked or prompted is precisely the point—but what they produce are descriptions (i.e., representations), not objective reports. As discourse and interaction researchers (e.g., Baker, 2002; Potter & Hepburn, 2005, 2012; Prior, 2014, 2016; Roulston, 2010; Talmy & Richards, 2011) have shown, a close inspection of interviews reveals how interactants co-­organize their turn-taking patterns, identity categories, power relations, and even the goals and structure of the interview activity itself. When we recognize that interviews are a social practice, we must also acknowledge that the interview products cannot be separated from the processes by which they are generated. This corresponds with Holstein and Gubrium’s (2004) recommendation to attend to both the “whats” and the “hows” of interviews. Richards (2011) offers an instructional model that demonstrateshow sensitivity to both aspects can greatly inform interviewer awareness and training in applied linguistics research. Regardless of their specific approach to interviewing, researchers have a responsibility to make explicit and reflect on their stances and practices so that they are “purposefully connecting the information [whether “collected” or “generated”] to a conceptual or theoretical base” (Arksey & Knight, 1999, p.44; see also Talmy, 2010).

Overview ofInterview Formats There is no “typical” or “ideal” interview format. As with all research, the interviewer must select that which best aligns with the research questions and aims. It is equally important that interviewers consider their own personal communicative style (Roulston, 2010). For example, an introverted

  Interviews andFocus Groups 


interviewer may prefer a more structured interview. A naturally voluble interviewer may find it difficult not to talk too much within a more conversational setting. Frequently it is a process of self-discovery. As Weiss (1995, p. viii) notes, interviewing often involves “trial and error—rather a lot of error—and trial again.” A single interview may use one or hybrid interview formats, and even the most “generic” interview or well-planned interview schedule (i.e., a “guide” or “protocol” with a list of instructions and questions for the interviewee) may need to be modified on the spot to accommodate unexpected circ*mstances. It is useful, then, to conceptualize interviews as a continuum. At the one end are those with a more standardized or rigid schedule. These tend to employ more close-ended questions (e.g., “Where do you use English?”; “On a scale of 0 to 5, how would you rate your motivation to study Spanish?”) that elicit shorter, more restricted interviewee responses (e.g., “I use it at work”; “Three”). At the other end of the continuum are those that take a less structured, more conversational and exploratory approach. These have questions that are more open-ended (e.g., “Can you tell me about the first time you taught English?”) to elicit extended, in-depth responses. Although the research literature distinguishes various categories and subcategories of interviews, in the following section I will focus on four: structured, open-ended, semi-structured, and focus groups.

Structured Interviews The structured or standardized interview is most clearly exemplified by the survey interview described earlier. The aim of the fixed-delivery and fixed-­ response format is to increase reliability, reduce bias, and standardize responses for optimal coding and analysis. This rigidity leaves little room for interviewer or interviewee spontaneity (Dörnyei, 2007). Because it inhibits follow-up probes and expansions (unless they are built into the interview schedule), it risks overlooking topics and concerns that could potentially be relevant and informative to the study. It also discourages in-depth responses by placing interviewees into a more passive respondent rather than active participant role (Foley, 2012). However, the structured interview is particularly advantageous when the researcher is faced with time constraints and when seeking a large response sample. It is also a helpful format for novice interviewers (and interviewees) who require more guidance and structure during the interview process. Because the questions and responses are standardized, the data lend themselves to both quantitative and qualitative analysis.


M. T. Prior

Nevertheless, no matter how rigorously standardized, these interviews are rarely trouble-free; interviewers must still use careful judgment when stimuli are not effective (e.g., misunderstandings occur, repetition or clarification may be necessary, participants may resist or prove unresponsive). When understanding issues arise in interviews with L2 (second or additional language) speakers, Sapsford (2007, p.127) points out that the interviewer has three options: (1) reject the respondent (due to insufficient language competence), (2) translate the problematic material (thereby altering the interview questions), and (3) collect the data (now likely to be useless because the questions were not fully understood). Another option may be to incorporate an interpreter as a mediator, but this raises other considerations: including privacy and consent, personal comfort, transformation of the message, and cultural appropriateness.

Open Interviews Open or open-ended interviews fall at the more minimally structured or conversational side of the interview continuum. In this format, the researcher takes up the role of active listener to elicit the interviewee’s in-depth perspective. The researcher often has in mind a general topic or themes and has prepared a few broad or “grand tour” (Spradley, 1979) questions, “but many of the specific questions are formulated as the interview proceeds, in response to what the interviewee says” (Rubin & Rubin, 2012, p.31). Whereas in the structured interview, interviewees are confined to a narrow set of responses to questions and topics predetermined by the interviewer, here the interviewee helps shape the direction and content of the interview. Many researchers refer to this format as “unstructured,” but this is really a misnomer (cf. Richards, 2009), in that all interviews (and all interactions, for that matter) have an underlying organizational structure. Moreover, the label “unstructured interview” suggests that less work is required of interviewers than if they were conducting a more “structured” or standardized interview. Regardless of how conversation-like an interview may appear on the surface—and no matter how much we, as interviewers, may wish to convince interviewees (and ourselves) that we are just “talking” together— because the interview is both research process and product, it is never just casual conversation. As any experienced interviewer can attest, all interviews require planning and ongoing management. To make an interview “come off ” as conversational requires a great deal of communicative, mental, and emotional effort.

  Interviews andFocus Groups 


Although the open interview is a popular method, it is highly labor intensive and therefore not an ideal choice when the researcher is under time constraints (Arksey & Knight, 1999). These interviews also produce a large amount of data to transcribe, code, and analyze. With so much data to sort, determining where to begin analysis and what to focus on can pose a significant challenge. Open interviews are often used in case studies and for carrying out initial exploratory work before proceeding with a more focused study (Dörnyei, 2007). Some researchers find that because the open interview allows participants to produce responses in their own words, it is especially useful for investigating sensitive topics. The fact that open interviews are largely interviewee-­led may also go some way toward addressing the asymmetric power relations that can be found between interviewer and interviewee (Arksey & Knight, 1999).

Semi-structured Interviews Semi-structured interviews are considered a subtype of the structured interview genre. Dörnyei (2007) points out that most interviews in applied linguistics are of this type, which offers a compromise between the two extremes of structured and open. Nevertheless, the precise meaning of “semi-structured” can vary widely from study to study. Many interviewers choose to incorporate structured questions at the beginning of the interview and then follow up with open-ended questions later to expand upon earlier points. Richards (2009) cautions against this approach, warning that it can establish an undesirable pattern of short interviewee responses. Instead, he advises beginning interviews with an open question. A disadvantage with semi-structured interviews is in comparing participants and generalizing findings due to the lack of standardization of procedures (e.g., study participants are all asked different follow-up questions). As with open interviews, novice researchers may mistakenly assume that they can get by with less preparation. Wengraf (2001) warns, “They are semi-­structured, but they must be fully planned and prepared. Improvisation requires more training and more mental preparation before each interview than simply delivering lines prepared and rote-learned in advance” (p.5; emphasis in original). Nevertheless, because they are adaptable to almost any research setting, semi-structured interviews remain a widely used interview format. Regardless of the specific format, all interviews are labor-intensive activities. A genuine interest in people is essential—as is adequate preparation. Interviewing requires flexibility, patience, active listening, a good memory, and strong inter-


M. T. Prior

personal communication skills to adapt quickly and manage the unpredictability of the interview situation (Brinkmann & Kvale, 2015; Seidman, 2013). Even in open interviews, the interviewer must prepare by determining the interview goals and objectives, recruiting and confirming interviewee participation, setting the meeting place and time, readying and testing the recording equipment, preparing the interview schedule, obtaining informed consent, recording, taking notes, and soforth. A reality is that not all researchers are equally effective in conducting interviews, and not all research participants prove equally willing, able, or consistent interviewees. Miscommunication and resistance are endemic to the activity (due to topic, time of day, length of interview, differential expectations, personality conflict, language and cultural differences, the presence of a recording device, and other factors).

Conjoint andGroup Interviews When contemplating doing research interviews in applied linguistics, those involvingone interviewer and one interviewee are typicallywhat cometo mind. But conjoint interviews (with two interviewees) and group interviews (with three or more interviewees) are also options. The former are common in bilingualism research, for example, when interviewing a couple or a parent and child to investigate language attitudes and family communication practices. Group interviews can be found in studies when the researcher seeks to elicit a range of responses to produce a more holistic picture of a group’s or community’s (shared and divergent) perspectives on their sociolinguistic experiences, perspectives, resources, practices, and so on. Interview groups can be formed from pre-existing units (e.g., classroom, family, workplace, team) or recruited based on specific characteristics directly relevant to the study (e.g., gender, social class, language, country of origin, occupation).

Focus Group Interviews Focus group research did not have a visible presence in the social sciences until the 1980s and 1990s (Brinkmann & Kvale, 2015; Morgan, 1997), where it was taken up in feminist research, communication and media studies, sociology, and social psychology (Wilkinson, 2004). In applied linguistics, studies employing focus groups appeared sporadically (e.g., Dushku, 2000; Hyland, 2002; Myers, 1998), with Ho (2006) being one of

  Interviews andFocus Groups 


the strongest advocatesof this method. The focus group received almost no mention in applied linguistics research methods texts until Dörnyei (2007), who highlighted it as a specialized, group format, interviewing technique. It is important to emphasize that focus groups are not simply group interviews. The focus group is a type of interview setting, usually semi-structured, but the researcher’s role here is that of moderator or facilitator rather than interviewer in the traditional sense. To put focus group participants at ease, the setting is kept as informal and non-directive as possible. As with other interview formats, the moderator prepares an interview schedule, usually consisting of several prompts (e.g., around a topic, issue, open-ended question) designed to spark discussion. The moderator introduces a prompt and invites the focus group participants to enter into discussion. For example, in Parkinson and Crouch’s (2011) focus group study on language and cultural identity of mother-tongue Zulu students in South Africa, one of the discussion prompts used was: “OK Let me ask you this thing … Everyone knows that CSA [program name] only takes students from disadvantaged school … Carrying that label with you from a disadvantaged school … Does it impact on how people see you?” (p.91). In focus group research, the interaction among the participants is both the method and the data (Kitzinger, 1995). The goal is not for the group to reach a consensus or to respond to the moderator one by one, but to generate discussion (even disagreement) among one anotherand to bring out members’ various viewpoints and experiences (Brinkmann & Kvale, 2015). The moderator is non-directive but active in keeping the discussion flowing, checking and clarifying, making sure that the group is “focused” on the topic, and ensuring all members participate. Similar to recruitment for other interview studies, focus group participants are generally selected based on predetermined criteria. The size of a focus group varies, usually ranging from 6 to 12 (Dörnyei, 2007; Fern, 2001; Krueger & Casey, 2015). Stewart and Shamdasani (2014) caution that fewer than 8–12 members can lead to an overly narrow discussion; however, smaller groups of 4–8 are increasingly recognized as more effective (Kitzinger, 1995; Liamputtong, 2011), mainly because they are easier to recruit and host and overall more efficient because all participants are more likely to have an opportunity to speak. As with other kinds of interviewers, focus group moderators require training and practice—perhaps even more so—because they must facilitate interaction among multiple participants at once. In addition to the basic preparation for the setting, essential are general interviewing skills, flexibility, self-control


M. T. Prior

(particularlyto avoid monopolizing or over-directing the discussion), cross-­ cultural and pragmaticcompetence, and empathy (Morgan, 1997; Stewart & Shamdasani, 2014; Wilkinson, 2004). Time management is crucial. Participants have been recruited for a specific period (usually one to three hours), and the researcher must keep to that schedule (Stewart & Shamdasani, 2014). The ability to keep a discussion going requires a sensitivity to group dynamics and the ability to enable all members to participate. Sometimes in focus groups there is a tendency for some individuals (e.g., confident, louder, aggressive) to dominate the discussion; thus, their opinions may influence or inhibit those of others. It is also possible that personalities will clash, participants will disagree or fight, the discussion will go off topic, and participants will fragment into sub-groups (Litosseliti, 2007). There are many benefits to using focus groups. Perhaps one of the main advantages is that they allow the moderator to probe and identify questions, issues, or concerns in an in-depth manner over a short period from a large data sample. The focus group interaction can prompt participants to describe their attitudes, priorities, frames of understanding, norms, values, and other things that may otherwise go unarticulated in other research methods (Kitzinger, 1995). Focus group interviews can be very useful for exploratory studies (Brinkmann & Kvale, 2015), especially as part of a larger research project. Some researchers find it useful to conduct multiple focus groups in a single study (Stewart & Shamdasani, 2014). Focus group interviews can also be important equalizers. Unlike some other modes of research, the focus group does not discriminate against people who may have limited reading or writing skills. The group dynamic can also be facilitative for those who may be reluctant or unresponsive when interviewed in one-on-one interview settings (Kitzinger, 1995; Litosseliti, 2007). Because the focus group involves multiple speakers, oftentimes speaking in overlap, it is essential that the discussion be audio or video recorded. This allows the moderator to focus on facilitating the discussion and not worry about writing everything down. The usual practice is to place an audiorecorder in the middle of the table. Participants generally get used to it quickly and ignore it. Video recording allows a much more detailed analysis of speakers and their verbal and nonverbal interaction. Multiple recording devices are recommended (but they should be kept as unobtrusive as possible), both for backup and to ensure all participants’ contributions are captured. If the recording is audio only, then the researcher/moderator should transcribe it as soon as possible to be able to distinguish speakers while the discussion is still fresh in their memory. Like interviews, focus groups have several limitations and concerns. Because focus groups rely on non-probability sampling (i.e., participants are recruited

  Interviews andFocus Groups 


from convenience samples by the interviewer or self-selected), the findings will not be generalizable to a larger population. There is also the danger that moderators may have biased or manipulated participants’ responses, so theymust take care to monitor their own verbal and nonverbal feedback and other contributions. Another consideration is that in academic focus group research, as a matter of reciprocity, researchers generally offer an incentive or some form of compensation for participating (e.g., cash, gift certificate, even a language lesson). However, it can sometimes be awkward for the moderator to dispense the incentive—particularly when trying to establish friendly rapport or if a participant decides to leave the focus group early. It may therefore be helpful to assign another member of the research team to dispense the incentives at the end of the session. In some small-scale studies, such as those frequently conducted by graduate students in applied linguistics, people are often willing to participate for free; but this may raise some ethical questions (Litosseliti, 2007), possibly even of interviewee exploitation (Spradley, 1979).

Sample Study 11.1 Ro, E., & Jung, H. (2016). Teacher belief in a focus group: A respecification study. English Teaching, 1(1), 119–145. Research Problems/Gaps Research on teacher cognition and teacher beliefs has focused largely on investigating the internal lives of teachers, but little attention has been given to the ways teachers talk about their beliefs with one other. To address this gap, this study takes a discursive approach that examines the concept of “teacher belief,” not as a stable object sitting within people’s heads, but as something that is coconstructed by participants within interaction. Research Method • Type of research: Focus group. • Setting and participants: US university; three male participants were chosen based on their experience as EAP reading instructors and familiarity with the extensive reading approach. • Instruments/techniques: One-time video-recorded, semi-structured focus group interview (100 minutes). A two-minute segment was selected for detailed analysis, and a rationale was provided. • Data analysis: Discourse and interaction analysis (conversation analysis and discursive psychology) of selected sequences (e.g., disagreements, teasing) for insight into participants’ displayed “understandings” and other “psychological” business. Multimodal data were transcribed following modified CA ­conventions, and analysis included verbal and nonverbal actions. Interview prompts were included in an appendix.


M. T. Prior

Key Findings Participants’ professional beliefs were shown to evolve over the focus group session. Following friendly disagreement and teasing sequences, participants’ articulated beliefs became more specific. Participants not only displayed their general beliefs about teaching, they displayed their professional competence and experience as English language teachers and academic reading experts. Findings show how examining speakers’ co-­construction of teacher beliefs in focus group interaction helps us understand how “collective thinking” shapes teacher learning and teacher practices. Comments This study demonstrates how interviews can be used to collect and generate data for analysis. Theapproach aligns closely with the interview as social practice perspective, where the data and meanings are collaboratively constructed. In their analysis and in the detailed transcripts, the researchers included the verbal and nonverbal actions of the participants (including the moderator). The published study was further enriched by the inclusion of video clips of participants, detailed transcripts, and a list of focus group prompts. However, more information on recruitment and a rationale for the small focus group size would have strengthened this study.

Challenges andControversial Issues The “complex, multidimensional” (Gubrium et al., 2012, p. x) nature of interviewing creates many inherent challenges for interviewers and interviewees. I will now consider a few of these matters in more detail.

Interviews as“Naturalistic” vs. “Contrived” The matter of interviews as “naturalistic” or “contrived” (i.e., artificial) social occasions remains a point of contention among researchers. Codó (2008) describes the interview as “an authentic communicative situation in which naturally occurring talk is exchanged” (p.158). Silverman (2014) characterizes interview and focus group studies as “researcher provoked data” (p.455). Conversation analysts and other interaction researchers have often rejected interviews (but see, e.g., van den Berg, Wetherell, & Houtkoop-Steenstra, 2003), preferring instead to work with naturallyoccurring data3 that is free of researcher influence (Goodman & Speer, 2016; Potter & Hepburn, 2005). However, Speer (2002) argues, “All data can be natural or contrived depending on what one wants to do with them” (p. 520, emphasis in original). Likewise, as many qualitative researchers (e.g., Holstein & Gubrium, 2004; Silverman, 2014) have pointed out, no data are ever free of the researcher’s

  Interviews andFocus Groups 


influence, and “all methods have consequences” (Mishler, 1986, p. 120). Thus, although interviews are not “natural” if analyzed as spontaneous conversation, they can be considered “natural” as interviews (Goodman & Speer, 2016; Speer, 2002). The central issue for “naturalness” may not be the data— or even necessarily the method—but how the data are analyzed, contextualized, and represented.

Rapport Rapport is often cited in the literature as essential for prompting interviewees to talk freely and honestly (e.g., Atkinson, 1998; Brinkmann & Kvale, 2015). Interviewers are therefore advised to be open, sympathetic, and interested listeners (e.g., Rubin & Rubin, 2012; Seidman, 2013). They are also cautioned to guard against potential threats to rapport, such as their outsider status (Horowitz, 1986) or their negative verbal and nonverbal responses. Methods texts frequently gloss rapport as “attentive listening and engagement” (Potter & Hepburn, 2012, p.566; see also Prior, 2017). Though rapport is often considered a kind of emotional closeness, Spradley (1979) argues it “can exist in the absence of fondness and affection” (p.78). Some researchers have even induced rapport through “faking” friendship (e.g., Duncombe & Jessop, 2012). Because of the often intimate nature of in-depth and repeated interviews, there may also be a concern of “over-rapport” or “over-involvement” between interviewer and interviewee. This has been described as a feature of the general “emotionalist” or “romantic impulse” (Atkinson & Silverman, 1997; Gubrium & Holstein, 2002; Talmy, 2010), where the interviewer attaches so-called authenticity to the speaker’s “voice” and personal experience. Because interviewers may, intentionally or not, use rapport to further their research aims, it has serious potential for participant coercion, exploitation, and abuse—especially when deception is part of the research design (e.g., when the researcher’s goals and actual interpersonal investment are masked behind the guise of cultivating “friendship” or “just doing conversation”). Because rapport is as much an ethical matter as it is a practical one, great care must be taken to ensure that it is not abused for the sake of getting data.

Language andInterculturality In applied linguistics research, interviews often take place between people with different L1s (first languages) and cultural backgrounds. Interactants may therefore be required to use a lingua franca, or shared language, to


M. T. Prior

c­ ommunicate (e.g., L1-L2, L2-L2). Consequently, unexpected tensions may arise from, on the one hand, essentialist assumptions concerning “cultural differences” between interviewer and interviewee, and on the other, a hybrid or interculture that emerges in the interview situation. These matters of language and interculturality—part of the mode of communication and interactants’ collaborative identity work, as well as objects of researcher interest—are areas of study within the intercultural pragmatics paradigm (Kecskes, 2012; see also Kasper & Omori, 2010), and they raise many relevant questions surrounding the construction of meaning in interviews. Some researchers (e.g., Chen, 2011; Winchatz, 2006) consider the interviewer’s L2 speaker status advantageous because it may encourage L1 interviewees to do more work to define concepts. Prior (2014) found being a cultural “outsider” helpful in eliciting extended narrative accounts from adult immigrant interviewees, although some unexpectedly took over the interviews, turning them into language lessons to instruct the interviewer. Menard-­Warwick (2009) notes that despite her privileged social status and Anglo background, her ability and willingness to speak Spanish (her L2) allowed her to establish rapport with adult Latin American immigrant women interviewees in the US. Pavlenko (2007), in a critical review of L2 autobiographic narrative research, questions the status quo of L2 interviews conducted in the primary language of the interviewer and makes the case that participants should be given an option to use their L1 (which may necessitate using a bilingual interviewer or interpreter; see also Prior, 2016). Code-switching is also an important expressive and symbolic resource in multilingual interviews. Canagarajah’s (2008) study on language shift in the Sri Lankan Tamil diaspora, for example, included fascinating instances of Tamil-English code-switching by the researcher and interview participants. Although language alternation in the interviews went largely unanalyzed, it nonetheless evoked the richness of interculturality as an analytic topic. Language and interculturality are important issues, especially because they are constructive of and constructed by the interview interaction itself (see, e.g., Briggs, 1986; Pavlenko, 2007; Prior, 2014; Roulston, 2010). Moreover, ensuring a linguistic, and even cultural, match between interviewer and interviewee (or between moderator andfocus group participants) may not always be enough to ensure their comfort and full participation (Miller, 2011). Social status or class, ethnicity, gender, age, skin color, and other personal characteristics are also important to consider, since they may index (rightly or wrongly) various norms, beliefs, competences, and other assumptions that can influence how interactants relate to one another (Chiu & Knight, 1999; Rubin & Rubin, 2012)—even when they are from the same language background (Kenney & Akita, 2008).

  Interviews andFocus Groups 


Limitations andFuture Directions I have aimed in this chapter to highlight the dynamic nature and rich potential of interviewing in applied linguistics research. Many of the challenges researchers encounter can be addressed by developing a rigorous foundation in research theory and methodology, reading widely (within and outside the field), cultivating research apprenticeships and partnerships, planning carefully, piloting materials, developing reflective and ethical research practices, and most importantly, gaining practical experience. But interviewing is not always the right or the only choice for a study. It will not allow the researcher to observe what people do in their daily lives. It cannot help predict the behavior of individuals or groups, nor will it reveal what people really think, believe, or feel (Brinkmann & Kvale, 2015). However, interviews can show how people articulate responses and meanings as they describe their “inner” (i.e., psychological) and “outer” (i.e., social) worlds. As interactions, interviews are rich analytical sources for examining how people do storytelling and use language and other communicative resources in generic as well as creative ways (Kasper & Prior, 2015; Roulston, 2010), but in a bounded context (i.e., with the interviewer or focus group members). Interviewing can also be an important resource for making the researcher aware of potentially researchable issues and concerns that may go unnoticed by other methods. A growing area where interview researchers are enriching theory and practice is what Rapley (2012) has labeled “social studies of interview studies” (p. 552). Here interviewers return to their data (and the data of others) to examine, for example, research challenges, dilemmas, and “failed” or “uninformative” episodes (e.g., Nairn, Munro, & Smith, 2005; Prior, 2014, 2016; Roulston, 2010, 2011, 2014). Interviewers can use these studies to “reflect on and help them make sense of their own interviewing practice” (Rapley, 2012, p.548). Reflection becomes a pedagogical tool (Roulston, 2012) or “therapeutic intervention” (Rapley, p.548) for “refining interviewer technique…developing awareness…[and] improving analytical sensitivity” (Richards, 2011, pp.108-109). These “therapeutic” benefits for the researcher extend also to studies where interviews are used when they should not be (e.g., as stand-ins for direct observation) (S.Talmy, personal communication, January 12, 2017). Another positive outcome may be to sensitize interviewers to the ignored or taken-for-granted intersections of identity in the research process: …interviewers must understand the social locations that they occupy as researchers—such as race, ethnicity, status, age, nationality, education, gender, language proficiency, and so forth—and how these may both limit and benefit the generation of interview data with research participants. (Roulston, 2012, p.71)


M. T. Prior

This echoes Briggs’ (1986) classic work on his “communicative blunders” as an Anglo-American learning to interview Spanish speakers in New Mexico. Roulston (2010) also considers possibilities for a “decolonizing conception of interviewing” (p. 68) that might help counter ways in which indigenous voices often get silenced by non-indigenous researchers (see also Griffin, 2015; Smith, 2012). Close reflection also forces us to recognize the “emotional labor in researcher–respondent interactions and the various ways in which it is incorporated into the research process and writing” (Lillrank, 2012, p.281; Prior, 2016). By critically questioning and reflecting upon the various “commonsense” assumptions that shape interview and focus group research in applied linguistics, we will be better able to advance theory and method—ultimately, leading to renewed insights and more rigorously grounded interviewing practices.

Resources forFurther Reading Briggs, C.L. (1986). Learning how to ask: A sociolinguistic appraisal of the role of the interview in social science research. NewYork: Cambridge. Mishler, E.G. (1986). Research interviewing: Context and narrative. Cambridge, MA: Harvard University Press. These classic works set the standard for conceptualizing the interview as an interactional speech event. Brinkmann, S., & Kvale, S. (2015). InterViews: Learning the craft of qualitative research interviewing (3rd Ed.). London: SAGE. This text offers a comprehensive and accessible introduction to interview theory and practice, particularly for novice researchers. Gubrium, J.F., Holstein, J.A., Marvasti, A.B., & McKinney, K.D. (Eds.). (2012). The SAGE handbook of interview research: The complexity of the craft (2nd ed.). London: SAGE. This is the most comprehensive collection representing the diversity of contemporary interview research written by scholars from across the social sciences. Potter, J., & Hepburn, A. (2005). Qualitative interviews in psychology: Problems and possibilities. Qualitative Research in Psychology, 2, 281–307.

  Interviews andFocus Groups 


Potter, J., & Hepburn, A. (2012). Eight challenges for interview researchers. In J.F. Gubrium, J.Holstein, A.B. Marvasti, & K.D. McKinney (Eds.), The SAGE handbook of interview research: The complexity of the craft (2nd Ed., pp.555–570). London: SAGE. Potter and Hepburn’s detailed and provocative critiques of qualitative interview research are regularly cited in the literature and are essential reading for all qualitative researchers. Richards, K. (2003). Qualitative inquiry in TESOL [CH2: Interviewing]. NewYork: Palgrave Macmillan. Written for TESOL professionals and graduate students, this chapter presents a step-by-step guide to conducting qualitative interviews. Roulston, K. (2010). Reflective interviewing: A guide to theory and practice. London: SAGE. Along with its thorough overview of the various theoretical and practical aspects of interviewing, this concise guide offers clear and compelling guidance for becoming a more self-aware interview researcher.

Notes 1. I am primarily concerned here with face-to-face interviews. For other modes, see, for example, Gubrium etal. (2012). 2. This is a necessarily abbreviated history. Oral history interviews and anthropological interviews can be traced back much further. See, for example, Malinowski (1922) and Ritchie (2011). 3. See Rapley (2007) for another perspective on “naturallyoccurring” data.

References Arksey, H., & Knight, P. (1999). Interviewing for social scientists: An introductory resource with examples. London: SAGE. Atkinson, P. (2015). For ethnography. London: SAGE. Atkinson, P., & Silverman, D. (1997). Kundera’s Immortality: The interview society and the invention of the self. Qualitative Inquiry, 3(3), 304–325.


M. T. Prior

Atkinson, R. (1998). The life story interview. London: SAGE. Baker, C.D. (2002). Ethnomethodological analyses of interviews. In J.Holstein & J.Gubrium (Eds.), Inside interviewing: New lenses, new concerns (pp.395–412). London: SAGE. Barkhuizen, G., Benson, P., & Chik, A. (2014). Narrative inquiry in language teaching and learning research. NewYork: Routledge. van den Berg, H., Wetherell, M., & Houtkoop-Steenstra, H. (Eds.). (2003). Analyzing race talk: Multidisciplinary approaches to the interview. Cambridge: Cambridge University Press. Briggs, C.L. (1986). Learning how to ask: A sociolinguistic appraisal of the role of the interview in social science research. NewYork: Cambridge University Press. Brinkmann, S., & Kvale, S. (2015). InterViews: Learning the craft of qualitative research interviewing (3rd ed.). London: SAGE. Canagarajah, A.S. (2008). Language shift and the family: Questions from the Sri Lankan Tamil diaspora. Journal of Sociolinguistics, 12(2), 143–176. Chase, S. (2011). Narrative inquiry: Still a field in the making. In N. Denzin & Y.Lincoln (Eds.), The Sage handbook of qualitative research (4th ed., pp.421–434). London: SAGE. Chen, S.-H. (2011). Power relations between the researcher and the researched: An analysis of native and nonnative ethnographic interviews. Field Methods, 23(2), 119–135. Chiu, L.-F., & Knight, D. (1999). Developing focus group research: Politics, theory, and practice. In J.Kitzinger & R.Barbour (Eds.), Developing focus group research: Politics, theory and practice (pp.99–112). London: SAGE. Codó, E. (2008). Interviews and questionnaires. In L. Wei & M. Moyer (Eds.), Blackwell guide to research methods in bilingualism and multilingualism (pp.158–176). Oxford: Blackwell. Currivan, D. B. (2008). Conversational interviewing. In P. J. Lavrakas (Ed.), Encyclopedia of survey research methods (p.152). London: SAGE. De Fina, A., & Georgakopoulou, A. (2012). Analyzing narrative: Discourse and sociolinguistic perspectives. NewYork: Cambridge University Press. Dörnyei, Z. (2007). Research methods in applied linguistics. Oxford: Oxford University Press. Duncombe, J., & Jessop, J.(2012). Doing rapport and the ethics of “faking friendship”. In T.Miller, M.Birch, M.Mauthner, & J.Jessop (Eds.), Ethics in qualitative research (2nd ed., pp.107–122). London: SAGE. Dushku, S. (2000). Conducting individual and focus group interviews in research in Albania. TESOL Quarterly, 34(4), 763–768. Edley, N., & Litosseliti, L. (2010). Contemplating interviews and focus groups. In L. Litosseliti (Ed.), Research methods in linguistics (pp. 155–179). London: Continuum. Fern, E.F. (2001). Advanced focus group research. London: SAGE. Flick, U. (2014). An introduction to qualitative research. London: SAGE.

  Interviews andFocus Groups 


Foley, L. (2012). Constructing the respondent. In J. F. Gubrium, J. A. Holstein, A.B. Marvast, & K.D. McKinney (Eds.), The Sage handbook of interview research: The complexity of the craft (pp.305–315). London: SAGE. Fulcher, G., & Davidson, F. (Eds.). (2013). The Routledge handbook of language testing. London: Routledge. Goodman, S., & Speer, S. (2016). Natural and contrived data. In C. Tileagă & E.Stokoe (Eds.), Discursive psychology: Classic and contemporary issues (pp.57–69). London: Routledge. Goodwin, J.(2012). Sage biographical research (Vol. 1: Biographical research: Starting points, debates and approaches). London: SAGE. Griffin, G. (2015). Cross-cultural interviewing: Feminist experiences and reflections. London: Routledge. Gubrium, J.F., & Holstein, J.A. (2002). From the individual to the interview society. In J. F. Gubrium & J. A. Holstein (Eds.), Handbook of interview research (pp.3–32). London: SAGE. Gubrium, J.F., Holstein, J.A., Marvasti, A.B., & McKinney, K.D. (Eds.). (2012). The SAGE handbook of interview research: The complexity of the craft (2nd ed.). London: SAGE. Heyl, B.S. (2001). Ethnographic interviewing. In P.Atkinson, A.Coffey, S.Delamont, J.Lofland, & L.Lofland (Eds.), Handbook of ethnography (pp.369–383). London: SAGE. Ho, D. (2006). The Focus group interview: Rising to the challenge in qualitative research methodology. Australian Review of Applied Linguistics, 29(1), 5.1–5.19. Holstein, J.A., & Gubrium, J.F. (2004). The active interview. In D.Silverman (Ed.), Qualitative research: Theory, method, and practice (pp.140–161). London: SAGE. Horowitz, R. (1986). Remaining an outsider: Membership as a threat to research rapport. Urban Life, 14(4), 409–430. Hyland, K. (2002). Directives: Argument and engagement in academic writing. Applied Linguistics in Academic Writing, 23(2), 215–239. Kasper, G., & Omori, M. (2010). Language and culture. In N.H. Hornberger & S.L. McKay (Eds.), Sociolinguistics and language education (pp.455–491). Bristol: Multilingual Matters. Kasper, G., & Prior, M. T. (2015). Analyzing storytelling in TESOL interview research. TESOL Quarterly, 49(2), 226–255. Kecskes, I. (2012). Interculturality and intercultural pragmatics. In J.Jackson (Ed.), Routledge handbook of language and intercultural communication (pp. 67–84). London: Routledge. Kenney, R., & Akita, K. (2008). When West writes East: In search of an ethic for cross-cultural interviewing. Journal of Mass Media Ethics, 23, 280–295. Kitzinger, J. (1995). Introducing focus groups. British Medical Journal, 311, 299–311. Kropf, M.E. (2004). Survey methods. In J.G. Greer (Ed.), Public opinion and polling around the world: A historical encyclopedia (Vol. 1, pp. 468–474). Oxford: ABC-CLIO.


M. T. Prior

Krueger, R. A., & Casey, M. A. (2015). Focus groups: A practical guide for applied research. London: SAGE. Labov, W. (1972). Language in the inner city: Studies in the Black English Vernacular. Philadelphia: University of Philadelphia Press. Lavrakas, P. (Ed.). (2008). Encyclopedia of survey methods. London: SAGE. Liamputtong, P. (2011). Focus group methodology: Principle and practice. London: SAGE. Lillrank, A. (2012). Managing the interviewer self. In J.F. Gubrium, J.A. Holstein, A. B. Marvasti, & K. D. McKinney (Eds.), The SAGE handbook of interview research: The complexity of the craft (2nd ed., pp.281–294). London: SAGE. Litosseliti, L. (2007). Using focus groups in research. London: Continuum. Malinowski, B. (1922). Argonauts of the Western Pacific. London: Routledge. Maryns, K. (2012). Multilingualism in legal settings. In M. Martin-Jones, A. Blackledge, & A. Creese (Eds.), Routledge handbook of multilingualism (pp.297–313). London: Routledge. Menard-Warwick, J. (2009). Gendered identities and immigrant language learning. Bristol: Multilingual Matters. Miller, E. (2011). Indeterminacy and interview research: Co-constructing ambiguity and clarity in interviews with an adult immigrant learner of English. Applied Linguistics, 32(1), 43–59. Miller, E.R. (2014). The language of adult immigrants: Agency in the making. Bristol: Multilingual Matters. Mishler, E.G. (1986). Research interviewing: Context and narrative. Cambridge, MA: Harvard University Press. Morgan, D.L. (1997). Focus groups as qualitative research. London: SAGE. Myers, G. (1998). Displaying opinions: Topics and disagreement in focus groups. Language in Society, 27, 85–111. Nairn, K., Munro, J., & Smith, A.B. (2005). A counter-narrative of a “failed” interview. Qualitative Research, 5(2), 221–244. Parkinson, J., & Crouch, A. (2011). Education, language, and identity amongst students at a South African university. Journal of Language, Identity, and Education, 10(2), 83–98. Pavlenko, A. (2007). Autobiographical narratives as data in applied linguistics. Applied Linguistics, 28(2), 163–188. Potter, J.(2004). Discourse analysis as a way of analysing naturally occurring talk. In D.Silverman (Ed.), Qualitative research: Theory, method and practice (pp.200–221). London: SAGE. Potter, J., & Hepburn, A. (2005). Qualitative interviews in psychology: Problems and possibilities. Qualitative Research in Psychology, 2, 281–307. Potter, J., & Hepburn, A. (2008). Discursive constructionism. In J.A. Holstein & J. F. Gubrium (Eds.), Handbook of constructionist research (pp. 275–293). NewYork: Guildford. Potter, J., & Hepburn, A. (2012). Eight challenges for interview researchers. In J.F. Gubrium, J.A. Holstein, A.B. Marvasti, & K.D. McKinney (Eds.), The SAGE

  Interviews andFocus Groups 


handbook of interview research: The complexity of the craft (2nd ed., pp.555–570). London: SAGE. Prior, M.T. (2014). Re-examining alignment in a ‘failed’ L2 autobiographic research interview. Qualitative Inquiry, 20(4), 495–508. Prior, M.T. (2016). Emotion and discourse in L2 narrative research. Bristol: Multilingual Matters. Prior, M. T. (2017). Accomplishing “rapport” in qualitative research interviews: Empathic moments in interaction. Applied Linguistics Review. https://doi. org/10.1515/applirev-2017-0029. Rapley, T.J. (2007). Doing conversation, discourse, and document. London: SAGE. Rapley, T.J. (2012). The (extra)ordinary practices of qualitative interviewing. In J.F. Gubrium, J.A. Holstein, A.B. Marvasti, & K.D. McKinney (Eds.), The SAGE handbook of interview research: The complexity of the craft (2nd ed., pp.541–554). London: SAGE. Reddy, M.J. (1979). The conduit metaphor—A case of frame conflict in our language about language. In A.Ortony (Ed.), Metaphor and thought (pp.284–310). Cambridge: Cambridge University Press. Richards, K. (2003). Qualitative inquiry in TESOL. NewYork: Palgrave Macmillan. Richards, K. (2009). Interviews. In J.Heigham & R.A. Croker (Eds.), Qualitative research in applied linguistics: A practical introduction (pp.182–199). NewYork: Palgrave Macmillan. Richards, K. (2011). Using micro-analysis in interviewer training: ‘Continuers’ and interviewer positioning. Applied Linguistics, 32(1), 95–112. Riemann, G., & Schütze, E. (1991). “Trajectory” as a basic theoretical concept for analyzing suffering and disorderly social processes. In D.R. Maines (Ed.), Social organization and social process: Essays in honor of Anselm Strauss (pp. 333–357). NewYork: de Gruyter. Ritchie, D.A. (Ed.). (2011). The Oxford handbook of oral history. Oxford: Oxford University press. Roulston, K. (2010). Reflective interviewing: A guide to theory and practice. London: SAGE. Roulston, K. (2011). Working through challenges in doing interview research. International Journal of Qualitative Methods, 10(4), 348–366. Roulston, K. (2012). The pedagogy of interviewing. In J.F. Gubrium, J.A. Holstein, A. B. Marvasti, & K. D. McKinney (Eds.), The SAGE handbook of interview research: The complexity of the craft (2nd ed., pp.61–74). London: SAGE. Roulston, K. (2014). Interaction problems in research interviews. Qualitative Research, 14(3), 277–293. Rubin, H.R., & Rubin, I.S. (2012). Qualitative interviewing: The art of hearing data (3rd ed.). London: SAGE. Sapsford, R. (2007). Survey research (2nd ed.). London: SAGE. Schaeffer, N.C. (1991). Conversation with a purpose—Or conversation? Interaction in the standardized interview. In P.P. Biemer, R.M. Groves, L.E. Lyberg, N.A.


M. T. Prior

Mathiowetz, & S.Sudman (Eds.), Measurement errors in surveys (pp.367–391). Hoboken, NJ: John Wiley & Sons. Seidman, I. (2013). Interviewing as qualitative research: A guide for researchers in education and the social sciences (4th ed.). NewYork: Teachers College Press. Silverman, D. (2014). Interpreting qualitative data (5th ed.). London: SAGE. Skinner, J. (2012). The interview: An ethnographic approach. London: Berg. Smith, L.T. (2012). Decolonizing methodologies: Research and indigenous peoples (2nd ed.). London: Zed Books. Speer, S.A. (2002). ‘Natural’ and ‘contrived’ data: A sustainable distinction? Discourse Studies, 4(4), 511–525. Spradley, J. P. (1979). The ethnographic interview. New York: Holt, Rinehart, & Winston. Stewart, D.W., & Shamdasani, P.N. (2014). Focus groups: Theory and practice (3rd ed.). London: SAGE. Talmy, S. (2010). Qualitative interviews in applied linguistics: From research instrument to social practice. ARAL, 30, 128–148. Talmy, S. (2011). The interview as collaborative achievement: Interaction, identity, and ideology in a speech event. Applied Linguistics, 32(1), 35–42. Talmy, S., & Richards, K. (Eds.). (2011). Qualitative interviews in applied linguistics: Discursive perspectives [Special issue]. Applied Linguistics, 32(1), 1–128. Watson-Gegeo, K.A. (1988). Ethnography in ESL: Defining the essentials. TESOL Quarterly, 22(4), 575–592. Weiss, R.S. (1995). Learning from strangers: The art and method of qualitative interview studies. NewYork: The Free Press. Wengraf, T. (2001). Qualitative research interviewing. London: SAGE. Wilkinson, S. (2004). Focus group research. In D. Silverman (Ed.), Qualitative research: Theory, method and practice (pp.177–199). London: SAGE. Winchatz, M. R. (2006). Fieldworker or foreigner? Ethnographic interviewing in nonnative languages. Field Methods, 18, 83–97. Zimmerman, D. H. (1998). Identity, context and interaction. In C. Antaki & S.Widdicombe (Eds.), Identities in talk (pp.87–106). London: SAGE.

12 Observation andFieldnotes FionaCopland

Introduction This chapter introduces observation and fieldnotes in applied linguistics research. Traditionally, observation and fieldnotes have been central to fully fledged ethnographies and some applied linguistics studies have taken this approach (e.g., De Costa, 2014; Duff, 2002). However, it is fair to say that full ethnographies are rather rare in this research field: “topical” ethnographies, which focus on a particular aspect of a research site (see Shaw, Copland, & Snell, 2015), are more popular (see, e.g., Creese, Blackledge, & Kaur, 2014; King & Bigelow, 2012; Kubanyiova, 2015; Luk & Lin, 2010; Madsen & Karreboek, 2015). Nevertheless, a recent search of the journal Applied Linguistics using the term “fieldnotes” returned only six articles, which rose to eight when “field notes” was the search term, showing that even topical ethnographic studies are not generally reported in one of the leading applied linguistics journals. In contrast, the same search terms returned 38 articles in the Journal of Sociolinguistics, a journal which was first published at least 17 years after Applied Linguistics, suggesting that researchers taking a more qualitative, ethnographic approach are aligning themselves to sociolinguistics rather than the more mainstream applied linguistics, although for many,

F. Copland (*) Faculty of Social Sciences, University of Stirling, Stirling, Scotland e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



F. Copland

s­ociolinguistics remains a strand of this larger field (see, e.g., statement by American Association for Applied Linguistics (AAAL) http://www.aaal. org/?page=AboutAAAL). This observation in turn seems to be confirmed by examining research methods in a recent published issue of Applied Linguistics (June 2016, volume 37, issue 3). Out of six articles, two are experimental studies, two corpus linguistics studies and two are theoretically oriented studies of language classification. Given this underrepresentation of ethnographically oriented studies in mainstream applied linguistics research, the purpose of this chapter is to outline the methodological warrant for ethnographic studies, and in particular to explain how to carry out observation and how to write and analyse fieldnotes. It is hoped that the discussion will contribute to the spirit of dialogue and mutual engagement between research paradigms that King and Mackey (2016) call for in their introductory article for the centenary edition of The Modern Language Journal and will go some way to ensuring that researchers are ready to respond to current, real-world research problems in applied linguistics that often require a range of research approaches.

Current/Core Issues Applied linguistics is a broad discipline which includes sub-disciplines, such as forensic linguistics, corpus linguistics, second language acquisition (SLA), translation, lexicography and TESOL (teaching English to speakers of other languages) (see Hall, Smith, & Wicaksono, 2011). While not common in all these sub-disciplines, observation has been employed as a data collection tool in translation, SLA and in TESOL. Often the researchers complete either a predetermined observation schedule or one designed by the researcher(s). In TESOL, for example, researchers might collect information about the organisation of learning, such as which students contribute to the class, what kinds of questions a teacher asks or what languages participants use during the lesson. Observation schedules, therefore, tend to guide the observer to particular features of the observed space in a structured way. Researchers may also audio or video record interactions with a view to applying an analytic framework post-observation (see Walsh, 2011). Another approach to observation, both inside the classroom and in other contexts, is through ethnographic study. Ethnography, simply put, is the study of people in their own contexts with a view to seeing the context and the world as they see it (this is called taking an emic perspective). Ethnographers, therefore, generally focus on one (or a small number of ) research sites, spend

  Observation andFieldnotes 


time in those sites, observing, talking to people (“participants”) and taking part inlocal practices (this is called “participant observation”) and can take a number of different forms (see Copland & Creese, 2015). Traditionally, the participants were not known to researchers and studying them meant travelling far from home, living for extended amounts of time with the people being studied and perhaps learning a new language (see e.g., Geertz, 1960; Scheper-Hughes, 2001). However, in recent times, the focus in ethnographic studies has been on understanding groups closer to home, and schools, neighbourhoods and institutions have all been researched. Ethnographers have focused on making the familiar strange (rather than the strange familiar, which was the concern of early ethnography), showing how taken-for-granted routines and behaviours result from and contribute to social structures and ideologies in play. Rock (2015), for example, uses an ethnographic approach to uncover how cautions are delivered to suspects in police custody in the UK and how suspects can be disadvantaged by how the caution is delivered in different contexts. She argues as the result of her findings that the wording of the caution should be reconsidered with a view to making it as transparent as possible. In a number of cases, the researcher belongs (or has belonged) to the group being studied: this is called “native ethnography” or “practitioner ethnography” (Hammersley, 1992). Through living or working in a context, a researcher (or potential researcher) notices features of the research site that he/she finds puzzling, infuriating or interesting and decides to investigate them. Many PhD studies develop from premises such as these. Fieldnotes are central to ethnography. The researcher not only observes and takes part inlocal practices, he/she also writes down what he/she sees. These fieldnotes, therefore, become the main data set from which the researcher draws findings. However, as noted in the Introduction, fieldnotes are not common in much applied linguistics research. This may be because in many applied linguistics sub-disciplines, empirical data seem relatively unassailable, coming as they do in the form of corpora of written texts or recorded conversations, for example. These texts and conversations exist beyond the researcher, who has not brought them into being. Fieldnotes, in contrast, are created by the researcher, often working alone in the field. They record the features that he/she wishes to note, particularly those that seem relevant to the research participants (see below for a detailed discussion). As such, they are clearly subjective. Indeed, if more than one researcher is present at the same event, the fieldnotes may well be very different (Creese, Bhatt, Bhojani, & Martin, 2008). Some applied linguistics researchers, therefore, can struggle with what they might consider the partiality, bias and subjectivity of fieldnotes.


F. Copland

Of course, researchers who work with fieldnotes would refute these criticisms. They argue that all research is subjective and biased and one of the strengths of ethnography is that it acknowledges these realities and focuses on research as interpretation. Fieldnotes provide accounts which record the “partialities and subjectivities that are necessarily part of any interpretative process” (Copland & Creese, 2015, p.38). Copland and Creese (2015) argue, as a result, that fieldnote data should be open to scrutiny in the same way as other forms of data—concordance lines or transcriptions, for example. In this way, a reader can “formulate his or her own hunches about the perspectives of people who are being studied” (Bryman, 1988, p.77). In addition, researchers would suggest that fieldnotes are more suitable as a data collection tool for describing the social world than methods such as surveys and even interviews. They allow the researcher to begin to understand and represent the insider’s perspective, providing situated, contextualised accounts of lived realities (see Taylor, 2002). In the following section, I will outline the processes of writing fieldnotes, before considering core issues in the area currently.

Essential Processes/Steps As suggested above, fieldnotes are made by researchers as part of the observation process. However, before observation can begin, the researcher must instigate two important processes: gaining access to the research site and developing relationships with research participants. If the researcher is known in the site and has already developed relationships with the people in it, gaining access can be straightforward and may comprise a formal letter to the head of the site and gaining permission from the potential participants (although this situation may not be without ethical issues—see Copland & Creese, 2016, for a discussion of ethical issues in observational research). The process can be more difficult if the researcher is not known at the site. Potential participants may mistrust the researcher and be suspicious of his/her motives. In this case, the researcher may need to negotiate access with the head of the research site, through explaining the research in detail and developing relationships with the participants. Even then, access may not be granted or granted under certain conditions. Shaw (2015), for example, explains how she gained access to health think tanks through drawing on her experience of working in a well-known think tank in her communications with the executives she wished to interview, and through frequent and repeated contact. In one case, where the executive did not respond, Shaw carefully explained that

  Observation andFieldnotes 


the think tank would not feature in the case study and therefore that the think tank’s interests would not be represented. This approach was eventually successful and Shaw gained access. However, all the think tank executives insisted on seeing potential publications before going to print which resulted in some negotiation of content. This is quite a concession in any research, although increasingly common when research is funded by commercial companies or those with an overt political or ideological agenda. The researcher must pay attention at all stages to developing good relationships with the research participants. It is particularly important at the beginning of the observation process so that the participants feel as comfortable as possible with the presence of the researcher. Agar (1996) calls this “early participant observation” and suggests “it is the bedrock on which the rest of your research will rest” (p.166). It may be best at the beginning of the process not to take notes but to watch and listen in order to concentrate fully on the context without the distraction—for the researcher and the participants—of note taking. Once the researcher feels comfortable in the site (and the participants feel comfortable having the researcher there), the researcher can begin to make fieldnotes. Fieldnotes are “productions and recordings of the researcher’s noticings with the intent of describing the research participants’ actions” (Creese, 2011, p.44). Because fieldnotes are not guided by specific questions or structures (in contrast to observation schedules), the researcher is free to choose what to record. Fieldnotes, therefore, are always incomplete records and are always representative, reducing as they do “just-observed events, persons and places to written accounts” (Emerson, Fretz, & Shaw, 2001, p.353). Furthermore, they are always evaluative as “whatever we write down, positions us in relation to what we observe in one way or another” (Copland & Creese, 2015, p.44). Fieldnotes are constructed accounts over time. When the researcher is observing at the research site, he/she will generally find it difficult to make full notes. Often, it is not possible to make any notes at all in which case the researcher must rely on memory. In other places, the researcher can note only “jottings” (Emerson et al., 2001, p. 356) or “scratch notes” (Sanjek, 1990, p. 95), words or phrases that remind the researcher of something he/she wishes to write about at a later date. It is usual, therefore, for the researcher to ensure that there is time to write up fieldnotes soon after observation, a process which can be both tiring and frustrating (Punch, 2012). In Sample Study 12.1, you can see how Angela Creese’s scratchings became considered, formal fieldnotes. This set of fieldnotes was just one of 42 fieldnote documents, with each set consisting of between two and nine pages of


F. Copland

Sample Study 12.1 Creese, A. (2015). Case study one: Reflexivity, voice and representation in linguistic ethnography. In F.Copland & A.Creese with F.Rock & F.Shaw (Eds.), Linguistic ethnography: Collecting, analysing and presenting data (pp.61–88). London: SAGE. Research Background The extract is from a longer case study which considers how a researcher collects, analyses and presents data, particularly in team ethnography. The author is drawing on a study which she carried out with other researchers which focuses on the links between bilingualism and social class. Research Method The author uses linguistic ethnography in which she collects both linguistic and ethnographic data, including fieldnotes, recordings and interviews. • Setting and participants: The setting is a complementary school (i.e., a school which operates outside school times in the UK, often on a Saturday, to teach children the language and culture of their parents). Central participants are the teenage students at the school learning their heritage language, and Hema, the head teacher of the school. • Instruments/techniques: Fieldnotes. Key Results Creese’s first set of fieldnotes comprised rough jottings of first drafts made inside the classroom and added to immediately after class. An example of these can be seen in Fig.12.1 from the original study. The fieldnotes show that at the time Creese was interested in aspiration and the relationship between social class and aspiration. When producing these first “scratchings,” Creese would not have known if these particular observations would make an important contribution to research findings or if they would be afforded analytical attention.

Fig. 12.1  Draft 1 of fieldnotes

  Observation andFieldnotes 


Usually Creese’s first drafts contain more abbreviations, shorthand text, halffinished sentences and jottings of words to remind her of things of interest to write up in more detail at second draft. These fieldnotes are fuller because the class she was observing was holding an examination and Angela was bored and had more time to write and reflect. Below you can see how these scratches became a full typed-up set. Although few changes have been made between the original scratchings and these fully fledged fieldnotes, there are some differences, which you may wish to try to spot. 19 March Fieldnotes (AC) 1pm a break is given. During the break the girls are talking about their papers. The boys are talking about money and time! Hema goes into the next class to make sure all is OK.She keeps it all going. She looks at her phone during the break and texts something. *Kareena says something about everyone thinks she goes to grammar school. She explains it used to be a boarding school and it’s difficult to get in cos everyone wants to go there. Should be worth listening to this section of recording as they are all talking about “being bright,” “being tutored.” My rough notes say we have to use this project to talk about social class and multilingualism. Also something on varieties of language and perhaps “accommodation theory.” That should keep us going! New girl is not invited over to join in with the 4 girls who seem to be quite good friends now. She sits on her own. Later they include her. *Girls are also talking about “a half Sikh and half Muslim girl.” Also talking about a good-looking boy and Dumbledore and when the next Harry Potter film is out. The 2 boys want to know if they can go next door but Hema says no, break is over. Hema goes over the translation and appears to struggle a bit—either reading it or translating it. I can’t tell. She reads “film” in Panjabi as “filim.” Comments In redrafting between rough notes and typed-up draft, Angela is making decisions about what is interesting and relevant and what gets carried forward. The first draft, for example, includes a comment about boarding schools which is also included in the second draft. In contrast, she omitted information about researcher imposition, so that her concern that her presence was disrupting the mock examination being conducted by the teacher was not included. In terms of what is added, further detail about grammar schools is included as Angela was interested in the girls’ interest in different kinds of schools and she was also interested in the grammar school system herself. One of the main differences between the two sets of fieldnotes are the phrases, “I write in my notes” and “that should keep us going.” The first indicates that Angela would like the meta comment to be noted by others in the team, while the second comment is a light-hearted comment in which she acknowledges her own position as someone is going to be involved in the project for some time. Whether these fieldnotes become central or not is not the issue at this point. A single entry about one theme in one set of fieldnotes does not constitute data. Rather fieldnotes must be analysed as a set so that themes and rich points can be identified and examined.


F. Copland

single-spaced A4 typed-up observations. As you read, consider how the fieldnotes move from scratchings to formal notes and to the presentation of analytical categories.

What toWrite inFieldnotes Writing fieldnotes can be a difficult task, particularly at the beginning of a project when it is difficult to decide what is relevant. Agar (1996) has useful advice in this regard: At any given time during your early informal fieldwork, there will be a couple of topics you are focusing on. Centre your informal interviews, conversations, and observations on these topics…. When something interesting appears, note it. But don’t lose the focus of the topics currently under consideration. (p.162)

Focusing observations on overarching research questions, therefore, can reduce feelings of being overwhelmed by information and detail, a common response to observing in new research sites. Researchers, as Agar recognises, always have a focus or research questions that guide the study. If these are kept in mind in the beginning stages of fieldwork, they can support the researcher in looking. However, the researcher should be open to noting whatever appears interesting at the research site as new foci can emerge as studies develop. More detailed advice about writing fieldnotes is provided by Emerson, Fretz, and Shaw (1995): First jot down details of what you sense are key components of the observed scenes or interactions….Second avoid making statements characterising what people do that rely on generalisations (e.g. using opinionated words)… Third jot down concrete sensory details about actions and talk. (p.32)

This advice urges the researcher to consider the vocabulary used to record fieldnotes, noting that careless language can affect the quality of the fieldnotes, particularly if the researcher uses “opinionated words” (Emerson etal., 1995, p.32). Their advice, therefore, is to describe neutrally. Concrete sensory details, in addition, will be helpful in taking the researcher back to the site when reading and analysing the notes, as well as providing useful information when writing up the fieldnotes into research findings. Richards (2003), drawing on Spradley (1980) provides a useful set of headings which can guide the observer in deciding what to note. Specifically, he suggests that observers record details of setting, people (including r­ elationships

  Observation andFieldnotes 


and interactions), systems (by which he means the procedures that participants follow in the research setting) and behaviour (including events and times). The advantage in systematically following a checklist of this nature is that the fieldnotes data can be reliably compared. Furthermore, changes and differences are quickly recognised. Importantly, the list of headings supports the novice researcher in making sense of the research site, which, as noted above, can be overwhelming when first encountered. Each time a researcher carries out observation, a new set of fieldnotes is created. These fieldnotes become a “corpus” of data (Emerson etal., 2007, p.353), which is available for analysis (see below). Levon (2013) explains that storing and organising fieldnotes requires some thought and that researchers “should come up with a standardised system for keeping” them (p.76). Many researchers store their fieldnotes in multiple places, but it is important that these are treated as other research data and kept safely and securely in places which others cannot access.

Reliability/Validity For many ethnographic researchers, the notions of reliability and validity are not a concern. Starfield (2015) suggests this is because ethnographers do not want to be held accountable to concepts that have emerged from positivist research paradigms. Indeed, it is rare to see a mention of reliability or validity in texts which focus on ethnographic work. Hammersley (1992) suggests that rather than being concerned with validity and reliability, research of all types should focus on validity and relevance (p.68). Validity is concerned with producing research findings that can be substantiated. Relevance should be concerned with the importance of the topic and whether the findings add to what we already know about the topic. Validity answers the question, “How do I know that?”. It requires the researcher to provide explicit evidence to support claims. Validity can be achieved in a number of ways. The first is through convincing the reader that the researcher has carried out a “thorough, systematic iterative analysis” (Duff, 2002, p.983). To do so, the researcher must provide details of the analytical process and examples of decisions made (see Copland & Creese, 2015, for a number of worked examples of this type). Another way is through providing a detailed description of the research site/participants that is recognisable to those who took part in the research. This can be achieved through member validation, that is, sharing findings with research participants (perhaps through vignettes (further illustrated in Sample Study 12.2).


F. Copland

Related to this approach, Emerson etal. (2007) suggest that “by presenting herself as a participant in an event and witnessing insider actions, the ethnographer can convince by showing how she learned about a process” (p.364). In this case, validity is achieved by the weight and authenticity of empirical evidence. A final way is to use more than one data set to provide a range of perspectives on the issue under consideration. While not originally developed with validity in mind, linguistic ethnography (Rampton, Maybin, & Roberts, 2015), which brings together ethnographic and linguistic data and data analysis, provides this affordance. Linguistic ethnography often conjoins fieldnotes with transcribed data from interactions in natural settings or with transcribed interviews, or with both data sets. Taken together, the data can show how a researcher has arrived at findings (see Snell etal., 2015, for a range of studies that take a linguistic ethnographic approach, and Copland & Creese, 2015, for a full discussion of collecting, analysing and presenting data in linguistic ethnography). Relevance responds to the question, “so what?”. In an age where research impact is increasingly important, relevance cannot be overlooked. Relevance, of course, is a concern for the research project as a whole rather than for fieldnotes in particular. It forces researchers to consider whether the research project has a valuable purpose, in terms of either changing lives or developing theoretical perspectives. In this context, fieldnotes provide researchers with empirical evidence from which to draw findings to make their cases. Their analyses “produce sensitising concepts and models that allow people to see events in new ways” (Hammersley, 1992, p.15). In that fieldnotes are generally written in a report or narrative form, it could be argued that they are more accessible to a range of readers than statistical data, ensuring that their relevance (or not) is more immediately apparent.

Analysing Fieldnotes Writing fieldnotes is the first level of data analysis in that it is an interpretive process (Emerson etal., 1995). As researchers work on their fieldnotes, adding to or refining them, they continue this process of analysis. Formal “observable” analysis begins when the researcher considers the fieldnotes as a data set from which to produce findings. Again, this process requires the researcher to go through a number of stages. First, the researcher must read the notes a number of times in order to become familiar with them. As Blommaert and Dong (2010) advise, “gradually you will start reading them as a source of ‘data’ which you can group, catalogue and convert into preliminary analysis” (p.39). Often this preliminary analysis will be in the form of themes which begin to emerge from the data. Becoming familiar also means

  Observation andFieldnotes 


that researchers will be able to locate particular sections easily for further analysis or publication. During the reading stage, the researcher will often write analytical notes on the fieldnotes, either by hand or electronically. In Fig. 12.2, the analytical notes can be seen on the right-hand side. These analytical notes help the researcher to develop codes, the next stage in the process. Codes usually combine themes that have informed the research questions with those which emerge through reading and rereading the fieldnotes, often emerging from the notes or memos. For example, in Fig.12.2, which comes from my data, the codes are given on the left-hand side. “FTA” (face-threatening act) came from a research question, which focused on face. However, the code “feedback structure” was developed as I read the fieldnotes and realised the importance of how feedback was organised to what was said (and allowed to be said). The main purpose of codes is to provide evidence that themes were either important to participants or important to the researcher (see discussion of reflexivity below). For example, the code FTA appeared in every set of fieldnotes I wrote, often on more than one occasion. In discussing findings, I could be confident that FTAs were common in the feedback conferences I observed and that they deserved to be discussed. Empirically, the frequency of codes can be useful for demonstrating to readers not familiar with ethnography that the theme is important; the researcher, for example, might state that the code FTA appeared 38 times and at least three times in each set of fieldnotes. Another purpose of coding is to identify the normal and regular from the unusual and noteworthy. It is often the irregular that alerts the researcher to what is taken for granted in the research site and what acceptable behaviour from the research participants’ perspectives looks like (Agar’s, 1996, “rich points”). Hammersley (1992) argues: “ethnographic work often seems to amount to a celebration of the richness and diversity of human social life, but at the same time seeks to identify generic features” (p.17). Fieldnotes, therefore, provide evidence of both. At this stage, researchers may produce memos. Drawn from grounded theory (e.g., Charmaz, 2000), memos integrate data from different fieldnotes

Fig. 12.2  Coded fieldnotes


F. Copland

(and potentially other data sources such as interviews or recorded interactions) into texts that link “analytic themes and categories” (Emerson etal., 1995, p.143). According to Charmaz (2000), memoing is a process through which we explore and integrate our codes, linking analyses with empirical data. A memo therefore is the researcher’s written articulation of how theoretical insights are drawn from data and may include sections of fieldnotes to illustrate these insights. Alternatively, or in addition, a researcher may use the observation and fieldnotes to produce vignettes. A vignette is “a focused description of a series of events taken to be representative, typical or emblematic” (Miles & Huberman, 1994, p. 81). Vignettes differ from memos, therefore, in being narrative accounts rather than (only) exploratory or analytical accounts. Their purpose is also different: rather than providing an articulation of the link between data and theory, a vignette attempts to capture something of the experience of being in the research site, including descriptive details of sights and sounds (see Flewitt, 2011). Vignettes, therefore, distil data (rather than including “raw” data such as fieldnotes) into a text which provides readers with an insight into findings of the research. In Sample Study 12.2, the reader can see how Rosie Flewitt achieved this in her vignette of the research participants’ literacy practices. Sample Study 12.2 Flewitt, R. (2011). Bringing ethnography to a multimodal investigation of early literacy in a digital age. Qualitative Research, 11(3), 293–310. Research Background The article explores how children develop literacy skills through interaction with technology, specifically the computer. Research Question In the article, Flewitt asks if ethnography and multimodality can be brought together effectively to provide “richly situated insights into the complexities of early literacy development in a digital age, and can inform socially and culturally sensitive theories of literacy as social practice” (Flewitt, 2001, p.293). Research Method • Setting and participants: The setting is a pre-school classroom in which there is a computer. The two participants are pre-school children, Edward and Chrissie, who are 3–4 years old, and another young male child of the same age. • Instruments/techniques/analysis: The author uses multimodal methods of data collection and analysis in her work. In this article she explains how ethnography and multimodality can combine effectively to provide insights into early literacy learning.

  Observation andFieldnotes 


Key Results Flewitt produces a vignette of Edward’s literacy play which focuses on an episode of semiosis to demonstrate to the reader how interactions between children and between the computer and children result in meaning making. Flewitt explains that the vignette is a “transduction of data collected in diverse media and presented anew in written format” (p.302). The vignette, therefore, includes references to interviews, fieldnotes and the researcher’s recollections while focusing for the most part on one section of video-recorded activity. Flewitt argues that a vignette cannot reproduce the field but “offers a way to distil observed practices into the portrayal of a scene that captures something of the rich sights, sounds and senses of the ethnographic research experience” (p.302). The vignette provides a detailed description of how three children negotiate their social and digital literacy interaction. In summary, Edward is on the computer playing a game and then Chrissie come over to join in, taking the mouse while Edward uses the space bar on the keyboard. After a brief tussle when Chrissie stops the game, the children decide to play a different game which she selects. They play on until a third child joins them and after a further tussle continue to play as a threesome. The full vignette provides a great deal more detail and a photograph supports the description. Comments The value of the vignette is that it demonstrates effectively and efficiently how social practices, in this case developing literacy, are socially situated. In this vignette, the reader develops an understanding of the interactions that took place and of the atmosphere in which they happened. As Flewitt writes, “the reader begins to get a sense of how the interaction between the children was negotiated in ensembles of meaning created through multiple modes, primarily through embodied action, particularly pointing and the sensual use of touch, with some use of spoken language while their gaze remained largely fixed on the screen” (p.303). Through this vignette, Flewitt clearly shows how ethnography and multimodality can work together to produce nuanced findings which help us to better understand children’s literacy development.

When toWrite Fieldnotes A challenge for researchers is when to write fieldnotes. If he/she is a participant, it is almost impossible to write fieldnotes contemporaneously with the actions being observed. There are other reasons why a researcher may want to postpone fieldnote writing until after the site visit. Participants may get anxious about what is being written. In classroom contexts in particular, observers writing notes are often equated with evaluative practices and so can make teachers uncomfortable. One PhD student of mine avoided this situation by writing his fieldnotes in his car which he parked a kilometre down the road from the school in which he had spent the morning. Scott Jones and Watt (2010) describe how Scott Jones wrote fieldnotes in the toilet. Her reason for doing so was because her research was covert and she


F. Copland

did not want people in the research site to see what she was doing. In applied linguistics research, covert research is generally discouraged but the issue of when to write notes can provide similar challenges to researchers in different contexts. Likewise, overtly writing fieldnotes can attract attention. Scott Jones and Watt (2010) describe how participants in Watt’s research “would often nod to my book as if to say ‘you need to write this down’” (p.165). The public nature of these fieldnotes meant that Watt “developed a strategy of leaving a space after the jotted note to remind her there was more to say on this topic in the privacy of her own space” (2010, p.165). In this way, Watt could both satisfy the participants that she was attending to what was important to them at the same time as leaving herself some space in which to reflect on what she had seen and perhaps provide an interpretation of it. When to write fieldnotes, therefore, is contextually and motivationally contingent and requires sensitivity on the part of the researcher.

What toInclude inFieldnotes In the section above, guidance is given about how to focus fieldnotes and what kind of observations to make. However, content has become something of a controversial issue. Blommaert and Dong (2010) are clear that fieldnotes should include more than a description of what the researcher sees: Do not attempt to be Cartesian in your fieldnotes: you can afford yourself to be subjective and impressionistic, emotional or poetic. Use the most appropriate way of expressing what you want to express, do not write for an audience, and do not feel constrained by any external pressure: your fieldnotes are private documents, and you will be the only one to decide what you release from them. (p.38)

Their argument is twofold: first, fieldnotes are private and the researcher chooses what to make public so that choice belongs to the researcher. The second is that the emotional self is involved in the field and researchers should feel free to draw on it as our emotions are intrinsic to how we witness an event. However, the latter view in particular is not universally shared. Punch (2012) explains how in the field of sociology, fieldnotes and field diaries have traditionally been separate documents, with diaries being reserved for feelings, gripes, emotions and fieldnotes for “observations, descriptions of places, events, people and actions…reflections and analytical thoughts about what the observations may mean: emerging ideas, notes on themes and concepts, links to research questions and the wider literature” (p. 90). Drawing on

  Observation andFieldnotes 


Blackman (2007), she suggests that this separation is because “incorporating emotions and personal challenges into discussions of the research process is still not accepted as a scholarly endeavour” (p.87) and therefore should be relegated to a different, less academic document. Indeed, Palmer (2010) advocates separating notes into observational notes, theoretical notes and methodological notes (it is interesting that he does not also suggest “emotional notes”). In applied linguistics and sociolinguistics research, however, the distinction between the observational self and the emotional self has not been as rigid. As Blommaert and Dong (2010) eloquently argue, how we witness tells a story about “an epistemic process: the way in which we try to make new information understandable for ourselves, using our own interpretive frames, concepts and categories” (p.37). Writing how we feel, therefore, can support researchers in developing a reflexive approach in their work. Reflexivity involves the recognition that the researcher has an impact on what is being observed in terms of how participants behave, how the research is constructed and what gets reported (see Copland & Creese, 2015). This is not to say that our notes on feelings necessarily find their way into our research reports. Nevertheless, it is vital that we recognise how they influence what we do and see in order that we can account for our findings. Emotions and feelings, therefore, seem central to developing as a reflexive ethnographer (see Davies, 2008).

Technology andFieldnotes In recent years, technology has played an increasingly important role in fieldnotes work. In the research site, tablets can now be used instead of notebooks, with all the electronic benefits of writing and storing at the same time. In terms of analysis, NVIVO, Atlas.ti and other data analysis programmes help researchers to code and sort fieldnotes data (see, e.g., Tusting, 2015). One of my PhD students has been using Transana to bring together classroom recordings, transcriptions of these recordings and fieldnotes. Transana provides a visual representation of all the data from a particular observation and has allowed the researcher to identify links between the data which might otherwise have been missed. Figure12.3 is a screenshot of one such Transana page. It is likely as technology continues to improve and researchers become increasingly adept at using it that much ethnographic data will be stored and analysed using programmes such as these.


F. Copland

Fig. 12.3  Screenshot of Transana programme used to collate fieldnotes and recordings (Hall, personal data, 2015)

Limitations andFuture Directions King and Mackey (2016) argue in their discussion of second language acquisition research that it needs to draw on a range of research approaches in order to answer the questions about language learning that are pertinent in today’s world. My view is that this argument can be extrapolated to applied linguistics research in general. I can think of few articles I have read recently that would not benefit from taking a close-up look at language in its contexts of use particularly given the multi-layered complexity of many of the issues applied linguists are currently grappling with. While full-blown ethnographies of the kind produced by Duff (2002) are beyond the economic means of many researchers, topic-based ethnographic studies are far less onerous in terms of time and money and therefore within reach. Furthermore, as research funders increasingly request research to be interdisciplinary in nature, it is an opportune time for applied linguistics researchers to consider how an ethnographic element could support them in writing more relevant (in Hammersley’s terms) research questions and in producing findings which tackle complexity and which are nuanced and contextualised. The purpose of this chapter is to persuade readers that such an enterprise is worth some investment in terms of understanding how to carry out observational research and how to write and analyse fieldnotes.

  Observation andFieldnotes 


Resources forFurther Reading Copland, F. (2011). Negotiating face in the feedback conference: A linguistic ethnographic approach. Journal of Pragmatics, 43(15), 3832–3843. In this article, I use fieldnotes to develop the argument that linguistic data alone is not adequate in discussions of face threat. The context is English language teacher training but the linguistic ethnographic approach could be taken by anyone carrying out research in a limited number of research sites. King, K.A., & Mackey, A. (2016). Research methodologies in second language studies: Trends, concerns and new directions. Modern Language Journal, 100(S1), 209–227. In this article, King and Mackey explore current trends in applied linguistics research and suggest the qualitative/quantitative divide should be re-­ examined with a view to researchers developing research designs which are fit for purpose in terms of investigating complex linguistic problems. Punch, S. (2012). Hidden struggles of fieldwork: Exploring the role and use of field diaries. Emotion, Space and Society, 5, 86–93. In this article, Sam Punch explores the role of emotion in fieldwork. She describes the traditional approach in sociology to separate fieldnotes from field diaries: the former record observations in the field and the latter emotional responses and other personal responses. She challenges the notion that emotions should not be included in sociological reports and suggests that field diaries should become recognised sources of data.

References Agar, M. H. (1996). The professional stranger (2nd ed.). Bingley: Emerald House Publishing. Blackman, S.J. (2007). ‘Hidden ethnography’: Crossing emotional borders in qualitative accounts of young people’s lives. Sociology, 41(4), 699–716. Blommaert, J., & Dong, J.(2010). Ethnographic fieldwork: A beginner’s guide. Bristol: Multilingual Matters. Bryman, A. (1988). Quantity and quality in social research. London: Unwin Hyman.


F. Copland

Charmaz, K. (2000). Grounded theory: Objectivist and constructive methods. In N.K. Denzin & Y.S. Lincoln (Eds.), Handbook of qualitative research (2nd ed., pp.509–536). London: SAGE. Copland, F. (2011). Negotiating face in the feedback conference: A linguistic ethnographic approach. Journal of Pragmatics, 43(15), 3832–3843. Copland, F., & Creese, A. with Shaw, S., & Rock, F. (2015). Linguistic ethnography: Collecting, analysing and presenting data. London: SAGE. Copland, F., & Creese, A. (2016). Ethical issues in linguistic ethnography: Balancing the micro and the macro. In P. I. De Costa (Ed.), Ethics in applied linguistics research (pp.161–178). London: Routledge. Creese, A. (2011). Making local practices globally relevant in researching multilingual education. In F. M. Hult & K. A. King (Eds.), Educational linguistics in practice: Applying the local globally and the global locally (pp. 41–55). Bristol: Multilingual Matters. Creese, A. (2015). Case study one: Reflexivity, voice and representation in linguistic ethnography. In F. Copland & A. Creese with F. Rock & F. Shaw (Eds.), Linguistic ethnography: Collecting, analysing and presenting data (pp. 61–88). London: SAGE. Creese, A., Bhatt, A., Bhojani, N., & Martin, P. (2008). Fieldnotes in team ethnography: Researching complementary schools. Qualitative Research, 8(2), 197–215. Creese, A., Blackledge, A., & Kaur, J. (2014). The ideal ‘native-speaker’ teacher: Negotiating authenticity and legitimacy in the language classroom. Modern Language Journal, 98(4), 937–951. Davies, C. A. (2008). Reflexive ethnography: A guide to researching selves and others (2nd ed.). Oxford: Routledge. De Costa, P.I. (2014). Making ethical decisions in an ethnographic study. TESOL Quarterly, 48(2), 413–422. Duff, P. (2002). The discursive co-construction of knowledge, identity, and difference: An ethnography of communication in the high school mainstream. Applied Linguistics, 23(3), 289–322. Emerson, R.M., Fretz, R.I., & Shaw, L.L. (1995). Writing ethnographic fieldnotes. Chicago: University of Chicago Press. Emerson, R.M., Fretz, R.I., & Shaw, L.L. (2001). Participant observation and fieldnotes. In P.Atkinson, A.Coffey, S.Delamont, J.Lofland, & L.Lofland (Eds.), Handbook of ethnography (pp.352–368). London: SAGE. Emerson, R. M., Fretz, R. I., & Shaw, L. L. (2007). Participant observation and fieldnotes. In P. Atkinson, A. Coffey, S. Delamont, J. Lofland, & L. Lofland (Eds.), Handbook of ethnography (pp. 352–368). London: Sage. Flewitt, R. (2011). Bringing ethnography to a multimodal investigation of early literacy in a digital age. Qualitative Research, 11(3), 293–310.

  Observation andFieldnotes 


Geertz, C. (1960). The religion of Java. NewYork: Free Press. Hall, C., Smith, P.H., & Wicaksono, R. (2011). Mapping applied linguistics. London: Routledge. Hammersley, M. (1992). What’s wrong with ethnography? London: Routledge. King, K.A., & Bigelow, M. (2012). Acquiring English and literacy while learning to do school: Resistance and accommodation. In P. Vinogradov & M. Bigelow (Eds.), Low education second language and literacy acquisition, proceedings from the 7th symposium (pp.157–182). Minneapolis: University of Minnesota Press. King, K.A., & Mackey, A. (2016). Research methodologies in second language studies: Trends, concerns and new directions. Modern Language Journal, 100(S1), 209–227. Kubanyiova, M. (2015). The role of teachers’ future self guides in creating L2 development opportunities in teacher-led classroom discourse: Reclaiming the relevance of language teacher cognition. Modern Language Journal, 99(3), 565–584. Levon, E. (2013). Ethnographic fieldwork. In C.Mallinson, B.Childs, & G.Herk (Eds.), Data collection in sociolinguistics: Methods and applications (pp. 69–79). London: Routledge. Luk, J.C. M., & Lin, A. (2010). Classroom interactions as cross-cultural encounters: Native speakers in EFL lessons. London: Routledge. Madsen, L., & Karreboek, M. (2015). Hip hop, education and polycentricity. In J. Snell, S. Shaw, & F. Copland (Eds.), Linguistic ethnography: Interdisciplinary explorations (pp.246–265). London: Palgrave Macmillan. Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis (2nd ed.). London: SAGE. Palmer, C. (2010). Observing with a focus: Fieldnotes and data recording. In J.Scott Jones & S. Watt (Eds.), Ethnography in social science practice (pp. 141–156). London: Routledge. Punch, S. (2012). Hidden struggles of fieldwork: Exploring the role and use and of field diaries. Emotion, Space and Society, 3(2), 86–93. Rampton, B., Maybin, J., & Roberts, C. (2015). Theory and method in linguistic ethnography. In J. Snell, S. Shaw, & F. Copland (Eds.), Linguistic ethnography: Interdisciplinary explorations (pp.14–50). London: Palgrave Macmillan. Richards, K. (2003). Qualitative inquiry in TESOL. Basingstoke: Palgrave Macmillan. Rock, F. (2015). Case study three: Ethnography and the workplace. In F.Copland & A.Creese (Eds.), Linguistic ethnography: Collecting, analysing and presenting data (pp.117–142). London: Sage. Sanjek, R. (1990). A vocabulary for fieldnotes. In R. Sanjek (Ed.), Fieldnotes: The makings of anthropology (pp. 92–138). Ithaca and London: Cornell University Press. Scheper-Hughes, N. (2001). Saints, scholars, and schizophrenics (20th anniversary edition). London: University of California Press.


F. Copland

Scott Jones, J., & Watt, S. (2010). Making sense of it all: Analysing ethnographic data. In J. Scott Jones & S. Watt (Eds.), Ethnography in social science practice (pp.157–172). London: Routledge. Shaw, S. (2015). Case study four: Ethnography, language and healthcare planning. In F.Copland & A.Creese (Eds.), Linguistic ethnography: Collecting, analysing and presenting data (pp.143–170). London: Sage. Shaw, S., Copland, F., & Snell, J.(2015). An introduction to linguistic ethnography: Interdisciplinary explorations. In J.Snell, S.Shaw, & F.Copland (Eds.), Linguistic ethnography: Interdisciplinary Explorations (pp. 1–13). London: Palgrave Macmillan. Snell, J., Shaw, S., & Copland, F. (2015). Linguistic ethnography: Interdisciplinary explorations. London: Sage. Spradley, J.P. (1980). Participant observation. NewYork: Holt, Rinehart and Winston. Starfield, S. (2015). Ethnographic research. In B. Paltridge & A. Phakiti (Eds.), Research methods in applied linguistics (pp.137–152). London: Bloomsbury. Taylor, S. (2002). Researching the social: An introduction to ethnographic research. In S.Taylor (Ed.), Ethnographic research: A reader (pp.1–12). London: The Open University and SAGE. Tusting, K. (2015). Workplace literacies and workplace society. In J.Snell, S.Shaw, & F. Copland (Eds.), Linguistic ethnography: Interdisciplinary explorations (pp.51–70). London: Palgrave Macmillan. Walsh, S. (2011). Exploring classroom discourse: Language in action. Abingdon: Routledge.

13 Online Questionnaires Jean-MarcDewaele

Introduction This chapter considers issues linked to questionnaires and to web-based research, more specifically within second language acquisition (SLA) and multilingualism research. Referring to my own experiences with online questionnaires, I consider the extent to which sampling and the increased heterogeneity of samples can affect findings. I also address the question of sample size and of the various pitfalls that researchers can tumble into. I mention issues of validity, suggest future directions for research and remind readers of some important issues of etiquette, such as gratitude to participants and respect with regard to ethical issues. The present chapter takes Wilson and Dewaele’s (2010) study of the use of web questionnaires in SLA and bilingualism research as a starting point. Dörnyei and Taguchi (2010) explain that because the essence of scientific research “is trying to find answers to questions in a systematic manner, it is no wonder that the questionnaire has become one of the most popular research instruments in the social sciences” (p.1). They define “questionnaire” as “the self-completed, written questionnaire that respondents fill in by themselves” (p.3). Questionnaires are employed “as research instruments for measurement purposes to collect valid and reliable data” (p.3). The authors do lament the J.-M. Dewaele (*) Department of Applied Linguistics and Communication, Birkbeck, University of London, London, UK e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



J.-M. Dewaele

fact that, despite a growing methodological awareness in applied linguistics, “the practice of questionnaire design/use has remained largely uninformed by the principles of survey research accumulated in the social sciences” (p.2). Applied linguists have been using questionnaires—both paper-based and online versions—for many years. Paper-based questionnaires also have broad applications for other types of research designs including experimental research, case study, action research and ethnographic research. Paper-based questionnaires guarantee a higher response rate—but not necessarily a higher quality of response—when the researcher is on site. Though there are plenty of publications on paper-based questionnaires (Dörnyei & Taguchi, 2010), where online questionnaires might get a passing mention, there is a distinct lack of texts focused more specifically on online questionnaires, a gap which merits to be filled. The omnipresence of the Internet has greatly facilitated the development of web-based instruments for researchers at all levels: ranging from undergraduate students collecting data for a small research project to Masters’ students collecting information for their dissertation, to PhD students collecting substantial quantities of quantitative and/or qualitative data for their thesis, or established researchers reaching out to the general public in order to collect data that could allow them to answer intriguing research questions. The danger is that it has become almost too easy to create an online questionnaire, and there is a danger that (budding) researchers may underestimate the amount of work and thought that must go into the research design, into the criteria for participation, into the formulation of the questions and the items, and the balance between conciseness and completeness, while keeping an eye on the overall length of the questionnaire and countless other things that could make a big difference at the time of the analysis of the feedback (Dörnyei & Taguchi, 2010). The reactions in the field of second language acquisition (SLA) have been sufficiently positive for the use of online questionnaires to become firmly established (Dörnyei, 2007). Dörnyei points to the advantages and disadvantages of web-based research. The most important advantage that outweighs the disadvantages by far is the low cost of setting up an online questionnaire. One popular platform, SurveyMonkey, is free but only ten questions can be included in the questionnaire and there is a cap of 100 responses. For a small monthly fee, it allows up to 1000 responses, and for a slightly higher annual amount the number of responses is unlimited and statistical assistance is included ( Google Forms are free but offer fewer services ( A growing number of academic institutions have purchased an online authoring tool/survey tool

  Online Questionnaires 


licence that academic staff, students and its employees can use for free. A ­survey online authoring tool, such as Lime is common and user friendly and a URL for the survey can be generated and hosted by an academic institute ( Researchers in applied linguistics and psychology have also started using Mechanical Turk to gain immediate (online) access to and data from large numbers of participants ( Once online questionnaires are up, they run themselves without further input from the researcher, keeping count of the number of participants, until the point where sufficient data have been collected and the harvest can start. This amounts to downloading the data into a spreadsheet. Another advantage, according to Dörnyei (2007) is anonymity, as there is no face-to-face interaction between researcher and participants, and no pressure on the latter to participate, which enhances the level of honesty in responses. Indeed, because there are no social consequences to participation, there is less chance that participants may be tempted to exaggerate or distort their responses to please the authors of the questionnaire. Dörnyei points out that another crucial advantage is the potential access to much larger and more diverse populations all over the world, or, on the other hand, the possibility to reach “small, scattered, or specialised populations” (p.121). This is particularly important for applied linguists who try to collect data from learners with different language backgrounds engaged in the acquisition of a variety of target languages in different school systems around the world. The diversity of contexts boosts the ecological validity of the data, in other words, the richness of the materials and settings approximate the real world. Patterns between variables that are unique to one context, and that researchers may interpret as being universal, may well be absent in another context. The multi-context perspective thus protects researchers from prematurely jumping to sweeping conclusions about the phenomena they are interested in. One major limitation of questionnaires, and online questionnaires in particular, which will be highlighted further, is the inevitable self-selection bias. Indeed, in “Internet-based research (…) it is not possible to apply a systematic, purposive sampling strategy, as all participants are self-selected” (Dörnyei, 2007, p.122). Even with paper-based questionnaires, potential participants can decline to fill out the questionnaire or participate without much enthusiasm and leave some questions unanswered or start to answer at random. Several of the studies that I have carried out using online questionnaires were based on data collected from between 1000 and over 2000 participants (see Sample Study 13.1). These numbers would have been unattainable to a lone researcher in the past or to a researcher using paper-based questionnaires. I remember the days when I handed out printed questionnaires to my students,


J.-M. Dewaele

or to colleagues, and hovered around to make sure they completed the questionnaire and did not promise to hand it to me the following week—which was quite unlikely to occur. It took a lot of time and persuasive effort, and getting 150 properly completed questionnaires was an achievement. Then came the tedious transcription of Likert scale responses onto Excel files and the struggle to decipher bad handwriting on open questions. These days, the data can be imported straight into an Excel file, and while it does typically take a full day to clean up the data, the effort required for preparing the data of 1500 participants for statistical analysis is probably less than 10 per cent of the effort required to do the same thing with the data of 150 printed questionnaires.

Sampling Types ofSampling Participant selection has traditionally been a thorny issue among social scientists, well before the advent of online questionnaires (Bass & Firestone, 1980; Ness Evans & Rooney, 2013). There are two main types of sampling strategies and subcategories within these. The first type is probability sampling, which aims to constitute a “representative sample” of the general population. One procedure is random sampling: “whereby a sample is drawn such that each member of the population has an equal probability of being included in that sample” (Ness Evans & Rooney, 2013, p.126). Ness Evans and Rooney point out that random sampling rarely happens in the social sciences but that it is not really a problem because social scientists “are typically testing theories, not generalizing to entire populations” (p.127). However, random assignment of participants to groups is crucial as it is “an important assumption of several statistical procedures” (p.127). Ness Evans and Rooney (2013) urge researchers to be cautious “in generalizing the results to populations that may differ from our sampled population” (p. 132). Dörnyei adds that the generalizability of findings based on data gathered via non-random sampling is “often negligible” (2007, p.99). This is a common question too for academics who use convenience sampling (i.e. collecting data from their own students). These students represent a captive participant pool, who might be gently coerced into participating in order to earn some pocket money or to obtain a credit for their participation in the experiment. These participants are smart, accessible, willing and cheap. Depending on the topic of research, the results are not generalizable to the

  Online Questionnaires 


whole population. A sample of students is fine for research on students’ attitudes or performance, but it would be inadequate for broader research, for example, on the political views in a certain region or country (Ness Evans & Rooney, 2013). Sampling strategies are thus dictated by practical considerations: “it is in sampling, perhaps more than anywhere else in research, that theory meets the hard realities of time and resources” (Kemper, Stringfield, & Teddlie, 2003, pp.273–274). One specific problem is self-selection bias: only people who are interested in a topic and feel strongly about it, whether positively or negatively, will be willing to spend 20 minutes filling out an online questionnaire on it. The bias does not invalidate the research, but it does require careful interpretation of the results. In Dewaele and MacIntyre (2014, 2016), for example, we collected data from 1746 participants who were learning a foreign language focusing on their enjoyment and anxiety in the foreign language class. A paired t-test showed that the mean scores for foreign language enjoyment were significantly higher than for foreign language anxiety. It would have been easy but erroneous to jump to the conclusion that foreign language learners report experiencing more enjoyment than anxiety in class. Despite the size of the sample, it was very likely that learners who did not like foreign language classes much were less likely to fill out the questionnaire. Hence, our participants were not a representative sample of the whole foreign language learner population (it is unlikely that such a perfectly balanced sample could ever be constituted).

he Difficulty ofAttaining Balance inTerms ofa*ge, T Education Level andGender Participants in online questionnaires in the field of SLA and multilingualism are typically older adults (the average age of the 1573 participants in the Bilingualism and Emotion Questionnaire (BEQ) was 34 (Dewaele, 2013; Dewaele & Pavlenko, 2001–2003); the average age of the 2324 participants who filled out a questionnaire on swearing in English was 32 (Dewaele, 2016). The participants in Dewaele and MacIntyre (2014, 2016) were younger on average (24 years old), partly because of the unexpected recruitment of 11, 12, 13 and 14-year-olds. Gender imbalance is another issue that I have encountered in all the open online questionnaires that I have run. No matter how big the sample, the majority of participants have been female. It is possible that this partly reflects the proportion of female learners engaged in foreign language learning or of


J.-M. Dewaele

learners willing to spend 20 minutes in an exercise of self-reflection about their language use and their emotions when learning and using the foreign language. Statistical analysis of gender differences in the data is possible despite the imbalance. However, I wonder then if the minority of male participants willing to fill out the questionnaire may differ from those male learners who do not participate. In other words, to what extent are the male participants representative of all male learners? Feldman Barrett said that Internet-based samples may be more representative of the general population’s age but pointed out that they tend to over-­ represent participants with an above-average educational and socio-economic status who can afford to access a computer with an Internet connection (Feldman Barrett, cited in Jaffe, 2005). This was certainly the case in the previously mentioned corpora. In the BEQ, 90 per cent of participants had university degrees, including Bachelors, Masters and doctoral degrees (Dewaele, 2013, p.43). Similar proportions emerged in Dewaele (2016). I have speculated that the high proportion of participants with university education is probably linked to the fact that filling out these questionnaires requires practice, self-confidence, metalinguistic and metapragmatic awareness of one’s language practices and a genuine interest in the topic. Filling out questionnaires is a particular literacy skill that university students in the social sciences acquire quickly but that can leave lay people dumbfounded. It requires an ability to condense complex experiences, feelings and language choices on highly abstract Likert scales. It also implies an awareness of the convention that answers in these numerical formats are at best an approximation of a complex truth. A typical comment on the BEQ, which asked about language preference for the communication of emotions with various categories of interlocutor, including “friends,” was that it depended on the specific friend being addressed (Dewaele, 2013). The first exposure to such a questionnaire must be unsettling, comparable to one’s first encounter with a dictionary or a grammar. However, no PhD is needed to be able to reflect on one’s experiences in multiple languages, and a good number of our participants had lower education levels. Interestingly, their responses were generally comparable to those with higher levels of education (Dewaele, 2013). Reflecting on the difficulty of recruiting more participants with lower education levels (Dewaele, 2013), I included a story from my research assistant at the time, Dr Benedetta Bassetti, about her unsuccessful attempt to collect data for the BEQ from working class, bilingual Italians: I finally managed to get hold of a bunch of very uneducated Italian-English bilingual males, and tried to administer your questionnaire. Given the very low levels of literacy, I did it orally and recorded the answers. (…) The first problem

  Online Questionnaires 


was of course getting the message across, these people have a down-to-earth approach that is simply devastating. ‘This research is about language and emotions.’ ‘About what?’ Language and—it’s about how you express your emotions in Italian and in English, like, y’know, when you’re in love, do you prefer to express this feeling in English or in Italian? You are THICK, of course if I’m in love with an Italian girl I say it in Italian, if I’m in love with an English I say it in English. Ok (sigh!), yes, of course, but which one do you prefer, like, in which language do you feel more comfortable to express this emotion, or in which language is it more meaningful?—the informant didn’t get this abstract idea, and the discussion shifted to whether he preferred an Italian or English girlfriend. The second problem was suspicion. Why do they want to know this? E’ un professore dell’universitàa’ di Londra, he’s doing some research. Why does a professore want to know my opinion? Well you know, he has to do research to get his salary. —He gets paid to do this?!? (…) Finally, I would say that the level of abstraction was too high. At the question ‘if you had some unpleasant experiences in the past’, he answered ‘I didn’t have any unpleasant experiences’, so I decided to change the question into a concrete image ‘for instance, if your mother dies, would it be easier to talk about this in English or in Italian?’ He looked at me thoughtfully and thoroughly scratched his testicl*s for two minutes (Italian version of touching wood) (personal communication, 2003) (Dewaele, 2013, pp.43–44)

This episode clearly shows that these Italian participants would have been unlikely to fill out either a paper-based or an online questionnaire, even if they had had Internet access. Being put under pressure to fill out the paper version of the questionnaire would probably have been equally unsuccessful, as information would have been incomplete. I did try to recruit multilingual participants with lower socio-economic status using a paper version of the BEQ.I asked parents from various ethnic and linguistic backgrounds at my daughter’s primary school to fill out the questionnaires. I managed to collect about 20 questionnaires in this way, but I did meet considerable resistance and unwillingness. There is a serious point about all this: sitting in their comfy chairs, reviewers and examiners can complain about a sample being too highly educated, too imbalanced in terms of age and gender and therefore not representative of the general population. It’s only when the researcher has ventured on the ground that the gap between theory and reality becomes clear. Research instruments such as questionnaires with Likert scales and open questions on language choices probably look intimidating and probably even threatening to people who are not used to them. It may bring back unhappy school memories, it may provoke a fear of looking stupid in the eyes of the researcher as they may think that there are “right” and “wrong” answers. Explaining to them


J.-M. Dewaele

that this is not the case might be insufficient to allay their fear. To conclude, we have to accept that it is unlikely for us to ever have a sample of participants that is representative of the general population. This should not discourage us from doing research, as long as we acknowledge the limitations of the research design and sample, and we remain careful in drawing conclusions from our findings.

Reaching Vulnerable or Closed Niche Groups To obtain ethics approval for research on human subjects, researchers must typically explain the aim of the research, the design, the target group and the instruments they will use. When the target group is foreign language learners or multilinguals over the age of 16, obtaining ethics approval is relatively easy in my institution. It becomes much harder when the target group are foreign language learners who are younger than 16 and when consent needs to be obtained from the school, from the teachers, from the parents and from the students themselves. This cascade of levels of consent makes the research more time-consuming, and the sample size will tend to be smaller. A different type of difficulty occurs when the target group is psychologically vulnerable, such as current or former (multilingual) psychotherapy patients. Firstly, it is impossible to obtain patient lists from therapists or official mental health organisations because of confidentiality issues. A consequence of this is that very little research has been carried out on larger samples of (former) patients. Most studies investigating these issues use a case study approach, with typically up to five participants. Beverley Costa and myself decided that such small samples and the evidence that emerged from these studies would never yield the kind of generalisable result that is needed to influence policy. In this particular project we wanted to attract attention to the fact that monolingualism is the norm in health services and that the fact that a patient is relatively fluent in the language of the host country does not mean s/he can be treated as a monolingual. Hence the need to adapt the training of future psychotherapists by introducing notions on individual multilingualism, the meanings of code-­ switching and its consequences for psychotherapy. In Costa and Dewaele (2012), we used an anonymous online questionnaire that attracted the feedback of 101 therapists using snowball sampling. In Dewaele and Costa (2013) we used the same approach to reach 182 multilingual clients who had experienced therapeutic approaches in different countries. Some communities have their groups and forums where potential participants can be recruited. In Simpson and Dewaele (2019), we focused on self-

  Online Questionnaires 


misgendering (i.e. making a gender error in referring to the self ) among multilingual transgender speakers. Transgender people are sparsely distributed and often hidden, which means that the Internet is the ideal place to make contact with them. Being transgender, the first author knew the social-­ networking sites and forums used by the transgender community (over 30 groups and forums, based in various countries) where the call for participation could be issued. Being a member of the group meant that her request was also met with less suspicion than if an outsider had issued the call. We highlighted the fact that the questionnaire was completely anonymous. A total of 132 people participated.

Sample Size How Much Is Enough? Another important issue in questionnaire-based research is sample size. In other words, how many participants are needed to answer the research questions? Barkhuizen (2014) writes that when he is asked this question, his answer is “‘Well, it depends’” (p.5). Many factors need to be considered when deciding on the number of participants. Some factors are outside the control of the researcher: “the specific requirements of the research design and methods, the availability of the participants, constraints of time and human resources, and organizational structures within research sites, such as class size and timetabling” (p.7). Other factors are under the control of researchers: “determining the purposes and goals of the study, planning for and monitoring access to participants and research sites, and gauging feasibility in terms of scale of the project, time constraints, and one’s own research knowledge and skills” (p. 7). Barkhuizen also advises researchers to consult published research literature in the same field and discuss the issue with experienced colleagues. Ness Evans and Rooney (2013) take a more statistical point of view on the question “how many is enough?”. They explain that for quantitative research, the number of participants needed “depends on the power of [the] statistic. Parametric statistics such as the t-test and analysis of variance (ANOVA) have a great deal of power, but chi-square, a nonparametric procedure, has ­relatively low” (p.134). They explain further that the research design determines the sample size. Larger samples are needed when multiple statistical comparisons are planned, smaller samples are fine for tightly controlled experiments where there is a strong relation “between the manipulated variable and the behaviour”


J.-M. Dewaele

(p.135). Larger samples are thus needed in field research, where researchers have less control than their colleagues in the laboratory. Having more participants can compensate for greater variability. However, it is worth noting here that more does not automatically mean better. In very large-scale studies, because the analyses are very highly powered (large N), correlations and mean differences are more likely to be statistically significant. In other words, there is a danger of getting an inflated view of the relationships in question. To avoid this, researchers have to be sure to pay attention to the actual size of the relationship as measured not by p but the r or r2 value or the d value for mean differences (see Plonsky & Oswald, 2014).

Sample Study 13.1 Dewaele, J.-M. (2015). British ‘Bollocks’ versus American ‘Jerk’: Do native British English speakers swear more—or differently—compared to American English speakers? Applied Linguistic Review, 6(3), 309–339. Dewaele, J.-M. (2016). Thirty shades of offensiveness: L1 and LX English users’ understanding, perception and self-reported use of negative emotion-laden words. Journal of Pragmatics, 94, 112–127. Background Previous work on swearing in a foreign language has shown that swearwords are perceived to be more powerful, and preferred, in the L1 than in the LX, even among bilinguals who feel equally proficient in both languages. Method An online questionnaire attracted 2347 participants (1636 females, 664 males). This included 1159 English first language users and 1165 English foreign language users. They were asked to rate a list of 30 emotion-­laden words and expressions ranging in emotional valence from mildly negative to extremely negative. More specifically they were asked about how sure they were about the meaning of the word, the offensiveness and their frequency of use. The 2015 study was a comparison of the same variables for the 414 British and 556 L1 users of American English. Research Questions 1. Do LX users of English report being as comfortable in their understanding of the meaning of the 30 words as L1 users of English? 2. Do LX users of English have the same perception of offensiveness of the 30 words as L1 users of English? 3. Do LX users of English report comparable frequencies of use of the 30 words as L1 users of English? 4. What is the effect of having lived in an English-speaking environment on LX users’ understanding of the meaning, perception of offensiveness and reported frequencies of use of the 30 words?

  Online Questionnaires 


5. What is the effect of context of acquisition on LX users’ understanding of the meaning, perception of offensiveness and reported frequencies of use of the 30 words? 6. What is the relationship between self-reported level of oral proficiency in English and LX users’ understanding of the meaning, perception of offensiveness and reported frequencies of use of the 30 words? Results The LX users were found to overestimate the offensiveness of most words. They were significantly less sure about the exact meaning of most words compared to the L1 users and reported a preference for relatively less offensive words, while the L1 users enjoyed using the taboo words. Among the LX users more contact and exposure to English was linked to a better understanding of the meaning of the words, a better calibration of offensiveness and frequency of use. The comparison of British and American L1 users showed that the British participants gave significantly higher offensiveness and frequency of use scores to four words (including “bollocks”) while the American English L1 participants rated a third of words as significantly more offensive (including “jerk”), which they also reported using more frequently. Comments These two papers relied on a corpus that was large enough to allow multiple studies, comparing different groups and subgroups. One of the typical questions from reviewers was how reliable self-report is. This is a tricky question because it is impossible to claim that it is entirely reliable. The main defence against this type of critical observation is that since the questionnaire was anonymous there was no reason for participants to lie. Also, the participant is a pretty reliable source of his/her own linguistic behaviour. Everybody has a pretty clear idea about the frequency with which they use certain swearwords. Using a 5-point Likert scale is both vague and precise enough for the purpose of the research. It allows for precise comparisons of scores for different words. Actual production data are obviously better but other methodological problems arise. Wiring somebody up in order to record daily interactions for a length of time (typically never more than two days) might yield interesting data about swearing if that person happens to find itself in the right situation but if no such situation is encountered or if the person swears infrequently, no usable data would be collected. Moreover, it would be impossible to collect data from more than 1000 participants over a sufficient long period (not even mentioning the ethical issues).

Getting theBall Rolling As some of my students have found, putting a questionnaire online is no guarantee of mass participation. Snowball sampling implies that the researcher throws the ball in the direction of fresh snow that will adhere and that multiple little pulls and pushes are needed to keep it rolling. I typically spend about two days contacting students, friends, colleagues asking them nicely to


J.-M. Dewaele

fill out the questionnaire and to forward the call for participation to their students, colleagues and anyone who fits the selection criteria. Posting the call on Linguist List ( or professional listservs also helps. It is good to have a wide network of socio-professional contacts and friends. My guess is that I probably send out about 3000 emails when launching a new project. I have learned not to include more than 500 addresses per message in order to avoid the account being blocked for spamming. I do not see calls for participation in studies as spam, but I agree that when it happens too ­frequently it can be perceived as (friendly) harassment. I often help my students in sending out their calls for participation because I realise that those who receive a call are more likely to participate and forward it when they recognise the name of the sender. As a rule, I also respond positively to calls for participation and forward these to my students. In the academic world, we need to help each other out. I do get the odd angry reaction, with an addressee’s asking me how I got his or her address. I have generally no idea how the person ended up in my list of email contacts, but having been editor of the International Journal of Bilingual Education and Bilingualism since 2013 has certainly boosted my list of contacts. Once somebody asked for proof of the ethics clearance for the study, which I forwarded to him, as it was a legitimate question. A delicate decision is at what point to withdraw the online questionnaire from the web. This depends firstly on the schedule of the researcher and whether, for example, a date has been set to start analysing the data in order to write a dissertation or to prepare a paper for a conference. The researcher can also decide to analyse the first batch of data while the questionnaire remains active on the web. Google Forms allows the researcher to see how the recruitment is going. If the number of new participants tails off to fewer than one or two a day, it might be the right time to haul in the data. A fitting metaphor is that of the fisherman/woman hauling in the net, hoping that the catch will be glittering, abundant and rich, and will contain the fish that were targeted. It is important at that point not to be discouraged by a smaller than expected haul. Good research can be carried out on small datasets, but it may need some tweaking of the research questions.

Does theQuestionnaire Look Attractive andProfessional? When the request to participate is sent out, it is crucial that the questionnaire looks good and professional. This means that it cannot have any spelling mistakes, ungrammatical constructions, ambiguous questions, irritating questions

  Online Questionnaires 


and too many open questions that require too much effort on the part of the participants. Convincing someone to fill out a questionnaire for free is only going to work if the layout is nice. If it is sloppy, or the first paragraph sounds pretentious, or there are mistakes, people will not bother. This is also why it is important to pilot test the questionnaire with a small group of critically minded participants who can point out what is wrong or ambiguous or boring. I recommend avoiding questions in the style of “Do you do A more than B?” as the results are hard to analyse statistically. It is much better to ask “How often do you do A?” with a Likert scale, followed by “How often do you do B?” with a Likert scale. This will allow the calculation of means and a simple t-test will say whether the difference is statistically significant. Another crucial aspect of an attractive questionnaire is having a neutral, concise, clear and interest-arousing title that accurately reflects the content of the questionnaire but does not give away too much to avoid potential bias in the responses. In other words, the title should not contain words such as “the benefits of…” nor “the dangers of…” which might attract more people who agree with the message and the agenda it betrays or which might influence the responses of participants to please the author of the questionnaire. The same principle applies to the inclusion of a sufficient number of items and questions that reflect different possible views and do not steer participants in a particular direction. It is also a good idea to include a final open question about observations the participant wishes to make about the questionnaire. The observations can include requests for being contacted once the findings are published, suggestions about missing elements in the questionnaires, congratulations or complaints about the research project. Obviously, the title of the paper or book based on the findings can have a clear positive or negative direction, possibly with a question mark at the end. It informs the reader that a certain hypothesis is being tested. I recently co-authored a paper (Dewaele, MacIntyre, Boudreau, & Dewaele, 2016) entitled “Do girls have all the fun? Anxiety and enjoyment in the foreign language classroom.” The question was deliberately provocative and a reference to the famous 1983 Cyndi Lauper song, “Girls just want to have fun.” We felt that the question part of the title was both catchy and appropriate as our female participants reported significantly more enjoyment but also more anxiety in the FL classroom (effect sizes were small though). A final point needs to be made on the current trend of filling out online questionnaires using a mobile phone. A distraught PhD student recently told me that her pilot project participants had only selected the first four boxes on 9-point Likert scales after viewing video fragments. She then discovered that the phone screen gave a restricted view and that her participants had been unable to see the right side of her questionnaire, which was perfectly visible on a laptop.


J.-M. Dewaele

Limitations andFuture Directions The present chapter has discussed some of the limitations inherent to the use of paper-based and online questionnaires. What has not yet been mentioned is the fact that “it is all too easy to produce unreliable and invalid data by means of an ill-constructed questionnaire” (Dörnyei, 2007, p.115). My advice is thus to consult a good source like Dörnyei and Taguchi (2010) before attempting the creation of a questionnaire. Questionnaires developed haphazardly on the kitchen table without proper pilot testing will lead to tears and disappointment. Budding authors of questionnaires need to remember to keep questions simple in order to elicit simple short responses so that participants do not spend too long on filling out the questionnaire. As a consequence, the investigation cannot go very deep and “questionnaire surveys usually provide a rather “thin” description of the target phenomena” (p.115). One possible way to avoid shallowness is a mixed-method approach which involves gathering both quantitative and qualitative data in a single study, allowing the researcher to identify significant trends in the quantitative data, the possible causes of which can be pursued in interviews with a smaller number of participants (see Creswell & Plano Clark, 2010). This approach is particularly useful for a comparison between “typical” participants and outliers. Interviews often reveal totally unexpected reasons for specific behaviour or attitudes. Interviews are also an excellent way to complement questionnaires. Benedetta Bassetti helped me in interviewing 20 multilinguals after they had filled out the BEQ (Dewaele, 2013). Noticing that a participant had indicated that she had never used swearwords in her English L2, Benedetta challenged her in the interview. The participant admitted that she had indeed started using mild English swearwords, but only when having tea with her Chinese friends in London. This is a nice example of the fact that the questionnaire is a passive and relatively bland receptacle of data; it is good if it can be complemented by the quick thinking of an inquisitive researcher in an interview. The future of online questionnaires is bright because of the continuous growth of social media and technological advances. Questionnaires can include video clips on YouTube about which specific questions can be asked in the questionnaire (see Sample Study 13.2); webcams can catch facial expressions of participants watching videos, which can be analysed with facial recognition software.

  Online Questionnaires 


Sample Study 13.2 Lorette, P., & Dewaele, J.-M. (2015). Emotion recognition ability in English among L1 and LX users of English. International Journal of Language and Culture, 2(1), 62–86. Background To what extent do foreign language users struggle to recognise emotions in that language compared to first language users? Rintell (1984) showed that EFL students performed less well than a control group of native speaker students in recognising emotions in audio recordings. Proficiency and cultural distance were the main independent variables to have an effect. Method An online questionnaire was used to test individual differences in the emotion recognition ability (ERA) of 356 first language and 564 foreign language users of English. Participants were shown six short English videos of an English L1 British actress displaying an improvisation of six basic emotions. Participants were asked to identify the emotion portrayed by the actress ( embed/8VcoNbk3HVE). A comparison of the performance of native and foreign language users of English revealed no difference between the two groups despite the foreign language users scoring significantly lower on a lexical decision task (LEXTALE) which was integrated in the questionnaire (www.lextale. com), which was used as an indicator of English proficiency. Research Questions 1. Are there differences between L1 and LX users of English in their ability to recognise a (basic) emotion conveyed by an L1 British English speaker? 2. Is proficiency in English linked to the ability to recognise a (basic) emotion conveyed by an L1 British English speaker? 3. Is a participant’s L1 culture linked to their ability to recognise a (basic) emotion conveyed by an L1 British English speaker? Results LX users of English can generally recognise basic emotions in English video clips as accurately as L1 users of English despite lower levels of proficiency. LX users’ proficiency in English is related to their ERA in English, but the threshold for the successful recognition in an LX is probably lower than has been assumed so far. English L1 users with high proficiency scores tended to score higher on the ERA, which could hint at variation on some underlying linguistic or psychological dimension. A link exists between cultural distance and ERA as the Asian participants scored significantly lower on ERA than other groups, despite having English proficiency levels comparable to those of Continental European participants. Comments We experienced some difficulty in convincing the reviewers that our video clips were appropriate stimuli. We argued that the use of the picture of a face—a common method in emotion recognition research—might be more straightforward but that our spontaneous stimuli were more likely to reflect the messy reality we face every day, where we observe somebody during a stretch of time and form an opinion on the emotions that the speaker experiences based on the weighing of verbal, vocal and non-verbal cues.


J.-M. Dewaele

Conclusion Using online questionnaires in applied linguistics and multilingualism research is a great way to obtain rich and abundant data while sitting behind one’s desk. It is crucial to remain aware of the possibilities and the limitations of the data gathered in this way. At some point, the desk will have to be left behind in order to face some of the anonymous participants, and their input has the potential to create major surprises and revelations about possible causes behind the observed phenomena. It is equally important to express gratitude towards the many participants who spent some precious time answering the many questions. This can happen in the acknowledgement at the end of a published paper or book or at the start or end of a conference presentation. Since many of the participants may be friends and colleagues, or their students, it’s crucial to acknowledge that without their input, we would be standing there with empty hands. It is equally important to behave ethically, meaning that if we promised anonymity, nobody should recognise themselves or somebody they know in the research. As academics, we have to respect proper etiquette just like actors in their traditional tearful speech at the Oscars.

Resources forFurther Reading Creswell, J., & Plano Clark, V. (2010). Designing and conducting mixed methods research (2nd ed.). Thousand Oaks, CA: Sage. This is the “bible” for mixed-method research. It offers practical and step-­ by-­step guidance on how to develop a research design—a crucial step that precedes the creation of a questionnaire. It discusses the formulation of research questions, the collection of data and the interpretation of results. Dörnyei, Z., & Taguchi, N. (2010). Questionnaires in second language research: Construction, administration, and processing (2nd ed.). London: Routledge. This revised edition is again the prime source of information for anyone wishing to design questionnaires. It is very hands-on and no-nonsense. Like “celebrity” chefs, the authors share their recipes and provide advice on how to create reliable, valid and successful questionnaires.

  Online Questionnaires 


References Barkhuizen, G. (2014). Number of participants. Language Teaching Research, 18(1), 5–7. Bass, A.R., & Firestone, I.J. (1980). Implications of representativeness for generalizability of field and laboratory research findings. American Psychologist, 35(2), 141–150. Costa, B., & Dewaele, J.-M. (2012). Psychotherapy across languages: Beliefs, attitudes and practices of monolingual and multilingual therapists with their multilingual patients. Language and Psychoanalysis, 1, 18–40; revised version reprinted in (2014) Counselling and Psychotherapy Research, 14(3), 235–244. Creswell, J., & Plano Clark, V. (2010). Designing and conducting mixed methods research (2nd ed.). Thousand Oaks, CA: SAGE. Dewaele, J.-M. (2013). Emotions in multiple languages (2nd ed.). Basingstoke: Palgrave Macmillan. Dewaele, J.-M. (2015). British ‘Bollocks’ versus American ‘Jerk’: Do native British English speakers swear more—or differently—compared to American English speakers? Applied Linguistic Review, 6(3), 309–339. Dewaele, J.-M. (2016). Thirty shades of offensiveness: L1 and LX English users’ understanding, perception and self-reported use of negative emotion-laden words. Journal of Pragmatics, 94, 112–127. Dewaele, J.-M., & Costa, B. (2013). Multilingual clients’ experience of psychotherapy. Language and Psychoanalysis, 2(2), 31–50. Dewaele, J.-M., & MacIntyre, P. (2014). The two faces of Janus? Anxiety and enjoyment in the foreign language classroom. Studies in Second Language Learning and Teaching, 4(2), 237–274. Dewaele, J.-M., & MacIntyre, P. (2016). Foreign language enjoyment and foreign language classroom anxiety. The right and left feet of FL learning? In P.MacIntyre, T. Gregersen, & S. Mercer (Eds.), Positive psychology in SLA (pp. 215–236). Bristol: Multilingual Matters. Dewaele, J.-M., MacIntyre, P., Boudreau, C., & Dewaele, L. (2016). Do girls have all the fun? Anxiety and enjoyment in the foreign language classroom. Theory and Practice of Second Language Acquisition, 2(1), 41–63. Dewaele, J.-M., & Pavlenko, A. (2001-2003). Web questionnaire on bilingualism and emotion. Unpublished manuscript, University of London. Dörnyei, Z. (2007). Research methods in applied linguistics. Oxford: Oxford University Press. Dörnyei, Z., & Taguchi, N. (2010). Questionnaires in second language research: Construction, administration, and processing (2nd ed.). London: Routledge. Jaffe, E. (2005). How random is that. American Psychological Society, 9, 17–30. Retrieved from


J.-M. Dewaele

Kemper, E.A., Stringfield, S., & Teddlie, C. (2003). Mixed methods sampling strategies in social research. In A.Tashakkori & C.Teddlie (Eds.), Handbook of mixed methods in social and behavioral research (pp. 273–296). Thousand Oaks, CA: SAGE. Ness Evans, A., & Rooney, B.J. (2013). Methods in psychological research (3rd ed.). NewYork: Sage Publications. Plonsky, L., & Oswald, F.L. (2014). How big is ‘big’? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912. Simpson, L., & Dewaele, J.-M. (2019). Self-misgendering among multilingual transgender speakers. International Journal of the Sociology of Language. Wilson, R., & Dewaele, J.-M. (2010). The use of web questionnaires in second language acquisition and bilingualism research. Second Language Research, 26(1), 103–123.

14 Psycholinguistic Methods SarahGrey andKaitlynM.Tagarelli

Introduction Psycholinguistics is dedicated to studying how the human mind learns, represents, and processes language (Harley, 2013). In recent decades, overlapping interests in psychology and linguistics have resulted in a surge in knowledge about language processing, and the psycholinguistic methods borne out of this research are increasingly of interest to applied linguistics research. Like other language research methods—such as surveys, interviews, or proficiency tests—psycholinguistic methods can uncover information about language learning and use. Crucially, through experimental and theoretical rigor, psycholinguistic methods also reveal the psychological representations and processes underlying language learning and use. Although research conducted in naturalistic settings—such as in foreign language classrooms—provides important perspectives, such work is inherently confounded with extraneous variables. For example, even if one were to assume that all students receive the exact same input during a Spanish class, there is likely a large range in how much and in what ways (e.g., music, television, friends) each student is exposed to Spanish outside the classroom. This lack of empirical control S. Grey (*) Department of Modern Languages and Literatures, Fordham University, Bronx, NY, USA e-mail: [emailprotected] K. M. Tagarelli Department of Neuroscience, Georgetown University, Washington, DC, USA e-mail: [emailprotected] © The Author(s) 2018 A. Phakiti et al. (eds.), The Palgrave Handbook of Applied Linguistics Research Methodology,



S. Grey and K. M. Tagarelli

severely limits the precision and clarity with which researchers can measure and interpret language-related cognitive processes, which has consequences for the theories and applications informed by these interpretations. Psycholinguistic experiments, which are carried out in laboratory settings, employ carefully controlled designs that enable researchers to target factors of interest (e.g., the effects of phonological similarity on word recognition) while limiting, or controlling for, potentially confounding variables (e.g., word frequency). Their capacity for collecting detailed and well-controlled information is perhaps the strongest advantage of using these methods in applied linguistics research. Furthermore, psycholinguistic methods can reveal information about language learning and processing where other methods cannot. As we will discuss in more detail, traditional measures of performance or proficiency are often too coarse to distinguish different levels of language processing, but psycholinguistic measures like reaction time and brain recordings are more sensitive and thus can reveal effects where behavioral methods cannot. In this chapter, we review psycholinguistic methods. We do not cover the entire methodological corpus of psycholinguistics but instead focus on methods that are particularly relevant for addressing compelling questions in applied linguistics. In the first part of the chapter, we discuss psycholinguistic measures and tasks, and in the second part we describe common experimental paradigms.

Tasks andMeasures Psycholinguistic methods are especially useful for studying the cognitive processes about language learning and use, from phonetics and phonology to discourse-level pragmatics. This section reviews common psycholinguistic measures and tasks that are useful for applied linguistics researchers, together with strengths and limitations of these methods. The section begins with behavioral measures—including decision tasks, reaction time, mouse-­ tracking, and eye-tracking—and ends with a discussion of a popular neurocognitive measure, event-related potentials.

Decision Tasks We conceptualize decision tasks as experimental tasks that instruct participants to make a decision in response to a stimulus, from which researchers infer some underlying psycholinguistic process or representation. Such tasks are incredibly common in psycholinguistics and can be employed along all language domains. For example, a phoneme discrimination task requires

  Psycholinguistic Methods 


­ articipants to decide whether a pair of speech sounds (e.g., [pa] and [ba]) are p the same or different and provides insight into phonological representations. A lexical decision task (LDT) requires participants to decide whether a string of letters or sounds constitutes a real word (cucumber) or not (nutolon). LDTs reveal information about the organization of and access to the mental lexicon via a variety of experimental manipulations that tap domains such as orthography, phonology, morphology, and semantics. At the sentence level, a widely used decision task elicits acceptability judgments, where participants might decide whether a sentence is grammatically well-formed, as in (1) or perhaps whether a sentence makes semantic sense, as in (2). 1. a. The winner of the big trophy has proud parents. b. The winner of the big trophy †have proud parents. (Tanner & Van Hell, 2014) 2. a. Kaitlyn traveled across the ocean in a plane to attend the conference. b. Kaitlyn traveled across the ocean in a †cactus to attend the conference. (Grey & Van Hell, 2017) Acceptability judgment tasks are popular in second language acquisition research because researchers can design materials to assess knowledge of specific target structures, such as word order (Tagarelli, Ruiz, Moreno Vega, & Rebuschat, 2016) or grammatical gender agreement (Grey, Cox, Serafini, & Sanz, 2015). Additionally, acceptability judgments that impose a time limit on the decision have been shown to reliably tap implicit linguistic knowledge, while those that are untimed tap explicit knowledge (Ellis, 2005). A limitation of decision tasks is that they require participants to make an explicit and oftentimes unnatural evaluation about the target content: “Are these two sounds the same or different? Is this sentence correct?”. This may fundamentally change the way the given linguistic information is processed, which has consequences regarding the extent to which the inferred psychological representations can be said to underlie naturalistic language processing. However, this limitation does not outweigh the benefits: decision tasks are easy and efficient to administer and most are supported by cross-validation through decades of research. Also, in addition to using decision tasks to gather data on a target variable (e.g., accuracy in LDT) researchers can use them to keep participants attentive during an experiment or to mask the goals of the study while other less explicit measures, such as eye movements, are gathered (see Sample Study 14.1).


S. Grey and K. M. Tagarelli

Latency Measures Questions about language processing are often associated with the notions of “time” or “efficiency,” and as such many psycholinguistic tasks include a measure of latency: the time between a stimulus and a response. Perhaps the most common psycholinguistic measure of latency is reaction time (RT): a measure of the time it takes, in milliseconds, to respond to an external stimulus, usually via button press (or by speech onset in naming tasks). In applied linguistics, RTs gained early popularity in studies of L2 automaticity (Segalowitz & Segalowitz, 1993) and are increasingly employed to study L2 development, for example, in study abroad research (Sunderman & Kroll, 2009) and intesting different pedagogical techniques (Stafford, Bowden, & Sanz, 2012). RT in fact has been critical in clarifying the efficacy of different pedagogical techniques, since latency differences can be observed in the absence of accuracy differences (e.g., Robinson, 1997), thereby revealing finer-grained details about language learning than can be captured with accuracy alone (for a review, see Sanz & Grey, 2015). RTs are used for button press and similar responses, whereas voice onset time (VOT) is a production-based measure of latency that can be used to examine the timing and characteristics of articulation. VOT is the time, in milliseconds, between the onset of vocal fold vibrations and the release of the articulators. VOT exhibits well-identified cross-linguistic patterns (Lisker & Abramson, 1964), making it an attractive metric for studying the psycholinguistic processes underlying articulation in different language groups. For example, VOT has been used to study speech planning in L2 learners (Flege, 1991) and bilingual code-switching (Balukas & Koops, 2015). A third latency measure is eye-tracking. Eye-tracking data, like VOT, are ecologically appealing because they measure naturally occurring behavior as it unfolds in time, for example, as participants view scenes or read sentences. Eye movements are measured in terms of fixations, saccades (Rayner, 1998), or proportion of looks to regions of interest (ROIs, e.g., a word or picture; Huettig, Rommers, & Meyer, 2011). Fixations refer to the time spent on a location and can be divided into earlier measures, such as first-fixation duration, and later measures, such as total time in an ROI. Saccades refer to quick eye movements from one location to another and provide information on forward movements in reading (forward saccades, which are rightward movements in most languages) as well as returns to a location, or regressive saccades (Rayner, 1998). Such measures have recently been used to study attention in L2 learning (Godfroid, Housen, & Boers, 2013) and the use of subtitles in foreign language films (Bisson, Van Heuven, Conklin, & Tunney, 2012).

  Psycholinguistic Methods 


Fig. 14.1  Sample visual world. Note: In this example, “cat” is the target, “caterpillar” is an onset competitor, “bat” is a rhyme competitor, and “hedgehog” is an unrelated distractor. Images are from the Multipic database (Duñabeitia etal., 2017)

Proportion of looks to a target ROI (compared to competitors) is another metric for eye-movement data and is often used to measure language processing in visual world paradigms, where both linguistic and visual information are presented (Fig.14.1; Sample Study 14.1). A more recent latency measure is mouse-tracking, which records the trajectory of the computer mouse on thescreen by sampling the movement many dozen times a second (Freeman & Ambady, 2010). This technique allows researchers to examine competition and selection between multiple response options with detailed latency information. The continuous trajectory of the mouse as it moves toward a target (see Fig.14.2) is the core metric, which is often analyzed by calculating its maximum deviation (MD) or area under the curve (AUC). MD refers to the largest perpendicular deviation between the ideal mouse trajectory (straight line to the target) and the observed trajectory, and AUC is the area between the ideal and observed trajectories. Although


S. Grey and K. M. Tagarelli

Fig. 14.2  Sample data from mouse-tracking language experiment. Note: The black line represents a competitor trajectory; the gray line represents a target trajectory. Images are from the Multipic database (Duñabeitia etal., 2017)

mouse-tracking methodology is rather new, it is gaining momentum in language research and, unlike many types of psycholinguistic testing software, mouse-tracking software for experimental research is freely available (MouseTracker, Freeman & Ambady, 2010) which makes it especially appealing. Of interest to applied linguistics, mouse-tracking has recently been employed to study pragmatic intent (Roche, Peters, & Dale, 2015), syntactic transfer in bilinguals (Morett & Macwhinney, 2013), and lexical competition in monolinguals and bilinguals (Bartolotti & Marian, 2012).

Language Production Using measures such as VOT, psycholinguistic methods are fruitful in revealing insights about production. One well-studied production task is picture naming, whereby participants are presented with pictures and asked to name them aloud. This task reveals information on lexical access and the organization of the mental lexicon more generally and has been employed across many different populations and languages (Bates etal., 2003). Researchers usually measure naming accuracy and can gather naming latency, as well as VOT.This makes picture naming versatile in the information it provides; it has been used

  Psycholinguistic Methods 


to test questions with implications for applied linguistics, including those pertaining to the effects of semantics on L2 word learning (Finkbeiner & Nicol, 2003) as well as conceptual/lexical representations during lexical access (for a review, see Kroll, Van Hell, Tokowicz, & Green, 2010). Elicited imitation (EI) is another production task that is impressively simple. Participants listen to a sentence and are asked to repeat it verbatim. The premise of EI is that if participants can repeat a sentence quickly and precisely, they possess the grammatical knowledge contained in that sentence. Under this premise, EI has been established as a reliable measure of language proficiency, both globally (Wu & Ortega, 2013) and on specific linguistic structures (Rassaei, Moinzadeh, & Youhannaee, 2012). Additionally, EI can assess implicit linguistic knowledge (Ellis, 2005; Spada, Shiu, & Tomita, 2015), the attainment of which is a central research area in applied linguistics. EI has also been the subject of increased methodological rigor, spurring a recent meta-­ analysis of the technique (Yan, Maeda, Lv, & Ginther, 2016) and the extension of EI to assess proficiency in heritage languages (Bowden, 2016). The main critique of EI is in fact its main design element. Because participants are repeating predetermined sentences outside of a larger discourse context, the language produced is not representative of naturalistic speech. This critique is stronger for designs that include ungrammatical sentences (e.g., Erlam, 2006), which are quite common in L2 research, but by their very nature do not represent genuine language. The study of speech disfluencies offers a more naturalistic psycholinguistic perspective. Disfluencies occur very frequently in natural language production (Fox Tree, 1995) and include editing terms (e.g., uh and um), pauses, repetitions, restarts, and repairs. They have long been used as a window into the effects of cognitive load on speech planning and production. For example, disfluencies are more frequent at the beginning of utterances, preceding longer utterances, and for unfamiliar conversational topics (Bortfeld, Leon, Bloom, Schober, & Brennan, 2001; Oviatt, 1995), where cognitive demand is relatively high. An increasing amount of research examines the effects of speaker disfluencies on language comprehension. Interestingly, these effects appear to be facilitative. For example, disfluencies speed up word recognition (Corley & Hartsuiker, 2011) and serve as cues to new information in discourse (Arnold, fa*gnano, & Tanenhaus, 2003), though this may depend on speaker-specific factors such as native versus non-native speech (Bosker, Quené, Sanders, & de Jong, 2014). Growing psycholinguistic interest in disfluencies is promising for future research, in part because they represent naturalistic phenomena with implications for comprehension and production.


S. Grey and K. M. Tagarelli

Event-Related Potentials In recent years, psycholinguistic interests have expanded to consider questions on how the human brain acquires, represents, and processes language. One approach for considering such questions is the event-related potential (ERP) technique. Many of the methods discussed above can be coupled with this technique, such as acceptability judgments (Tanner & Van Hell, 2014) and LDTs (Barber, Otten, Kousta, & Vigliocco, 2013). Compared to other neuroimaging techniques like PET, MEG, and MRI, ERP equipment is generally much less expensive, and there are even excellent lower-cost ERP equipment options on the market (e.g., actiCHamp from Brain Products, Germany). As such, ERPs provide applied linguists with a cost-effective method for applying neuroscience technology to timely issues in applied linguistics research (for a review, see Morgan-Short, 2014). ERPs are derived through recording naturally occurring electroencephalogram data, which consist of changes in the brain’s electrical activity measured from electrodes on the scalp. Using ERPs, researchers investigate neurocognitive processes with millisecond precision (see Luck, 2014, for detailed information), which are elicited in response to a time-locked event (e.g., the onset of the words “plane” or “cactus” in (2) above). ERP language studies often use violation paradigms, which are characterized by measuring the neural activations of correct/standard stimuli (e.g., 1a, 2a above; or “cucumber” in LDTs) compared to matched “violation” stimuli (e.g., 1b, 2b; “nutolon” in LDTs). A key ecological advantage of ERPs is that the ERPs themselves serve as automatic, implicit responses to such stimuli, so researchers do not necessarily have to elicit an explicit decision from their subjects, which, as discussed above, could result in less naturalistic language processing. ERP language research has revealed a set of well-studied neural responses, or ERP effects, that are considered to reflect different neurocognitive processes. Table14.1 summarizes some of these effects and Fig.14.3 illustrates a sample ERP pattern for semantic processing. In the last 15 years, ERPs have become popular for applied linguistics interests (Morgan-Short, 2014), with studies testing the role of L1/L2 proficiency (Newman, Tremblay, Nichols, Neville, & Ullman, 2012), different L2 training conditions (Batterink & Neville, 2013), cross-linguistic transfer (Gillon Dowens, Guo, Guo, Barber, & Carreiras, 2011), and prediction during L2 reading (Martin etal., 2013), among others. Notably, changes in ERP responses as a result of language development have been observed even when proficiency changes are not apparent at the behavioral level (McLaughlin, Osterhout, & Kim, 2004).

  Psycholinguistic Methods 


Table 14.1  Common language-related ERP effects ERP effect

Language domain








Frontal positivity

Sample stimuli

They wanted to make the hotel look more like a tropical resort, so along the driveway they planted rows of    palms    † pines    †† tulips (Federmeier & Kutas, 1999) (Morpho)syntax The man in the restaurant doesn’t like the hamburgers that are on his plate. The man in the restaurant doesn’t like the hamburger that †are on his plate. (Kaan & Swaab, 2003) (Morpho)syntax The scientists criticized Max’s proof of the theorem. The scientists criticized Max’s †of proof the theorem. (Neville, Nicol, Barss, Forster, & Garrett, 1991) snap– nap Pre-lexical snap– †tap phonetics/ (Newman & Connolly, 2009) phonology Lexical The bakery did not accept prediction credit cards so Peter would have to write    a check    †an apology    to the owner (DeLong, Groppe, Urbach, & Kutas, 2012)

^Representative study McLaughlin etal. (2004)

Bowden, Steinhauer, Sanz, and Ullman (2013)

Batterink and Neville (2013)

Goslin, Duffy, and Floccia (2012) Martin etal. (2013)

†Represents violation/unexpected/mismatch item. ^Related to interests in applied linguistics. 1ANs (anterior negativities) have also been referred to as LANs (left anterior negativities). When present theygenerally appear as a biphasic AN-P600 response. 2PMN, phonological mapping negativity

ERPs represent a highly sensitive temporal measure of neural processing, which is important for understanding language. They do not, however, reflect the spatial location within the brain that gives rise to the observed effects. That is, observing an ERP effect in posterior (parietal) scalp locations does not mean the activity originated in the parietal cortex. For this level of spatial detail, researchers use fMRI or MEG, though their high costs and lower temporal resolution make these techniques less common in psycholinguistics.


S. Grey and K. M. Tagarelli

Fig. 14.3  Sample ERP waves and scalp topography maps of the standard ERP correlate of semantic processing (N400). Note: Each tick mark on y-axis represents 100 ms; x-axis represents voltage in microvolts, ±3μV; negative is plotted up. The black line represents brain activity to correct items, such as plane in example 2a. The blue line represents brain activity to a semantic anomaly, such as cactus in example 2b. The topographic scalp maps show the distribution of activity in the anomaly minus correct conditions with a calibration scale of ±4μV.Fromdata reported in Grey and Van Hell (2017)

The use of decision tasks, latency measures, production measures, and brain wave recordings reveals information about the cognitive bases of language and is fundamental to psycholinguistic research. By employing these measures, the psycholinguistic paradigms—or standard experimental designs—described in the following section allow researchers to understand language learning and

  Psycholinguistic Methods 


processing in a way that complements and expands on traditional methods in applied linguistics.

Psycholinguistic Paradigms Over the last few decades, several reliable psycholinguistic paradigms have been established to investigate the mental underpinnings of language. This section reviews common experimental approaches that are relevant for interests in applied linguistics: priming, visual world, and language learning paradigms.

Priming Priming refers to the cognitive process whereby exposure to some language (the “prime”) influences processing of subsequently presented language (the “target”). This effect can be facilitative, wherein processing of the target speeds up as a result of the prime, or inhibitory, wherein processing of the target slows down. Priming effects generally reflect implicit processes—individuals are sensitive to language details without being aware of how such sensitivities influence subsequent language processing (McDonough & Trofimovich, 2009). In a standard priming paradigm, the participant is briefly (for 0.30) are also needed for EFA to be appropriate. Researchers need


A. Phakiti

to examine a correlation matrix among items before performing EFA (a correlation matrix can be generated as part of EFA in SPSS). 7. Outliers based on zero or near-zero R2 (shared variance). It is likely that some items will not be correlated at all with others. For EFA, an item that has a near-zero R2 with other items is considered an outlier and should not be included in EFA. 8 . Outliers based on perfect or near-perfect R2. While there is a need for a number of sizable correlations, researchers do not want to have many perfect or near-perfect correlations in the dataset. With a perfect correlation, R2 is 1 or is close to 1, which signifies singularity and multicollinearity in the dataset. Such items are also considered outliers, and when found, they should be excluded from subsequent EFA.

Essential EFA Steps Although researchers can instruct a programme such as SPSS to perform EFA in a single step (e.g., by specifying extraction and rotation methods at the outset), this chapter emphasises the importance of conducting EFA one step at a time because the early detection of problematic items or choices of extraction methods can prevent difficulties in interpreting common factor structures at a later stage, thereby promoting the validity of EFA. This is a parsimonious approach to EFA. Figure 20.2 provides 12 essential steps in EFA. While this diagram appears sequential, the actual steps are largely iterative. To illustrate how EFA can be performed using SPSS and following these steps, an unpublished dataset (available upon request for the purpose of practice only) from 275 international students who answered a 37-item, Likert-­ type scale questionnaire (1 = not at all true of me, 2 = not true of me, 3=somewhat true of me, 4=true of me and 5=very true of me) about their strategy use in academic lectures is used. Note that Item 32 was reverse coded. Figure20.3 shows a screenshot of part of this dataset.

Step 1: Entering, Checking andCleaning Data This step is common to all quantitative analyses. Unless data are obtained online, researchers first enter data into an SPSS spreadsheet and examine if there are missing data or errors in the data entry. Odd values, for example, have an implication for the variances used in EFA and can result in a different

  Exploratory Factor Analysis 


Step 1: Entering and cleaning data

Step 8: Re-run EFAs until a conclusion about factors has been reached.

Step 9: Computing a reliability coefficient for each factor

Step 2: Examining item-level descriptive statistics

Step 7: Choosing a rotation method

Step 10: Naming a factor

Step 3: Checking an overall reliability coefficient

Step 6: Determining the number of factors

Step 11: Forming factor scores or composite scores

Step 4: Examining a correlation matrix and some initial EFA statistical requirements

Step 5: Choosing an extraction method

Step 12: Performing subsequent analysis to address research questions

Fig. 20.2  12 essential steps in EFA

Fig. 20.3  Screenshot of the strategy use in lectures data


A. Phakiti

Fig. 20.4  Descriptive statistics options in SPSS

factor structure. Participants with missing values are not used in EFA (see Step 7), which will reduce the sample size. Missing data are common in quantitative research, and there are several methods that have been developed to deal with missing data, such as mean substitution, regression and multiple imputation (see Fig.20.17). See Brown (2015), Fabrigar and Wegener (2012) and Osborne (2014) for further discussion of how to deal with missing data in EFA (Fig.20.4).

Step 2: Examining Item-Level Descriptive Statistics In research articles, authors may decide not to report item-level descriptive statistics due to the word limits. However, this does not mean that researchers do not need to examine the item-level descriptive statistics, which include the mean, median, mode, standard deviation as well as skewness and kurtosis statistics. Item-level descriptive statistics are essential as they allow researchers and readers to understand the nature of the dataset prior to EFA.Table20.1 is an example of descriptive statistics for items one to five. The data were normally distributed as the skewness and kurtosis statistics ranged within the limits of ± 1.

Step 3: Checking anOverall Reliability Coefficient After examining the descriptive statistics, researchers perform reliability analysis. In SPSS, this can be performed in the “Scale” sub-menu under “Analyze.” The Cronbach’s alpha is commonly used for test and Likert-type scale data. If

  Exploratory Factor Analysis 


Table 20.1  Descriptive statistics of items one to five Statistics N

Valid Missing

Mean Median Mode Std. deviation Skewness Std. error of skewness Kurtosis Std. error of kurtosis Minimum Maximum






275 0

275 0

275 0

275 0

275 0

3.0036 3.0000 3.00 0.94559 −0.242 0.147

2.8800 3.0000 3.00 0.97606 −0.137 0.147

2.5491 3.0000 3.00 0.89624 0.219 0.147

2.4800 2.0000 2.00 0.88507 0.045 0.147

2.7127 3.0000 3.00 0.92473 −0.040 0.147

−0.263 0.293

−0.428 0.293

−0.110 0.293

−0.709 0.293

−0.476 0.293

1.00 5.00

1.00 5.00

1.00 5.00

1.00 4.00

1.00 5.00

Table 20.2  Cronbach’s alpha coefficient of the questionnaire Reliability statistics Cronbach’s alpha

N of items



the overall Cronbach’s alpha is lower than 0.70, researchers need to decide whether to continue with an EFA approach. In SPSS, it is useful to choose the option “Scale if item deleted” in reliability analysis, so that some items that negatively affect the reliability estimates may be identified and excluded from EFA.Table20.2 presents the overall Cronbach’s alpha of the questionnaire, which was 0.87.

tep 4: Examining aCorrelation Matrix andSome Initial S EFA Statistical Requirements This step is related to points 2, 8, 9 and 10 in the assumptions discussed above. This step can be performed from the factor analysis menu in SPSS.To start performing EFA, go to “Analyze,” “Dimension Reduction” and then “Factor” (see Fig.20.5). Figure 20.6 is the main dialog box for EFA.Here, move items 1 to 37 to the Variables pane on the right. In Fig.20.6, there are sub-menu options that allow researchers to perform various stages of EFA.The Descriptives option is relevant to this step. Click


A. Phakiti

Fig. 20.5  EFA in SPSS

“Descriptives” and a new dialog box will appear (see Fig.20.7). In this option, tick the following boxes: initial solution, coefficients and Kaiser-Meyer-Olkin (KMO) and Bartlett’s test of sphericity. Then click “Continue” and “OK.” More options can be chosen (see e.g., Field, 2013), but for this chapter, this selection is sufficient. The descriptive statistics option is not selected because it only reports the means, standard deviations and the number of cases (which is not as comprehensive as in Step 2). In this step, we first check whether the correlation coefficients in the matrix are reasonable and second check whether there are perfect

  Exploratory Factor Analysis 


Fig. 20.6  Factor analysis menu

Fig. 20.7  SPSS Descriptives dialog box

correlations or several zero or near-zero correlations (refer to points 9 and 10in the assumptions section). Due to limitations of space, this matrix is not presented here. A careful inspection suggests that several correlations were larger than 0.30, and there were no perfect correlations or near-zero correlations of a


A. Phakiti

Table 20.3  KMO and Bartlett’s test based on 37 items KMO and Bartlett’s test Kaiser-Meyer-Olkin Measure of Sampling Adequacy Bartlett’s test of sphericity Approx. chi-square df Sig.

0.837 3790.175 666.000 0.000

particular item to any other items, indicating that factor analysis is plausible. Table20.3 presents the results of the KMO and Bartlett’s test. The Kaiser-Meyer-Olkin measure is used to indicate data sampling adequacy and this statistic should be larger than 0.60 (Fabrigar & Wegener, 2012; Field, 2013). In this dataset, it was 0.84. The Bartlett’s test of sphericity is used to determine whether the dataset is factorable, and this statistic should be significant at 0.05 (Fabrigar & Wegener, 2012; Osborne, 2014), and in this output, it should be read as p

The Palgrave Handbook of Applied Linguistics Research Methodology PDF - AZPDF.TIPS (2024)
Top Articles
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated:

Views: 5382

Rating: 4.6 / 5 (46 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.