Making sense of the literature as a PhD student, Part 2: Inclusion criteria, exclusion criteria, and everything in between

This blog post is the second part of a series of three blog posts on the topic of literature searching and writing a literature review for a PhD. Some of the advice presented here might also be applicable to students doing research at postgraduate level or any researchers undertaking large-scale research projects. Blog post 1 of 3 in the series was about literature searching. You can find it here.


What happens after the literature search? What do we do after we have retrieved hundreds of articles?

A while back, I talked about literature searching, which is the act of applying systematic search criteria to collect relevant sources for essays, reports, and dissertations. Much has been written elsewhere about performing literature searches and writing literature reviews. In fact, almost every university has a LibGuide page that hosts resources and provides guidance on these topics.

Yet there is one essential transitional step that comes after the literature search and before the literature review which is not as transparent as the other two steps, and which needs to be demystified further: the selection of sources for inclusion in the literature review. This step has received less coverage in LibGuides than the literature search and the literature review steps, even though it is just as important and impacts the final outcome just the same. The literature selection step entails the classification, organisation, evaluation, and filtering of retrieved items according to a set of criteria defined by the researcher. General guidance on inclusion and exclusion criteria is widely available – see, for instance, this LibGuide which has a good coverage of common criteria – though it is ultimately up to the researcher to decide what they wish to include or exclude based on the nature and scope of their research.

A rigorous and systematic literature search generates a plethora of high quality and highly relevant results: journal articles; book chapters; introductory texts; primary and secondary research; theoretical and conceptual texts; policy reports; meta-analyses and systematic reviews; methodological and philosophical texts; opinion pieces and critiques, and perhaps even grey literature. But would it be possible to include them all in the literature review, and would it be feasible to read them all in the limited time we have to complete our project? No! As interesting as they may be, there may simply be too many items to include at once. Were we to include them all, we would risk over-referencing, diluting the message of our literature review, or increasing the scope of the work too far beyond its original designation. This is why we need to apply inclusion and exclusion crtieria to the retrieved items before we can start writing.

What are inclusion and exclusion criteria?

Broadly speaking, inclusion criteria are those criteria that lead to a more in-depth reading of an article and that cause certain articles to be granted priority for inclusion in the literature review on the basis of relevance and quality. Exclusion criteria are those criteria that signpost that it is acceptable not to read an article in-depth or include it in the literature review because it is peripheral to the topic being studied (or is just outright irrelevant!). The criteria are applied consistently and objectively to every item being evaluated, hence the selection of papers is just as systematic as the literature search that preceded it.

Deciding on inclusion and exclusion criteria is relatively straightforward. All we need to do is to decide on how we are going to define quality and relevance, and how far our research topic will stretch out (that is, what the scope of the research is going to be). Key questions to ask here are: What does a high quality article look like in our field? What are we studying and not studying? What is relevant and irrelevant to our topic? This way, we introduce some boundaries to guide our interpretation. There is no pressure to get the inclusion and exclusion criteria perfect the first time – it is actually fairly common to go back and revise the criteria in the early stages of the selection process.

Some standard inclusion criteria might be: frequently cited, recent, peer-reviewed, implements keywords throughout (“implements keywords throughout” means that we are not looking for items that mention our central keyword twice in the introduction and then go off talking about something else entirely; it is important that key words are central to the narrative or the empirical investigation). Exclusion criteria might be the exact opposite versions of inclusion criteria, or additional exclusion criteria might be added (e.g. only articles in English or articles coming from our immediate geographical region will be considered and everything else will be excluded). We just need to make sure that the inclusion and exclusion criteria are justifiable and specific. For instance, why did we choose to include items covering the time period between 2010-2020 only? Did this time range correspond to an important development in the history of the phenomenon, or are we applying this criterion as a way to showcase only the most recent work?

The selection procedure goes something like this…

  1. Look for keywords in the title and the abstract.
  2. Skim article quickly.
  3. Review methodology and findings (or main arguments for non-empirical research).
  4. Decide if the article will be included in the literature review or not.

If it will be included: Read further, highlight, and annotate.

If it will not be included: Extract any relevant cited articles and file away.

If unsure whether to include it or not: put it in the ‘maybe’ pile and come back to it later!

There are many ways you could go about the selection process. Find the one that works for you!

When it comes to the technical side of the process, everyone has their own way of managing their reading and selecting items for inclusion in their literature review. There seem to be two big camps here: the reference manager camp and the MS Office camp.

Reference managers

Reference managers can do so much more than just generate a bibliography! In Mendeley (and most other reference managers), there are sophisticated literature management features such as attaching custom tags to records and searching for keywords within abstracts. Mendeley has a great built-in PDF reader as well, which makes it possible to annotate articles and take notes all together in the same application. Since you can see your paper and notes side by side, there is no need to jump between applications; everything is handled by the same software in a streamlined way. Using reference managers as an end-to-end literature management solution is a really good option for novice researchers as reference managers are quite standardised and easy to use.

Custom templates via MS Office

Another way to manage the process is to use a combination of various MS Office applications to record one’s evaluations and notes. For example, you could save the results from your literature searches in themed folders, and then use Excel spreadsheets or Word headings/tables to record your observations. This method will require more jumping back and forth between applications, and a second monitor may be needed in some cases. Its big benefit, though, is that it offers more flexibility and customisation than reference managers. It allows us to develop our own systems for recording our progress, complete with colour coding and custom categories. The MS Office method is also a familiar and trusty solution which bypasses known problems with reference managers (such as cloud sync problems and potential data loss, to name a few – though you may argue that such problems are not unique to reference managers!).

Here is an example Excel spreadsheet for literature selection:


So far, I have been using a mixture of both reference managers and MS Office applications to record the progress of my reading and to decide on what literature to include in my literature review. What has worked best for me has been to use Mendeley (rather than Adobe) as my main PDF reader, Zotero as my bibliography generation tool, and Excel/Word for organising my notes and keeping track of everything.

I also found it most manageable and sensible to alternate cycles of reading, selecting, and writing, and to select items for inclusion in my literature review chapter by chapter, rather than to go through all retrieved items in one go and do all of my writing at the very end. This simple ‘Read-Select-Write’ formula has been taking between a month and a month and a half to complete per chapter on average. For example, last month, I did my reading on the broad theme of information literacy and work, then selected papers for inclusion, and drafted the majority of the chapter. This month, I started my reading on the role of information in career development, which will be another chapter in my literature review, and I will yet again be looking to select items for inclusion and to draft the chapter before I move on to a different chapter. In a way, I’m uniting two well-known productivity principles here – “write as you go” and microproductivity (that is, breaking big projects into smaller projects or big tasks into smaller tasks).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s