  • Kyle Andrew Fitzgerald Cape Peninsula University of Technology
  • Jane Alice Fitzgerald Cape Peninsula University of Technology
  • Andrew John Bytheway Cape Peninsula University of Technology




Background and Purpose: For researchers making use of various information retrieval systems utilising phrase terms in an attempt to retrieve documents relevant to just one classification of diabetes, remains a challenge. The purpose of this study is to establish, test and evaluate a standard set of phrase terms within a text collection.

Methods: A collection 52 phrase terms were extrapolated from the literature and used to create nine queries each relating to a diabetes classification. A specificity information retrieval system was used to judge and retrieve documents both relevant and non relevant, to the queries. Results were analysed using document frequency, to measure research interest, and collection frequency, to measure phrase term usage.

Results:  9,106 documents were retrieved from the collection. Diabetes research interest is in: ‘type 2 diabetes mellitus’, ‘type 1 diabetes mellitus’ and ‘gestational diabetes mellitus’ with the classification ‘type 2 diabetes mellitus’ having three times more research interest than ‘type 1 diabetes mellitus’. The top five frequently used phrase terms were: ‘type 2 diabetes’, ‘type 1 diabetes’, ‘diabetes mellitus’, ‘type 2 diabetes mellitus’ and ‘prediabetes’.

Conclusions: Document volumes from the conferences reduced over the ten years. Most research interest remains in both ‘type 2 diabetes mellitus’ and ‘type 1 diabetes mellitus’. Research interest is increasing for ‘prediabetes’ and ‘gestational diabetes mellitus’. Phrase term usage tends to increase when research interest is low.


