Enhancing access to reports of randomized trials published world-wide--the contribution of EMBASE records to the Cochrane Central Register of Controlled Trials (CENTRAL) in The Cochrane Library

Emerg Themes Epidemiol. 2008 Sep 30:5:13. doi: 10.1186/1742-7622-5-13.

Abstract

Background: Randomized trials are essential in assessing the effects of healthcare interventions and are a key component in systematic reviews of effectiveness. Searching for reports of randomized trials in databases is problematic due to the absence of appropriate indexing terms until the 1990s and inconsistent application of these indexing terms thereafter.

Objectives: The objectives of this study are to devise a search strategy for identifying reports of randomized trials in EMBASE which are not already indexed as trials in MEDLINE and to make these reports easily accessible by including them in the Cochrane Central Register of Controlled Trials (CENTRAL) in The Cochrane Library, with the permission of Elsevier, the publishers of EMBASE.

Methods: A highly sensitive search strategy was designed for EMBASE based on free-text and thesaurus terms which occurred frequently in the titles, abstracts, EMTREE terms (or some combination of these) of reports of trials indexed in EMBASE. This search strategy was run against EMBASE from 1980 to 2005 (1974 to 2005 for four of the terms) and records retrieved by the search, which were not already indexed as randomized trials in MEDLINE, were downloaded from EMBASE, printed and read. An analysis of the language of publication was conducted for the reports of trials published in 2005 (the most recent year completed at the time of this study).

Results: Twenty-two search terms were used (including nine which were later rejected due to poor cumulative precision). More than a third of a million records were downloaded and scanned and approximately 80,000 reports of trials were identified which were not already indexed as randomized trials in MEDLINE. These are now easily identifiable in CENTRAL, in The Cochrane Library. Cumulative sensitivity ranged from 0.1% to 60% and cumulative precision ranged from 8% to 61%. The truncated term 'random$' identified 60% of the total number of reports of trials but only 35% of the more than 130,000 records retrieved by this term were reports of trials. The language analysis for the sample year 2005 indicated that of the 18,427 reports indexed as randomized trials in MEDLINE, 959 (5%) were in languages other than English. The EMBASE search identified an additional 658 reports in languages other than English, of which the highest number were in Chinese (320).

Conclusion: The results of the search to date have greatly increased access to reports of trials in EMBASE, especially in some languages other than English. The search strategy used was subjectively derived from a small 'gold standard' set of test records and was not validated in an independent test set. We intend to design an objectively-derived validated search strategy using logistic regression based on the frequency of occurrence of terms in the approximately 80,000 reports of randomized trials identified compared with the frequency of these terms across the entire EMBASE database.