Criteria for Updating a Guideline and Updating a Systematic Review
An update of a systematic review of a guideline topic denotes an event with the aim to search for and identify new evidence to incorporate into a previously completed systematic review.54 Changes to guidelines can be undertaken for correction of typographical or content errors. Such changes do not constitute an update because they do not allow for the possibility of new evidence being identified.54
In general, guidelines and the systematic reviews they are based on should be updated as scheduled. An earlier update of the systematic review on a particular guideline topic can be prompted if all of the following conditions are met:
Evidence from surrogate end point trials can prompt an update of a systematic review for a guideline if the criteria outlined by the "Users" Guide for a Surrogate End Point Trial" are met (see Table 13).55 As an example, the Dialysis Patients’ Response to IV Iron with Elevated Ferritin (DRIVE) Study59 did not initiate an update of the systematic review for Guideline 3.2 on Iron Targets because the study examined a surrogate outcome (change in Hb) after short follow-up duration (6 weeks).
Methods Used for this Guideline Update
Process
For this guideline update, the Evidence Review Team (ERT) at Tufts-New England Medical Center in Boston, MA and the Work Group updated the systematic review of RCTs that compared the effect of targeting different Hb levels with ESA treatment. A detailed description of the methods can be found in the methods chapter of the 2006 Anemia guidelines.56 The inclusion criteria were: RCTs in patients with CKD stages 1 to 5, with a minimum of 2-month follow-up duration. Outcomes of interest were all-cause mortality; cardiovascular, cerebrovascular, and peripheral vascular disease; left ventricular hypertrophy; quality of life; hospitalizations; progression of kidney disease; dialysis adequacy; hypertension; transfusions; and seizures.
An updated search conducted on December 7, 2006, with the previously used key words of KIDNEY and ANEMIA identified 639 citations of English-language studies indexed in MEDLINE after November 2004. Furthermore, the ERT searched the clinicaltrials.gov registration website to identify additional studies that might be completed. The search update resulted in the addition of 6 RCTs to the systematic review on this topic.1-6 All were in patients not on dialysis therapy, mostly with CKD stages 3 to 4. The ERT also updated Table 1 of "ongoing studies" to show what trials will be completed in the future.
The new studies were critically appraised by the ERT. The ERT extracted the data from these studies and added them to the summary tables published in the KDOQI 2006 Anemia in CKD guidelines. Each study was graded with regard to its method quality. The Work Group experts reviewed and confirmed data and quality grades in the summary tables. The ERT and the Work Group members updated the evidence profiles for nondialysis patients following the modified Grades of Recommendation Assessment, Development, and Evaluation (GRADE) approach. 57, 60 The ERT tabulated an evidence matrix that provides an overview of the quality of the reviewed evidence. It tabulates all studies included in the review by type of outcome and quality.
A meeting of the original 2006 KDOQI Anemia guidelines Work Group members, the ERT, and NKF support staff was held in Dallas, TX, on February 2 and 3, 2007. Before the face-to-face meeting in Dallas, all Work Group members and the KDOQI Chair and Vice-Chair completed new financial disclosure statements. Based on these financial disclosure statements, the Work Group chose the KDOQI Vice-Chair to moderate the face-to-face meeting in Dallas. The Work Group reviewed the summary tables; evidence profiles; a FDA-approved prescribing information for ESAs current as of March 2005 (Appendix 1); and the table of ongoing studies (Table 1).
It then deliberated on what guideline recommendation the expanded evidence base would support. TheWork Group then drafted recommendations and graded the strength of the recommendations. The strength of a guideline recommendation is shown in parentheses after the guideline statement as "strong" or "moderately strong." A "Clinical Practice Recommendation" is followed by "CPR" in parentheses. Issues considered in the grading of the quality of the evidence and the strength of the recommendations were detailed in the rationale section corresponding to each statement.
The draft of the updated guidelines underwent refinement and internal review by theWork Group by using emails and conference calls, subsequent review by the KDOQI Advisory Board and the public in April 2007, followed by further revisions by the Work Group.
Grading of the quality of a study
A detailed description can be found in the methods section of the 2006 KDOQI Anemia guidelines.56 Each study was graded with regard to its method quality mainly for its primary outcome and also for the quality-of-life outcome, if this was reported and was not the primary outcome. Table 14 shows the grading scheme for study quality.
Grading of the quality of evidence
The evidence profile recorded the assessment of the quality of evidence, the summary of the effect for each outcome, the judgment about the overall quality of the evidence, and a summary assessment of the balance of benefits and harms.57
The quality of a body of evidence pertaining to a particular outcome of interest was initially categorized based on study design (Table 15). For questions of interventions, the initial quality grade is "high" if the body of evidence consists of RCTs, "low" if it consists of observational studies, or "very low" if it consists of studies of other study designs. The grade for the quality of evidence for each intervention/outcome pair was then decreased if there were limitations to the method quality of the aggregate of studies, if there were inconsistencies in the results across studies, if there was uncertainty about the directness of evidence including limited applicability of the findings to the population of interest, if the data were imprecise or sparse, or if there was thought to be a high likelihood of reporting bias (Table 15). The final grade for the quality of the evidence for an intervention/outcome pair could be one of the following 4 grades: "high," "moderate," "low," or "very low" (Table 15).
The quality of the overall body of evidence was then determined based on the quality grades for all outcomes of interest, taking into account explicit judgments about the relative importance of each of the outcomes. To judge the balance between benefits and harms, the summaries for the actual results for each outcome were reviewed. Four grades for the quality of overall evidence were used, as defined in Table 15.
Grading of guideline recommendations
Overall, the strength of a guideline was based on the extent to which the Work Group could be confident that adherence will do more good than harm. The strength of a recommendation was based on the quality of the overall supporting evidence, as well as additional considerations (Table 16). The strength of a guideline recommendation could be rated as either "strong" or "moderately strong." A "strong" guideline requires support by evidence of "high" quality. A "moderately strong" guideline requires support by evidence of at least "moderate" quality. Incorporation of additional considerations can modify the linkage between quality of evidence and strength of a guideline, usually resulting in a lower strength of the recommendation, than would be supportable based on the quality of evidence alone.
A "strong" rating indicates the expectation that the guideline recommendation will be followed unless there are compelling reasons to deviate from the recommendation in an individual. This is based on "high"-quality evidence that the practice results in net medical benefit to the patient and the assumption that most well informed individuals will make the same choice. A "moderately strong" rating indicates the expectation that consideration will be given to follow the guideline recommendation. This is based on at least "moderate"-quality evidence that the practice results in net medical benefit to the patient and the assumption that a majority of well-informed individuals will make this choice, but a substantial minority may not.
Clinical practice recommendations
In the absence of "high"- or "moderate"- quality evidence or when additional considerations did not support "strong" or "moderately strong" evidence-based guideline recommendations, the Work Group was able to draft "CPRs" based on overall consensus of the opinions of the Work Group members (Table 16). As such, the Work Group recommends that clinicians give consideration to following these "CPRs" for eligible patients.
Meta-analyses
Meta-analyses were performed on a subset of RCTs in our systematic review that had 6 or more months of mean follow-up. RRs with 95% CIs were calculated for each study for mortality and for cardiovascular disease. For the cardiovascular disease end point, we combined events for coronary, cerebrovascular, and peripheral vascular disease and heart failure as defined in each study. For CHOIR2 and CREATE,1 we included all events from the primary composite outcomes, even though they also included deaths from any cause or from cardiac arrhythmias. We grouped studies according to whether they were conducted in nondialysis patients or dialysis patients. We included the study by Furuland et al12 with the dialysis studies, even though it contained a subgroup of nondialysis patients.
Calculations were performed using Meta- Analyst (version 0.99 1997; Joseph Lau, Tufts– New England Medical Center, Boston, MA). Because of the clinical heterogeneity of the studies in terms of populations, interventional protocols, durations of follow-up, and outcome definitions, we used a random-effects model according to DerSimonian and Laird for dichotomous outcomes. The random-effects model incorporates both within-study and between-studies variability in assigning weights to each study. It gives a wider CI when heterogeneity is present and thus is more conservative compared with a fixed effect model.