An analysis of the effects of different multiple-choice item selection strategies on the reliability and validity of measures of physician competence in specialty certification