교보문고

학술논문

단일집단 및 다집단 검사자료의 IRT 문항모수 추정을 위한 BILOG-MG, ICL, PARSCALE 프로그램의 기능 비교

이용수 607

영문명: A Comparison on the Performances of the Computer Programs BILOG-MG, ICL, and PARSCALE for Estimating IRT Item Parameters with Single- and Multiple-Group Test Data
발행기관: 한국교육평가학회
저자명: 김성훈(Seonghoon Kim) 김선(Sun Kim)
간행물 정보: 『교육평가연구』제27권 제2호, 327~356쪽, 전체 29쪽
주제분류: 사회과학 > 교육학
파일형태: PDF
발행일자: 2014.06.30

6,280원

구매일시로부터 72시간 이내에 다운로드 가능합니다.
이 학술논문 정보는 (주)교보문고와 각 발행기관 사이에 저작물 이용 계약이 체결된 것으로, 교보문고를 통해 제공되고 있습니다.

1:1 문의

국문 초록

문항반응이론(IRT) 문항모수의 추정을 위해 주변최대우도 및 베이지언 최빈 방법을 지원하는 세 컴퓨터 프로그램 BILOG-MG, ICL, PARSCALE의 기능의 차이를 모의실험을 통해 비교 분석하였다. 이를 위해 먼저 검사자료가 단일집단 및 다집단(공통문항-비동등집단)으로부터 얻어질 때 각 프로그램이 IRT 문항모수를 추정하는 원리와 방법(명령어 구문)을 제시하였다. 모의실험의 결과, 검사자료의 구조에 따라 세 프로그램은 IRT 문항모수 추정의 정확성에 있어 뚜렷한 기능상의 차이를 보였다. 2모수 및 3모수 로지스틱(2PL & 3PL) 모형이 적합한 단일집단 검사자료의 경우, 검사의 길이 및 곤란도, 표본의 크기 등을 변화시킨 검사조건들에서 전반적으로 ICL이 가장 우수한 수행을 보였으나 세 프로그램 간의 실제적 차이는 없어 보였다. 2PL 모형 및 3PL 모형이 적합한 다집단 검사자료의 경우, 비동등 수준과 표본의 크기를 변화시킨 검사조건들에서 전반적으로 BILOG-MG와 ICL은 거의 대등한 수행 수준을 보였다. 다집단 검사자료의 IRT 문항모수 추정에서 PARSCALE은 한 검사형의 길이가 최대 23(5개의 공통문항+18개의 고유문항)인 조건에서 작동하였고, BILOG-MG 및 ICL보다 약 1.2~1.5배 이상의 추정의 평균오차를 보였다.

영문 초록

The three computer programs BILOG-MG, ICL, and PARSCALE have been developed to estimate item response theory (IRT) item parameters and ability distributions using the marginal maximum likelihood and Bayes modal methods. With simulated single-group and multiple-group (common-item nonequivalent groups) test data, relative performances of the three programs were investigated on the degree of accuracy in estimation of the item parameters of the two- and three-parameter logistic (2PL & 3PL) models. As methodological bases, the estimation principles each program is based on and the detailed command syntax for running each program were presented. It was noted that PARSCALE could conduct multiple-group IRT estimation using the DIF model but deal with the maximum number of 23 items for a test form. The simulation results showed that the relative performances of the three programs should differ by the test data structure (single-group vs. multiple-group data). For the 2PL- and 3PL-model fitted single-group test data that were generated under combinatory conditions of test length, difficulty, and sample size, ICL overall performed best but the three programs performed almost equally in a practical sense. For the multiple-group test data that were generated under combinatory conditions of degree of nonequivalence between examinee groups and sample size, BILOG-MG and ICL performed almost equally. For this multiple-group IRT estimation, PARSCALE worked for 23-item test forms but performed much worse than BILOG-MG and ICL.

키워드

문항반응이론 단일집단 및 다집단 IRT 추정 item response theory(IRT) BILOG-MG ICL PARSCALE single- and multiple-group estimation

국문 초록

영문 초록

목차

키워드

해당간행물 수록 논문

참고문헌

관련논문

사회과학 > 교육학분야 BEST

사회과학 > 교육학분야 NEW

최근 이용한 논문

APA

MLA