교보문고

학술논문

종단자료의 속성을 고려한 머신러닝 기법 비교 연구: MERF와 glmmLasso를 중심으로

이용수 0

영문명: A Comparison Study of Machine Learning Methods Considering the Structure of Longitudinal Data: Focusing on MERF and glmmLasso
발행기관: 한국교육평가학회
저자명: 정혜원(Hyewon Chung) 장은아(Eunah Jang)
간행물 정보: 『교육평가연구』제38권 제1호, 209~241쪽, 전체 33쪽
주제분류: 사회과학 > 교육학
파일형태: PDF
발행일자: 2025.03.31

6,760원

구매일시로부터 72시간 이내에 다운로드 가능합니다.
이 학술논문 정보는 (주)교보문고와 각 발행기관 사이에 저작물 이용 계약이 체결된 것으로, 교보문고를 통해 제공되고 있습니다.

1:1 문의

국문 초록

본 연구는 종단자료의 속성을 반영할 수 있는 머신러닝 기법인 MERF(mixed-effects random forest)와 glmmLasso(generalized linear mixed models with Lasso regularization)의 예측성능 및 변인 선택의 정확성을 비교하기 위해 수행되었다. 이를 위해 예측변인 수(25, 50, 100), 반복측정횟수(3, 4, 6), 사례 수(500, 1,000, 2,000, 4,000) 조건을 반영한 모의자료를 생성하고, MERF와 glmmLasso를 적용하여 평균제곱근오차와 변인 선택의 정확률 및 오류율, 상대적 편의 및 편의를 비교하였다. 모의실험 결과, 첫째, 본 연구에서 고려한 모든 모의실험 조건에서 glmmLasso의 예측성능이 MERF보다 우수한 것으로 나타났다. 둘째, 예측변인 수가 많을수록, 사례 수가 적을수록 MERF의 평균제곱근오차가 증가하여 예측성능이 저하되는 경향을 보였다. 반면, glmmLasso의 평균제곱근오차는 모의실험 조건에 따라 뚜렷한 경향 없이 비교적 안정적으로 추정되었다. 셋째, MERF와 glmmLasso를 적용했을 때, 사례 수가 적은 조건에서는 반복측정횟수가 증가할수록 평균제곱근오차가 높아졌다. 그러나 glmmLasso 적용 시 사례 수가 500인 일부 조건에서 상대적 편의가 나타났지만, 일정 사례 수 이상에서는 반복측정횟수 조건에 따른 평균제곱근오차에 뚜렷한 경향이 나타나지 않았다. 넷째, MERF 적용 시, 예측변인 수와 반복측정횟수가 작을수록, 사례 수가 많을수록 변인 선택의 정확률이 증가하는 것으로 나타났다. 이상의 결과를 토대로, 종단자료에 두 모형을 적용 시 고려해야 할 사항과 후속 연구를 위한 제언을 제시하였다.

영문 초록

The purpose of this study is to compare the predictive performance and variable selection accuracy of two machine learning methods—MERF and glmmLasso—that can account for the structure of longitudinal data. To achieve this, a simulation study was conducted with conditions varying in the number of predictors (25, 50, 100), number of repeated measurements (3, 4, 6), and sample sizes (500, 1,000, 2,000, 4,000). The performance of MERF and glmmLasso was evaluated using root mean squared error (RMSE), variable selection accuracy rate, error rate, relative bias, and bias. The results are as follows. First, glmmLasso consistently outperformed MERF in predictive performance across all conditions investigated under the current study. Second, MERF showed increasing RMSE with more predictors and smaller sample sizes, whereas glmmLasso maintained stable RMSE. Third, for both methods, RMSE increased with the number of repeated measurements when the sample size was relatively small. Additionally, in glmmLasso, relative bias was observed under some conditions when the sample size was 500. However, with a sufficiently large sample size, RMSE differences across the different numbers of repeated measurements were minimal. Fourth, in MERF, variable selection accuracy rate improved with fewer predictors, fewer numbers of repeated measurements, and a larger sample size. Based on these findings, practical guidelines for applying MERF and glmmLasso methods while considering the structure of longitudinal data and suggestions for future research were provided.

국문 초록

영문 초록

목차

키워드

해당간행물 수록 논문

참고문헌

관련논문

사회과학 > 교육학분야 BEST

사회과학 > 교육학분야 NEW

최근 이용한 논문

APA

MLA