한국형 유아 정신건강 선별검사 개발 및 타당화
Development and Validation of the Korean Toddler Mental Health Screening Test
Article information
Trans Abstract
Purpose
This study develops and validates a scale that comprehensively evaluates intellectual and social development and emotional and behavioral problems in children ages 2 to 5.
Methods
The Korean Toddler Mental Health Screening Test (K-TMHST) encompasses 5 domains to assess intellectual (language, motor, and social development), emotional (mood and affect), and behavioral (attention and activity) problems. After informed consent, we recruited 1,080 adult guardians aged 18 years or older with children aged 2–5 years (24–71 months), and they answered a K-TMHST. We examined reliability, construct validity, and criterion-related validity to explore the psychometric properties.
Results
The reliability was good to excellent. Confirmatory factor analysis yielded a 5-factorial solution for the symptom subscales supporting construct validity. Criterion-related validity was generally satisfactory in that all subscales of the K-TMHST showed significant correlations with relevant measures in the expected direction.
Conclusion
The K-TMHST is a newly developed parent-report questionnaire with good psychometric properties. The K-TMHST can contribute to early and accurate screening of such problems in Korea, which can be a starting point for timely and effective prevention and intervention.
INTRODUCTION
Early childhood, which spans the ages of 2–5 years, is a developmental stage that roughly corresponds to the period before entering elementary school. Scholars consider this period (birth to age 5) the most important developmental phase in an individual's life (Moharir & Kulkarni, 2023). During this period, infants develop their physical and motor skills rapidly, their language develops significantly, communication becomes more fluid, and their cognitive ability regarding the surrounding environment improves. Furthermore, most infants leave their familial environments at this stage and attend childcare facilities for the first time. During this period, they acquire fundamental social skills and develop emotional competencies, including emotional differentiation, emotional recognition and expression, and the management of emotional regulation (Ruba & Pollak, 2020; Slot et al., 2020).
Overall, successful development during early childhood, when rapid changes occur, becomes the cornerstone of later adaptation to school, academic achievement, and growth into healthy adults (Lo et al., 2017). In a neurodevelopmental context, human brain development occurs rapidly from birth to 3 years of age, and after that, brain function has high plasticity depending on learning and experience until adolescence (Guyer et al., 2018). Mental health issues in early childhood have a detrimental effect on brain development, with the po-tential to predict the onset of mental health problems in adolescence and adulthood (Bor et al., 2004; Pine & Fox, 2015; Toumbourou et al., 2011). Consequently, early detection and intervention of developmental delays and mental health problems in early childhood represent a significant social issue in promoting healthy development throughout the lifespan (Izett et al., 2021).
Several epidemiological studies have reported that the prevalence of mental health disorders in infants and preschool children ranges from approximately 7% to 35% (Briggs-Gowan et al., 2001; von Klitzing et al., 2015). Similarly, the Survey on the Promotion of Mental Health of Infants and Young Children of Korea, conducted by the Korea Health Promotion Institute in 2012–2013, indicates that 3 to 4 out of 10 infants and children under the age of 6 residing in Gwangmyeong-si, Gyeonggi-do, and Mapo-gu, Seoul, have social and emotional problems. Furthermore, according to Health Insurance Review and Assessment Service, from 2016 to 2020, the number of infants and children aged 0 to 9 diagnosed with mental illness (diagnosis codes F00 to F99) in the Department of Pediatric Psychiatry has continued to increase (Park & Heo, 2021).
Nevertheless, the rate of preschool children seeking help from mental health professionals is low worldwide (Pihlakoski et al., 2004). As infants and toddlers are unable to recognize and report their problems independently, guardians, such as parents, may play a pivotal role in intervening for mental health problems in preschool children. However, this is con-tingent on their awareness of the issue. Furthermore, given the practical difficulties guardians encounter in locating a mental health professional to assess their infants’ mental development, there is a pressing need to identify and address issues early through the large-scale implementation of early childhood development screening tests (Egger & Emde, 2011).
Among the early childhood d evelopment screening tests used in Korea, the Korean Ages and Stages Questionnaire (K-ASQ; Heo et al., 2006) and Korean Child Development Inventory (K-CDI; Kim & Shin, 2006) were originally developed overseas, and subsequently adapted and standardized for Korean populations. Practitioners should use these tests with caution when applied to Korean children, as detailed aspects of infant development may vary depending on cultural differences (Korea Disease Control and Prevention Agency, 2017).
A representative test for Korean infants is the Korean Development Screening Test for Infants and Children (K-DST), developed in 2007 by the Ministry of Health and Welfare and the Korea Centers for Disease Control and Prevention as part of the infant health checkup project and revised in 2017. This scale evaluates 5 areas (gross motor, fine motor, cognition, language, social skills, and self-help) in infants and toddlers aged 4–71 months. It includes an assessment of intellectual developmental disorder (IDD) or autism spectrum disorder (ASD). Tests screening for ASD can assist in identifying individuals with this condition. However, the test does not provide information on emotional and behavioral problems such as attention-deficit/hyperactivity disorder (ADHD), anxiety, or depression. This lack limits the scope of screening for mental health problems in children.
Infant screening tests that evaluate children's problem behaviors based on guardian reports include the widely-used Infant Behavior Rating Scale (Child Behavior Checklist for ages 1.5–5 [CBCL 1.5–5]; Oh & Kim, 2013) and Korean Personality Rating Scale (KPRC; Cho et al., 2006). How ever, these scales have too many questions, with norms established more than 10 years ago; therefore, they do not reflect the cha-racteristics of recent Korean children. Thus, the present study develops a scale that comprehensively evaluates intellectual development, social development (SD), and emotional and behavioral problems in children ages 2 to 5. This multifaceted structure enhances its utility for various purposes, including screening, measuring symptom severity, and evaluating treat-ment effects.
MATERIALS AND METHODS
1. Item Development and Pilot Study
We developed the scale in 3 stages: item development, a pilot study, and scale validation. First, we established the prin-cipal domains and a comprehensive item pool. Based on an extensive literature review and existing data on psychological problem areas in children in Korea, we identified 5 major domains: language development (LD), motor development (MD), attention and activity (AA), SD, and mood and affect (MA). Notably, language, motor, and SD are the central aspects of early childhood development in the operational definition of developmental delay (Congress.gov, 2004) and IDD (American Psychiatric Association, 2013). In addition, the AA domain was selected on the ground that early signs of hyperactivity and inattention can be observed by caregivers from toddler stage (American Psychiatric Association, 2013). Moreover, the MA domain was included to evaluate negative affectivity, which encompasses major and prevalent interna-lizing problems (e.g., anxiety, worry, depression, lack of confidence) during early childhood (King et al., 1991).
In this phase of the construction process, we populated the item pool with 50–100 items per subscale, including state-ments such as “It is difficult to understand what my child is saying.”, “My child's movements are clumsy.”, “My child never stops moving.”, “My child frequently makes unusual vocalizations.”, “My child often feels nervous.” By applying 3 rules, we selected 104 items through author consensus. The first rule related to including items theoretically central to each subscale (e.g., Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic criteria). The second rule included clinically relevant items that prior literature implicated, such as those related to developmental considera tions. The third rule covered approximately the same number of items per domain. A panel of experts (4 clinical psychologists and 2 early childhood education practitioners) confirmed the overall adequacy of the scale and all preliminary items. The research team conducted periodic reviews of the questions based on assessments of their importance and suitability. Sometimes, we used consensus to eliminate or revise questions and resolve inconsistencies. A group of domestic infant experts was at the vanguard of this process, reflecting the symptoms commonly reported by parents of children with developmental and emotional problems who visited pediatric psychiatry. Importantly, cultural considerations specific to the Korean context were integrated throughout the item development process. For example, feedback from Korean early childhood educators and clinicians helped ensure that the wording of items was culturally appropriate and easily understood by Korean parents, avoiding expressions that might be ambiguous or less relevant in this cultural setting.
In the subsequent phase of the pilot study, caregivers of 100 toddlers (aged 2–5) completed the scale. Approximately half of the participants were male (n=48, 48%), with a mean age of 53.4 months (standard deviation=12.7); we recruited them to assess the readability and validity of the preliminary scale. Based on the results of the descriptive statistics, reliability measures, discrimination and difficulty parameters, item information curves obtained from the item response theory model, and factor loadings and model fit indices from confirmatory factor analysis (CFA), in combination with theore-tical importance, the authors agreed on eliminating 4 items, resulting in a final scale comprising 100 items. The expert group (4 clinical psychologists and 2 early childhood education practitioners) conducted a second evaluation of the items and determined that the revised final scale exhibited enhanced adequacy compared to the preliminary version.
2. Participants
In the final step, we recruited 1,080 adult guardians aged 18 or older with children aged 2–5 years (24–71 months) as study participants. We stratified the dataset according to the 2021 Statistics Korea Population and Housing Census data, including region (metropolitan area, provinces), child's gender, and age (divided into 8 sections, each comprising 6-month increments). We divided the participants into 8 age groups: 24–29 months, 30–35 months, 36–41 months, 42–47 months, 48–53 months, 54–59 months, 60–65 months, and 66–71 months, and we obtained informed consent from all participants. Among the individuals who participated in the standardization study, 51 underwent rete-sting at 1-month intervals to calculate test-retest reliability. Additionally, 69 individuals completed the (CBCL 1.5–5; Oh & Kim, 2013) to evaluate criterion validity. We obtained informed consent from the parents/guardians. In addition, the Institutional Review Board of Chinju National University of Education (No. 2022-08) approved this study's materials and pro cedures.
3. Measures
1) Korean Toddler Mental Health Screening Test
The final 100 items constituted the Korean Toddler Mental Health Screening Test (K-TMHST). Five domains assessed intellectual (language, motor, and SD), emotional (MA) and behavioral (AA) problems. Representative examples include the following: “My child speaks later than other children of the same age” (LD); “My child started walking later than other children of the same age” (MD); “My child cannot stay still and is always restless” (AA); “My child does not respond when his/her name is called” (SD); and “My child worries about everything” (MA). Based on primary caregiver reports, participants answered items on a 4-point Likert scale (0=not at all, 1=often, 2=very often, and 3=always). In general, higher scores indicate higher levels for each construct.
2) Child Behavior Checklist for ages 1.5–5
To ascertain the convergent and divergent validity of the construct, we employed subscales from the CBCL 1.5–5— DSM pervasive developmental problems, attention problems, anxious/depressed, and withdrawn—which correspond to constructs measured by the K-TMHST. Each subscale consists of 8–13 items rated on a 3-point Likert scale. The overall internal consistency was excellent (Cronbach α=0.95).
4. Statistical Analyses
We evaluated the reliability of the K-TMHST using internal consistency through Cronbach α and test-retest correlation coefficients. Because heterogeneity between the sample and the population may underestimate the correlations calculated between measures at 2 time points, we calculated both uncor-rected correlation (r) and the corrected correlation (r’). Further, we performed a CFA to confirm the scale's construct validity. The individual items in the scale were rated on a 4-point Likert scale, which limits the assumption of continuity in the responses. Moreover, since the items assess features related to several aspects of problems (cognitive, emotional, social, and behavioral) in children aged 2 to 5, we expected a low rate of reported problem behaviors, resulting in positively skewed responses. Analyzing such skewed ordinal data with Maximum Likelihood estimation tends to overestimate the chi-square statistic, underestimate factor loadings, and bias standard errors (Kaplan, 2009; Muthén & Kaplan, 1985, 1992).
Therefore, we performed CFA using the diagonally weighted least squares (WLSMV) estimation method for analyzing categorical or ordinal data. This method does not assume normality or continuity of the data and is appropriate for analyzing large sample sizes of over 1,000 (Flora & Curran, 2004). Additionally, we analyzed the same data using the unweighted least squares (ULSMV) estimation method, which provides more accurate estimates in smaller samples compared to WLSMV. We obtained results from both estimation methods.
The fit indices estimated using those least squares estimation methods can be influenced by factors such as sample size and strength of the correlations among the variables and model size, making cutoff points difficult to universally apply to all models (Shi et al., 2019). Moreover, determining cutoff points for fit indices when dealing with non-normally distributed data is also challenging (Savalei, 2021; Xia & Yang, 2019). Therefore, we evaluated the model fit using stringent cutoff criteria commonly used in empirical research: comparative fit index (CFI) and Tucker-Lewis Index (TLI) >0.95, root mean square error of approximation (RMSEA) <0.06, and standardized root mean square residual (SRMR) <0.08 (Browne & Cudeck, 1993; Hu & Bentler, 1999).
RESULTS
1. Reliability
Table 1 presents descriptive statistics and reliability measures for each subdomain of the scale (N=1,080). The stratified alpha (Cronbach et al., 1965), as a reliability measure for the entire scale, was 0.98. Moreover, each subdomain's reliability was good to excellent, ranging from 0.91 to 0.96. The test-retest correlations for subdomains (N=51) ranged from 0.62 to 0.81, while its the total score was 0.70, all of which were statistically significant (p<0.01).
2. Construct Validity
Based on the validation sample, we performed a CFA to evaluate the construct validity of the K-TMHST. We assumed the 5-factor model, where the factor loadings loaded onto their corresponding 5 subdomains: LD (16 items), MD (13 items), AA (26 items), SD (25 items), and MA (20 items). The fit indices from the CFA based on the ULSMV and WLSMV estimation methods are in Table 2.
The results show that CFI and TLI values exceeded 0.95, while RMSEA and SRMR values were below the specified cutoff values. Moreover, all the factor loadings from the 5-factor model were statistically significant, ranging from 0.688 to 0.902 for LD, 0.655 to 0.915 for MD, 0.702 to 0.912 for SD, 0.399 to 0.954 for AA, and 0.535 to 0.950 for MA. Additionally, the estimated correlations between factors ranged from 0.79 to 0.90. These findings provide evidence that the 5-factor model appropriately explains the factor structure of the K-TMHST subscales.
3. Criterion-Related Validity
Table 3 presents the results of the correlation analysis among the subdomains that constitute the scale based on the validation sample (N=69). The results revealed that all subdomains and the total score of the scale were significantly and positively related to each other, ranging from 0.68 to 0.86. Among the subdomains, the correlations were particularly strong between SD and MA (r=0.72, p<0.01), LD and SD (r=0.59, p<0.01), and MD and SD (r=0.59, p<0.01).
Table 4 presents the correlation analysis results between the K-TMHST and several validity measures assessing similar or discriminative constructs. The results revealed that the overall developmental problems showed significant correlations with all the subdomains, including LD (r=0.48, p<0.01), MD (r=0.28, p<0.05), AA (r=0.34, p<0.01), SD (r=0.79, p<0.01), and MA (r=0.79, p<0.01), as well as with the total score (r=0.67, p<0.01).
The attention problems also showed significant correlations with all subdomains and the total score, including LD (r=0.46, p<0.01), MD (r=0.45, p<0.01), AA (r=0.72, p<0.01), SD (r=0.45, p<0.01), MA (r=0.25, p<0.05), and with the total score (r=0.67, p<0.01).
The anxious/depressed significantly correlated with several subdomains, including LD (r=0.39, p<0.01), AA (r=0.36, p< 0.01), SD (r=0.50, p<0.01), and MA (r=0.74, p<0.01), as well as with the total score (r=0.59, p<0.01).
Withdrawal showed significant correlations similar to those of anxious/depressed, such as LD (r=0.35, p<0.01), AA (r=0.27, p<0.05), SD (r=0.70, p<0.01), and anxious/depressed (r=0.62, p<0.01) as well as with the total score (r=0.54, p<0.01).
DISCUSSION
This study mainly aimed to develop and validate a parent-report questionnaire for comprehensively assessing Korean toddlers’ developmental and mental health issues. The newly developed K-TMHST demonstrated sound psychometric properties. Reliability was good to excellent, evidenced by internal consistency and test-retest correlation. Results of the CFA confirmed the 5-factor structure corresponding to the 5 subdomains of the K-TMHST: LD, MD, attention/activity, SD, and MA, supporting construct validity.
Criterion-related validity was promising, considering the differential pattern of correlations in the expected direction and magnitude. The score of the attention/activity subdomain demonstrated the highest positive correlation (r=0.72) with the attention problems subscale of the CBCL 1.5–5, whereas the positive correlation coefficients with other scales of the CBCL 1.5–5 were relatively lower (r=0.27–0.36). Also, the SD subdomain demonstrated a stronger positive correlation with pervasive developmental problems (r=0.79) and withdrawal (r=0.70) than with the attention problems (r=0.45) and anxious/depressed (r=0.50) subscales of the CBCL 1.5–5. These results indicate a satisfactory level of discriminant and convergent validity of the 2 subdomains of the K-TMHST.
On the other hand, the correlation patterns of the other subdomains were less specific. The MA subdomain showed an expected strong positive correlation with the anxious/depressed subscale of the CBCL 1.5–5 (r=0.74). It was also strongly associated with pervasive developmental problems (r=0.79) and withdrawal (r=0.62) subscales. Next, the LD subdomain was generally positively associated with all 4 CBCL 1.5–5 subscales (r=0.35–0.48), and the MD subdomain was only positively associated with 2 subscales: pervasive developmental problems and attention problems (r=0.28–0.45).
The K-TMHST has comparative advantages over existing screening tools, which can increase its utility in research and clinical settings. First, it provides wider content coverage than previously used screening tests for toddlers. When compared to the K-DST, the K-TMHST includes items that evaluate 2 additional subdomains of attention/activity (26 items) and MA (20 items) problems. Although ADHD is not yet officially diagnosed during the toddler stage, parents often retrospectively recall that these kids were more hyperactive than peers from earlier developmental stages (American Psychiatric Association, 2013). Also, affectivity is a major component of temperament, an enduring individual difference of emotional and behavioral response with a genetic basis, which forms the foundation for subsequent personality development and parent–child relationships (Caspi & Silva, 1995). Children with difficult temperaments tend to display higher levels of negative affectivity, and such heightened negative affectivity relates to different functions of the neurodevelopmental syste m responsible for emotional processing and, ultimately, later development of psychopathology, including emotion regulation difficulty (Camacho et al., 2021; Karevold et al., 2012; Rettew & McKee, 2005). Thus, it is clear that early and accurate detection of attention and emotional problems is an important but overlooked issue in the mental health screening of toddlers in Korea, which the K-TMHST can readily address.
Second, despite the wider coverage, we carefully arranged the total number and difficulty of items to help reduce re-spondents’ burden concerning time and effort. In our validation study, it typically took 20 minutes to complete the questionnaire. We examined the readability and difficulty of all items to ensure valid reports of caregivers who do not possess professional knowledge or skills in pediatric psychiatry. These features enhance the clinical utility of the K-TMHST, making it an economical and convenient alternative for parent-report-based mass screenings at a national level or in primary care settings where pediatric or psychiatric experts are unavailable (Pihlakoski et al., 2004).
Third, developing the K-TMHST with a large, recent sample of Korean caregivers of toddlers aged 2–5 years ensures culturally and historically sensitive test interpretation. This approach contrasts with previously used measures that translated questionnaires originally developed in foreign samples (e.g., K-ASQ and K-CDI) or relied on old norms (e.g., CBCL and KPRC). Also, this study adopted stratified sampling based on the 2021 Korean population census, thus increasing sample representativeness.
This study has limitations. First, important aspects of validity remained unaddressed. Future research should examine the diagnostic and predictive validity of the K-TMHST to determine if it can be a valid and effective screening tool. For example, recruiting parents with toddlers diagnosed with developmental delays, IDD, or ASD is necessary to investigate the optimal cutoff score of the K-TMHST. In particular, it is important to conduct longitudinal follow-up studies to clarify the predictive validity of the K-TMHST. For instance, high-risk group screened by higher scores of AA or MA should be followed up until they reach the age of diagnostic threshold to verify whether they are more prone to corresponding diag-noses subsequently. Second, researchers need to strengthen the discriminant validity of the 3 subdomains—LD, MD, and emotional problems. The decreased specificity of the emotional problems subdomain might be due to the existing conceptual overlap, as negative affectivity could be a comorbid condition in pervasive developmental problems and withdrawal. However, we found no corresponding direct validator for language or MD subdomains. Future investigations should address these issues by employing different validating measures.
CONCLUSIONS
The K-TMHST is a newly developed parent-report questionnaire with good psychometric properties. While this test is comprehensive enough to provide a wide screening range of mental health problems that could show initial signs during the toddler stage, it is also economical and convenient in test administration. In addition, because researchers constructed the scale using a representative sample of Korean toddler care givers, the test can provide more culturally sensitive results. Given the relatively high prevalence of mental health problems in toddlers within the community but the low detection rate (Briggs-Gowan et al., 2001; Pihlakoski et al., 2004), we expect the development and ongoing validation of the K-TMHST to improve early and accurate screening in Korea, serving as a foundation for timely and effective prevention and intervention efforts.
ACKNOWLEDGMENTS
This work was supported by a Korea Psychology Co., LTD. The funding body had no role in the study design, data collection, data analysis, data interpretation, or writing of this report. The corresponding authors had full access to all the data in this study and had final responsibility for the decision to submit for publication.
