Learning with baby

The rate at which young children learn new words astounds care-givers and scientists alike. Most models of early language acquisition have only been tested in laboratory settings using highly controlled stimuli. In a recent publication in Science, Vong et al. decided to take a more naturalistic approach. They collected 61 h of video footage of a toddler wearing a head-mounted camera as they went about their day-to-day life. Video frames — paired with transcribed audio — were extracted and used to train a neural network that they dubbed the ‘Child’s View for Contrastive Learning’ model (CVCL). By contrasting two distributed vectors for words and images, CVCL learnt to match most of the words it was tested on with the appropriate visual element, with an average accuracy of 61%. CVCL also demonstrated a modest ability to recognize novel examples of the same objects, and made correct identifications around 34% of the time. Interestingly, CVCL’s classification accuracy was similar to that of another neural network trained on a more extensive database that contained millions of stimuli. These findings suggest that relatively simple associative learning using limited training cues could be sufficient to enable early word learning.

Original reference: Science 383, 504–511 (2024)

留言 (0)

沒有登入
gif