top of page

Using NLP: Do Parents Talk to Boys and Girls Differently During Play?

**All code used in this project is available on Github.**

Recently, there has been increasing focus on gender differences and parenting in research and the popular press. This is especially prominent in light of gender disparities in STEM education. One reason for this disparity might stem from how parents speak to their young children during play (e.g., Crowley, Callanan, Tenenbaum, & Allen, 2001; Tenenbaum & Leaper, 2003).

The goal of this project was to use natural language processing (NLP) to investigate whether parents speak to their children differently depending on gender in a neutral play setting.

Data for this project came from a study investigating word learning skills in toddlers in a lab at UT Austin. With much help from a talented research assistant, Katherine Soon, we transcribed a random sample of parent-child play sessions. In this project, I examine the types of words parents use with their toddlers during a brief play session with this toy in the lab.

 

Part 1:

First, I wanted to see what words parents were using with boys and girls. An easy way for me to visualize the words and frequency of usage was to create word clouds- a perennial favorite.

Next, I will read in a data frame where each row represents a participant. To create this data frame, I read in each text file and appended it to a data frame. Note that the participant ID has been removed for privacy.

The text looks messy. We need to remove punctuation, unrecognized characters, and common "stop" words. First, I removed punctuation and symbols.

Next, I removed stop words. I also removed words that were very common, but not helpful in determining meaningful differences between genders. Here are the top tokens, or words for each gender:

Top Words for Girls:

['looks', 'like', 'spider', 'spider', 'spider', 'dragonfly', 'dragonfly', 'baby', 'dragonfly', 'goes']

Top Words for Boys:

['ladybug', 'butterfly', 'bee', 'another', 'butterfly', 'grasshopper', 'spider', 'bee', 'looks', 'like']

In a first step towards clarifying whether or not gender differences exist, I decided to look at the frequency of words used in each gender. I used a handy function 'word_vectorizer.fit_transform' in Python.

Upon initial inspection, it looks like the frequency of word usage is fairly equivalent across gender- which is good news.

 

Part 2:

Another interesting analysis might be to investigate how often parents use certain TYPES of words across gender. To do so, I created four text corpora: positive, negative, science-related, and animal words.

I used a positive and negative text corpora from here.

I created the science-related words by using a list of terms from this site, and then adding additional terms I frequently observed while transcribing. For example, I included the words "predict," "try" and "guess," which are heavily tied to STEM and the scientific method.

I then created the animal corpus by using part of the MCDI and adding specific animals from the game the parent-child dyads were playing with during the task (e.g., "caterpillar" and "ant").

Next, I used a script to count how many times a word in each category appeared in both sets of transcripts.

Bar plot to compare the genders:

Results:

I did not observe any gender differences between the types of words (e.g., negative or amount of STEM-related) that parents used with 2-year olds. This is good news in light of the recent evidence that there are vast differences in how parents approach play and speak with their children depending on their gender.

However, this project is limited in several aspects. First, our sample size was small. I only used a sub sample from the data available in my lab. Second, I only explored at tokens. I did not examine bigrams, or other more advanced NLP techniques here. Third and finally, the task used in this parent-child interaction was not necessarily designed to promote any type of STEM education or related prompting from parents. Since the task was play-based and neutral (no direct instructions were given), this might explain the similar results between the genders.

Let's end with cleaned up versions of word clouds for each gender.

 

References: 1. Crowley, K., Callanan, M. A., Tenenbaum, H. R., & Allen, E. (2001). Parents explain more often to boys than to girls during shared scientific thinking. Psychological Science, 12(3), 258-261. 2. Tenenbaum, H. R., & Leaper, C. (2003). Parent-child conversations about science: The socialization of gender inequities. Developmental psychology, 39(1), 34. 3. Minqing Hu and Bing Liu. "Mining and Summarizing Customer Reviews." Proceedings of the ACM SIGKDD International Conference on Knowledge. Discovery and Data Mining (KDD-2004), Aug 22-25, 2004, Seattle, Washington, USA. 4. Fenson, L., Marchman, V. A., Thal, D. J., Dale, P. S., Reznick, J. S., and Bates, E. (2007). MacArthur-Bates Communicative Development Inventories: User's Guide and Technical Manual - Second Edition. Baltimore: Brookes Publishing. 5. http://www.descriptionkey.org/vocab/stem.html#pre-k 6. The Little Learners Lab at UT Austin

Featured Posts
Recent Posts
Search By Tags
No tags yet.
bottom of page