Microblog discourse analysis for parenting style assessment

1 Introduction

Parenting style is the summary of the characteristics of the parents' various parenting behaviors, attitudes, and emotions. The concept of parenting style has been mainly divided into three orientations: dimensional orientation, practical behavior orientation, and comprehensive orientation (1). First, the dimensional orientation considers that the parenting style is a relatively fixed parenting behavior tendency shown by parents in daily life, including the speech and emotion between parents and children (2). The most representative example of this orientation is the classification of parenting as authoritative, permissive, or authoritarian (3). Authoritative parents not only have high demands and expectations for their children but also provide warm emotional support and responses (3). They encourage their children to express their thoughts and focus on cultivating their children's autonomy (3). Permissive parents grant their children a great deal of freedom, set few rules and requirements, and are overly tolerant of their children (3). Authoritarian parents, on the other hand, emphasize absolute obedience, establish strict rules, yet lack emotional warmth and communication (3). Snow et al. (4) further enriched the theory by adding the neglectful parenting style. Neglectful parents lack both affection for their children and the establishment of rules and requirements. This parenting style often has severe negative impacts on children's growth (4). Second, the practical behavior orientation considers the parenting style as the specific behavior of parents when raising their children, such as the time spent accompanying their children and the extent of their involvement in their children's educational activities (5). Finally, the comprehensive orientation integrates the first two definitions. This means that parenting style is not only a summary of the characteristics of the parents' various parenting behaviors but also includes parenting attitudes and emotions.

A negative parenting style is characterized by low emotional warmth and high rejection and overprotection (6), which has been identified as an important cause of anxiety, depression, and suicide among university students (7, 8). Assessment of the parenting style of people's parents allows healthcare institutions to pay attention to people with mild anxiety, mild depression, and potential suicidal ideation in a timely manner and make appropriate interventions to prevent the situation from worsening.

Currently, there is a lack of automated assessment methods for parenting styles. Traditional assessment methods based on questionnaires and interviews often require individuals to complete a professional questionnaire or participate in a face-to-face interview with a psychology expert. In 1980, Perris and Jacobsson (9) research, proposed the parental rearing behavior questionnaire (EMBU). The questionnaire has 15 subscales with 81 items. The 15 subscales form three father-related factors and four mother-related factors. To make this scale suitable for the Chinese population, Yue et al. (10) proposed the Chinese version of EMBU by revising the EMBU. The revised questionnaire contains a sub-questionnaire for both the father and the mother. The father's sub-questionnaire comprises 58 items in six dimensions: Emotional warmth, Excessive Interference, Punishment, Rejection, Overprotection, and Preference. The mother's sub-questionnaire comprises 57 items in five dimensions: Emotional warmth, Excessive Interference, Punishment, Rejection, and Preference. Although this questionnaire provide a comprehensive assessment of parenting styles, it may be time-consuming for respondents. Later, Jiang et al. (11) further introduced the short-form Chinese version of EMBU (s-EMBU-C) by reducing the number of items from 115 to 42. The short-form questionnaire only has three dimensions: Rejection, Emotional warmth, and Overprotection. This questionnaire was tested among the Chinese university students, and the results show that it has good reliability and validity.

However, this type of invasive method requires high enthusiasm from individuals and is difficult to apply on a large scale. Medical and healthcare institutions lack low-cost, automated, and non-invasive methods. With the development of social media, increasingly more people are expressing their inner feelings on social media (e.g., on microblogs) (12, 13). Having the advantages of being large-scale, low-cost, and open, microblogs allow us to access the parenting styles of a large number of students' parents in a no-contact manner.

However, limited research exists in the literature addresses the assessment of parenting styles of students' parents based on microblogs. The first challenge is that the correlation between students' discourses on microblogs and parents' parenting styles has been less explored. Moreover, in the long-time series analysis of student users, there is often a lot of noisy data, and critical information can be overwhelmed by numerous meaningless posts.

Figure 1 shows a real open post sequence written by a student on a microblog over the period of a year. We make the following observations.

www.frontiersin.org

Figure 1. Real open post sequence of a university student user from May 1, 2018 to April 30, 2019.

(Observation 1) People may talk about their childhood experiences of being parented on microblogs. According to the described experiences, the student's parents may have beaten her for small things, seldom praised her, and seldom let her do what she wanted when she was young. These descriptions indicate that the student grew up under a negative parenting style.

(Observation 2) Noisy information that is not related to the parenting style appears in the student's open post sequence, such as in the student's comments on daily trivia.

The above observations inspire us to assess the parenting styles of student microblog users' parents based on their discourses.

In this study, we propose to (1) first investigate the correlation between students' discourses on microblogs and their parents' parenting styles and (2) then predict the parenting style of the microblog student users' parents based on the found correlation and their discourses.

Subtask 1: exploration of the correlation between microblog student user's discourse and parents' parenting style

Subtask 1 is not only designed to explore the correlation between the discourse from student's open posts and the parenting style of their parents, but it also plays a crucial role in understanding the differences in emotional and linguistic expressions across various parenting styles.

To address the subtask 1, we first constructed a microblog-based parenting style dataset, which contains numerous discourses from microblog student users. The dataset contains 111,258 open posts made by 575 students from November 1, 2019, to October 31, 2022. Subsequently, the associations between the discourses and three types of parenting styles were investigated from the perspectives of students' linguistic and emotional expressions.

Through data analysis, we found that positive, mixed, and negative parenting styles show significant differences in linguistic expressions and emotional expressions on microblogs. In this study, linguistic and emotional expressions were analyzed by examining the frequency of different topics and emotional words used by students. This approach allowed us to identify distinct patterns of expression associated with each parenting style. Under a negative parenting style, the effect of gender on negative emotional expressions is more significant than that under a positive or mixed parenting style.

Subtask 2: microblog-based parenting style assessment

Subtask 2 focuses on inferring a student's parents' parenting style, primarily by utilizing the correlation between the student's discourse and their parents' parenting style. This correlation, discovered through our analysis, plays a crucial role in providing a more accurate assessment of the parenting style, beyond just relying on microblog data alone.

The parenting style is represented in the form of scores in the dimensions Rejection, Emotional warmth, and Overprotection (11); i.e., . Let P = be a student's open post sequence. That is, F(P) = .

To solve subtask 2, we first used Sentence BERT (14) to extract the linguistic and emotional information from the student's open posts. An attention layer was then used to handle data noise and capture the key discourses describing the student's experiences in childhood. Subsequently, we designed a novel correlation injection layer to merge the found correlations and the above key discourses. Last, based on the correlations and key discourses, two fully-connected layers predicted the parenting style of the parents of the student as scores .

The parenting style dataset was used to evaluate the effectiveness of the method. Experimental results obtained on the dataset show that the parenting style assessment method effectively predicts the parenting style of the parents of the student with a mean square error (MSE) of 0.12 and mean absolute error (MAE) of 0.28, which outperforms all the baseline methods (e.g., ChatGPT-4).

In summary, this study makes the following contributions:

• Exploration of student's discourse and parenting style: We explore the correlation between microblog university student user's discourse and parents' parenting style from the perspectives of linguistic and emotional expressions. Compared with the students under the positive and mixed parenting styles, students under the negative parenting style tend to use more words related to “health,” “death,” and “love,” and express more negative emotions (i.e., “sad,” “fear,” and “hate”) in their discourses.

• Introduction of a microblog-based parenting style assessment method: We propose the first method of this kind. Performance study shows that this method can infer the parenting styles of students' parents with minor errors. This method facilitates the assessment of more suicide risk factors.

• Creation of a large microblog-based parenting style dataset: The dataset contains 111,258 open posts made by 575 students from November 1, 2019, to October 31, 2022. It facilitates subsequent research in the fields of psychology and computer science.

2 Materials and methods

In this section, we first explored the correlation between students discourses and parenting style types, and then designed a deep learning-based parenting style assessment method using students' discourses and found correlations.

2.1 An exploration of the correlation between discourse and parenting style type 2.1.1 Construction of parenting style dataset

Due to the absence of a dataset related to parenting styles at present, to explore the relationship between students' discourses and the parenting styles they experienced, we constructed a new parenting style dataset. We constructed this dataset via Weibo, the largest social media platform in China. To recruit participants, we posted announcements in the “University Student Questionnaire Completion” topic on Weibo. This topic, which gathers over 127,000 university students across the country, is the largest community for questionnaire completion on the platform.

Participants were asked to fill out a questionnaire that included their parenting style, Weibo ID (for collecting their public microblog data later), age, gender, payment account details, and real name. The payment account and real name were collected solely for the purpose of disbursing participant fees. Each participant who completed the questionnaire carefully received a compensation of 5 yuan. We used Python code to collect their open microblog data to construct a parenting style dataset.

To ensure data reliability and prevent individuals from completing the questionnaire multiple times using different accounts, we restricted submissions to one per IP address.

For the parenting style assessment, we used the short-form Chinese version of the EMBU (s-EMBU-C) (11). This 21-item questionnaire is widely applied in China and is well-suited for Chinese participants. Specifically, this questionnaire exhibits high reliability and validity, demonstrating a high degree of dependability. Its developer administered the questionnaire to 700 Chinese college students. The test - retest reliability of this questionnaire, measured after 10 weeks, ranges from 0.70 to 0.81. In terms of validity, this questionnaire has a significant correlation with the corresponding dimensions of the Chinese version of EMBU, with the correlation coefficients all above 0.8. The questionnaire evaluates three parenting dimensions: Rejection (six items), Emotional Warmth (seven items), and Overprotection (eight items). Responses were rated on a four-point Likert scale, ranging from “never occurs” to “always occurs.” Scores for each dimension were calculated as the average of the respective items.

To protect participant privacy, we implemented several measures. First, participants' payment account details and real names were permanently deleted after fee disbursement. Second, public Weibo data was collected using participants' Weibo IDs, ensuring that no personal information was exposed. Additionally, participants were fully informed before participation that their public Weibo data would be used for scientific research purposes.

We received a total of 3,183 questionnaires over the period of a month. Approximately one-third of the users did not supply their microblog usernames and were removed because we could not confirm their authenticity. Among the remaining users, a user was regarded as a valid student user if he/she satisfied the following conditions: (1) He/she answered the polygraph question correctly. We added a simple polygraph question: “For this question, please choose ‘occasionally occurs’ for your father and ‘always occurs’ for your mother.” (2) He/she had between 5 and 5,000 followers. Too few followers meant that he/she was not active enough on microblog and too many followers meant that he/she may be an institution or a public figure. (3) He/she had made more than 10 original posts in the past year. Too few original posts indicated low activity on the microblog. The data collection process received approval from the local ethics committee, reference number: 202302220019. After filtering, there were 575 valid students.

2.1.2 Description of the parenting style dataset

This dataset contains 575 microblog student users, including 281 males and 294 females. Each student was assigned three scores (sr, se, and so) for the dimensions of Rejection, Emotional Warmth, and Overprotection. The age distribution ranged from 14 to 51 years, with an average age of 24.0 years and a standard deviation of 5.8 years. We wrote a Python program to collect all the open posts of these students dated from November 1, 2019, to October 31, 2022. Finally, we got 111,258 open posts from 575 students. Each student on average has 193.5 posts. The maximum and minimum numbers of posts made by a student were 1987 and 10, respectively. To develop the parenting style assessment method, we divided the 575 students into a training set of 475 students, a validation set of 50 students, and a test set of 50 students.

Figure 2 shows the distribution of parenting scores in the three dimensions. The average scores of the 575 students in the dimensions Rejection, Emotion warmth, and Overprotection were 1.59, 3.04, and 2.21, respectively.

www.frontiersin.org

Figure 2. The distribution of the parenting scores of 575 microblog student users in the dimensions (A) Rejection, (B) Emotion warmth, and (C) Overprotection.

Peng et al. (6) argued that parenting styles can be categorized into three types—positive, mixed, and negative—using Latent Profile Analysis (LPA). Following this approach, we classified the 575 students into these three types based on their scores in the dimensions of Rejection (sr), Emotional warmth (se), and Overprotection (so):

• Positive-parenting-style-type students. Students have higher scores in the dimension Emotional warmth and lower scores in the Rejection and Overprotection, i.e., (se > sr + 1)&(se > so + 1).

• Mixed-parenting-style-type students. Scores in the dimension Emotional warmth are not significantly different from those in the dimensions Rejection and Overprotection, i.e., (|se − sr| < 1)&(|se − so| < 1).

• Negative-parenting-style-type students. Students have lower scores in the dimension Emotional warmth and higher scores in the dimensions Rejection and Overprotection, i.e., (se + 1 < sr)&(se + 1 < so).

There are 379 (66.4%), 162 (28.4%), and 16 (2.8%) students who grew up under positive, mixed, and negative parenting style types. Moreover, there are 14 (2.5%) students whose parenting style scores did not satisfy the above three types, e.g., (sr + 1 < se)&(se < so). Those students were not considered in the following analysis. The statistical results are broadly in line with the statistical finding reported by Peng et al. (6) that the proportions of positive, mixed, and negative parenting style types are 69.1%, 22.3%, and 8.6%, respectively. Figure 3 shows the average scores of the three parenting style types in the dimensions Rejection, Emotional warmth, and Overprotection, respectively.

www.frontiersin.org

Figure 3. The average scores of the three parenting style types in the three dimensions.

2.1.3 Correlation exploration

We analyzed the difference between the three parenting style types from the perspectives of linguistic expressions and emotional expressions in discourses.

Table 1 lists the topic-related word frequencies used by the three types of students on microblogs. The word frequencies were calculated using the Simplified Chinese Linguistic Inquiry Word Count, TextMind (15). The corpus contains 72,748 open posts from positive-parenting-style-type students, 34,326 from mixed-parenting-style-type students, and 2,366 from negative-parenting-style-type students. There are 11 topics, i.e., “social,” “family,” “friend,” “health,” “work,” “leisure,” “money,” “religion,” “death,” “psychology,” and “love.” The differences between the maximum and secondary maximum word frequencies are significant (p < 0.01) for all topics. We found that compared with the other types of students, positive-parenting-style-type students were more prone to use words related to topics like “social,” “family,” “friend,” “work,” “leisure,” “money,” and “psychology.” Mixed-parenting-style-type students talked more about “religion.” Negative-parenting-style-type students used more words related to “health,” “death,” and “love.” Besides, “Social,” “work,” and “leisure” are the most frequently discussed topics.

www.frontiersin.org

Table 1. The mean proportions of topic-related words used by 575 microblog student users under the parenting style types “Positive,” “Mixed,” and “Negative.”

From the perspective of emotional expression (Table 2), we observe that students under the positive and mixed parenting style types wrote more posts with positive emotions (i.e., “happy,” “like,” and “surprise”) whereas negative-parenting-style-type students tended to express more negative emotions (i.e., “sad,” “fear,” and “hate”) in their posts. We used the Chinese Affect Lexicon (16) including 27,466 words from seven emotional categories, i.e., “happy” (1,967 words), “like” (11,108 words), “surprised” (228 words), “angry” (388 words), “sad” (2,314 words), “fear” (1,179 words), and “hate” (10,282 words). Table 2 lists the mean proportions of emotional words used by students under the three types of parenting styles. We conducted a student's t-test and verified that the differences between different types of student were significant (p < 0.01) in the emotional categories “happy,” “like,” and “sad.” As the significant predictor of suicidal ideation, more attention should be paid to the negative parenting style type (17, 18).

www.frontiersin.org

Table 2. The mean proportions of emotional words used by 575 student users under the parenting style types “Positive,” “Mixed,” and “Negative.”

From Table 2, we further observe that under the negative parenting style type, there is a significant difference (p < 0.01) in the emotional expression between male and female students. Female students tend to express more negative emotions (i.e., “sad,” “fear,” and “hate”) than male students. Under the negative parenting style type, the female students' most common negative emotion is hate. Moreover, we conducted a t-test to investigate the effect of gender on negative emotional expressions. As shown in Table 3, under the negative parenting style type, the effect is more significant than that under a positive or mixed parenting style type.

www.frontiersin.org

Table 3. The difference in the expression of negative emotions between male and female students under the positive, mixed, and negative parenting style types.

Overall, the differences in discourse among microblog university student users with different parenting styles indicate that it is feasible to assess parenting styles based on students' language patterns. These variations in students' expressions provide valuable insights, allowing us to focus on specific aspects when evaluating their parenting styles. In the next chapter, we incorporate these discourse differences into the construction of our assessment method to improve its accuracy and reliability.

2.2 Microblog-based parenting style assessment

In this subsection, we propose the microblog-based parenting style assessment method. Figure 4 shows the architecture of the parenting style assessment method. Given the student's open post sequence , the aim of the method is to predict the student's parenting style scores on the three dimensions (i.e., Rejection, Emotion warmth, Overprotection), where 1 ≤ sr, se, so ≤ 4, n is the number of posts. Because the parenting styles of 14 (2.5%) students' parents do not fall into the category of positive, mixed, or negative, this method assesses the scores directly instead of performing three-class classification.

www.frontiersin.org

Figure 4. The architecture of the parenting style assessment method. In the attention layer, a deeper color indicates a higher attention weight.

As illustrated in Figure 4, we first feed the student's open post sequence into Sentence BERT (14) to capture the linguistic and emotional information from student's discourses. The Sentence BERT obtained good performance in linguistic and emotional information extraction from texts (14, 19). The output of Sentence BERT is the embedding sequence with rich linguistic and emotional information. The sequence is represented as , where embi∈ℝ1×384 is the embedding vector of the i-th post posti:

=SentenceBERT(post1,⋯,postn).    (1)

Given the embedding sequence, a two-layered LSTM is employed to sense the relationship between consecutive posts. Since the student may intermittently reveal experiences related to his or her parenting style in the post sequence, LSTM can connect this valuable information together for subsequent in-depth analysis.

hi1=LSTM1(embi,hi-11),hi2=LSTM2(hi1,hi-12),    (2)

where hi1,hi2∈ℝ1×300 are the hidden states of LSTM1 and LSTM2 in step i, H=∈ℝn×300 is the output sequence.

As attention mechanism has demonstrated great capacity in key information extraction (20, 21). An attention mechanism is then applied to H to find the key posts related to parenting style.

Att=Tanh(H×W1+b1) ∈ℝn×1,H′=AttT×H                   ∈ℝ1×300,    (3)

where Att = is a sequence of attention weights paid to . A higher value of ei means that posti is more related to the parenting style of the parents of the student. H′ is the output of the attention mechanism, containing the key discourse information related to the parenting style of the student's parents. W1∈ℝ300×1 and b1∈ℝ1×1 are learnable parameters.

2.2.1 Correlation injection module

To obtain a better performance in parenting style assessment, we designed a tailor-made module (Figure 5), injecting the found correlations into the assessment method. These correlations are that students who grew up in different parenting styles have different tendencies in linguistic and emotional expressions. This insight led our method to focus on words related to positive, mixed, and negative parenting styles, allowing our method to achieve more accurate parenting style predictions.

www.frontiersin.org

Figure 5. The architecture of the correlation injection layer in the parenting style assessment method.

According to the respective tendencies of the three types of students, we created three topic word sets (Tp, Tm, Tn) and three emotional word sets (Ep, Em, En), respectively. For instance, Tn contains 497 topic words that negative-parenting-style students tend to use, that is, words related to topics like “health,” “death,” and “love.” The statistical information of word sets Tp, Tm, Tn, Ep, Em, En is shown in Table 4. All words are from the Simplified Chinese Linguistic Inquiry Word Count (15) and the Chinese Affect Lexicon (16).

www.frontiersin.org

Table 4. The statistical information of word sets Tp, Tm, Tn, Ep, Em, and En, which were used in the correlation injection module.

Let Tn′∈ℝ497×300,En′∈ℝ13775×300 be the word embedding sets of all words in Tn and En. All the word embeddings were calculated by Sentence BERT (14). Let Tn″∈ℝ1×300,En″∈ℝ1×300 be the average of all word embeddings in Tn′ and En′, respectively. Tn″ and En″ can be seen as the representations of topic and emotional words that negative-parenting-style students tend to use. We used the same approach to generate Tp″∈ℝ1×300, Ep″∈ℝ1×300, Tm″∈ℝ1×300, and Em″∈ℝ1×300.

To inject the found correlations into the assessment method, we concatenated the discourse information H′ and the six representations, generated the correlation-aware discourse representation C:

C=H′||Tp″||Ep″||Tm″||Em″||Tn″||En″∈ℝ1×2100,    (4)

where || is the concatenate operation. C contains not only key discourse information (H′) but also the correlation information between discourse and parenting style (e.g., Tn″), which allows our method to more accurately evaluate the parenting style of the student's parents.

Finally, given the correlation-aware discourse representation C, the assessment method needs to deeply understand the student's discourse information and refer to the correlation information between the discourse and the parenting style to give the final parenting style assessment. Two fully-connected layers were utilized in this step:

                   U=Tanh(C×W2+b2)∈ℝ1×128,=Tanh(R×W3+b3),    (5)

where U is the intermediate result, W2∈ℝ2100×128, W3∈ℝ128×3, b2∈ℝ1×128, and b3∈ℝ1×3 are learnable parameters. sr, se, so are the predicted scores of the parenting style of the parents of the student in the dimensions Rejection, Emotional Warmth, and Overprotection. Higher scores indicate higher levels. In this step, we tried single-layer, two-layer, and three-layer fully-connected layers respectively, and found that the assessment performance of two-layer was the best.

3 Results 3.1 Effectiveness of the parenting style assessment

We used Mean Square Error (MSE) and Mean Absolute Error (MAE) to evaluate the performance of the parenting style assessment method. The MSE and MAE are defined as follows:

MAE=1n∑i=1N|yi-yi′|,MSE=1n∑i=1N(yi-yi′)2,    (6)

where yi′ is the predicted score of the parenting style in the dimensions Rejection, Emotional warmth, and Overprotection. yi is the label acquired from the questionnaire, and N is the number of students. We normalized yi from 1 ≤ yi ≤ 4 to −1 ≤ yi ≤ 1 by the following equation:

Moreover, the hyper-parameter settings of the parenting style assessment method are shown in Table 5. For each student, we selected his/her last 100 posts as input.

留言 (0)

沒有登入
gif