2020年美国大学生数学建模比赛——C题论文

Problem Chosen
C

2020
MCM/ICM
Summary Sheet

Team Control Number
2000316

 


In this paper, we made a comprehensive product evaluation model, found the change in product reputation over time, predicted the future success or failure of the product, the impact of specific star ratings on subsequent product reviews, and the correlation between reviews sentiment score and star rating.

 

We first preprocess the given data set: delete irrelevant data and interfere data. And then process reviews body: delete non-text characters in the data, convert uppercase letters into lowercase, and correct the spelling errors. Based on the preprocessed data, we make a sentiment analysis on the reviews to get the sentiment score of each review. Then based on the number of product-related words in the review, the length of the review, the number of modifiers in the review, and the sentiment score of the review, we calculate the score of each review’s quality. Based on star rating, review quality score, and review sentiment score, we calculate the comprehensive score of each row in the data set by using the entropy method. Then we analyze the trend of the product's comprehensive score over time, that is, the change of product reputation over time. Based on the comprehensive score changes over time we predict the future success or failure of the product. Moreover, by analyzing the correlation between the inflection point's score on the comprehensive score-time curve and the average score of its previous period star rating, we believe that a specific star rating will not affect subsequent reviews. Besides, Based on the comment sentiment score calculated above, use correlation analysis to analyze whether the comment sentiment score has a strong correlation with the star rating of the comment.

 

Finally, we provide advice for Sunshine company to prepare their new products for online sale based on the models and rules that are explored. During the modeling process, we extract and filter related words related to the product in the review. For example, the feature words for hair dryers include xxx, xxx, and for baby pacifier are xxx, xxx. Obviously, these words extracted from the reviews are the important product characteristics which consumers value most, and also the direction that Sunshine needs to pay attention to when designing products.

 

In addition, for sunshine company, analyzing the historical data of competitors to predict the future direction of the product can help its new products to survive better in the market.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Content

Introduction. 3

Restatement of Problems. 3

General Defines. 4

Models and Algorithms. 4

Pre-process of Data. 5

Calculate Review’s Sentiment Score. 6

Calculate Review’s Quality Score. 6

The Relationship of Star Ratings, Reviews, and Helpfulness Ratings. 11

Comprehensive Evaluation Model (Most Informative Ratings and Reviews) 15

Product Reputation’s Change Over Time. 17

Product’s Future Success or Failure. 19

the Relationship Between Specific Star Ratings and Reviews. 19

the Relationship Between Quality Descriptors and Rating Levels in Reviews. 20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Introduction

Nowadays online shopping has become part of our lives. After shopping online, customer usually share their perspectives on the product they bought and the service they experienced, which are shown by star ratings and reviews. While shopping online, other customers can browse these stars and reviews, and give ratings on these reviews as being helpful or not, which called a helpfulness rating. For companies, these data give insight of markets they participate in, by analyzing the data, company can better hand customer’s preferences, therefore improve their products and services to gain a success in the future.

 

     Sunshine Company is preparing to introduce and sell three new products in the online market: a microwave oven, a pacifier, and a hair dryer. Therefore, it wants to make an online marketing strategy and identify essential product design features that would enhance its competitiveness by analyzing the feedback of similar product of competing company.

 

     To help Sunshine Company achieve its goals, we analyzed the given data sets and found patterns, relationships, measures etc. and provide Sunshine Company with suggestions for the products sale online in the future, hope it would be useful for Sunshine Company.

Restatement of Problems

The Sunshine Company requests us:

 

  1. Inform their online sales strategy
  2. Identify potentially important design features that would enhance product desirability
  3. Find the way these time-based data interact that will may help Sunshine company craft successful product.

 

In essence, this problem requires us to do 6 tasks:

 

  1. Find the ways the star ratings, reviews, and helpfulness ratings interact with each other.
  2. Identify most informative data measures based on ratings and reviews.
  3. Identify time-based measures and patterns that suggest a product’s reputation changes over time.
  4. Determine combinations of text-based measure(s) and ratings-based measures that best indicate a potentially successful or failing product.
  5. Figure out if specific star ratings incite more reviews.
  6. Figure out if specific quality descriptors of text-based reviews strongly associated with rating levels.

General Defines

Star rating: The 1 to 5star rating of the review.

 

Review: The review text

 

Helpful Votes: Number of helpful votes

 

Total Votes: Number of total votes the review received.

 

Helpful Vote Ratio (HVR): Helpful votes as a percentage of total votes. That is,

 

HVR =

 

Vine: Customers are invited to become Amazon Vine Voices based on the trust that they have earned in the Amazon community for writing accurate and insightful reviews. Amazon provides Amazon Vine members with free copies of products that have been submitted to the program by vendors. Amazon doesn't influence the opinions of Amazon Vine members, nor do they modify or edit reviews.

 

Verified Purchase: A “Y” indicates Amazon verified that the person writing the review purchased the product at Amazon and didn't receive the product at a deep discount.

Models and Algorithms

Pearson Correlation Coefficient: Pearson correlation coefficient s used to measure whether two data sets are on a line, and it is used to measure the linear relationship between distance variables.


The Entropy method
In information theory, entropy is a measure of uncertainty. The greater the amount of information, the smaller the uncertainty and entropy; the smaller the amount of information, the greater the uncertainty and entropy. According to the characteristics of entropy, we can judge the randomness and disorder of an event by calculating the entropy value, and we can also use the entropy value to judge the degree of discreteness of an indicator. The greater degree of dispersion of the indicator means the greater impact it has on the comprehensive evaluation.

Therefore, the information entropy can be used to calculate the weight of each indicator according to the degree of variation of each indicator to provide a basis for comprehensive evaluation of multiple indicators.

 

TextBlob: TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

Pre-process of Data

 

 

 

 

Picture-1 the process of Pre-process of data

 

Step-1: Remove irrelevant rows.

 

We find that the given data set has reviews that are not describe the product we analyze. By filtering the column product_title, that is, if column product_title in a row of data does not contain the word related to the analyzed product, we will delete the row.

 

Step-2: Remove rows that vine = ‘N’ and verified purchase = ‘N’

 

Non-vine and non-verified purchase reviews are not valid based on known conditions, as they cannot be confirmed the product has been used or received.

 

Step-3: Remove non-text from reviews

 

In this step we delete the non-text part of the review, which includes numbers, symbols, URLs, and certain specific characters, such as <br />.

 

Step-4: Case conversion of reviews

 

In this step we convert all capital letters in the review to lower case, for the convenience that we process the review later.

 

Step-5: Remove stop words from reviews

 

Stop words include words that often appear but have little meaning for classification, such as a, and, the, etc.

 

Step-6: Spell correction of reviews

 

We found that there were misspelled words in the reviews, for the convenience we process the review later, we give a spelling correction in this step.

 

Calculate Review’s Sentiment Score

The sentiment score of a review is the sentiment index of the review, which is related to the sentiment words in the review. In order to get the sentiment index, we used the sentiment function in the TextBlob tool. The sentiment value is [-1,1]. Positive numbers represent positive feelings and negative numbers represent negative. The emotional distribution is as follows:

 

 

 

 

Picture-2 the emotion map

Calculate Review’s Quality Score

The quality of a review is an important description of a review. Based on experience and the data given, we picked seven indicators:

 

Score Indicators

 

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

vocabulary refers to the product

Count of product-related words appearing in reviews

 

 

 

 

 

the length of review

Statistic comment length (measured by word length)

 

 

 

 

 

Number of modifiers

Number of modifiers

 

 

 

 

 

helpful vote ratio

Useful votes / All helpful votes

 

 

 

 

 

Emotional expression intensity

Counting sentiment scores in reviews

 

 

 

 

 

Note 1: vine adds two points to Y

Remark 2: 1 point plus for varified purchases

 

 

 

 

 

 

Table-1 Review Quantitative Rating Method

 

Scoring criteria

 

The following image is schematically obtained from the hair_dryer table, and the remaining two tables are the same.

 

I. Vocabulary of products involved

 

In order to be able to set a standard based on the vocabulary of the product, we chose python as the tool. First, we used some NLP methods to select all noun phrases in the review, and then counted the frequency of these noun phrases, arranged in order from highest to lowest, and obtained the following image:

 

 

 

Picture-3 words frequency

Images obtained from highest to lowest word frequency (partial)

 

Then we selected some product-related nouns from high to low word frequency, put these nouns into a dictionary, and then traversed the reviews to get the number of vocabulary related to the product in each review, and got the following image:

 

 

 

Picture-4 Vocabulary of words refers to product

 

By analyzing these two images, the corresponding standards can be obtained:

 

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

vocabulary refers to the product

Count of product-related words appearing in reviews

0

1,5

5,10

>10

 

Table-2 indicator of vocabulary

 

2.Comment length

In order to get a measure of comment length, we counted the length (number of words) of each comment and got the following distribution map:

 

 

 

Picture-5 review length

 

Then we analyzed the median and average of review lengths, and got the following criteria:

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

the length of review

Statistic comment length (measured by word length)

<10

10,30

30,60

>60

 

 

Table-3 review length

 

3.The number of modifiers

We measure this standard by finding all adjectives and counting the number of adjectives in each comment to get the following figure:

 

 

 

Picture-6 Number of adjectives/modifiers

 

The mean and median were calculated at the same time, and the following criteria were obtained:

 

 

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

Number of modifiers

Number of modifiers

0-2

3,4

5,10

>10

 

 

Table-4 number of modifiers

4.Helpful vote ratio

We get the support rate of each review by helping_votes / total (helpful_votes) of each review. The distribution image is as follows:

 

 

 

Picture-7 helpful vote ratio

 

After calculating the average and median, we can get the standard:

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

helpful vote ratio

Useful votes / All helpful votes

0-0.005

0.005-0.01

0.01-0.015

>0.15

 

Table-5 helpful vote ratio

 

5.The intensity of emotional expression

To quote the sentiment score of the comment above, take the absolute value of the sentiment score to get the following table:

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

Emotional expression intensity

Counting sentiment scores in reviews

|sentiment|=0

|sentiment|<0.4

|sentiment|<0.8

|sentiment|>=0.8

 

Table-6 The intensity of emotional expression

 

In this way, the final measurement standard was obtained as follows.

Review Quantitative Rating Method

 

indicators/ratings

Scoring method

score=1

score=2

score=3

score=4

 
 

num of vacabulary refers to the product

Count of product-related words appearing in reviews

0

1,5

5,10

>10

 

the lenghth of review

Statistic comment length (measured by word length)

<10

10,30

30,60

>60

 

Number of modifiers

Number of modifiers

0-2

3,4

5,10

>10

 

helpful vote ratio

Useful votes / All helpful votes

0-0.005

0.005-0.01

0.01-0.015

>0.15

 

Emotional expression intensity

Counting sentiment scores in reviews

|sentiment|=0

|sentiment|<0.4

|sentiment|<0.8

|sentiment|>=0.8

 

Note 1: vine adds two points to Y

Remark 2: 1 point plus for varified purchases

 

 

 

 

 

Table-7 Review Quantitative Rating Method

 

Then we code a score model by python based on this standard. Then get the Review’s Quality Score as the picture below.

 

 

 

 

Picture-8 Reviews Quantity Score

The Relationship of Star Ratings, Reviews, and Helpfulness Ratings

In this part, we discuss the relationship with star ratings, reviews and help_rating.

First, we quantify the three indicators. For stars has been quantified as 1,2,3,4,5star, help_rating = helpful votes/total votes when total votes not equal to 0, while total vote is 0, we define Helpful Vote Ratio as 0.01.

 

Stars = [1,2,3,4,5]

 

 

As for reviews, we classify Review’s Sentiment Score  into five levels.

Review’s Sentiment Level = [0,1,2,3,4]

 

Level

Meaning

0

Very negative

1

Negative

2

Neutral

3

Positive

4

Very Positive

Table-8 Review’s sentiment level

 

Then we calculate the Pearson Correlation Coefficient among star_rating, help_rating and review as follows.

 

                               

 

 

Table-9 Pearson Correlation Coefficient of hair dryer       Table-10 Pearson Correlation Coefficient of microwave

 

 

 

Table-11 Pearson Correlation Coefficient of pacifier

 

From the data in the tables above, we can conclude that no matter which product, the correlation coefficients on X and Y are close to 0, and the corresponding significance levels are 0.01, 0.05, 0.01, and the correlation is significant, indicating that there is no connection between star ratings, reviews and help_rating.

Based on the results of the discussion, we would like to further understand the comprehensive evaluation results of customers' products based on three basically irrelevant factors.

We first selected the products with the highest number of reviews from the three categories and analyzed them. They are Remington ac2015 t|studio salon collection pearl emitted hair dryer, deep purple (hair_dryer), danby 0.7 cu.ft. Countertop microwave(microwave), philips avent bpa free soothie pacifier, 0-3 months, 2 pack, packaging may vary (pacifier,the total reviews is 734) Their number of reviews is shown in the following three charts.

 

 

 

            Picture-9 hair_dryers’ total number of reviews                          Picture-10 microwaves’ total number of reviews

 

According to The previous discussion, we get The values of The three factors, and we assign The weight value of each item to The score of The product based on The three factors according to AHP (The analytic hierarchy process).NOW:

 

1. Establish the hierarchical structure model

Take customer satisfaction as the goal and consider star_rating, help_rating, and review_score. According to their mutual relations, they are divided into the highest and the lowest levels, and a hierarchical structure diagram is drawn.

 

2. Construct judgment matrix

The comparison results of importance are shown in table 1. The 9 importance levels given by Saaty and their values are listed. The matrix formed by pairwise comparison is called the judgment matrix. The judgment matrix has the following properties:

 

The scale method of judging matrix element  is as follows:

Factor I over factor j

Quantitative val

 

ues

As important

1

A little important

3

More important

5

Highly important

7

Extremely important

9

The median of two adjacent judgments

2,4,6,8

                                                     Table-12

 

The judgment matrix between the three is obtained:

 

 

 

                   Table-13                                                                        picture-11

 

From the figure, we can get that the weight value of review_score is 0.75, and the weight value of star_rating and help_rating is 0.125. The overall evaluation of the customer on the product is shown in the following formula:

 

       Satifiaction=0.75*review_score+0.125*star_rating+0.125*help_rating

 

Then, we get a comprehensive score for each piece of data, which can be divided into the following three levels:

 

Score

Satification

0-1.5

Negative

1.5-2.5

Neutral

2.5-4

Positive

Table-14

 

The analysis results of three products are obtained, as shown in the following three figures

 

 

 

              Table-15 remington ac2015 t|studio salon collection                                         Table-16

 

We can see that among the 534 reviews on this product, 252 votes are positive, with a ratio of 47.2%, 163 votes are negative, and 119 votes are neutral.

We can see that among the 363 reviews on this product, 146votes are positive, with a ratio of 40.2%, 113 votes are negative, and 104 votes are neutral.

 

 

 

 

                                                         Table-17

 

We can see that among the 734 reviews on this product, 308 votes are positive, with a ratio of 42.0%, 243 votes are negative, and 183 votes are neutral.

Comprehensive Evaluation Model (Most Informative Ratings and Reviews)

In order to find the most informative user feedback, we use star rating, review quality score, and review sentiment score as measurement indicators through the entropy method to give weight to the three factors to get the comprehensive score of each feedback. The calculation steps are as follows:

 

1)Determine the Indicators

 

As mentioned above, they are star rating, review quality score and review sentiment score.

 

2) Standardize the indicators.

 

As the units of measurement of the indicators are not uniform, before we use them to calculate comprehensive indicators, we must standardize them, that is, convert the absolute values ​​of the indicators into relative values to solve the problem of homogeneity of various qualitative index values. Moreover, because the positive and negative indicator values ​​have different meanings (It is favorable that the positive indicator value is higher and the negative is lower). Therefore, we use different algorithms for data normalization for high and low indicators. The specific method is as follows:

 

positive indicators:

 

Negative indicators:

 

 

 

 

 

 is the value of the i-th feedback’s j-th indicator (i = 1, 2…, n; j = 1, 2, …, m). For convenience, the normalized data is still recorded as

 

3) Calculate the proportion of the i-th feedback under the j-th indicator

 

 

 

 

4) Calculate the entropy of the j-th index:

 

 

 

 

 

Of which  satisfies

 

5) Calculate information entropy redundancy:

 

 

 

 

 

6) Calculate the weight of each indicator:

 

 

 

 

7) Calculate the overall score for each feedback:

 

 

 

 

Due to the large amount of data, we implemented the above algorithm through MATLAB, and obtained the comprehensive score of each feedback in the three data sets, which was used in subsequent analysis.

Product Reputation’s Change Over Time

As the comprehensive evaluation involves star rating, which most directly shows consumer satisfaction with the product, the quality of the evaluation, which shows the relevance of the review content and the product, the emotional score of the review, which further expresses in more details toward the product, and the comprehensive evaluation express the user's comprehensive attitude towards the product, so it can be used to represent the product reputation. Therefore, analyzing the change of product reputation over time can be transformed into analyzing the overall evaluation score of the product over time. The following table is an excerpt from our processed and calculated dataset(total_score means the Comprehensive Product Rating).

 

star_rating

review_quality_score

emotion_value

total_score

5

11

0.75

121.33

5

9

0.40

100.31

1

11

0.50

68.18

5

9

0.50

101.37

1

9

0.24

48.15

4

10

0.25

94.74

5

14

0.30

142.47

5

12

0.35

125.76

5

10

0.60

111.08

Table-18 Comprehensive score of each feedback

 

Through the data pivot table, we get the trend of the comprehensive evaluation score of the three product with each quarter, as shown in the figures below.

 

Baby pacifier:

                                    

 

 

Picture-12 Evaluation of baby pacifiers over time

 

From the perspective of the line chart, the overall evaluation of baby pacifiers has stabilized year by year. Judging from the forecast curve (dotted line), the baby's pacifier's reputation will not change much in the future, and it may increase slightly.

 

Hair dryer:

                                            

 

 

Picture-13 Evaluation of hair dryer over time

 

We analyze the time series and use the ARIMA model,From the graph, the product evaluation fluctuated greatly from 2002 to 2006 (probably because the amount of data in recent years was not large enough) and then stabilized. Judging from the forecast curve, the reputation of hair dryers in the future may decrease slightly, and the overall stability will be stable.

 

Microwave:

                                  

 

 

Picture-14 Evaluation of microwave over time

 

From the graph, the overall evaluation of microwave ovens is on a downward trend, and the fluctuation of reputation is larger than that of the first two products. Judging from the forecast curve, the reputation of microwave ovens will decline in the future.

Product’s Future Success or Failure

In this part we discuss the future success or failure of the product from a composite score. From the analysis of the previous question, we have obtained a graph of the reputation of the product (the comprehensive score of the product) over time. Take microwave ovens as an example. The reputation of microwave ovens has decreased year by year, which means that

 

consumers ’satisfaction with online purchases of microwave ovens has decreased year by year. Therefore, it can be analyzed that the proportion of online purchases of microwave ovens will decrease in the future, leading to product failure.

the Relationship Between Specific Star Ratings and Reviews

We processed the 734 reviews of the pacifier product, the correlation between each review and last month's average score of star_rating, and the results are shown in the following figure:

                                                                           

Table-19

From the data in the tables above, we can conclude that the correlation coefficients on X and Y are close to 0, and the corresponding significance levels are 0.01, The relationship between them was not significant or strong.

the Relationship Between Quality Descriptors and Rating Levels in Reviews

In the previous article, we have performed sentiment analysis on the reviews and obtained the sentiment score of each review. Here we want to explore whether the sentiment score of the review is strongly related to the star rating. To this end, we did a correlation analysis between the sentiment score and the star rating. The results are as follows:

 

 

star_rating

emotion_value

star_rating

1

 

emotion_value

0.359392029

1

Table-20 pacifier

 

 

star_rating

emotion_value

star_rating

1

 

emotion_value

0.416941057

1

Table-21 hair dryer

 

 

star_rating

emotion_value

star_rating

1

 

emotion_value

0.460324

1

Table-22 microwave

 

According to the above three correlation coefficients we find that the sentiment score of the review is positively correlated with the star rating and has a strong correlation.

 

posted @ 2021-02-28 11:51  小璐同学  阅读(1038)  评论(0编辑  收藏  举报