A CROSS-PLATFORM MARKET STRUCTURE ANALYSIS METHOD USING ONLINE PRODUCT REVIEWS

. Studies have shown that online product reviews can indicate the position of a competitive brand. Even though reviews on different platforms may express different opinions, most studies are based on only one platform. This may lead to an inaccurate analysis of market structure. To solve this problem, we develop a novel market structure analysis based on multi-attribute group decision-making which can integrate reviews from different platforms. Multiple platforms more comprehensively reflect the market than single platforms do. To verify the effectiveness of the proposed method, we conduct a case study of mobile phone reviews across three top e-commerce platforms in China. In addition, we propose a process to generate priorities for product-attribute improvements using a cross-platform market structure analysis method. Our experiments demonstrate the effectiveness of the proposed method.


Introduction
Market structure analysis is the identification of product-market boundaries and structure (Srivastava et al., 1981). Through market structure analysis, people can understand the position of products or brands in a competitive space in order to make strategic decisions. As more and more information is published by consumers on forums, blogs, e-commerce sites or other Internet platforms, the subject of how to find valuable information in user-generated content has become a research hotspot (Netzer et al., 2012). In theory, through monitoring consumer comments and discussions about products in a particular category, firms can better understand the following: the features of their products compared to those of competitors; the market structure and competitive landscape; and marketing opportunities (Netzer et al., 2012). This points to the possibility of drawing upon online user-generated content for market structure analysis.
However, user-generated content is vast, unstructured, and difficult to analyze. Researchers have begun using different methods to address this conundrum. Methods based on text mining are commonly used, as they are effective and efficient. Text mining-based market structure analysis usually consists of two steps. The first involves obtaining the attributes of the product or brand and the score of each attribute from the user-generated content. The second step involves using a model to evaluate a product or brand's market position based on its attributes and scores and then conducting a market structure analysis. For instance, Chen et al. (2015) proposed a novel Latent Dirichlet Allocation-based market structure analysis framework to identify product features. The method they used was largely automated, improving the efficiency of the analysis.
Despite this benefit, there are currently two problems with text mining-based market structure analysis. First, most studies are based on only one platform -thus, ignoring differences among platforms -which could lead to a biased analysis of market structure. Second, even though the sentiment in comments tends to reflect consumer perceptions of product attributes (Koh et al., 2010), text mining-based market analysis focuses too much on the extraction of product attributes rather than on making full use of the positive and negative comments in product reviews. These limitations have inspired us to develop a novel method for market structure analysis that can fully exploit the emotional content in online reviews and integrate reviews across multiple platforms.
In this study, we propose a cross-platform market structure analysis method that employs multi-attribute group decision making. This involves selecting the most appropriate option among a range of alternatives under a given standard (Koksalmis & Kabak, 2019). In addition, we propose a process to generate priorities for product-attribute improvements that are based on the method. Multi-attribute group decision making is the process of selecting the best alternatives based on numerous expert opinions and evaluation criteria (Yue, 2012;Kou et al., 2012Kou et al., , 2014Kou et al., , 2016Schotten & Morais, 2019;Zolfani & Saparauskas, 2013). In this study, we regard the e-commerce platform as the decision-making expert, the product attributes as the evaluation indicators, and the products as alternatives. This study integrates the results from different platforms using group decision-making integration methods, which allows for a more comprehensive analysis of the market structure. The contributions of this research to the literature are as follows: (1) We first use sentiment classification and fuzzy set theory together to analyze market structure. Our work incorporates real opinions on products as well as the emotional content in product reviews. (2) We use a multi-attribute group decision-making method to integrate online product reviews across multiple platforms. We consider multiple e-commerce platforms, as this is more reliable than relying only on a single platform.
(3) We also propose a method to generate priorities for product-attribute improvement.
The purpose of this is to make full use of our proposed cross-platform market structure analysis method and provide guidance for firms to learn from other firms' products. The remainder of the paper is organized as follows. In Section 1, we introduce related work. In Section 2, we describe the proposed method. In Section 3, we present the details of our experiment. In Section 4, we propose a process to generate the priorities of products attribute improvements using the cross-platform market structure analysis method. In the last section, we present our conclusions and offer suggestions for future research.

Related work
In this section, we review the literature related to the current study. The review covers two themes in the literature: market structure analyses and studies of product ratings based on online reviews. While market structure analysis is our main research topic, we also review studies of product ratings in online reviews, as our research methodology resembles the methodology of those studies.

Market structure analysis methods
Market structure analysis is the basic pillar of marketing research, and it has been evolving over time. There are two categories of methods that have been used in these analyses: traditional and text-based mining methods (Chen et al., 2015).
Traditional market structure analyses vary according to the types of data analyzed (e.g., panel scanner data, total sales, and consumer survey responses) and the methods used. Regardless of whether an analysis employs internal or external methods, ordinarily, a set of attributes and their underlying dimensions must be determined in advance based on survey or transaction sales data, assuming that "all customers perceive all products the same way and differ only in their evaluation of product attributes" (Elrod et al., 2002). Determining this set of attributes is, thus, fundamental to market structural analysis, but studies using traditional methods have rarely explored how to determine these attributes along with their magnitudes and underlying dimensions. As a result, traditional analyses depend more on manual manipulation to determine the attributes. Fraas and Greer (1977) examined the effects of structural conditions on price collusion between oligarchs. Using traditional methods provided evidence for structure-behavior relationships in an analysis of 606 standards-compliant case studies. Srivastava et al. (1981) used the substitution-in-use criterion to analyze product usage data in financial services to obtain an effective product-market structure. They employed a hierarchical clustering method and verified the effectiveness of their method. Fraser and Bradford (1983) proposed a competitive market structure analysis method based on principal component analysis and used their proposed Index of Revealed Substitutability as a segmentation indicator to identify brands or groups of competing products in a market. They demonstrated the effectiveness of the method in the context of the US coffee market. Erdem (1996) suggested that past analyses of market structure have ignored consumer dynamics and heterogeneity in both the preferences for and perceptions of brand attributes. To solve this problem, he proposed a novel model and tested it on Nielsen scanning panel data. The results showed that the internal and external fitting of this method was better than that for methods that had not taken into account consumer dynamics, verifying the importance of considering these dynamics to obtain the most accurate results. Cooper and Inoue (1996) proposed a market structure analysis method based on preference structure in the consumer market. The method divided the market into heterogeneous sub-markets according to different consumer perspectives. Yang et al. (2017) proposed a dynamic analysis model that incorporated eight influencing factors based on vector autoregressive analysis in the context of China's coal industry. This model revealed that the market structure had a dynamic response path to these factors. It also showed the changes in the contribution rate of these factors to market structure.
To resolve the issue that determining attributes requires more manual manipulation, some scholars began to consider text-mining technology to automatically extract the required attributes from user-generated content (Gupta et al., 2020). In the last decade, methods for market structure analysis that are based on text mining have fully emerged, primarily focusing on user-generated content on the Internet. This approach employs text-mining methods to shed light on consumer perceptions of products or brands as well as on market structures. As the method is relatively new, only a few studies have addressed it (Lee & Bradlow, 2011;Netzer et al., 2012). Lee and Bradlow (2011) proposed a text mining-based approach to identify product attributes and dimensions from user-generated content; they also used the method to calculate brand distances and visualize market structure. Following Lee andBradlow (2011), Netzer et al. (2012) introduced semantic relations to define similarities, leading to more reasonable analysis results. However, since both these approaches must be based on the "Bag of Words" assumption, they require manual classification of similar product attributes. To solve this problem, Chen et al. (2015) proposed a method based on a topic model that could automatically identify similar product attributes, circumventing the need for manual involvement. In other words, this approach was more automated.
There are two problems with these studies. First, they focus on the extraction of product attributes from user-generated content, ignoring emotional information. This is a problem, since user emotions can reflect perceptions of products or brands. Second, the research is based on only one platform, which could lead to an inaccurate market structure analysis. This is because people may express different perceptions of the same product or brand on different platforms. This study solves some of these problems by employing text mining. It makes full use of the emotional information in user-generated content, and further, it employs a multi-criteria group decision-making method to integrate online comments from multiple platforms. This approach allows for a more accurate analysis.

Ranking products based on online reviews
Product ratings based on online reviews help consumers select products that best meet their needs. This study's method of market structure analysis adopts this concept of product ratings. Up until now, research on product ratings based on online reviews has been rare. The research has usually consisted of three steps: first, extracting product attributes from comments; second, identifying emotional directions corresponding to product attributes; and finally, using one of Multi-Attributes Decision Making methods (Galankashi et al., 2020;Wang et al., 2021) to score product attributes and weighting them to obtain the final product rating. Zhang et al. (2010) was the first person to rate products using online reviews. They considered both emotional information and contrasting opinions in product ratings and used directed, weighted graphs to score digital cameras and televisions. The test proved the effectiveness of their method. Peng et al. (2014) used similarities in product-attribute extraction to classify attributes (a process which can avoid human operations) and used fuzzy PRO-METHEE to score products, achieving satisfactory results in the context of mobile phones. However, their ratings of product attributes were mainly based on expert scoring, without consideration of the emotional content of comments. Najmi et al. (2015) were more comprehensive in their approach. They rated brands and products simultaneously and considered the usefulness of reviews. Liu et al. (2017b) considered both polar and neutral comments, used an emotional dictionary for sentiment analysis, and employed the intuitionistic fuzzy PROMETHEE to comprehensively rate products.
These studies did not consider differences between platforms, as they were based on only one platform. In addition, obtaining single-user preferences for product attributes and then determining attribute weights is challenging, making this method less suitable for userpersonalized recommendations. However, users' overall preferences can be determined by the number of users concerned about specific product attributes, and this information could then be used to generate attribute weights. This would be a suitable approach to market structure analysis, as it considers the entire market rather than focusing on individual users, and it would be sufficient for identifying overall preferences.

The proposed method
To make full use of the attributes and sentiment tendencies in user reviews and to integrate the content from multiple platforms, our proposed method employs text classification, multiobjective group decision making, and intuitionistic fuzzy TOPSIS. Since we are using multiattribute group decision making to solve the cross-platform market structure analysis, in the case of known alternatives (products) and experts (different e-commerce platforms), we first need to obtain evaluation criteria from product reviews (product attributes) and criteria scores (scores of product attributes). We obtain the main attributes of products by counting the nouns in the reviews and manually classifying the synonyms. The attribute weights are generated by counting the frequencies of different product attribute words in the comments on each platform. We use the method of product rating (Zhang et al. 2010;Peng et al. 2014;Najmi et al. 2015;Liu et al. 2017b) and sentiment analysis to determine the scores of product attributes. After determining the alternatives, experts, attributes, and attribute scores, we can generate the decision matrix for different experts, which is a typical multi-attribute group decision problem (Lin et al., 2020;Zhang et al., 2021;Yu et al., 2021). After that, we need to select the appropriate group-decision integration method to generate expert weights and to assemble the decision matrices of different experts into a single decision matrix. Finally, we use multi-attribute decision making methods to score the products, and then, we are able to determine their market positions. The proposed method is shown in Figure 1. The detailed method includes the following steps: Step 1. Collect review data for the specified product on the specified platform. Considering the functionality and simplicity of Python, we use it as the programming language for data collection. At this step, the crawled content is mainly the body of the review, which includes the first and the additional reviews. On some e-commerce platforms, parts of reviews appear as tags. Because this part of the content contains product attributes and emotional tendencies, the tag reviews are also collected and analyzed.
Step 2. Obtain product attributes from reviews and then clean reviews. The acquisition of product attributes is carried out consistently with the following ideas. First, words in the review content are segmented and their parts of speech marked; for part-of-speech tagging, we use the jieba word segmentation package in Python. Second, all the nouns from the reviews are taken and the document frequency of each noun DF i is counted, where DF i represents the number of documents containing noun i in all documents. Third, nouns whose document frequency is less than a certain threshold are deleted, and the remaining words are the alternative attribute words. In this study, the threshold is set at five. Finally, the candidate attribute words that are not related to the product are eliminated, and the attributes are classified by experts. Attribute words belonging to the same class can be considered the same attribute. Cleaning of reviews is divided into two steps: the first is to delete the stop words and the reviews that do not contain any attribute words; the second

Decision matrices integration
Product ranking is to split the reviews into single sentences. Sometimes, longer reviews can contain a range of emotions, but usually, the emotional tendency in a sentence is singular. This makes subsequent emotional analysis easier. In the analysis, each sentence that comes from the segmentation of reviews is treated as a single review rather than a raw review, and document frequency refers to the frequency of the segmented sentence.
Step 3. Calculate the weights of product attributes on each platform. We assume that the more times an attribute appears in reviews, the greater is its weight. This assumption is grounded in the belief that the more times an attribute appears in a review dataset, the more consumers are concerned about it. Based on this assumption, we count the weight values for each item and attribute on each platform. The method is as follows: Let the review set of product j on platform i be D ij , the kth attribute be Step 4. The sentiment analysis is performed on the reviews, and the product attribute scores of each platform are obtained according to the analysis results to establish an intuitionistic fuzzy decision matrix. Since we are using sentiment analysis to obtain attribute scores for products, we need to address the sentiment analysis of reviews. Methods for solving sentiment analysis usually fall into two types: machine learning-based methods (Li et al., 2017;Chao et al., 2019;Kou et al., 2019;Kong et al., 2019;Zhong & Enke, 2019) and lexiconbased methods (Medhat et al., 2014). Although the method based on the sentiment lexicon is more accurate, its construction requires manual manipulation, which is inefficient compared to the machine learning-based method. Therefore, we use a machine learning-based approach to solve this problem. This step involves tagging a portion of the cleaned reviews manually. We use "-1" to represent negative emotions, "0" for neutral emotions, and "1" for positive emotions. The classifier is trained with the labeled samples to obtain the emotional tendency of the unlabeled samples. After that, we make an assumption: If one or more attributes appear in a review with emotion, we assume that the author's emotional tendency toward the attribute is the same as the emotional tendency of the sentence. For example, if a review of a cell phone mentions the screen and the sentiment of the review is negative, then the author of the review considers the screen in negative manner. Through the sentiment analysis and based on our assumption, we can determine the emotional tendencies corresponding to the attributes in all reviews. Counting the number of emotional reviews p , we can build the intuitionistic fuzzy number scoring matrix using the method proposed by Liu et al. (2017a). This method is as follows: the intuitionistic Thus, we can get its intuitionistic fuzzy number matrix for each Step 5. Determine the expert weights and integrate the expert matrices to arrive at the integrated matrix. Our approach to solving cross-platform market structure analysis is based in a multi-attribute group decision-making approach. This method is divided into two steps: the first is to integrate the decision matrices of multiple experts into a single matrix; the second is to use the multi-criteria decision-making method to score the alternatives in the integrated decision matrix and obtain the rankings of alternatives. We use the method proposed by Li et al. (2016) to get the integrated decision matrix, which is , , where R g and w i represent the integrated decision matrix and expert weight, respectively. Since the generated decision matrix and weight matrix are from real and objective review data and we hope to obtain objective results for our market structure analysis, the method which is to maximize the level of consensus by adjusting the weight of decision makers with the help of a feedback mechanism (Zhang et al., 2019) is not suitable for our work and the deriving of expert weights should also be objective. Although there are many methods that could be used to generate expert weight objectively, most of them are for cases in which the decision matrix is real rather than fuzzy and the attribute weight is an unknown vector rather than a known matrix. This would mean that all the weights of a single attribute are the same across different alternatives and different platforms, making the expert weight generation method less suitable for our situation. The main reasons for this are as follows. First, our decision matrix is an intuitionistic fuzzy number matrix. Second, our attribute weights are known and in the form of a matrix rather than a vector because we generate attribute weights from consumers' reviews. Thus, our attribute weights vary across different platforms and products. Therefore, we cannot use the existing methods to generate the expert weights directly, and we need to modify their methods. In light of this consideration, we use the projection method (Yue, 2012) to determine the expert weights from the weight matrices. Using a projection method to determine the expert weights involves calculating the similarity between the expert and ideal decision matrices. The ideal decision matrix is calculated by averaging all decision matrices, and the similarity between a decision matrix and the ideal decision matrix is calculated by projecting the decision matrices onto the average of all decision matrices. The higher the similarity, the higher the expert weight. The raw projection method is applicable to deriving the expert from the real decision matrices, but we have no real decision matrix and cannot use it directly. However, we have known weight matrices, so we can use these to derive expert weights from weight matrices. The process of determining the expert weight from the weight matrices using the projection method is divided into three steps: First, calculate the average weight matrix, which is Second, calculate the projection of each expert weight matrix onto the average weight matrix, which is ( ) Third, calculate the expert weight, which is After deriving the expert weight, we integrate the decision matrices and the weight matrices, where the method to integrate the weight matrices is Step 6. The product scores are calculated using intuitionistic fuzzy TOPSIS (Šaparauskas & Turskis, 2006) to obtain their market position rankings. The higher the product score, the higher the market position. Since the elements of the integrated decision matrix are intuitionistic fuzzy numbers, we need to use an intuitionistic fuzzy multi-attribute decision-making method to calculate the product score. There are many different kinds of fuzzy multi-attribute decision-making methods. Because the intuitionistic fuzzy TOPSIS is simple and effective, we use it to calculate the product scores. The calculation steps are as follows: First, according to the integrated matrix R g , we can get the positive ideal solution which is ( ) Second, calculate the distance between the jth product and the positive ideal solution which is ( ) and the distance between the jth product and the negative ideal solution which is ( ) Third, the closeness coefficient of jth product can by calculated by Fourth, rank the market position of the product based on the value of c j .
In the next section, we will use the proposed method to analyze the market structure of four different brands of mobile phones based on reviews on three representative platforms in China.

Experimental study
Using the method we proposed in section 2, we convert the review text data from different platforms into numerical data to help us analyze the market structure. In this section, we use real data as an example to verify the effectiveness of our proposed method.

Data collection
In our experiment, we take mobile phones as an example and choose the three largest ecommerce platforms (Tmall, jd.com, and Suning) in China to verify the effectiveness of our proposed method. We then selected four mobile phone brands and models: Huawei Mate 20, iPhone X, Mi MIX 3, and Oppo Find X. We choose these models as objects of the experiment because they were released in China in the past two years, they have excellent reputations, and they have distinct product attributes to facilitate the analysis. The websites of the official flagship stores or platform-owned stores usually have the largest number of product reviews among all websites affiliated with the same brands. Thus, we collect reviews from the websites of the official flagship stores for these phones on each platform or from the product links on the platforms' own stores. The 12-product links along with an example of the content we collect in each Suning review are in the Attachment. We collect three kinds of reviews: label, first, and additional reviews. The label reviews are always in the form of a phrase such as "high performance cost ratio" and "beautiful appearance, " which can be selected from among various phrases. The first and additional reviews are complete sentences written by consumers themselves. The reason for choosing these three kinds of reviews is that they all contain product attributes that attract reviewers' attention and toward which reviewers tend to have an emotional response. With other kinds of information, it is more difficult to assess reviewers' emotional tendencies towards attributes. To simplify the analysis, we collect only the text content but not the picture, time, author information, and other items in the review. For Tmall and jd.com, we only collect first and additional reviews, since label reviews are not an option.
We used Python language to crawl the review data, and we experienced some difficulty accessing reviews because of anti-crawling mechanisms. In Tmall, our crawler was easily detected and prohibited, and in Suning, we could only see the first 50 pages, or a total of 500 reviews. To balance the number of reviews collected on different platforms, we did not collect more than 1,000 reviews on each product link. Specific review numbers are shown in Table 1. When the number of reviews is below 500, it means that the number of valid comments was less than 500 and we could not collect additional reviews. To obtain review attributes, first, we cut the reviews into words and tag the parts of speech in the manner described in section 3. An example of a review after word cutting and part-ofspeech tagging is shown in the Attachment. The original text is well segmented and marked with parts of speech. The two items in each "()" are the word and the part of speech of the word, respectively. After that, we count the document frequency of all nouns, filter out the low-frequency nouns, and manually select alternative attribute words from the remaining nouns. We filter out the nouns whose document frequency is less than five. By manually classifying these alternative attribute words, we obtain 11 attributes, as shown in the Excel file "Attributes and their different expressions" in the Attachment, and each attribute has a variety of different expressions. After obtaining 11 product attributes, the reviews are cleaned in two steps: First, remove reviews without expressions of the 11 attributes listed in the file "Attributes and their different expressions. " This step can be performed directly through programming, traversing expressions of every attribute in the file, and matching the reviews one-by-one. If we do not find expressions of attributes that match to a particular review after traversing the properties in the file, this means the review does not contain the attribute, can be considered noise, and can be deleted.
Second, break down the reviews into short sentences. Since most reviews contain more than one attribute and the same reviewer may express different emotional tendencies for different attributes, this step allows us to avoid having multiple emotions in one review. We split sentences using a method that matches the ending symbols or conjunctions of a sentence and then cleaves the sentence. The ending signs and conjunctions that are used in the study are included in the Attachment.

Calculating the weights of attributes
The method in section 3 is used to calculate the attribute weight. First, we count the document frequency of all expressions of the 11 attributes for each product on each platform. In this way, we can obtain the document frequency table for all attributes, as shown in Table 2.

Sentiment classification and generating the decision matrix
We select 762 reviews and manually label their emotional tendencies. Labels include positive, negative, and neutral, which are represented by "1, " "-1, " and "0, " respectively. We find that some of the reviews have product attributes but that the content does not have an emotional tendency or that the content has nothing to do with the product. We label these reviews as "2, " and we use the classifier to identify and filter them. However, when calculating the weight of the attribute, these kinds of reviews are not excluded. This is because if the reviewer mentions the attribute, it usually means that he or she pays attention to it. The numbers of labels in each class among the 762 reviews are shown in Table 6. From Table 8, we know that the classification problem we are trying to solve is an imbalanced multi-classification problem. We use these labeled reviews to train the classifier to obtain the emotional tendencies of the reviews and filter out irrelevant ones. To improve the accuracy of our intuitionistic fuzzy decision matrix, we need to ensure the accuracy of our classification. In addition, because accuracy may not be a good measure when evaluating imbalanced classification problems, we introduce the macro average geometric (MAvG) (Ferri et al., 2009)  Obviously, in terms of accuracy, if classes with only a few samples are classified incorrectly, high values can result as long as classes with many samples are satisfactory. However, for MAvG, even if only one class has a poor classification performance, a high value cannot be realized. When the two measures are high, simultaneously, this means that the classification performance overall and the performances in all the single classes are satisfactory. We need to generate the intuitionistic fuzzy matrices with a satisfactory classification performance in each class, so we choose both measures.
We test a variety of classifiers with the 762 labeled comments as well as with the training sets and test sets, including eight traditional classifiers: support vector machines (SVM); k-nearest neighbor (kNN); random forest (RF), logistic regression (LR); naive Bayes (NB); decision tree (DT); gradient boosting decision tree (GBDT); and XGBoost. There are also two deep-learning classifiers: convolutional neural networks (CNN) (Chen & Tsai, 2020) and long short-term memory (LSTM) (Selvamuthu et al., 2019;Fazelabdolabadi, 2019). Considering that traditional classifiers require dimensionality reduction to improve performance, but deep learning requires sufficient dimensions to train more accurate classifiers, we use the following feature selection methods: document frequency (DF); information gain (IG); Gini Index (GI); distinguished feature selector (DFS); expected cross entropy (ECE); chi-squared (CHI); odds ratio (OR); class discriminating measure (CDM); and mutual information (MI). The weighted log likelihood ratio (WLLR) is used for dimensionality reduction. For the deep learning method, we retain all features. We test 100, 200, 500, 1,000, and 2,000 features for 10 feature selection methods for the eight traditional classifiers. One part of the performance of the different classifiers for the 762 labeled reviews is shown in Table 7. The complete test results are in the Attachment.
The test results show that, for most classifiers, the accuracy is not good enough, and MAvG is 0. Thus, performance for most of these classifiers is poor. Although LSTM has a high level of accuracy, it is not able to identify some classes that have only a few samples. There is only one classifier (the CNN) that has good performance; its accuracy reaches 0.97, and its MAvG is 0.8. Therefore, in this experiment, we ultimately decided to use CNN as the classifier for sentiment analysis. We use CNN to categorize all unlabeled reviews, and we remove reviews with the label "2". Then, we count the pos ijk p , ,  The second step is to calculate the projection of each expert weight matrix on the average weight matrix. Through calculation, we find that the projections for Tmall, jd.com, and Suning are 1.07, 0.99, and 0.94, respectively.
In the third step, we find that the expert weights of Tmall, jd.com, and Suning, are 0.36, 0.33, and 0.31, respectively.
Using the expert weights, the intuitionistic fuzzy decision matrices in Tables 10-12, and the weighting matrices in Tables 5-7, we can obtain the integrated intuitionistic fuzzy decision matrix (as shown in Table 12) and the integrated weight matrix (Table 13).

Generating the competitive positions of different cellphones
We use the intuitionistic fuzzy TOPSIS given in Section 2 combined with the intuitionistic fuzzy decision matrices of Tables 8-10 and 12 and the weight matrices of Tables 3-5 and  Table 13 to calculate the scores and rankings of each mobile phone on both the single platform and integrated decision matrices, respectively. The scores and rankings are shown in Table 14. We analyze the sensitivity of the integrated mobile phone market position ranking results when the expert weight changes, and we find that when Tmall's expert weights changes slightly (0.25 < w 1 < 0.67), no matter how the expert weights for jd.com and Suning change, the integrated ranking results are consistent with the results in Table 14. This is because Tmall is the largest e-commerce platform in China, but it has not yet achieved a monopoly position. Intuitively, Tmall's expert weight should be within the interval (0.25, 0.67), which means that our ranking results are relatively stable.

Results and discussion
In this section, we discuss the experimental results, which are summarized as follows: 1) The weight of an attribute changes as the phone or platform changes; 2) Similar results appear in attribute scores; 3) The same phone ranks differently on different platforms. We discuss the main reasons for this result based on the differences among mobile phones and platforms.

Weight matrix
Tables 3-5 show the following two phenomena: 1) different phones have different attribute weights on the same platform. For example, the highest weight of the Huawei Mate 20 in Tmall is for "Brand". However, the highest weight attribute of the Oppo Find X in Tmall is "Look and feel"; 2) the same mobile phone can have different weights on different platforms. For example, for the Huawei Mate 20, the attribute with the highest weight in Tmall is "Brand", but it is "Screen" in Suning.
We analyze the first phenomenon, recognizing that people pay more attention to the attributes of mobile phone brands that have obvious advantages or disadvantages. For example, for the four mobile phones on Tmall, consider the attribute of "Brand", which may be related to Chinese culture. The Huawei Mate 20 and the Mi MIX 3 have always been representative of domestic mobile phones, with Huawei especially being the benchmark for affordable domestic mobile phones. Some extreme netizens have even issued slogans that the Chinese must buy a Huawei phone or be considered unpatriotic. In other words, its brand recognition in China is very high. Thus, "Brand" for the Huawei Mate 20 and the Mi MIX 3 has a higher weight than for the iPhone X or the Oppo Find X. Although the Oppo Find X is also a domestic mobile phone, its reputation has not been favorable for a long time. It is considered a high-priced, low-profile mobile phone; thus, its brand is at a disadvantage. The iPhone X is an Apple phone, and Apple has world-class influence. However, it is not a domestic mobile phone, and in the past two years, the company's word-of-mouth reputation has declined for various reasons.
The second phenomenon may be related to the way reviews are conducted on these platforms. For example, let us consider "Brand" in this case, too. For the Huawei Mate 20, the values of the weights are different on Tmall and jd.com. However, on both platforms, "Brand" is the attribute with the highest weight. On Suning, the weight for "Brand" ranks third among all attributes for this phone. The difference between the reviews on Suning versus those for Tmall and jd.com is that the former has label reviews, while the latter two do not. On Suning, consumers can select labels such as "system fluency" in their reviews. In other words, a wide variety of opinions can be expressed via these label reviews, which can be directly selected (e.g., via a mouse click) without the consumer having to input their own text. Thus, the option in Suning to express opinions via label reviews may have resulted in reduced textual content in the first and additional Suning reviews compared to those for Tmall and jd.com. Furthermore, there is not an option such as "Huawei is outstanding" referring to a particular brand in Suning's label reviews. This means that a reduction in textual content in the first and additional Suning reviews would lower the weight for "Brand" compared to the weight in the other two platforms. In sum, for Suning, the document frequency and weight for "Brand" are lower than in the other platforms because of the way weight is calculated.

Decision matrix
Tables 8-10 demonstrate a phenomenon similar to the one for the weight matrix: 1) different phones have different attributes on the same platform, which is a common phenomenon; 2) the same mobile phone attribute has different fuzzy numbers across platforms. For example, the intuitionistic fuzzy number of the attribute "Brand" for the Huawei Mate 20 on Tmall is (0.97, 0.03) but only (0.88, 0.11) on jd.com. This means the attribute of "Brand" for the Huawei Mate 20 performs much worse on jd.com, verifying that different attribute scores are possible for the same product across different platforms. In addition, another phenomenon captures our attention: an attribute with a high weight does not necessarily have a high fuzzy number. As shown in Tables 5 and 10, the attribute "Brand" of the Mi Mix 3 on Tmall has a weight of 0.19, which is a high weight, but the fuzzy number of the attribute is (0.85, 0.11), which is poor. This may be because Xiaomi's innovation abilities and long-term cost advantages have enhanced its brand. However, Xiaomi's more recent products have been disappointing. For example, domestic mobile phone manufacturers have improved their processes over time, and there are now many companies with innovation capabilities that surpass Xiaomi's. Xiaomi still has a cost advantage, but this also means that it has to keep its costs low to sustain its profits. This pressure has led to quality control challenges and the failure of its mobile phones to keep up with consumer demand for high quality. This challenge coupled with the problem of long-standing low supply means that Xiaomi's brand performance is now waning in the eyes of customers. Table 14 shows the results of scores and rankings for the integrated and individual platforms before integration. Evidently, market structure analysis results for the various phones and brands on different platforms differ, confirming our speculation in the introduction. For example, the iPhone X has the lowest position in Tmall and jd.com, but it has the highest position on Suning. Once again, this shows that relying on just one platform may lead to inaccurate results. The cross-platform approach enables the integration of similar content from different platforms to achieve complementary results and to better reflect consumer perceptions of the brand.

Scores and rankings in the market structure analysis
More specifically, the cross-platform market structure analysis has advantages over the single platform method in the following two ways. First, cross-platform analysis incorporates more reviews than single-platform analysis, and market analysis results from a sufficient number of reviews tend to be more statistically significant. In our experiment, the mobile phone products we selected have a sufficient number of reviews. However, platforms generally do not publish all reviews, and the number of reviews published by the platforms may differ. In addition, the anti-crawl mechanism limits the number of reviews that can be automatically obtained; thus, the number of obtainable reviews may not be very high. For niche products, total reviews will be lower, simply because these products are reviewed less frequently. Through a cross-platform approach, one can gain access to more reviews, thus avoiding these problems.
Second, content from different platforms can be complementary, which does not happen with a single platform. The way people submit reviews on different platforms is limited by the platform mechanism. For example, in Tmall and jd.com, people can only use the first and additional reviews for textual reviews. In contrast, on Suning, people can also submit label reviews by directly selecting from a range of phrases. These two ways of reviewing products focus the reviews in different ways. For Tmall and jd.com, which do not have label reviews, people can only type words to describe satisfaction or dissatisfaction with the product attributes. When words are limited by the platform, people only review the attributes they really care about (such as the major advantages and disadvantages of a product) instead of wasting words describing minor attributes that do not really affect them. On Suning, label reviews allow reviewers to reduce unnecessary textual descriptions and choose directly from a wide range of phrases, producing a more comprehensive description of product attributes than on Tmall or jd.com. At the same time, because of reduced content in the first and additional reviews in Suning, it may be more difficult to identify the reviewer's focus. Thus, reviews on Tmall and jd.com might not be as comprehensive, but they may be more focused, whereas in Suning, the reviews may be more comprehensive but not as focused. This difference is reflected in Tables 10-12 and Table 16. In Tables 10-12, we find that the fuzzy numbers of Tmall and jd.com are similar to each other but differ greatly from Suning. Moreover, the proportion of negative reviews is higher on Tmall and jd.com than on Suning. In Table 16, product ranking results also show a similar phenomenon: the ranking results on Tmall and jd.com are similar to each other but quite different from those on Suning.
Based on these two advantages, we believe that the cross-platform market structure analysis of product reviews is preferable to single-platform market structure analysis.

A process to generate the priorities of product attribute improvements
In this section, to make full use of the research and improve the contribution of this paper, we propose a process to generate priorities for product improvement based on cross-platform market structure analysis. As we all know, product development depends on market demand. Consumers have different levels of satisfaction, and they pay attention to different attributes. Through cross-platform market structure analysis, we can obtain integrated decision and weight matrices of products that are based on multiple platforms. These are assumed to reflect the broader market's attention to and satisfaction with different attributes of products. Thus, we can identify the advantages and disadvantages of different products based on these two matrices and then develop product improvement strategies (see Figure 2). Our idea of product improvement is very simple: calculate the relative distance between the product attribute and the ideal solution attribute and determine the priority for product-attribute improvement based on the relative distance. We only obtain the attribute scores and weights, and each attribute incorporates significant content. Therefore, we cannot determine the specific nature of the improvement, but we can at least identify which attributes the enterprise should focus on first in its product-improvement initiatives. The detailed steps to generate a product improvement strategy are as follows: Step 1. Obtain the integrated decision matrix and integrated weight matrix using the crossplatform market structure analysis method.
Step 2. Obtain the positive ideal solutions by the integrated decision and weight matrices and intuitionistic fuzzy TOPSIS.
Step 3. Calculate the relative distance between each product attribute and its ideal solution attribute. The relative distance is as follows: Step 4. Prioritize product attributes based on the distances (i.e., the greatest distance would be assigned the highest priority, etc.) For example, we directly use the results of the Section 4 experiment to calculate the distances. In section 3, we obtain the integrated decision and weight matrices of the four mobile phones, as shown in Tables 12 and 13. Then, we can obtain the positive ideal solution, as shown in Table 15.
Next, we can calculate the relative distance by the function in Step 3. The relative distances for each attribute are shown in Table 16.   From Table 16, we can assign priorities for attribute improvement of the four mobile phones. For Huawei Mate 20, the three attributes that most need improvement are operational performance, gift, and brand. For iPhone X, the three attributes that most need improvement are price, function, and operational performance. For Mi Mix 3, the three attributes that most need improvement are function, look and feel, and brand; and for Oppo Find X, the three attributes that most need improvement are screen, look and feel, and batteries and duration. This example shows the effectiveness of our approach for identifying priorities for product-attribute improvements. In addition, we found that the relative distances of all attributes of Huawei Mate 20 are relatively small. The maximum relative distance of Huawei Mate 20 is only 0.003991, which is far less than 0.015902 of iPhone X, 0.015123 of Mi Mix 3 and 0.010481 of Oppo Find X. This means that Huawei Mate 20 is more perfect than the other three mobile phones, which further verifies the results in section 4.
Being able to prioritize product-attribute improvements could help companies focus their product-development efforts and improve their products' reputations. Unfortunately, our method does not provide a way to specify improvement measures; it only assigns priorities. Thus, we cannot predict the cost of improving a product. In other words, there is no way to know how much to invest to improve various product attributes and thus, maximize the score of the product.

Conclusions
In the past, market structure analyses using text mining have been based on a single platform, and the results of these analyses may have been affected by the number of reviews and the manner in which the reviews were submitted. Thus, the results may not have fully reflected consumers' perceptions of products. To solve this problem, this study proposes a cross-platform market structure analysis method using online product reviews. Compared with previous studies, our method incorporates group decision making to make full use of the emotional content in the reviews, and it can integrate consumer opinions that are distributed across multiple platforms. Through this integration, content from different platforms becomes complementary, and a more comprehensive result is obtained. We use the proposed method to study the Chinese mobile phone market, selecting four representative mobile phones and three of the biggest e-commerce platforms. Then, we conduct market structure analysis tests across these platforms. The results prove the effectiveness of our proposed method. To make full use of all our research and thereby strengthen this paper's contribution, we also propose a process to generate priorities for product improvement. We provide an example to demonstrate the effectiveness of this process.
Although our method makes satisfactory use of the emotional content of reviews and integrates the content of reviews across multiple platforms, it has the following limitations. First, it assumes that the emotional tendency of an attribute is consistent with the emotional tendency of the comment that contains the attribute, which may lead to misjudgments about attribute sentiment in some reviews. Second, to simplify the proposed method, our research does not consider the dynamics of the market nor the timeliness of the reviews. Third, the products we choose are mobile phones, which are search goods with relatively clear product attributes. For experience goods, attributes are more difficult to extract, and perhaps our method would not be effective for determining the market structure for such products. Fourth, the proposed process for generating priorities for product-attribute improvements can only provide priorities. It cannot specify how to improve product attributes or predict the costs of doing so. In addition, we have only briefly discussed differences in market structure analysis results for reviews on different platforms, and we did not delve deeply into the reasons for these differences.
To improve upon the current method, subsequent research might do the following: 1) Use a method for sentiment analysis that is superior to the one used in this study; 2) More fully consider the dynamics of the market and the timeliness of reviews; 3) Explore solutions for products without clear product attributes; and 4) Propose new methods for arriving at specific product-improvement strategies and determining the costs of the product improvement.