Skip to main content
Log in

User-aware topic modeling of online reviews

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

The online reviews are one type of social media which are opinions generated by the users to comment on some special items. Since the sentiments are dependent on topics, probabilistic topic models have been widely used for sentiment analysis. However, most of existing methods only model the text, but rarely consider the users, who express the opinions, and the items, which the opinions are expressed on. Different users are usually concerned with different topics and use different sentiment expressions, a lenient user might tend to give positive review than a critical user. High-quality items tend to receive positive reviews than low-quality items. To better model the topics and sentiments, we argue that it is essential to explore reviews as well as users and items. To this end, we propose a novel model called User Item Sentiment Topic (UIST) which incorporates users and items for topic modeling and produces topic–word, user–topic, and item–topic distributions simultaneously. Extensive experiments on several datasets demonstrate the advantages and effectiveness of our method. The extracted topics with our method are more coherent and informative; consequently, the performance of sentiment classification is also improved. Furthermore, the user preference obtained with our method could be utilized for many personalized applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.cs.cornell.edu/people/pabo/movie-review-data/.

  2. http://www-nlp.stanford.edu/software/segmenter.html.

  3. http://www.keenage.com/html/c_index.html.

  4. https://github.com/linron84/JST.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). http://dl.acm.org/citation.cfm?id=944919.944937

  2. Chang, J., Boyd-Graber, J., Gerrish, S., Wang, C., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Proceedings of the 2009 Advances in Neural Information Processing Systems, pp. 288–296 (2009)

  3. Chen, H., Sun, M., Tu, C., Lin, Y., Liu, Z.: Neural sentiment classification with user and product attention. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1650–1659. Association for Computational Linguistics (2016). http://aclweb.org/anthology/D16-1171

  4. Chen, T., Xu, R., He, Y., Xia, Y., Wang, X.: Learning user and product distributed representations using a sequence model for sentiment analysis. IEEE Comput. Intel. Mag. 11(3), 34–44 (2016). doi:10.1109/MCI.2016.2572539

    Article  Google Scholar 

  5. Dasgupta, S., Ng, V.: Topic-wise, sentiment-wise, or otherwise? identifying the hidden dimension for unsupervised text classification. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, 6-7 August 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 580–589 (2009). http://www.aclweb.org/anthology/D09-1061

  6. Diao, Q., Qiu, M., Wu, C.Y., Smola, A.J., Jiang, J., Wang, C.: Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 193–202. ACM, New York, NY, USA (2014). doi:10.1145/2623330.2623758

  7. Fang, Q., Xu, C., Sang, J., Hossain, M.S., Ghulam, M.: Word-of-mouth understanding: Entity-centric multimodal aspect-opinion mining in social media. IEEE Trans. Multimedia 17(12), 2281–2296 (2015). doi:10.1109/TMM.2015.2491019

  8. Gao, W., Yoshinaga, N., Kaji, N., Kitsuregawa, M.: Modeling user leniency and product popularity for sentiment classification. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 1107–1111. Asian Federation of Natural Language Processing, Nagoya, Japan (2013). http://www.aclweb.org/anthology/I13-1156

  9. He, Y., Lin, C., Alani, H.: Automatically extracting polarity-bearing topics for cross-domain sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 123–131. Association for Computational Linguistics (2011)

  10. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’99, pp. 50–57. ACM, New York, NY, USA (1999). doi:10.1145/312624.312649

  11. Lakkaraju, H., Bhattacharyya, C., Bhattacharya, I., Merugu, S.: Exploiting coherence for the simultaneous discovery of latent facets and associated sentiments. In: Proceedings of the 2011 SIAM International Conference on Data Mining (2011)

  12. Li, F., Liu, N., Jin, H., Zhao, K., Yang, Q., Zhu, X.: Incorporating reviewer and product information for review rating prediction. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Volume Three, IJCAI’11, pp. 1820–1825. AAAI Press (2011). doi:10.5591/978-1-57735-516-8/IJCAI11-305, 10.5591/978-1-57735-516-8/IJCAI11-305

  13. Li, F., Wang, S., Liu, S., Zhang, M.: Suit: A supervised user-item based topic model for sentiment analysis. In: AAAI, pp. 1636–1642 (2014)

  14. Li, T., Zhang, Y., Sindhwani, V.: A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In: ACL 2009, Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2-7 August 2009, Singapore, pp. 244–252 (2009). http://www.aclweb.org/anthology/P09-1028

  15. Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM conference on Information and knowledge management, pp. 375–384. ACM (2009)

  16. Lin, C., He, Y., Everson, R.: A comparative study of bayesian models for unsupervised sentiment detection. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL 2010, Uppsala, Sweden, July 15-16, 2010, pp. 144–152 (2010). http://aclweb.org/anthology/W/W10/W10-2918.pdf

  17. Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012). doi:10.2200/S00416ED1V01Y201204HLT016

    Article  Google Scholar 

  18. Lu, Y., Zhai, C., Sundaresan, N.: Rated aspect summarization of short comments. In: Proceedings of the 18th international conference on World wide web, pp. 131–140. ACM (2009)

  19. McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM conference on Recommender systems, pp. 165–172. ACM (2013)

  20. Moghaddam, S., Ester, M.: On the design of lda models for aspect-based opinion mining. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12, pp. 803–812. ACM, New York, NY, USA (2012). doi:10.1145/2396761.2396863, 10.1145/2396761.2396863

  21. Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using crfs with hidden variables. In: Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 2-4, 2010, Los Angeles, California, USA, pp. 786–794 (2010). http://www.aclweb.org/anthology/N10-1120

  22. Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100–108. Association for Computational Linguistics (2010)

  23. Ovsjanikov, M., Chen, Y.: Topic modeling for personalized recommendation of volatile items. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 483–498. Springer (2010)

  24. Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42th Annual Meeting of the Association for Computational Linguistics (ACL) (2004)

  25. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  26. Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., Steyvers, M.: Learning author-topic models from text corpora. ACM Trans. Inf. Syst. (TOIS) 28(1), 4 (2010)

    Article  Google Scholar 

  27. Sang, J.: User-centric cross-osn multimedia computing. In: Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM ’15, Brisbane, Australia, October 26 - 30, 2015, pp. 1333–1334 (2015). doi:10.1145/2733373.2807423

  28. Sang, J., Xu, C.: Browse by chunks: topic mining and organizing on web-scale social media. ACM Trans. Multimed. Comput. Commun. Appl. 7(1), 30 (2011)

    Google Scholar 

  29. Sang, J., Xu, C.: Right buddy makes the difference: an early exploration of social relation analysis in multimedia applications. In: ACM International Conference on Multimedia, pp. 19–28. ACM (2012)

  30. Sang, J., Xu, C., Liu, J.: User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimed. 14(3), 883–895 (2012)

    Article  Google Scholar 

  31. Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27-31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 151–161 (2011). http://www.aclweb.org/anthology/D11-1014

  32. Song, K., Feng, S., Gao, W., Wang, D., Yu, G., Wong, K.F.: Personalized sentiment classification based on latent individuality of microblog users. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence, IJCAI’15, pp. 2277–2283. AAAI Press (2015)

  33. Steyvers, M., Griffiths, T.: Probabilistic topic models. Handb. Latent Semant. Anal. 427(7), 424–440 (2007)

    Google Scholar 

  34. Subhabrata Mukherjee, G.B., Joshi, S.: Joint author sentiment topic model. In: Proceedings of the 2014 SIAM International Conference on Data Mining (SDM’14) (2014)

  35. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)

    Article  Google Scholar 

  36. Tang, D., Qin, B., Liu, T.: Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1014–1023. Association for Computational Linguistics (2015). doi:10.3115/v1/P15-1098. http://aclweb.org/anthology/P15-1098

  37. Titov, I., McDonald, R.: A joint model of text and aspect ratings for sentiment summarization. In: ACL-08: HLT, pp. 308–316. Association for Computational Linguistics (2008)

  38. Titov, I., McDonald, R.: Modeling online reviews with multi-grain topic models. In: Proceedings of the 17th International Conference on World Wide Web, WWW ’08, pp. 111–120. ACM, New York, NY, USA (2008). doi:10.1145/1367497.1367513., 10.1145/1367497.1367513

  39. Wang, S., Chen, Z., Liu, B.: Mining aspect-specific opinion using a holistic lifelong topic model. In: Proceedings of the 25th International Conference on World Wide Web, WWW ’16, pp. 167–176. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2016). doi:10.1145/2872427.2883086, 10.1145/2872427.2883086.

  40. Yan, C., Zhang, Y., Dai, F., Wang, X., Li, L., Dai, Q.: Parallel deblocking filter for hevc on many-core processor. Electron. Lett. 50(5), 367–368 (2014). doi:10.1049/el.2013.3235

    Article  Google Scholar 

  41. Yan, C., Zhang, Y., Dai, F., Zhang, J., Li, L., Dai, Q.: Efficient parallel hevc intra-prediction on many-core processor. Electron. Lett. 50(11), 805–806 (2014). doi:10.1049/el.2014.0611

    Article  Google Scholar 

  42. Yan, C., Zhang, Y., Xu, J., Dai, F., Li, L., Dai, Q., Wu, F.: A highly parallel framework for hevc coding unit partitioning tree decision on many-core processors. IEEE Signal Process. Lett. 21(5), 573–576 (2014). doi:10.1109/LSP.2014.2310494

    Article  Google Scholar 

  43. Yan, C., Zhang, Y., Xu, J., Dai, F., Zhang, J., Dai, Q., Wu, F.: Efficient parallel framework for hevc motion estimation on many-core processors. IEEE Trans. Circuits Syst. Video Technol. 24(12), 2077–2089 (2014). doi:10.1109/TCSVT.2014.2335852

    Article  Google Scholar 

  44. Zhou, D., Bian, J., Zheng, S., Zha, H., Giles, C.L.: Exploring social annotations for information retrieval. In: Proceedings of the 17th international conference on World Wide Web, pp. 715–724. ACM (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojia Pu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pu, X., Wu, G. & Yuan, C. User-aware topic modeling of online reviews. Multimedia Systems 25, 59–69 (2019). https://doi.org/10.1007/s00530-017-0557-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-017-0557-6

Keywords

Navigation