Skip to main content
Log in

The Contents-Based Website Classification for the Internet Advertising Planning: An Empirical Application of the Natural Language Analysis

  • Published:
The Review of Socionetwork Strategies Aims and scope Submit manuscript

Abstract

This study proposes a model for website classification using website content, and discusses applications for the Internet advertising (ad) strategies. Internet ad agencies have a vast amount of ad-spaces embedded in websites and have to choose which advertisements are feasible for place. Therefore, ad agencies have to know the properties and topics of each website to optimize advertising submission strategy. However, since website content is in natural languages, they have to convert these qualitative sentences into quantitative data if they want to classify websites using statistical models. To address this issue, this study applies statistical analysis to website information written in natural languages. We apply a dictionary of neologisms to decompose website sentences into words and create a data set of indicator matrices to classify the websites. From the data set, we estimate the topics of each website using latent Dirichlet allocation, which is fast and robust method for sparse matrices. Finally, we discuss how to apply the results obtained to optimize ad strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Ansari, A., Essegaier, S., & Kohli, R. (2000). Internet recommendation systems. Journal of Marketing Research, 37(3), 363–375.

    Article  Google Scholar 

  2. Ansari, A., & Mela, C. (2003). E-customization. Journal of Marketing Research, 40(2), 131–146.

    Article  Google Scholar 

  3. Anderson, C. (2008). The long tail: why the future of business is selling less of more. Harlow: Hachette Books.

    Google Scholar 

  4. Dentsu Inc. (2016). 2015 advertising expenditures in Japan (Online). Available at http://www.dentsu.com/knowledgeanddata/ad_expenditures/pdf/expenditures_2015.pdf. Accessed 12 Aug 2016.

  5. Stephen, A. T. (2016). The role of digital and social media marketing in consumer behavoir. Current Opinion in Psychology, 10, 17–21.

    Article  Google Scholar 

  6. Lamberton, C., & Stephen, A. T. (2016). A thematic explosion of digital social media, and mobile marketing: research evolution from 2000 to 2015 and agenda for future inquiry. Journal of Marketing, 80, 146–172.

    Article  Google Scholar 

  7. Berger, J. (2014). Word-of-mouth and interpersonal communication: a review and directions for future research. Journal of Consumer Psychology, 24(4), 586–607.

    Article  Google Scholar 

  8. Yadav, M. S., & Pavlou, P. A. (2014). Marketing in computer-mediated environments: research synthesis and new directions. Journal of Marketing, 78, 20–40.

    Article  Google Scholar 

  9. Tirunillai, S., & Tellis, G. J. (2012). Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Science, 31(2), 198–215.

    Article  Google Scholar 

  10. Kumar, A., Bezawada, R., Rishika, R., Janakiraman, R., & Kannan, P. K. (2016). From social to sale: the effects of firm-generated content in social media on customer behavior. American Marketing Association, 80, 7–25.

    Google Scholar 

  11. Krippendorff, K. H. (2013). Content analysis: an introduction to its methodology. Beverly Hills: Sage Publications.

    Google Scholar 

  12. Humphreys, A. (2010). Megamarketing: the creation of markets as a social process. Journal of Marketing, 74(2), 1–19.

    Article  Google Scholar 

  13. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    Google Scholar 

  14. Tirunillai, S., & Tellis, G. J. (2014). Mining marketing from online chatter: strategic brand analysis of big data using latent Dirichlet allocation. Journal of Marketing Research, 51, 463–479.

    Article  Google Scholar 

  15. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. PNAS, 101(1), 5228–5235.

    Article  Google Scholar 

  16. Sriwannawit, P., & Sandström, U. (2015). Large-scale bibliometric review of diffusion research. Scientometrics, 102, 1615–1645.

    Article  Google Scholar 

  17. Agarwal, A., Hosanagar, K., & Smith, M. D. (2011). Location, location, location: an analysis of profitability of position in online advertising markets. Journal of Marketing Research, 48(6), 1057–1073.

    Article  Google Scholar 

  18. Kinjo, K., & Ebina, T. (2016). An advertising strategy using consumption externality and forgetting in the case of Japanese electronic books. The Review of Socionetwork Strategies, 10(2), 55–71.

    Article  Google Scholar 

  19. Kudo, T., Yamamoto, K., & Matsumoto, Y. (2004). Applying conditional random fields to Japanese morphological analysis. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 4, 230–237.

    Google Scholar 

  20. Sato, T. (2016). mecab-ipadic-NEologd: neologism dictionary for MeCab (Online). Available at https://github.com/neologd/mecab-ipadic-neologd. Accessed 12 Aug 2016.

  21. Newton, M. A., & Raftery, A. E. (1994). Approximate bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society, Series B (Methodological), 56(1), 3–48.

    Google Scholar 

  22. Aoyama, Y., & Izushi, H. (2003). Hardware Gimmick or Cultural Innovation? Technological, cultural, and social foundations of the Japanese video game industry. Research Policy, 32, 423–444.

    Article  Google Scholar 

  23. Li, W., & McCallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. ICML ‘06 Proceedings of the 23rd International Conference on Machine Learning, pp. 577–584.

Download references

Acknowledgements

The authors would like to thank Kazuki Oomori and members of F@N Communications Information and Science Technology Department, and anonymous reviewers for helpful comments and suggestions. This work was supported by JSPS KAKENHI Grant Number 17H02573.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sotaro Katsumata.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Katsumata, S., Motohashi, E., Nishimoto, A. et al. The Contents-Based Website Classification for the Internet Advertising Planning: An Empirical Application of the Natural Language Analysis. Rev Socionetwork Strat 11, 129–142 (2017). https://doi.org/10.1007/s12626-017-0007-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12626-017-0007-0

Keywords

Navigation