Skip to main content
Log in

Knowledge representation analysis of graph mining

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

This paper analyses the graph mining problem, and the frequent pattern mining task associated with it. In general, frequent pattern mining looks for a graph which occurs frequently within a network or, in the transactional setting, within a dataset of graphs. We discuss this task in the transactional setting, which is a problem of interest in many fields such as bioinformatics, chemoinformatics and social networks. We look at the graph mining problem from a Knowledge Representation point of view, hoping to learn something about support for higher-order logics in declarative languages and solvers. Graph mining is studied as a prototypical problem; it is easily expressible mathematically and exists in many variations. As such, it appears to be a prime candidate for a declarative approach; one would expect this allows for a clear, structured, statement of the problem combined with easy adaptation to changing requirements and variations. Current state-of-the-art KR languages such as IDP and ASP aspire to be practical solvers for such problems (Bruynooghe, Theory Practice Logic Program. (TPLP) 15(6), 783–817 2015). Nevertheless, expressing the graph mining problem in these languages requires unexpectedly complicated and unintuitive encoding techniques. These techniques are in contrast to the ease with which one can transform the mathematical definition of graph mining to a higher-order logic specification, and distract from the problem essentials, complicating possible future adaptation. In this paper, we argue that efforts should be made towards supporting higher-order logic specifications in modern specification languages, without unintuitive and complicated encoding techniques. We argue that this not only makes representation clearer and more susceptible to future adaptation, but might also allow for faster, more competitive solver techniques to be implemented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abramson, H., Rogers, H.: Meta-Programming in Logic Programming. MIT Press (1989)

  2. Abrial, J.R.: The B-Book. Cambridge University Press. https://doi.org/10.1017/CBO9780511624162 (1996)

  3. Abrial, J.R.: Modeling in Event-B: System and Software Engineering. Cambridge University Press (2010)

  4. Aoga, J.O.R., Guns, T., Schaus, P.: An efficient algorithm for mining frequent sequence with constraint programming. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part II, Lecture Notes in Computer Science, vol. 9852, pp 315–330. Springer (2016), https://doi.org/10.1007/978-3-319-46227-1_20

  5. Babai, L.: Graph isomorphism in quasipolynomial time. CoRR 1512.03547 (2015)

  6. Bogaerts, B., Janhunen, T., Tasharrofi, S.: Solving QBF instances with nested SAT solvers. In: Darwiche, A. (ed.) Beyond NP, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA, February 12, 2016., AAAI Workshops. http://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12603, vol. WS-16-05. AAAI Press (2016)

  7. Bowen, J.P.: Formal Specification and Documentation using Z. International Thomson Computer Press (1996)

  8. Brewka, G., Delgrande, J.P., Romero, J., Schaub, T.: asprin: Customizing answer set preferences without a headache. In: AAAI, pp. 1467–1474. AAAI Press (2015)

  9. Bruynooghe, M., Blockeel, H., Bogaerts, B., de Cat, B., Pooter, S.D., Jansen, J., Labarre, A., Ramon, J., Denecker, M., Verwer, S.: Predicate logic as a modeling language: modeling and solving some machine learning and data mining problems with IDP3. Theory Practice Logic Program. (TPLP) 15(6), 783–817 (2015). https://doi.org/10.1017/S147106841400009X

    Article  MathSciNet  MATH  Google Scholar 

  10. de Cat, B., Denecker, M., Bruynooghe, M., Stuckey, P.J.: Lazy model expansion: Interleaving grounding with search. J. Artif. Intell. Res. 52, 235–286 (2015). https://doi.org/10.1613/jair.4591

    Article  MathSciNet  MATH  Google Scholar 

  11. Chen, W., Kifer, M., Warren, D.S.: Hilog: A foundation for higher-order logic programming. J. Logic Program. 15(3), 187–230 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  12. Cuteri, B., Dodaro, C., Ricca, F., Schu̇ller, P.: Constraints, lazy constraints, or propagators in ASP solving: An empirical analysis. Theory Pract. Logic Program. (TPLP) 17(5-6), 780–799 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  13. Dasseville, I., van der Hallen, M., Janssens, G., Denecker, M.: Semantics of templates in a compositional framework for building logics. Theory Pract. Logic Program. (TPLP) 15(4-5), 681–695 (2015). https://doi.org/10.1017/S1471068415000319

    Article  MathSciNet  MATH  Google Scholar 

  14. De Cat, B., Bogaerts, B., Bruynooghe, M., Janssens, G., Denecker, M.: Predicate logic as a modelling language: The IDP system. CoRR 1401.6312v2 (2016)

  15. De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)

  16. Eiter, T., Fink, M., Ianni, G., Krennwallner, T., Redl, C., Schu̇ller, P.: A model building framework for answer set programming with external computations. Theory Pract. Logic Program. (TPLP) 16(4), 418–464 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  17. Eiter, T., Ianni, G., Krennwallner, T.: Answer set programming: A primer. In: Reasoning Web, Lecture Notes in Computer Science, vol. 5689, pp. 40–110. Springer (2009)

  18. Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Answer Set Solving in Practice. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool Publishers (2012)

  19. Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Clingo = ASP + control: Preliminary report. CoRR 1405.3694 (2014)

  20. Gebser, M., Kaufmann, B., Schaub, T.: Solution enumeration for projected boolean search problems. In: Constraint Programming, Artificial Intelligence and Operations Research (CPAIOR), Lecture Notes in Computer Science, vol. 5547, pp. 71–86. Springer (2009)

  21. Guyet, T., Moinard, Y., Quiniou, R., Schaub, T.: Efficiency Analysis of ASP Encodings for Sequential Pattern Mining Tasks, pp 41–81. Springer International Publishing, Cham (2018)

    Google Scholar 

  22. van der Hallen, M., Janssens, G.: A grounder from second-order logic to qbf. In: Quantified Boolean Formulas, Papers from the 2018 FLoC Quantified Boolean Formulas and Beyond Workshop, Oxford, England, July 8, 2018 (accepted), Federated Logic Conference (FLoC): workshop proceedings (2018)

  23. van der Hallen, M., Paramonov, S., Leuschel, M., Janssens, G.: Knowledge representation analysis of graph mining. CoRR 1608.08956 (2016)

  24. Immerman, N.: Descriptive complexity and model checking. In: Arvind, V., Ramanujam, R. (eds.) Foundations of Software Technology and Theoretical Computer Science, 18th Conference, Chennai, India, December 17-19, 1998, Proceedings, Lecture Notes in Computer Science, vol. 1530, pp 1–5. Springer (1998), https://doi.org/10.1007/978-3-540-49382-2_1

  25. Järvisalo, M.: Itemset mining as a challenge application for answer set enumeration. Logic Programming and Nonmonotonic Reasoning (LPNMR), 304–310 (2011)

  26. Kaufmann, B., Leone, N., Perri, S., Schaub, T.: Grounding and solving in answer set programming. AI Mag. 37(3), 25–32 (2016). http://www.aaai.org/ojs/index.php/aimagazine/article/view/2672

    Article  Google Scholar 

  27. Kemmar, A., Lebbah, Y., Loudni, S., Boizumault, P., Charnois, T.: Prefix-projection global constraint and top-k approach for sequential pattern mining. Constraints 22(2), 265–306 (2017). https://doi.org/10.1007/s10601-016-9252-z

    Article  MathSciNet  MATH  Google Scholar 

  28. Lamport, L.: Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley (2002)

  29. Leuschel, M., Butler, M.J.: ProB: An automated analysis toolset for the B method. STTT 10(2), 185–203 (2008)

    Article  Google Scholar 

  30. Li, H., Yap, C.W., Ung, C.Y., Xue, Y., Cao, Z.W., Chen, Y.Z.: Effect of selection of molecular descriptors on the prediction of bloodbrain barrier penetrating and nonpenetrating agents by statistical learning methods. J. Chem. Inf. Model. 45(5), 1376–1384 (2005). https://doi.org/10.1021/ci050135u. PMID: 16180914

    Article  Google Scholar 

  31. Lonsing, F., Biere, A.: Depqbf: A dependency-aware QBF solver. JSAT 7(2–3), 71–76 (2010). http://jsat.ewi.tudelft.nl/content/volume7/JSAT7_6_Lonsing.pdf

    Google Scholar 

  32. Lonsing, F., Egly, U., Gelder, A.V.: Efficient clause learning for quantified boolean formulas via QBF pseudo unit propagation. In: Järvisalo, M., Gelder, A.V. (eds.) Theory and Applications of Satisfiability Testing - SAT 2013 - 16th International Conference, Helsinki, Finland, July 8-12, 2013. Proceedings, Lecture Notes in Computer Science, vol. 7962, pp 100–115. Springer (2013), https://doi.org/10.1007/978-3-642-39071-5_9

  33. McCarthy, J.: Elaboration tolerance. In: Working Papers of the Fourth International Symposium on Logical formalizations of Commonsense Reasoning, Commonsense-1998 (1998)

  34. Muggleton, S., Raedt, L.D.: Inductive logic programming: Theory and methods. J. Log. Program. 19(/20), 629–679 (1994). https://doi.org/10.1016/0743-1066(94)90035-3

    Article  MathSciNet  MATH  Google Scholar 

  35. Nijssen, S., Kok, J.N.: Frequent graph mining and its application to molecular databases. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics: The Hague, Netherlands, 10-13 October 2004, pp. 4571–4577. IEEE. https://doi.org/10.1109/ICSMC.2004.1401252 (2004)

  36. Paramonov, S., Chen, T., Guns, T.: Generic mining of condensed pattern representations under constraints. In: CEUR: Young Scientist‘s Second International Workshop on Trends in Information Processing Proceedings (YSIP), vol. 1837, pp. 138–177 (2017)

  37. Peitl, T., Slivovsky, F., Szeider, S.: Dependency learning for QBF. In: Gaspers, S., Walsh, T. (eds.) Theory and Applications of Satisfiability Testing - SAT 2017 - 20th International Conference, Melbourne, VIC, Australia, August 28 - September 1, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10491, pp 298–313. Springer (2017), https://doi.org/10.1007/978-3-319-66263-_19

  38. Rückert, U., Kramer, S.: Optimizing feature sets for structured data. In: Kok, J.N., Koronacki, J., de Mántaras, R.L., Matwin, S., Mladenic, D., Skowron, A. (eds.) Machine Learning: ECML 2007, 18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007, Proceedings, Lecture Notes in Computer Science, vol. 4701, pp 716–723. Springer (2007), https://doi.org/10.1007/978-3-540-74958-5_72

  39. Silva, J.P.M., Sakallah, K.A.: GRASP - a new search algorithm for satisfiability. In: International Conference on Computer-Aided Design (ICCAD), San Jose, California, USA, November 10-14 1996, pp. 220–227 (1996)

  40. Takigawa, I., Mamitsuka, H.: Graph mining: Procedure, application to drug discovery and recent advances. Drug Discov. Today 18(1), 50–57 (2013). https://doi.org/10.1016/j.drudis.2012.07.016. http://www.sciencedirect.com/science/article/pii/S1359644612002759

    Article  Google Scholar 

  41. Weinzierl, A.: Blending lazy-grounding and CDNL search for answer-set solving. In: Logic Programming and Nonmonotonic Reasoning (LPNMR), Lecture Notes in Computer Science, vol. 10377, pp. 191–204. Springer (2017)

  42. Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM ’02, pp 721–. IEEE Computer Society, Washington (2002)

Download references

Acknowledgements

We would like to thank the reviewers for their insightful and helpful comments, thanks to which many improvements could be made.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthias van der Hallen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Matthias van der Hallen is supported by a Ph.D. fellowship from the Research Foundation - Flanders (FWO - Vlaanderen).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van der Hallen, M., Paramonov, S., Janssens, G. et al. Knowledge representation analysis of graph mining. Ann Math Artif Intell 86, 21–60 (2019). https://doi.org/10.1007/s10472-019-09624-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-019-09624-y

Keywords

Mathematics Subject Classification (2010)

Navigation