Skip to main content
Log in

On the impact of security vulnerabilities in the npm and RubyGems dependency networks

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The increasing interest in open source software has led to the emergence of large language-specific package distributions of reusable software libraries, such as npm and RubyGems. These software packages can be subject to vulnerabilities that may expose dependent packages through explicitly declared dependencies. Using Snyk’s vulnerability database, this article empirically studies vulnerabilities affecting npm and RubyGems packages. We analyse how and when these vulnerabilities are disclosed and fixed, and how their prevalence changes over time. We also analyse how vulnerable packages expose their direct and indirect dependents to vulnerabilities. We distinguish between two types of dependents: packages distributed via the package manager, and external GitHub projects depending on npm packages. We observe that the number of vulnerabilities in npm is increasing and being disclosed faster than vulnerabilities in RubyGems. For both package distributions, the time required to disclose vulnerabilities is increasing over time. Vulnerabilities in npm packages affect a median of 30 package releases, while this is 59 releases in RubyGems packages. A large proportion of external GitHub projects is exposed to vulnerabilities coming from direct or indirect dependencies. 33% and 40% of dependency vulnerabilities to which projects and packages are exposed, respectively, have their fixes in more recent releases within the same major release range of the used dependency. Our findings reveal that more effort is needed to better secure open source package distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. https://github.com/dominictarr/event-stream/issues/116

  2. See https://semver.org

  3. https://github.com/atom/atom/blob/master/package.json

  4. https://www.npmjs.com/package/mocha

  5. https://github.com/discourse/discourse/blob/master/Gemfile

  6. https://rubygems.org/gems/json

  7. https://github.com/left-pad/left-pad/issues/4

  8. https://www.theregister.com/2018/11/26/npm_repo_bitcoin_stealer/

  9. cve.mitre.org

  10. https://snyk.io/vuln

  11. https://snyk.io/vuln/SNYK-RUBY-RESTCLIENT-459900

  12. If n different tests are carried out over the same dataset, for each individual test one can only reject H0 if \(p< \frac {0.05}{n}\). In our case n = 48, i.e., p < 0.001.

  13. https://github.com/AhmedZerouali/vulnerability_analysis

  14. R2 ∈ [0, 1] and the closer to 1 the better the model fits the data.

  15. According to libraries.io, in May 2021, npm contained 1.79M packages compared to “only” 173K packages in RubyGems.

  16. https://snyk.io/vuln/SNYK-DEBIAN9-SUDO-1065095

  17. https://cve.mitre.org/cve/request_id.html

  18. We implicitly assume here that the first unaffected release is the one containing the fix.

  19. This analysis included Malicious Package vulnerabilities

  20. The two categories of directly and indirectly exposed package releases are non-exclusive.

  21. The two categories of directly and indirectly exposed projects are non-exclusive.

  22. Top-level packages are packages that do not have any dependent packages themselves.

  23. https://www.npmjs.com/package/sql

  24. https://www.npmjs.com/package/lodash

  25. https://nvd.nist.gov/vuln/detail/cve-2019-10744

  26. https://dependabot.com/

  27. https://nvd.nist.gov/

  28. https://nvd.nist.gov/vuln/detail/cve-2019-10744

  29. https://docs.npmjs.com/cli/v7/commands/npm-dedupe

  30. https://www.whitesourcesoftware.com/resources/blog/prototype-pollution-vulnerabilities/

  31. https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610

  32. https://docs.npmjs.com/cli/v7/configuring-npm/package-lock-json

  33. https://bundler.io/rationale.html

  34. https://libraries.io/search?q=wowdude

  35. https://libraries.io/npm/neat-106

  36. https://nvd.nist.gov/

  37. https://snyk.io/vuln/npm:wysihtml:20121229

  38. https://www.npmjs.com/package/wysihtml

  39. https://github.com/Voog/wysihtml/tags?after=0.4.0

References

  • Agresti A, Coull BA (1998) Approximate is better than “exact” for interval estimation of binomial proportions. The American Statistician 52 (2):119–126

    MathSciNet  Google Scholar 

  • Alexopoulos N, Meneely A, Arnouts D, Mühlhäuser M. (2021) Who are vulnerability reporters? a large-scale empirical study on floss. In: Proceedings of the 15th ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), pp 1–12

  • Alfadel M, Costa DE, Shihab E (2021) Empirical analysis of security vulnerabilities in Python packages. In: International conference on software analysis, evolution and reengineering. IEEE

  • Aranovich R, Wu M, Yu D, Katsy K, Ahmadnia K, Bishop M, Filkov V, Sagae K (2021) Beyond nvd: Cybersecurity meets the semantic web

  • Birsan A (2021) Dependency confusion: How I hacked into Apple, Microsoft and dozens of other companies. https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610. Accessed 7 May 2021

  • Bogart C, Kästner C., Herbsleb J, Thung F (2016) How to break an API: Cost negotiation and community values in three software ecosystems. In: Int’l Symp foundations of software engineering (FSE). ACM, pp 109–120

  • Bogart C, Kästner C, Herbsleb J, Thung F (2021) When and how to make breaking changes: Policies and practices in 18 open source software ecosystems. ACM Trans. Softw. Eng. Methodol., 30(4)

  • Chinthanet B, Ponta SE, Plate H, Sabetta A, Kula RG, Ishio T, Matsumoto K (2020) Code-based vulnerability detection in Node. js applications: How far are we?. In: International conference on automated software engineering (ASE). IEEE, pp 1199–1203

  • Cox J, Bouwers E, Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International conference on software engineering. IEEE Press, pp 109–118

  • Cox J, Bouwers E, van Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International Conference on Software Engineering, pp 109–118

  • Dashevskyi S, Brucker AD, Massacci F (2018) A screening test for disclosed vulnerabilities in foss components. IEEE Trans Softw Eng 45(10):945–966

    Article  Google Scholar 

  • Decan A, Mens T (2019) What do package dependencies tell us about semantic versioning?. IEEE Transactions on Software Engineering

  • Decan A, Mens T, Claes M (2017) An empirical comparison of dependency issues in OSS packaging ecosystems. In: International conference on software analysis, evolution and reengineering. IEEE, pp 2–12

  • Decan A, Mens T, Constantinou E (2018) On the evolution of technical lag in the npm package dependency network. In: Int’l Conf software maintenance and evolution. IEEE, pp 404–414

  • Decan A, Mens T, Constantinou E (2018) On the impact of security vulnerabilities in the npm package dependency network. In: International conference on mining software repositories

  • Decan A, Mens T, Grosjean P (2019) An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir Softw Eng 24(1):381–416

    Article  Google Scholar 

  • Decan A, Mens T, Zerouali A, Roover CD (2021) Back to the past–analysing backporting practices in package dependency networks. IEEE Transactions on Software Engineering

  • Gkortzis A, Feitosa D, Spinellis D (2020) Software reuse cuts both ways: An empirical analysis of its relationship with security vulnerabilities. Journal of Systems and Software

  • Gonzalez-Barahona JM, Sherwood P, Robles G, Izquierdo D (2017) Technical lag in software compilations: Measuring how outdated a software deployment is. In: IFIP international conference on open source systems. Springer, pp 182–192

  • Imtiaz N, Thorne S, Williams L (2021) A comparative study of vulnerability reporting by software composition analysis tools. arXiv preprint arXiv:2108.12078

  • Katz J (2020) Libraries.io Open Source Repository and Dependency Metadata

  • Kikas R, Gousios G, Dumas M, Pfahl D (2017) Structure and evolution of package dependency networks. In: International conference on mining software repositories (MSR). IEEE, pp 102–112

  • Klein JP, Moeschberger ML (2013) Survival Analysis: Techniques for Censored and Truncated Data. Springer, Berlin

    MATH  Google Scholar 

  • Lauinger T, Chaabane A, Arshad S, Robertson W, Wilson C, Kirda E (2017) Thou shalt not depend on me: Analysing the use of outdated JavaScript libraries on the web. In: NDSS symposium

  • Maillart T, Zhao M, Grossklags J, Chuang J (2017) Given enough eyeballs, all bugs are shallow? revisiting eric raymond with bug bounty programs. Journal of Cybersecurity 3(2):81–90

    Article  Google Scholar 

  • Massacci F, Pashchenko I (2021) Technical leverage in a software ecosystem: Development opportunities and security risks. In: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE). IEEE, pp 1386–1397

  • Meneely A, Srinivasan H, Musa A, Tejeda AR, Mokary M, Spates B (2013) When a patch goes bad: Exploring the properties of vulnerability-contributing commits. In: 2013 ACM/IEEE international symposium on empirical software engineering and measurement. IEEE, pp 65–74

  • Mujahid S, Costa DE, Abdalkareem R, Shihab E, Saied MA, Adams B (2021) Towards using package centrality trend to identify packages in decline. arXiv preprint arXiv:2107.10168

  • Nguyen VH, Dashevskyi S, Massacci F (2016) An automatic method for assessing the versions affected by a vulnerability. Empir Softw Eng 21 (6):2268–2297

    Article  Google Scholar 

  • Nguyen DC, Derr E, Backes M, Bugiel S (2020) Up2dep: Android tool support to fix insecure code dependencies. In: Annual Computer Security Applications Conference, pp 263–276

  • OWASP (2017) Owasp top ten web application security risks. https://owasp.org/www-project-top-ten/, accessed: 24/04/2021

  • Ohm M, Plate H, Sykosch A, Meier M (2020) Backstabber’s knife collection: A review of open source software supply chain attacks. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, pp 23–43

  • Ozment A, Schechter SE (2006) Milk or wine: does software security improve with age? In. USENIX Security Symposium 6:10–5555

    Google Scholar 

  • Pashchenko I, Duc-Ly V, Massacci F (2020) A qualitative study of dependency management and its security implications. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp 1513–1531

  • Pashchenko I, Plate H, Ponta SE, Sabetta A, Massacci F (2018) Vulnerable open source dependencies: Counting those that matter. In: International symposium on empirical software engineering and measurement. ACM

  • Pashchenko I, Plate H, Ponta SE, Sabetta A, Massacci F (2020) Vuln4real: A methodology for counting actually vulnerable dependencies. IEEE Transactions on Software Engineering

  • Pham NH, Nguyen TT, Nguyen HA, Wang X, Nguyen AT, Nguyen TN (2010) Detecting recurring and similar software vulnerabilities. In: Int’l Conf software engineering, pp 227–230

  • Ponta SE, Plate H, Sabetta A (2020) Detection, assessment and mitigation of vulnerabilities in open source dependencies. Empir Softw Eng 25 (5):3175–3215

    Article  Google Scholar 

  • Prana GAA, Sharma A, Shar LK, Foo D, Santosa A, Sharma A, Lo D (2021) Out of sight, out of mind? How vulnerable dependencies affect open-source projects. Empirical Software Engineering, 26

  • Preston-Werner T (2013) Semantic versioning 2.0.0. https://semver.org/

  • Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: Are the t-test and Cohen’s d indices the most appropriate choices?. In: Annual Meeting of the Southern Association for Institutional Research

  • Ruohonen J (2018) An empirical analysis of vulnerabilities in Python packages for web applications. In: International workshop on empirical software engineering in practice (IWESEP). IEEE, pp 25–30

  • Shin Y, Meneely A, Williams L, Osborne JA (2010) Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans Softw Eng 37(6):772–787

    Article  Google Scholar 

  • Snyk (2017) The state of open source security. https://snyk.io/wp-content/uploads/The-State-of-Open-Source-2017.pdfhttps://snyk.io/wp-content/uploads/The-State-of-Open-Source-2017.pdf, accessed: 10/06/2021

  • Soto-Valero C, Harrand N, Monperrus M, Baudry B (2021) A comprehensive study of bloated dependencies in the maven ecosystem. Empir Softw Eng 26(3):1–44

    Article  Google Scholar 

  • Wittern E, Suter P, Rajagopalan S (2016) A look at the dynamics of the JavaScript package ecosystem. In: Int’l Conf mining software repositories (MSR). IEEE, pp 351–361

  • Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in Software Engineering - An Introduction. Kluwer

  • Zapata RE, Kula RG, Chinthanet B, Ishio T, Matsumoto K, Ihara A (2018) Towards smoother library migrations: A look at vulnerable dependency migrations at function level for npm JavaScript packages. In: International conference on software maintenance and evolution. IEEE, pp 559–563

  • Zerouali J (2019) A Measurement Framework for Analyzing Technical Lag in Open-Source Software Ecosystems. PhD thesis, University of Mons

  • Zerouali A, Constantinou E, Mens T, Robles G, González-Barahona J (2018) An empirical analysis of technical lag in npm package dependencies. In: International conference on software reuse. Springer, pp 95–110

  • Zerouali A, Mens T, Decan A, Gonzalez-Barahona J, Robles G (2021a) A multi-dimensional analysis of technical lag in Debian-based Docker images. Empir Softw Eng 26(2):1–45

    Article  Google Scholar 

  • Zerouali A, Mens T, Robles G, Gonzalez-Barahona JM (2019) On the relation between outdated Docker containers, severity vulnerabilities, and bugs. In: International conference on software analysis, evolution and reengineering. IEEE, pp 491–501

  • Zerouali A, Mens T, Roover CD (2021b) On the usage of JavaScript, Python and Ruby packages in Docker Hub images. Science of Computer Programming, pp 102653

  • Zimmermann M, Staicu C-A, Tenny C, Pradel M (2019) Small world with high risks: A study of security threats in the npm ecosystem. In: USENIX security symposium, pp 995–1010

Download references

Acknowledgments

This research was partially funded by the Excellence of Science project 30446992 SECO-Assist financed by F.R.S.-FNRS and FWO-Vlaanderen, as well as FNRS Research Credit J015120 and FNRS Research Project T001718. We express our gratitude to the security team of Snyk for granting us permission to use their dataset of vulnerability reports for research purposes.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Zerouali.

Additional information

Communicated by: Jeffrey C. Carver

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zerouali, A., Mens, T., Decan, A. et al. On the impact of security vulnerabilities in the npm and RubyGems dependency networks. Empir Software Eng 27, 107 (2022). https://doi.org/10.1007/s10664-022-10154-1

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10154-1

Keywords

Navigation