As a cybersecurity student exploring the world of data science, I often find myself downloading packages using the
pip package manager, not only for data science projects (using in conda) but also python-based security tools (which require dependencies).
Just want to bring this article below to your attention as I saw some similar “news” in a tweet that was reshared by another dev on LinkedIn.
Also recalled myself attending a talk of how depending on too many open source libraries (esp. in enterprises is not always a good thing) and how making typos may lead you to downloading a malicious package (so check the spelling ).
What to do?
- Don’t do mass bogus uploads like this to prove your point. We appreciate the message you are trying to deliver, but it’s already been documented so you are just making distracting work for other people who could more usefully be doing something else for the project.
- Don’t choose a PyPI package just because the name looks right. Check that you really are downloading the right module from the right publisher. Even legitimtate modules sometimes have names that clash, compete or confuse.
- Don’t hook internal projects to external repositories by mistake. If you are using Python packages that you haven’t published externally, then the one thing you can be sure of is that all external copies of “your” package are imposter modules, probably malware.
- Don’t blindly download package updates into your own development or build systems. Test and review everything you download before you approve it for use. Remember that packages typically include update-time scripts that run when you do the update, so malware infections could be delivered as part of the update process, not of the module source code that ultimately gets installed.
Let me know your thoughts below as I would love to hear other perspectives about this, not just from a security point of view.
Cheers and happy coding!