Show HN: I mirrored all the code from PyPI to GitHub and analysed it
This is a side project I've been working on for the last few months. I built an automated system to continuously mirror all the code on PyPI to a series of Github repositories. Mirroring PyPI code to Github enables:
1. Scanning of all new Python packages for accidentally published credentials
2. A browsable/searchable index of published code with a nice UI
3. Large-scale analysis of all published code to see how the language is evolving
Using this project anyone is able to download the contents of PyPI to their personal machine and analyse every piece of code ever published in a matter of hours.
I hope it enables people to do things with the worlds largest and oldest corpus of Python code that wasn't possible before, and while this is likely totally useless to most people I think that is kind of cool and unique.
Comments URL: https://news.ycombinator.com/item?id=37364290
Points: 19
# Comments: 5
from Hacker News: Front Page https://ift.tt/m8gPnUB
0 comments