56 Groundbreaking Python Open-source Projects – Get started with Python
Python is booming and so is its Github page. This year was great for Python and we saw some very powerful python open-source projects to contribute to. Today, we’re listing down some of the top python open-source projects; try contributing to at least one of these, it will help improve your Python skills.
Don't become Obsolete & get a Pink Slip
Follow DataFlair on Google News & Stay ahead of the game
56 Python Open-source Projects
Below is the detail of 56 Python open-source projects, let’s start –
This is a micro web framework written in Python. It does not have form validation and a database abstraction layer, but it lets you use third-party libraries for common functions. And that is why it’s a microframework. Flask is designed to make creating apps easy and fast and is scalable and lightweight. It is based on the projects Werkzeug and Jinja2. You can learn more about it at DataFlair’s latest article on Python Flask.
Keras is a neural network library that is open-source and written in Python. It is user-friendly, modular, and extensible, and can run on top of TensorFlow, Theano, PlaidML, or Microsoft Cognitive Toolkit (CNTK). Keras has it all- layers, objectives, activation functions, optimizers, and much more. It also supports convolutional and recurrent neural networks.
Work on the latest Keras based python open-source project – Breast Cancer Classification
This is an open-source software library that deals with Natural Language Processing and is written in Python and Cython. While NLTK is more for teaching and research purposes, spaCy’s job is to provide software for production. Also, Thinc is spaCy’s machine learning library featuring CNN models for part-of-speech tagging, dependency parsing, and named entity recognition.
It offers hosted error monitoring that is also open-source so you can discover and triage errors in real-time. Simply install the SDK for your language(s) or framework(s) and get started. It lets you capture unhandled exceptions, examine the stack trace, analyze the impact of each problem, track errors across different projects, assign issues, and much more. Using Sentry means fewer bugs and more shipped code.
OpenCV is an open-source computer vision and machine learning library. The library has more than 2500 optimized algorithms for computer vision tasks like detecting and recognizing objects, classifying different human activities, tracking movements with the camera, producing 3d models of objects, stitching images to get the high-resolution images and a lot more tasks. The library is available for many languages like Python, C++, Java, etc.
Number of stars on Github: 39585
Have you worked on any OpenCV project yet? Here is one for FREE – Gender and Age Detection Project
This is a module for fast and easy implementation of statistical learning on NeuroImaging data. This makes use of scikit-learn for multivariate statistics for predictive modeling, classification, decoding, and connectivity analysis. Nilearn is a part of the NiPy ecosystem, which is a community devoted to using Python for analyzing neuroimaging data.
Number of stars on Github: 549
Scikit-learn is another python open-source project. This is a very famous machine learning library for Python. Often used with NumPy and SciPy, scikit-learn offers classification, regression, and clustering- it has support for SVM (Support Vector Machines), random forests, gradient boosting, k-means, and DBSCAN. This library is written in Python and Cython for performance.
Number of stars on Github: 37,144
PyTorch is another open-source machine learning library written in and for Python. This is based on the Torch library, and is great for domains like computer vision and natural language processing (NLP). It also has a C++ frontend. Among many other features, PyTorch offers two high-level ones:
- Tensor computing with strong acceleration using GPU
- Deep neural networks
Number of stars on Github: 31,779
Librosa is one of the best python library for music and audio analysis. It provides the necessary building blocks which are used to retrieve information from music. The library is well documented and has several tutorials and examples to make your task easier.
Number of stars on Github: 3107
Implement Python Open-source Project with Librosa – Speech Emotion Recognition
Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. It targets the NLP and information retrieval communities. Gensim is short for ‘generate similar’. Earlier, this would generate a shortlist of articles similar to a given article. Gensim is clear, efficient, and scalable. This implements efficient and hassle-free realization of unsupervised semantic modeling from plain text.
Number of stars on Github: 9,870
Django is a high-level Python framework that encourages rapid development and believes in the DRY Principle (Don’t Repeat Yourself). It is a very powerful framework and the most-widely used web framework for Python. It follows the MTV pattern (Model-Template-View).
Number of stars on Github: 44,214
12. Face Recognition
Face Recognition is a popular project on GitHub- it easily recognizes and manipulates faces using Python/command line and uses the world’s simplest face recognition library for this. This uses dlib with deep learning to detect faces with an accuracy of 99.38% on the Labeled Faces in the Wild benchmark.
Number of stars on Github: 28,267
Number of stars on Github: 10,291
pandas is a data analysis and manipulation library for Python and offer labeled data structures and statistical functions.
Number of stars on Github: 21,404
Python open-source project to try with Pandas – Detecting Parkinson Disease
Pipenv promises to be a production-ready tool aiming to bring the best of all packaging worlds to the world of Python. Its terminal colors are pretty and it harnesses Pipfile, pip, and virtualenv into one command. It automatically creates and manages a virtualenv for your projects and gives users an easy way to setup a working environment.
Number of stars on Github: 18,322
This is an implementation of Blockchain for a cryptocurrency made in Python, but is simple, insecure, and incomplete. Not for production uses, SimpleCoin is for educational purposes and just aims to make a working blockchain currency and keep it simple. It lets you preserve mined hashes and exchange them in any supported currency.
Number of stars on Github: 1,343
This is a 3D rendering library written in vanilla Python. It renders 2D, 3D, higher dimensional objects and scenes in Python, and animations. It finds us in the fields of created videos, video games, physical simulations, and even pretty pictures. The requirements for this are PIL, numpy, and scipy.
Number of stars on Github: 451
MicroPython is Python for microcontrollers. It is an efficient implementation of Python3 and ships with many packages from the Python standard library, and is optimized to run on microcontrollers and in constrained environments. pyboard is a small electronic circuit board which runs MicroPython on bare metal so it can control all kinds of electronic projects.
Number of stars on Github: 9,197
Kivy is a Python library for development of mobile applications and other multitouch application software with a natural user interface (NUI). It has a graphic library, multiple widget options, the intermediate language Kv to design custom widgets, and input support for mouse, keyboard, TUIO, and multitouch events. This is an open-source library for rapid development of applications with innovative UIs. It is cross-platform, business-friendly, and GPU accelerated.
Number of stars on Github: 9,930
Excellent! You have read about the 19 python open-source projects.
Don’t forget to bookmark – Python projects with source code
Number of stars on Github: 9,883
Magenta is an open-source research project that focuses on machine learning as a tool in the creative process. This lets you create music and art using machine learning. It is a Python library powered by TensorFlow, and has utilities for manipulating source data, using it to train machine learning models, and using those to create new content.
22. Mask R-CNN
This is an implementation of Mask R-CNN on Python 3, TensorFlow, and Keras. The model takes each instance of an object in the image and creates bounding boxes and segmentation masks for it. It uses the Feature Pyramid Network (FPN) and a ResNet101 backbone. The code is easy to extend. This project also offers the Matterport3D dataset of 3D-reconstructed spaces captured by customers.
Number of stars on Github: 14,055
23. TensorFlow Models
This is a repository with different models implemented in TensorFlow- official modelsa and research models. It also has samples and tutorials. The official models use TensorFlow’s high-level APIs. The research models are the models implemented in TensorFlow by researchers to maintain them or provide support on issues and pull requests.
Number of stars on Github: 57,745
Refer to this Free TensorFlow Tutorials Library and learn everything about TensorFlow
Snallygaster is a way to organize issues with project boards. With this, you can set up a project board on GitHub, and streamline and automate your workflow. It lets you sort tasks, plan your projects, automate your workflow, track progress, share status, and finally, wrap up. Snallygaster can scan for secret files on HTTP servers- it looks for files accessible on web servers that shouldn’t be public and can be a security risk.
Number of stars on Github: 1,477
This is a Python package that complements scipy for statistical computations- this includes descriptive statistics, and estimation and inference for statistical models. It has classes and functions for the same. It also lets us conduct statistical tests and perform statistical data exploration.
Number of stars on Github: 4,246
This is an advanced firewall detection tool we can use to get an idea of whether there’s a web application firewall present. It detects a firewall on a web application and attempts to detect one or more bypasses for it on the specified target.
Number of stars on Github: 1,300
Chainer is a deep learning framework that focuses on flexibility. It is based in Python and offers differentiation APIs based on the define-by-run approach. Chainer also offers object-oriented high-level APIs to build and train neural networks. It is a powerful, flexible, and intuitive framework for neural networks.
Number of stars on Github: 5,054
This is a command-line tool; when you get a compiler error, it immediately fetches results from Stack Overflow. To use this, you can use the rebound command to execute your file. This is one of the 50 most popular python open-source projects of 2018. Also, it requires Python 3.0 or higher. The file types it supports are Python, Node.js, Ruby, Golang, and Java.
Number of stars on Github: 2,913
Detectron performs state-of-the-art object detection (also implements Mask R-CNN). It is Facebook AI Research’s (FAIR’s) software and is written in Python and powered by the Caffe2 Deep Learning framework. Detectron’s purpose is to provide a high-quality and high-performance codebase for object detection research. It is flexible and implements the following algorithms- Mask R-CNN, RetinaNet, Faster R-CNN, RPN, Fast R-CNN, R-FCN.
Number of stars on Github: 21,873
Don’t Miss!! Practice top Data Science Projects for FREE
This one’s a library for automatically generating CLIs (Command Line Interfaces) from a (any) Python object. It also lets you develop and debug code, and explore existing code or turn someone else’s code into a CLI. Python Fire makes it easier to transition between Bash and Python, and also makes using REPL easier.
Number of stars on Github: 15,299
Pylearn2 is a Machine Learning library mostly built on top of Theano. Its purpose is to make ML research easy. It lets you write new algorithms and models.
Number of stars on Github: 2,681
matplotlib is a 2D plotting library for Python- it produces publication-quality figures in different hardcopy formats.
Number of stars on Github: 10,072
Theano is a library for manipulating and mathematical expressions and matrix-valued expressions. It is also an optimizing compiler. Theano uses a Numpy-like syntax to express computations, and compiles them to run on CPU or GPU architectures. This is an open-source Python Machine Learning library that is written in Python and CUDA and runs on Linux, macOS, and Windows.
Number of stars on Github: 8,922
Multidiff is designed to make it easy to understand machine-friendly data. It helps view the differences within a large number of objects by way of performing diffs between relevant objects, and then displaying them. This visualization lets us look for patterns in proprietary protocols or unusual file formats. It is also mostly used for reverse engineering and binary data analysis.
Number of stars on Github: 262
This project deals with the use of Self-Organizing Maps to deal with the Traveling Salesman Problem. Using a SOM, we discover sub-optimal solutions for the TSP problem, and we use the .tsp format for this. TSP is an NP-complete problem, and as the number of cities increases, it becomes more difficult to solve it.
Number of stars on Github: 950
Number of stars on Github: 5,714
37. Social Mapper
Social Mapper is a social media mapping tool that correlates profiles using facial recognition. It does this on different websites on a large scale; It automates searching for names and pictures on social media websites, then tries to accurately detect and group somebody’s presence. Then, it creates a report for a human to review. This is useful in the security industry (eg, for phishing). It supports the platforms LinkedIn, Facebook, Twitter, Google Plus, Instagram, VKontakte, Weibo, and Douban.
Number of stars on Github: 2,396
Camelot is a Python library that helps extract tables from PDF files. This works with text-based PDFs, but not with scanned documents. Here, each table is a pandas DataFrame; also, you can then export the tables in .json, .xls, .html, or .sqlite.
Number of stars on Github: 2,415
This is a Qt-based ebook reader. It supports the .pdf, .epub, .djvu, .fb2, .mobi, .azw/.azw3/.azw4, .cbr/.cbz, and .md file formats. Lector has a main window, table view, book reading view, distraction-free view, annotation support, comic reading view, and a settings window. It also supports bookmarks, viewing profiles, metadata editor, and an in-program dictionary.
Number of stars on Github: 835
This is a Telegram bot for self-testing of depression and anxiety.
Number of stars on Github: 145
This is an animation engine for explanatory math videos, and can be used to programmatically create precise animation. It uses Python for this.
Number of stars on Github: 13,491
This is a Python bot for a Tinder-like application. This is in Chinese.
Number of stars on Github: 5,959
This is a Cross Site Scripting detection package with four handwritten parsers. It also has an intelligent payload generator, a powerful fuzzing engine, and an exceptionally fast crawler. It does not inject payload, but analyzes the response with multiple parsers.
Number of stars on Github: 7,050
This project is a collection of code in Python- robotics algorithms, and those for autonomous navigation.
Number of stars on Github: 6,746
45. Google Images Download
Google Images Download is a command line Python program that searches for keywords on Google images and gets the images for you. This is a small program without dependencies if you only need to download up to 100 images per keyword.
Number of stars on Github: 5,749
This lets you track and execute intelligent social engineering attacks in real-time. This helps discover how large Internet companies can obtain confidential information and control users without them knowing. This also tries to help track cybercriminals.
Number of stars on Github: 4,256
Xonsh is a cross-platform Unix-gazing shell language and command prompt based on Python. It is a superset of Python 3.5+ and has additional shell primitives like those in Bash and IPython. Xonsh works on Linux, Max OS X, Windows, and other major systems.
Number of stars on Github: 3,426
48. GIF for CLI
This takes in a GIF, or a short video or query, and using the Tenor GIF API, converts it into animated ASCII art. It uses ANSI escape sequences for animation and color.
Number of stars on Github: 2,547
Draw this is a polaroid camera capable of drawing cartoons. This uses a neural network for object recognition, the google quickdraw dataset, a thermal printer, and a raspberry pi. Quick, Draw! is a game by Google where it challenges players to draw a picture of an object/idea, and then it attempts to guess what it represents- in less than 20 seconds.
Number of stars on Github: 1,760
Learn more about Neural Networks through the latest article on Artificial Neural Networks.
Zulip is a group chat application that is real-time and also productive because of threaded conversations. Many Fortune 500 companies and open-source projects use it for a real-time chat system that can process thousands of messages in a day.
Number of stars on Github: 10,432
This is a command-line program that can download videos from YouTube and a few other sites. It is not platform-specific.
Number of stars on Github: 55,868
This is a simple IT automation system that can handle the following- configuration management, application deployment, cloud provisioning, ad-hoc task execution, network automation, and multi-node orchestration.
Number of stars on Github: 39,443
HTTPie is a command-line HTTP client. It makes CLI interaction with web services simpler. For the http command, it lets us send arbitrary HTTP requests with a simple syntax, and get colored output. We can use this to test, debug, and interact with HTTP servers.
Number of stars on Github: 43,199
54. Tornado Web Server
This is a web framework an an asynchronous networking library for Python. It uses the non-blocking network I/O to scale to more than thousands of open connections This makes it a good choice for long-polling and WebSockets.
Number of stars on Github: 18,306
requests is a library that lets you easily send HTTP/1.1 requests. You don’t need to manually add query strings to URLs or form-encode PUT and POST data.
Number of stars on Github: 40,294
scrapy is a fast high-level web crawling and scraping framework- you can use it to crawl websites to extract structure data from. You can also use this for data mining, monitoring, and automated testing.
Number of stars on Github: 34,493
So, these were all the 54 Python open-source projects that you can learn from and also contribute to. Want to add to the list? Comment below.
Also, you can practice more interesting projects by enrolling for the best Python Online Course