Sorry, this page is locked for editing due to heavy traffic and edit volume.
Python Modules for Data Science & Analytics
A collection of important python modules for data scientists
This is a part of Python Knowledge and Resources List

Pandas
Pandas is a library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. Pandas is free software released under the threeclause BSD license.
Website: http://pandas.pydata.org/
Installation
Installing pandas and the rest of the NumPy and SciPy stack can be a little difficult for inexperienced users.
The easiest way to install pandas is to install it as part of the Anaconda distribution.
pandas can be installed via pip from PyPI.
pip install pandas
This will likely require the installation of a number of dependencies, including NumPy, will require a compiler to compile required bits of code, and can take a few minutes to complete.

Statsmodels
Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator.
Website: http://statsmodels.sourceforge.net/
Installation
You can obtain source distributions and Windows binaries from PyPi. Alternatively, you can use setuptools to install statsmodels:
easy_install statsmodels
or upgrade with:
easy_install U statsmodels
Statsmodels can be installed from source the usual way with the command
python setup.py install

scikitlearn
scikitlearn is an open source library for the Python. It features various classification, regression and clustering algorithms including support vector machines, logistic regression, naive Bayes, random forests, gradient boosting, kmeans and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Website: http://scikitlearn.org/stable/
Installation
At this time scikitlearn does not provide official binary packages for Linux so you have to build from source.
Installing from source requires you to have installed the scikitlearn runtime dependencies, Python development headers and a working C/C++ compiler. Under Debianbased operating systems, which include Ubuntu, if you have Python 2 you can install all these requirements by issuing:
sudo aptget install buildessential pythondev pythonsetuptools \
pythonnumpy pythonscipy \
libatlasdev libatlas3gfbase

Mlpy
Mlpy is a Python machine learning library built on top of NumPy/SciPy, the GNU Scientific Library. mlpy provides a wide range of machine learning methods for supervised and unsupervised problem.mlpy is multi platform, it works with Python 2 and 3.
Website: http://mlpy.sourceforge.net/
Installation
Download latest version for your OS from http://sourceforge.net/projects/mlpy/files/
you need GCC,Python,Numpy,SciPy,GSL preinstalled
then, from the terminal run
python setup.py install

NumPy
NumPy is an open source extension module for Python. The module NumPy provides fast precompiled functions for numerical routines.
It adds support to Python for large, multidimensional arrays and matrices. Besides that it supplies a large library of highlevel mathematical functions to operate on these arrays
Website: http://www.numpy.org/
Installation
Most of the major linux distributions provide packages for NumPy, but these can lag behind the most recent NumPy release. Prebuilt binary packages for Ubuntu are available on the scipy ppa. Redhat binaries are available in the Enthought Canopy.

SciPy
SciPy is widely used in scientific and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.
Website: http://www.scipy.org/
Installation
Users on Linux can quickly install the necessary packages from repositories.
for example ubuntu users can install dependencied by runnung
sudo aptget install pythonnumpy pythonscipy pythonmatplotlib ipython ipythonnotebook pythonpandas pythonsympy pythonnose

matplotlib
matplotlib is a plotting library for NumPy.
Website: http://matplotlib.org/
Installation
sudo aptget install pythonmatplotlib

NLTK
The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs statistical natural language processing (NLP) for the Python. NLTK includes graphical demonstrations and sample data.NLTK has been used successfully as a platform for prototyping and building research systems
Website: http://www.nltk.org/
Installation
for ubuntu
sudo pip install U nump
sudo pip install U nltk

Theano
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multidimensional arrays efficiently

nolearn
This package contains a number of utility modules that are helpful with machine learning tasks. Most of the modules work together with scikitlearn, others are more generally useful.

PyBrainPyBrain is short for PythonBased Reinforcement Learning, Artificial Intelligence and Neural Network LibraryIts goal is to offer flexible, easytouse yet still powerful algorithms for Machine Learning Tasks and a variety of predefined environments to test and compare your algorithms.

OrangeOrange is a componentbased data mining and machine learning software suite, featuring a visual programming frontend for explorative data analysis and visualization, and Python bindings and libraries for scripting. It includes a set of components for data preprocessing, feature scoring and filtering, modeling, model evaluation, and exploration techniques. It is implemented in C++ and Python. Its graphical user interface builds upon the crossplatform Qt frameworkUnlike its competitors scikitlearn and mlpy, Orange does not tie into NumPy and its ecosystem of tools; it focuses on traditional, symbolic algorithms, more than numeric onesjjj
http://orange.biolab.si/ 
KerasKeras is a minimalist, highly modular neural network library in the spirit of Torch, written in Python, that uses Theano under the hood for fast tensor manipulation on GPU and CPU. It was developed with a focus on enabling fast experimentation.

HebelHebel is a library for deep learning with neural networks in Python using GPU acceleration with CUDA through PyCUDA. It implements the most important types of neural network models and offers a variety of different activation functions and training methods such as momentum, Nesterov momentum, dropout, and early stopping.
https://github.com/hannesbrt/hebel