Intro
Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.
Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales. Van Rossum led the language community until stepping down as leader in July 2018.
Python features a dynamic type system and automatic memory management. It supports multiple programming paradigms, including object-oriented, imperative, functional and procedural, and has a large and comprehensive standard library.
The Python interpreter and the extensive standard library are freely available in source or binary form for all major platforms from the Python website, https://www.python.org , and may be freely distributed. The same site also contains distributions of and pointers to many free third party Python modules, programs and tools, and additional documentation.
Features and Philosophy
Python is a multi-paradigm programming language. Object-oriented programming and structured programming are fully supported, and many of its features support functional programming and aspect-oriented programming (including by metaprogramming and metaobjects (magic methods)). Many other paradigms are supported via extensions, including design by contract and logic programming.
Python uses dynamic typing, and a combination of reference counting and a cycle-detecting garbage collector for memory management. It also features dynamic name resolution (late binding), which binds method and variable names during program execution.
Python's design offers some support for functional programming in the Lisp tradition. It has filter()
, map()
, and reduce()
functions; list comprehensions, dictionaries, and sets; and generator expressions.
The standard library has two modules (itertools and functools) that implement functional tools borrowed from Haskell and Standard ML.
The language's core philosophy is summarized in the document The Zen of Python, which includes aphorisms such as:
- Beautiful is better than ugly
- Explicit is better than implicit
- Simple is better than complex
- Complex is better than complicated
- Readability counts
Rather than having all of its functionality built into its core, Python was designed to be highly extensible. This compact modularity has made it particularly popular as a means of adding programmable interfaces to existing applications.
Van Rossum's vision of a small core language with a large standard library and easily extensible interpreter stemmed from his frustrations with ABC, which espoused the opposite approach.
Syntax and Semantics
Python is meant to be an easily readable language. Its formatting is visually uncluttered, and it often uses English keywords where other languages use punctuation. Unlike many other languages, it does not use curly brackets to delimit blocks, and semicolons after statements are optional. It has fewer syntactic exceptions and special cases than C or Pascal.
Indentation
Python uses whitespace indentation, rather than curly brackets or keywords, to delimit blocks. An increase in indentation comes after certain statements; a decrease in indentation signifies the end of the current block. Thus, the program's visual structure accurately represents the program's semantic structure. This feature is also sometimes termed the off-side rule.
Statements & control flow
Python's statements include (among others):
- The
if
statement, which conditionally executes a block of code, along withelse
andelif
(a contraction of else-if). - The
for
statement, which iterates over an iterable object, capturing each element to a local variable for use by the attached block. - The
while
statement, which executes a block of code as long as its condition is true. - The
try
statement, which allows exceptions raised in its attached code block to be caught and handled byexcept
clauses; it also ensures that clean-up code in afinally
block will always be run regardless of how the block exits. - The
raise
statement, used to raise a specified exception or re-raise a caught exception. - The
class
statement, which executes a block of code and attaches its local namespace to a class, for use in object-oriented programming. - The
def
statement, which defines a function or method. - The
with
statement, from Python 2.5 released on September 2006, which encloses a code block within a context manager (for example, acquiring a lock before the block of code is run and releasing the lock afterwards, or opening a file and then closing it), allowing Resource Acquisition Is Initialization (RAII)-like behavior and replaces a common try/finally idiom. - The
pass
statement, which serves as a NOP. It is syntactically needed to create an empty code block. - The
assert
statement, used during debugging to check for conditions that ought to apply. - The
yield
statement, which returns a value from a generator function. From Python 2.5,yield
is also an operator. This form is used to implement coroutines. - The
import
statement, which is used to import modules whose functions or variables can be used in the current program. There are three ways of using import:import <module name> [as <alias>]
orfrom <module name> import *
orfrom <module name> import <definition 1> [as <alias 1>], <definition 2> [as <alias 2>], ...
. - The
print
statement was changed to theprint()
function in Python 3.
Python does not support tail call optimization or first-class continuations, and, according to Guido van Rossum, it never will. However, better support for coroutine-like functionality is provided in 2.5, by extending Python's generators. Before 2.5, generators were lazy iterators; information was passed unidirectionally out of the generator. From Python 2.5, it is possible to pass information back into a generator function, and from Python 3.3, the information can be passed through multiple stack levels.
Expressions
Some Python expressions are similar to languages such as C and Java, while some are not:
- Addition, subtraction, and multiplication are the same, but the behavior of division differs. There are two types of divisions in Python. They are floor division and integer division. Python also added the
**
operator for exponentiation. - From Python 3.5, the new
@
infix operator was introduced. It is intended to be used by libraries such as NumPy for matrix multiplication. - n Python,
==
compares by value, versus Java, which compares numerics by value and objects by reference. (Value comparisons in Java on objects can be performed with theequals()
method.) Python'sis
operator may be used to compare object identities (comparison by reference). In Python, comparisons may be chained, for examplea <= b <= c
. - Python uses the words
and
,or
,not
&&,||
,!
used in Java and C. - Python has a type of expression termed a list comprehension. Python 2.4 extended list comprehensions into a more general expression termed a generator expression.
- Anonymous functions are implemented using lambda expressions; however, these are limited in that the body can only be one expression.
- Conditional expressions in Python are written as
x if c else y
(different in order of operands from thec ? x : y
operator common to many other languages). - Python makes a distinction between lists and tuples. Lists are written as
[1, 2, 3]
, are mutable, and cannot be used as the keys of dictionaries (dictionary keys must be immutable in Python). Tuples are written as(1, 2, 3)
, are immutable and thus can be used as the keys of dictionaries, provided all elements of the tuple are immutable. The+
operator can be used to concatenate two tuples, which does not directly modify their contents, but rather produces a new tuple containing the elements of both provided tuples. Thus, given the variablet
initially equal to(1, 2, 3)
, executingt = t + (4, 5)
first evaluatest + (4, 5)
, which yields(1, 2, 3, 4, 5)
, which is then assigned back tot
, thereby effectively "modifying the contents" oft
, while conforming to the immutable nature of tuple objects. Parentheses are optional for tuples in unambiguous contexts. - Python features sequence unpacking where multiple expressions, each evaluating to anything that can be assigned to (a variable, a writable property, etc.), are associated in the identical manner to that forming tuple literals and, as a whole, are put on the left hand side of the equal sign in an assignment statement. The statement expects an iterable object on the right hand side of the equal sign that produces the same number of values as the provided writable expressions when iterated through, and will iterate through it, assigning each of the produced values to the corresponding expression on the left.
- Python has a "string format" operator
%
. This functions analogous toprintf
format strings in C, e.g."spam=%s eggs=%d" % ("blah", 2)
evaluates to"spam=blah eggs=2"
. In Python 3 and 2.6+, this was supplemented by theformat()
method of thestr
class, e.g."spam={0} eggs={1}".format("blah", 2)
. Python 3.6 added "f-strings":blah = "blah"; eggs = 2; f'spam={blah} eggs={eggs}'
. Python has various kinds of string literals: - Strings delimited by single or double quote marks. Unlike in Unix shells, Perl and Perl-influenced languages, single quote marks and double quote marks function identically. Both kinds of string use the backslash (
\
) as an escape character. String interpolation became available in Python 3.6 as "formatted string literals". - Triple-quoted strings, which begin and end with a series of three single or double quote marks. They may span multiple lines and function like here documents in shells, Perl and Ruby.
- Raw string varieties, denoted by prefixing the string literal with an
r
. Escape sequences are not interpreted; hence raw strings are useful where literal backslashes are common, such as regular expressions and Windows-style paths. Compare "@
-quoting" in C#. - Python has array index and array slicing expressions on lists, denoted as
a[key]
,a[start:stop]
ora[start:stop:step]
. Indexes are zero-based, and negative indexes are relative to the end. Slices take elements from the start index up to, but not including, the stop index. The third slice parameter, called step or stride, allows elements to be skipped and reversed. Slice indexes may be omitted, for examplea[:]
returns a copy of the entire list. Each element of a slice is a shallow copy.
In Python, a distinction between expressions and statements is rigidly enforced, in contrast to languages such as Common Lisp, Scheme, or Ruby. This leads to duplicating some functionality. For example:
- List comprehensions vs.
for
-loops - Conditional expressions vs.
if
blocks - The
eval()
vs.exec()
built-in functions (in Python 2,exec
is a statement); the former is for expressions, the latter is for statements.
Statements cannot be a part of an expression, so list and other comprehensions or lambda expressions, all being expressions, cannot contain statements. A particular case of this is that an assignment statement such as a = 1
cannot form part of the conditional expression of a conditional statement. This has the advantage of avoiding a classic C error of mistaking an assignment operator =
for an equality operator ==
in conditions: if (c = 1) { ... }
is syntactically valid (but probably unintended) C code but if c = 1: ...
causes a syntax error in Python.
Methods
Methods on objects are functions attached to the object's class; the syntax instance.method(argument)
is, for normal methods and functions, syntactic sugar for Class.method(instance, argument)
. Python methods have an explicit self
parameter to access instance data, in contrast to the implicit self
(or this
) in some other object-oriented programming languages (e.g., C++, Java, Objective-C, or Ruby).
Typing
Python uses duck typing and has typed objects but untyped variable names. Type constraints are not checked at compile time; rather, operations on an object may fail, signifying that the given object is not of a suitable type. Despite being dynamically typed, Python is strongly typed, forbidding operations that are not well-defined (for example, adding a number to a string) rather than silently attempting to make sense of them.
Python allows programmers to define their own types using classes, which are most often used for object-oriented programming. New instances of classes are constructed by calling the class (for example, SpamClass()
or EggsClass()
), and the classes are instances of the metaclass type,
(itself an instance of itself), allowing metaprogramming and reflection.
Before version 3.0, Python had two kinds of classes: old-style and new-style. The syntax of both styles is the same, the difference being whether the class object
is inherited from, directly or indirectly (all new-style classes inherit from object
and are instances of type
). In versions of Python 2 from Python 2.2 onwards, both kinds of classes can be used. Old-style classes were eliminated in Python 3.0.
The long term plan is to support gradual typing and from Python 3.5, the syntax of the language allows specifying static types but they are not checked in the default implementation, CPython. An experimental optional static type checker named mypy supports compile-time type checking.
Mathematics
Python has the usual C language arithmetic operators (+
, -
, *
, /
, %
). It also has **
for exponentiation, e.g. 5**3 == 125
and 9**0.5 == 3.0
, and a new matrix multiply @
operator is included in version 3.5. Additionally, it has a unary operator (~
), which essentially inverts all the bits of its one argument. For integers, this means ~x=-x-1
. Other operators include bitwise shift operators x << y
, which shifts x
to the left y
places, the same as x*(2**y)
, and x >> y
, which shifts x
to the right y
places, the same as x//(2**y)
.
The behavior of division has changed significantly over time:
- Python 2.1 and earlier use the C division behavior. The
/
operator is integer division if both operands are integers, and floating-point division otherwise. Integer division rounds towards 0, e.g.7/3 == 2
and-7/3 == -2
. - Python 2.2 changes integer division to round towards negative infinity, e.g.
7/3 == 2
and-7/3 == -3
. The floor division//
operator is introduced. So7//3 == 2
,-7//3 == -3
,7.5//3 == 2.0
and-7.5//3 == -3.0
. Addingfrom __future__ import division
causes a module to use Python 3.0 rules for division (see next). - Python 3.0 changes
/
to be always floating-point division. In Python terms, the pre-3.0/
is classic division, the version-3.0/
is real division, and//
is floor division.
Rounding towards negative infinity, though different from most languages, adds consistency. For instance, it means that the equation (a + b)//b == a//b + 1
is always true. It also means that the equation b*(a//b) + a%b == a
is valid for both positive and negative values of a
. However, maintaining the validity of this equation means that while the result of a%b
is, as expected, in the half-open interval [0, b), where b
is a positive integer, it has to lie in the interval (b, 0] when b
is negative.
Python provides a round
function for rounding a float to the nearest integer. For tie-breaking, versions before 3 use round-away-from-zero: round(0.5)
is 1.0, round(-0.5)
is −1.0. Python 3 uses round to even: round(1.5)
is 2, round(2.5)
is 2.
Python allows boolean expressions with multiple equality relations in a manner that is consistent with general use in mathematics. For example, the expression a < b < c
tests whether a
is less than b
and b
is less than c
. C-derived languages interpret this expression differently: in C, the expression would first evaluate a < b
, resulting in 0 or 1, and that result would then be compared with c
.
Python has extensive built-in support for arbitrary precision arithmetic. Integers are transparently switched from the machine-supported maximum fixed-precision (usually 32 or 64 bits), belonging to the python type int
, to arbitrary precision, belonging to the Python type long
, where needed. The latter have an "L" suffix in their textual representation. (In Python 3, the distinction between the int
and long
types was eliminated; this behavior is now entirely contained by the int
class.) The Decimal
type/class in module decimal
(since version 2.4) provides decimal floating point numbers to arbitrary precision and several rounding modes. The Fraction
type in module fractions
(since version 2.6) provides arbitrary precision for rational numbers.
Due to Python's extensive mathematics library, and the third-party library NumPy that further extends the native capabilities, it is frequently used as a scientific scripting language to aid in problems such as numerical data processing and manipulation.
Libraries
Python's large standard library, commonly cited as one of its greatest strengths, provides tools suited to many tasks. For Internet-facing applications, many standard formats and protocols such as MIME and HTTP are supported. It includes modules for creating graphical user interfaces, connecting to relational databases, generating pseudorandom numbers, arithmetic with arbitrary precision decimals, manipulating regular expressions, and unit testing.
Some parts of the standard library are covered by specifications (for example, the Web Server Gateway Interface (WSGI) implementation wsgiref follows PEP 333), but most modules are not. They are specified by their code, internal documentation, and test suites (if supplied). However, because most of the standard library is cross-platform Python code, only a few modules need altering or rewriting for variant implementations.
As of March 2018, the Python Package Index (PyPI), the official repository for third-party Python software, contains over 130,000 packages with a wide range of functionality, including:
- Graphical user interfaces
- Web frameworks
- Multimedia
- Databases
- Networking
- Test frameworks
- Automation
- Web scraping
- Documentation
- System administration
- Scientific computing
- Text processing
- Image processing
Development
Most Python implementations (including CPython) include a read–eval–print loop (REPL), permitting them to function as a command line interpreter for which the user enters statements sequentially and receives results immediately.
Other shells, including IDLE and IPython, add further abilities such as auto-completion, session state retention and syntax highlighting.
As well as standard desktop integrated development environments, there are Web browser-based IDEs; SageMath (intended for developing science and math-related Python programs); PythonAnywhere, a browser-based IDE and hosting environment; and Canopy IDE, a commercial Python IDE emphasizing scientific computing.
Implementation
Reference implementations
CPython is the reference implementation of Python. It is written in C, meeting the C89 standard with several select C99 features. It compiles Python programs into an intermediate bytecode which is then executed by its virtual machine. CPython is distributed with a large standard library written in a mixture of C and native Python. It is available for many platforms, including Windows and most modern Unix-like systems. Platform portability was one of its earliest priorities.
Other implementations
PyPy is a fast, compliant[102] interpreter of Python 2.7 and 3.5. Its just-in-time compiler brings a significant speed improvement over CPython. Stackless Python is a significant fork of CPython that implements microthreads; it does not use the C memory stack, thus allowing massively concurrent programs. PyPy also has a stackless version. MicroPython and CircuitPython are Python 3 variants optimised for microcontrollers.
Unsupported implementations
Other just-in-time Python compilers have been developed, but are now unsupported:
- Google began a project named Unladen Swallow in 2009 with the aim of speeding up the Python interpreter fivefold by using the LLVM, and of improving its multithreading ability to scale to thousands of cores.
- Psyco is a just-in-time specialising compiler that integrates with CPython and transforms bytecode to machine code at runtime. The emitted code is specialised for certain data types and is faster than standard Python code.
In 2005, Nokia released a Python interpreter for the Series 60 mobile phones named PyS60. It includes many of the modules from the CPython implementations and some additional modules to integrate with the Symbian operating system. The project has been kept up-to-date to run on all variants of the S60 platform, and several third-party modules are available. The Nokia N900 also supports Python with GTK widget libraries, enabling programs to be written and run on the target device.
Cross-compilers to other languages
There are several compilers to high-level object languages, with either unrestricted Python, a restricted subset of Python, or a language similar to Python as the source language:
- Jython compiles into Java byte code, which can then be executed by every Java virtual machine implementation. This also enables the use of Java class library functions from the Python program.
- IronPython follows a similar approach in order to run Python programs on the .NET Common Language Runtime.
- The RPython language can be compiled to C, Java bytecode, or Common Intermediate Language, and is used to build the PyPy interpreter of Python.
- Pyjs compiles Python to JavaScript.
- Cython compiles Python to C and C++.
- Pythran compiles Python to C++.
- Somewhat dated Pyrex (latest release in 2010) and Shed Skin (latest release in 2013) compile to C and C++ respectively.
- Google's Grumpy compiles Python to Go.
- MyHDL compiles Python to VHDL.
- Nuitka compiles Python into C++.
Performance
A performance comparison of various Python implementations on a non-numerical (combinatorial) workload was presented at EuroSciPy '13.
References
The main resource for this technical documentation project was the Python (programming language) Wikipedia entry.
The Python logo is from the Python Software Foundation.