Python is a freely distributable high-level computer programming language that has become very popular for everything from scripting applications and web-page generation to solving scientific problems.
Python shares many basic characteristics with languages like Mathematica, Matlab, and Labview, and has an extensive set of numerical and scientific modules.
Python has several freely distributable add-ons available that help make it more general. These include NumPy (Numerical Python) and SciPy that allow you to solve complex scientific problems in an efficient manner, and matplotlib to plot data fairly nicely.
This course will focus on analyzing natural and social science problems, introducing just enough Python to implement them.
We will use versions 2.7 and 3.7 of Python; in 2020 the former is becoming a “legacy” version, while the latter is being used for most new development.
Version 3 is different enough from earlier versions that there are still many extras that have not been updated, but it will be the primary version we use, with 2.7 for a special case or two.
You will often want to review some of the online resources provided by the Python community:
In the summer of 2021, Python 2.7 and 3.7 are available in all of the Amherst College computing labs, both separately and as part of the Anaconda distribution, on both Macs and Windows.
For your own computer you can install these items individually, but the Anaconda distribution is recommended.
If you do choose to install them individually:
You’ll need to work using the command line user interface (CLUI): on Macintosh menu Go> Utilities and then open Terminal; on Windows menu Start and type cmd.
You’ll want to first install Python (if it isn’t already there).
If you have your own Windows computer, you will need to install Python yourself.
If you have your own Macintosh computer, Python is already installed, but it may be older than you want; macOS 10.7 – 10.14 come with version 2.7, while macOS 10.6 has version 2.6.
If you are running a version of Python 2 before 2.7.9 or Python 3 before 3.7, download the python package installer pip. To install pip, run the command python get-pip.py (ideally as an administrator so that it’s generally available). Then you can install the other packages, e.g. pip install numpy (again as administrator, if possible). If you don’t know what this means, you probably want to use Anaconda instead (which also includes pip, though Anaconda’s package installer conda should be tried first).
Integrated Development Environments (IDEs) are extremely useful tools for programming, providing syntax coloring and checking, debugging capabilities, and more.
Python comes preinstalled with an IDE called IDLE (can you guess why?), but we will instead use Spyder, which has many more features and in some ways is easier to use.
You can also just use any good programmer’s text editor (TextWrangler for Mac, Notepad++ for Windows, Emacs, VIM, etc.) and run Python from a built-in command line interpreter (on Macintosh menu Go> Utilities and then open Terminal; on Windows menu Start and type cmd).
On Amherst College Windows computers, several versions of Python are installed, so you can launch Spyder by, for example, menuing Start > Anaconda3(64-bit) > Spyder.
On Macs, Spyder is buried in the folder /opt/anaconda/bin; the simplest way to launch it as by looking in the folder Applications for Anaconda-Navigator.app, and drag that to the dock. Spyder will be listed along with other tools like Juypter. Once you launch Spyder you can drag to another location in the dock and it will stick.
In either case, a window should open that provides a set of panes and toolbars that let you interact with Python:
The panes include:
the IPython console, where you can try out various Python statements to see their effect;
the Editor, where you can edit files that contain Python statements, and then run them to see their output in the console;
the File explorer, which displays the locations of your Python files for easy access;
the Variable explorer, which provides information about Python variables and their values;
the History log, which maintains a list of previously executed Python statements.
By default there are three visible panes, sometimes with buttons that let you switch between multiple overlapping panes.
Panes can be “popped out” by clicking on the button , and then moved around into a new location or to overlap another pane.
Other panes providing additional information can be added via the menu View > Panes. If you accidentally close a pane, you can reopen it here, too.
The toolbars include:
The File toolbar, with the usual buttons New, Open, and Save to select a file to work on in the Editor;
The Run toolbar, which lets you evaluate the Python statements in the file currently open in the Editor by clicking on the button Run;
The Global working directory toolbar, which lets you select the working directory that appears in the File explorer by clicking on the button Browse a Working Directory.
Note: A directory is the same thing as a folder.
Other toolbars providing additional functionality can be added via the menu View > Toolbars.
Consider the basic law of gravitational acceleration near the Earth’s surface, which applies if you throw a ball up into the air:
y = y_{0} + v_{y0}t - ½ gt^{2}
Here y_{0} is the initial height, v_{y0} is the initial speed in the vertical direction, g = 9.81 m/s^{2} is the acceleration of gravity, and t is time.
At its most basic, Python will act like a scientific calculator and let you calculate the result of this equation by typing in particular values for these symbols and using the standard mathematical operators:
+ addition - subtraction
* multiplication / division
** exponentiation
Using y_{0} = 2 m, v_{y0} = 15 m/s, and t = 3 s, type the following expression into the IPython console at the prompt In [1]:, and then evaluate it by typing the character <return>/<enter>:
2 + 15 * 3 - 1/2 * 9.81 * 3**2
⇒ 2.8549999999999969
The second line, the value of the expression, will appear in the console window.
There are several things to be aware of here:
Like a calculator, you need to make sure your units match up. Since the gravitational constant is 9.81 m/s^{2}, 2 must be in meters, 3 in seconds, and 15 in m/s — but the units themselves aren’t actually typed into Python.
The operators have different precedences, so that exponentiation will be applied before multiplication and division, which will be evaluated before addition and subtraction.
If you want to change this order, you need to place parentheses () around the lower-precedence operation, e.g. (2 + 15)*3.
Python distinguishes between integer and floating point (real) values, which you can tell apart by the presence or absence of a decimal point, e.g. 1 and 1.0 are different types of data and are stored by the computer in different ways.
You can write real values in scientific notation using the letter e to represent the ×10 (there should be no space around it):
2e-3
⇒ 0.002
This representation of real numbers should tell you where the name “floating point” comes from; we could just as easily represent this number as 0.2e-2 or 20e-4. In fact, this is how these values are stored in memory on a computer, as a pair of integers, the mantissa and the exponent.
Computer memory is structured as a set of binary digits or bits, which each have the value 0 or 1.
Bits are generally grouped in units of eight called bytes, e.g. 01110011, which form base-2 numbers that range from 0 to 28 – 1 = 255.
Often the eighth (left-most) bit of a byte is used as a sign bit, in which case the values range from –128 to 127:
Binary
Unsigned Decimal
Signed Decimal
00000000
0
0
00000001
1
1
00000010
2
2
00000011
3
3
00000100
4
4
…
…
…
00001000
8
8
…
…
…
01000000
64
64
…
…
…
01111111
127
127
10000000
128
–128
…
…
…
11111111
255
–1
Integers are defined by grouping together 1, 2, or 4 bytes, depending on the range of values desired, and interpreting them as binary numbers, as above.
Floating-point values are defined by grouping together 4 or 8 bytes, commonly called single and double precision, respectively. The available bits are split between the mantissa and exponent and their signs, 24 + 6 + 2 and 53 + 9 + 2, respectively.
Common Computer Numeric Data Types
Data Type
Size
Minimum Value
Maximum Value
Approximate Digits of Precision
Unsigned Integer
8 bit = 1 byte
0
255
3
16 bit = 2 bytes
0
65535
5
32 bit = 4 bytes
0
4294967295
10
Signed Integer
8 bit = 1 byte
−128
127
3
16 bit = 2 bytes
−32,768
32,767
5
32 bit = 4 bytes
−2,147,483,648
2,147,483,647
10
Floating Point
32 bit = 4 bytes
−3.4
x 1038
1.2
x 1038
7
64 bit = 8 bytes
−2.2
x 10308
1.8
x 10308
16
The size of Python integer and floating-point values are, by default, determined by the operating system.
On newer computers (i.e. with 64-bit hardware that can transfer 8 bytes of data at once), ints are 4-byte signed, while floats are double precision.
You can tell the floating-point precision by the number of digits that Python prints out for values like 2.8549999999999969, showing all the available precision (keep in mind, though, that if the input data isn’t this precise the output won’t be, either!).
When describing data sizes, keep in mind that numbers like 1, 2, 4, and 8 will usually refer to bytes, while numbers like 8, 16, 32, and 64 will usually refer to bits.
You can explicitly force conversion of an integer to a real value with the built-in constructor float():
float(1)
⇒ 1.0
Try this!
Or go the opposite direction with the constructor int(), which truncates the fractional part of a real number:
int(9.81)
⇒ 9
Try this!
To round off to the nearest integer, instead of truncating, apply the function round():
round(2.8549999999999969)
⇒ 3
Try this!
Be aware that round()always rounds half-values, e.g. 1.5 or 2.5, to the nearest even value — both of these will round to 2. This helps reduce cumulative errors that might occur when working with large amounts of data.
More generally, your data will have an inherent precision determined by measurement and systematic errors, and you can round calculations to maintain that precision. Simply provide round() with a second argument specifying the number of decimal places to preserve:
round(2.8549999999999969, 3)
⇒ 2.855
Try this!
Spyder provides a quick reference for Python’s built-in and library functions like round() — if you click on their name and then type <ctrl>-I (Windows) or <command>-I (Mac), its description will appear in the window Help, so that you can verify its argument format:
You can also learn about Python’s other built-in functions by typing in their names and pressing <return>/<enter>.
Warning: In Python version 3, when you divide two integers, the result will be a float if there is a fractional part:
1/2 ⇒ 0.5
But in versions 1 and 2, the result is an integer so any fractional part will be truncated, e.g.
1/2 ⇒ 0
To force truncation in any version, you can use the integer division operator //:
Just as in mathematics, assigning literal values to variables will allow you to remember their purpose and make use of them more easily.
Python lets you use variable names of any length, though they must start with a letter and afterward can contain letters, numbers, and the underscore (_). Be aware that variable names are case-sensitive, i.e. y is not the same as Y.
If you follow variables with an = sign and any value or expression such as the above, they will be assigned that value and have that type:
y0 = 2
vy0 = 15
g = 9.81
t = 3
y = y0 + vy0 * t - 1/2 * g * t**2
y
⇒ 2.8549999999999969
Try this!
Note how similar the last expression is now to the way we write the law of gravity in scientific texts.
(Also note that IPython doesn’t print out the values of assignments, only single variables or expressions.)
Python variables are dynamically typed, so you can redefine a variable at will, e.g. vy0 is an integer above but becomes a float with:
vy0 = 15.0
We can check the type of a variable with one of Python’s built-in functions:
type(y0)
⇒ int
type(g)
⇒ float
Try this!
A generally useful rule: if you will use the result of a calculation more than once, it’s usually a good idea to first assign it to a variable.
It would be useful to be able to repeat the calculation of position in Earth’s gravity for many different times without having to type out the whole expression repeatedly.
We can achieve this by packaging the statement above inside our own function definition statement:
def y(t): return y0 + vy0 * t - 1./2 * g * t**2
y(3.1)
⇒ 1.3629499999999908
Try this!
There are several things to be aware of here:
The colon (:) is required to terminate the initial statement;
All subsequent statements that are calculated as part of the function must be indented by a standard amount (4 spaces = 1 tab);
When typing this into IPython, it will automatically indent after it sees the colon, and at the very end you need to type an extra <return>/<enter> to leave the indented level and evaluate the function definition;
The function is aware of the earlier assignments to y0, vy0, and g, which are known as global variables;
When the function is called with the value 3.1 in place of the argument t, the latter is replaced by the calling value everywhere it occurs inside the function, and the previous global assignment t = 3 is ignored;
The value of the function call y(t) is the value of the expression in the return statement.
Function calls can be used anywhere a number or variable can be used.
The reserved words def and return have special meaning for Python, and are examples of several that must not be used for variable names.
We can generalize this function by including multiple arguments in the definition, and we can also make g a local variable :
def y(t, y0, vy0):
g = 9.81 return y0 + vy0 * t - 1./2 * g * t**2
y(3.1, 2, 16)
⇒ 4.462949999999992
Try this!
Just like the function arguments, when g is defined as a local variable inside the function it will supercede any declarations outside of the function, which are known as global variables.
Local and global variables have different scopes or namespaces, and therefore are independent of each other.
Using local variables helps keep your definitions in one place and ensure they aren’t accidentally overwritten, which will make your code easier to debug.