Scientific Programming with Python

Part 1: The Earth’s Gravity

Current: The Earth’s Gravity

Next: Newton’s Laws of Motion

Python is a freely distributable high-level computer programming language that has become very popular for everything from scripting applications and web-page generation to solving scientific problems.

Python shares many basic characteristics with languages like Mathematica, Matlab, and Labview, and has an extensive set of numerical and scientific modules.

Python has several freely distributable add-ons available that help make it more general. These include NumPy (Numerical Python) and SciPy that allow you to solve complex scientific problems in an efficient manner, and matplotlib to plot data fairly nicely.

This course will focus on analyzing natural and social science problems, introducing just enough Python to implement them.

Getting Started

We will use version 3.10 of Python. In 2023 you may still see version 2.7 occasionally as a “legacy” version, but it isn’t being used for new development.

You will often want to review some of the online resources provided by the Python community:

The Python Tutorial: 3.10
The Python Language Reference: 3.10
The Python Standard Library: 3.10
If you’ve used Python 2.7 previously, you will probably be interested in the Key Differences Between Python 2.7.x and Python 3.x with Examples

A number of important add-ons or packages work with both of these versions of Python:

NumPy makes it significantly easier to work with numerical arrays, and is essential for scientific/mathematical projects.
SciPy provides a number of scientific tools such as numerical integration.
Matplotlib is a 2-D plotting package.
Pandas is a data analysis and manipulation tool.
Spyder is an interactive development environment (IDE) for Python with a number of useful features that make it easier to develop code.
Jupyter is a documentation tool for Python (and some other languages) that allows one to intermix live code with text explanations.

The home web sites of these technologies are linked above, providing their specifications, full documentation, and more general tutorials.

These technologies together with a few others are known as the SciPy Stack.

The Stack and a number of other useful add-ons can be easily installed via the Anaconda distribution.

You might also find this summary useful:

Cheat Sheet for Scientific Python

This course will get you started but will only be a taste of possibilities, so if you want to learn more, these are good books:

Installation

In the summer of 2023, Python 3.x is available in all of the Amherst College computing labs, both separately and as part of the Anaconda distribution, on both Macs and Windows.

For your own computer you can install these items individually, but the Anaconda distribution is recommended.

If you do choose to install them individually:

You’ll need to work using the command line user interface (CLUI): on Macintosh menu Go > Utilities and then open Terminal; on Windows menu Start and type cmd.
You’ll want to first install Python (if it isn’t already there).

If you have your own Windows computer, you will need to install Python yourself.
If you have your own Macintosh computer, Python 2.7 is automatically installed on macOS 12.2 and earlier. Python 3 is available with the XCode Developer Tools, but we won’t use it.

Many Python packages require installation before they can be imported into your program, and many of these are available in online archives. Package installers make it relatively easy to install them, once you locate them. Anaconda’s package installer conda should be tried first, but the older program pip is still often required, and is installed with Anaconda and with Python 3.7 or newer. Otherwise, you’ll need to download and install them yourself:

conda: Download from https://docs.conda.io/en/latest/miniconda.html. Then in your terminal, change directory to the Downloads folder and run the command bash Miniconda3*.sh (ideally as an administrator so that it’s generally available).
pip: Download from https://pip.pypa.io/en/latest/installation/. Then in your terminal, change directory to the downloaded folder and run the command python get-pip.py (ideally as an administrator so that it’s generally available). Then you can install the other packages, e.g. pip install numpy (again as administrator, if possible).

If you don’t know what this means, you probably want to use Anaconda instead.

Integrated Development Environments

Integrated Development Environments (IDEs) are extremely useful tools for programming, providing syntax coloring and checking, debugging capabilities, and more.

Python comes preinstalled with an IDE called IDLE (can you guess why?), but we will instead use Spyder, which has many more features and in some ways is easier to use.

There are many other Python IDEs available, see this fairly comprehensive list.

You can also just use any good programmer’s text editor (BBEdit for Mac, Notepad++ for Windows, Emacs, VIM, etc.) and run Python from a built-in command line interpreter (on Macintosh menu Go > Utilities and then open Terminal; on Windows menu Start and type cmd).

On Amherst College Windows computers, several versions of Python are installed, so you can launch Spyder by, for example, menuing Start > Anaconda3(64-bit) > Spyder.

On Macs, Spyder is buried in the folder /opt/anaconda/bin; the simplest way to launch it as by looking in the folder Applications for Anaconda-Navigator.app, and drag that to the dock. Spyder will be listed along with other tools like Juypter. Once you launch Spyder you can drag to another location in the dock and it will stick.

In either case, a window should open that provides a set of panes and toolbars that let you interact with Python:

The Spyder IDE

The panes include:

the IPython console, where you can try out various Python statements to see their effect;
the Editor, where you can edit files that contain Python statements, and then run them to see their output in the console;
the File explorer, which displays the locations of your Python files for easy access;
the Variable explorer, which provides information about Python variables and their values;
the History log, which maintains a list of previously executed Python statements.

By default there are three visible panes, sometimes with buttons that let you switch between multiple overlapping panes.

Panes can be “popped out” by clicking on the button , and then moved around into a new location or to overlap another pane.

Other panes providing additional information can be added via the menu View > Panes. If you accidentally close a pane, you can reopen it here, too.

The toolbars include:

The File toolbar, with the usual buttons New, Open, and Save to select a file to work on in the Editor;
The Run toolbar, which lets you evaluate the Python statements in the file currently open in the Editor by clicking on the button Run;
The Global working directory toolbar, which lets you select the working directory that appears in the File explorer by clicking on the button Browse a Working Directory.

Note: A directory is the same thing as a folder.

Other toolbars providing additional functionality can be added via the menu View > Toolbars.

The Acceleration of Gravity

Programming languages let you represent mathematical expressions with relative ease.

A Scientific Calculator

Consider the basic law of gravitational acceleration near the Earth’s surface, which applies if you throw a ball up into the air:

y = y₀ + v_y0t - ½ gt²

Here y₀ is the initial height, v_y0 is the initial speed in the vertical direction, g = 9.81 m/s² is the acceleration of gravity, and t is time.

At its most basic, Python will act like a scientific calculator and let you calculate the result of this equation by typing in particular values for these symbols and using the standard mathematical operators:

Using y₀ = 2 m, v_y0 = 15 m/s, and t = 3 s, type the following expression into the IPython console at the prompt In [1]:, and then evaluate it by typing the character <return>/<enter>:

2 + 15 * 3 - 1/2 * 9.81 * 3**2

⇒ 2.8549999999999969

The second line, the value of the expression, will appear in the console window.

There are several things to be aware of here:

Like a calculator, you need to make sure your units match up. Since the gravitational constant is 9.81 m/s², 2 must be in meters, 3 in seconds, and 15 in m/s — but the units themselves aren’t actually typed into Python.
The operators have different precedences, so that exponentiation will be applied before multiplication and division, which will be evaluated before addition and subtraction.

If you want to change this order, you need to place parentheses () around the lower-precedence operation, e.g. (2 + 15)*3.

A complete list of operator precedence can be found here.
Within the precedence levels of addition/subtraction and multiplication/division, operations are applied left-to-right.

Therefore, 1/2*9.81 is calculated exactly the same as (1/2)*9.81, while 1/(2*9.81) will produce a different value.
Exponentiation, on the other hand, is applied right-to-left:

2**3**2

⇒ 512

(2**3)**2

⇒ 64

Computer Numbers

Python distinguishes between integer and floating point (real) values, which you can tell apart by the presence or absence of a decimal point, e.g. 1 and 1.0 are different types of data and are stored by the computer in different ways.

You can write real values in scientific notation using the letter e to represent the ×10 (there should be no space around it):

2e-3

⇒ 0.002

This representation of real numbers should tell you where the name “floating point” comes from; we could just as easily represent this number as 0.2e-2 or 20e-4. In fact, this is how these values are stored in memory on a computer, as a pair of integers, the mantissa and the exponent.

Computer memory is structured as a set of binary digits or bits, which each have the value 0 or 1.

Bits are generally grouped in units of eight called bytes, e.g. 01110011, which form base-2 numbers that range from 0 to 28 – 1 = 255.

Often the eighth (left-most) bit of a byte is used as a sign bit, in which case the values range from –128 to 127:

Binary	Unsigned Decimal	Signed Decimal
00000000	0	0
00000001	1	1
00000010	2	2
00000011	3	3
00000100	4	4
…	…	…
00001000	8	8
…	…	…
01000000	64	64
…	…	…
01111111	127	127
10000000	128	–128
…	…	…
11111111	255	–1

Integers are defined by grouping together 1, 2, or 4 bytes, depending on the range of values desired, and interpreting them as binary numbers, as above.

Floating-point values are defined by grouping together 4 or 8 bytes, commonly called single and double precision, respectively. The available bits are split between the mantissa and exponent and their signs, 24 + 6 + 2 and 53 + 9 + 2, respectively.

Common Computer Numeric Data Types
Data Type	Size	Minimum Value	Maximum Value	Approximate Digits of Precision
Unsigned Integer	8 bit = 1 byte	0	255	3
	16 bit = 2 bytes	0	65535	5
	32 bit = 4 bytes	0	4294967295	10
Signed Integer	8 bit = 1 byte	−128	127	3
	16 bit = 2 bytes	−32,768	32,767	5
	32 bit = 4 bytes	−2,147,483,648	2,147,483,647	10
Floating Point	32 bit = 4 bytes	−3.4 x 1038	1.2 x 1038	7
Floating Point	64 bit = 8 bytes	−2.2 x 10308	1.8 x 10308	16

The size of Python integer and floating-point values are, by default, determined by the operating system.

On newer computers (i.e. with 64-bit hardware that can transfer 8 bytes of data at once), ints are 4-byte signed, while floats are double precision.

You can tell the floating-point precision by the number of digits that Python prints out for values like 2.8549999999999969, showing all the available precision (keep in mind, though, that if the input data isn’t this precise the output won’t be, either!).

When describing data sizes, keep in mind that numbers like 1, 2, 4, and 8 will usually refer to bytes, while numbers like 8, 16, 32, and 64 will usually refer to bits.

You can explicitly force conversion of an integer to a real value with the built-in constructor float():

float(1)

⇒ 1.0

Or go the opposite direction with the constructor int(), which truncates the fractional part of a real number:

int(9.81)

⇒ 9

To round off to the nearest integer, instead of truncating, apply the function round():

round(2.8549999999999969)

⇒ 3

Be aware that round()always rounds half-values, e.g. 1.5 or 2.5, to the nearest even value — both of these will round to 2. This helps reduce cumulative errors that might occur when working with large amounts of data.

More generally, your data will have an inherent precision determined by measurement and systematic errors, and you can round calculations to maintain that precision. Simply provide round() with a second argument specifying the number of decimal places to preserve:

round(2.8549999999999969, 3)

⇒ 2.855

Spyder provides a quick reference for Python’s built-in and library functions like round() — if you click on their name and then type <ctrl>-I (Windows) or <command>-I (Mac), its description will appear in the window Help, so that you can verify its argument format:

Object Inspector Window

You can also learn about Python’s other built-in functions by typing in their names and pressing <return>/<enter>.

Warning: In Python version 3, when you divide two integers, the result will be a float if there is a fractional part:

1/2 ⇒ 0.5

But in versions 1 and 2, the result is an integer so any fractional part will be truncated, e.g.

1/2 ⇒ 0

To force truncation in any version, you can use the integer division operator //:

1//2 ⇒ 0

Variables

Just as in mathematics, assigning literal values to variables will allow you to remember their purpose and make use of them more easily.

Python lets you use variable names of any length, though they must start with a letter and afterward can contain letters, numbers, and the underscore (_). Be aware that variable names are case-sensitive, i.e. y is not the same as Y.

If you follow variables with an = sign and any value or expression such as the above, they will be assigned that value and have that type:

y0 = 2
vy0 = 15
g = 9.81
t = 3
y = y0 + vy0 * t - 1/2 * g * t**2
y

⇒ 2.8549999999999969

Note how similar the last expression is now to the way we write the law of gravity in scientific texts.

(Also note that the IPython console doesn’t print out the values of assignments, only single variables or expressions.)

Python variables are dynamically typed, so you can redefine a variable at will, e.g. vy0 is an integer above but becomes a float with:

vy0 = 15.0

We can check the type of a variable with one of Python’s built-in functions:

type(y0)

⇒ int

type(g)

⇒ float

A generally useful rule: if you will use the result of a calculation more than once, it’s usually a good idea to first assign it to a variable.

Variables are always pointers to areas of computer memory; they can be reassigned to other values at will, e.g.

temp = 100

temp = 200.5

The previous memory area holding the value 100 will be marked for deletion and eventually cleared when Python starts its garbage collection process.

If a variable and the memory it references are no longer needed and the latter is particularly large, it is a good idea to delete the variable manually:

del temp

If you ask for the value of a variable that has not been assigned a value or has been deleted, a name error will occur:

temp

⇒ NameError: name 'temp' is not defined

Functions

It would be useful to be able to repeat the calculation of position in Earth’s gravity for many different times without having to type out the whole expression repeatedly.

We can achieve this by packaging the statement above inside our own function definition statement:

def y(t):
return y0 + vy0 * t - 1./2 * g * t**2

y(3.1)

⇒ 1.3629499999999908

There are several things to be aware of here:

The colon (:) is required to terminate the initial statement;
All subsequent statements that are calculated as part of the function must be indented by a standard amount (4 spaces = 1 tab);
When typing this into IPython, it will automatically indent after it sees the colon, and at the very end you need to type an extra <return>/<enter> to leave the indented level and evaluate the function definition;
The function is aware of the earlier assignments to y0, vy0, and g, which are known as global variables;
When the function is called with the value 3.1 in place of the argument t, the latter is replaced by the calling value everywhere it occurs inside the function, and the previous global assignment t = 3 is ignored;
The value of the function call y(t) is the value of the expression in the return statement.
Function calls can be used anywhere a number or variable can be used.
The reserved words def and return have special meaning for Python, and are examples of several that must not be used for variable names.

Here is a complete list of Python’s reserved words.

We can generalize this function by including multiple arguments in the definition, and we can also make g a local variable :

def y(t, y0, vy0):
g = 9.81
return y0 + vy0 * t - 1./2 * g * t**2

y(3.1, 2, 16)

⇒ 4.462949999999992

Just like the function arguments, when g is defined as a local variable inside the function it will supercede any declarations outside of the function, which are known as global variables.

Local and global variables have different scopes or namespaces, and therefore are independent of each other.

Using local variables helps keep your definitions in one place and ensure they aren’t accidentally overwritten, which will make your code easier to debug.

The acceleration of gravity far from the Earth depends on distance r from the center of the Earth:

g(r) = GM/r²

Here G is the Universal Constant of Gravitation and M is the mass of the Earth:

G = 6.673 × 10^-11 m³/(kg•s²)

M = 5.972 × 10²⁴ kg

Write a function of r that calculates the acceleration of gravity, and use it at 1, 2, and 3 times the Earth’s radius:

R = 6.378 × 10⁶ m

How do these values compare to the value previous asserted for g at the surface of the Earth? Does their behavior match what you know about gravity?

The acceleration of gravity at the surface of the Earth depends on its radius R:

g(r) = GM/R²

Here G is the Universal Constant of Gravitation:

G = 6.673 × 10^-11 m³/(kg•s²)

and M is the mass of the Earth. But this equation actually applies to any spherical object with a surface, not just the Earth.

Write a function that calculates the acceleration of gravity on any planet or moon, given its mass and diameter. Using your function, calculate gravity’s value for the first five worlds in this table. How do these values compare to the value previous asserted for g at the surface of the Earth?

Vectorially Speaking

Motion in a gravitational field is not always up and down, and if you give a ball an initial horizontal velocity, it will also move in that direction:

x = x₀ + v_x0t

y = y₀ + v_y₀t - ½ gt²

So it’s frequently important to calculate the vector position of an object, often represented with unit vectors along the x- and y-axes as x î + y ĵ , but which can be written more simply as a pair of values (x, y).

Python recognizes a similar sequence of values called a tuple, which is simply two or more values separated by commas:

0, 2

⇒ (0, 2)

Tuple commas have a very low precedence, meaning most calculations between commas happen first and grouping into a tuple happens last:

1+1, 5-4, 7*2

⇒ (2, 1, 14)

Parentheses around tuples are usually optional on input, but are sometimes necessary to remove ambiguity, e.g. commas are also used in function argument lists, so if an argument is itself a tuple, parentheses are required.

We can generalize the function above by using tuples as vectors and including the equation for x:

def xy(t, xy0, v0):
    g = 9.81
    (x0,y0) = xy0
    (vx0,vy0)= v0
    return x0 + vx0 * t, y0 + vy0 * t - 1./2 * g * t**2

xy(3, (0, 2), (0, 15))

⇒ (0, 2.854999999999997)

Here tuples are passed both as input to the function but also returned as the output of the function.
Note how this function extracts individual values from the tuple by assigning it to a tuple of variables, which can then be used in calculations in each of the x and y directions.

The evaluation of the function above is a check, which reproduces the previous case in the y direction and also the expected value in the x direction. Another result with an initial speed vx0 = 10 in the x direction would be:

xy0 = (0, 2)
v0 = (10,15)
xy(3, xy0, v0)

⇒ (30, 2.854999999999997)

Once you have created a tuple, you cannot modify it; it’s immutable, like literal integers and floats.

Unlike vectors, tuples can’t be added or subtracted; the operator + is overloaded to produce the concatenation of tuples:

xy0 + v0

⇒ (0, 2, 10, 15)

And vector scalar multiplication also doesn’t work; the operator * combining an integer (only) and a tuple is overloaded to produce the repetition of the tuple:

3 * v0

⇒ (10, 15, 10, 15, 10, 15)

Tuples therefore have limited use as vectors, being primarily a compact notation for handling sequences of values. We will discuss more general vector analysis later.

To determine the length of a tuple, use the built-in function len():

len(xy0)

⇒ 2

The Index Operator

While assigning a tuple to a tuple of variables is one way to extract its values, a more general approach to retrieving a particular value from a tuple or list would be to use the index operator [], which refers to the first element in the tuple or list by the index 0 and the last element by the index len() – 1:

⇒ (10, 15)

v0[0]

⇒ 10

v0[1]

⇒ 15

An often useful feature of Python’s indexing is that a negative value refers to the end of the tuple, i.e. index –n is a shorthand for index len() – n:

v0[-1]

⇒ 15

v0[-2]

⇒ 10

The function len() returns the number of elements in a tuple, but this is different from its “vector length” or norm, the square root of the sum of the squares. We can write a function to calculate that value, but it will depend on the tuple length, requiring the use of conditional statements to control the flow of evaluation:

def norm(v):
    length = len(v)
    if length == 0:
        result = 0
    elif length == 1:
        result = v[0]
    elif length == 2:
        result = (v[0]**2 + v[1]**2)**0.5
    elif length == 3:
        result = (v[0]**2 + v[1]**2 + v[2]**2)**0.5
    else:
        result = None
    return result

norm((3, 4))

⇒ 5.0

The reserved words if, elif, and else introduce conditional statements, and the first two include a conditional expression that produces a boolean value, either True or False, here using the equality operator ==. If the test is True, the following block of code is evaluated, and the flow of evaluation then breaks to the next statement outside of this group of conditional statements, which in this case is the return statement. If the test is False, the next statement is evaluated. The else statement and its following block of code are only evaluated if the preceding if and elif statements are all False.

There are a few other new things to see here:

The test length == 0 uses the conditional operator == (“double equals”) and returns a boolean value, either True or False. Do not confuse this with the assignment operator = (“single equals”).

Other comparison operators one can use are less than (<), less than or equal to (<=), greater than (>), greater than or equal to (>=), and inequality (!=).
Like def, each of these statements is followed by a colon (:), which introduces a block of code that must be indented relative to each reserved word to distinguish them.
Generally speaking, the use of the elif and else statements are optional.
The special value None represents a result that does not exist or is undefined. It is often used to provide a simple way to check results by testing for it, e.g.

if norm(v) == None: {do something}
This code could actually be simplified by using return statements in each block of code to exit the function, rather than assigning the result and returning the latter at the end.
This code has both a conceptual error and a coding error in it — can you spot them? How would you fix them? (Hint: does this work for anything besides tuples?)

While You Throw the Ball…

The norm() function from the previous section only works with tuples of a certain length, but generally speaking vectors could have an arbitrary length. How can we do this calculation without having an endless chain of elif statements?

The most general way to work with objects of arbitrary extent is with a while loop, which lets us work with each part of the object in turn, and test for their ending in some way:

def norm(v):
    if type(v) == int or type(v) == float:
        return abs(v)
    length = len(v)
    if length == 0:
        return 0
    if length == 1:
        return abs(v[0])
    result = 0
    index = 0
    while index < length:
        result += v[index]**2
        index += 1
    return result**0.5

norm((3, 4, 5, 6))

⇒ 9.273618495495704

Note the same statement—colon—indented-block-of-statements structure that we saw for function definitions and if, elif, and else statements.

Generally speaking while loops require that we set an initial value for some variable, provide a conditional expression to test for the final value, and in the body of the loop provide an expression or expressions that updates the time in such a way that the conditional expression is eventually False, whereupon the loop breaks to the next statement.

Here, to increment the value of the index for the next round of the loop, we use an assignment operator += , which takes the value on the right side and adds it to the variable on the left side. The result is equivalent to the statement index = index + 1, but is more efficient. Eventually index == length, and the loop will terminate.

Similarly, we use += to accumulate the sum of the squares in result, and when the loop terminates, the next statement takes the square root of this sum.

Returning to our ball example, to find the full path of a ball thrown under the influence of gravity, we’ll want to calculate its position for more than just a few values of time, and the while loop is one way to do this:

t = 0
while t < 4:
print((t,) + xy(t, xy0, v0))
t += 0.5

⇒ (0, 0, 2.0)
  (0.5, 5.0, 8.27375)
  (1.0, 10.0, 12.094999999999999)
  (1.5, 15.0, 13.46375)
  (2.0, 20.0, 12.379999999999999)
  (2.5, 25.0, 8.84375)
  (3.0, 30.0, 2.854999999999997)
  (3.5, 35.0, -5.58625)

There are also a couple of other new things to see here:

The parameter t is converted to a one-element tuple with the notation (t,), which allows it to be concatenated with the tuple output by xy() to form a tuple of values (t, x, y).
The print() function is used inside of the loop to see the results, otherwise they are suppressed by IPython.

Often you will want to collect these values in a variable rather than print them out, so you can work with them as a group, e.g. to plot them. One approach is to initialize an empty tuple and then concatenate subsequent results:

t = 0
trajectory = ()
while t < 4:
trajectory += (t,)+ xy(t, xy0, v0),
t += 0.5
trajectory

⇒ ((0, 0, 2.0),
(0.5, 5.0, 8.27375),
(1.0, 10.0, 12.094999999999999),
(1.5, 15.0, 13.46375),
(2.0, 20.0, 12.379999999999999),
(2.5, 25.0, 8.84375),
(3.0, 30.0, 2.854999999999997),
(3.5, 35.0, -5.58625))

Here the assignment operator += is also used to concatenate a tuple of values at the end of trajectory — but note the extra , at the end which makes the right-hand side a tuple of a tuple! That is,(t,x,y) is a one-dimensional array, but ((t,x,y),) is a two-dimensional array, and concatenating them produces a time sequence as above.

Question: what will you get if you leave off that extra comma?

Rewrite the above loop to avoid negative values of y instead of lasting for a certain length of time (because, of course, y = 0 is the ground!).

We see above that tuples look like but don’t act like vectors:

Operation	Tuple	Vector
`(x, y) + (u, v)`	Concatenation: `(x, y, u, v)`	Component Addition: `(x + u, y + v)`
`(x, y) * s`	Repetition `s` times: `(x, y, x, y, ...)`	Scalar Multiplication: `(x * s, y * s)`
`(x, y) * (u, v)`	Undefined	Vector Multiplication (dot product): `x * u + y * v`

Write three functions that treat tuples of arbitrary length as vectors:

Take two tuples as arguments and produce their component addition.
Take a tuple and a number as arguments and produce scalar multiplication.
Take two tuples as arguments and produce their dot product.

…For a Specific Range of Time…

When you have a particular sequence of values that you want to iterate through, an alternative to the while loop is a for loop:

trajectory = ()
for t in range(8):
t /= 2.
trajectory += (t,)+ xy(t, xy0, v0),
trajectory

⇒ ((0, 0, 2.0)
  (0.5, 5.0, 8.27375)
  (1.0, 10.0, 12.094999999999999)
  (1.5, 15.0, 13.46375)
  (2.0, 20.0, 12.379999999999999)
  (2.5, 25.0, 8.84375)
  (3.0, 30.0, 2.854999999999997)
  (3.5, 35.0, -5.58625))

Again, there is the same statement—colon—indented-block-of-statements structure that we saw for function definitions and while statements.

The expression after the in keyword can be any kind of sequence, which Python calls iterators. This includes tuples and lists (which we will see shortly).

In this case, the iterator is another object called a range. The range() constructor in this case takes a single argument, here 8, and represents a sequence of integer values starting from zero:

0, 1, 2, 3, 4, 5, 6, 7

These values are assigned in turn to t, and the following block of statements is evaluated.

Note this means we don’t need to increment the time ourselves, the for loop does it for us, which is generally more efficient.

Because the range() constructor can only produce integer values, we need to divide those values by 2 to generate steps of 0.5, as in the previous case of the while loop. We therefore use another assignment operator, t /= 2, which is equivalent to t = t / 2 but is more efficient.

More generally, the range() constructor can have an initial integer value (inclusive), a final integer value (exclusive), and an integer step value, so that the following produces the same sequence as above:

range(0,8,1)

If the step value isn’t included, it’s assumed to be 1, so the following again produces the same sequence:

range(0,8)

And if the initial value is also not included (as above), it’s assumed to be 0.

When you type in a range expressions by itself, the output looks more or less the same; this is because the range() function is a constructor of another type of sequence called, unsurprisingly, the range, which is distinct from the tuple (though it is also immutable):

rangeItems = range(-1,2)
rangeItems

⇒ range(-1,2)

You can, however, convert from a range to a tuple with a constructor:

tuple(rangeItems)

⇒ (-1, 0, 1)

You can also use the index operator:

rangeItems[2]

⇒ 1

Warning: In versions 1 and 2 of Python, the range() function returned a more general type of sequence called a list!

…Unless You List Towards Lists

Sometimes you’ll want to work with a sequence of data that, unlike a tuple, can be modified, e.g. by changing individual elements, inserting other elements, etc.; for this purpose Python provides the list data type.

Like a tuple, a list is simply two or more values separated by commas, but in this case surrounded by square brackets:

[0, 2]

⇒ [0, 2]

Like tuples and ranges, you can assign a list to a variable and reference individual items with the index operator:

listItems = [1, 5, 9]
listItems[1]

⇒ 5

So how is a list different than a tuple? The primary difference is that a list is mutable, i.e. it can be modified, and is therefore much more flexible. In particular, you can assign new values to individual items in a list by using index notation:

listItems[1] = 6
listItems

⇒ [1, 6, 9]

Lists are also a structure known as an object, which provide a number of useful methods (self-operating functions). Object attributes are referenced by appending them to the object with a period. For example, to insert a new value at index 2 (pushing later items to higher indices):

listItems.insert(2, 8)
listItems

⇒ [1, 6, 8, 9]

As an alternative to using += to concatenate one list on the end of another, we can also use the append() method:

trajectory = []
for t in range(8):
t /= 2.
trajectory.append((t,)+ xy(t, xy0, v0))
trajectory

⇒ [(0, 0, 2.0)
  (0.5, 5.0, 8.27375)
  (1.0, 10.0, 12.094999999999999)
  (1.5, 15.0, 13.46375)
  (2.0, 20.0, 12.379999999999999)
  (2.5, 25.0, 8.84375)
  (3.0, 30.0, 2.854999999999997)
  (3.5, 35.0, -5.58625)]

This is less efficient than using tuples, however, since lists have more overhead.

See this documentation on list methods to learn more about the possibilities.

Tuples and ranges can be converted to lists with a constructor:

list(xy0)

⇒ [0, 2]

list(range(-1,2))

⇒ [-1, 0, 1]

Speaking of Time

How do you know which of the above loops will be the fastest way to calculate these values? The timeit module provides the tools to test them:

import timeit

loop1 = """
t = 0
trajectory = ()
while t < 4:
trajectory += ((t,)+ xy(t, xy0, v0)),
t += 0.5
"""

timeit.timeit(stmt=loop1, setup='from __main__ import xy0, v0, xy', number=100000)

⇒ 0.5816807200899348

The """ are a way to define a string of characters that crosses multiple lines, and is used to define the code that you want to run. (Note that all printing has been removed here, to focus on the looping.)

When imported, the module timeit becomes an object with its functions as methods, in particular timeit.timeit().

This particular function expression uses keyword arguments stmt, setup, and number, which can be placed in any position in the argument list, unlike the positional arguments we saw previously.

The parameter setup is used to ensure that predefined parameters are available when the code assigned to stmt is run.

The parameter number lets you choose the number of times to run the code; the larger the number, the less important system-related time should be in the total.

loop2 = """
trajectory = ()
for t in range(8):
t /= 2.
trajectory += (t,)+ xy(t, xy0, v0),
"""

timeit.timeit(stmt=loop2, setup='from __main__ import xy0, v0, xy', number=100000)

⇒ 0.5703178539406508

And also:

loop3 = """
t = 0
trajectory = []
while t < 4:
trajectory.append( (t,)+ xy(t, xy0, v0) )
t += 0.5
"""

timeit.timeit(stmt=loop3, setup='from __main__ import xy0, v0, xy', number=100000)

⇒ 0.6067627309821546

And finally:

loop4 = "trajectory = [(t/2.,) + xy(t/2., xy0, v0) for t in range(8)]"

timeit.timeit(stmt=loop4, setup='from __main__ import xy0, v0, xy', number=100000)

⇒ 0.6161218710476533

So the last approach is about as efficient as the previous case. But building a tuple is still the most efficient approach, due to their lower overhead.

Back Down to Earth: Working with Files

Writing groups of functions in separate files will allow you to access them when necessary, and reuse them in other programs.

Python Files

So far we’ve been ignoring the Editor window, which initially shows a default file named temp.py or untitled0.py (if you happen to have an existing Python file open, click on the button New to start fresh).

We can use files such as this to store the Python statements that we write, and this will also make it easy to edit these statements for bug fixes or improvements.

Initially this temporary file contains a bit of text inside of triple-double quotes """, describing something about the file’s contents, which for most purposes Python will ignore. Let’s update this text to read:

"""
A Study of the Utmost Gravity

A set of functions to calculate the effects of gravity.
"""

We have also written a fair bit of Python code in IPython. Let’s add some of it to this file, in particular:

def xy(t, xy0, v0):
    g = 9.81
    (x0,y0) = xy0
    (vx0,vy0)= v0
    return x0 + vx0 * t, y0 + vy0 * t - 1./2 * g * t**2

xy0 = (0, 2)
v0 = (10, 15)

trajectory = ()
for t in range(8):
trajectory += (t/2.,) + xy(t/2., xy0, v0),

print(trajectory)

Important: You’ll also need to include the function print in a file to see any particular results (unlike in the IPython console).

When putting code into a Python file, it’s a good idea to save it periodically, and in general before you run it (since this can sometimes crash Spyder).

Procedure 1: Saving a Python File

To preserve your code for future use and for ease of editing, it’s best to save it in a folder where it can be easily found and run when you return to the computer:

Menu File > Save as…
Navigate to a good location to store it, such as your local Documents folder or on your U: drive or G: drive;
Create a new folder with a name such as Scientific Programming;
Give the file a name, e.g. gravity.py ;
Click the button Save.

Finally, to execute the code in the current file, click the toolbar button Run file. The file will be loaded directly into IPython as if you had typed its statements there, and any output will also appear there.

The result of the print() command will appear in the IPython console, along with any error messages.

The file extension .py is recognized by many programs to be a particular kind of file called containing Python statements, and if you double-click on it, it may be run as an application rather than being opened again by Spyder.

Spyder’s Files pane is therefore a better way to access your Python files, because it will always open .py files in Spyder.

If you click on the tab Files, you will see Spyder’s current working directory (folder), which initially will be your home folder.

Procedure 2: Remembering a Python File’s Folder

You will often be working on multiple projects in different folders, and Spyder can remember them for easy access in its menu of working directories, by default placed at the top right of Spyder’s window:

To add the current file’s location to this menu:

In the upper right corner of the Editor window, click on the button ≡ Options;
Choose the item Set console working directory;
The folder path should now appear in the working directory menu, and the Files pane should also display that folder and its files.

Python Modules

In addition to running a file directly, you can also load the file in IPython (and even other files or modules) with the statement:

import gravity

Note that the file extension .py is not used here.

This is the same command used for the Python-supplied module timeit in the previous section; Python searches through a number of predefined folders to find modules you reference.

Warning: This only works for your own modules if you have already established the location of this file as the working directory. This is why it’s good practice to keep all of your files together in the same folder.

An alternative to the above statement lets you choose a shortened name:

import gravity as g

When you import the file gravity.py, any functions and variables you’ve defined therein become available for use — but you must reference them as attributes of the gravity or g object (depending on how you imported it). For example,

g.xy0

⇒ (0, 2)

g.xy

⇒ <function gravity.xy>

g.xy(3, g.xy0, g.v0)

⇒ (30, 2.854999999999997)

We’ve seen object methods such as g.xy() previously; g.xy0 and g.v0 are known as object properties.

If you want to import these methods and properties directly, so that you don’t need to include a prefix such as g., you can use the statement

from gravity import xy

from gravity import *

to important everything.

The names of imported attributes become part of the name space of your program, so be certain that you aren’t already using them for something else, as those earlier definitions will be wiped out.

Documenting Your Code

Once you start writing a fair amount of Python code, you will want to include documentation to help you understand your intentions! Otherwise you will return weeks or months later and not know what you did, and it will also make it easier for others to use your code.

Python provides two primary ways to document your code.

Docstrings

We saw above that we began the file gravity.py file with this text as its first few lines:

"""
A Study of the Utmost Gravity

A set of functions to calculate the effects of gravity.
"""

This is known as a docstring, and is always placed at the very beginning of a module file or a function as its first statement.

Simply write a good description of its purpose, and surround it by a matched pair of single, double, or “triple double” quotes.

Then, this text can be referenced with the object property __doc__:

import gravity as g
g.__doc__

⇒ '\nA Study of the Utmost Gravity\n\nA set of functions to calculate the effects of gravity.\n'

The print() function often provides a better display of such text, by, for example, implementing the newline character \n:

print(g.__doc__)

⇒
A Study of the Utmost Gravity

A set of functions to calculate the effects of gravity.

You could also add the following docstring to the function xy() in your file:

def xy(t, xy0, v0):
    """This function calculates the trajectory of an object
    under the influence of Earth's gravity near its surface:

        xy(t, (x0,y0), (vx0,vy0))

    where xy, x0, and y0 are in meters and vx0 and vy0 are in meters/second."""

Then when this module or function is imported, this description will be available for users:

from gravity import xy

xy.__doc__

⇒
"This function calculates the trajectory of an object\n under the influence of Earth's gravity near its surface:\n\n xy(t, (x0,y0), (vx0,vy0))\n\n where xy, x0, and y0 are in meters and vx0 and vy0 are in meters/second."

Spyder also provides a quick reference for objects you’ve defined, e.g. if you click on the name of an object such as

xy()

and then type <ctrl>-I (Windows) or <command>-I (Mac), its definition and docstring will appear in the window Help, just like built-in functions:

Object Inspector Window

Comments

More generally, Python allows you to describe in detail what is happening with your code where it isn’t obvious, by including comments where appropriate.

Python begins comments with a hash character #, and everything that follows to the end of the line is ignored by the interpreter:

# Split vectors into their components:
(x0,y0) = xy0

Comments that apply to subsequent code typically appear on lines by themselves, as above.

Short comments that apply to preceding code typically appear on the same line:

trajectory += (t,)+ xy(t, xy0, v0), # create a tuple (t, x, y) and cat it

Character Strings

A docstring is an example of a character string, or string for short, text surrounded by a matched pair of either single, double, or “triple double” quotes, such as the following:

'This is the trajectory of a ball thrown into the air.'

"It's certain to come back down."

"""
A Study of the Utmost Gravity

A set of functions to calculate the effects of gravity.
"""

There is no difference between using matched pairs of single or double quotes, but they allow you to embed the other character within them, as with the second line above.

Strings surrounded by triple double quotes have the advantage that they can be written to cross multiple lines.

Character strings are commonly used to share data between programs and people, since they are more understandable than binary data, but also between programs because it’s a widely understood format.

Strings as Sequences

Character strings are a new type of sequence data, str.

Like the tuple and list sequences, strings can be concatenated with the operator +:

'# This is the trajectory of a ball thrown into the air.' + \
" It's certain to come back down."

Note the continuation character \ that allows any statement to continue onto the next line.

Also like tuples and lists, strings can be assigned to variables:

intro = '# This is the trajectory of a ball thrown into the air.' + \
" It's certain to come back down."

print(intro)

⇒ # This is the trajectory of a ball thrown into the air. It's certain to come back down.

Like other sequences, you can reference individual characters in a string with the index operator:

intro[0]

⇒ '#'

However, like tuples and ranges, strings are immutable, so you can’t change them with index assignment (as you can for lists).

String Formatting

You may have noticed that when printing the trajectory information, it doesn’t come out very easy-to-read:

print(trajectory)

⇒ ((0, 0, 2.0), (0.5, 5.0, 8.2737499999999997), (1.0, 10.0, 12.094999999999999), (1.5, 15.0, 13.463749999999999), (2.0, 20.0, 12.379999999999999), (2.5, 25.0, 8.84375), (3.0, 30.0, 2.8549999999999969), (3.5, 35.0, -5.5862499999999997))

To make this look much nicer and more understandable, we can first print a string as a descriptive header:

tableheader = ' t(s) x(m) y(m)'
print(tableheader)

We can then use the string formatting operator % which designates format specifiers in a string, and fills them in with values from a tuple:

tabledata = '%5.2f %5.2f %5.2f'
for position in trajectory:
print(tabledata % position) # position is a tuple (t, x, y)

⇒

 t(s)   x(m)   y(m)
 0.00   0.00   2.00
 0.50   5.00   8.27
 1.00  10.00  12.09
 1.50  15.00  13.46
 2.00  20.00  12.38
 2.50  25.00   8.84
 3.00  30.00   2.85
 3.50  35.00  -5.59

The format specifier %5.2f specifies a floating-point output, with a field width of five characters, including two decimal places. The number is right-justified, and blank spaces will be used, if necessary, to fill five positions.

When printing data such as the trajectory above, it’s also a good idea to provide other contextual information about the data, called metadata. Here we’ll use another way to include data in a string, a formatted string, which precedes the first quote (of any kind) with the letter f:

metadata = f"""
# Initial position: {xy0} m
# Initial velocity: {v0} m/s.
"""
print(metadata)

⇒
# Initial position: (0, 2) m
# Initial velocity: (10, 15) m/s.

Formatting codes can also be used in formatted strings, but with a colon instead of a %, e.g. {xy0[0]:.2f}.

There are a number of other formatting codes, including %e for scientific notation, %i for integer values, and %s for strings (to be embedded within the formatting string). The sequence %% represents a single % in the leading string.

See this documentation on string formatting options for more information.

Data Output and Input

Rather than writing the pretty table above to the screen, we can also open a file for writing and save it there.

First we must open the file for writing :

output = open('trajectory.txt', 'w')

This statement assumes the file will be in the current folder (directory), which should be the same place from where you opened your file gravity.py.

This statement will also delete any existing content in the file if it already exists; if you want to append data to an existing file, the second argument should instead be 'a'.

The result of the open() command is a file handler we’ve called output, whose methods allow us to output data to the file.

We can make use of the same strings as above:

output.write(intro + '\n')
output.write(metadata + '\n')
output.write(tableheader + '\n')
for position in trajectory:
output.write(tabledata % position + '\n')

But note the two differences from the print statement:

We are using the method output.write() instead;
We must explicitly append a newline character '\n' at the end of each string to start new lines.

Finally, we must close the file:

output.close()

To read the file back in again:

input = open('trajectory.txt', 'r')
data = input.read()
print(data)
input.close()

⇒
# This is the trajectory of a ball thrown into the air. It's certain to come back down.
# Initial position: (0, 2) m
# Initial velocity: (10, 15) m/s.

t(s)   x(m)   y(m)
0.00   0.00   2.00
0.50   5.00   8.27
1.00  10.00  12.09
1.50  15.00  13.46
2.00  20.00  12.38
2.50  25.00   8.84
3.00  30.00   2.85
3.50  35.00  -5.59

Question: if you simply type the name of the variable data to display it, how does its output differ from using the print statement?

String Parsing

Note that the input data is a single string — if this were actual data you would want to read it in line-by-line so that you could parse it back into its component parts (viz. header and rows):

input = open('trajectory.txt', 'r')
data = input.readlines()
input.close()
print(data)

⇒
["# This is the trajectory of a ball thrown into the air. It's certain to come back down.\n", '\n', '# Initial position: (0, 2) m\n', '# Initial velocity: (10, 15) m/s.\n', '\n', ' t(s) x(m) y(m)\n', ' 0.00 0.00 2.00\n', ' 0.50 5.00 8.27\n', ' 1.00 10.00 12.09\n', ' 1.50 15.00 13.46\n', ' 2.00 20.00 12.38\n', ' 2.50 25.00 8.84\n', ' 3.00 30.00 2.85\n']

The actual data begins in the seventh element of the list, which has index 6, so we can process it beginning there:

trajectory = []
for line in data[6:]:
    trajectory.append(
        tuple(
            map(float, line.split())
        )
    )
print(trajectory)

⇒
[(0.0, 4.0, 2.0), (0.5, 9.0, 8.27), (1.0, 14.0, 12.09), (1.5, 19.0, 13.46), (2.0, 24.0, 12.38), (2.5, 29.0, 8.84), (3.0, 34.0, 2.85), (3.5, 39.0, -5.59)]

Again, there’s a lot going on here:

The index notation [n:m], known as a slice, lets you pick out any set of values from a sequence, from element [n] to
element [m-1]— yes, the first index is inclusive while the second is exclusive (!).
If you leave out the second index in a slice, e.g. [n:], it continues to the end of the list,
and if you leave out the first index, e.g. [:m], it starts from the beginning.
The string method .split() will split its string at intervening spaces, producing a list of three separate strings:

' 0.00 0.00 2.00\n' ⇒ ['0.00', '0.00', '2.00']
The function map() will apply its first argument, the function float(), to each element of its second argument, a list:

['0.00', '0.00', '2.00'] ⇒ [0.0, 0.0, 2.0]The result is another type of sequence called an iterator, which is converted to a tuple with the function tuple().

Here is a complete list of methods available for working on strings.

How would you go about extracting the four numbers from this string?

Hint: strings have a .find() method!

Parsing the contents of a file doesn’t actually need to be this difficult — this is more of a learning exercise. Others have been here before you and have written modules to simplify Python’s input/output (I/O). We’ll look at one example in a couple of days.

Natural Arrays

Data describing Nature are often best stored in arrays, rectangular (or higher-dimensional) collections of values, and NumPy makes it much easier to work with them.

NumPy Arrays

Remember our previous data set describing the motion of a ball thrown into the air near the Earth’s surface:

tableheader = ' t(s) x(m) y(m)'
tabledata = '%5.2f %5.2f %5.2f'
print(tableheader)
for position in trajectory:
print(tabledata % position)

⇒

t(s)   x(m)   y(m)
0.00   0.00   2.00
0.50   5.00   8.27
1.00  10.00  12.09
1.50  15.00  13.46
2.00  20.00  12.38
2.50  25.00   8.84
3.00  30.00   2.85
3.50  35.00  -5.59

which is a nice way of displaying trajectory, a list of tuples in the format (t, x, y):

trajectory

⇒ [(0, 0, 2.0), (0.5, 5.0, 8.2737499999999997), (1.0, 10.0, 12.094999999999999), (1.5, 15.0, 13.463749999999999), (2.0, 20.0, 12.379999999999999), (2.5, 25.0, 8.84375), (3.0, 30.0, 2.8549999999999969), (3.5, 35.0, -5.5862499999999997)]

We would like to be able to plot, two-dimensionally, the data pairs (t, x), (t, y), and (x, y). Unfortunately, Python by itself doesn’t provide good tools for extracting “columns” from lists of lists; you might try syntax that youv’e seen in other languages, such as

trajectory[:,1]
trajectory[:][1]

to select the second column, where [:]should mean to select all of the rows, but this won’t get you what you want; the first will produce an error and the second will produce the second row instead!

The only choice you have is to iterate through each row and compile the individual elements, but that’s very inefficient.

Fortunately, Python is very general and can be used to build better data types than the lowly list, and the add-on NumPy takes advantage of these features to provide a better data type, the array.

First, though, we must import Numpy:

import numpy as np

Including the as np phrase allows us to refer to the functions imported as np.method() rather than numpy.method().

With this large toolbox in hand, we can then convert trajectory to an array as follows:

ntra = np.array(trajectory)

And it is now possible to use the first notation above to select the columns we want, e.g. :

ntra[:,1]

⇒ array([ 0., 5., 10., 15., 20., 25., 30., 35.])

We can also select multiple columns:

ntra[:,1:3]

⇒
array([[ 0., 2. ]
[ 5., 8.27]
[ 10., 12.09 ]
[ 15., 13.46]
[ 20., 12.38 ]
[ 25., 8.84]
[ 30., 2.85 ]
[ 35., -5.59]])

Remember one of Python’s oddities: the sequence slice 1:3 means elements 1 and 2, because the second index in a slice is exclusive.

Plotting Arrays

Now that we’ve extracted the data pairs we want, we can plot them using the add-on matplotlib.

Because matplotlib generally requires several statements to build a plot, we will want to write them in a .py file while we perfect them.

But because we will be producing interactive plots, it’s a good idea to embed the plotting commands in a function that we can call from the Python shell after loading our .py file:

def xyplot():
    'This function plots the trajectory of a ball in a gravitional field.'
    from matplotlib.pyplot import plot, title, xlabel, ylabel, show, close
    plot(ntra[:,1],ntra[:,2])
    title('Trajectory of a Ball through the Air.')
    xlabel('x')
    ylabel('y')
    show()
    close()

Here we see a variation of the import command that loads just the particular functions we need from matplotlib. It also avoids the use of a prefix such as np.method(), since any name conflicts will only occur inside the function, where we’ll have tighter control over them.

The plot is built up from multiple pieces: the actual plot itself, the title, and the x- and y-axis labels. Only after these have been “constructed” do we show() the plot, which produces a pop-up window with the figure displayed:

xyplot()

⇒

If the figure is satisfactory, you can save it by right-clicking on it, and in the contextual menu that appears select the item Save Image As….

A list of plotting commands can be found in the Matplotlib documentation, providing a great deal of flexibility in displaying your data.

Scientific Programming with Python

Part 1: The Earth’s Gravity

Current: The Earth’s Gravity

Next: Newton’s Laws of Motion

Topics

Common Computer Numeric Data Types

While You Throw the Ball…

…For a Specific Range of Time…

Procedure 2: Remembering a Python File’s Folder

Scientific Programming with Python: Gravity Near the Earth’s Surface

Next: Newton’s Laws of Motion

Scientific Programming with Python:
Gravity Near the Earth’s Surface

Next:
Newton’s Laws of Motion