String Data Type

Strings (str) are a Python data type that handles a collection (sequence) of characters and methods that operate on them.

Literal Definition

Strings can be defined literally, by typing out text that is between matching " (double) or ' (single) quotes.

Strings in Python can also be defined using matchin """ or ''' ("triple quoted strings") if you need very long strings to span multiple lines.

In [2]:
greet = "Hello Bob"

Indexing

You can access individual characters in a string using indexing, or the use of a cardinal number which identifies a position in the string, starting from 0.

In [3]:
greet[0]
Out[3]:
'H'
In [4]:
print(greet[0], greet[2], greet[4])
H l o
In [6]:
for char in greet:
    print(char, end=" ")
H e l l o   B o b 

Expressions as Indicies

Indicies can be given through a more complicated arithmetic expression so long as the evaluation of that expressions results in an integer value.

So, be sure that your indices always results as an int, and that the index position doesn't exceed the bounds of the string.

In [7]:
x = 8
In [8]:
greet[x - 2]
Out[8]:
'B'
In [9]:
y = 10.0
greet[y - 2]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-212cf27dc72a> in <module>()
      1 y = 10.0
----> 2 greet[y - 2]

TypeError: string indices must be integers
In [10]:
y = 10
greet[y - 2]
Out[10]:
'b'
In [11]:
z = 20
greet[z - 2]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-11-d0bbd647caad> in <module>()
      1 z = 20
----> 2 greet[z - 2]

IndexError: string index out of range

Reverse Indexing

You can reference the end of a string using negative numbers, a shorthand that simplifies many types of string programming activities. Instead of knowing how long a string is in order to reference the last character, all you need to do is write "-1" as an index to say "give me the last character. From there, lower negative integers refer to character positions from the right (-2, -3, -4 are the 2nd, 3rd and 4th characters from the string's end).

In [12]:
greet[-1]  # Last character of our string
Out[12]:
'b'
In [13]:
greet[-2]
Out[13]:
'o'
In [16]:
for n in range(-1, -10, -1):
    print(n, greet[n])
-1 b
-2 o
-3 B
-4  
-5 o
-6 l
-7 l
-8 e
-9 H
In [17]:
greet[len(greet)]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-17-fdbeb5e74c65> in <module>()
----> 1 greet[len(greet)]

IndexError: string index out of range
In [18]:
len(greet)
Out[18]:
9
In [19]:
greet[len(greet)-1]
Out[19]:
'b'
In [20]:
greet[len(greet)-2]
Out[20]:
'o'
In [22]:
for n in range(-1, -10, -1):
    print(n, greet[len(greet)+n])
-1 b
-2 o
-3 B
-4  
-5 o
-6 l
-7 l
-8 e
-9 H
In [24]:
greet[len(greet)-1]
greet[len(greet)-2]
greet[len(greet)-3]
Out[24]:
'B'
In [25]:
greet[-1]
greet[-2]
greet[-3]
Out[25]:
'B'

Slicing

Splicing is an natural and intuitive extension of indexing. Instead of one character returned, you can get a series. The slicing operation looks like [start:end:step] where you can specify a start or end integer (or, leave one or both off for implicit start and end of strings, respectively). The step parameter is completely optional and allows you to slice a string at every $n$ character.

In [26]:
greet[0:3]
Out[26]:
'Hel'
In [27]:
greet[5:9]
Out[27]:
' Bob'
In [28]:
greet[5:90]
Out[28]:
' Bob'
In [29]:
greet[5:]
Out[29]:
' Bob'
In [30]:
greet[:5]
Out[30]:
'Hello'
In [31]:
greet[0:5]
Out[31]:
'Hello'
In [32]:
greet[:]
Out[32]:
'Hello Bob'
In [33]:
print(id(greet), id(greet[:]))
4525755696 4525755696
In [34]:
greet[6] = 'T'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-8b63bdbc7c92> in <module>()
----> 1 greet[6] = 'T'

TypeError: 'str' object does not support item assignment

Negative Slicing

Slicing from the reverse is possible, and includes many forms, some of which are not obvious or intuitive (e.g. [-3:9])

In [35]:
greet
Out[35]:
'Hello Bob'
In [36]:
greet[-1:]
Out[36]:
'b'
In [37]:
greet[-3:]
Out[37]:
'Bob'
In [38]:
greet[-3:8]
Out[38]:
'Bo'
In [39]:
greet[-3:9]
Out[39]:
'Bob'
In [40]:
greet[-9:-4]
Out[40]:
'Hello'

Slicing with steps

In [42]:
greet[::-1]
Out[42]:
'boB olleH'
In [43]:
greet[0:9:2]
Out[43]:
'HloBb'
In [44]:
greet
Out[44]:
'Hello Bob'
In [45]:
greet[0:9:3]
Out[45]:
'HlB'

"Artihmetic" with Strings

The + (addition) and * (multiplication) operators are "overloaded to work with strings in addition to numbers; however with strings you aren't performing any arithmetic. When you "add" two strings together, you are peforming a string concatenation, or a joining of strings to form a new string. When you "multiple" a string, you provide a string with an integer $n$ to create a new string where the contents are repeated $n$ times (whatever that integer is).

In [47]:
s1 = "Spam"
s2 = "Green Eggs"

verse = s2 + " and " + s1
In [48]:
print(verse)
Green Eggs and Spam
In [49]:
width = 10
print("=" * width)
print("First Name", "Last Name")
print("=" * width)
==========
First Name Last Name
==========
In [51]:
header1 = "First Name"
header2 = "Last Name"
width = len(header1 + header2) + 1

print("=" * width)
print(header1, header2)
print("=" * width)
====================
First Name Last Name
====================
In [52]:
header = "=" * 10
In [53]:
header
Out[53]:
'=========='