Python Data Types

· Seokhyeon Byun

Note: This post is based on my old programming study notes when I taught myself.

Characteristic of Data in Python

All data in Python is represented by objects or by relations between objects. Every object has an identity, a type and a value.

Primitive Types

Numeric Types

  • Integer: Whole numbers (positive, negative, or zero)
  • Float: Real numbers including decimal points
# Integer examples
age = 25
temperature = -10
count = 0

# Float examples
price = 19.99
pi = 3.14159
scientific = 2.5e6  # 2,500,000

String Type

A sequence of characters. Use single quotes (”) or double quotation marks ("") to create strings.

# String examples
name = "John Doe"
message = 'Hello, World!'
multiline = """This is a
multiline string"""

# String operations
full_name = "John" + " " + "Doe"  # Concatenation
repeated = "Ha" * 3  # "HaHaHa"

Reference: Python String Methods Documentation

Boolean Type

True or False values.

is_student = True
is_graduated = False

# Important notes:
# - 0 equals False
# - 1 equals True
# - In Python, T and F must be capital

# Boolean operations
result = True and False  # False
result = True or False   # True
result = not True        # False

None Type

A single object that has a value ‘None’. It represents the absence of a value.

data = None
result = None

# Important: None is not the same as:
# - 0 (zero)
# - False
# - An empty string ""

Sequence Types

There are three different sequence types:

Immutable Sequences

  • String: "hello"
  • Tuples: Use ()
# Tuple examples
coordinates = (10, 20)
colors = ("red", "green", "blue")
single_item = (42,)  # Note the comma for single item

Mutable Sequences

  • Lists: Use []
# List examples
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True]
nested = [[1, 2], [3, 4]]

Range Objects

# Range examples
numbers = range(10)        # 0 to 9
evens = range(0, 10, 2)    # 0, 2, 4, 6, 8
countdown = range(10, 0, -1)  # 10, 9, 8, ..., 1

Reference: Python Sequence Types Documentation


Data Type Conversion

Basic Conversions

  • int(): Convert to integer
  • float(): Convert to float (real number)
  • str(): Convert to string
# Integer conversion
num_str = "123"
num_int = int(num_str)  # 123
float_num = 3.14
int_from_float = int(float_num)  # 3 (truncated)

# Float conversion
int_num = 42
float_num = float(int_num)  # 42.0
str_num = "3.14"
float_from_str = float(str_num)  # 3.14

# String conversion
number = 42
str_num = str(number)  # "42"
boolean = True
str_bool = str(boolean)  # "True"

Advanced Conversions

# List, tuple, set conversions
string = "hello"
char_list = list(string)      # ['h', 'e', 'l', 'l', 'o']
char_tuple = tuple(string)    # ('h', 'e', 'l', 'l', 'o')
char_set = set(string)        # {'h', 'e', 'l', 'o'}

# Boolean conversion
bool(1)       # True
bool(0)       # False
bool("")      # False
bool("text")  # True
bool([])      # False
bool([1, 2])  # True

Useful Built-in Functions

String Methods

text = "  Hello World  "

# Case conversion
text.upper()     # "  HELLO WORLD  "
text.lower()     # "  hello world  "
text.title()     # "  Hello World  "

# Whitespace handling
text.strip()     # "Hello World"
text.lstrip()    # "Hello World  "
text.rstrip()    # "  Hello World"

# Search and replace
text.find("World")      # 8 (index of "World")
text.count("l")         # 3 (number of "l"s)
text.replace("World", "Python")  # "  Hello Python  "

# Splitting and joining
sentence = "apple,banana,orange"
fruits = sentence.split(",")     # ["apple", "banana", "orange"]
joined = "-".join(fruits)        # "apple-banana-orange"

List Methods

numbers = [1, 2, 3]

# Adding elements
numbers.insert(0, 0)    # [0, 1, 2, 3] (insert at index)
numbers.append(4)       # [0, 1, 2, 3, 4] (add to end)

# Removing elements
numbers.pop()           # Returns and removes 4: [0, 1, 2, 3]
numbers.pop(0)          # Returns and removes 0: [1, 2, 3]

General Built-in Functions

# Type checking
type(42)           # <class 'int'>
isinstance(42, int)  # True

# Length and size
len("hello")       # 5
len([1, 2, 3])     # 3

# Math functions
abs(-5)           # 5
max([1, 2, 3])    # 3
min([1, 2, 3])    # 1
sum([1, 2, 3])    # 6

# Input/Output
name = input("Enter your name: ")  # Get user input
print("Hello,", name)              # Print output

Reference: Python Built-in Functions Documentation


Type Checking and Validation

# Check data types
data = 42
print(type(data))           # <class 'int'>
print(isinstance(data, int))  # True

# Multiple type checking
def process_data(value):
    if isinstance(value, (int, float)):
        return value * 2
    elif isinstance(value, str):
        return value.upper()
    else:
        return None

# Type validation example
def safe_divide(a, b):
    if not isinstance(a, (int, float)) or not isinstance(b, (int, float)):
        return "Error: Both arguments must be numbers"
    if b == 0:
        return "Error: Cannot divide by zero"
    return a / b

Common Pitfalls and Tips

Mutable vs Immutable

# Immutable - creates new object
text = "hello"
text.upper()  # Returns "HELLO" but doesn't change original
print(text)   # Still "hello"

# Mutable - modifies existing object
numbers = [1, 2, 3]
numbers.append(4)  # Modifies the original list
print(numbers)     # [1, 2, 3, 4]

Type Conversion Edge Cases

# Be careful with these conversions
int("3.14")     # ValueError: invalid literal
int(float("3.14"))  # 3 (correct way)

# String to boolean is always True (except empty string)
bool("False")   # True (string "False" is truthy)
bool("")        # False (empty string is falsy)

Memory Efficiency

# For large numeric ranges, use range() instead of list
# Memory efficient
for i in range(1000000):
    pass

# Memory intensive
for i in list(range(1000000)):  # Creates list in memory
    pass

Technical Interview Essentials

Time & Space Complexity of Data Types

Access/Lookup Time Complexity:

# List: O(1) access by index, O(n) search
my_list = [1, 2, 3, 4, 5]
item = my_list[2]        # O(1) - direct access
found = 3 in my_list     # O(n) - linear search

# Dictionary: O(1) average case for lookup
my_dict = {"key": "value", "python": "awesome"}
value = my_dict["key"]   # O(1) average case
exists = "python" in my_dict  # O(1) average case

# Set: O(1) average case for lookup
my_set = {1, 2, 3, 4, 5}
exists = 3 in my_set     # O(1) average case

Memory Usage Comparison

import sys

# Compare memory usage
numbers_list = [1, 2, 3, 4, 5]
numbers_tuple = (1, 2, 3, 4, 5)
numbers_set = {1, 2, 3, 4, 5}

print(f"List size: {sys.getsizeof(numbers_list)} bytes")
print(f"Tuple size: {sys.getsizeof(numbers_tuple)} bytes")
print(f"Set size: {sys.getsizeof(numbers_set)} bytes")

# Tuple is typically more memory-efficient than list

Common Interview Patterns

1. Two Pointers with Strings

def is_palindrome(s):
    """Check if string is palindrome - common interview question"""
    left, right = 0, len(s) - 1

    while left < right:
        if s[left] != s[right]:
            return False
        left += 1
        right -= 1

    return True

# Test cases
print(is_palindrome("racecar"))  # True
print(is_palindrome("hello"))    # False

2. Hash Map for Counting

def character_count(text):
    """Count character frequency - very common pattern"""
    count_dict = {}

    for char in text:
        count_dict[char] = count_dict.get(char, 0) + 1
        # Alternative: count_dict[char] = count_dict.setdefault(char, 0) + 1

    return count_dict

print(character_count("hello"))  # {'h': 1, 'e': 1, 'l': 2, 'o': 1}

3. Type Conversion Edge Cases (Interview Traps)

# Common interview traps with type conversion
def safe_conversions():
    # String to int edge cases
    try:
        result = int("3.14")  # ValueError!
    except ValueError:
        result = int(float("3.14"))  # Correct: 3

    # Boolean edge cases that trip people up
    print(bool("0"))      # True - string "0" is truthy!
    print(bool("False"))  # True - string "False" is truthy!
    print(bool(""))       # False - empty string is falsy
    print(bool([]))       # False - empty list is falsy
    print(bool({}))       # False - empty dict is falsy

    # Integer division vs float division
    print(7 // 3)         # 2 - floor division
    print(7 / 3)          # 2.333... - true division

    return result

safe_conversions()

Quick Reference for Interviews

Falsy Values in Python (memorize this!):

# These are ALL the falsy values in Python
falsy_values = [
    None,
    False,
    0,           # Zero integer
    0.0,         # Zero float
    0j,          # Zero complex number
    "",          # Empty string
    (),          # Empty tuple
    [],          # Empty list
    {},          # Empty dictionary
    set(),       # Empty set
    frozenset(), # Empty frozenset
]

# Everything else is truthy!
for value in falsy_values:
    print(f"{repr(value):15} -> {bool(value)}")

String Immutability - Interview Gotcha:

# This is O(n²) time complexity - BAD!
def build_string_wrong(items):
    result = ""
    for item in items:
        result += item  # Creates new string each time!
    return result

# This is O(n) time complexity - GOOD!
def build_string_right(items):
    return "".join(items)  # Efficient string concatenation

# Performance comparison for large inputs
items = ["hello"] * 1000
# build_string_wrong(items)  # Slow!
fast_result = build_string_right(items)  # Fast!