Go getters: a monadic way

Apr 9, 2024

I talked about some functional ideas about monad in the previous post. Let’s put this to work. In this post, I’ll show you how to implement a simple monadic type in Python to handle safe getters.

Call me Maybe

This is a trivial example, but it shows the basic idea. We’ll create a Maybe type that can either hold a value or be empty. We’ll implement map and bind methods to work with this type.

class Monad m where
    return :: a -> m a
    (>>=) :: m a -> (a -> m b) -> m b

instance Monad Maybe where
    return x = Just x
    Nothing >>= f = Nothing
    Just x >>= f = f x
class Maybe:
    def __init__(self, value=None):
        self.value = value

    def is_nothing(self):
        return self.value is None

    def map(self, func):
        if self.is_nothing():
            return self
        else:
            return Maybe(func(self.value))

    def bind(self, func):
        if self.is_nothing():
            return self
        else:
            return func(self.value)

    def __repr__(self):
        if self.is_nothing():
            return "Nothing"
        else:
            return f"Just({self.value})"

# Example usage:
def safe_divide(a, b):
    try:
        return Maybe(a / b)
    except ZeroDivisionError:
        return Maybe()

def increment(x):
    return x + 1

result = Maybe(10).bind(lambda x: safe_divide(x, 2))
print(result)  # Just(5.0)

result = result.map(increment)
print(result)  # Just(6.0)

result = Maybe(10).bind(lambda x: safe_divide(x, 0))
print(result)  # Nothing

result = result.map(increment)
print(result)  # Nothing

Some refactoring to make the interface more user-friendly:

def safe_divide(maybe_a: Maybe, maybe_b: Maybe) -> Maybe:
    def inner_divide(a, b):
        if b == 0:
            raise ValueError("Division by zero.")
        return a / b

    return maybe_a.bind(lambda a: maybe_b.map(lambda b: inner_divide(a, b)))

# Example usage:
a = Maybe(10)
b = Maybe(2)
result = safe_divide(a, b)
print(result)  # Just(5.0)

a = Maybe(10)
b = Maybe(0)
result = safe_divide(a, b)
print(result)  # Nothing

a = Maybe(10)
b = Maybe()
result = safe_divide(a, b)
print(result)  # Nothing

a = Maybe()
b = Maybe()
result = safe_divide(a, b)
print(result)  # Nothing

The value of using a Maybe monad over a simple division operation in Python, especially in the context of handling operations that might fail (like division by zero), lies in several key areas:

  1. Error Handling and Safety Explicit Handling of Failure Cases: The Maybe monad makes the handling of errors and exceptional cases explicit and a fundamental part of the type system. This contrasts with simple division, where errors must be handled through conditional checks or exception handling, which can be more error-prone and verbose. No Exception Required for Control Flow: Using exceptions for control flow is generally considered a bad practice because it can make the code harder to understand and maintain. The Maybe monad allows you to encode potential failure in the type system, making the flow of data and errors more explicit and less reliant on side-effects or exceptions.
  2. Composability and Chaining Chaining Operations: With the Maybe monad, you can easily chain operations that might fail without having to check for errors after each step. This leads to cleaner and more readable code, especially when dealing with multiple operations that can fail. Unified Interface for Nullable Operations: It provides a unified interface for dealing with operations that might return a null or undefined value, reducing the need for null checks scattered throughout the code.
  3. Functional Programming Paradigm Encourages Pure Functions: The use of monads encourages the design of pure functions that don’t have side effects, making the code easier to reason about, test, and reuse. Declarative Code Style: It promotes a more declarative style of programming, where you describe what you want to achieve rather than how to do it step by step. This can lead to more concise and readable code.
  4. Null Safety Prevents Null Reference Errors: By encapsulating the presence or absence of a value in a Maybe object, you avoid the common pitfall of null reference errors, which are a frequent source of bugs in many programming languages.

Safe getters

Implementing a monad specifically designed to handle attribute access (getattr) operations safely, dealing gracefully with None values (or any situation where an attribute might not exist), is a practical way to manage data access in a more functional style. This can be particularly useful when working with deeply nested data structures where any part of the chain might be None (or missing).

Let’s call this monad SafeGetter. It will encapsulate a value and allow us to chain attribute access operations safely, returning a special value (e.g., None or a custom default) if any operation in the chain fails due to the target being None or the attribute not existing.

Here’s how you might implement and use such a monad:

class SafeGetter:
    def __init__(self, value):
        self.value = value

    def get(self, attr, default=None):
        """Attempts to get an attribute from the current value, safely."""
        if self.value is None:
            return SafeGetter(default)
        try:
            return SafeGetter(getattr(self.value, attr, default))
        except AttributeError:
            return SafeGetter(default)

    def or_else(self, default):
        """Returns the contained value or a default if None."""
        if self.value is None:
            return default
        return self.value

    def __repr__(self):
        return f"SafeGetter({repr(self.value)})"

# Example usage:
class Person:
    def __init__(self, name, parent=None):
        self.name = name
        self.parent = parent

# Constructing a nested structure
grandparent = Person("Grandparent")
parent = Person("Parent", grandparent)
child = Person("Child", parent)

# Safe attribute access
child_name = SafeGetter(child).get("parent").get("parent").get("name").or_else("No name")
print(child_name)  # Output: Grandparent

# Handling missing attributes safely
unknown = SafeGetter(child).get("parent").get("sibling").get("name").or_else("No name")
print(unknown)  # Output: No name

# Handling None at any level
none_test = SafeGetter(None).get("parent").get("name").or_else("No value")
print(none_test)  # Output: No value

This provides a way to safely access attributes in a chain of nested objects, handling missing attributes or None values gracefully without raising exceptions or requiring explicit null checks at each step.

PEP 505. Several modern programming languages have so-called “null-coalescing” or “null- aware” operators, including C#, Dart, Perl, Swift, and PHP (starting in version 7). There are also stage 3 draft proposals for their addition to ECMAScript (a.k.a. JavaScript). These operators provide syntactic sugar for common patterns involving null references.

  • The “null-coalescing” operator is a binary operator that returns its left operand if it is not null. Otherwise it returns its right operand.
  • The “null-aware member access” operator accesses an instance member only if that instance is non-null. Otherwise it returns null. (This is also called a “safe navigation” operator.) -The “null-aware index access” operator accesses an element of a collection only if that collection is non-null. Otherwise it returns null. (This is another type of “safe navigation” operator.)

Getters in the wild

It’s common to use nested pydantic models to represent complex data structures in Python.

from pydantic import BaseModel, Field
from typing import List, Optional

class CashHandling(BaseModel):
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    cashflow: Optional[float] = Field(None)

class AssetWeight(BaseModel):
    asset_id: Optional[str] = Field(None)
    trade_direction: Optional[str] = Field(None)
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)

class RebalanceSettings(BaseModel):
    cash_handling: CashHandling = Field(default_factory=CashHandling)
    asset_weights: List[AssetWeight] = Field(default_factory=list)

#%%
rebal_settings = RebalanceSettings(**{
    "cash_settings": {
        "min_weight": 0.005,
        "max_weight": 0.01,
        "cashflow": 1000.0
    },
    "asset_weights": [
        {
            "asset_id": "AAPL",
            "trade_direction": "BUY",
            "min_weight": 0.05,
            "max_weight": 0.06
        },
        {
            "asset_id": "MSFT",
            "trade_direction": "SELL",
            "min_weight": 0.05,
            "max_weight": 0.07
        }
    ]
})

It is left to the users to access nested attributes in a safe way to avoid exceptions.

if (self.rebal_settings is not None) and (self.rebal_settings.cash_settings is not None):
    cashflow = self.rebal_settings.cash_settings.cashflow
else:
    cashflow = 0.0

v1: Using a monadic getter

We can use the SafeGetter monad to achieve this.

from pydantic import BaseModel, Field, ValidationError
from typing import List, Optional, Any, TypeVar, Generic

T = TypeVar('T')

class OptionalMonad(Generic[T]):
    def __init__(self, value: Optional[T]):
        self.value = value

    def bind(self, func):
        if self.value is None:
            return OptionalMonad(None)
        try:
            return OptionalMonad(func(self.value))
        except (AttributeError, ValidationError, KeyError):
            return OptionalMonad(None)

    def or_else(self, default: T) -> T:
        return self.value if self.value is not None else default

    def __repr__(self):
        return f"OptionalMonad({self.value})"

class SettingsModel(BaseModel):
    def get(self, attr: str) -> OptionalMonad:
        return OptionalMonad(getattr(self, attr, None))

class CashHandling(SettingsModel):
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    cashflow: Optional[float] = Field(None)

class AssetWeight(SettingsModel):
    asset_id: Optional[str] = Field(None)
    trade_direction: Optional[str] = Field(None)
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)

class RebalanceSettings(SettingsModel):
    cash_handling: CashHandling = Field(default_factory=CashHandling)
    asset_weights: List[AssetWeight] = Field(default_factory=list)

# Example usage
rebal_settings = RebalanceSettings(**{
    "cash_handling": {
        "min_weight": 0.005,
        "max_weight": 0.01,
        "cashflow": 1000.0
    },
    "asset_weights": [
        {
            "asset_id": "AAPL",
            "trade_direction": "BUY",
            "min_weight": 0.05,
            "max_weight": 0.06
        },
        {
            "asset_id": "MSFT",
            "trade_direction": "SELL",
            "min_weight": 0.05,
            "max_weight": 0.07
        }
    ]
})

# Accessing and using the monadic getter
cashflow = rebal_settings.get('cash_handling').bind(lambda ch: ch.get('cashflow')).or_else(0)
print(cashflow)  # Output: 1000.0

This implementation is technically correct. It is more verbose than the simple getattr approach, but it provides a more functional and safe way to access nested attributes in a chain of objects, especially when dealing with complex data structures like those defined by Pydantic models. But the interface is user-friendly and no programmer would buy into this pattern of binding into an anonymous function to access an attribute.

v2: Simplifying the interface

We can simplify the interface by overriding the __getattr__ method in the SettingsModel class to return a SafeGetter object for any attribute access. This way, users can access attributes directly without having to call the get method explicitly.

from pydantic import BaseModel, Field, ValidationError
from typing import List, Optional, Any, TypeVar, Generic, Callable

T = TypeVar('T')

class SafeAccess:
    def __init__(self, value: Optional[T]):
        self.value = value

    def __getattr__(self, name: str) -> 'SafeAccess':
        if self.value is None:
            return SafeAccess(None)
        try:
            return SafeAccess(getattr(self.value, name, None))
        except AttributeError:
            return SafeAccess(None)

    def or_else(self, default: T) -> T:
        return self.value if self.value is not None else default

    def __call__(self, *args, **kwargs) -> 'SafeAccess':
        if callable(self.value):
            try:
                return SafeAccess(self.value(*args, **kwargs))
            except Exception:
                return SafeAccess(None)
        return SafeAccess(None)

    def __repr__(self):
        return f"SafeAccess({self.value})"

class SettingsModel(BaseModel):
    def __getattr__(self, item: str) -> SafeAccess:
        # This method is only called when accessing an undefined attribute,
        # so we wrap the result in SafeAccess for safety.
        value = self.__dict__.get(item, None)
        return SafeAccess(value)

class CashHandling(SettingsModel):
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    cashflow: Optional[float] = Field(None)

class AssetWeight(SettingsModel):
    asset_id: Optional[str] = Field(None)
    trade_direction: Optional[str] = Field(None)
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)

class RebalanceSettings(SettingsModel):
    cash_handling: CashHandling = Field(default_factory=CashHandling)
    asset_weights: List[AssetWeight] = Field(default_factory=list)

# Example usage
rebal_settings = RebalanceSettings(**{
    "cash_handling": {
        "min_weight": 0.005,
        "max_weight": 0.01,
        "cashflow": 1000.0
    },
    "asset_weights": [
        {
            "asset_id": "AAPL",
            "trade_direction": "BUY",
            "min_weight": 0.05,
            "max_weight": 0.06
        },
        {
            "asset_id": "MSFT",
            "trade_direction": "SELL",
            "min_weight": 0.05,
            "max_weight": 0.07
        }
    ]
})

# Accessing attributes without explicit bind
cashflow = rebal_settings.cash_handling.cashflow.or_else(0)
print(cashflow)  # Output: 1000.0

If you run this code, you’ll get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/workspaces/d-.github.io/scripts/test.py in line 80
     57 rebal_settings = RebalanceSettings(**{
     58     "cash_handling": {
     59         "min_weight": 0.005,
   (...)
     76     ]
     77 })b
     79 # Accessing attributes without explicit bind
---> 80 cashflow = rebal_settings.cash_handling.cashflow.or_else(0)
     81 print(cashflow)  # Output: 1000.0

AttributeError: 'float' object has no attribute 'or_else'

This is a common bug in monadic code. The or_else method is not defined for the float type, which is the value of cashflow. This is because the SafeAccess object returned by rebal_settings.cash_handling.cashflow is not a SafeAccess object but a float object. We can fix this by returning a SafeAccess object from the __getattr__ method when accessing an attribute. This is the moral of the story: these getters should be a one street into the monadic world. If you take the purist functionalist approach: you should not be able to get out of the monadic world, except with the or_else escape hatch. You could wrap all primitive types in a SafeAccess object, but that would be overkill. Instead, we just assume the last attribute in the chain is a primitive type and return it directly.

#unsafe
cashflow = rebal_settings.cash_handling.cashflow
print(cashflow)  # Output: 1000.0 instead of SafeAccess(1000.0)

v3: handling safe access to lists

You still need to handle safe access to lists. You can create a SafeList class that inherits from list and overrides the __getitem__ method to return a SafeAccess object for the accessed element. This way, you can safely access elements in a list without worrying about index out of range errors.

from pydantic import BaseModel, Field
from typing import List, Optional, Generic, TypeVar

T = TypeVar('T')

class SafeAccess:
    def __init__(self, value: Optional[T]):
        self.value = value

    def __getattr__(self, name: str) -> 'SafeAccess':
        if self.value is None:
            return SafeAccess(None)
        try:
            return SafeAccess(getattr(self.value, name, None))
        except AttributeError:
            return SafeAccess(None)

    def or_else(self, default: T) -> T:
        return self.value if self.value is not None else default

    def __call__(self, *args, **kwargs) -> 'SafeAccess':
        if callable(self.value):
            try:
                return SafeAccess(self.value(*args, **kwargs))
            except Exception:
                return SafeAccess(None)
        return SafeAccess(None)

    def __repr__(self):
        return f"SafeAccess({self.value})"

class SafeList(List[T]):
    def __getitem__(self, index: int) -> SafeAccess:
        try:
            return SafeAccess(super().__getitem__(index))
        except IndexError:
            return SafeAccess(None)


class SettingsModel(BaseModel):
    def __getattr__(self, item: str) -> SafeAccess:
        value = self.__dict__.get(item, None)
        return SafeAccess(value)

    class Config:
        arbitrary_types_allowed = True

class CashHandling(SettingsModel):
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    cashflow: Optional[float] = Field(None)

class AssetWeight(SettingsModel):
    asset_id: Optional[str] = Field(None)
    trade_direction: Optional[str] = Field(None)
    min_weight: Optional[float] = Field(None, ge=0.0, le=1.0)
    max_weight: Optional[float] = Field(None, ge=0.0, le=1.0)

class RebalanceSettings(SettingsModel):
    cash_handling: CashHandling = Field(default_factory=CashHandling)
    asset_weights: List[AssetWeight] = Field(default_factory=SafeList)

# Example usage modified to use SafeList
rebal_settings = RebalanceSettings(**{
    "cash_handling": {
        "min_weight": 0.005,
        "max_weight": 0.01,
        "cashflow": 1000.0
    },
})

#%%
print(rebal_settings.asset_weights[0].asset_id)  # Output: SafeAccess(None)

This however would fail but I think it’s reasonable to require users of RebalanceSettings to rely on the default factory of List[AssetWeight] to handle empty lists, instead of passing an empty list directly. This way, the default factory can return a SafeList object, ensuring that all list accesses are safe.

# Example usage modified to use SafeList
rebal_settings = RebalanceSettings(**{
    "cash_handling": {
        "min_weight": 0.005,
        "max_weight": 0.01,
        "cashflow": 1000.0
    },
    "asset_weights": [
    ]
})

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/workspaces/d-.github.io/scripts/test.py in line 74
     64 # Example usage modified to use SafeList
     65 rebal_settings = RebalanceSettings(**{
     66     "cash_handling": {
     67         "min_weight": 0.005,
   (...)
     71     "asset_weights": []
     72 })
---> 74 rebal_settings.asset_weights[0].min_weight

IndexError: list index out of range

DataFrameMonads

To make the DataFrame operations more expressive and chainable in a monadic style, we can wrap the DataFrame in a class that implements the monadic operations more seamlessly. This approach allows us to use method chaining for a more fluent and readable syntax. Below is an example of how you might implement such a class:

import pandas as pd

class DataFrameMonad:
    def __init__(self, df):
        self.df = df

    @staticmethod
    def return_df(value):
        """Static method to encapsulate a value into the monad."""
        return DataFrameMonad(pd.DataFrame(value))

    def bind(self, func):
        """Apply a function to the DataFrame and return a new monad."""
        try:
            # Apply the function to the DataFrame
            result = func(self.df)
            # Ensure the result is a DataFrame
            if not isinstance(result, pd.DataFrame):
                raise ValueError("The function did not return a DataFrame")
            return DataFrameMonad(result)
        except Exception as e:
            # Handle or log the error
            print(f"Error during bind: {e}")
            # Optionally, return a monad with an empty DataFrame or some error indicator
            return DataFrameMonad(pd.DataFrame())

    def to_dataframe(self):
        """Utility method to get the underlying DataFrame."""
        return self.df

def filter_rows(df):
    """Example function to filter rows of the DataFrame."""
    return df[df['value'] > 10]

def add_column(df):
    """Example function to add a new column to the DataFrame."""
    df['new_column'] = df['value'] * 2
    return df

df_monad = DataFrameMonad.return_df({'id': [1, 2, 3], 'value': [5, 15, 25]})

result_monad = df_monad.bind(filter_rows).bind(add_column)

result_df = result_monad.to_dataframe()

print(result_df)

The choice to use a return_df method (or any similarly named method) for initialization, instead of directly using the initializer of the DataFrameMonad class, is primarily a matter of adhering to the monadic pattern and its terminology. However, this choice also offers flexibility and clarity in certain contexts. Let’s explore the reasons and considerations in more detail:

class DataFrameMonad:
    def __init__(self, df):
        if not isinstance(df, pd.DataFrame):
            raise ValueError("Expected a pandas DataFrame")
        self.df = df

    def bind(self, func):
        # Implementation remains the same...

df_monad = DataFrameMonad(pd.DataFrame({'id': [1, 2, 3], 'value': [5, 15, 25]}))

To give a DataFrameMonad class the same methods and attributes as a usual pandas DataFrame, you essentially want to make the DataFrameMonad behave like a DataFrame. This can be achieved through a combination of delegation and dynamic attribute access. The goal is to allow users of DataFrameMonad to invoke DataFrame methods directly on a DataFrameMonad instance, with the class transparently passing these calls through to the underlying DataFrame.


class DataFrameMonad:
    def __init__(self, df):
        self._df = df

    def __getattr__(self, name):
        # Delegate attribute access to the underlying DataFrame
        attr = getattr(self._df, name)

        if callable(attr):
            def wrapper(*args, **kwargs):
                # Call the DataFrame method and wrap the result in a new DataFrameMonad
                result = attr(*args, **kwargs)
                if isinstance(result, pd.DataFrame):
                    return DataFrameMonad(result)
                else:
                    return result
            return wrapper
        else:
            return attr

import pandas as pd

df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
monad = DataFrameMonad(df)

result = monad.sum()
print(result)  # This will print the sum of the DataFrame columns

filtered = monad[monad['a'] > 1]
print(filtered._df)  # Accessing the underlying DataFrame to display it

← Back to all posts