List Comparison Strategies

When comparing list[BaseModel] fields in similarity or fill rate accuracy computations, you can use different comparison strategies to align items when list order may differ.

Available Strategies

Strategy	Description	Use Case
`pairwise`	Compare items by index (default)	Lists have same order
`levenshtein`	Align items preserving relative order	Lists with insertions/deletions
`optimal_assignment`	Find optimal one-to-one mapping	Lists with different order

Using list_compare_strategy

Set the strategy using Spec(list_compare_strategy=...):

from cobjectric import BaseModel, Spec, ListCompareStrategy

class Item(BaseModel):
    name: str
    price: float

class Order(BaseModel):
    # Default: pairwise comparison
    items_pairwise: list[Item]
    # Levenshtein alignment (preserves relative order)
    items_levenshtein: list[Item] = Spec(
        list_compare_strategy=ListCompareStrategy.LEVENSHTEIN
    )
    # Optimal assignment (Hungarian algorithm)
    items_optimal: list[Item] = Spec(
        list_compare_strategy=ListCompareStrategy.OPTIMAL_ASSIGNMENT
    )

You can also use strings:

class Order(BaseModel):
    items: list[Item] = Spec(list_compare_strategy="levenshtein")

Pairwise Strategy (Default)

Compares items by their index. Simple and fast, but requires lists to be in the same order:

class Item(BaseModel):
    name: str

class Order(BaseModel):
    items: list[Item]  # Uses pairwise by default

order_got = Order.from_dict({"items": [{"name": "Apple"}, {"name": "Banana"}]})
order_expected = Order.from_dict({"items": [{"name": "Apple"}, {"name": "Banana"}]})

result = order_got.compute_similarity(order_expected)
print(result.fields.items[0].fields.name.value)  # 1.0 (Apple == Apple)
print(result.fields.items[1].fields.name.value)  # 1.0 (Banana == Banana)

If items are in different order, pairwise comparison will fail to match them:

order_got = Order.from_dict({"items": [{"name": "Apple"}, {"name": "Banana"}]})
order_expected = Order.from_dict({"items": [{"name": "Banana"}, {"name": "Apple"}]})

result = order_got.compute_similarity(order_expected)
print(result.fields.items[0].fields.name.value)  # 0.0 (Apple != Banana)
print(result.fields.items[1].fields.name.value)  # 0.0 (Banana != Apple)

Levenshtein Strategy

Uses dynamic programming to find the best alignment while preserving relative order. Good for lists with insertions or deletions:

class Item(BaseModel):
    name: str

class Order(BaseModel):
    items: list[Item] = Spec(list_compare_strategy=ListCompareStrategy.LEVENSHTEIN)

# got: [Apple, Cherry, Banana]
# expected: [Apple, Banana]
# Best alignment: Apple-Apple, Banana-Banana (skip Cherry)
order_got = Order.from_dict({
    "items": [{"name": "Apple"}, {"name": "Cherry"}, {"name": "Banana"}]
})
order_expected = Order.from_dict({
    "items": [{"name": "Apple"}, {"name": "Banana"}]
})

result = order_got.compute_similarity(order_expected)
print(len(result.fields.items))  # 2
print(result.fields.items[0].fields.name.value)  # 1.0 (Apple)
print(result.fields.items[1].fields.name.value)  # 1.0 (Banana)

Important: Levenshtein preserves relative order. It cannot match items that would violate the original order:

# got: [Apple, Banana]
# expected: [Banana, Apple]
# Levenshtein can only align ONE item (Apple-Apple OR Banana-Banana)
order_got = Order.from_dict({"items": [{"name": "Apple"}, {"name": "Banana"}]})
order_expected = Order.from_dict({"items": [{"name": "Banana"}, {"name": "Apple"}]})

result = order_got.compute_similarity(order_expected)
print(len(result.fields.items))  # 1 (only one item aligned)

Optimal Assignment Strategy

Uses the Hungarian algorithm to find the optimal one-to-one mapping regardless of order. Best for lists where order doesn't matter:

from cobjectric import BaseModel, Spec, ListCompareStrategy

class Item(BaseModel):
    name: str
    price: float

class Order(BaseModel):
    items: list[Item] = Spec(
        list_compare_strategy=ListCompareStrategy.OPTIMAL_ASSIGNMENT
    )

# got: [Apple, Banana]
# expected: [Banana, Apple]
# Optimal alignment: Apple-Apple, Banana-Banana
order_got = Order.from_dict({
    "items": [
        {"name": "Apple", "price": 1.0},
        {"name": "Banana", "price": 0.5},
    ]
})
order_expected = Order.from_dict({
    "items": [
        {"name": "Banana", "price": 0.5},
        {"name": "Apple", "price": 1.0},
    ]
})

result = order_got.compute_similarity(order_expected)
print(len(result.fields.items))  # 2
# All items are perfectly matched
print(result.fields.items[0].fields.name.value)   # 1.0
print(result.fields.items[0].fields.price.value)  # 1.0
print(result.fields.items[1].fields.name.value)   # 1.0
print(result.fields.items[1].fields.price.value)  # 1.0

Note: The optimal_assignment strategy requires scipy to be installed:

pip install scipy

Strategy Comparison

Scenario	Pairwise	Levenshtein	Optimal Assignment
Same order	✅ Best	✅ Works	✅ Works
Insertions/deletions	❌ Poor	✅ Best	✅ Works
Different order	❌ Poor	❌ Poor	✅ Best
Performance	⚡ O(n)	📊 O(n×m)	📊 O(n³)

Usage with Similarity

List comparison strategies are used when computing similarity between two models. See Similarity for details.

Usage with Fill Rate Accuracy

List comparison strategies are also used when computing fill rate accuracy between two models. See Fill Rate Accuracy for details.

Invalid Usage

Using list_compare_strategy on non-list[BaseModel] fields raises InvalidListCompareStrategyError:

from cobjectric import InvalidListCompareStrategyError

# Error: Using on a non-list field
class Person(BaseModel):
    name: str = Spec(list_compare_strategy=ListCompareStrategy.LEVENSHTEIN)

person_got = Person(name="John")
person_expected = Person(name="Jane")

try:
    person_got.compute_similarity(person_expected)
except InvalidListCompareStrategyError as e:
    print(f"Error: {e}")

# Error: Using on list[Primitive] (only list[BaseModel] is supported)
class Person(BaseModel):
    tags: list[str] = Spec(list_compare_strategy=ListCompareStrategy.LEVENSHTEIN)

person_got = Person(tags=["python"])
person_expected = Person(tags=["rust"])

try:
    person_got.compute_similarity(person_expected)
except InvalidListCompareStrategyError as e:
    print(f"Error: {e}")

Similarity - Learn about similarity computation
Fill Rate - Learn about fill rate and fill rate accuracy computation

API Reference

See the API Reference for ListCompareStrategy documentation and List Results for aggregation methods.