Pollution Primitives
A pollution primitive describes what an attacker can choose at each step of the access-then-assign sequence. The “get” primitive captures choices across the access steps. The “set” primitive captures the final assignment.
The atomic operations that compose each primitive are listed on the Get & Set Atomics page.
Get primitives
There are two get primitives, distinguished by whether the program lets the attacker choose between attribute access and item access at each step.
Agnostic-Get. The attacker can pick either atomic at every step. This shape arises in two ways. The first is a control-flow branch that selects different atomics on different paths.
for key in path:
if isinstance(obj, dict): # branching
obj = obj[key] # item access
else:
obj = getattr(obj, key) # attribute access
The second is a hybrid reflection function such as eval or exec, which evaluates an expression that can use either form depending on the input.
Constrained-Get. The program fixes one atomic at every step, and the attacker must follow that pattern. In practice this is almost always a chain of attribute accesses, because item access alone cannot escape its container.
for part in path.split("."):
obj = getattr(obj, part) # attribute only
Set primitives
There are three set primitives, distinguished by whether the program lets the attacker choose between attribute and item assignment at the final write.
| Primitive | Attacker capability | Example |
|---|---|---|
| Dual-Set | Either attribute or item assignment | setattr(obj, k, v) or obj[k] = v |
| Attr-Set | Attribute assignment only | setattr(obj, k, v) |
| Item-Set | Item assignment only | obj[k] = v |
Six variants
The two get primitives combined with the three set primitives yield the six pollution variants summarized in the capability matrix. Each subsection below gives the shape of the variant, a minimal Python snippet, and one observed real-world case.
| Variant | Get | Set | Status |
|---|---|---|---|
| Agnostic-Get × Dual-Set | Agnostic | Dual | Previously known |
| Agnostic-Get × Attr-Set | Agnostic | Attr | New |
| Agnostic-Get × Item-Set | Agnostic | Item | New |
| Constrained-Get × Dual-Set | Constrained | Dual | New |
| Constrained-Get × Attr-Set | Constrained | Attr | New |
| Constrained-Get × Item-Set | Constrained | Item | New |
Agnostic-Get × Dual-Set
The most permissive variant. The attacker chooses attribute or item access at every step and can use either atomic for the final write. This is the only variant documented before our work, and it covers the canonical recursive-merge bug pattern (Program A from the taxonomy overview).
def update(obj, data):
for k, v in data.items():
if isinstance(v, dict):
if isinstance(obj, dict):
update(obj[k], v)
else:
update(getattr(obj, k), v)
else:
if isinstance(obj, dict):
obj[k] = v
else:
setattr(obj, k, v)
Observed in: pydash (set_), Azure CLI (set_properties), and django-unicorn (set_property_value).
Agnostic-Get × Item-Set
The traversal is mixed, but the final write is always an item assignment. There are two ways this is exploitable.
The first is when the final target is a dict-like, such as __globals__, os.environ, or sys.modules, where obj[k] = v directly modifies the entry.
def assign(obj, path, val):
for k in path[:-1]:
obj = obj[k] if isinstance(obj, dict) else getattr(obj, k)
obj[path[-1]] = val
The second is when the target is a general object with a writable __dict__. In that case the agnostic traversal lets the attacker step into __dict__ and the final item-write becomes an attribute write, because a.v = x is semantically equivalent to a.__dict__["v"] = x. This means Agnostic-Get × Item-Set has the same effective capability as Agnostic-Get × Dual-Set: any attribute set reachable by the latter can be re-encoded as an item set on __dict__ by the former.
Observed in: see the Collection for confirmed cases.
Agnostic-Get × Attr-Set
The attacker can mix attribute and item access during traversal, but the final write is always an attribute assignment. The dual-namespace traversal lets the attacker reach a class or module, and then the program forces an attribute write at the sink.
def assign(obj, path, val):
for k in path[:-1]:
obj = obj[k] if isinstance(obj, dict) else getattr(obj, k)
setattr(obj, path[-1], val)
Observed in: see the Collection for confirmed cases.
Constrained-Get × Dual-Set
Single-atomic traversal, typically getattr only, with both setattr and obj[k] = v available as the sink. The branch that picks between the two writes is usually based on the target type.
def assign(obj, path, val):
for part in path[:-1]:
obj = getattr(obj, part)
if isinstance(obj, dict):
obj[path[-1]] = val
else:
setattr(obj, path[-1], val)
Observed in: Google Mesop (_recursive_update_dataclass_from_json_obj).
Constrained-Get × Attr-Set
The most prevalent variant in our scan. The program walks an attribute chain via getattr and ends with setattr. This is the shape of Program B from the taxonomy overview, and it is easy to introduce by accident in any “deep set by dotted path” helper.
def deep_set(obj, dotted, val):
parts = dotted.split(".")
for p in parts[:-1]:
obj = getattr(obj, p)
setattr(obj, parts[-1], val)
Observed in: Taipy (_attrsetter).
Constrained-Get × Item-Set
Single-atomic traversal followed by an item write. Pure item-only walks rarely escape their container, so this variant most commonly appears when the constrained traversal uses attribute access to reach a module or class and only the final write is constrained to item form.
def assign(obj, path, val):
for part in path[:-1]:
obj = getattr(obj, part)
obj[path[-1]] = val
Observed in: see the Collection for confirmed cases.