The point of enumerating all of those specific issues, wasn't to say "here are some bugs" and if you fix them you're good. It was to say "here are some examples of the much more fundamental problem", which seemed to be a fundamental misunderstanding of the language memory model and the guarantees offered by assignments, atomic.CompareAndSwap, etc., and those operations' interactions with package unsafe.
For example this code
https://github.com/sirgallo/cmapv2/blob/6bcaa0253b1b0b261e8a...
and in particular its use of this code
https://github.com/sirgallo/cmapv2/blob/6bcaa0253b1b0b261e8a...
is still completely unsound.
Looking at *only this code path* -- and there are *many more* --
---
Put
- Snapshots the current root pointer with atomic.LoadPointer
- Makes an updated root pointer via putRecursive, given the snapshotted root pointer
- Spins on a CAS of the root ptr and the updated ptr with runtime.Gosched() between attempts
---
atomic.LoadPointer isn't a real snapshot
- It's atomic only over the root ptr, not any interior field
- Those interior fields are mutated in-place via e.g. setBitmap, setChild, etc.
- Any goroutine can see partial data, violating the memory model, etc.
---
putRecursive is unsound
- copyNode performs a shallow copy, child pointers are shared, subsequent setChild, extendTable, etc. mutate nodes other goroutines can still hold -- this is a fundamental bug that seems to remain un-addressed from the previous review
- Those mutations use plain writes (no atomics/locks/etc.) -- data race, memory model violation, etc.
- Get later returns the internal []byte slice directly -- data race, memory model violation, etc.
- Newly created nodes are cast to unsafe.Pointer without an atomic store, bypassing the write barrier required by the GC
---
That compareAndSwap is unsound
- It compares only the root pointer, a shallow copy
- After a successful CAS other writers can still mutate any shared children (see above), so readers following any shared path see data races, memory model violations, etc.
- The retry loop can livelock, details elided
---
The implementation still seems to confuse "atomic pointer swap" with "atomic update of a complex, shared value", misunderstands the requirements of the Go memory model, and consistently mis-uses unsafe.Pointer.
tl;dr here is probably to just stop using package unsafe altogether, until you have some time to properly understand its semantics, requirements, and limitations...
Be more specific? OK
---
CopyNode broken
`CopyNode` duplicates only the parent; every child pointer is still shared
nodeCopy.setChildren(make([]*node, len(n.children)))
copy(nodeCopy.children, n.children) // pointers reused
https://github.com/sirgallo/cmapv2/blob/main/node.go#L11-L17Any later mutation (for example `setValue`) writes through those shared pointers, so readers and historical snapshots are modified concurrently -- invalid, data race, memory model violation
---
Bitmap corruption
`SetBit` uses XOR rather than “set”:
func SetBit(bitmap uint32, position int) uint32 { return bitmap ^ (1 << position) }
https://github.com/sirgallo/cmapv2/blob/main/utils.go#L41-L4...Calling it twice on the same index flips the bit back to 0. During branch-creation on insert and during delete, this function is invoked multiple times on the same index, clearing a bit that should remain set and leaving orphaned children.
---
Invalid assumptions re: interior pointers
Only the root pointer is read with `atomic.LoadPointer`. All deeper fields like `children[pos]`, `bitmap`, and the byte-slice keys/values, are accessed directly after a successful CAS. Readers therefore race with writers that mutate these fields in place -- race condition, memory model violation, etc.
pos := cMap.getPosition(node.Bitmap(), hash, level)
if node.Child(pos).IsLeaf() && bytes.Equal(key, node.Child(pos).Key()) {
return node.Child(pos).Value()
}
https://github.com/sirgallo/cmapv2/blob/main/operation.go#L5...---
All xxxRecursive functions rely on those invalid interior pointer assumptions
Sequence in `putRecursive` / `deleteRecursive` is
1. `curr := atomic.LoadPointer(ptr)`
2. Build `nodeCopy`
3. Recurse; grandchildren are mutated in place
4. `atomic.CompareAndSwap(ptr, curr, nodeCopy)`
https://github.com/sirgallo/cmapv2/blob/main/operation.go#L1...If another goroutine has already swapped in a different copy of `curr` (and mutated it) the CAS still succeeds because the pointer value is unchanged, merging incompatible sub-tries and corrupting the data
---
Use-after-free in sync.Pool
On CAS failure the freshly built `nodeCopy` is immediately returned to a `sync.Pool` -- undefined behavior
cMap.pool.PutNode(nodeCopy) // may race with outstanding readers
https://github.com/sirgallo/cmapv2/blob/main/operation.go#L1...Other goroutines still holding references to that node can now access a reclaimed object, oops.
---
K/V Aliasing
Keys and values (both []byte slices, which are not safe for concurrent r/w access) are stored by reference, a mistake:
n.setKey(key)
n.setValue(value)
If the caller mutates those slices later (or concurrently in another goroutine), data races ahoy---
Reader panics, etc.
- `getRecursive` accesses `children[pos]` without bounds or nil checks, concurrent shrink can make `pos` invalid
- `GetIndex` allows a negative `shiftSize` once `level >= 7` with `chunkSize = 5`, producing nonsense indices and potential slice-out-of-bounds
This repo is completely unsound, code like [1] is pervasive and demonstrates a total misunderstanding of what guarantees are provided -- or, really, not provided -- by "atomic" reads of unsafe.Pointer values. Data races everywhere!
Not safe, do not pass go, do not collect $200, absolutely do not use.
[1] https://github.com/sirgallo/cmapv2/blob/280e3017ae4ba212f6f8...
> There are practical limitations mostly with backend analysis tools
Not just end-of-line analysis tools, but also initiating SDKs, and system agents, and intermediate middle-boxes -- really anything that needs to parse OTel.
Spec > SDK > Trace > Span limits: https://opentelemetry.io/docs/specs/otel/trace/sdk/#span-lim...
Spec > Common > Attribute limits: https://opentelemetry.io/docs/specs/otel/common/#attribute-l...
I know the spec says the default AttributeValueLengthLimit = infinity, but...
> It’s quite common in LLM Observability to capture full prompts and LLM responses as attributes on spans, for example.
...I'd love to learn about any OTel-compatible pipeline/system that supports attribute values of arbitrary size! because I've personally not seen anything that lets you get bigger than O(1MB).