Debugging Guide

This guide covers common debugging scenarios and troubleshooting techniques for pyisolate.

Environment Variables

PYISOLATE_DEBUG_RPC

Enable verbose RPC message logging:

export PYISOLATE_DEBUG_RPC=1

This logs all RPC messages sent and received, useful for diagnosing communication issues.

PYISOLATE_ENABLE_CUDA_IPC

Enable CUDA IPC for zero-copy GPU tensor transfer:

export PYISOLATE_ENABLE_CUDA_IPC=1

Without this, CUDA tensors are copied to CPU for transfer.

PYISOLATE_CHILD

Set automatically in child processes. Check this to determine if running in sandbox:

import os
if os.environ.get("PYISOLATE_CHILD") == "1":
    print("Running in isolated sandbox")

Common Issues

“No such file or directory” for Shared Memory

Symptom: RPC recv failed ... No such file or directory

Cause: Tensor was garbage collected before remote process could access shared memory.

Solution: The TensorKeeper class holds references for 30 seconds by default. If you see this error:

  1. Increase TensorKeeper.retention_seconds for slow networks

  2. Ensure tensors aren’t being explicitly deleted too early

  3. Check that /dev/shm has sufficient space

Sandbox Launch Fails

Symptom: Extension fails to start with bwrap errors

Diagnosis:

# Check if bwrap is available
which bwrap

# Check user namespace restrictions
cat /proc/sys/kernel/unprivileged_userns_clone
# Should be 1 for full isolation

# Check AppArmor restrictions (Ubuntu)
aa-status | grep bwrap

Solutions:

  1. Install bubblewrap: apt install bubblewrap

  2. Enable unprivileged user namespaces: sysctl kernel.unprivileged_userns_clone=1

  3. For AppArmor issues, see sandbox_detect.py

RPC Timeout

Symptom: RPC calls hang or timeout

Diagnosis:

import logging
logging.getLogger("pyisolate").setLevel(logging.DEBUG)

Check for:

  1. Deadlocks in callback chains

  2. Extension process crashed (check process status)

  3. Socket connection issues

Solutions:

  1. Check extension logs for exceptions

  2. Verify socket path is accessible

  3. Ensure no circular RPC calls

Singleton Not Found

Symptom: KeyError: <class 'MySingleton'>

Cause: Singleton accessed before use_remote() was called.

Solution: Ensure use_remote() is called before any instantiation:

# Correct order
MySingleton.use_remote(rpc)
instance = MySingleton()

# Wrong order - will fail
instance = MySingleton()  # Creates local instance
MySingleton.use_remote(rpc)  # Too late!

CUDA IPC Failures

Symptom: CUDA tensors fail to transfer between processes

Diagnosis:

import torch
# Check CUDA IPC support
print(torch.cuda.is_available())
print(torch.version.cuda)

# Test IPC handle creation
t = torch.zeros(10, device='cuda')
try:
    import torch.multiprocessing.reductions as r
    func, args = r.reduce_tensor(t)
    print("CUDA IPC supported")
except Exception as e:
    print(f"CUDA IPC failed: {e}")

Common causes:

  1. Different CUDA versions between processes

  2. Tensor received from another process (can’t re-share)

  3. CUDA driver issues

Solutions:

  1. Clone tensors that were received via IPC before re-sharing

  2. Ensure same CUDA/driver version in all processes

  3. Fall back to CPU transfer: unset PYISOLATE_ENABLE_CUDA_IPC

Logging Configuration

Enable Debug Logging

import logging

# Enable all pyisolate debug logging
logging.getLogger("pyisolate").setLevel(logging.DEBUG)

# Enable specific module logging
logging.getLogger("pyisolate._internal.rpc_protocol").setLevel(logging.DEBUG)
logging.getLogger("pyisolate._internal.sandbox").setLevel(logging.DEBUG)

Pytest Debug Options

# Enable debug logging during tests
pytest --debug-pyisolate

# Log to file
pytest --pyisolate-log-file=debug.log

# Verbose output
pytest -v --tb=long

Inspecting RPC State

Check Registered Singletons

from pyisolate._internal.rpc_protocol import SingletonMetaclass

# List all registered singletons
for cls, instance in SingletonMetaclass._instances.items():
    print(f"{cls.__name__}: {type(instance)}")

Check Pending RPC Calls

# In AsyncRPC instance
print(f"Pending calls: {len(rpc._pending_calls)}")
for call_id, pending in rpc._pending_calls.items():
    print(f"  {call_id}: {pending['method']} on {pending['object_id']}")

Sandbox Debugging

Inspect Sandbox Command

from pyisolate._internal.sandbox import build_bwrap_command
from pyisolate._internal.sandbox_detect import detect_restriction_model

restriction = detect_restriction_model()
cmd = build_bwrap_command(
    python_exe="/path/to/python",
    module_path="/path/to/extension",
    venv_path="/path/to/venv",
    uds_address="/tmp/socket",
    allow_gpu=True,
    restriction_model=restriction
)
print(" ".join(cmd))

Test Sandbox Manually

# Run sandbox command manually to see errors
bwrap --ro-bind /usr /usr --dev /dev --proc /proc \
    --ro-bind /path/to/venv /path/to/venv \
    /path/to/python -c "print('Hello from sandbox')"

Memory Debugging

Track Tensor References

from pyisolate._internal.tensor_serializer import _tensor_keeper

# Check how many tensors are being kept
print(f"Tensors in keeper: {len(_tensor_keeper._keeper)}")

Detect Memory Leaks

import gc
import weakref

# Track singleton garbage collection
from pyisolate._internal.rpc_protocol import SingletonMetaclass

class MyService(ProxiedSingleton):
    pass

instance = MyService()
ref = weakref.ref(instance)

del instance
SingletonMetaclass._instances.clear()
gc.collect()

if ref() is None:
    print("Properly collected")
else:
    print("Memory leak detected!")

Getting Help

  1. Enable debug logging and capture output

  2. Include pyisolate version: python -c "import pyisolate; print(pyisolate.__version__)"

  3. Include Python version and platform info

  4. Check existing issues at: https://github.com/anthropics/claude-code/issues