Advanced Usage

This guide covers advanced topics and customization options for Bio Transformations. It’s intended for users who are already familiar with the basic usage of the package and want to explore its full capabilities.

Customizing BioConverter with Comprehensive Configuration

The BioConfig class provides extensive configuration options for fine-tuning the bio-inspired modifications. Here’s an example of creating a BioConverter with custom settings across all features:

from bio_transformations import BioConverter, BioConfig
from bio_transformations.bio_config import Distribution
import torch.nn as nn

# Comprehensive configuration
converter = BioConverter(
    # Fuzzy learning rate parameters
    fuzzy_learning_rate_factor_nu=0.2,        # Controls the variability in learning rates
    fuzzy_lr_distribution=Distribution.NORMAL, # Distribution strategy for learning rates
    fuzzy_lr_dynamic=True,                     # Whether to update learning rates during training
    fuzzy_lr_min=0.5,                          # Minimum value for learning rates
    fuzzy_lr_max=1.5,                          # Maximum value for learning rates
    fuzzy_lr_update_freq=5,                    # How often to update temporal rates
    fuzzy_lr_decay=0.98,                       # Decay factor for temporal rates

    # Synaptic stabilization parameters
    dampening_factor=0.7,                      # Factor for reducing learning rates during crystallization
    crystal_thresh=5e-05,                      # Threshold for identifying weights to crystallize

    # Structural plasticity parameters
    rejuvenation_parameter_dre=10.0,           # Controls the rate of weight rejuvenation

    # Multi-synaptic connectivity parameters
    weight_splitting_Gamma=2,                  # Number of sub-synapses per connection (0 = disabled)
    weight_splitting_activation_function=nn.ReLU(),  # Activation for weight splitting

    # Volume-dependent plasticity parameters
    base_lr=0.05,                              # Base learning rate for volume-dependent plasticity
    stability_factor=2.5,                      # Controls how quickly stability increases with weight size
    lr_variability=0.15,                       # Controls the amount of variability in learning rates

    # Dale's principle parameters
    apply_dales_principle=True                 # Whether to enforce Dale's principle
)

# Convert a model with this comprehensive configuration
bio_model = converter(model)

Creating and Updating BioConverter from Dictionary

You can create a BioConverter from a dictionary of parameters or update an existing converter:

# Create from dictionary
config_dict = {
    'fuzzy_learning_rate_factor_nu': 0.2,
    'dampening_factor': 0.7,
    'crystal_thresh': 5e-05,
    'rejuvenation_parameter_dre': 10.0,
    'weight_splitting_Gamma': 2
}

converter = BioConverter.from_dict(config_dict)

# Update an existing converter
converter.update_config(
    fuzzy_lr_distribution=Distribution.LOGNORMAL,
    fuzzy_lr_dynamic=True,
    apply_dales_principle=True
)

# Get the current configuration
current_config = converter.get_config()
print(current_config)

Detailed Guide to Fuzzy Learning Rate Distributions

Bio Transformations offers multiple distribution strategies for fuzzy learning rates, each with unique characteristics:

from bio_transformations.bio_config import Distribution

# 1. BASELINE - No variability (all parameters = 1.0)
baseline_config = BioConfig(fuzzy_lr_distribution=Distribution.BASELINE)
# Good for establishing a performance baseline without diversity

# 2. UNIFORM - Uniform distribution around 1.0
uniform_config = BioConfig(
    fuzzy_lr_distribution=Distribution.UNIFORM,
    fuzzy_learning_rate_factor_nu=0.16  # Controls the range: [1-0.16, 1+0.16]
)
# Simple, predictable variability across all weights

# 3. NORMAL - Normal distribution centered at 1.0
normal_config = BioConfig(
    fuzzy_lr_distribution=Distribution.NORMAL,
    fuzzy_learning_rate_factor_nu=0.16  # Standard deviation
)
# Bell-curve distribution with most values near 1.0

# 4. LOGNORMAL - Log-normal with mean 1.0
lognormal_config = BioConfig(
    fuzzy_lr_distribution=Distribution.LOGNORMAL,
    fuzzy_learning_rate_factor_nu=0.16  # Controls the shape
)
# Skewed distribution with long tail, all positive values

# 5. GAMMA - Gamma distribution (positive, skewed)
gamma_config = BioConfig(
    fuzzy_lr_distribution=Distribution.GAMMA,
    fuzzy_learning_rate_factor_nu=0.16  # Controls the shape
)
# Models continuous waiting times, good for activity-dependent processes

# 6. BETA - Beta distribution scaled to [1-nu, 1+nu]
beta_config = BioConfig(
    fuzzy_lr_distribution=Distribution.BETA,
    fuzzy_learning_rate_factor_nu=0.16  # Controls the shape and range
)
# Flexible distribution bounded on both sides

# 7. LAYER_ADAPTIVE - Layer-dependent variability
layer_config = BioConfig(
    fuzzy_lr_distribution=Distribution.LAYER_ADAPTIVE,
    fuzzy_learning_rate_factor_nu=0.16  # Base variability
)
# Early layers get more variability than later layers
# Mimics biological observation of layer-specific plasticity in cortex

# 8. WEIGHT_ADAPTIVE - Weight-dependent scaling
weight_config = BioConfig(
    fuzzy_lr_distribution=Distribution.WEIGHT_ADAPTIVE,
    fuzzy_learning_rate_factor_nu=0.16  # Base variability
)
# Smaller weights get more variability than larger weights
# Mimics size-dependent plasticity of dendritic spines

# 9. TEMPORAL - Evolves over time
temporal_config = BioConfig(
    fuzzy_lr_distribution=Distribution.TEMPORAL,
    fuzzy_lr_dynamic=True,               # Must be True for temporal evolution
    fuzzy_learning_rate_factor_nu=0.16,  # Base variability
    fuzzy_lr_update_freq=10,             # Update every 10 steps
    fuzzy_lr_decay=0.95                  # Decay factor for temporal rates
)
# Learning rates change during training, mimicking developmental changes

# 10. ACTIVITY - Based on neuron activation patterns
activity_config = BioConfig(
    fuzzy_lr_distribution=Distribution.ACTIVITY,
    fuzzy_lr_dynamic=True,               # Must be True for activity tracking
    fuzzy_learning_rate_factor_nu=0.16   # Base variability
)
# Adjusts learning rates based on neuron activity
# More active neurons become more stable (less variable)

Implementing Custom Activation Functions for Weight Splitting

You can implement custom activation functions for weight splitting to modify how multi-synaptic connections behave:

import torch
import torch.nn as nn
import torch.nn.functional as F

# Custom activation: Leaky ReLU with specific negative slope
def custom_leaky_activation(x):
    return F.leaky_relu(x, negative_slope=0.05)

# Custom activation: Sigmoid with scaling
def custom_sigmoid_activation(x):
    return torch.sigmoid(x) * 2.0  # Scale output range to [0, 2]

# Custom activation: Tanh with gain
def custom_tanh_activation(x):
    return torch.tanh(x * 1.5)  # Apply gain before tanh

# Use the custom activation in BioConverter
converter = BioConverter(
    weight_splitting_Gamma=2,  # Enable weight splitting
    weight_splitting_activation_function=custom_leaky_activation
)

bio_model = converter(model)

Selective Application of Bio-Inspired Features

You can selectively apply bio-inspired features to specific layers of your model:

class CustomModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(10, 20)
        self.fc2 = nn.Linear(20, 5)
        self.fc3 = nn.Linear(5, 1)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        return self.fc3(x)

model = CustomModel()

# Mark fc3 to skip weight splitting
# This is particularly useful for output layers where
# changing the output dimension would affect the task
BioConverter.set_last_module_token_for_module(model.fc3)

# Convert the model
bio_model = converter(model)

Monitoring Bio-Inspired Modifications

To monitor the effects of bio-inspired modifications during training:

class MonitoredModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(10, 20)
        self.fc2 = nn.Linear(20, 5)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        return self.fc2(x)

model = MonitoredModel()
converter = BioConverter()
bio_model = converter(model)

# Track changes during training
crystallized_weights_history = []
rejuvenated_weights_history = []
learning_rate_variability_history = []

# Training loop with monitoring
for epoch in range(100):
    # Forward and backward pass
    outputs = bio_model(inputs)
    loss = criterion(outputs, targets)
    optimizer.zero_grad()
    loss.backward()

    # Get pre-modification weights for comparison
    pre_weights_fc1 = bio_model.fc1.weight.data.clone()

    # Apply bio-inspired modifications
    bio_model.crystallize()

    # Track crystallized weights after each epoch
    with torch.no_grad():
        # Count crystallized weights (those with reduced learning rates)
        crystallized_count = torch.sum(
            bio_model.fc1.bio_mod.fuzzy_learning_rate_parameters < 0.9
        ).item()
        crystallized_weights_history.append(crystallized_count)

        # Check learning rate variability
        lr_variability = bio_model.fc1.bio_mod.fuzzy_learning_rate_parameters.std().item()
        learning_rate_variability_history.append(lr_variability)

    # Apply other modifications
    bio_model.fuzzy_learning_rates()
    optimizer.step()

    # Periodically apply rejuvenation
    if epoch % 10 == 0:
        # Get pre-rejuvenation weights
        pre_rejuv_weights = bio_model.fc1.weight.data.clone()

        # Apply rejuvenation
        bio_model.rejuvenate_weights()

        # Count rejuvenated weights
        with torch.no_grad():
            rejuvenated_count = torch.sum(
                (bio_model.fc1.weight.data - pre_rejuv_weights).abs() > 1e-6
            ).item()
            rejuvenated_weights_history.append(rejuvenated_count)

# Plot the results
import matplotlib.pyplot as plt

plt.figure(figsize=(15, 5))

plt.subplot(1, 3, 1)
plt.plot(crystallized_weights_history)
plt.title('Crystallized Weights Over Time')
plt.xlabel('Epoch')
plt.ylabel('Number of Crystallized Weights')

plt.subplot(1, 3, 2)
plt.plot(rejuvenated_weights_history)
plt.title('Rejuvenated Weights Over Time')
plt.xlabel('Epoch')
plt.ylabel('Number of Rejuvenated Weights')

plt.subplot(1, 3, 3)
plt.plot(learning_rate_variability_history)
plt.title('Learning Rate Variability Over Time')
plt.xlabel('Epoch')
plt.ylabel('Standard Deviation of Learning Rates')

plt.tight_layout()
plt.show()

Creating a Custom BioModule Extension

You can extend the BioModule class with your own bio-inspired methods:

from bio_transformations.bio_module import BioModule
from bio_transformations import BioConverter

class CustomBioModule(BioModule):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # Add custom state variables
        self.register_buffer('activity_history', torch.zeros(10))
        self.current_step = 0

    def custom_bio_method(self):
        """
        Custom bio-inspired plasticity rule that scales weights
        based on a simulated neuromodulator presence.
        """
        with torch.no_grad():
            # Simulate neuromodulator concentration varying over time
            neuromodulator = 0.5 + 0.5 * torch.sin(torch.tensor(self.current_step / 10))
            self.current_step += 1

            # Scale weights based on neuromodulator concentration
            scale_factor = 1.0 + 0.1 * neuromodulator
            self.get_parent().weight.data *= scale_factor

            # Track activity for this step
            idx = self.current_step % 10
            self.activity_history[idx] = neuromodulator.item()

# Update BioModule.exposed_functions to include your new method
CustomBioModule.exposed_functions = BioModule.exposed_functions + ("custom_bio_method",)

# Create a custom BioConverter that uses your extended BioModule
class CustomBioConverter(BioConverter):
    def _bio_modulize(self, module):
        if isinstance(module, (nn.Linear, nn.Conv2d)):
            module.add_module('bio_mod', CustomBioModule(lambda: module, config=self.config))

# Use your custom converter
custom_converter = CustomBioConverter()
bio_model = custom_converter(model)

# Training loop with custom method
for epoch in range(100):
    # Standard training steps
    # ...

    # Apply custom bio method
    bio_model.custom_bio_method()

    # Continue with other modifications
    # ...

Combining Bio Transformations with Other PyTorch Features

Bio Transformations can be combined with other PyTorch features like DataParallel for multi-GPU training or TorchScript for deployment:

import torch.nn as nn
from torch.nn.parallel import DataParallel
from bio_transformations import BioConverter

# Create and convert model
model = YourModel()
converter = BioConverter()
bio_model = converter(model)

# Wrap with DataParallel for multi-GPU training
if torch.cuda.device_count() > 1:
    bio_model = DataParallel(bio_model)

bio_model = bio_model.to(device)

# Standard training loop
for inputs, targets in train_loader:
    inputs, targets = inputs.to(device), targets.to(device)
    outputs = bio_model(inputs)
    loss = criterion(outputs, targets)
    optimizer.zero_grad()
    loss.backward()

    # For DataParallel models, we need to access the module attribute
    if isinstance(bio_model, DataParallel):
        bio_model.module.volume_dependent_lr()
        bio_model.module.fuzzy_learning_rates()
        bio_model.module.crystallize()
    else:
        bio_model.volume_dependent_lr()
        bio_model.fuzzy_learning_rates()
        bio_model.crystallize()

    optimizer.step()

# Export with TorchScript
# Note: After conversion to TorchScript, bio-inspired methods
# can no longer be called, so this is for deployment only
scripted_model = torch.jit.script(bio_model)
scripted_model.save("bio_model_scripted.pt")

Performance Optimization Tips

Here are some tips to optimize performance when using Bio Transformations:

Selective Application of Bio-Inspired Methods

Not all bio-inspired methods need to be applied at every iteration. For example:

# Instead of applying everything every iteration:
for i, (inputs, targets) in enumerate(train_loader):
    outputs = bio_model(inputs)
    loss = criterion(outputs, targets)
    optimizer.zero_grad()
    loss.backward()

    bio_model.volume_dependent_lr()
    bio_model.fuzzy_learning_rates()
    bio_model.crystallize()
    bio_model.rejuvenate_weights()  # Expensive operation

    optimizer.step()

# Consider selective application:
for i, (inputs, targets) in enumerate(train_loader):
    outputs = bio_model(inputs)
    loss = criterion(outputs, targets)
    optimizer.zero_grad()
    loss.backward()

    # Apply these every iteration
    bio_model.fuzzy_learning_rates()

    # Apply some methods less frequently
    if i % 10 == 0:
        bio_model.crystallize()

    # Apply expensive operations very selectively
    if i % 100 == 0:
        bio_model.rejuvenate_weights()

    optimizer.step()

Use Non-Dynamic Distributions When Possible

Dynamic distributions require updates during training and may be more computationally expensive:

# More efficient (no updates required during training):
config = BioConfig(
    fuzzy_lr_distribution=Distribution.NORMAL,
    fuzzy_lr_dynamic=False
)

# Less efficient (requires updates during training):
config = BioConfig(
    fuzzy_lr_distribution=Distribution.TEMPORAL,
    fuzzy_lr_dynamic=True
)

Batch Processing for Activity-Dependent Learning

When using activity-dependent learning rates, process in batches rather than single examples:

# Process entire batches (more efficient):
outputs = bio_model(inputs_batch)  # inputs_batch shape: [batch_size, features]
bio_model.update_fuzzy_learning_rates(inputs_batch)

# Avoid processing individual examples (less efficient):
for single_input in inputs_batch:
    output = bio_model(single_input.unsqueeze(0))
    bio_model.update_fuzzy_learning_rates(single_input.unsqueeze(0))

Consider Model Size vs. Weight Splitting

Weight splitting increases memory usage and computation. For large models, consider using smaller values:

# For smaller models, more splitting might be fine:
small_model_converter = BioConverter(weight_splitting_Gamma=4)

# For larger models, use less splitting or disable it:
large_model_converter = BioConverter(weight_splitting_Gamma=2)  # Less splitting
very_large_model_converter = BioConverter(weight_splitting_Gamma=0)  # Disabled

Adjust Bio-Inspired Parameters Based on Network Size

Larger networks may require different parameter settings:

# For small networks:
small_config = BioConfig(
    fuzzy_learning_rate_factor_nu=0.16,
    rejuvenation_parameter_dre=8.0
)

# For large networks, use more conservative settings:
large_config = BioConfig(
    fuzzy_learning_rate_factor_nu=0.1,  # Less variability
    rejuvenation_parameter_dre=12.0     # Less aggressive rejuvenation
)

These advanced usage examples should help you customize and extend Bio Transformations to suit your specific needs. Remember to refer to the API documentation for detailed information on each class and method.