Image showing Four Storage Backends, One Interface: Building a Pluggable Image Storage Layer in Python

Four Storage Backends, One Interface: Building a Pluggable Image Storage Layer in Python

affiliate best offer

[!note] 📚 Smart Assets Manager Series

  1. Why Storage Abstraction Matters — May 11
  2. Four Backends, One Interface ← you are here
  3. The Unified API: Credits and Rate Limiting — April 27
  4. Testing Strategy: Unit vs E2E — April 20
  5. 5 Edge Cases That Break Image APIs — June 1
  6. API Documentation: Swagger + Postman — March 30

In the previous post, I made the case for why coupling an image generator to a single storage provider creates a constraint that compounds as the system grows. Here’s the implementation.

The design is a classic strategy pattern: one abstract base class that defines the contract, four concrete classes that implement it, and one factory function that dispatches based on the caller’s preference. Every caller uses the same interface. Only the factory knows which backend is active.


The Contract: Abstract Base Class

The base class is where the design is encoded. Every backend must implement upload() and generate_signed_url(). Everything else is a detail.

# backend/app/integrations/storage_manager.py

from abc import ABC, abstractmethod
from enum import Enum
import logging

logger = logging.getLogger(__name__)


class StorageType(str, Enum):
    """Supported storage backends, passed as the 'storage' request parameter."""
    CLOUDINARY = "cloudinary"
    LOCAL = "local"
    S3 = "s3"
    DIRECT = "direct"


class StorageBackend(ABC):
    """Abstract base class for all image storage backends.

    The two abstract methods are the entire interface callers depend on.
    Switching backends requires changing only the factory, not any caller code.
    """

    @abstractmethod
    def upload(self, image_bytes: bytes, filename: str, visibility: str = "public") -> str:
        """Store image bytes and return a URL or data URI.

        The visibility parameter controls access:
        - 'public': open URL, no authentication required
        - 'private': requires a signed URL to access (supported by Cloudinary and S3)
        """
        ...

    @abstractmethod
    def generate_signed_url(self, asset_id: str, expiry_seconds: int = 86400) -> str:
        """Generate a time-limited signed URL for a private asset.

        The default expiry of 86400 seconds (24 hours) is long enough for most
        downstream workflows but short enough to expire before stale links circulate.
        """
        ...

    def cleanup_failed_assets(self, asset_ids: list[str]) -> None:
        """Best-effort cleanup of partial uploads after a batch failure.

        When a multi-size generation partially fails — some images uploaded
        before the error, the rest not — those orphaned uploads should be removed.
        This method is non-raising by design: cleanup failure is logged, never re-raised.
        """
        for asset_id in asset_ids:
            try:
                self._delete(asset_id)
            except Exception as e:
                logger.warning("Cleanup failed for asset %s: %s", asset_id, e)

    @abstractmethod
    def _delete(self, asset_id: str) -> None:
        """Delete a stored asset by identifier (internal use only)."""
        ...

Three decisions in this base class are worth examining.

The visibility parameter on upload() encodes a meaningful distinction: public images get open CDN URLs; private images require signed access. Not all backends support this — local storage doesn’t have the concept — but the interface surfaces it so callers can express intent, even when some backends treat it as a no-op.

The cleanup_failed_assets() method is concrete, not abstract. This is intentional: the cleanup logic (iterate asset IDs, call _delete(), swallow exceptions) is identical across all backends. Only _delete() varies. Making cleanup concrete on the base class means backends get it for free — they just implement _delete().

The non-raising behavior of cleanup_failed_assets() is deliberate. Cleanup runs in an error context — a batch failed, and we’re trying to remove the partial uploads before returning the error. If the cleanup itself raises, it obscures the original error and makes debugging harder. Warn on cleanup failure; never re-raise.


The Cloudinary Backend

Cloudinary is the production default: images uploaded here get CDN URLs, automatic optimization, and format transformation support.

import time
import cloudinary.uploader
import cloudinary.utils
from pathlib import Path


class CloudinaryStorage(StorageBackend):

    def upload(self, image_bytes: bytes, filename: str, visibility: str = "public") -> str:
        # The 'authenticated' upload type requires a signature to access the URL.
        # This prevents hotlinking and ensures private images stay private.
        upload_type = "authenticated" if visibility == "private" else "upload"

        result = cloudinary.uploader.upload(
            image_bytes,
            public_id=Path(filename).stem,
            resource_type="image",
            type=upload_type,
        )
        return result["secure_url"]

    def generate_signed_url(self, asset_id: str, expiry_seconds: int = 86400) -> str:
        expires_at = int(time.time()) + expiry_seconds
        signed_url, _ = cloudinary.utils.cloudinary_url(
            asset_id,
            type="authenticated",
            sign_url=True,
            expires_at=expires_at,
        )
        return signed_url

    def _delete(self, asset_id: str) -> None:
        cloudinary.uploader.destroy(asset_id, resource_type="image")

The upload_type = "authenticated" if visibility == "private" line is a detail that matters: Cloudinary’s “authenticated” upload type locks the asset behind a signature. Without this, marking an image “private” in the API call would have no effect — the URL would still be publicly accessible.


The Local File System Backend

Local storage exists for two reasons: development (no credentials, no costs) and on-premise deployments (data stays on the internal server).

from pathlib import Path


class LocalStorage(StorageBackend):

    BASE_PATH = Path("generated_assets")
    BASE_URL = "http://localhost:8000/assets"

    def upload(self, image_bytes: bytes, filename: str, visibility: str = "public") -> str:
        # Create the directory if it doesn't exist — first run in a new environment
        self.BASE_PATH.mkdir(parents=True, exist_ok=True)

        target = self.BASE_PATH / filename
        target.write_bytes(image_bytes)

        # The caller is responsible for serving this path.
        # In development, the FastAPI app serves /assets from BASE_PATH.
        return f"{self.BASE_URL}/{filename}"

    def generate_signed_url(self, asset_id: str, expiry_seconds: int = 86400) -> str:
        # Signed URLs don't apply in local storage — trusted environments only.
        return asset_id

    def _delete(self, asset_id: str) -> None:
        path = self.BASE_PATH / Path(asset_id).name
        if path.exists():
            path.unlink()

The generate_signed_url returning asset_id unchanged is intentional. Local storage is used in development and trusted on-premise environments where the concept of “signing” a URL doesn’t apply. Returning the asset ID as-is keeps the interface consistent without pretending the backend does something it doesn’t.


The Direct Download Backend

This is the backend that gets the most use from developers and that nobody planned for.

Instead of storing anything, it base64-encodes the image bytes and returns a data URI directly in the API response. No credentials. No storage costs. No round-trip.

import base64


class DirectDownloadStorage(StorageBackend):
    """Returns image bytes as a base64-encoded data URI.

    No external storage. No credentials required.
    Ideal for: CI/CD pipelines, development and testing, API consumers
    who handle their own storage downstream.
    """

    # Format detection by byte signature rather than file extension.
    # Callers don't always provide correct extensions; bytes are authoritative.
    _FORMAT_SIGNATURES = {
        b"\x89PNG": "png",
        b"\xff\xd8\xff": "jpeg",
        b"RIFF": "webp",
    }

    def upload(self, image_bytes: bytes, filename: str, visibility: str = "public") -> str:
        image_format = "png"  # Safe default
        for signature, fmt in self._FORMAT_SIGNATURES.items():
            if image_bytes[: len(signature)] == signature:
                image_format = fmt
                break

        encoded = base64.b64encode(image_bytes).decode("utf-8")
        return f"data:image/{image_format};base64,{encoded}"

    def generate_signed_url(self, asset_id: str, expiry_seconds: int = 86400) -> str:
        # Data URIs don't expire — they're ephemeral by nature.
        return asset_id

    def _delete(self, asset_id: str) -> None:
        pass  # Nothing to delete.

The byte-signature detection in _FORMAT_SIGNATURES is worth noting. Callers pass a filename parameter, but filenames lie — a caller can upload a PNG with a .jpg extension, and if the MIME type in the data URI is image/jpeg, some browsers will refuse to render it. Reading the first few bytes of the actual image data and inferring format from the magic bytes is more reliable than trusting the filename.

Takeaway: Detect image format from byte signatures, not file extensions. The b"\x89PNG" magic bytes at the start of a PNG are always there; the filename extension is whatever the caller provided.


The Factory Function

One function. This is the only place in the codebase that knows which class maps to which StorageType. All callers depend on the interface, not the implementation.

def get_storage_backend(storage_type: str) -> StorageBackend:
    """Return the correct storage backend for the given type string.

    Raises ValueError if the type is unrecognized — callers should validate
    the storage parameter against StorageType before calling this function.
    """
    backends = {
        StorageType.CLOUDINARY: CloudinaryStorage,
        StorageType.LOCAL: LocalStorage,
        StorageType.S3: S3Storage,
        StorageType.DIRECT: DirectDownloadStorage,
    }

    backend_class = backends.get(StorageType(storage_type))
    if not backend_class:
        raise ValueError(f"Unknown storage type: {storage_type!r}")

    return backend_class()

Adding a fifth backend — say, Google Cloud Storage — means: add a class that implements upload(), generate_signed_url(), and _delete(), add it to this dictionary, add the value to StorageType. The rest of the codebase is unchanged.


What’s Next

With storage handled, the next piece is the API layer: a unified endpoint that accepts all generation types, validates requests with Pydantic, and coordinates credit reservation so users are never charged without delivery.

→ Next: The Unified API with Credit Reservation

Full Bright

Full Bright

A professional and sympathic business man.

Contact

Contact Us

To order one of our services, navigate to the order service page

Address

10 rue François 1er,
75008 Paris

Email Us

hello at bright-softwares dot com

Open Hours

Monday - Friday
9:00AM - 05:00PM