The 5 Edge Cases That Break Image Generation APIs (Found in Production, Tested in Code)

Summary

Case 1: Partial Batch Failure → Proportional Charge
Case 2: SVG Injection → Sanitize, Don’t Reject
Cases 3 and 4: Credits — Before and After
Case 5: Rate Limit State Across Sequential Requests
The Result: 90% E2E Coverage

[!note] 📚 Smart Assets Manager Series

Why Storage Abstraction Matters — May 11

Four Backends, One Interface — May 18

The Unified API: Credits and Rate Limiting — April 27

Testing Strategy: Unit vs E2E — April 20

5 Edge Cases That Break Image APIs ← you are here

API Documentation: Swagger + Postman — March 30

The testing strategy post established the principle: unit tests for function logic, E2E tests for integration behavior. This post is about five specific cases where that principle had to be applied — scenarios that look fine in normal usage but surface in production under conditions that are hard to predict and expensive to debug.

None of these are theoretical. Each one represents either a bug caught by the test or a guarantee that the test validates across the full stack.

Case 1: Partial Batch Failure → Proportional Charge

When a user requests 10 image sizes and 3 fail mid-generation, what does the API charge?

The obvious answer — charge for 7 successes, not 10 — requires three services to coordinate correctly: the generator, the storage backend, and the credit service. The generator knows how many images succeeded. The credit service holds the reservation. For the charge to reflect reality, these two pieces of information have to meet in the right place, in the right order.

The initial implementation got this wrong. The credit confirmation ran after the batch loop, using the requested count, not the success count. The logic was in the right place conceptually but reading the wrong variable.

@pytest.mark.asyncio
async def test_partial_failure_proportional_credit_charge(
    async_client, test_user, monkeypatch
):
    call_count = 0

    def mock_generator_with_failures(data, width, height):
        nonlocal call_count
        call_count += 1
        # Fails on attempts 4, 7, and 9 — simulates intermittent generation errors
        if call_count in [4, 7, 9]:
            raise RuntimeError(f"Generation failed for {width}x{height}")
        return b"\x89PNG\r\n\x1a\n"  # Minimal valid PNG header

    monkeypatch.setattr(
        "app.services.generators.SocialCardGenerator.generate",
        mock_generator_with_failures
    )

    response = await async_client.post(
        "/api/v1/deterministic/generate",
        json={
            "type": "social_card",
            "storage": "direct",
            "generate_sizes": True,
            "preset_name": "custom_10",
            "data": {"title": "Test", "brand_color": "#000"},
        },
        headers={"Authorization": f"Bearer {test_user.api_key}"},
    )

    assert response.status_code == 207  # Multi-status: partial success
    data = response.json()
    assert data["success_count"] == 7
    assert data["error_count"] == 3
    # Charge: 0.25 base + 6 additional variants × 0.1 = 0.85
    assert abs(data["credits_used"] - 0.85) < 0.01

The test uses monkeypatch to inject deterministic failures at specific call counts. The assertion on credits_used is the one that caught the bug — when the charge was computed from the requested count, it was always 1.15, not 0.85.

Takeaway: Proportional charging requires the success count to flow to the billing system. Test this with injected failures at specific positions in the batch, then assert the charge matches what succeeded.

Case 2: SVG Injection → Sanitize, Don’t Reject

This one involves a security question with a non-obvious correct answer.

Smart Assets Manager accepts user-provided SVG templates. A malicious user can submit an SVG containing <script> tags, onclick handlers, and <foreignObject> elements wrapping iframes with javascript: URIs. The wrong response is to reject the request. The right response is to strip the dangerous elements and still render the image.

Why? Because rejecting-on-encounter is too aggressive. Real SVGs from legitimate tools sometimes include namespace declarations, conditional processing instructions, or elements that parsers flag as suspicious but that don’t constitute actual attacks. Rejecting those inputs creates false negatives and frustrates legitimate users.

@pytest.mark.asyncio
async def test_svg_injection_sanitized_and_renders(async_client, test_user):
    malicious_svg = """<svg xmlns="http://www.w3.org/2000/svg" width="800" height="600">
        <script>document.cookie = 'stolen=' + document.cookie;</script>
        <rect onclick="alert('xss')" width="800" height="600" fill=""/>
        <foreignObject><div xmlns="http://www.w3.org/1999/xhtml">
            <iframe src="javascript:alert('xss')"></iframe>
        </div></foreignObject>
        <text x="400" y="300"></text>
    </svg>"""

    response = await async_client.post(
        "/api/v1/deterministic/generate",
        json={
            "type": "svg_template",
            "storage": "direct",
            "data": {
                "template_content": malicious_svg,
                "variables": {"bg_color": "#1A2980", "title": "Safe"},
            },
        },
        headers={"Authorization": f"Bearer {test_user.api_key}"},
    )

    # Generation succeeds — sanitization strips, doesn't block
    assert response.status_code == 200
    data = response.json()
    assert data["urls"][0]["url"].startswith("data:image/")

    # The sanitization report is part of the response
    assert data["sanitization"]["elements_removed"] >= 3
    removed_types = data["sanitization"]["elements_removed_types"]
    assert "script" in removed_types
    assert "foreignObject" in removed_types

The test asserts two things that pull in opposite directions: the response is 200 (not rejected), and dangerous elements were removed (not passed through). Both must be true simultaneously.

The bug this caught: the original sanitizer raised an exception when encountering <foreignObject> elements — it treated nested XML namespaces as malformed SVG rather than stripping them. The fix was to add foreignObject to the strip list and continue rendering, not to treat it as a parse error.

Cases 3 and 4: Credits — Before and After

Two credit-related edge cases belong together because they test opposite failure modes: the API overcharging by starting before it should, and the API losing money by not recovering after a server error.

Case 3: Insufficient credits → 402 before generation starts

If the credit check happens after generation, you deliver the image and can’t collect for it. The check must be first — before any computation runs.

@pytest.mark.asyncio
async def test_insufficient_credits_returns_402_before_generation(
    async_client, db_session
):
    poor_user = User(email="[email protected]", credits=0.5, api_key="poor-key")
    db_session.add(poor_user)
    db_session.commit()

    response = await async_client.post(
        "/api/v1/deterministic/generate",
        json={
            "type": "social_card",
            "storage": "direct",
            "generate_sizes": True,
            "preset_name": "blog_images",  # Total cost: 1.75
            "data": {"title": "Test"},
        },
        headers={"Authorization": "Bearer poor-key"},
    )

    assert response.status_code == 402
    data = response.json()
    assert data["detail"]["error"] == "insufficient_credits"
    assert data["detail"]["available"] == 0.5
    assert data["detail"]["required"] == 1.75

The test verifies that the response body includes both the available balance and the required amount — giving the caller enough information to explain the error to the user without a second API call.

Case 4: Server error → automatic credit refund

Credits are reserved before generation starts (the atomic reservation described in post 3). If a server error occurs mid-generation, those reserved credits must return to the user’s balance. This is a trust-destroying bug if it fails silently.

@pytest.mark.asyncio
async def test_server_error_triggers_credit_refund(
    async_client, test_user, monkeypatch, db_session
):
    initial_credits = test_user.credits

    def mock_generation_failure(*args, **kwargs):
        raise RuntimeError("Simulated server error during generation")

    monkeypatch.setattr(
        "app.services.generators.SocialCardGenerator.generate",
        mock_generation_failure
    )

    response = await async_client.post(
        "/api/v1/deterministic/generate",
        json={"type": "social_card", "storage": "direct", "data": {"title": "Test"}},
        headers={"Authorization": f"Bearer {test_user.api_key}"},
    )

    assert response.status_code == 500

    # The user's balance must be unchanged — no net charge on server error
    db_session.refresh(test_user)
    assert test_user.credits == initial_credits

The critical pattern here is db_session.refresh(test_user) — without refreshing from the database, the in-memory object holds the pre-refund state. The test passes through the HTTP layer (not mocking the endpoint directly) because the refund logic must execute inside the error handler, not the test fixture.

[!info] The refund test pattern To test credit refunds, inject an error into the generation layer (not the credit layer), make a real HTTP call, then refresh the database object and assert the balance is unchanged. Any shortcut that bypasses the HTTP layer also bypasses the error handler where the refund lives.

Case 5: Rate Limit State Across Sequential Requests

Rate limiting is the clearest example of an E2E-only test. The token bucket state lives in Redis. Unit tests don’t touch Redis. Only sequential real requests through the full stack can verify that the rate limiter counts correctly and enforces limits where it’s supposed to.

@pytest.mark.asyncio
async def test_free_tier_rate_limit_enforced(async_client, free_tier_user):
    # The free tier allows 5 requests per minute — all should succeed
    for i in range(5):
        r = await async_client.post(
            "/api/v1/deterministic/generate",
            json={
                "type": "url_personalization",
                "storage": "direct",
                "data": {
                    "text": f"Test {i}",
                    "background_url": "https://example.com/bg.jpg",
                },
            },
            headers={"Authorization": f"Bearer {free_tier_user.api_key}"},
        )
        assert r.status_code == 200, f"Request {i+1} failed unexpectedly: {r.json()}"

    # The 6th request hits the limit
    r = await async_client.post(
        "/api/v1/deterministic/generate",
        json={
            "type": "url_personalization",
            "storage": "direct",
            "data": {"text": "Over limit", "background_url": "https://example.com/bg.jpg"},
        },
        headers={"Authorization": f"Bearer {free_tier_user.api_key}"},
    )

    assert r.status_code == 429
    assert "Retry-After" in r.headers
    assert 0 < int(r.headers["Retry-After"]) <= 60

Two assertions on the 429 response: the status code and the Retry-After header. The header is checked for range (between 1 and 60 seconds) rather than exact value — because the rate limiter’s bucket refill time is relative to when the test runs, not a fixed number. Asserting a range is the correct behavior here; asserting an exact value would make the test timing-dependent and flaky.

The Result: 90% E2E Coverage

After these five cases — plus the basic happy-path tests from the testing strategy post — E2E coverage landed at 90%, matching the target. The remaining 10% represents scenarios that are genuinely difficult to automate: signed URL expiry validation (depends on real clock behavior in ways that defeat test environments) and Cloudinary-specific error responses (simulating these without mocking the entire SDK defeats the purpose of E2E tests). Both are documented as manual test scenarios.

The five edge cases here share a pattern: they’re all scenarios where the system’s behavior across component boundaries is what needs testing, not any individual function’s logic. The generator doesn’t know about credit refunds. The credit service doesn’t know about rate limits. Only a test that exercises the full stack catches the coordination failures.

→ Next: API Documentation: Swagger + Postman as First-Class Deliverables

The 5 Edge Cases That Break Image Generation APIs (Found in Production, Tested in Code)

Summary

Case 1: Partial Batch Failure → Proportional Charge

Case 2: SVG Injection → Sanitize, Don’t Reject

Cases 3 and 4: Credits — Before and After

Case 5: Rate Limit State Across Sequential Requests

The Result: 90% E2E Coverage

Niveaux OAuth Gmail : Ce Que Tout Développeur Workspace Doit Savoir

Gmail OAuth Scope Tiers: What Every Workspace Addon Developer Must Know

Le Mensonge des 50 000 $ qui a Failli Tuer Mon Add-on Gmail

Le Mensonge à 50 000 $ qui a Failli Tuer Mon Extension Gmail

Full Bright

Search

Recent Posts

Gmail OAuth Scope Tiers: What Every Workspace Addon Developer Must Know

The $50,000 Lie That Almost Killed My Gmail Addon

Gmail OAuth Scopes Decoded: The 3-Tier System That Determines Your Launch Path

Legal Documents for Gmail Add-ons: What Google Reviews and How to Deploy in 15 Minutes

Categories

Tags

Contact

To order one of our services, navigate to the order service page

Address

Email Us

Open Hours