Building Real-Time Usage Tracking Without Background Jobs

December 22, 2024

When building AuthMint, a usage-based billing platform, I faced a critical architectural decision: how to track and calculate user token consumption in real-time without the complexity and potential failures of background job systems.

The Problem

Most billing systems rely on background jobs to:

  • Calculate usage periodically
  • Update user balances
  • Check quota violations
  • Generate billing data

This approach has several downsides:

  • Delayed updates - Users don't see real-time usage
  • Complexity - Job scheduling, failure handling, retries
  • Race conditions - Between consumption and calculation
  • Data inconsistency - Temporary states during processing
  • Operational overhead - Monitoring job queues, handling failures

The Solution: Database-Level Real-Time Calculations

Instead of background jobs, I built the entire system using PostgreSQL functions that calculate everything in real-time during each API call.

Core Architecture

The heart of the system is a single PostgreSQL function that handles token consumption:

CREATE OR REPLACE FUNCTION consume_tokens_v2(
  p_project_id uuid,
  p_external_user_id text,
  p_tokens_consumed integer,
  p_tag_id uuid DEFAULT NULL,
  p_metadata jsonb DEFAULT NULL
) RETURNS TABLE(
  success boolean, 
  tokens_consumed integer,
  base_quota_used integer,
  grants_used integer,
  message text
)

This function performs several operations atomically:

  1. Validates availability - Checks quotas and grants
  2. Enforces restrictions - Daily/hourly limits, tag restrictions
  3. Consumes tokens - From quotas first, then grants (FIFO)
  4. Records audit trail - Complete transaction history
  5. Returns real-time state - Current balance and usage

Token Sources & Priority

The system supports multiple token sources with a clear consumption priority:

-- For Subscription Plans:
1. Check tag restrictions (if applicable)
2. Use base quota first (if payment active)
3. Use grants (FIFO by expiry date)

-- For Pay-as-you-go Plans:
1. Check tag restrictions (if applicable)  
2. Use grants only (FIFO by expiry date)

Restriction System

One of the most complex parts is handling flexible restrictions:

-- Example: GPT-4 tokens limited to 2000/day
INSERT INTO plan_restrictions (
  plan_id,
  restriction_type,
  restriction_config
) VALUES (
  uuid_plan_id,
  'tag',
  '{
    "tag_name": "GPT-4",
    "max_tokens": 2000,
    "duration": "1 day"
  }'
);

The system calculates restriction periods dynamically:

  • 1 hour → Current hour window
  • 1 day → Current day window
  • 1 week → Current week window
  • 1 month → Current month window

Implementation Details

Database Schema Design

Key tables that enable real-time calculations:

-- User token grants (FIFO consumption)
user_token_grants
├── token_amount (original grant)
├── expires_at (FIFO ordering)
└── grant_type (trial, bonus, purchase)

-- Grant consumption tracking
user_token_grants_consume
├── user_token_grants_id
├── token_deduction
└── created_at

-- Quota usage tracking
user_quota_usage
├── quota_id
├── period_start/period_end
└── tokens_used

-- Complete audit trail
token_transfers
├── amount (+ for grants, - for consumption)
├── transaction_type
└── metadata (full context)

Real-Time Balance Calculation

The system calculates available balance on every request:

-- Available grants (FIFO by expiry)
SELECT SUM(
  GREATEST(0, utg.token_amount - COALESCE(consumed.total_consumed, 0))
) 
FROM user_token_grants utg
LEFT JOIN (
  SELECT user_token_grants_id, SUM(token_deduction) as total_consumed
  FROM user_token_grants_consume 
  GROUP BY user_token_grants_id
) consumed ON utg.id = consumed.user_token_grants_id
WHERE utg.user_id = ? 
  AND (utg.expires_at IS NULL OR utg.expires_at > NOW())

Handling Concurrency

PostgreSQL's ACID properties handle concurrent token consumption safely:

  • Atomic operations - Each consumption is a single transaction
  • Isolation - Concurrent requests don't interfere
  • Consistency - Database constraints prevent overselling
  • Durability - All changes are immediately persistent

Performance Optimizations

Strategic Indexing

-- Fast user lookups
CREATE INDEX idx_tracked_users_project_external 
ON tracked_users(project_id, external_id);

-- Grant expiry ordering (FIFO)
CREATE INDEX idx_user_token_grants_expires 
ON user_token_grants(expires_at) 
WHERE expires_at IS NOT NULL;

-- Usage period lookups
CREATE INDEX idx_user_quota_usage_user_period 
ON user_quota_usage(user_id, period_start, period_end);

Query Optimization

  • Efficient joins - Minimize data retrieval
  • Early filtering - Use WHERE clauses effectively
  • Proper aggregation - SUM/COUNT at database level
  • Avoid N+1 queries - Batch related operations

Results & Benefits

This architecture delivers several key advantages:

Immediate Consistency

  • Users see real-time usage updates
  • No eventual consistency delays
  • Accurate quota enforcement

Simplified Operations

  • No background job monitoring
  • No queue management
  • No retry logic needed
  • Fewer failure modes

Performance Metrics

  • Average response time: 50-100ms
  • Concurrent requests: Handled safely
  • Database load: Manageable with proper indexing
  • Scalability: Limited by database performance

Developer Experience

// Simple API call gets real-time data
const result = await fetch('/api/v1/consume', {
  method: 'POST',
  body: JSON.stringify({
    user_id: 'user_123',
    tokens_to_consume: 500,
    tag: 'GPT-4'
  })
});

// Immediate response with current state
const { success, tokens_consumed, message } = await result.json();

Challenges & Trade-offs

Database Load

Real-time calculations put more load on the database, but this is manageable with:

  • Proper indexing strategy
  • Connection pooling
  • Query optimization
  • Database scaling (if needed)

Complex Business Logic

Moving logic to the database requires:

  • SQL expertise for complex functions
  • Careful testing of edge cases
  • Version control for database schema
  • Database-specific deployment strategies

Limited Language Features

PostgreSQL functions are less flexible than application code for:

  • Complex data transformations
  • External API calls
  • Rich error handling
  • Advanced logging

When to Use This Approach

This architecture works well when:

Accuracy is critical - Financial/billing data ✅ Real-time updates required - User dashboards ✅ Simple deployment preferred - Fewer moving parts ✅ Database expertise available - Team comfortable with SQL ✅ Moderate scale - Database can handle the load

Avoid when: ❌ Very high scale - Database becomes bottleneck ❌ Complex external integrations - Need application-level logic ❌ Team prefers application logic - Limited SQL experience ❌ Eventual consistency acceptable - Background jobs sufficient

Lessons Learned

Start with Real-Time

It's much easier to build real-time calculations from the beginning than to retrofit them later. The data model and query patterns are fundamentally different.

PostgreSQL is Powerful

Modern PostgreSQL can handle sophisticated business logic efficiently. Don't underestimate what you can accomplish with well-written functions and proper indexing.

Simplicity Wins

Sometimes the "simple" approach of calculating everything in real-time is more robust than complex distributed systems with background jobs.

Test Thoroughly

Real-time systems need extensive testing for:

  • Concurrent access patterns
  • Edge cases in business logic
  • Performance under load
  • Data consistency scenarios

Future Considerations

As the system scales, potential optimizations include:

  • Read replicas for analytics queries
  • Caching layers for frequently accessed data
  • Database partitioning for large datasets
  • Async processing for non-critical operations

But the core real-time architecture provides a solid foundation that can evolve with business needs.

Conclusion

Building real-time usage tracking without background jobs proved to be a successful architectural choice for AuthMint. The approach delivers immediate consistency, simplifies operations, and provides a better user experience.

The key is designing your data model and queries for real-time use from the start, leveraging PostgreSQL's strengths, and accepting the trade-offs that come with database-centric business logic.

For SaaS applications where accurate, real-time usage tracking is critical, this approach offers a robust alternative to traditional background job architectures.