Building Real-Time Usage Tracking Without Background Jobs

When building AuthMint, a usage-based billing platform, I faced a critical architectural decision: how to track and calculate user token consumption in real-time without the complexity and potential failures of background job systems.

The Problem

Most billing systems rely on background jobs to:

Calculate usage periodically
Update user balances
Check quota violations
Generate billing data

This approach has several downsides:

Delayed updates - Users don't see real-time usage
Complexity - Job scheduling, failure handling, retries
Race conditions - Between consumption and calculation
Data inconsistency - Temporary states during processing
Operational overhead - Monitoring job queues, handling failures

The Solution: Database-Level Real-Time Calculations

Instead of background jobs, I built the entire system using PostgreSQL functions that calculate everything in real-time during each API call.

Core Architecture

The heart of the system is a single PostgreSQL function that handles token consumption:

CREATE OR REPLACE FUNCTION consume_tokens_v2(
  p_project_id uuid,
  p_external_user_id text,
  p_tokens_consumed integer,
  p_tag_id uuid DEFAULT NULL,
  p_metadata jsonb DEFAULT NULL
) RETURNS TABLE(
  success boolean, 
  tokens_consumed integer,
  base_quota_used integer,
  grants_used integer,
  message text
)

This function performs several operations atomically:

Validates availability - Checks quotas and grants
Enforces restrictions - Daily/hourly limits, tag restrictions
Consumes tokens - From quotas first, then grants (FIFO)
Records audit trail - Complete transaction history
Returns real-time state - Current balance and usage

Token Sources & Priority

The system supports multiple token sources with a clear consumption priority:

-- For Subscription Plans:
1. Check tag restrictions (if applicable)
2. Use base quota first (if payment active)
3. Use grants (FIFO by expiry date)

-- For Pay-as-you-go Plans:
1. Check tag restrictions (if applicable)  
2. Use grants only (FIFO by expiry date)

Restriction System

One of the most complex parts is handling flexible restrictions:

-- Example: GPT-4 tokens limited to 2000/day
INSERT INTO plan_restrictions (
  plan_id,
  restriction_type,
  restriction_config
) VALUES (
  uuid_plan_id,
  'tag',
  '{
    "tag_name": "GPT-4",
    "max_tokens": 2000,
    "duration": "1 day"
  }'
);

The system calculates restriction periods dynamically:

1 hour → Current hour window
1 day → Current day window
1 week → Current week window
1 month → Current month window

Implementation Details

Database Schema Design

Key tables that enable real-time calculations:

-- User token grants (FIFO consumption)
user_token_grants
├── token_amount (original grant)
├── expires_at (FIFO ordering)
└── grant_type (trial, bonus, purchase)

-- Grant consumption tracking
user_token_grants_consume
├── user_token_grants_id
├── token_deduction
└── created_at

-- Quota usage tracking
user_quota_usage
├── quota_id
├── period_start/period_end
└── tokens_used

-- Complete audit trail
token_transfers
├── amount (+ for grants, - for consumption)
├── transaction_type
└── metadata (full context)

Real-Time Balance Calculation

The system calculates available balance on every request:

-- Available grants (FIFO by expiry)
SELECT SUM(
  GREATEST(0, utg.token_amount - COALESCE(consumed.total_consumed, 0))
) 
FROM user_token_grants utg
LEFT JOIN (
  SELECT user_token_grants_id, SUM(token_deduction) as total_consumed
  FROM user_token_grants_consume 
  GROUP BY user_token_grants_id
) consumed ON utg.id = consumed.user_token_grants_id
WHERE utg.user_id = ? 
  AND (utg.expires_at IS NULL OR utg.expires_at > NOW())

Handling Concurrency

PostgreSQL's ACID properties handle concurrent token consumption safely:

Atomic operations - Each consumption is a single transaction
Isolation - Concurrent requests don't interfere
Consistency - Database constraints prevent overselling
Durability - All changes are immediately persistent

Performance Optimizations

Strategic Indexing

-- Fast user lookups
CREATE INDEX idx_tracked_users_project_external 
ON tracked_users(project_id, external_id);

-- Grant expiry ordering (FIFO)
CREATE INDEX idx_user_token_grants_expires 
ON user_token_grants(expires_at) 
WHERE expires_at IS NOT NULL;

-- Usage period lookups
CREATE INDEX idx_user_quota_usage_user_period 
ON user_quota_usage(user_id, period_start, period_end);

Query Optimization

Efficient joins - Minimize data retrieval
Early filtering - Use WHERE clauses effectively
Proper aggregation - SUM/COUNT at database level
Avoid N+1 queries - Batch related operations

Results & Benefits

This architecture delivers several key advantages:

Immediate Consistency

Users see real-time usage updates
No eventual consistency delays
Accurate quota enforcement

Simplified Operations

No background job monitoring
No queue management
No retry logic needed
Fewer failure modes

Performance Metrics

Average response time: 50-100ms
Concurrent requests: Handled safely
Database load: Manageable with proper indexing
Scalability: Limited by database performance

Developer Experience

// Simple API call gets real-time data
const result = await fetch('/api/v1/consume', {
  method: 'POST',
  body: JSON.stringify({
    user_id: 'user_123',
    tokens_to_consume: 500,
    tag: 'GPT-4'
  })
});

// Immediate response with current state
const { success, tokens_consumed, message } = await result.json();

Challenges & Trade-offs

Database Load

Real-time calculations put more load on the database, but this is manageable with:

Proper indexing strategy
Connection pooling
Query optimization
Database scaling (if needed)

Complex Business Logic

Moving logic to the database requires:

SQL expertise for complex functions
Careful testing of edge cases
Version control for database schema
Database-specific deployment strategies

Limited Language Features

PostgreSQL functions are less flexible than application code for:

Complex data transformations
External API calls
Rich error handling
Advanced logging

When to Use This Approach

This architecture works well when:

✅ Accuracy is critical - Financial/billing data ✅ Real-time updates required - User dashboards ✅ Simple deployment preferred - Fewer moving parts ✅ Database expertise available - Team comfortable with SQL ✅ Moderate scale - Database can handle the load

Avoid when: ❌ Very high scale - Database becomes bottleneck ❌ Complex external integrations - Need application-level logic ❌ Team prefers application logic - Limited SQL experience ❌ Eventual consistency acceptable - Background jobs sufficient

Lessons Learned

Start with Real-Time

It's much easier to build real-time calculations from the beginning than to retrofit them later. The data model and query patterns are fundamentally different.

PostgreSQL is Powerful

Modern PostgreSQL can handle sophisticated business logic efficiently. Don't underestimate what you can accomplish with well-written functions and proper indexing.

Simplicity Wins

Sometimes the "simple" approach of calculating everything in real-time is more robust than complex distributed systems with background jobs.

Test Thoroughly

Real-time systems need extensive testing for:

Concurrent access patterns
Edge cases in business logic
Performance under load
Data consistency scenarios

Future Considerations

As the system scales, potential optimizations include:

Read replicas for analytics queries
Caching layers for frequently accessed data
Database partitioning for large datasets
Async processing for non-critical operations

But the core real-time architecture provides a solid foundation that can evolve with business needs.

Conclusion

Building real-time usage tracking without background jobs proved to be a successful architectural choice for AuthMint. The approach delivers immediate consistency, simplifies operations, and provides a better user experience.

The key is designing your data model and queries for real-time use from the start, leveraging PostgreSQL's strengths, and accepting the trade-offs that come with database-centric business logic.

For SaaS applications where accurate, real-time usage tracking is critical, this approach offers a robust alternative to traditional background job architectures.