When building AuthMint, a usage-based billing platform, I faced a critical architectural decision: how to track and calculate user token consumption in real-time without the complexity and potential failures of background job systems.
The Problem
Most billing systems rely on background jobs to:
- Calculate usage periodically
- Update user balances
- Check quota violations
- Generate billing data
This approach has several downsides:
- Delayed updates - Users don't see real-time usage
- Complexity - Job scheduling, failure handling, retries
- Race conditions - Between consumption and calculation
- Data inconsistency - Temporary states during processing
- Operational overhead - Monitoring job queues, handling failures
The Solution: Database-Level Real-Time Calculations
Instead of background jobs, I built the entire system using PostgreSQL functions that calculate everything in real-time during each API call.
Core Architecture
The heart of the system is a single PostgreSQL function that handles token consumption:
CREATE OR REPLACE FUNCTION consume_tokens_v2(
p_project_id uuid,
p_external_user_id text,
p_tokens_consumed integer,
p_tag_id uuid DEFAULT NULL,
p_metadata jsonb DEFAULT NULL
) RETURNS TABLE(
success boolean,
tokens_consumed integer,
base_quota_used integer,
grants_used integer,
message text
)
This function performs several operations atomically:
- Validates availability - Checks quotas and grants
- Enforces restrictions - Daily/hourly limits, tag restrictions
- Consumes tokens - From quotas first, then grants (FIFO)
- Records audit trail - Complete transaction history
- Returns real-time state - Current balance and usage
Token Sources & Priority
The system supports multiple token sources with a clear consumption priority:
-- For Subscription Plans:
1. Check tag restrictions (if applicable)
2. Use base quota first (if payment active)
3. Use grants (FIFO by expiry date)
-- For Pay-as-you-go Plans:
1. Check tag restrictions (if applicable)
2. Use grants only (FIFO by expiry date)
Restriction System
One of the most complex parts is handling flexible restrictions:
-- Example: GPT-4 tokens limited to 2000/day
INSERT INTO plan_restrictions (
plan_id,
restriction_type,
restriction_config
) VALUES (
uuid_plan_id,
'tag',
'{
"tag_name": "GPT-4",
"max_tokens": 2000,
"duration": "1 day"
}'
);
The system calculates restriction periods dynamically:
1 hour
→ Current hour window1 day
→ Current day window1 week
→ Current week window1 month
→ Current month window
Implementation Details
Database Schema Design
Key tables that enable real-time calculations:
-- User token grants (FIFO consumption)
user_token_grants
├── token_amount (original grant)
├── expires_at (FIFO ordering)
└── grant_type (trial, bonus, purchase)
-- Grant consumption tracking
user_token_grants_consume
├── user_token_grants_id
├── token_deduction
└── created_at
-- Quota usage tracking
user_quota_usage
├── quota_id
├── period_start/period_end
└── tokens_used
-- Complete audit trail
token_transfers
├── amount (+ for grants, - for consumption)
├── transaction_type
└── metadata (full context)
Real-Time Balance Calculation
The system calculates available balance on every request:
-- Available grants (FIFO by expiry)
SELECT SUM(
GREATEST(0, utg.token_amount - COALESCE(consumed.total_consumed, 0))
)
FROM user_token_grants utg
LEFT JOIN (
SELECT user_token_grants_id, SUM(token_deduction) as total_consumed
FROM user_token_grants_consume
GROUP BY user_token_grants_id
) consumed ON utg.id = consumed.user_token_grants_id
WHERE utg.user_id = ?
AND (utg.expires_at IS NULL OR utg.expires_at > NOW())
Handling Concurrency
PostgreSQL's ACID properties handle concurrent token consumption safely:
- Atomic operations - Each consumption is a single transaction
- Isolation - Concurrent requests don't interfere
- Consistency - Database constraints prevent overselling
- Durability - All changes are immediately persistent
Performance Optimizations
Strategic Indexing
-- Fast user lookups
CREATE INDEX idx_tracked_users_project_external
ON tracked_users(project_id, external_id);
-- Grant expiry ordering (FIFO)
CREATE INDEX idx_user_token_grants_expires
ON user_token_grants(expires_at)
WHERE expires_at IS NOT NULL;
-- Usage period lookups
CREATE INDEX idx_user_quota_usage_user_period
ON user_quota_usage(user_id, period_start, period_end);
Query Optimization
- Efficient joins - Minimize data retrieval
- Early filtering - Use WHERE clauses effectively
- Proper aggregation - SUM/COUNT at database level
- Avoid N+1 queries - Batch related operations
Results & Benefits
This architecture delivers several key advantages:
Immediate Consistency
- Users see real-time usage updates
- No eventual consistency delays
- Accurate quota enforcement
Simplified Operations
- No background job monitoring
- No queue management
- No retry logic needed
- Fewer failure modes
Performance Metrics
- Average response time: 50-100ms
- Concurrent requests: Handled safely
- Database load: Manageable with proper indexing
- Scalability: Limited by database performance
Developer Experience
// Simple API call gets real-time data
const result = await fetch('/api/v1/consume', {
method: 'POST',
body: JSON.stringify({
user_id: 'user_123',
tokens_to_consume: 500,
tag: 'GPT-4'
})
});
// Immediate response with current state
const { success, tokens_consumed, message } = await result.json();
Challenges & Trade-offs
Database Load
Real-time calculations put more load on the database, but this is manageable with:
- Proper indexing strategy
- Connection pooling
- Query optimization
- Database scaling (if needed)
Complex Business Logic
Moving logic to the database requires:
- SQL expertise for complex functions
- Careful testing of edge cases
- Version control for database schema
- Database-specific deployment strategies
Limited Language Features
PostgreSQL functions are less flexible than application code for:
- Complex data transformations
- External API calls
- Rich error handling
- Advanced logging
When to Use This Approach
This architecture works well when:
✅ Accuracy is critical - Financial/billing data ✅ Real-time updates required - User dashboards ✅ Simple deployment preferred - Fewer moving parts ✅ Database expertise available - Team comfortable with SQL ✅ Moderate scale - Database can handle the load
Avoid when: ❌ Very high scale - Database becomes bottleneck ❌ Complex external integrations - Need application-level logic ❌ Team prefers application logic - Limited SQL experience ❌ Eventual consistency acceptable - Background jobs sufficient
Lessons Learned
Start with Real-Time
It's much easier to build real-time calculations from the beginning than to retrofit them later. The data model and query patterns are fundamentally different.
PostgreSQL is Powerful
Modern PostgreSQL can handle sophisticated business logic efficiently. Don't underestimate what you can accomplish with well-written functions and proper indexing.
Simplicity Wins
Sometimes the "simple" approach of calculating everything in real-time is more robust than complex distributed systems with background jobs.
Test Thoroughly
Real-time systems need extensive testing for:
- Concurrent access patterns
- Edge cases in business logic
- Performance under load
- Data consistency scenarios
Future Considerations
As the system scales, potential optimizations include:
- Read replicas for analytics queries
- Caching layers for frequently accessed data
- Database partitioning for large datasets
- Async processing for non-critical operations
But the core real-time architecture provides a solid foundation that can evolve with business needs.
Conclusion
Building real-time usage tracking without background jobs proved to be a successful architectural choice for AuthMint. The approach delivers immediate consistency, simplifies operations, and provides a better user experience.
The key is designing your data model and queries for real-time use from the start, leveraging PostgreSQL's strengths, and accepting the trade-offs that come with database-centric business logic.
For SaaS applications where accurate, real-time usage tracking is critical, this approach offers a robust alternative to traditional background job architectures.