Best practices

1. Batch Processing

✅ Recommended:

Retrieve 100 to 250 records at once and process them in batches. This approach significantly improves performance by reducing the number of API calls to the Feed API.

Benefits:

  • Reduces API call frequency
  • Improves throughput
  • Better resource utilization
  • Lower latency overall

❌ Avoid:

Do not retrieve just 1 record at a time. While it might seem simpler, single-event processing causes several performance issues.

Drawbacks:

  • High API call volume
  • Increased network overhead
  • Poor throughput
  • Risk of hitting rate limits

2. Polling Strategy

✅ Recommended:

Implement an adaptive polling strategy that responds to the hasNext flag in the API response. This approach balances responsiveness with resource efficiency.

Algorithm:

  1. Check the hasNext flag from the previous response
  2. If hasNext is true, immediately fetch the next batch without waiting
  3. If hasNext is false, wait 5-10 seconds before attempting to consume again
  4. Exponentially increase wait time if no data is received consecutively

Polling Scenarios:

  • When hasNext = true: Immediately retry with no wait to process data as quickly as it becomes available
  • When hasNext = false: Wait 5 seconds, then retry to avoid excessive polling when no data is available
  • 3+ consecutive no-data responses: Increase wait time to 10-15 seconds to reduce unnecessary API calls

3. Connection Management

✅ Recommended:

Implement connection pooling to improve resource utilization and overall performance. Connection reuse reduces the overhead of establishing new connections for each API call.

Connection Pool Configuration:

  • Reuse Connections: true - Keep connections open for reuse across multiple requests
  • Keep-Alive: true - Maintain active connections to the API server
  • Connection Timeout: 30 seconds - Maximum time to wait for a connection to be established
  • Read Timeout: 45 seconds - Maximum time to wait for a response from the server
  • Pool Size: 5-10 concurrent connections - Optimal number of concurrent connections to maintain
  • Benefit: Better resource utilization and improved performance through connection reuse

4. Error Handling

✅ Recommended:

Implement comprehensive error handling that differentiates between temporary, permanent, and unknown errors. Each error type requires a different response strategy.

Temporary Errors (Retry with exponential backoff):

  • Status Codes: 408, 429, 500, 502, 503, 504
  • Action: Retry the request with increasing delays between attempts
  • Example: Connection timeouts, rate limits, temporary server issues

Permanent Errors (Log and escalate; do not retry):

  • Status Codes: 400, 401, 403, 404
  • Action: Log the error and escalate to your monitoring/alerting system
  • Example: Invalid requests, authentication failures, permission issues, resource not found

Unknown Errors (Log, retry once, then escalate):

  • Action: Log the error, attempt one retry, then escalate if it persists
  • Helps distinguish transient issues from permanent failures

5. Data Persistence

✅ Recommended:

Implement checkpoint-based recovery to ensure your application can resume consuming events after unexpected shutdowns or failures.

Implementation Steps:

  1. Store Checkpoint: Save the last successfully processed eventId locally in your database or storage system
  2. Resume on Restart: When your application restarts, query the Feed API from the last checkpoint to avoid reprocessing old events
  3. Handle Duplicates: Implement idempotent processing logic to safely handle duplicate events if they occur
  4. Transactional Writes: Use database transactions to ensure data consistency between event processing and checkpoint updates

Benefit: Resilience against application crashes and network interruptions - your system can recover and continue processing without losing or duplicating data

6. Rate Limit Management

✅ Recommended:

Proactively manage rate limits by monitoring rate limit headers and adjusting your request frequency accordingly.

Rate Limit Practices:

  • Monitor Headers: Track the following response headers to understand your rate limit status:
    • x-ratelimit-limit - Your total rate limit
    • x-ratelimit-remaining - Requests remaining in the current window
    • x-ratelimit-reset - When the rate limit resets
  • Preemptive Slowing: Reduce your request frequency when remaining requests fall below 20% of your limit to avoid hitting the limit
  • Respect Retry-After: Always honor the Retry-After header value in 429 responses; this indicates the minimum time to wait before retrying
  • Implement Exponential Backoff: When you receive a 429 (Too Many Requests) response, implement exponential backoff to gradually increase the wait time between retries

7. Data Validation

✅ Recommended:

Implement comprehensive data validation to ensure the integrity and consistency of events received from the Feed API.

Validation Checks:

  • Response Structure: Verify that all expected fields are present in each response. Missing fields may indicate API issues or version mismatches.
  • Event Type Validation: Ensure each eventType is in your known and supported types list. Reject or log unexpected event types for investigation.
  • Data Type Validation: Validate that field values match their expected data types (e.g., integers, strings, objects). Type mismatches may cause processing errors.
  • Timestamp Validation: Verify that timestamps are in valid ISO-8601 format and represent logical time progression. Timestamp issues can indicate data corruption.
  • Deduplication: Track processed eventIds to detect duplicate events. Maintain a deduplication cache or database index to identify and skip already-processed events.

8. Monitoring & Logging

✅ Recommended:

Implement comprehensive monitoring and logging to track the health and performance of your Feed API consumer.

Key Metrics to Track:

  • API Call Count: Monitor the total number of API calls made to the Feed API for capacity planning and cost estimation
  • Events Processed: Count the number of events successfully processed to measure throughput
  • Error Rate: Monitor the percentage of failed requests to identify reliability issues
  • Latency: Track request/response times to detect performance degradation
  • Data Lag: Monitor the time between event creation and processing to measure freshness of your data

Logging Best Practices:

  • Log Level: Use INFO level for normal operations and ERROR level for failures and exceptions
  • Log Format: Include essential context in each log entry:
    • requestId - Unique identifier for tracking individual requests
    • feedId - Identifies which feed subscription the request is for
    • timestamp - When the event occurred
    • eventCount - Number of events processed in the request
  • Log Retention: Keep logs for a minimum of 30 days to support investigation and auditing of issues