Engineering•March 29, 2025•By Atul Kumar
Detailed Workflow Document: Request Deduplication & Processing System
Parent Article: Request Deduplication & Processing System
Detailed Workflow Document
Request Deduplication & Processing System
Lodgement
1. Overview
This document describes the detailed workflow for the Request Deduplication and Processing System within the Fenix Platform for lodgement process. It covers the request submission, processing, error handling, and failure recovery mechanisms.
2. Workflow Steps
2.1 Step 1: Request Submission
-
Client Request:
- The client sends a request with a unique
externalRequestIdin the header.
- The client sends a request with a unique
-
Orchestration Service:
- The Orchestration Service checks DynamoDB (DDB) for the
externalRequestIdand its currentstatus.
- The Orchestration Service checks DynamoDB (DDB) for the
2.1.1 If Status is Success
- The service responds with a 400 error indicating a duplicate request found with success state and provides the
fenixTxnIdalong withfenixErrorCodeandmetadata(EMPTY in case of success). - The message will contain a note saying that this
externalRequestIdhas already been processed.
2.1.2 If Status is Failed
- The service responds with a 400 error indicating a duplicate request found with failure state and provides fenixErrorCode and metadata (NON-EMPTY in case of failure).
fenixTxnIdwill be null in this case. - The message will contain a note saying that this
externalRequestIdhas already been processed.
2.1.3 If Status is Processing
- The service immediately calls CTMS to fetch the respective
fenixTxnId. - If no record is found in CTMS:
- The
externalRequestIdis placed into a Dead Letter Queue (DLQ). - A 500 error is returned to the client, instructing them to stop retries and wait for a webhook response.
- Fenix will investigate through on-call alerts. These alarms will improve out latency.
- The
2.1.4 If Status is Pending
- Similar to the
Processingstate:- The service calls CTMS to fetch the
fenixTxnId. - If no record is found, the
externalRequestIdis placed in the DLQ. (To check why processing is delayed) - A 500 error is returned with a request to wait for the webhook.
- The service calls CTMS to fetch the
2.1.5 If No Record is Found in DDB
- This indicates a new request. The following steps occur:
- Write
{ requestId, status: pending, metadata }to DDB. - Send the request to SQS for further processing.
- Lambda consumes the SQS message and updates the DDB with the correct data.
- The request is forwarded to CTMS for processing with
requestIdin the header.
- Write
2.2 Step 2: CTMS Processing
- CTMS checks DDB for the
requestIdand status.
2.2.1 If Status is Success
- CTMS responds with a 400 error to the Orchestration Service indicating the transaction is already processed.
- It returns the
requestId,fenixTxnId, and the transaction status.
2.2.2 If Status is Failed
- CTMS responds with a 400 error indicating the transaction has failed.
- It provides the
requestId,fenixTxnIdasnull, and transaction status asfailed. - Clients can retry with a new
requestId.
2.2.3 If Status is Pending
- CTMS immediately updates the status to
Processing. - It then starts processing the request.
- On successful completion, it updates DDB with the
fenixTxnIdandstatus=success. - On failure, it updates the DDB with
status=failed.
- On successful completion, it updates DDB with the
3. Error Handling and Failure Recovery
3.1 SQS Message Update Failures
- If there is a failure while updating messages in SQS, the message will be sent to a DLQ.
- Alerts are configured to notify engineers through CloudWatch when a message reaches the DLQ.
- Failure while putting messages into SQS or DLQ will not halt the overall process.
3.2 Dead Letter Queue (DLQ)
- DLQ is used for capturing failed messages.
- Alerts are configured to raise incidents whenever a request with
ProcessingorPendingstatus is sent to the DLQ. - Engineers can investigate via on-call.
4. Webhook Notifications
- Webhook responses will include the following fields:
{ "requestId": "req_123", "fenixTxnId": "fenix_789", "externalTxnId": "ext_456", "status": "success" } - Webhook is sent once the request is successfully processed.
- In case of failure, the client is notified with the appropriate status and failure reason.
5. Conclusion
- This system ensures deduplication and consistent state management using DDB.
- SQS and Lambda handle asynchronous workflows.
- DLQ and alerts enable effective failure recovery.
- Clients are provided with real-time responses or webhooks for transparency.
Next Steps:
- Implement the detailed design in the staging environment.
- Perform load testing to validate the workflow.
- Set up monitoring and alerting mechanisms.
- Provide integration guidelines to clients.