A Case Study on Amazon Prime Audio/Video Monitoring Service Re-architecture
I recently read an article on the audio/video monitoring service re-architecture in Amazon Prime. Previously, they distributed different processes and connected the tasks using AWS Step Functions and AWS Lambda serverless components.
It all starts with a service that has three major processes:
– Media converter, which runs as an AWS Lambda function, converts input audio/video streams to frames and stores them in S3.
– Defect detectors, which also run as AWS Lambda function, read from S3 and analyze frames and audio buffers in real-time to identify defects. They send real-time notifications whenever a defect is found.
– Orchestration, which runs as an AWS Step function, controls the flow within the service.
By removing the overhead and moving the processes that need to work together in one place, they were able to reduce their costs by over 90%. This reduction is significant because the initial cost was ridiculously high in the first place.
Working with videos and images can be expensive. In this case, the media converter process creates video frames (splitting videos into image frames) and writes them to S3, while the detectors read from S3 to access these images. The load and cost of these reading/writing operations alone are very huge, not to mention the cost associated with serverless components and S3 storage.
Do you see the problem here? Yup, S3 is a major overhead in this case.
To address this issue, they removed S3 from the middle and moved the entire process to one server (monolith). This way, all data transfers can be done in-memory, eliminating the cost of using S3 for this process.
Another cost consideration is the use of AWS Step Functions and Lambda serverless components. AWS Step Functions, to put it simply, serve as a pipeline for executing specific tasks in a sequential manner. However, users are charged for each state transition, which occurs when moving to the next task after completing one. This can create bottlenecks when approaching transition limits. Since this service involved multiple state transitions per second, they quickly reached their account limits.
Therefore, they moved the media conversion, detectors, and orchestration processes to one instance and scaled vertically instead of horizontally. Vertical scaling means adding more resources like memory, computation, or storage capacity to the server to increase performance, while horizontal scaling involves adding more servers to distribute the workload across multiple machines.
With the new design, they exceeded the capacity of a single instance, so they copied the same service and differentiated them logically using a subset of detectors. They also added an orchestration layer to distribute customer requests across these services.
Here’s an example:
– The first service includes detector 1, detector 2
– The second service includes detector 3, detector 4, and Detector 5
In conclusion, this is a specific use case where having different processes in one place makes more sense and saves a lot of money. However, this approach may not apply to other use cases. Every need is different, and so is the solution. Case studies like this help us learn different aspects of architecture and planning.
Understanding the specific requirements of an application will help us design a more appropriate architecture.