“Diagrams for Devs: Hulu’s Architecture Before and After Cassandra” was created by Pete Soderling, as part of Hakka Labs’ Cassandra Week.
Matt Jurik (Software Developer, Hulu) gave an excellent talk at Cassandra Day Silicon Valley about Hulu’s migration to Cassandra. The talk features awesome diagrams of Hulu’s architecture with a focus on the Hugetop service. Hugetop tracks users’ progress in content. Hulu has been able to scale this service to accommodate over 400 million monthly plays. Here are my favorite snapshots from the talk.
1. Hulu’s old architecture (MySQL)
“The old architecture was based on MySQL. As you can see, at the top, we have Hulu.com, devices, and other services – these aren’t exposed to the Internet. These are the primary three sources of web requests that come through Hugetop. Hugetop itself is really just a Python application. We use TFS to make sure that it’s Async. As you can imagine, this service is one of our busier ones – there are lots of people watching videos, getting updates on where people are on videos, or processing requests for various traits on the progress indicator. There’s a lot of concurrency and we use Python to make sure it can handle an extremely high load….as for data stores we used Redis and originally MySQL…the Redis shards are there for cacheing…”
2. The switch to C*
“We got rid of MySQL and the Python application that was directly talking to MySQL and we replaced it with the Cassandra Rust API…”
3. Hardware Considerations
“The complexity here is how Redis and Cassandra talk to each other…”
Enjoy Matt’s talk for a finer description of the technologies and strategies his team applies.