Needing to mine data in multiple warehouses and legacy systems, Capital One pulled off one of the largest digital transformations by migrating to Cassandra to achieve near real-time analytics.
450M rows consolidated from legacy systems
21K transactions per second
Best known for its credit cards and Samuel L. Jackson-hosted “What’s in your wallet?” commercials, Capital One is a diversified bank that offers a broad array of financial products and services to consumers, small businesses, and commercial clients. A Fortune 500 company, Capital One is one of the most recognized brands in the United States.
A financial industry leader, Capital One’s applications generate huge amounts of data. Mining of which has become of paramount importance for making critical business decisions. Capital One was storing this data in multiple warehouses and databases and using traditional batch analysis for decision making. They needed to quickly process and analyze their data to make faster decisions in near real-time. Capital One sought quickly to build a new, real-time reporting and analytics platform. “It’s a challenge that the business is going through from thinking about batch analysis to a near-real-time platform,” said Javed Roshan, Director of Data Services at Capital One.
The stringent service level agreement (SLA) requirement and the need to quickly process large amounts of data made Apache Cassandra a natural choice for Capital One. Capital One attained a distributed environment out of the box with multi-datacenter replication. This environment is also extremely flexible and able to handle incredibly fast writes, which Capital One required, and has tunable consistency for different types of queries. Furthermore, reliable scalability makes it easy to grow data clusters and support new features.
The new platform kick-started Capital One’s big data intelligence by quickly and easily incorporating a year’s worth of data – 450 million rows – from a legacy Oracle database. Consolidating this into far fewer rows (11.5 million), all handled on a six-node cluster at the lightning-fast rate of 21,000 transactions per second and meeting Capital One’s 99.99% uptime requirement. “We pulled in one year of data from Oracle and once we got it into Cassandra, it was a smooth ride and it was processing at a very high rate,” said Mukram Aziz, Senior Manager of Data Services at Capital One.