I have to create multiple reports on a daily basis from a Cassandra table, and in 24 hours the data could be close to a million rows.
I initially thought of running a single query and get the data using paging and keep a local copy of file and upload to an ftp server. But my application is running on a Kubernetes cluster so can't create local files as this is not recommended.
So I am trying to get the data out of Cassandra and stream the data using Flink job and push the data to ObjectStore(Azure blob) and once the data is uploaded completely I will upload the file to an ftp server.
I'm not sure whether this will work or not, any suggestions or improvements on this design.