Illustration Image

Error while processing list of dictionarie in cassandra through pyspark

I have below code which I am trying to execute:

prepared_statement = session.prepare("update table set somecol = ? where col = ?")
params = ([Row('key1': value1), Row('ke2': value2), .., Row('keyn': valuen)], value2)
batch.add(prepared_statement, params)

But code is failing while adding statement in batch. Below is the error which I am receiving. can someone pls help with this?

  File "/mnt/yarn/<path>/pyspark.zip/pyspark/worker.py", line 619, in main
    process()
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/worker.py", line 609, in process
    out_iter = func(split_index, iterator)
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/rdd.py", line 2918, in pipeline_func
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/rdd.py", line 2918, in pipeline_func
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/rdd.py", line 2918, in pipeline_func
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/rdd.py", line 417, in func
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/rdd.py", line 933, in func
  File "/mnt/yarn/<path>/processor/python_file.py", line 35, in <lambda>
  File "/mnt/yarn/<path>/processor/python_file.py", line 111, in partition_method
    raise e
  File "/mnt/yarn/<path>/processor/python_file.py", line 103, in partition_method
    raise e
  File "/mnt/yarn/<path>/processor/python_file.py", line 72, in partition_method
    batch_statement.add(prepared_statement, params)
  File "cassandra/query.py", line 825, in cassandra.query.BatchStatement.add
  File "cassandra/query.py", line 504, in cassandra.query.PreparedStatement.bind
  File "cassandra/query.py", line 634, in cassandra.query.BoundStatement.bind
  File "cassandra/cqltypes.py", line 808, in cassandra.cqltypes._ParameterizedType.serialize
  File "cassandra/cqltypes.py", line 847, in cassandra.cqltypes._SimpleParameterizedType.serialize_safe
  File "cassandra/cqltypes.py", line 313, in cassandra.cqltypes._CassandraType.to_binary
  File "cassandra/cqltypes.py", line 808, in cassandra.cqltypes._ParameterizedType.serialize
  File "cassandra/cqltypes.py", line 1034, in cassandra.cqltypes.UserType.serialize_safe
  File "/mnt/yarn/<path>/pyspark.zip/pyspark/sql/types.py", line 1556, in __getitem__
    return super(Row, self).__getitem__(item)
IndexError: tuple index out of range
Become part of our
growing community!
Welcome to Planet Cassandra, a community for Apache Cassandra®! We're a passionate and dedicated group of users, developers, and enthusiasts who are working together to make Cassandra the best it can be. Whether you're just getting started with Cassandra or you're an experienced user, there's a place for you in our community.
A dinosaur
Planet Cassandra is a service for the Apache Cassandra® user community to share with each other. From tutorials and guides, to discussions and updates, we're here to help you get the most out of Cassandra. Connect with us and become part of our growing community today.
© 2009-2023 The Apache Software Foundation under the terms of the Apache License 2.0. Apache, the Apache feather logo, Apache Cassandra, Cassandra, and the Cassandra logo, are either registered trademarks or trademarks of The Apache Software Foundation. Sponsored by Anant Corporation and Datastax, and Developed by Anant Corporation.

Get Involved with Planet Cassandra!

We believe that the power of the Planet Cassandra community lies in the contributions of its members. Do you have content, articles, videos, or use cases you want to share with the world?