You can create a ConstantInputDStream with the CassandraRDD as input. ConstantInputDStream will provide the same RDD on each streaming interval, and by executing an action on that RDD you will trigger a materialization of the RDD lineage, leading to executing the query on Cassandra every time.
Make sure that the data being queried does not grow unbounded to avoid increasing query times and resulting in an unstable streaming process.
Something like this should do the trick (using your code as starting point):
import org.apache.spark.streaming.dstream.ConstantInputDStream
val ssc = new StreamingContext(conf, Seconds(10))
val cassandraRDD = ssc.cassandraTable("mykeyspace", "users").select("fname", "lname").where("lname = ?", "yu")
val dstream = new ConstantInputDStream(ssc, cassandraRDD)
dstream.foreachRDD{ rdd =>
// any action will trigger the underlying cassandra query, using collect to have a simple output
println(rdd.collect.mkString("
"))
}
ssc.start()
ssc.awaitTermination()
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…