In Scala, we would write an RDD to Redis like this:
datardd.foreachPartition(iter => {
val r = new RedisClient("hosturl", 6379)
iter.foreach(i => {
val (str, it) = i
val map = it.toMap
r.hmset(str, map)
})
})
I tried doing this in PySpark like this: datardd.foreachPartition(storeToRedis)
, where function storeToRedis
is defined as:
def storeToRedis(x):
r = redis.StrictRedis(host = 'hosturl', port = 6379)
for i in x:
r.set(i[0], dict(i[1]))
It gives me this:
ImportError: ('No module named redis', function subimport at
0x47879b0, ('redis',))
Of course, I have imported redis.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…