http://spark.incubator.apache.org/docs/latest/streaming-programming-guide.html
Unifying Batch and Stream Processing Models
- Spark
program
on Twitter log file using RDDs
val
tweets =
sc.hadoopFile("hdfs://...")
val
hashTags
=
tweets.flatMap
(status
=> getTags(status))
hashTags.saveAsHadoopFile("hdfs://...")
- Spark
Streaming
program on Twitter stream using DStreams
val
tweets =
ssc.twitterStream()
val
hashTags
=
tweets.flatMap
(status
=> getTags(status))
hashTags.saveAsHadoopFiles("hdfs://...")
No comments:
Post a Comment