Spark Steaming

Spark Streaming is a library built on top of the Spark Core framework that enables scalable, high throughput, fault-tolerant stream processing. Spark Streaming is able to reuse many of the same API's that Spark Core uses. Spark Streaming utilizes a micro-batch architecture, allowing it to process data with as low as 1-2s of latency. Spark introduces a new abstraction of data called Discretized Streams (Dstreams). Dstreams are batches of data that share very little physically with RDD's, but are used very similarly to RDD's. Examples of streams: Twitter tweets, Messages in a messaging system etc..

results matching ""

    No results matching ""