Moving Data [Spark Streaming - simple data files]
Spark Streaming a local directory Introduction For our streaming exercise, we will create two classes (one being a utility class that initially only handle the log4j properties for our application). We will manually compile our classes and use spark-submit to run our application. Note: For this example, we are creating the logging class inline into our main class. Therefore this example helps you to slowly get to grips with the dynamics of Spark Scala programming, and, slowly grows your knowledge. We will not use the "spark/bin/run-example" script that most examples use, as I want you to understand step by step what you are doing. You are also more easily able to alter your code by doing it my way. Use Case We are creating a streaming Apache Spark Scala program that reads a directory for new files and counts the amount of words in the file. Example Create our class FileStream.scala (which contains the StreamingUtility class), which will for this e...