I am getting a hard time visualizing what exactly stream means in terms of IO. I imagine stream as a continuous flow of data coming from a file, socket or any other data source. Is that correct? But then I get confused on how our java programs react to stream because when we write any java code let say:
Customer getCustomer(Customer customer)
Doesn't the above java code expects the whole object to be present before it gets processed? Now lets say we are reading from a stream something like
FileInputStream in = new FileInputStream("abc.txt")
in.read();
Doesn't in.read() expects the whole file to be present in memory to be processed. If it is, then how come it is a stream? Why do we call them streams? Do they process data as it is read?
A similar confusion when reading through hadoop streams, looks like they have a different meaning altogether.
From here..
A Stream is a free flowing sequence of elements. They do not hold any storage as that responsibility lies with collections such as arrays, lists and sets. Every stream starts with a source of data, sets up a pipeline, processes the elements through a pipeline and finishes with a terminal operation. They allow us to parallelize the load that comes with heavy operations without having to write any parallel code. A new package java.util.stream was introduced in Java 8 to deal with this feature.
Streams adhere to a common Pipes and Filters software pattern. A pipeline is created with data events flowing through, with various intermediate operations being applied on individual events as they move through the pipeline. The stream is said to be terminated when the pipeline is disrupted with a terminal operation. Please keep in mind that a stream is expected to be immutable — any attempts to modify the collection during the pipeline will raise a ConcurrentModifiedException exception.