I have a file with a bunch of primes. For this purpose we can say that the contents of this file called "primes" are as follows:
2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,
I want to read the contents of this file (and do some other unrelated stuff later on).
My problem is that the way am reading the file seems to have boilerplate and also appears to be very amateur. This is what i have
const std = @import("std");
const print = std.debug.print;
pub fn main() void {
print("This is the beginning of the program\n", .{});
const file: std.fs.File = std.fs.cwd().openFile(
"primes",
.{}) catch | err | {
print("En arror occured while opening file: {}\n", .{err});
std.os.exit(1);
};
defer file.close();
const reader = file.reader();
var buff: [1024]u8 = undefined;
while(reader.readUntilDelimiterOrEof(&buff, ',')) | number | { // THIS IS A PROBLEM LINE
if(number == null) break; // AND SO IS THIS
print("...running {s}\n", .{buff});
} else | err | {
print("En arror occured while opening file: {}\n", .{err});
}
print("Buffer: {s}\n", .{buff});
}
It seems redundant and almost dumb to unpack the value onto "number" while the buffer "buff" will also have the same contents, unless "number" is null (which will break the loop). I want to know if there is a more streamline way to do this..
I am avoiding using the return value of !void in the main function because I am trying to practice error handling, and want to handle them myself.
Thank you in advance
There are some fundamental problems in the posted code, and I don't think that it works as OP expects. There is some "boilerplate" associated with handling errors and potential null values, but that is one of the prices you pay for working in a systems programming language. You can have less such boilerplate in C code, but not in robustly written C code.
Problems in the Posted Code
The OP code is printing the value of
buff, which is an array ofu8, instead of the value ofnumber, which is what is returned byreadUntilDelimiterOrEof. While this function does read into the buffer, it returns a slice formed from the contents of the buffer based on the number of bytes read. You want the slice if you intend to print those contents as a string.The trouble is, and I suspect that this is where OP ran into problems,
readUntilDelimiterOrEofdoesn't return a simple slice, but rather an error union with an optional slice. The return value must be unwrapped (twice) in order to make use of its value.A
whileloop with payload capture unwraps the error union first, and OP code essentially handles this correctly, although the error message seems misplaced. The payload captured by thewhileloop is an optional type. OP code correctly checks this againstnulland breaks from the loop whennullhas been returned. But to print the value contained in the optional, the optional must be unwrapped. You can do this with.{number orelse unreachable}in place of.{buff}in the printing code, or you can use the shorthand version of this:.{number.?}.The OP code prints the contents of the buffer
buffin the final line of the program. This is bad for two reasons. First,buffis not a slice and shouldn't be printed as a string. Second, if an error was encountered in thewhileloop this line attempts to print the contents of a possibly uninitialized array. The presence of this line, together with the large size ofbuff, makes me wonder if the OP believes thatbuffsomehow accumulates the results as they are read in the loop. It does not;readUntilDelimiterOrEofsimply reads into the buffer starting from the beginning each time.OP says that they want to "do some other unrelated stuff later on". It isn't clear what this unrelated stuff is, but presumably they want to do some other work with the data read from the file. As it is written, the OP code does not persist any of the data read from the file. In the next section I suggest one possible approach to this problem.
Here is the simplest correction to the posted code:
You could write the loop in a slightly more verbose style with another payload capture to make the double unwrapping more explicit:
Reading the Whole File into a Buffer
It might be better to read the entire file into a buffer, and work with that buffer instead of taking one bit at a time from the file. This is probably a more robust approach, and probably more performant.
The program below reads the contents of a file into a dynamically allocated buffer, and then iterates over those contents with an iterator provided by
splitAny. One virtue of usingsplitAnyis that you can provide it with a slice of delimiters; any of these delimiters indicates where the contents of the buffer should be split. In the case of the OP data file, it would be good to split not only on commas, but on spaces and newlines as well.