Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor out potentially buggy fseek() call for getting file size #222

Open
saxbophone opened this issue Nov 14, 2018 · 2 comments
Open
Labels

Comments

@saxbophone
Copy link
Owner

Current code:

libsxbp/sxbp/utils.c

Lines 96 to 107 in 0b30978

/*
* private, works out and returns the size of the file referred to by the given
* file handle
*/
static size_t sxbp_get_file_size(FILE* file_handle) {
// seek to end
// NOTE: This isn't portable due to lack of meaningful support of `SEEK_END`
fseek(file_handle, 0, SEEK_END);
// get size
size_t file_size = (size_t)ftell(file_handle);
// seek to start again
fseek(file_handle, 0, SEEK_SET);

This is a generic function, provided by the library to ease reading of files into buffer objects.

Unfortunately, the current implementation makes use of fseek() using the SEEK_END argument, which according to the documentation:

Library implementations are allowed to not meaningfully support SEEK_END (therefore, code using it has no real standard portability).

This means that my current approach is non-portable, and although it appears to work on Linux, Mac and Windows, I don't want to take any chances!

Probably a solution which uses fstat() on Unix and the equivalent functions on Windows is the best approach, a lá this Stack Overflow answer: https://stackoverflow.com/a/238609/6177253

@saxbophone saxbophone added the bug label Nov 14, 2018
@saxbophone
Copy link
Owner Author

Easier alternative: ready bytes from file into the buffer, resizing the latter until the file is empty, or we run out of memory.

  • Pros:
    • No need to write OS-specific code
    • No need to call fseek(), which might incur a performance penalty for large files. Feels a strange way to get a file's size anyway.
  • Cons:
    • May end up allocating a load of memory before failing if someone tries to make it read from a special block file like /dev/zero or similar ones.
    • I'm going to have to write another function for 'resizing a buffer'. On the other hand, maybe this is a good thing to have anyway!
    • It's a bit annoying to have to keep on reallocating memory for something as it grows in size. How much memory do we increase it by each time? We'll have to shrink it back down to its actual size once we've read the whole file.

@saxbophone
Copy link
Owner Author

Further reading on why my current approach to getting a file's size is pretty lame:
https://wiki.sei.cmu.edu/confluence/display/c/FIO19-C.+Do+not+use+fseek%28%29+and+ftell%28%29+to+compute+the+size+of+a+regular+file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant