Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERL-327 On zero-size files attempt to read until EOF #1524

Closed
wants to merge 8 commits into from

Conversation

kvakvs
Copy link
Contributor

@kvakvs kvakvs commented Jul 31, 2017

When read_file is invoked on a file with zero size, it will attempt to read 256kb 64kb blocks until EOF then stop and return the result. Assuming most of files in /proc/ are few kb in length, this typically should result in 1 allocation of 256kb 64kb and 1 following reallocation down to the real size.

@kvakvs kvakvs force-pushed the erl327-zero-size-read_file branch from f24a563 to e83c11a Compare July 31, 2017 16:03
datap = d->c.read_file.binp->orig_bytes + actual_size;

/* Read until we hit EOF (read less than requested) */
read_result = read((int) d->fd, datap, FILE_SEGMENT_READ);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the efile_read(errInfo, flags, fd, buf, count, pBytesRead) wrapper instead of plain read(fd, p, size) since the former for example handles both retry on EINTR as well as uses the right call on Windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Will check it and submit soon.

/* [ERL-327] Some special files like /proc/... have reported size 0 */
void read_file_zero_size(struct t_data* d) {
size_t actual_size = 0;
while (1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer the for (;;) construct instead of while (1) since a compiler once warned me that 1 was a constant expression.

static void invoke_read_file(void *data)
{
struct t_data *d = (struct t_data *) data;
size_t read_size;
int chop;
DTRACE_INVOKE_SETUP(FILE_READ_FILE);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace changes makes the diff harder to read

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix this in a separate commit. Then squash before merge.

EFILE_MODE_READ, &fd, &size))) {
goto done;
}
d->fd = fd;
d->c.read_file.size = (int) size;
if (size < 0 || size != d->c.read_file.size ||
! (d->c.read_file.binp =
! (d->c.read_file.binp =
driver_alloc_binary(d->c.read_file.size))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible that for the special case size == 0 pre-allocate one chunk and then in read_file_zero_size(d) not have to start with reallocating it from size 0 to one chunk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Will check it and submit soon.

size_t actual_size = 0;
while (1) {
ssize_t read_result;
char* datap = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to use d->c.read_file.orig_bytes and d->c.read_file.offset instead of actual_size and datap to reduce the number of variables and to make the code more similar to the non-0-size code (less places where the type could get wrong)

@RaimoNiskanen RaimoNiskanen added the testing currently being tested, tag is used by OTP internal CI label Aug 4, 2017
@RaimoNiskanen
Copy link
Contributor

Thank you for the updates!

One more small nitpick that I am sorry I missed the first time: we like use tests a'la if (size) for boolean values only, so if you test for not zero, please use if (size != 0). We think it increases readability. Two places.

@RaimoNiskanen
Copy link
Contributor

I see you also changed the chunk size to 64 kB. Great! We talked about 256 kB being on the fat side internally, but I did not remember having said that to you. We can talk more about the final chunk size later...

This kind of change has been up for discussion many times before and always been rejected. I do not remember the exact reasons, nor the exact proposals, and can not find any meeting protocols about it. So the final decision will have to wait until everybody with an opinion gets back from their vacations.

@kvakvs
Copy link
Contributor Author

kvakvs commented Aug 4, 2017

The chunk size only affects zero-file algorithm as my common sense told me 256kb is too much for possibly a ~3kb read. I did not modify the original FILE_SEGMENT_READ constant.

@RaimoNiskanen RaimoNiskanen added team:VM Assigned to OTP team VM enhancement labels Aug 4, 2017
@bjorng bjorng removed the testing currently being tested, tag is used by OTP internal CI label Aug 9, 2017
@@ -1290,6 +1340,11 @@ static void invoke_read_file(void *data)
}
/* Invariant: d->c.read_file.size >= d->c.read_file.offset */

if (d->c.read_file.size != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

97e33f9 accidentally inverted this condition, it should be d->c.read_file.size == 0

read_file_zero_size sets d->again to 0 and doesn't close the file on its own, so jumping to chop_done will leak. goto close; instead?

Copy link
Contributor Author

@kvakvs kvakvs Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! I am committing a fix now (it all will be squashed anyway, so doesn't matter in the end)
C really needs some RAII and resource lifetimes 😄 (which it has in form of __attribute__((cleanup(my_function_name))) but that is Clang/GCC specific and isn't a generic C89)

@RaimoNiskanen
Copy link
Contributor

Now we are just missing some test cases. It would be simple to test if there is a /proc filesystem and then read some sure to find file. Also test to read it a few thousand times to ensure detection of leaking file descriptors.

@jhogberg jhogberg self-assigned this Sep 12, 2017
@jhogberg jhogberg added the testing currently being tested, tag is used by OTP internal CI label Sep 12, 2017

%% Fail if none of mentioned files exist in /proc, did we just get
%% a normal /proc directory without any special files?
?assertNotEqual([], Inputs1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the /proc/ files in "Inputs" are available on freebsd or solaris.

On freebsd I suggest you use /proc/curproc/cmdline.
On solaris all of the files in /proc/ seem to have the correct filesize, so not sure what we can do there. I suggest skipping them.

@garazdawi
Copy link
Contributor

I squashed this and merged it to master. Thanks!

@garazdawi garazdawi closed this Sep 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants