-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide external file (mmaped) representation #108
Comments
You can already do it with a bit of wrapper code. There is a There is no need for a special representation, once you have a There is no point in providing this sort of functionality in massiv directly because mmap is very much OS specific and I wanna stay OS agnostic as much as possible. However I might consider a helper package that does just this Let me know how it goes if you do figure it out or hit me up on gitter if you do get stuck https://gitter.im/haskell-massiv/Lobby I'll keep this ticket opened in case I find time to experiment with it and create such a package in a future. |
Thanks for reply
If you expect massiv to be used in numerics code (Personally, I consider it as my best bet for the project I'm currently planning), you should keep in mind that such code often has to deal with data sets exceeding available RAM by orders of magnitude. If we construct a massiv representing such data the way you described, it would have different cost model of various access patterns that are different from purely in-memory massives. In addition, specialized prefetch calls are available for mmaped files. Given than, it makes sense to have specialized algorithms for mmaped representation. |
@permeakra I didn't say this functionality isn't useful. I said that it should not be implemented in
It makes sense to have a new representation to account for different usage patterns, I certainly agree with that, but one way or another it will have to be a representation that is a wrapper around This is how I would implement this representation: |
I have two use cases in mind. The first is (limited) persistence, allowing access to raw data. The second is working with datasets, exceeding memory size by several orders of magnitude. In both cases it might be desirable to allow several massives in a single file. Probably, interaction with madvise could be of use.
The text was updated successfully, but these errors were encountered: