Few days ago I've implemented vnode cache - it works and file system works faster. But now I'm working on file mappings.
I found an interesting subject due to mappings design: if you have a page aligned in memory file system - you can directly maps it's storage to address space of a process.
This will avoid extra memory loss - you don't need to copy/paste data from file system storage to memory.
For example, init.fs is a flat (but not page aligned) file system - and if we will provide a page aligned file system we can avoid memory loss.
This feature already implemented in some monolithic kernels, but I don't seen that in microkernel multiservice-based OSes.
Wednesday, August 26, 2009
Wednesday, August 19, 2009
Vnode cache design
I found a development quest in Jari OS: all begins from device drivers services design, some drivers are actually is a library that linking to service that support some hardware part, in this case driver developer folks need to have a dlopen()-family functions to link this libraries at runtime, for example if block service find a specific host it should be linked runtime, not statically. Well, OS hasn't dlopen-family support functions, saying more - dynamically linked ELF support are very early (it works, but it works slowly and takes many service time slice) - the reason is a poor file mapping support, you can map file, but when you will close file descriptor you will got a segfault on page fault in this mapped area. This is a complex problem, to support dlopen() , ELF support should be improved i.e. work with libraries and binaries should be going outside of process manager service and, not be done via read/write calls - one of the benefits will be a dlopen() support.
I know, that one of the weak place of the file subsystem layer is a libv2(it's a VFS-like layer, but implemented as a library that linked to each file system(don't confuse - VFS service is a filesystem manager, resource storage, redirection mechanism and so on)). Libv2 supports many POSIX features already, and operates via abstract vnode_t (in linux it's inode called) and there are too poor vnode cache, so every mapping is connected to vnode and if cache decide to destroy it - mapping can be lost. Like a solution is a map count handling, but it's a very simple - more complex solution is a creating more sophiscated cache.
Assume a file system with a high load, you are always performs lookups, maps, read, write operations - there are possible situation when vnode might not have positive map and open counters for a short amount of time, and there are possible rarely used vnode i.e. some vnode may be mapped sometime, but not accessible for a day - and it will waste you data structure and memory - and you cannot delete it at all. On other side - you cannot use vnode metadata that connected to real file system node, because lookup calls don't modify accessed time of this. Well, what the solution could be - first I determined several vnode stages with it's own features, second - I designed a table for vnodes that mapped, but doesn't accessed - it will not take resources as much as regular vnode takes, but virtually will present.
The next vnode stages might be:
I'm not sure that I will implement all features at near time, but anyway this functionality will be added to libv2 - and I will continue my quest - I will improve page cache, improve memory events protocol, design and implement linker with dlopen ;)
I know, that one of the weak place of the file subsystem layer is a libv2(it's a VFS-like layer, but implemented as a library that linked to each file system(don't confuse - VFS service is a filesystem manager, resource storage, redirection mechanism and so on)). Libv2 supports many POSIX features already, and operates via abstract vnode_t (in linux it's inode called) and there are too poor vnode cache, so every mapping is connected to vnode and if cache decide to destroy it - mapping can be lost. Like a solution is a map count handling, but it's a very simple - more complex solution is a creating more sophiscated cache.
Assume a file system with a high load, you are always performs lookups, maps, read, write operations - there are possible situation when vnode might not have positive map and open counters for a short amount of time, and there are possible rarely used vnode i.e. some vnode may be mapped sometime, but not accessible for a day - and it will waste you data structure and memory - and you cannot delete it at all. On other side - you cannot use vnode metadata that connected to real file system node, because lookup calls don't modify accessed time of this. Well, what the solution could be - first I determined several vnode stages with it's own features, second - I designed a table for vnodes that mapped, but doesn't accessed - it will not take resources as much as regular vnode takes, but virtually will present.
The next vnode stages might be:
- strong vnode: not a candidate to move to the virtual vnode list, has a positive open and mapped count, and was recently used
- potential oldies vnodes: has only mapped count or opened count positive and wasn't recently used
- oldies vnode: vnodes in the virtual vnodes list
- died vnodes: vnodes that was removed via unlink call
I'm not sure that I will implement all features at near time, but anyway this functionality will be added to libv2 - and I will continue my quest - I will improve page cache, improve memory events protocol, design and implement linker with dlopen ;)
Subscribe to:
Posts (Atom)