Tuesday, October 18, 2011

Something more: about general system high-level design

Well, I want to repeat something again, but with more words.
Today I want to introduce intercommunication design within the
whole system.
Firstly, I want to tell - IDL is a good idea in theory and is a bad idea
on practice.
The main reason - it's implementation. IDL usually describe interfaces and,
in addition, creates some code, that packs the request, get it, call it, and, finally,
reply. I.e. receive, call and reply to the message, nothing more.
But, in our system we have something more complex, let's enlist it:
  • IPC message forwarding
  • Postponed calls (blocking operations)
  • IPC message modification operations
Well, on that point you need something more featured than receive - call - reply
cycle, you need to determine when you need to forward a message, on which
point you need to do it, what is immutable message parts and what parts
should be modified and/or cut off.

The other "funny" thing is postponed messages, in this case IDL generated code
must contain many stuff like quick allocation of memory and other stuff.
Finally, generated code is coming to be huge, and in addition you need to modify
it in some different cases.

Another problem is a problem with interfaces, i will try to describe it briefly.
For each instance you have an interface with a predefined set of functions,
many functions are similar, or identical: for example, device has a read() function,
and file has a read() function, but you have a one set of interfaces for file system
and other set of interfaces for devices. That mean that you need to try develop some
generic interface, and extend it every time with your own functions again and again.
On practice you will have a big set of libraries with interfaces and it will be a real headache.

From other point of view, you can resolve the problem with it on the libc client side,
just determine the resource and call specific function i.e. POSIX read() in our case
will call read_file() or read_device() functions depends on the opened resource.
I think - the last idea is very very ugly, I don't think that it's a good idea to change libc
every time when you adding something new to the system, also, we should be modular,
in some cases I can turn off support for pipes of sockets - and I don't want to recompile
libc, all the other system parts for it.

That's why IDL sucks, I spent many time to solve the problems with nontrivial things like
message advanced control and I don't see the sense. But, the old one - uni_rpc_t is sucks too,
on practice uni_rpc_t creates more problems that it's should solve.

And, I faced that pretty solution going from other way to solve this problem.
Each time while you design RPC/IDL/something_else_... you should think more closely
to the system and it's objects, my error was in the way to solve this problem, I think
about RPC/IDL/etc ... only, but you need to think about system also.
What I mean: system might be presented like a set of objects hosted on different servers and
interacting each one with other, and in this case there are no difference between
file and task, i.e. file, task, device, ipc object - it's node, with predefined set of functions.
Well, yes, task and file has a different operations, but all possible operation might
be represented via one set.
Anyway you will have a very similar operations, in example, changing file owner uid and changing
task effective uid is very similar, on RPC level there are no difference between it.
What I decide: I decide to represent all objects (or it's high-level representation) as a node,
each node has a 2 groups of operations, first one for control the node and manage it attributes,
second one for data i/o (yep, task will doesn't have data i/o interface, but we can do it, if needed).
In that case RPC is going to be simple, you have RPC signature in every message,
this signature points to the following things:
  • Function group
  • Function within group
  • IDs to select right node for operation
Other, after RPC signature, is going to the implementation.

How are ipcbox and sbuf abstractions used ?
It's a good question with a brief and simple answer, ipcbox is used by the
rpc library routines to operate with IPC, sbuf is used for data, because implementation
gets the sbuf like a data, rpc code don't allocating anything for implementation,
just sbuf.
Why is it pretty ?
Because it's simple and this solution doesn't require to implement a huge set of
API for each task, i.e. good old getvfspid() getsomeothercrap() bla bla bla.
Each task, has the special system reserved iolink, depends on task role (file system,
regular task, translator, resource carrier, etc ...) its node has a set of operations.
For example, to make a fork() you just need to make a control request via system
iolink to your node with fork() related data, to change your effective uid you just
send a stat request to your node representation, but if you are a regular task (i.e. doesn't have a rights
to link file system onto namespace tree) your node don't have the operation of such kind,
otherwise you will able to make a special call to your system iolink to do it.
Is it simple ? I guess it's very simple solution.

Friday, October 07, 2011

Jari OS RPCv7 and Vnode

I wrote before why IDL failed. And I know that old uni_rpc_t RPC is failed too, it's not flexible, ugly and slow.
I'm working on this system about 5-6 years, well, and there are 6 attempts to create good IPC/RPC chain:
- v1 create a lot of structures and calls and implement it on the microkernel side
- v2 create one generic RPC in microkernel side
- v3 remove it from microkernel
- v4/v5 uni_rpc_t and related, try to assign all operation to the standard POSIX calls
- v6 IDL
None of this approaches are applicable.
I guess it's all about a way to resolve the problem. Each time I was try to solve one problem, not the set of those.
But, let's back to my favorite method of problem solving: divide and solve.
When you have a one big problem, it's better to divide it to many little ones, and solve it - one by one, and finally your big problem will be solved. Simple, isn't it ?
And now, we have a system, with many objects, with many problems, but ... make a guess, all of operations within this complex system is requests to the objects.
In that case, you just need to create the one generic object that may represent anything you want and provide generic interface to this object.
Well, you are welcome to the RPCv7.
Usually you can divide all your operations between several groups, to take a data, to control object and so on. Also, in Jari OS you need to have session with this object. That's all.
All your requests might be described with one generic header, not the data of the requests, not the some specific flags like it was done in v4/v5 versions, the header will describe request only.
struct rpc_signature.
Other stuff, are not RPCv7 stuff, it might be your data, headers and so on.
It was a first small problem solved. The second one is object representation.
And I got it, vnode, all things might be represented as vnode, it might have a memory mapping, set of many specific operations, many IDs, relations ... and RPC callbacks, the standard set
of it.
All you need - just implement this functions and forget about IPC/RPC headache.
And carrier function:

All data going via sbuf, just forget about IPC/RPC chain.

Well, I will write later about this implementation more, but you can always checkout the fresh sources.

Why IDL fail, or, sometimes you need to fail first, to rise next

IDL for microkernel multiservice system is fail. No, not just because it wasn't implemented, going deeper - it's implemented.
Okay , let's try to understand why, but first - I spent a very lot of time to design and implement IDL for this purposes without any applicable results - that's my fail first, and that's an idea fail second.
Usually you have a simple IDL case: take the message, parse it, call the server function, serialize it and reply. It's common for client-server architecture widely used, and yes, there are many already implemented IDLs for this simple case. Most of IDLs covers this needs.
You can told me about L4 IDL - you are welcome, I didn't see any serios developed with L4 IDL, and I can told you - it will suck.
My approach was more complex.
In case of Jari OS services you need more than receive - call - reply cycle. Request might be postponed, or changed and forwarded.
In real system (not like L4 ping-pong demos), to resolve resource by the name (symlinks and files) you always need to forward it, because symlink and file might be located on different servers. Also, on real system, blocking of server threads is a quite poor idea.
IDL concept was very simple before I've added postponed and forwarded messages, but it wan't so fatally, to get away with this idea.
The next problem was faced later, like time-bomb, it was message modification.
Finally I've got something huge, and not flexible and scalable.
IDL implementation with support of this features are not simply than some RPC interface.
Well, at this point I decide to stop it. I spent something about 1 year to design, research, test and try it - and got the ugly solution.
However, in Jari OS you will be able to use IDL with this features, but it will not used in the base system.

IDL research implementation here - http://jarios.org/gitweb/?p=jari_os.git;a=summary , checkout idlc branch.