Hi Lars: Thank you for your message. I have never written a marshalling program, so this area is new to me. Of course, I don't have any measurements. Your arguments, combined with Mitch's and Johan's indicate that using Java Serialization is the best bet. Mitch: we do this for implementing Ridl. Johan: what is the best design now that we use Serialization? Can we use the CopyVisitor or do we need a SerializationVisitor? Crista sends the strategy to the other side, too. Do we want that also? In RMI the marshalling is fixed per class, the way I remember. But we want to marshal objects of the same class differently, based on a given strategy. How do we best get this variablility? Johan: can you please give a presentation on marshalling in Java next week? -- Karl From lth@ccs.neu.edu Fri Oct 24 15:34:52 1997 To: Karl Lieberherr cc: binoy@ccs.neu.edu, johan@ccs.neu.edu, dem@ccs.neu.edu, jxiao@vnet.ibm.com Subject: Re: marshalling revisited Date: Fri, 24 Oct 1997 15:34:43 -0400 From: Lars Thomas Hansen >I plan to marshall non-circular and non-self-circular objects >quite frequently. So I would like to have a good solution for them. >If I need a circular object on the other side, I am willing to break the >circles and relink them after transmission. I note that the serialization algorithms used for circular objects are not really different from those used for preserving simple acyclic sharing. Are you willing to give up sharing too? Frankly, I think your argument is roughly equivalent to the good old "If I want garbage collection, I'll just implement it myself." If you have non-trivial sharing relationships, you're going to have to implement a variation of the full algorithm anyway. In my as-always humble opinion, many programs have complicated sharing relationships. This strikes me as particularly true for OO programs. >With printing/parsing we can hope for a much better compression than >serialization can get us. This is an interesting conjecture; do you have any measurements to support it? I am skeptical (although I'll be happy if it turns out to be true). For example, a reasonable external binary representation for a vector of n integers uses n+1 (or n+2) words. A textual representation will use at least 2 bytes per element for one-digit numbers (a digit and a space or comma), but given the typical distribution of integer values in programs I would assume values to be representable in an average of 3 or 4 characters, for 4 or 5 characters overall per element. Sure, you can compress this, but the binary data might compress to the same degree. Furthermore, it seems to me that any argument about how we can use _no_ space for absent or "default-valued" elements by using suitable grammars is only true for objects with many such elements. The only way I can think of in which serialization seems to give you worse results than ASCII is by including voluminous class information in each serialized string. But then it's not really marshalling -- a marshalling scheme should send the type information once, or not at all. --lars From wand@ccs.neu.edu Sun Oct 26 07:36:40 1997 From: Mitchell Wand To: Karl Lieberherr Cc: Lars Thomas Hansen binoy@ccs.neu.edu, johan@ccs.neu.edu, dem@ccs.neu.edu, jxiao@vnet.ibm.com Subject: Re: marshalling revisited I guess I missed the beginning of this thread. Why are you doing marshalling in the first place? I assume it's for communication between different machines or address spaces. Even if communication is slow, it can't be much slower than a print-read-parse cycle. (Lars, do you have a back-of-the-envelope calculation on this?). Furthermore, print/read, as Lars and Johan point out, lose information about sharing and circularity, which are both crucial in an OO (as opposed to tree-style) data models. Since Java is kind enough to offer this facility already, why not use it instead of reimplementing an inferior substitute. Talk about hematomas of duplication! Just my $0.02. --Mitch