Hi Lars and Johan: Our colloquium speaker on Wedn. taught us that communication is very expensive. So I think printing and parsing will be fast enough in this context. I plan to marshall non-circular and non-self-circular objects quite frequently. So I would like to have a good solution for them. If I need a circular object on the other side, I am willing to break the circles and relink them after transmission. With printing/parsing we can hope for a much better compression than serialization can get us. As Johan says, we could use the CopyVisitor with a traversal to cut down the objects before transmission but I think that leads to unnecessary long transmissions. Johan, do you still want to go ahead with Java serialization and then when we discover that we often marshal tree objects we can add the optimization later? -- Karl ========================= From johan@ccs.neu.edu Fri Oct 24 11:03:24 1997 From: Johan Ovlinger To: lieber@ccs.neu.edu CC: dem@ccs.neu.edu, jxiao@vnet.ibm.com Subject: Re: marshalling revisited Status: R Strings? I dunno. Seems awfully slow and not very good for circular datastructures. Wouldn't it be better to use the copyvisitor (do we have one of those?) to make a copy of the subgraph aznd then call serialize on that copy? This keeps as much of the slow stuff in the (hopefully) native implementation in the java class libs as possible. Perhaps this is what you meant by string? From johan@ccs.neu.edu Fri Oct 24 11:22:08 1997 Learn from my bugs: circular objects (ie objects that contain (or whose subparts contain) a self reference) are a hassle. Avoid them if possible. They can be dealt with, but it is easier not to introduce them to begin with. Johan --------------- From johan@ccs.neu.edu Fri Oct 24 13:22:31 1997 From: Johan Ovlinger To: binoy@ccs.neu.edu CC: lieber@ccs.neu.edu, dem@ccs.neu.edu, jxiao@vnet.ibm.com Subject: Re: marshalling revisited >From the JDK docs: Object Serialization extends the core Java Input/Output classes with support for objects. Object Serialization supports the encoding of objects and the objects reachable from them into a stream of bytes and it supports the complementary reconstruction of the object graph from the stream. Serialization is used for lightweight persistence and for communication via sockets or Remote Method Invocation (RMI). The default encoding of objects protects private and transient data, and supports the evolution of the classes. A class may implement its own external encoding and is then solely responsible for the external format. Further down: Within a stream, the first reference to any object results in the object being serialized or externalized and the assignment of a handle for that object. Subsequent references to that object are encoded as the handle. Using object handles preserves sharing and circular references that occur naturally in object graphs. Subsequent references to an object use only the handle allowing a very compact representation. Seems right up our alley, eh? Johan From lth@ccs.neu.edu Fri Oct 24 11:05:48 1997 To: Karl Lieberherr cc: johan@ccs.neu.edu, dem@ccs.neu.edu, jxiao@vnet.ibm.com Subject: Re: marshalling revisited Date: Fri, 24 Oct 1997 11:05:42 -0400 From: Lars Thomas Hansen >Binoy suggested that we use printing and parsing for marshalling. >A discussion of an alternative approach is below. But the >printing-parsing approach seems much better. Some questions I like to ask about any marshaling algorithm: (1) Does it preserve shared structure? (2) Can it deal with circular/self-referencing objects? (3) Is the marshaling algorithm (as opposed to the size of the sent data) a bottleneck? For example, on a fast network among hosts where at most byte order is different, it may take a while to to print and then parse a bunch of floating-point numbers; in particular, it may be take longer than the transmission itself. In these situations, fast representations should be used when possible. There was a paper in this year's (or last year's) PLDI on the performance of marshaling and related issues in the context of CORBA systems; I've only read the abstract so far, but it might be a worthwhile read if one is looking for even moderate performance. --lars