Recovery for Service-Oriented Applications

Recovery for Service Oriented Applications

Project Award Number IIS-0533625

This material is based upon work supported by the National Science Foundation under Grant No IIS-0533625 Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Principal Investigator

Betty J. Salzberg
College of Computer and Information Science
Northeastern University
Boston, MA., 02115
Phone: (617) 373-2229
Fax : (617) 373-5121
Email: salzberg@ccs.neu.edu
URL: http://www.ccs.neu.edu/home/salzberg

Keywords

recovery, logging, optimistic logging, pessimistic logging, server processes, reliability, performance, application fault tolerance

Project Summary

This project provides a recovery system for processes running methods at middleware servers. We have four types of logging: (1) pessimistic logging for communication with the outside world or for communication between distinct service domains (2) optimistic logging for communication among servers in the same service domain (3 )separate shared variable logging and (4) logging for communication between a server method and a DBMS. These four types of logging are integrated to provide recovery for server methods which guarantees exactly once execution, good performance and correct semantics.

Publications and Products

Kanoulas, E. et al., "Derivation of the Tumor Position From External REspiratory Surrogates with Periodical Updating of External/Internal Correlation [abstract]", Medical Physics, vol. 33, (2006) p. 2232.

Kanoulas E. et al., "Finding Fastest Paths on a Road Network with Speed Patterns", ICDE 2006 vol. 22, (2006) p.10-20.

Rui Wang, Betty Salzberg and David Lomet, "Log-Based Recovery for Middleware Servers", accepted in SIGMOD 2007.

Rui Wang, "Log-Based Recovery for Middleware Servers", Ph.D. thesis, Northeastern University, 2006.

Panfeng Zhou, Querying Multidimensional Data and Spatio-Temporal Data with Non-Overlapping Access Methods", Ph.D. thesis, Northeastern University, 2006.

Project Impact

Human Resources: Several graduate students working on the Ph.D. degree in the College of Computer Science at Northeastern University were supported by this project. Two of these students finished the Ph.D. in 2006 and are working in the database industry.

Education and Curriculum Development at all levels: Professor Salzberg teaches courses in Database systems and in Algorithms for undergraduate and graduate students. Materials for these classes are influenced by this research. In particular, Professor Salzberg presents recovery algorithms in database courses in more detail than can be found in most standard textbooks.

Industry Collaboration: This work is in collaboration with Dr. David Lomet at Microsoft Research in Redmond, Washington.

Goals, Objectives and Targeted Activities

The goal of this project is to provide algorithms and simulations which will enable recovery of server methods in spite of the nondeterminism resulting from message receiving, shared variables and interaction with DBMSs.

The targeted activities are:

1. Publications on server recovery.
2. Experiments and simulations measuring performance on a commercial web services platform.
3. Integration of logging and recovery of shared variables and interaction with DBMSs as well as effects of message passing.

Area Background

Although Data Base Management Systems (DBMSs) provide recovery for data which is written to the database, applications which use this data lose all other state when there is a system failure. This project is a step towards making application state recoverable. Specifically, when an application calls a method run at a server, our algorithms can be used to make the state of the method at the server recoverable. Thus, not only data in a DBMS is protected against system failure, but also server state. Reliability is an important property for applications. A great deal of work must now be done by programmers and system managers to provide recovery on a case-by-case basis. Our work will allow programmers to call methods without having to provide recovery logic. Systems can be restarted with confidence that consistency of state will be recovered.

Area References

Roger Barga and Shimin Chen and David Lomet, "Improving Logging and Recovery Performance in Phoenix/App", ICDE, 2004, pp. 486-497.

Roger Barga and David Lomet and Stelios Paparizos and Haifeng Yu and Sirish Chandrasekaran, "Persistent Component-Based Applications via Automatic Recovery", IDEAS, 2003, pp. 258-267.

Roger Barga and David Lomet and Gerhard Weikum, "Recovery Guarantees for General Multi-Tier Applications", ICDE, 2002, pp. 543-554.

Philip A. Bernstein and Meichun Hsu and Bruce Mann, "Implementing Recoverable Requests Using Queues", SIGMOD, 1990, pp. 112-122.

Om P. Damani and Ashis Tarafdar and Vijay K. Garg, "Optimistic Recovery in Multi-threaded Distributed Systems", SRDS, 1999, pp. 234-243.

E. N. Elnozahy and Lorenzo Alvisi and Yimin Wang and David B. Johnson, "A Survey of Rollback-Recovery Protocols in Message Passing Systems", ACM Comput. Surv., 2002, vol. 34, number 3, pp. 375-408.

David Lomet and Gerhard Weikum, "Efficient Transparent Application Recovery In Client-Server Information Systems", SIGMOD, 1998, pp. 460-471.

Jeff Napper and Lorenzo Alvisi and Harrick Vin, A Fault-Tolerant Java Virtual Machine, IEEE Dependable Systems and Networks, 2003, pp. 425-434.

Michiel Ronsse et al., "Record/Replay for Nondeterministic Program Executions", Commun. ACM, vol. 46, number 9, 2003, pp. 62-67.

Rober E. Strom and Shaula Yemini, "Optimistic Recovery in Distributed Systems", ACM Trans. on Computer Systems, vol 3., number 3, 1985, pp. 204-226.

Project Websites

The project website is at http://www.ccs.neu.edu/home/salzberg/soa. (this site)