Research
Research Interests
Subsequence similarity matching over time series stream data
Time series data modeling, analysis, indexing, clustering and prediciton
Biomedical research
Bioinformatics
Web database system design
Summary of Past Research:
In database area, my major research is on time series stream data modeling, analysis,
clustering,
and prediction using subsequence similarity matching.
In biomedical research, I focused on tumor respiratory motion analysis, prediction, and
correlation discovery.
In bioinformatics, I mainly worked on computational biology and bioanalytical chemistry.
In web database system design, we have designed and implemented the CenSSIS web image
database system.
Research Experience:
Research Fellow, 03/2003 - present
Harvard Medical School, Boston, MA
- Proposed and implemented a new finite state automaton to simulate tumor
respiratory motion, which fits well with our natural understanding of breathing
phases. This model is suitable for both online and offline motion analysis.
The implementation of the model produces a piecewise linear representation of
raw data in an online fashion, which makes the model a convenient tool to be used
in real-time image guided radiation therapy.
- Designed and implemented a model-based tumor motion prediction approach for
image guided gated treatment, using statistics, probability, and database similarity
matching techniques, based on the finite state model.
- Characterized the spatio-temporal features of tumors and organs based on
time-resolved Computed Tomography (CT) images. Optimized the treatment angle based
on the associated probability density function.
Research Assistant,01/2003 – present
CCIS, Northeastern University, Boston, MA
Introduced a new event-driven subsequence matching approach over massive
financial data streams. Designed and proposed a simultaneous segmentation and
pruning approach for piecewise linear representation of the raw streams.
Defined a metric subsequence distance function based on a permutation and
performed subsequence matching over an up-to-date stream database.
- Proposed a framework for time series stream data modeling, analysis,
clustering, and result analysis. Considered the internal structure with the data
directly during similarity matching.
- Defined a new concept, subsequence stability, to adjust the length of a query
subsequence dynamically, and to assure that a query subsequence is a good
representative of the current moving condition.
- Defined a new model-based, multi-layer, weighted, and parametric sequence
distance function, which captures the relative importance of each source stream,
amplitude, frequency, and proximity in time. Based on subsequence similarity,
whole stream similarity is defined, which can be used for correlation discovery,
and dynamic clustering of streams.
- Assessed radiation damage, normal lung tolerance or radiobiology in Boron
Neutron Capture Radiation Therapy (BNCT) using non-invasive breathing pattern
variations. Implemented an approach to evaluate inter-session frequency changes
and intro-session amplitude changes.
Research Assistant, 01/2003 – present
CenSSIS, Northeastern University, Boston, MA
- Designed, implemented, and maintained CenSSIS web image database system.
The system adopted a robust three-tiered web-accessible architecture and a
flexible data design, with a project-centered user interface, a powerful query
engine and a friendly hierarchical view.
- Incorporated an image-tagging package within the CenSSIS database system,
which enabled content-based image retrieval. The XML tags are stored, indexed
and queried as part of the database metadata.
Research Assistant, 08/1997 – 05/1999
University of Texas, Houston, TX
- Studied post-transcriptional regulation of mammalian gene expression, including mRNA stability and translation. Validate hypotheses using DNA, RNA and protein analysis.