``Patient was taking fluocinonideIf this piece of text is on thecream 1 bag p.o. from Jan 12 to May 15 this year X 3 q.d. until ready for d/c home. Before this, the patient had a 50-point hematocrit drop.''
m=``fluocinonideWhere ``m'' means ``medication name'', ``do'' means ``dosage'', ``mo'' means ``mode for the medication'', ``f'' means ``frequency to take the medication'', ``du'' means ``duration'', ``r'' means ``reason to take the medication'', ``ln'' indicates ``the information for this medication is from a narrative or a list''. The content within the double quote is the content for a specific field, and the number ``37:3 37:5'' is the offset of ``fluocinonidecream'' 37:3 37:5
do=``1 bag'' 37:6 37:7
mo=``p.o.'' 37:8 37:8
f=``X 3 q.d.'' 37:17 38:0
du=``from Jan 12 to May 15 this year ... until ready for d/c home'' 37:9 37:16 38:1 38:5
r=``50-point hematocrit drop'' 38:12 38:14
ln=``narrative''
The challenges here are:
Because the number of the ground truth files provided is very limited, we decided to manually extract information from the rest records, five records each person per week. It turned out that these were extremely difficult to interpret manually -- we spent around three weeks finishing the first five records with still hundreds remaining. I have to mention here, this is one of the main reasons that I do research on the evaluation of IR systems where judgments are incomplete. During this period of time, I wrote a program to color the content of different fields in these medical records based on their ground truth data in Python. This program helped us to examine how well our system worked.
We divided our task into three parts. My own was to extract a medication's frequency. There are three basic categories: frequency, like ``b.i.d'', ``X 3 daily''; expressions that mean as needed, like ``prn'', ``as necessary''; temporal phrases that specify when a medication should be taken, like ``after meal'', ``at 4pm''. Also, they may be combined together, like ``x 3 a day after meal as needed''.
We developed a simple algorithm which we called Medication Frequency Decision Algorithm.
It is shown in Algo. [1, 2,
3].
, left string
length, right string length, and span length respectively, are
constant. UNITLIST contains most of the possible single element frequency strings. All these
were
obtained by analyzing the given ground truth data and extracted data
manually. We found the best result when
. Although we have
not received results for larger sets of testing data, our algorithm was very effective
for the training set.