|
|
Learning and Using POMDP models of Patient-Caregiver Interactions During Activities of Daily Living
|
Older adults living with cognitive disabilities (such as Alzheimer's disease or other forms of dementia) have difficulty completing activities
of daily living (ADLs). They forget the proper sequence of tasks that need to be completed, or they lose track of the steps
that they have already completed. The current solution is to have a human caregiver assisting the patients at all times,
who prompts them for tasks or reminds them of their situation. The dependence on a caregiver
is difficult for the patient, and can lead to anger and helplessness, particularly for private ADLs
such as using the washroom (LoPresti, et al., 2004).
A cognitive orthosis is a system to automate this caregiving process, in order to provide alternative solutions for
patients and to reduce caregiver burden (LoPresti, et al., 2004).
Such systems would be able to non-invasively monitor the patient, stepping in to provide help in the form of verbal or visual prompts when necessary,
and ensuring the health and safety of the patient. Computer vision is an ideal sensor for such a task
because it is not invasive, and has the ability to generalise across tasks. This is in contrast to more invasive and interactive
monitoring tools such as bracelets, specialised sensors, or call devices, which may require the patient to ask for help, may need to be
carried or attached to the patient, and may need to be re-engineered for each task.
The ultimate goal of a computer-vision based cognitive orthosis for assisting dementia patients
during ADLs is to choose a prompting strategy
that maximises some notion of utility over the possible outcomes given visual observations of the patient.
|
|
|
Our research focusses on learning models of ADL behavior.
The principal benefit of the model we describe is that it does not
require patient behaviors to be labeled in video sequences.
The learning method discovers the classes of behaviors present in the
training data, and what their relationship is to the task state.
The burden on human experts for the training of the
system is thus reduced, for they only need to provide intermittent annotations of some small number of variables.
For example, for handwashing, these variables describe whether the patient's hands are wet, soapy, dirty or clean.
After training, the model can be used to infer the task state from unlabeled data by inferring what behaviors
are taking place, and how they are advancing (or retarding) the task state. In future, this inference will be used
to select appropriate prompting actions. From a computer vision perspective, the features we use
must be able to generalise across tasks, contexts, and individuals. Thus, we do not want to engineer features for each ADL, such as,
for example, skin color for the detection of hands during handwashing (Mihailidis et al., 2004[6]). Instead, we want features
which can be learned from training data in such a way that the recognised behaviors are most useful for predicting
state or value.
|
|