The SPHERE Challenge: Activity Recognition with Multimodal Sensor Data

This is a guest post by Niall Twomey of the SPHERE project's machine learning team. You can download the notebook here to follow along. Enjoy!

Welcome to the SPHERE Challenge! We are very excited to be working with DrivenData, the ECML-PKDD conference and the AARP foundation with this challenge!

My name is Niall Twomey, and I am a postdoctoral researcher on the machine learning with the SPHERE project. In this post I'm going to first walk you through how to work with the data we provide, and then I'm going to provide an implmentation of a baseline classification algorithm. This baseline isn't by any means going to win the challenge :-) but it will show you how you can work with the data, as there is a bit of work needed to transform the raw data in the tabular formats that we are most familiar with in maching learning problems.

There are four main sections in this post:

  1. I will start by visualising of all of the data modalities (a crucial initial step in all prediction problems!);
  2. Then I will extract very simple features from all of the sensor modalities.
  3. Due to the data collection system, we will need to deal with missing data with data imputation; and
  4. Finally after we get the data into a nice tabular format, we will test a k-nearest neighbour based classifier on the test data.

I hope this is helpful!

Time to start! :-)

In [1]:
from __future__ import print_function

# For number crunching
import numpy as np
import pandas as pd

# For visualisation
import matplotlib.pyplot as pl 
import seaborn as sns 

# For prediction 
import sklearn

# Misc
from itertools import cycle
import json 
import os

# Magic and formatting
% matplotlib inline


current_palette = cycle(sns.color_palette())

Module versions

These are the library versions we worked with to produce our results. 
In [2]:
print('  numpy version: {}'.format(np.__version__))
print(' pandas version: {}'.format(pd.__version__))
print('seaborn version: {}'.format(sns.__version__))
print('   json version: {}'.format(json.__version__))
print('sklearn version: {}'.format(sklearn.__version__))
  numpy version: 1.10.4
 pandas version: 0.18.0
seaborn version: 0.7.0
   json version: 2.0.9
sklearn version: 0.17.1


Before doing any prediction, I think it will be helpful to do visualisation on the data itself. To do this, we will use a set of classes that we wrote for easy access to the data. These are stored in a Python file called (available here) and are quite interpretable.

We will visualise the training sequence identified by 00001. This is an arbitrary selection, and the cells below will work on all records (though if visualising the test instances the ground truth won't be shown, of course).

The structure of the public data download is shown below.

├── public_data
│   ├── accelerometer_axes.json
│   ├── access_point_names.json
│   ├── annotations.json
│   ├── pir_locations.json
│   ├── rooms.json
│   ├── sample_submission.csv
│   ├── train
│   │   ├── 00001
│   │   ├── ...
│   │   └── 00010
│   ├── test
│   │   ├── 00011
│   │   ├── ...
│   │   └── 00882
│   ├── video_feature_names.json
│   └── video_locations.json

We use a helper class called SequenceVisualisation that can be located in the file. This class will load in all of the available data, let you slice training data by activity, will plot some of the sensor data and annotations (when they are available), amongst other super useful things!

We use three main sensor types in this challenge: PIR sensors, accelerometers, and video. We will show you what those data look like in this first, then show how you might perform feature extraction on the data, and finally we will then show how you might classify data!

In [3]:
from visualise_data import SequenceVisualisation

plotter = SequenceVisualisation('../public_data', '../public_data/train/00001')
sequence_window = (plotter.meta['start'], plotter.meta['end'])

# We can extract the time range of the activity 'jump' with the 'times_of_activity' function. 
# This function returns all of the times that jump was annotated, and it is indexed first 
# by the annotator, and then by the time at which it occurred. 
times_of_jump = plotter.times_of_activity('a_jump')

# You can get the times at which these. 
for ai, annotator_jumps in enumerate(times_of_jump): 
    print ('Annotator {}'.format(ai))
    # The annotator_jumps is a list of tuples. The length of this list specifies the number of segments as
    # annotated as 'a_jump' by this annotator. Each element of this list is a tuple that holds the start and 
    # end time of the t-th annotation in that order, ie: 
    for ti, (start, end) in enumerate(annotator_jumps, start=1): 
        print ('  Annotation {}'.format(ti))
        print ('    Start time: {}'.format(start))
        print ('    End time:   {}'.format(end))
        print ('    Duration:   {}'.format(end - start))
        print ()

# The sequence object also holds metadata regarding the length of the sequence. 
sequence_window = (plotter.meta['start'], plotter.meta['end'])
print (sequence_window)
Annotator 0
  Annotation 1
    Start time: 1774.195
    End time:   1776.739
    Duration:   2.5440000000000964

Annotator 1
  Annotation 1
    Start time: 1775.6870000000001
    End time:   1776.348
    Duration:   0.6609999999998308

  Annotation 2
    Start time: 1776.901
    End time:   1777.87
    Duration:   0.9689999999998236

Annotator 2
  Annotation 1
    Start time: 1774.974
    End time:   1778.844
    Duration:   3.8700000000001182

(0, 1823.9170000000001)

Plotting location.

The black lines in the image below are the times during which the PIR sensors were activated. The red, green, and blue lines here are then the annotations provided by the three annotators.

Note PIR sensors can elicit false negative and false positive activations - ie they do not always trigger when someone is present, and they can trigger when someone is not present due to the infrared radiation from the sun in warm weather.

In this challenge, evaluation does not consider prediction of location, but knowing the location will help in classification performance, eg someone on the staircase is more likely to be ascending/descending the stairs rather than walking flat, and they would be very unlikely to be sitting. Location information is provided in the training data sets only.

In [4]:
plotter.plot_pir(sequence_window, sharey=True)

Plotting RSSI.

This function plots the received signal strength indicator (RSSI) of the wireless transmission. RSSI is a measure of how strongly the accelerometer's signal was received. As a simple rule, the closer you are to the receiver (which we call an access point), the more power there is in the received signal (but this rule doesn't necessarily apply if your body gets in the way of the accelerometer and the access point!).

There are four access points located in the SPHERE house, and these are located in:

  1. the kitchen;
  2. the loung;
  3. upstairs; and
  4. in the study.

RSSI is a useful indication of location (together with PIR data). In the image below, the RSSI data is plotted together with the locations. There is a very strong correspondance, for example, between the lounge trace (green) and the annotation of presence in the lounge. And in locations without an access point (eg bed2) you can see patterns in the RSSI signal that should be indicative of your location.

In [5]:

Plotting acceleration.

This plots the acceleration data (in the continuous line traces) and annotated activities.

Here we can see another interesting aspect of the data, namely that while the annotators agree quite a lot of the time, often there is disagreement regarding the annotated activities. There are various different types of disagreement, and one of the most interesting is the disagreement of the specific start and end times of the activities.

In [6]:
plotter.plot_acceleration((sequence_window[0] + 180, sequence_window[0] + 300))

Plotting video data.

The visualisation below shows the video data that we give with this challenge. We do not give the raw video data (in order to procet the anonymity of the participants), so we give coordinate of 2D and 3D bounding boxes, and the centre of these boxes. These are quite coarse as features go, but there is certainly valuable information in the features that you might extract from these bounding boxes (including volumn, height, aspect ratios, etc). To understand what all of the columns are, see §2.2 from the following paper:

Niall Twomey, Tom Diethe, Meelis Kull, Hao Song, Massimo Camplani, Sion Hannuna, Xenofon Fafoutis, Ni Zhu, Pete Woznowski, Peter Flach, Ian Craddock: “The SPHERE Challenge: Activity Recognition with Multimodal Sensor Data”, 2016;arXiv:1603.00797.

In the images below, we plot the the centre of the 2D and 3D bounding boxes. The cameras are located in the hallway, living_room and the kitchen respectively. The red, blue and green horozontal lines in the images correspond to the ground truth location annotations that are provided, and these correspond to the right hand axis labels. eg, at 500 seconds, the annotations state that the participant was in the kitchen. The other lines indicate the values of the features as the participant moves throughout the room.

In [7]:
plotter.plot_video(plotter.centre_2d, sequence_window)
pl.gcf().suptitle('2D bounding box')

plotter.plot_video(plotter.centre_3d, sequence_window)
pl.gcf().suptitle('3D bounding box')
<matplotlib.text.Text at 0x11acba320>

Visualising the classification targets

We now take a look at the ground truth targets that accompany the training sequences. The labels that are annotated (20 in total) are described in the main page. In the image below, we plot the first two minutes of the target data. In the image, the x-axis is over time, and the y-axis is over probabilities. Each subplot represents a particular activity, and we only plot the activities which have been annotated over this time window so as to save vertical space.

One of the most important things to take away from this image is that the targets are not 0/1, but they are probabilities, and the probilities represent the average label assigned by the annotators.

In [8]:
annotation_names = plotter.targets.columns.difference(['start', 'end'])

# Select only the first minute of data 
sub_df = plotter.targets.ix[:60 * 2]

# Select only the columns that are non-empty 
sub_df_cols = [col for col in annotation_names if sub_df[col].sum() > 0]

# Plot a bar-plot w
current_palette = cycle(sns.color_palette())
    figsize=(20, 3 * len(sub_df_cols)), 
    color=[next(current_palette) for _ in annotation_names]