Benchmarks for Out-of-Distribution Generalization in Time Series
View the Project on GitHub jc-audet/WOODS
View the Documentation on RtD woods.readthedocs.io
Motivation: The intrinsic biases from inaccurate and poorly calibrated sensors of smart devices, along with the accumulated biases from everyday use makes human activity recognition a notoriously difficult task when task when done across devices. Contrary to static tasks where uninformative features can often be segmented out from the input features (e.g., background when classifying an animal from an image), invariant features in time series are often highly convoluted with other spurious features. We study the ability of models to ignore spurious information from complex signals with the HHAR dataset.
Problem: We consider the human activity classification task from accelerometer and gyroscope measurements of smartphones and smartwatches. The dataset has five source domains, where each domain contains data gathered with a different device. The goal is to generalize to unseen smart devices.
python -m woods.scripts.download_datasets HHAR --data_path /path/to/data
python -m woods.scripts.fetch_and_preprocess HHAR --data_path /path/to/data
[1] Stisen, Allan, et al. “Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition.” Proceedings of the 13th ACM conference on embedded networked sensor systems. 2015.
[2] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
This dataset is licensed under the https://opendatacommons.org/licenses/by/summary/index.html