RF-based 3D skeletons

Mingmin Zhao, Yonglong Tian, Hang Zhao, Mohammad Abu Alsheikh, Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, Antonio Torralba

Research output: A Conference proceeding or a Chapter in BookConference contribution

12 Citations (Scopus)

Abstract

This paper introduces RF-Pose3D, the first system that infers 3D human skeletons from RF signals. It requires no sensors on the body, and works with multiple people and across walls and occlusions. Further, it generates dynamic skeletons that follow the people as they move, walk or sit. As such, RF-Pose3D provides a significant leap in RF-based sensing and enables new applications in gaming, healthcare, and smart homes. RF-Pose3D is based on a novel convolutional neural network (CNN) architecture that performs high-dimensional convolutions by decomposing them into low-dimensional operations. This property allows the network to efficiently condense the spatio-temporal information in RF signals. The network first zooms in on the individuals in the scene, and crops the RF signals reflected off each person. For each individual, it localizes and tracks their body parts - head, shoulders, arms, wrists, hip, knees, and feet. Our evaluation results show that RF-Pose3D tracks each keypoint on the human body with an average error of 4.2 cm, 4.0 cm, and 4.9 cm along the X, Y, and Z axes respectively. It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set. Demo videos are available at our website: http://rfpose3d.csail.mit.edu.

Original languageEnglish
Title of host publicationSIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
PublisherAssociation for Computing Machinery, Inc
Pages267-281
Number of pages15
ISBN (Electronic)9781450355674
DOIs
Publication statusPublished - 7 Aug 2018
Externally publishedYes
Event2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018 - Budapest, Hungary
Duration: 20 Aug 201825 Aug 2018

Conference

Conference2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018
CountryHungary
CityBudapest
Period20/08/1825/08/18

Fingerprint

Network architecture
Convolution
Crops
Neural networks
Sensors
neural network
website
video
human being
evaluation

Cite this

Zhao, M., Tian, Y., Zhao, H., Alsheikh, M. A., Li, T., Hristov, R., ... Torralba, A. (2018). RF-based 3D skeletons. In SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (pp. 267-281). Association for Computing Machinery, Inc. https://doi.org/10.1145/3230543.3230579
Zhao, Mingmin ; Tian, Yonglong ; Zhao, Hang ; Alsheikh, Mohammad Abu ; Li, Tianhong ; Hristov, Rumen ; Kabelac, Zachary ; Katabi, Dina ; Torralba, Antonio. / RF-based 3D skeletons. SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc, 2018. pp. 267-281
@inproceedings{2ac177076aaf426db4d4f22f58824f84,
title = "RF-based 3D skeletons",
abstract = "This paper introduces RF-Pose3D, the first system that infers 3D human skeletons from RF signals. It requires no sensors on the body, and works with multiple people and across walls and occlusions. Further, it generates dynamic skeletons that follow the people as they move, walk or sit. As such, RF-Pose3D provides a significant leap in RF-based sensing and enables new applications in gaming, healthcare, and smart homes. RF-Pose3D is based on a novel convolutional neural network (CNN) architecture that performs high-dimensional convolutions by decomposing them into low-dimensional operations. This property allows the network to efficiently condense the spatio-temporal information in RF signals. The network first zooms in on the individuals in the scene, and crops the RF signals reflected off each person. For each individual, it localizes and tracks their body parts - head, shoulders, arms, wrists, hip, knees, and feet. Our evaluation results show that RF-Pose3D tracks each keypoint on the human body with an average error of 4.2 cm, 4.0 cm, and 4.9 cm along the X, Y, and Z axes respectively. It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set. Demo videos are available at our website: http://rfpose3d.csail.mit.edu.",
keywords = "3D Human Pose Estimation, Localization, Machine Learning, Neural Networks, RF Sensing, Smart Homes",
author = "Mingmin Zhao and Yonglong Tian and Hang Zhao and Alsheikh, {Mohammad Abu} and Tianhong Li and Rumen Hristov and Zachary Kabelac and Dina Katabi and Antonio Torralba",
year = "2018",
month = "8",
day = "7",
doi = "10.1145/3230543.3230579",
language = "English",
pages = "267--281",
booktitle = "SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication",
publisher = "Association for Computing Machinery, Inc",

}

Zhao, M, Tian, Y, Zhao, H, Alsheikh, MA, Li, T, Hristov, R, Kabelac, Z, Katabi, D & Torralba, A 2018, RF-based 3D skeletons. in SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc, pp. 267-281, 2018 Conference of the ACM Special Interest Group on Data Communication, ACM SIGCOMM 2018, Budapest, Hungary, 20/08/18. https://doi.org/10.1145/3230543.3230579

RF-based 3D skeletons. / Zhao, Mingmin; Tian, Yonglong; Zhao, Hang; Alsheikh, Mohammad Abu; Li, Tianhong; Hristov, Rumen; Kabelac, Zachary; Katabi, Dina; Torralba, Antonio.

SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc, 2018. p. 267-281.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - RF-based 3D skeletons

AU - Zhao, Mingmin

AU - Tian, Yonglong

AU - Zhao, Hang

AU - Alsheikh, Mohammad Abu

AU - Li, Tianhong

AU - Hristov, Rumen

AU - Kabelac, Zachary

AU - Katabi, Dina

AU - Torralba, Antonio

PY - 2018/8/7

Y1 - 2018/8/7

N2 - This paper introduces RF-Pose3D, the first system that infers 3D human skeletons from RF signals. It requires no sensors on the body, and works with multiple people and across walls and occlusions. Further, it generates dynamic skeletons that follow the people as they move, walk or sit. As such, RF-Pose3D provides a significant leap in RF-based sensing and enables new applications in gaming, healthcare, and smart homes. RF-Pose3D is based on a novel convolutional neural network (CNN) architecture that performs high-dimensional convolutions by decomposing them into low-dimensional operations. This property allows the network to efficiently condense the spatio-temporal information in RF signals. The network first zooms in on the individuals in the scene, and crops the RF signals reflected off each person. For each individual, it localizes and tracks their body parts - head, shoulders, arms, wrists, hip, knees, and feet. Our evaluation results show that RF-Pose3D tracks each keypoint on the human body with an average error of 4.2 cm, 4.0 cm, and 4.9 cm along the X, Y, and Z axes respectively. It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set. Demo videos are available at our website: http://rfpose3d.csail.mit.edu.

AB - This paper introduces RF-Pose3D, the first system that infers 3D human skeletons from RF signals. It requires no sensors on the body, and works with multiple people and across walls and occlusions. Further, it generates dynamic skeletons that follow the people as they move, walk or sit. As such, RF-Pose3D provides a significant leap in RF-based sensing and enables new applications in gaming, healthcare, and smart homes. RF-Pose3D is based on a novel convolutional neural network (CNN) architecture that performs high-dimensional convolutions by decomposing them into low-dimensional operations. This property allows the network to efficiently condense the spatio-temporal information in RF signals. The network first zooms in on the individuals in the scene, and crops the RF signals reflected off each person. For each individual, it localizes and tracks their body parts - head, shoulders, arms, wrists, hip, knees, and feet. Our evaluation results show that RF-Pose3D tracks each keypoint on the human body with an average error of 4.2 cm, 4.0 cm, and 4.9 cm along the X, Y, and Z axes respectively. It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set. Demo videos are available at our website: http://rfpose3d.csail.mit.edu.

KW - 3D Human Pose Estimation

KW - Localization

KW - Machine Learning

KW - Neural Networks

KW - RF Sensing

KW - Smart Homes

UR - http://www.scopus.com/inward/record.url?scp=85055315303&partnerID=8YFLogxK

UR - https://dl.acm.org/citation.cfm?doid=3230543.3230579

U2 - 10.1145/3230543.3230579

DO - 10.1145/3230543.3230579

M3 - Conference contribution

SP - 267

EP - 281

BT - SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication

PB - Association for Computing Machinery, Inc

ER -

Zhao M, Tian Y, Zhao H, Alsheikh MA, Li T, Hristov R et al. RF-based 3D skeletons. In SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. Association for Computing Machinery, Inc. 2018. p. 267-281 https://doi.org/10.1145/3230543.3230579