TY - GEN
T1 - RF-based 3D skeletons
AU - Zhao, Mingmin
AU - Tian, Yonglong
AU - Zhao, Hang
AU - Alsheikh, Mohammad Abu
AU - Li, Tianhong
AU - Hristov, Rumen
AU - Kabelac, Zachary
AU - Katabi, Dina
AU - Torralba, Antonio
PY - 2018/8/7
Y1 - 2018/8/7
N2 - This paper introduces RF-Pose3D, the first system that infers 3D human skeletons from RF signals. It requires no sensors on the body, and works with multiple people and across walls and occlusions. Further, it generates dynamic skeletons that follow the people as they move, walk or sit. As such, RF-Pose3D provides a significant leap in RF-based sensing and enables new applications in gaming, healthcare, and smart homes. RF-Pose3D is based on a novel convolutional neural network (CNN) architecture that performs high-dimensional convolutions by decomposing them into low-dimensional operations. This property allows the network to efficiently condense the spatio-temporal information in RF signals. The network first zooms in on the individuals in the scene, and crops the RF signals reflected off each person. For each individual, it localizes and tracks their body parts - head, shoulders, arms, wrists, hip, knees, and feet. Our evaluation results show that RF-Pose3D tracks each keypoint on the human body with an average error of 4.2 cm, 4.0 cm, and 4.9 cm along the X, Y, and Z axes respectively. It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set. Demo videos are available at our website: http://rfpose3d.csail.mit.edu.
AB - This paper introduces RF-Pose3D, the first system that infers 3D human skeletons from RF signals. It requires no sensors on the body, and works with multiple people and across walls and occlusions. Further, it generates dynamic skeletons that follow the people as they move, walk or sit. As such, RF-Pose3D provides a significant leap in RF-based sensing and enables new applications in gaming, healthcare, and smart homes. RF-Pose3D is based on a novel convolutional neural network (CNN) architecture that performs high-dimensional convolutions by decomposing them into low-dimensional operations. This property allows the network to efficiently condense the spatio-temporal information in RF signals. The network first zooms in on the individuals in the scene, and crops the RF signals reflected off each person. For each individual, it localizes and tracks their body parts - head, shoulders, arms, wrists, hip, knees, and feet. Our evaluation results show that RF-Pose3D tracks each keypoint on the human body with an average error of 4.2 cm, 4.0 cm, and 4.9 cm along the X, Y, and Z axes respectively. It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set. Demo videos are available at our website: http://rfpose3d.csail.mit.edu.
KW - 3D Human Pose Estimation
KW - Localization
KW - Machine Learning
KW - Neural Networks
KW - RF Sensing
KW - Smart Homes
UR - http://www.scopus.com/inward/record.url?scp=85055315303&partnerID=8YFLogxK
UR - https://dl.acm.org/citation.cfm?doid=3230543.3230579
UR - http://www.mendeley.com/research/rfbased-3d-skeletons
U2 - 10.1145/3230543.3230579
DO - 10.1145/3230543.3230579
M3 - Conference contribution
AN - SCOPUS:85055315303
SN - 9781450355674
T3 - SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
SP - 267
EP - 281
BT - SIGCOMM 2018 - Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
PB - Association for Computing Machinery, Inc
T2 - ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
Y2 - 1 January 2011
ER -