People Watching: Human Actions as a Cue for Single-View Geometry
People
Abstract
We present an approach which exploits the coupling between human actions and scene geometry. We investigate the use of human pose as a cue for single-view 3D scene understanding. Our method builds upon recent advances in still-image pose estimation to extract functional and geometric constraints about the scene. These constraints are then used to improve state-of-the-art single-view 3D scene understanding approaches. The proposed method is validated on a collection of monocular time lapse sequences collected from YouTube and a dataset of still images of indoor scenes. We demonstrate that observing people performing different actions can significantly improve estimates of 3D scene geometry.
Paper
ECCV Paper (pdf) Slides (zip, 60MB) Watch the talk from ECCV (at videolectures.net) Citation |
Extended Results
This video shows a selection of input timelapses and the evolution of functional surfaces and the resulting geometric
interpretation. Download (mp4, 18MB) |
Timelapse Results Gallery (40 Sequences)
Still Results Gallery (100 Images)
Data
Still image dataset (9.1MB ZIP file) - 100 JPGs
Videos (List of urls to videos used)
Related Works
Funding
This research is supported by:
- NSF Graduate Research Fellowship for David Fouhey
- ONR-MURI Grant N000141010934
- Qauero
- OSEO
- MSR-INRIA
- EIT-ICT
- ERC grant Videoworld
Copyright Notice
The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright.