The Visual Computing Database: A Platform for Visual Data Processing and Analysis at Internet Scale
Funding: NSF IIS-1539069, Google 2015 Faculty Fellowship, Intel Machine Learning ISRA
Today, it is clear that the next generation of visual computing applications will require efficient analysis and mining of large repositories of visual data (images, videos, RGBD). But scaling visual data analysis to operate on collections the size of all public photos and videos on Facebook, all security video cameras in a major city, or petabytes of images in an astronomy sky survey, presents supercomputing-scale storage and computation challenges. Very few programmers have the capability to operate efficiently at these scales, inhibiting the field’s ability to explore advanced data-driven visual computing applications. To meet this challenge, we are developing a distributed computing platform – combining ideas from high-performance image processing languages, data analytics, and database functionality – that facilitates the development of applications that query, analyze and mine image and video collections at scale.
This ongoing work presents two significant questions:
- What is the programming system for large-scale visual data analytics? What are scalable primitives for expressing visual data queries and describing visual concepts of interest?
- How do we architect an efficient runtime system for executing visual analysis pipelines, and what is the ideal hardware platform for such computations?
Automatically Scheduling Halide Programs. In recent years, the Halide image processing language has proven to be an effective system for authoring high-performance image processing code, as evidenced by its use at companies like Google to author popular computational photography applications used in hundreds of millions of smart phones. However, although Halide enables programmers to work more quickly, to obtain high performance it still requires requires programmers to have expertise in modern code optimization techniques and hardware architectures. We have developed an algorithm for automatically generating high-performance implementations of Halide image processing programs. In seconds, the algorithm generates schedules for a wide set of image processing benchmarks that are competitive with (and often better than) schedules manually authored by expert Halide developers on both server and mobile platforms. (This activity is also supported by PI Fatahalian's CAREER grant IIS-1253530.)
Lantern: A Query Language for Visual Concept Retrieval. Lantern addresses a rapidly growing need to efficiently explore and mine massive visual datasets for information, tasks like locating people in a video or determining similarity between images. A number of recent top-performing computer vision tools for these tasks rely on machine learning methods, specifically end-to-end training and evaluation which can take days or weeks to learn effective concept detectors. The language provides an abstraction, the spatial concept hierarchy, for combining existing vi- sion algorithms with coarse grained rules for quickly developing new queries and interactively exploring visual data. Lantern compiles queries into operations on distributed collections to enable rapid execution on large clusters. We demonstrate demonstrate the use of Lantern by building an interactive system for exploration of visual datasets, an object detector error analysis platform, and a tool to blur faces in videos. We show Lantern queries running across a cluster running on the Google Cloud Platform.
COMING SOON! We will be placing versions of our distributes cluster runtimes for video data analytics online on Github soon.
This project is supported by the National Science Foundation, :
Funding for research assistants working on the project is also provided by Google through a 2016 Faculty Fellowship, and Intel through the Machine Learning ISRA program.
Last updated September 2016.