An MIT researcher is recording the first three years of his child's life in order to collect data to teach robots how to learn language. Excerpts from a Wired story:
Almost every new dad breaks out a videocam to record his kid's early years. But Roy is working on a much more ambitious scale. Eleven cameras and 14 microphones are embedded in the ceilings of the Roy household and connected by some 3,000 feet of cable to a terabyte disk array in the basement. Roy has already captured more than 120,000 hours of footage. Data from the disks gets backed up to an automated tape library, and every 40 days Roy shows up at work with a rolling suitcase to download his new haul of data onto a dedicated 250-terabyte array in the air-conditioned machine room of the MIT Media Lab.
Roy, 38, directs the Media Lab's Cognitive Machines Group, known for teaching remedial English to a robot named Ripley. By recording the early stages of his boy's life, Roy is seeking to supplement his steel-and-silicon investigations: His three-year-long study will document practically every utterance his young son makes, from the first gurglings of infancy through the ad hoc eloquence of toddlerdom, in an unprecedented effort to chart — uninterrupted — the entire course of early language acquisition. The goal of the Human Speechome Project, as he boldly calls his program, is to amass a huge and intricate database on a fundamental human phenomenon. Roy believes the Speechome Project will, in turn, unlock the secrets of teaching robots to understand and manipulate language.
Disarmingly convincing with a calm manner and understated black attire, Roy goes on to explain how the project will ultimately let him combine human observation and robotic experimentation to address some of the most basic questions about how words work and what language reveals about cognition. There's a practical side to this: the motivation of an engineer who wants to make machines talk and think. There's also a speculative side: the motivation of a scientist who wants to explore language as a means of investigating the brain.
Over the past months, though, such grand problems have been the least of Roy's concerns. Kubat, along with grad students Philip DeCamp and Brandon Roy (no relation to Deb), has been wrestling with the task of managing and analyzing the hundreds of thousands of hours of raw multichannel video that are accruing. With input from his wife, Northeastern University speech pathologist Rupal Patel, Roy is attempting to make the project scientifically meaningful without turning baby Dwayne's life into The Truman Show. Even if Roy's work — endorsed by academic luminaries like experimental psychologist Steven Pinker and philosopher Daniel Dennett — fails to provide major linguistic insights, the data-mining techniques he's developing and the experimental protocols he's establishing will change how early childhood development is researched. His colleagues in the field are watching his methods with interest. "This is groundbreaking work ," says Carnegie Mellon developmental psychologist Brian MacWhinney, keeper of the world's leading repository of childhood speech transcripts. "More and more, it's the technology that drives the science."
Still, there are gaps in the record, and not only while Dwayne sleeps or when the family goes out. (Despite rumors circulating on the Internet, Dwayne isn't under house arrest and has even had his first summer vacation.) Sometimes several cameras are down; other times the spectrograms register hours of silence. These blank spots are intentional, blinders that Roy allows himself in the eye of his self-imposed panopticon. In fact, Roy is fanatical about privacy, declining all requests from reporters to visit his home and refusing to reveal his baby's real name. ("Dwayne" was chosen for this article in keeping with Roy's practice of naming his robotic research subjects after Aliens characters — in this case, Corporal Dwayne Hicks.) "It comes down to managing privacy issues in an experiment that's the first of its kind," Roy says. "I've been erring on the conservative side because right now I'm living it and my wife is living it, so I don't trust my intuition."
Erring on the conservative side means killing the system if he or his wife is in a bad mood and might want to vent over dinner. They can also switch off the cameras while Patel is breast-feeding or hit the "oops" button when something too personal gets recorded. In fact, a glowing, wall-mounted "oops" button can be found in every room, allowing them to make Total Recall's archive something less than total. Roy pressed it one day after emerging naked from the shower when the cameras were running.
via Bryan Appleyard,
see also: Human Speechome Project.
While it sounds like they're making an effort with privacy, this still seems pretty close to the ethical line. And I don't get how using a pseudonym for the baby but not the parents makes a bit of difference.