Alumni

Alumni Profile

Electrical engineer helps machines make sense of the world.
Parham Aarabi (PhD 2001 EE ) has been interested in connecting computers to the physical world since he was 11. He now directs the Artificial Perception Lab at the University of Toronto where he is an associate professor of electrical and computer engineering. His research focuses on systems that allow machines to make some sense of the world around them. Projects include building arrays of microphones that can locate a speaker in a room and software that can classify images based on what they depict.

Aarabi completed his PhD with advisors Vaughn Pratt and Bernard Widrow in just two years. Technology Review magazine recently named Aarabi one if its top 35 innovators under 35 years of age. In fact, when he was hired at age 24, he became Canada’s youngest professor.
What is Artificial Perception?
It is a term that I use to mean computers extracting any kind of information about an environment that allows them to understand certain aspects of the environment or the people within. It’s extracting important information and making sense of it.
One source of that information is sound. Why is that something we want machines to “perceive”?
Speech recognition would be useful for many applications. It would be a lot easier to simply tell your car to turn the radio station off or to change the channel than pressing buttons. For using computers or using appliances, speech is an interface that comes very naturally for us humans. For a long time we’ve had to adapt to learn to use computers. With speech the goal is to make computers adapt to how we communicate.
You’re not just doing speech recognition. You are also localizing it in a room.
Yes. My general work is extracting useful information from noisy sources. The problem with speech recognition is — we have systems that companies make commercially available which do not work well in very noisy environments. Not, for example, in an office building where many people might be talking. So what I try to do is design systems that in very simple ways try to mimic how we humans communicate through speech.

We focus on a single conversation and we tune out other conversations so that we can understand, even in a very noisy environment, what the other person is saying. It boils down to listening in a specific direction — and you have to know what direction you want to listen to — and trying to tune out or cancel voices and sounds from other directions. There is the first step of localization — finding the direction you want — and the second step of speech enhancement.
What is another example of what you do at the lab?
My speech work has been mostly in the past two to five years. More recently in the last two years my group and I have become very heavily involved in searching images. As an example, consider a very large database of images that are not all tagged. So you don’t have a person sitting there writing that an image contains an apple and a flower. How would you search this database if you wanted to find a flower? We’ve focused on trying to extract the contents of these images – very simple information relating one flower in one image to a different flower in a different image. By these relational links that we produce we allow this database to be very easily searched. So you could click on one flower and all of the flowers in the database would automatically come up without having a human operator directly describing what each image contains.
How did you become interested helping computers perceive?
It all goes back to when I was 11 and my family was living in Atlanta at the time. My parents got me my first personal computer, a PC XT. I remember that after playing with all the games that came with it, I started taking it apart and looking around at all the wires and poking around the back to see what all these wires did. For the next few years I was intrigued by trying to connect devices to this computer. I tried to connect my exercise bike to the computer so that when I would bike I would see a virtual image going by. The faster I would bike the faster I would see these images go by. I would pretend that I was biking on a lane or a road. So I was very intrigued about making computers connect to the world. Later on I realized that this connection is sometimes very hard because it is very hard for computers to make sense of information. The automatic extraction of information is somewhat difficult. Making sense of it is extremely difficult. This became my undergraduate thesis, my Masters thesis and eventually my PhD thesis at Stanford.
What did you work on while you were here?
I worked on sound localization using microphone arrays. The idea that if you have multiple microphones you can find out the location of the speaker has been known for a long time. What I tried to do in my PhD thesis ‚ the novelty ‚ was to answer the question of what if you are not sure about the location of your microphones? Some of the microphones could be moving around. Some of them could be faulty or broken. What if you had a microphone array that was either damaged partially, deformable, or was changing? Could you use those microphones to find out the location of the sound source? The answer in many cases turned out to be yes. You would try to find the location of the source and if the microphones couldnít agree on a location you would go back and see why they wouldnít agree and you would revise their positions or status. And then you would look again. So you would iterate through a series of microphone position estimates as well as localization estimates and after a few iterations you would obtain a good estimate of the location of both the microphones and the speaker.
Have you been back here since then?
I came back last year to give a talk. I love Stanford I must say. My two PhD years at Stanford were the best learning experience of my life. I came back last year to give a talk and hopefully I’m going to come back again in a few months to visit some friends. It’s a wonderful place. There is something in the atmosphere at Stanford. It is more than just the weather. It is the people and the sort of environment that is so innovative and intellectually stimulating.
Have any applications come out of the lab yet?
In the next few months my students and I are going to start a company based on the image search idea. We have a Web site that is in an alpha testing mode right now. We are trying to fine tune it. By next summer we will have released this to the public. All I can say at this point is that it is not exactly an Internet image searching site. There are some unique twists here and there. It is certainly an idea that Google or Yahoo! might be very interested in. It complements what they have, but doesn’t try to redo what they do.
   

October 2005
Related Topics