Archive for the ‘3D California’ Category
3D California is proud to present you a new demo.
In Equilibrium, you can see how Total Immersion’s Markerless Tracking can help you leverage your existing 2D Flash content, or help you create full Augmented Reality applications that can run everywhere on every Flash enabled device with a webcam (computer, mobile phones, tablets, TV, …).
To play this game, you will need to print the target, and click on the image below, with your webcam plugged.
We wish you will enjoy this demo as much as we enjoyed creating it.
Happy Chinese New Year !
We had several questions lately, about the use of QR Codes and how it is similar or different to usual augmented reality markers. A couple of elements here may help understand better the topic.
What is a QR Code?
Characters in a text can be coded as bits – zeros and ones – that can then be printed as black and white. Following a specific pattern, we can encode a full string of characters as a set of small black and white squares. This is QR Code (see one example in the picture)
If you want to read a QR Code, some mobile applications will help you do that. Your cell phone camera will read a QR Code and output a string (usually the URL of a website you may visit). Such an app will execute a 2D image analysis, finding the 4 corners of the QR Code which are always the same, and deducing the position of all the squares in the QR Code. There’s no 3D computing, only 2D image analysis.
QR Code or AR Marker?
A QR Code is not an augmented reality (AR) marker. They can look quite the same, but usually AR markers have fewer black and white squares and they are bigger. The aim of AR markers is not to convey a string. An AR application will have the position and orientation of the marker analyzed by a camera, in 3D. The computation is very different. With a QR Code, we read the value of the black and white squares but we don’t assess its position, and we want it to be still during the computation. With AR markers, we recognize a known marker in a set of previously learned ones, plus we get its position and orientation in real time as it moves. Then we usually play interactive 3D animations in real time according to the very position of the marker.
Why would a QR Code not be a good AR marker?
A QR Code is usually small. Therefore, the camera needs to be close if we want to read the coded string. The QR Code reading algorithm is very sensitive to movement. Once you stay still for a short moment, and the coded string is recognized, the QR Code has done its job. You don’t want to track it while moving.
AR markers are usually bigger and can be easily tracked with good augmented reality solutions. You don’t have to read or decode it. All you need is to recognize it and then track its movements to render interactive 3D animations accordingly, and in real time. Recent augmented reality advanced solutions have been enabling Markerless Tracking, which does not exactly mean there is no marker at all, but lets us use any image as a marker, like the logo of a company, or a picture, instead of a black and white AR Marker.
What if I want to use a QR Code?
There are plenty of solutions, for example, you could have your Mobile demo downloadable online, and use a QR Code to spread the URL. The actual demo could use any other marker to work with.
Thank you again, and we hope you will have plenty of ideas for projects using computer vision and natural interface technologies. And don’t forget … Zest you ideas with 3D !
Hi everyone, and Happy new year to you from 3DCalifornia!
First of all, we would like to thank you all for this tremendous year 2010 we had, and wish you the best for 2011! And because we love 3D, we did a small demo for you, using our partner’s technology D’Fusion, and it is available here. Feel free to try it and tell us what you think of it!
Let’s open our eyes
So we were on this series of articles about position detection, and this episode is supposed to be showing how computer vision can be used for that. Here we go!
First things first, what is computer vision? We explained briefly in one of the previous episodes what light was and what colors were. Our eyes can perceive lights and colors, and that’s mainly what they do. Then they send the information to the brain where lights and colors (low level information) become distances, objects, faces (known, unknown), words, etc (high level information).
Computer vision is the domain that studies the algorithm a computer needs to see, and to see high level information. To recognize that 2 objects are identical in 2 different images, to recognize a bunch of pixels as a tree or a bike, to recognize a face in an image and to know that it’s someone’s face in particular. And as this is the subject of this article, to determine the position of an object inside an image.
Two birds with one stone
So, why should we use computer vision for position detection? Because in most cases, we already have all the hardware up and running: we’re trying to do augmented reality, and for that, we need to add virtual objects to a live video stream. So in most cases, we already have a camera. This single piece of hardware will be allowing us to do both reality sampling and position detection. No need for expensive magnetic device, no need for infrared lights. Just a camera and a computer.
A computer vision algorithm can be more or less complicated, but it will usually rely on one thing: the value of the pixels. We have a pixel. Its color, or brightness can be quantified. So if we replace all the pixels by their numerical value, we now have a grid of numbers, a matrix, and mathematics are really good at taking information out of matrices, so that’s all. We throw some maths at our digital image, and we get the info.
So what’s the battle plan? We have an incoming video stream and we want to output its coordinates, as fast as possible. There are multiple solutions, and most of them will differ by 3 aspects: Assumptions, Learning and Running. In order to find your favorite teapot, you first have to know what a teapot is, what your favorite teapot looks like and then you need to look everywhere and figure out if you can see it. The steps are the same here.
“The least questioned assumptions are often the most questionable”
The quote is from Paul Broca. The question is “what do we know about the object we want to search”. For example, if it is a building, we know that we will probably see a lot of horizontal and vertical lines. If we’re looking for an old augmented reality marker, we will see a thick black square with black and white squares inside. And with Total Immersion’s technology, Markerless Tracking (MLT), only few assumptions are necessary, only considering we will be seeing an image that has no symmetry and that has some contrast in it. So first, we decide what kind of assumptions we keep. The larger the assumptions, the easier computation will be, but the more constraints it will create.
“The moment you stop learning, you stop leading”
This quote of Rick Warren will sure be a good introduction. Learning is the phase where we give the algorithm a way to differentiate any image that fits the assumptions and the image we’re specifically looking for. With MLT, for example, it is the step where we give the algorithm a clear view of the target we’ll be tracking. In this phase, which is usually not real time, we will extract features (middle level information) from images such as borders, corners, interest points or keypoints.
Keypoints are points that have a special property (usually a mathematical property), and this property is chosen to be stable, which means that when the object moves within your video stream, these properties will stay with the objects and still be visible. During the learning phase, you learn how to position the keypoints from one another. And now you’re ready for the race.
Running… For president?
So this is it. Now, we have an user in front of the camera, and we need to get the position of a target he has in his hands. So what do we do? We use the algorithm we prepared. If we learned the positions on the corners, we’ll be analyzing the image, searching for corners. If we learned the keypoints, we’ll look for the keypoints. It’s easy to determine that the image we’re looking for is indeed in the video stream. The hard part is to determine where. That’s where some clever filtering and modeling algorithms like RANSAC take place. RANSAC (for RANdom SAmple Consensus) take some data as an input (let’s say the points in the image A), and a model (let’s say a line) from which we want to find the parameters (parameters for a line will be height and slope for example), then will look for the points that fits the model the best, and will completely ignore the points that don’t fit it. It will then output the good points, and their good model (the blue line).
In an exact similar way, given all the keypoints on the image and a model (the keypoints from the learning, the unknown parameters will be their position and orientation), and RANSAC gives us the proper model (position and orientation) that fit our object’s keypoint, ignoring the background keypoints.
Pfew, that was quite a trip, wasn’t it? We did it! We now have the position and orientation of our object in the video stream. And, even better: now we know the position in this image, it will be even easier to find it in the next image, because we know it can’t have moved that much. We can make new assumptions, and new assumptions mean it’s easier to compute, which means it runs faster!
So that was a few selected ideas of how mathematics and their applications in computer vision can really make your life easier when you’re trying to augment reality. And this concludes this three-parted Quick peek behind the curtain article. I hope you liked it! If you want me to tackle a specific subject on Augmented and Virtual technologies next time, feel free to drop a comment, a mail, or anything.
It was in 1976, Atari released the arcade game Breakout. This so-simple-but-yet-so-addictive game inspired many and soon became recognized as masterpiece in the Video Game culture, a position it has kept since then with others as Pong, Pac-Man and Space Invader.
3D California revisits this Classic among the Classics, with a whole new gameplay. This 3D version is controlled by Image Motion Detection with your webcam. You can simply move the paddle by moving your hand. To add extra-interactivity, you can move the whole game in 3D simply by printing this flyer. We developed this project with our partner Total Immersion‘s incredible technology, D’Fusion !
Ready, Set, Start playing !
Ah, there you are! It’s time for another short article on the insides of augmented (and virtual) reality techniques. One of the big challenges is that we need to put in the exact same position our real camera and our virtual camera when trying to merge real and virtual objects. This means that our real camera must be able to communicate its coordinates from a reference object, which will have a virtual representation. So the point is: given a real world (let’s say, ours, for example) we need the coordinates of a moving object. This position detection, or tracking, how can we do that?
I, robot arm
The simplest solution is to use the object robot arm equipped with angle or slide sensors. Moving the object will change the angles and distances measured, which can be detected by a computer that will update the virtual object’s position using the new values. It can even be double sided, if your robot arm is equipped with sensors AND motors, then it can have a force feedback, to prevent you from entering solid (virtual) objects. You can see an example below of a haptic (i.e. linked to the sense of touching) device. The main problem we have is a quite limited range, because the arm is usually expensive, and building an arm that has more than 50 cm^3 of liberty of movement is not easy.
A whole new (magnetic) field
Another way to detect an object’s position is using a magnetic field. There are different ways to do it, but many of them are really similar. The plan is to generate a magnetic field with an electromagnet, and to measure the intensity of this field along the 3 directions of space: the closer to the source, the more intense. And we can have the distance from the source to the sensor in all 3 directions, thus giving the coordinates, et voilà. How can we do it? This is a question of physics: if we make a small circuit that has a coil in it, then when there is a moving magnetic field, there will be electricity in the circuit, and we can measure the amount of electricity. It’s the same principle that we use to generate electricity in power plants. But there is a problem. We need to know the shape of the magnetic field we generate, and it’s highly dependent of the environment we’re in, and especially the presence of metal objects. So this is a great system, but if someone brings a metallic chair to watch it, it will not work anymore. And how can I talk about magnetic tracking without mentioning another device that has been used for thousands of years: the good old compass, which uses the earth magnetic field and a natural magnet to point North and help lost travelers. Well even this old trick found its way to Augmented Reality:
Let’s throw our friends into outer space!
While we’re talking about compass, we may also consider another (more recent) technology that can be used for tracking, and that has a huge range, it’s Global Positioning System, better known as GPS. Imagine. You’re on a road, but you don’t know where. You have a friend, on the 30th mile of the road. If you know that you’re 10 mile away from him, then you’re either on the 20th mile or on the 40th. Now if you know you have another friend on the 50th mile, and you’re 10 miles away from him too, well, you know where you are. And you’ve just created a one dimensional GPS. For a three dimensional one, you need 4 friends. So let’s say your friends are satellites revolving around the Earth, and their position is known. If you can tell how far you are from each of them, then you can derive your own position. So your 4 satellite friends, who have very precise clocks in their electronic parts, will send a message containing the current time, and this message will travel through space at the speed of light to get to the GPS device in your car. So when you compare the message you receive and the current time, there is a slight difference due to the travelling time, and knowing the speed of light, this time difference can become a distance to the satellite. And thanks to your space friends, you now know the closest path to the nearest grocery store! GPS is used for most of the Augmented Reality features we can find in iPhone apps today, when it points a direction and a distance, it uses GPS. But there are some limitations to this principle. GPS is not very accurate for most devices, and even if you can get your position with an error of less than a meter, many Augmented Reality applications require something more like a centimeter or even a millimeter. And GPS does not give you your orientation, so even if you use a compass, you will still need other informations to have all the informations we need. Pretty much all we can do with a GPS alone for Augmented Reality is something like that :
So that’s it for today, but the “part 2” will soon be there for you 3D fans. We’ll be talking about infrared light, accelerometers and gyroscopes and finally Computer Vision. Stay tuned !