Posts Tagged ‘3D software’
Hi everyone, and Happy new year to you from 3DCalifornia!
First of all, we would like to thank you all for this tremendous year 2010 we had, and wish you the best for 2011! And because we love 3D, we did a small demo for you, using our partner’s technology D’Fusion, and it is available here. Feel free to try it and tell us what you think of it!
Let’s open our eyes
So we were on this series of articles about position detection, and this episode is supposed to be showing how computer vision can be used for that. Here we go!
First things first, what is computer vision? We explained briefly in one of the previous episodes what light was and what colors were. Our eyes can perceive lights and colors, and that’s mainly what they do. Then they send the information to the brain where lights and colors (low level information) become distances, objects, faces (known, unknown), words, etc (high level information).
Computer vision is the domain that studies the algorithm a computer needs to see, and to see high level information. To recognize that 2 objects are identical in 2 different images, to recognize a bunch of pixels as a tree or a bike, to recognize a face in an image and to know that it’s someone’s face in particular. And as this is the subject of this article, to determine the position of an object inside an image.
Two birds with one stone
So, why should we use computer vision for position detection? Because in most cases, we already have all the hardware up and running: we’re trying to do augmented reality, and for that, we need to add virtual objects to a live video stream. So in most cases, we already have a camera. This single piece of hardware will be allowing us to do both reality sampling and position detection. No need for expensive magnetic device, no need for infrared lights. Just a camera and a computer.
A computer vision algorithm can be more or less complicated, but it will usually rely on one thing: the value of the pixels. We have a pixel. Its color, or brightness can be quantified. So if we replace all the pixels by their numerical value, we now have a grid of numbers, a matrix, and mathematics are really good at taking information out of matrices, so that’s all. We throw some maths at our digital image, and we get the info.
So what’s the battle plan? We have an incoming video stream and we want to output its coordinates, as fast as possible. There are multiple solutions, and most of them will differ by 3 aspects: Assumptions, Learning and Running. In order to find your favorite teapot, you first have to know what a teapot is, what your favorite teapot looks like and then you need to look everywhere and figure out if you can see it. The steps are the same here.
“The least questioned assumptions are often the most questionable”
The quote is from Paul Broca. The question is “what do we know about the object we want to search”. For example, if it is a building, we know that we will probably see a lot of horizontal and vertical lines. If we’re looking for an old augmented reality marker, we will see a thick black square with black and white squares inside. And with Total Immersion’s technology, Markerless Tracking (MLT), only few assumptions are necessary, only considering we will be seeing an image that has no symmetry and that has some contrast in it. So first, we decide what kind of assumptions we keep. The larger the assumptions, the easier computation will be, but the more constraints it will create.
“The moment you stop learning, you stop leading”
This quote of Rick Warren will sure be a good introduction. Learning is the phase where we give the algorithm a way to differentiate any image that fits the assumptions and the image we’re specifically looking for. With MLT, for example, it is the step where we give the algorithm a clear view of the target we’ll be tracking. In this phase, which is usually not real time, we will extract features (middle level information) from images such as borders, corners, interest points or keypoints.
Keypoints are points that have a special property (usually a mathematical property), and this property is chosen to be stable, which means that when the object moves within your video stream, these properties will stay with the objects and still be visible. During the learning phase, you learn how to position the keypoints from one another. And now you’re ready for the race.
Running… For president?
So this is it. Now, we have an user in front of the camera, and we need to get the position of a target he has in his hands. So what do we do? We use the algorithm we prepared. If we learned the positions on the corners, we’ll be analyzing the image, searching for corners. If we learned the keypoints, we’ll look for the keypoints. It’s easy to determine that the image we’re looking for is indeed in the video stream. The hard part is to determine where. That’s where some clever filtering and modeling algorithms like RANSAC take place. RANSAC (for RANdom SAmple Consensus) take some data as an input (let’s say the points in the image A), and a model (let’s say a line) from which we want to find the parameters (parameters for a line will be height and slope for example), then will look for the points that fits the model the best, and will completely ignore the points that don’t fit it. It will then output the good points, and their good model (the blue line).
In an exact similar way, given all the keypoints on the image and a model (the keypoints from the learning, the unknown parameters will be their position and orientation), and RANSAC gives us the proper model (position and orientation) that fit our object’s keypoint, ignoring the background keypoints.
Pfew, that was quite a trip, wasn’t it? We did it! We now have the position and orientation of our object in the video stream. And, even better: now we know the position in this image, it will be even easier to find it in the next image, because we know it can’t have moved that much. We can make new assumptions, and new assumptions mean it’s easier to compute, which means it runs faster!
So that was a few selected ideas of how mathematics and their applications in computer vision can really make your life easier when you’re trying to augment reality. And this concludes this three-parted Quick peek behind the curtain article. I hope you liked it! If you want me to tackle a specific subject on Augmented and Virtual technologies next time, feel free to drop a comment, a mail, or anything.
Pride is an essential part of a sales job. Being proud to deliver products of services that can change the way people live is a major incentive in a sales guy everyday life. But when it comes to 3D software and products, pride just reaches a new dimension.
3D application vendors are not just regular vendors. 3D apps vendors sell apps that change the way we live. Applications that change the way you live.
Remember, for example, CAD/CAM software ? Such 3D applications have revolutionized the automotive and aeronautics industry. 20 or 30 years before, designers would have to spend hours on paper and pen to create products, engineers would spend days to manufacture those products based on drawings, support engineers would spend month to test products before the sales process begin. In these industries, 3D has just changed the way it goes, from design to manufacturing and support.
Surimpose information from computer to visual fied in real time become a reality in Japan. The goal is to improve and enriched the shopper experience. Several firms are testing such devices.
Toppan Printing Co Ltd provide a dedicated terminal, allowing for example to improve signage and items information, or even to locate a store around. It’s very simple : the consumer has just to show his product in front the Terminal cam, which recognize the product package and return a 3D augmented display of the product with its description.
Supplementary application making the link between physical and virtual world : the QR code technology
You can shoot the QR codes with your mobile phone and you have just to show QR at camera (the one of the Toppan dedicated terminal) to get, some samples of associated product.
Other example with Sony Music Communications Inc and Sky&Road Co Ltd, they propose an “interactive show-window”; you have just be on stand in front the terminal (fitted with cam ) and a virtual image (representing a pattern or decoration…) is surimposed on the real one of you.
Finally, with the two previous companies, we note the “Magic mirror” which can be a major innovation in field of clothing.
Imagine you’re wearing a long sleeve shirt and you want to know what you would look like with short sleeves shirt; so you have just to be on stand in front of the mirror and the computer generate a nude pair of arms; we realize easily the scope such applications in clothing area and elsewhere.
the actual application are developped with the AR development kit of Total Immersion.
The men’s magazine Esquire, dedicated its cover of the December edition to the actor Robert Downey Jr.,on this occasion the magazine propose an AR (Augmented Reality) Issue including several AR experiences for the reader.
Choice was made for big black & white markers on each targeted page of the magazine, and for exe download with no clear information of the Publisher in the Windows dialog box. (download from Esquire website) So far, the interaction with the content is very simple and having celebrities involved in these kind of initiatives is always exciting.
Another style of AR experience, with www.instyle.com (belongs to Time Inc) and still involving celebrities … This time, we have marker less technology – your webcam will track nice images directly from the magazine – and you download a well identified pluggin (ActiveX in a browser).
Who will be next ? Which magazine ? Which celebrity ? Let me know your thoughts …
The main innovation is in the musical object which is the physical Album (CD and Jacket Cover).
The idea is to turn the Audio CD into an object enriched by augmented reality. You need both the CD, a computer and a Webcam to live these new digital experience.
Music consumer may interact with a virtual projection displayed, and create his own musical experience, for example using an application to create your virtual musical mix.
This way, the physical object becomes a kind of dynamic open work, with an endless potential.
Considering the physical CD object is no more a finite and static object, the music buyer, and especially digital natives, will probably come back in the store to buy CD.
Bonuses and additional tracks like a ”Making-Off”, are a common feature on most DVDs. With augmented reality, the physical support gets an additional value by itself: objects turn into concepts, a world where bits and atoms may co-exist.
Just have a look at the following french video…: