Unlocking AR’s hidden potential for scalable and maintainable experiences
At Q42, we have a strong collaboration with the Rijksmuseum, focusing on developing innovative digital solutions based on the museum's challenges and opportunities. One of these is the use of Augmented Reality (AR) in the museum. The idea of overlaying AR layers on artworks provides visitors with additional information and new insights, enriching their experience, like this earlier example by Q-er Daniello. Many paintings in the Rijksmuseum contain hidden stories that are not visible to the naked eye. For example details revealed by an infrared camera or sketches showing how the work looked in earlier stages. Besides, AR offers the possibility to digitally enlarge and/or explain specific elements of artworks.
My graduation project as an intern at Q42 focused on developing an AR solution that enables Rijksmuseum staff to easily create, manage, and publish AR content for their visitors on a large scale, bringing these hidden stories and possibilities to life.
AR under the hood
AR is an impressive piece of technology where an algorithm creates a 3D representation based on camera images and movements detected by the gyroscope. This is done through points that are tracked and combined into a so-called point cloud. Based on this point cloud, planes are recognised where content can be attached. In the case of the Rijksmuseum, a Cloud Anchor is attached, which, as the name suggests, is an anchor point stored in the cloud. This makes it possible to create a consistent experience across various devices.
The application compares what is stored in the cloud with what the user is currently scanning, and if a match is found, the user receives the Cloud Anchor at the correct position.
Cloud Anchors are part of ARCore. More information on how ARCore works under the hood can be found in this deep-dive.
The challenge
Before I started this assignment, some aspects of AR in the Rijksmuseum had already been validated through small proof of concepts (POCs). Image Anchors, for example in the form of QR codes, were ruled out because the museum didn't want to place markers on every artwork, and the technology wasn't accurate or reliable enough, causing drift, meaning it would jitter across the display. However, one technique that consistently worked and proved promising was Google’s Cloud Anchors.
Given that I only had about five months for this graduation project and didn't have much experience with technologies like Vulkan or OpenGL (for native development on Android), I decided to use Unity for the development of this project. Mainly because I already had some experience of working with the Unity Engine from the past. A nice advantage was the ability to develop for both iOS and Android simultaneously in one codebase.
However, there were several requirements that made it challenging to make AR usable for the Rijksmuseum. These requirements were made to ensure a consistent and accurately displayed AR experience for the user and the content creators, i.e. Rijksmuseum staff.
These are as follows:
- The Rijksmuseum must be able to create Cloud Anchors during opening hours without causing inconvenience to visitors.
- To do this 'on a large scale' it's important that it doesn't take too much time on location. Therefore, content creation should be possible outside the gallery.
- The product should be resilient to changes of the environment. Even with temporary exhibitions it must continue to work.
- The content linked to an art object must be displayed with high precision (maximum deviation of 5mm).
These requirements are counterintuitive to the nature of AR. As we have learned before, we can only retrieve the Cloud Anchor when we’re scanning the exact same environment where the Cloud Anchor was previously made. However, being able to recreate the environment in a 3D editor in some way would solve this problem as it allows us to map the content externally.
This requires us to look for a creative solution to split this process into two steps.
- Create the cloud anchor.
- Digitally line up content with the Anchor somehow.
I’ve tried looking for an existing approach, but it seemed no one had tried this before. This forced me to think outside of the box and come up with my own solution.
Journey to the solution
After some brainstorming, I came up with the idea to store the three dimensional position of the four corners of the painting relative to the Anchor as shown in figure 1. For consistency we’ll call these four corners the Reference Points. Storing the position of these reference points enables us to create an accurate representation of the positioning and rotation of the painting in a 3D space. We can then accurately map the content on top of this painting in an external editor without actually being present at this spot. After having mapped this content it should then theoretically show up at the exact place where we mapped it in an external editor when we try to resolve the content by scanning the actual artwork.
Well… easier said than done.
When testing my first POC this workaround definitely seemed promising. The result was consistent, however the implemented solution did not allow for high precision. The deviations in positioning sometimes reached 5 to 10cm’s. Which exceeds the maximum of 5mm by a lot.
The problem that was causing this was the technique we used for the placement of the reference points. Since we cannot interact directly with the point cloud that ARCore is generating in the background, we are dependent on the Planes that ARCore draws for us. And these are highly inaccurate since they roughly estimate surfaces.
I immediately had a workaround in mind for increasing the accuracy of a specific reference point. Being able to drag them along the X, Y or Z axis after placement would solve any inaccuracies caused by ARCore. The idea would be to create an indicator similar to how you see the XYZ axis in a 3D editor like Blender or Cinema 4D (figure 2). The user would then be able to move the reference point by tapping a specific axis and making a drag gesture, to move the point along it.
During production and validation of this second POC I quickly realised that this solution did grant us the ability to easily move the reference points but unfortunately not with high precision.
My second approach was using buttons to be able to move a reference point, where for each axis there are six buttons. Three to move backwards and three to move forwards. The movement options are, 1mm, 5mm and 10mm, allowing for precise adjustments, quick large movements or something in between. And this worked way better than the dragging behaviour.
However…
Even though we gained the ability to accurately place each reference point we had to compromise in the time spent on creating an Anchor. Which now takes up a couple of minutes, or even more depending on how difficult it proves for a certain painting to get accurate. This unfortunately increases the load exponentially when the museum has to maintain a lot of different experiences. So there was still room for quite some improvement.
And so the search for a better solution began…
The answer to life the universe and everything
After a few brainstorming sessions, I realised that I could leverage the massive collection of high-resolution scans stored in the zoomable viewer Micrio we built for Rijksmuseum. By feeding these scans into ARFoundation, I could use Image Recognition to place an Image Anchor directly on the painting, eliminating the need for a Cloud Anchor or an Image Anchor with a QR code or similar markers.
Initial tests were promising—nice! So, I headed to the museum to test the solution on a large number of artworks in the actual environment. Unfortunately, it didn't work for some paintings, which was a bummer. This issue arose from the lack of complexity in certain images, such as very dark paintings with minimal color, paintings with insufficient lighting for tracking, or those where reflections from varnish made them untrackable. However, for the paintings where it did work, the quality and accuracy of the tracking was great, well within the requirement of less than 5mm deviation.
Because of this, replacing Cloud Anchors with Image Anchors was completely off the table. However, I wondered if we could combine the two to get the best of both worlds. After some more brainstorming, we decided on the following approach: we'd still use a Cloud Anchor to map the experience, but we'd also use an Image Anchor to determine where the reference points are in virtual space. For paintings that aren't recognised or don't have a usable high-res scan in Micrio, there's a fallback option to the manual placement.
During the implementation of this idea, we quickly realised that it was fruitful and provided the perfect middle ground between achieving super high precision and mapping quickly. However, even with this approach, we encountered some issues that required optimisation.
Image Tracking essentially involves an algorithm attempting to "guess" the positioning and rotation of an image in 3D space, which makes it quite jittery. During tests, I quickly realised that saving at the exact moment one of these jitters occurs could result in massive deviations. This was primarily visible when trying to load the content (essentially an overlaid semi-transparent picture of the painting we’re trying to map).
I managed to solve this problem by calculating an averaged position. This was achieved by storing the positioning of the four reference points in dictionaries, which acted as buffers to average out the jitters. I stored all observed positions for the last ten seconds, using a first-in, first-out principle. I then displayed only the averaged position in the UI, allowing the user to visually confirm when the positioning was accurate before pressing save. This approach worked great in initial tests, and once again we were on our way to the museum to test it there. After a final test in the museum and evaluating the current functionality, the project finally reached a point where all requirements were met—hooray!
Conclusion
I demoed the results of my experiment to one of our stakeholders at the Rijksmuseum, who loved the opportunities this solution could provide for bringing the hidden stories within the museum to life. Hopefully, we'll soon be able to experience the amazing AR interactions that can now be seamlessly mapped onto the many paintings housed in the Rijksmuseum.
The final showcase. (Keep in mind that the accuracy shown isn't very representative of the actual result. This was due to it being a very busy at Rijksmuseum that day we went to film. Therefore, it was quite hard to make very accurate scans, so I had to use a debug option to forcibly create the anchor. However in spite of that the result is still quite good.)
I thoroughly enjoyed the challenges that arose during this internship assignment, and I hope that in the near future Rijksmuseum and I or other Q-ers will have the chance to expand on the foundation that was laid during this project. AR as a technology has a lot of potential for museums but unfortunately suffers a lot from inherent limitations. This approach could serve as an inspiration to tackle these issues for developers and companies looking to leverage the technology.
In the meantime we hired Paul and he's now working as one of our Android engineers. Do you als love to dive into innovative technologies? Then do check out our job vacancies (in Dutch) at werkenbij.q42.nl!