In collaboration with



Our passion for experimental user interfaces and our drive to discover new and improved ways of utilizing technology compelled us to envision how a personal voice communicator could function and appear on Apple Vision Pro.


We initiated the project by analyzing the current functionality of voice messaging. The primary applications that support voice messaging features include WhatsApp, Messenger, and Instagram.

However, due to the vastly different nature of the circumstances in which a mobile communicator is used compared to the Apple Vision Pro headset, we had to think completely outside the box.

The headset is typically operated at home and when there is sufficient time available for its use. Starting with a blank slate, we held a single goal in mind: utilizing the spatial interface and a comfortable at-home setting to conceive and develop a genuinely intimate, personal voice communicator that would forge a closer connection between two individuals.


This is when the concept of Closer was conceived—a spatial communicator designed for intimacy, aiming to bring people closer even when they are physically apart.

Our ambition was to employ Artificial Intelligence (AI) to visualize the semantic meaning of the received audio message and immerse the listener in the intended thoughts of the sender during the recording of the note.


We aimed to provide the user with a simple yet intuitive interface to navigate the app. The dashboard view utilizes a side tab bar for switching between the inbox and the recording mode. The UI is streamlined to a minimum, avoiding unnecessary obstruction of the user's view.

Capitalizing on the advantages of incorporating a third spatial dimension, we crafted an intuitive and straightforward recording interface. An audio orb responds to the user's microphone input, providing visual confirmation that the sound is being captured.

Our aspiration was to forge an intensely personal and intimate connection between the sender and the receiver. Here's how we brought this vision to life:

Received messages, while fundamentally recorded audio, undergo analysis by AI and are transcribed into text. Subsequently, the AI generates a spherical image that encapsulates the message's content.

This visual representation is then transformed into a three-dimensional bubble, enabling the recipient to gain an immediate 'sense' of the message's essence.

visionOS immersion mode

When a message is played back, we immerse the user in an AI-generated representation of the message's content. For instance, below Agnes is sharing a dream she had, vividly describing her experience of walking through a bustling street in India. The other image shows isImmersion mode for another message—this time from Mike—where he notifies his intention to visit for a dinner with his wife.

Attention to detail

While implementing Closer, we dedicated significant attention to micro-animations, enriching the user experience.

Spatial app icon

We devised a spatial icon in accordance with the visionOS guidelines, causing it to emerge when the user gazes upon it.

The icon's visuals encompass several concepts. The white shapes can be interpreted as the letters 'C', aligning with the app's name, 'Closer.' They are also mirrored, symbolizing two users situated on opposite sides of the app. When combined, the shape at the intersection of the 'C's resembles an eye. Lastly, the 'C' shapes signify the immersion mode, encircling the user in a span of 180 or even 360 degrees.


We developed the application using Xcode Beta, Swift, and SwiftUI.

We designed and animated 3D components, such as the bubble representation of the message, using custom shaders and shader graphs in Reality Composer Pro.

On the backend, we established an auto-scaling infrastructure on the Google Cloud Platform, incorporating a Google Cloud Run service to host our backend API. Whenever a voice message was sent, it was routed through this backend to undergo transcription into text using Google Cloud Speech to Text. Subsequently, we utilized OpenAI's ChatGPT API to analyze the message's semantic content and distill key concepts articulated by the user. We then tasked ChatGPT with transforming these concepts into descriptive scenes. Lastly, we fed these descriptions into Blockade Labs' AI-powered Skybox generator to produce immersive 360-degree scenes.

Get in touch to discuss the possibility of launching your brand on Apple Vision Pro today at