Real-time SDK

Overview

Formant's real-time SDK allows web applications to establish and utilize real-time connections to devices over the internet or locally.

The underlying technology used to establish connections is called WebRTC. It's a web standard for real-time rich media and data applications.

After a web application connects to a device, it can create custom data channels between the application and a Formant device. Data channels are bi-directional and have configurable reliability. Any data payload can be sent over custom data channels.

The real-time SDK comes with a video player, which can be used to embed a live video stream from your device into any application.

Using this toolkit, developers can easily create any real-time application, custom-tailored for their needs.

If you want to jump right into example code, head to the repos we reference in this guide:

In this guide, we'll cover how to build a real-time application and embed it within Formant as a custom view.

Before we get started, a few quick notes:

  • A device can only have one real-time connection at a time. A new connection will disconnect any existing connections.
  • The real-time SDK only officially supports Google Chrome.

React template for real-time SDK applications

Formant maintains a starter template repo for new custom web view real-time SDK applications. If you want to get started right away with a bare bones React template, clone the following repo:

Then, follow the instructions on the custom web view documentation to run the app embedded in Formant as a custom web view.

In the file src/App.js, we can see how real-time client initialization is done:

import { RtcClient, SignalingPromiseClient } from '@formant/realtime-sdk';

const formantApiUrl = "https://api.formant.io";

// Create an instance of the real-time communication client
const rtcClient = new RtcClient({
  signalingClient: new SignalingPromiseClient(formantApiUrl, null, null),
  getToken: () => (new URLSearchParams(window.location.search)).get("auth"),
  receive: (peerId, message) => this.receiveRtcMessage(peerId, message),
});

console.log("Waiting for RTC client to initialize...")
await delay(500);

// Each online device and user has a peer in the system
const peers = await rtcClient.getPeers()
console.log(peers);

// Find the device peer corresponding to the device's ID
const devicePeer = peers.find(_ => _.deviceId !== undefined)
if (!devicePeer) {
  // If the device is offline, we won't be able to find its peer.
  console.log("Failed to find device peer.")
  return
}

// We can connect our real-time communication client to device peers by their ID
const devicePeerId = devicePeer.id;
await rtcClient.connect(devicePeerId)

// WebRTC requires a signaling phase when forming a new connection.
// Wait for the signaling process to complete...
while (rtcClient.getConnectionStatus(devicePeerId) !== "connected") {
  await delay(100);
  console.log("Waiting for connection ...")
}

Custom data channel initialization and usage

After the rtcClient is initialized, we can create custom data channels. Let's see what that looks like:

// Create a custom data channel to the device peer with a name, settings, and handlers.
// The device-side application can send and receive messages
// on this channel using the agent API
rtcClient.createCustomDataChannel(
  devicePeerId, // device peer to open the channel with
  "example-unreliable-channel", // channel name
  { ordered: false, maxRetransmits: 0 }, // channel settings
  true, // use binary data format
  (_, channel) => {
    this.dataChannel = channel;
    channel.onopen = () => {
      console.log("Channel opened.")
    }
    channel.onmessage = (event) => this.onChannelEvent(event);
  },
);

channel is an RTCDataChannel. It's a real-time bi-directional transport for arbitrary data with configurable reliability. Here is a good starting point for more information: https://developer.mozilla.org/en-US/docs/Web/API/RTCDataChannel

Notice the argument for channel settings: { ordered: false, maxRetransmits: 0}

This is an RTCChannelInit object, and it determines the reliability of the data channel you're creating. Here's an excellent, detailed breakdown:
https://jameshfisher.com/2017/01/17/webrtc-datachannel-reliability/

And here's the TL;DR. Most applications will only need the following settings:
TCP-like: { ordered: true }
UDP-like: { ordered: false, maxRetransmits: 0 }

How do we send and receive data over the channel? Since the channel we created uses a binary data format, we must encode data before sending, and decode data after receiving.

const decoder = new TextDecoder('utf-8');
const encoder = new TextEncoder('utf-8');

Receiving utf-8 encoded binary data

channel.onmessage = (event) => {
  decoder.decode(event.data);
}

Sending utf-8 encoded binary data

channel.send(encoder.encode("Arbitrary utf-8 encoded data"))

Example real-time application

Let's take a look at an example application which uses the real-time SDK to create three functional real-time components.

Here's the repository:

You can follow the instructions in the README.md to run the repository locally to see how things work. For this example to run, you will need to install Python and some dependencies.

Adapters

There are two sides to every real-time application: the user interface, and the device application. We often refer to applications which run on the device and use the Formant agent's capabilities as a "Formant adapter". The most common language for these adapters is Python.

An adapter is included with the example: see adapter.py in the root of the repository: https://github.com/FormantIO/realtime-sdk-guide/blob/master/adapter.py

Check it out and read through the comments to learn how you can build your own real-time adapter. At a high level, this adapter:

  • instantiates a Formant agent client
  • sends CPU core utilization percent periodically over the "cores" channel
  • creates a handler for real-time messages on the "path" and "textToSpeech" channels, and handles them

This adapter makes use of the following agent client methods:

register_custom_data_channel_message_callback
        :param f: A callback that will be called with messages
            received on the specified custom data channel
        :param channel_name_filter: An optional allow list of custom channel names
            for this callback

send_on_custom_data_channel
        :param channel_name: The name of the channel to send a message over
        :param payload: The payload of the message (bytes)

Custom views

The other side of the real-time application: the user interface. This can be its own web application hosted anywhere, or it can be a custom view. Custom views are static websites which are embedded inside the Formant app as views and are passed the credentials and information required to use Formant APIs and SDKs. Developing a custom view for an application-specific interface is a great way to allow viewers and operators to get their jobs done without switching between multiple apps and websites.

We've written an example so you can learn how to develop a real-time custom view of your own. The salient code can be found here: https://github.com/FormantIO/realtime-sdk-guide/blob/master/src/App.js

Read through the code and comments to learn how we establish a real-time connection to the device, open channels, and send and receive messages. If you're interested in playing with the example, clone this repo and run the interface and adapter locally alongside an agent to see what's possible.

798

At a high level, this custom view example does the following:

  • Creates and credentials an RtcClient object
  • Establishes a real-time connection to a specific device
  • Sets up three custom data channels, "path", "cores", and "textToSpeech"
  • Path Control: Collects mouse drag data and sends it to the device application using the "path" real-time channel
  • CPU Core Utilization: Handles messages on the "cores" real-time channel and displays a canvas visualization
  • Text-to-speech: Sends user input to the device application using the "textToSpeech" real-time channel

Video Example

Live video is an essential ingredient for many real-time applications. To add video using the real-time SDK, the first step is to configure a video stream in the teleop configuration of your Formant device.

1680

Two image streams configured in the "teleoperation" tab for the device "spot"

We've created an example application to show how developers can embed Formant live video into a custom view or website. Check out the repo to see it in action:

Let's go through that example and call out a few things.

Notice the file public/formant-ui-sdk.js and our import in public/index.html:

    <script src="formant-ui-sdk.js"></script>

This adds the ability to create a Formant real-time video player right from the window object:

    this.h264BytestreamCanvasDrawer =
      new window.Formant.H264BytestreamCanvasDrawer(
        this.setWebglYUVSupported,
        this.setWarningText,
        this.handleCanvasDrawerWarning,
        {
          Stream: "",
        }
      );

Notice that when we create the RtcClient this time, we draw received frames directly onto the canvas.

    // Create an instance of the real-time communication client
    const rtcClient = new RtcClient({
      signalingClient: new SignalingPromiseClient(formantApiUrl, null, null),
      getToken: () => this.auth,
      receive: (_peerId, message) => {
        this.h264BytestreamCanvasDrawer.receiveEncodedFrame(
          message.payload.h264VideoFrame
        );
      },
    });

Note: For video streams, we have to "enable" the stream in the following manner:

    const videoStream = await this.getActiveVideoStream();
    rtcClient.controlRemoteStream(devicePeerId, {
      streamName: videoStream,
      enable: true,
      pipeline: "rtc",
    });

The getActiveVideoStream method makes a request to the Formant backend to retrieve the video stream names.

  async getActiveVideoStream() {
    let response = await fetch(
      formantApiUrl + "/v1/admin/devices/" + this.deviceId,
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: "Bearer " + this.auth,
        },
      }
    );
    response = await response.json();

    let latestVersion = response.desiredConfigurationVersion;

    response = await fetch(
      formantApiUrl +
        "/v1/admin/devices/" +
        this.deviceId +
        "/configurations/" +
        latestVersion,
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: "Bearer " + this.auth,
        },
      }
    );
    response = await response.json();

    const hardwareStream = response.document.teleop.hardwareStreams[0].name;

    return hardwareStream;
  }

The h264BytestreamCanvasDrawer takes a canvas element which it draws the frames to.

          <canvas
            ref={(_) =>
              this.h264BytestreamCanvasDrawer.setCanvas(_ || undefined)
            }
          />

Contact us

Formant believes that human-in-the-loop real-time interfaces are a necessary step for many industries in the transition to autonomy. Let's work together to build your real-time application. https://formant.io/get-started/