A Full-stack Platform for Robotic Applications

Christian Fritz
10 min readFeb 5, 2021

Robotics is hard. And creating, operating, and scaling a robotics business is even harder. Not only is your agility in testing market fit handicapped by the fact that your product involves hardware that isn’t easily changed; you also realize quickly that “everything” in Murphy’s Law means everything, and hardware failures, which require an engineer to fly somewhere to fix a robot, are only a small part of that.

And then there is software. For your fleet of robots to function and for you and your team to deploy, operate, and maintain it, a lot of software is required. ROS, the Robot Operating System, is great because it provides a platform for the robotic needs and a lot of packages one can use to get started quickly. But there is so much more! We all appreciate a robot that stays localized and navigates gracefully. Achieving this 99.9% reliably is no small feast and can take many years of tuning. But an application that is of value to a customer requires a lot more and so does the operation of a large fleet of robots. Robotic applications need to integrate with infrastructure, other software systems, and also expose user interfaces for a variety of user-types. Consider the example of a delivery robot operating in a hospital. It needs to ride elevators, open automated doors, and provide graphical interfaces for various users, including delivery senders and receivers, by-standers, supervisors and administrators, and remote field operators. Some of these UIs need to be on a display on the robot itself, others available via dashboards in the cloud, and yet others are meant to be used via mobile devices.

The following figure lists some of the software capabilities that robotics and other autonomous vehicle companies need in order to operate large fleets of robots. Like many people in the industry, we have seen and felt these needs first-hand and as a result have developed those capabilities, sometimes multiple times for multiple companies ourselves.

List of capabilities.
Deploying and operating a fleet of robots requires a lot of software.

Only a small fraction of these are capabilities for the robot itself. The vast majority span multiple systems. For instance, the ability for a robot to ride elevators typically requires software running on an on-prem server, software running in the cloud, and application logic on the robot. The on-prem server, which is physically connected to the elevator controller, needs to receive requests from the robot (e.g., call elevator to Floor 3) and translate them into requests to the elevator controller. It also needs to provide status information back to the robot (e.g., current floor is Lobby, doors are open). The robot needs to know which elevators connect which floors, how to call them, how to maneuver in and out of each car, how to behave when the elevator appears too full to enter, etc. Finally the cloud software needs to provide a GUI for defining elevators on maps, indicating which map belongs to which floor, and of course a lot of status and health information about these elevators that will allow the remote field operations team to detect and analyze problems — you’d be surprised how often elevators fail without people noticing!

Neither ROS nor any other software platform we know of addresses this breadth of needs across multiple systems, and as a result, the above capabilities do not exist as commercial-off-the-shelf or open-source modules that robotics companies can use. Instead, each robotics startup independently recreates these capabilities in house, reinventing the wheel over and over again, both in terms of the platform and infrastructure as well as the capabilities themselves.

Another thing that is striking about the above list is that very few of these capabilities provide differentiation for the companies who build and use them. Many of these are just table-stakes (e.g., map editing) or are tools only used internally for efficient operation of the fleet (e.g., health monitoring), making it even more unnecessary for companies to implement them in isolation without sharing. Money spent on building and maintaining these takes away from their ability to execute on their core competencies.

If this feels like the pre-ROS era when every new robotics project began by creating their own robotics framework, then that’s because it is. There’s gotta be a better way.

A Platform with Capabilities

We believe the best approach to address this need is to design and build an open, full-stack platform for robotics applications and a repository of commonly needed capabilities that runs on top of that. Capabilities can be distributed via packages whose components may span the entire stack including robot, on-prem servers, cloud, and user-interfaces, as opposed to just one of these systems as is the case today. We also envision the platform to explicitly support several time-frames, including development time, test-, build-, various stages of run-time, and post-hoc analysis. Logically continuing on the design principles of ROS, the platform extends ROS’ data abstractions and communication modalities to all of these systems. It provides selective data synchronization, on-demand data fetching, and remote procedure calls across systems. It includes a front-end framework for creating reusable UIs components. A company using a package can hence quickly add new elements to their existing UIs, providing access to the capabilities the package provides. That last piece is important: companies don’t like switching-costs, and users don’t want to use separate UIs for separate tasks either. To address this, packages on the platform provide reusable web components that can be easily embedded in existing web applications, or native UI components for Android, iOS, or desktop operating systems. For those who want to get started even more quickly, the platform includes a ready-to-go default UI that is auto-populated by all the UI components installed by packages.

Screenshot of a teleop application.
Screenshot of a simple remote-teleop UI with low-latency video streaming built from components provided by packages.

As an example, consider a remote-tele-operation package. On the client side this package provides a soft-joystick UI component that can be used on a smartphone or other touch-enabled device to accurately control linear and angular velocities of the robot. The package also provides a video display component that shows a live-stream of the robot’s camera(s). On the robot, this package contains two components: one that receives velocity commands from the soft-joystick via the platform-provided data layer, performs liveness and safety checks, and sends motor commands. A second component opens a video stream from the robot’s camera(s), processes the video to reduce resolution and color depth, and compresses the stream for sending it to all subscribed clients via the cloud. All along this chain the package ensures that low-latency requirements are met. Because when you are remotely joy-sticking a robot that is half-way around the world, connected to the Internet over a sometimes flaky LTE connection, you want to be certain that what you see is as close to real-time as possible. Otherwise, the control commands you send may have unintended consequences:

An illustration depicting a robot falling down stairs.
Certain types of failures in robotics, such as falling down stairs, can be catastrophic and even lethal and hence must be avoided at all cost.

Given this package, other companies can add a simple but very effective remote-teleop capability to their robotic application in less than an hour, rather than the several weeks or months it would take to re-create it from scratch.

The following image depicts the conceptual architecture we envision. In addition to the elements already mentioned, i.e., the on-robot agent, on-prem software, cloud instance, UI framework, and communication layer, the platform also provides a package manager for easy installation or packages. Packages can contain components for any subset of the four systems: it is this feature that sets this architecture apart. The ability to bundle components that work together across the various systems is what enables developers to provide complete robotic capabilities, not just pieces of it. Such full-stack capabilities provide value to the end-user of the robot in the moment they are installed. No coding required.

An architectural diagram of the platform we describe.
Platform architecture including full-stack packages that can be installed at run-time.

Above we described the software components required across the various systems in order to enable robots to ride elevators. As can be seen from the diagram, these components, independent from the system they need to be installed on, can all be bundled and installed as one package, hence providing a complete capability, the ability to ride elevators, with one click in the package manager UI. The package manager ensures that the included components are installed on their respective systems (robot, on-prem, cloud, and front-end UIs), the data layer will ensure that they can communicate as necessary, and the included UI components will let your team start configuring elevators and floors right away. Within hours, not months, will your robots be able to provide services across multiple floors of a building.

Another example of a capability every robotics company needs is Health Monitoring. Each and every system involved in your operation can fail in many, surprising ways and when they do, your customer’s workflows may be disrupted. But part of the robotics-as-a-service business model that most robotics companies follow is a service level agreement, implicit or explicit, that guarantees smooth operation of your robotic fleet. It is therefore essential to monitor your fleet, anticipate issues before they happen, and resolve any issues that still occur as quickly as possible. For that, companies need to monitor several vital signs of their robots, on-prem equipment, and cloud instances, aggregate them, check them against thresholds, and alert their operations teams when necessary. Getting this right is not easy though: many of the connected systems have limited bandwidth to the cloud, predicting problems from the observable vitals is not trivial, and alerting often fails to be effective when there are too many false positives. Companies we spoke with could save money from automated anomaly detection, but are stretched too thin to develop it themselves. The platform we describe provides many of the pieces required for someone to develop a very good health monitoring capability, and a mechanism for distributing this capability to the companies that need it.

An Open Repository of Capabilities

Oftentimes, the real impact of a platform is not a result of the breadths of its features, but of the ecosystem that develops on top of it. Hence, we believe that the platform needs to come with an online repository where developers can offer their packages, each implementing a different capability or feature. Other companies can then use these packages to develop or enhance their robotic fleet with very little effort and robotic know-how. The repository serves more than just the distribution of packages. It also provides a way to vet packages and developers. Following the lead of other successful repositories, we envision it to provide additional information about each package, including how many robots are currently using it, how actively it is being maintained, and what users say about it.

Benefits

By defining and developing a full-stack platform like this, a common ground is created on which the robotics community can develop new capabilities and applications that are reusable across multiple robotics use-cases and industries. An ecosystem that grows on top of this platform, offering a wide variety of packages, will allow robotics companies to stop reinventing the wheel and spending their scarce resources on developing table-stake capabilities like robot health monitoring. Instead they can focus on creating intellectual property and features that create real differentiation in their respective markets, e.g., new algorithms for identifying grasping-points for a fruit-picking robot, or hazard prediction for self-driving cars . And the math works out: rather than 40 robotics companies all hiring engineers to build and maintain the same software capability, one company can develop and maintain such capability once very well and license it to other companies for a monthly fee. This is more cost effective for everyone as it avoids duplicate-work by highly-paid engineers and also increases quality.

In addition, the platform democratizes robotics even further. It lowers the bar for applying robotics to the automation needs of ever more companies, including those without robotics expertise. For the robotics industry this is significant, because one of the biggest challenges for robotics startups today is the identification and validation of market fit. This task of finding a match between what is possible and what is necessary is facilitated when the agility of experimenting with robotic solutions is increased. And that’s exactly what a platform with an active developer community does. There are already a number of vendors that sell platform robots on which new use-cases can be developed. Using existing ROS packages one can quickly create maps for these robots and make them navigate on these maps without bumping into things or getting lost. But that is not enough. To fully address the automation needs in question, it currently still requires robotics software engineers to turn a robot with such basic skill into an end-to-end application that integrates with infrastructure, implements the required business logic, and can be controlled and interacted with by end-users.

There is plenty of precedence of platforms like this dramatically accelerating innovation, with ROS in robotics, deep learning frameworks like PyTorch, TensorFlow, and Keras in machine learning, and SDKs for iOS and Android being just a few examples. In all these cases, the platform facilitated match-making between developers and end-users, which established a fluid exchange of ideas for new, innovative use-cases that no one had previously anticipated. We believe that an application platform like this will have a similar impact on robotic automation.

The platform we describe doesn’t yet exist. And just creating it is not enough. The success of a platform follows the same dynamics as a marketplace: it has an inherent network effect, where the network is a bipartite graph. On the one side are developers and capabilities, on the other are robotics companies, enterprises requiring automation, and their use-cases. As more and more capabilities exist on this platform, more and more companies with automation needs will be inclined to address their use-cases via this platform. This increase in demand in turn will encourage more developers to participate, and so on. However, for this dynamic to happen and in order to overcome the initial chicken-and-egg problem, a great degree of openness and community outreach is required. Publishing this post is our first step towards that. The second step is to build an essential set of capabilities that will help corral users and developers around this platform, and seed an ecosystem. For that, please raise your hand if there are specific capabilities you would like to see on this platform, you want to help build the platform and capabilities, or both!

Click here to get in touch.

--

--

Christian Fritz

CEO of Lumin Robotics; Former VP of Software at Savioke; PhD in CS/AI from UofT.