z6首页HIP

    

z6首页HIP: Empowering Intelligent Robots through Embodied AI, Stronger United, Yet Distinct.

Welcome to z6首页HIP Project page! z6首页HIP is an open-source embodied AI robotic software stack to empower various forms of intelligent robots. 

 

Panopto Video (z6首页HIP)

 

Codes & Papers 

z6首页HIP 1.0

1. Introduction

While embodied AI holds immense potential for shaping the future economy, it presents significant challenges, particularly in the realm of computing. Achieving the necessary flexibility, efficiency, and scalability demands sophisticated computational resources, but the most pressing challenge remains software complexity. Complexity often leads to inflexibility. 

Embodied AI systems must seamlessly integrate a wide array of functionalities, from environmental perception and physical interaction to the execution of complex tasks. This requires the harmonious operation of components such as sensor data analysis, advanced algorithmic processing, and precise actuator control. To support the diverse range of robotic forms and their specific tasks, a versatile and adaptable software stack is essential. However, creating a unified software architecture that ensures cohesive operation across these varied elements introduces substantial complexity, making it difficult to build a streamlined and efficient software ecosystem.

z6首页HIP has been developed to tackle the problem of software complexity in embodied AI. Its mission is to provide an easy-to-deploy software stack that empowers a wide variety of intelligent robots, thereby facilitating scalability and accelerating the commercialization of the embodied AI sector. z6首页HIP takes inspiration from Android, which played a crucial role in the mobile computing revolution by offering an open-source, flexible platform. Android enabled a wide range of device manufacturers to create smartphones and tablets at different price points, sparking rapid innovation and competition. This led to the widespread availability of affordable and powerful mobile devices. Android's robust ecosystem, supported by a vast library of apps through the Google Play Store, allowed developers to reach a global audience, significantly advancing mobile technology adoption.

Similarly, z6首页HIP's vision is to empower robot builders by providing an open-source embodied AI software stack. This platform enables the creation of truly intelligent robots capable of performing a variety of tasks that were previously unattainable at a reasonable cost. z6首页HIP’s motto, "Stronger United, Yet Distinct," embodies the belief that true intelligence emerges through integration, but such integration should enhance, not constrain, the creative possibilities for robotic designers, allowing for distinct and innovative designs.

To realize this vision, z6首页HIP has been designed with flexibility, extensibility, and intelligence at its core. In this release, z6首页HIP offers both software and hardware specifications, enabling robotic builders to develop complete embodied AI systems for a range of scenarios, including home, retail, and warehouse environments. z6首页HIP is capable of understanding natural language instructions and executing navigation and grasping tasks based on those instructions. The current z6首页HIP robot form factor features a hybrid design that includes a wheeled chassis, a robotic arm, a suite of sensors, and an embedded computing system. However, z6首页HIP is rapidly evolving, with plans to support many more form factors in the near future. The software architecture follows a hierarchical and modular design, incorporating large model capabilities into traditional robot software stacks. This modularity allows developers to customize the z6首页HIP software and swap out modules to meet specific application requirements.

 

z6首页HIP is distinguished by the following characteristics:

Source Code of z6首页HIP is available here: https://github.com/airs-admin/airship

 

2. Demo Videos

 

3. Hardware Architecture

The z6首页HIP hybrid robot comprises a wheeled chassis, a robotic arm with a compatible gripper, an Nvidia Orin computing board, and a sensor suite including a LiDAR, a camera, and an RGBD camera. Detailed hardware specifications are provided in this file. The following figure illustrates the z6首页HIP robot's hardware architecture.

 

4. Software Architecture

z6首页HIP is designed for scenarios that can be decomposed into sequential navigation and grasping tasks. Leveraging state-of-the-art language and vision foundation models, z6首页HIP augments traditional robotics with embodied AI capabilities, including high-level human instruction comprehension and scene understanding. It integrates these foundation models into existing robotic navigation and grasping software stacks.

The software architecture, illustrated in the above figure, employs an LLM to interpret high-level human instructions and break them down into a series of basic navigation and grasping actions. Navigation is accomplished using a traditional robotic navigation stack encompassing mapping, localization, path planning, and chassis control. The semantic map translates semantic objects into map locations, bridging high-level navigation goals with low-level robotic actions. Grasping is achieved through a neural network that determines gripper pose from visual input, followed by traditional robotic arm control for execution. A vision foundation model performs zero-shot object segmentation, converting semantic grasping tasks into vision-based ones.

 

4.1 Navigation software

The navigation software pipeline operates as follows. The localization module fuses LiDAR and IMU data to produce robust odometry and accurately determines the robot's position within a pre-built point cloud map. The path planning module generates collision-free trajectories using both global and local planners. Subsequently, the path planner provides velocity and twist commands to the base controller, which ultimately produces control signals to follow the planned path.

 

4.2 Grasping software

The grasping software pipeline operates as follows: GroundingDINO receives an image and object name, outputting the object's bounding box within the image. SAM utilizes this bounding box to generate a pixel-level object mask. GraspingNet processes the RGB and depth images to produce potential gripper poses for all objects in the scene. The object mask filters these poses to identify the optimal grasp for the target object.

 

z6首页HIP 2.0

 

 

Ecosystem

References:

  1. Liu, S., 2024. Shaping the Outlook for the Autonomy Economy. Communications of the ACM, 67(6), pp.10-12.
  2. Liu, S., 2024. Societal Impacts of Embodied AI. Communications of the ACM, https://cacm.acm.org/blogcacm/societal-impacts-of-embodied-ai/
  3. Liu, S., 2024. Establishing Standards for Embodied AI. Communications of the ACM, https://cacm.acm.org/blogcacm/establishing-standards-for-embodied-ai/
  4. Wang, F., Liu, S., 2024. Building Foundation Models for Embodied Artificial Intelligence, Communications of the ACM, https://cacm.acm.org/blogcacm/building-foundation-models-for-embodied-artificial-intelligence/
  5. Liu, S., Wu, S.. 2024. A Brief History of Embodied Artificial Intelligence, and Its Future Outlook. Communications of the ACM, https://cacm.acm.org/blogcacm/a-brief-history-of-embodied-artificial-intelligence-and-its-future-outlook/
  6. Hao, Y., Gan, Y., Yu, B., Liu, Q., Han, Y., Wan, Z. and Liu, S., 2024, April. ORIANNA: An Accelerator Generation Framework for Optimization-based Robotic Applications. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (pp. 813-829).
  7. Liu, S., 2024. The Value of Data in Embodied Artificial Intelligence. Communications of the ACM, https://cacm.acm.org/blogcacm/the-value-of-data-in-embodied-artificial-intelligence/
  8. Liu, S., Ding, N. 2024. Building Computing Systems for Embodied Artificial Intelligence. Communications of the ACM, https://cacm.acm.org/blogcacm/building-computing-systems-for-embodied-artificial-intelligence/
  9. Wu, S., Yu, B., Liu, S. and Zhu, Y., 2023. Autonomy 2.0: The Quest for Economies of Scale. arXiv preprint arXiv:2307.03973.
  10. Gan, Y., Whatmough, P., Leng, J., Yu, B., Liu, S. and Zhu, Y., 2022, October. Braum: Analyzing and protecting autonomous machine software stack. In 2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE) (pp. 85-96). IEEE.
  11. Hao, Y., Yu, B., Liu, Q., Liu, S. and Zhu, Y., 2022, October. Factor graph accelerator for lidar-inertial odometry. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design (pp. 1-7).
  12. Liu, S., Li, X., Geng, T., Zuckerman, S. and Gaudiot, J.L., 2022, October. Programming Autonomous Machines: Special Session Paper. In 2022 International Conference on Embedded Software (EMSOFT) (pp. 24-33). IEEE.
  13. Yu, B., Tang, J. and Liu, S.S., 2023, July. Autonomous Driving Digital Twin Empowered Design Automation: An Industry Perspective. In 2023 60th ACM/IEEE Design Automation Conference (DAC) (pp. 1-4). IEEE.
  14. Hao, Y., Gan, Y., Yu, B., Liu, Q., Liu, S.S. and Zhu, Y., 2023, July. Blitzcrank: Factor graph accelerator for motion planning. In 2023 60th ACM/IEEE Design Automation Conference (DAC) (pp. 1-6). IEEE.
  15. Niu, X., Zhang, Y., Zhang, Y., Tian, H., Yu, B., Liu, S. and Huang, S., 2024, April. Accelerating Autonomous Path Planning on FPGAs with Sparsity-Aware HW/SW Co-Optimizations. In Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (pp. 42-42).

 

Further Reading: