ShelfHelp -A smart assitive system for independent grocery shopping
2022 - Ongoing
Team members:
Shivendra Agrawal (lead)
Suresh Nayak (Graduate researcher)
Ashutosh Naik (Graduate researcher)
Bradley Hayes (Advisor)
We are working towards making an end-to-end system that can assist with independent grocery shopping as shopping with sighted guide is prohibitive and often causes a loss of privacy. Grocery shopping primarily consists of three main subtasks: navigation, product retrieval, and product examination. Our current work focuses on product retrieval. ShelfHelp can locate items on the shelf and verbally provide fine-grain manipulation guidance to help people retrieve the desired item from an aisle.
With our system, the blindfloded user is able to retrieve the desired product from the shelf.
Video with sound (recommended version)
ShelfHelp includes a robotic cane equipped with RealSense D455 and T265 cameras. The system is powered through a laptop in a backpack. Left: The system used as a navigational device. It uses audio and haptic feedback for navigation guidance. Right: The system used as a manipulation device. It uses audio for manipulation guidance.
Poster
System Diagram. Alignment, perception, planning, and verbal conveyance are executed on a backpack-worn laptop, while all the sensing is mounted on the cane.
ShelfHelp uses a novel 2-stage CV pipeline to locate products on the shelf. The pipeline doesn’t need any re-training. In the first stage, we use a YoloV5 network that we train on the SKU-110K dataset, giving us the most likely bounding boxes to contain any product. In the second stage, we take these most likely regions and compare them against the image of the desired product. We freeze the weights of an autoencoder that we train on the MS-COCO dataset and take just the encoder portion to extract features from the desired product image and the proposed regions (the feature vectors are compared using cosine similarity). For each new image added to the database, we pass it through the frozen encoder and save the generated feature vector on disk. This step again doesn’t require any re-training of the autoencoder.
Our product search algorithm can reliably locate desired products on a grocery shelf. Regions with a high likelihood of containing any product are proposed in the first stage. The features of these regions are then compared against the target product image. Our data association solution is used to identify whether detections from incoming camera frames are new or re-detections of existing products. The product classification aspect of this work has been tested and validated in actual grocery stores, whereas the data association and manipulation assistance components were validated within a lab-based study.
(Left to right) A sample of discrete commands. The movement (in meters) each command caused. MDP and solution definition. We train a model of human hand movement from demonstrations that inform the transition probabilities T. S defines the state space, A defines the discrete set of verbal actions, and R is the reward function. A policy is learned offline that can be used across reaching tasks.
Papers
AAMAS
ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane
The ability to shop independently, especially in grocery stores, is important for maintaining a high quality of life. This can be particularly challenging for people with visual impairments (PVI). Stores carry thousands of products, with approximately 30,000 new products introduced each year in the US market alone, presenting a challenge even for modern computer vision solutions. Through this work, we present a proof-of-concept socially assistive robotic system we call ShelfHelp, and propose novel technical solutions for enhancing instrumented canes traditionally meant for navigation tasks with additional capability within the domain of shopping. ShelfHelp includes a novel visual product locator algorithm designed for use in grocery stores and a novel planner that autonomously issues verbal manipulation guidance commands to guide the user during product retrieval. Through a human subjects study, we show the system’s success in locating and providing effective manipulation guidance to retrieve desired products with novice users. We compare two autonomous verbal guidance modes achieving comparable performance to a human assistance baseline and present encouraging findings that validate our system’s efficiency and effectiveness and through positive subjective metrics including competence, intelligence, and ease of use.
IROS
ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane
The ability to shop independently, especially in grocery stores, is important for maintaining a high quality of life. This can be particularly challenging for people with visual impairments (PVI). Stores carry thousands of products, with approximately 30,000 new products introduced each year in the US market alone, presenting a challenge even for modern computer vision solutions. In this work we present our work-in-progress investigating technical solutions for enhancing instrumented canes traditionally meant for navigation tasks with capability within the domain of shopping. Our system includes a novel visual product search algorithm designed for use in the wild and a novel planner that autonomously issues verbal commands to guide the user in a reaching task to acquire them.