Introduction
In Japan's labor shortage crisis, companies are turning to physical AI robots to fill roles that are difficult to staff. This tutorial will teach you how to build and deploy a robot control system using Python, ROS (Robot Operating System), and computer vision to automate repetitive tasks in manufacturing or service environments.
This project demonstrates how AI can be applied to real-world problems by creating a robot that can identify, pick up, and move objects using computer vision and motion control.
Prerequisites
- Basic understanding of Python programming
- Knowledge of ROS (Robot Operating System) fundamentals
- Access to a physical robot platform (or simulation environment)
- Computer vision libraries (OpenCV, NumPy)
- Basic understanding of robotics kinematics and motion planning
Step-by-Step Instructions
1. Set up your ROS environment
First, we need to initialize our ROS workspace and install required packages. This sets up the foundation for robot control and communication.
mkdir -p ~/robot_ws/src
cd ~/robot_ws/src
catkin_init_workspace
Next, create a new ROS package for our robot control system:
catkin_create_pkg robot_control rospy roscpp std_msgs sensor_msgs image_transport cv_bridge
2. Create the object detection node
We'll build a computer vision node that can detect and locate objects in real-time. This is crucial for the robot to know where to move its gripper.
import rospy
import cv2
import numpy as np
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
class ObjectDetector:
def __init__(self):
rospy.init_node('object_detector')
self.bridge = CvBridge()
self.image_sub = rospy.Subscriber('/camera/image_raw', Image, self.image_callback)
self.object_pub = rospy.Publisher('/detected_object', ObjectPosition, queue_size=10)
def image_callback(self, msg):
try:
cv_image = self.bridge.imgmsg_to_cv2(msg, "bgr8")
except CvBridgeError as e:
print(e)
return
# Simple color-based object detection
hsv = cv2.cvtColor(cv_image, cv2.COLOR_BGR2HSV)
lower_red = np.array([0, 50, 50])
upper_red = np.array([10, 255, 255])
mask = cv2.inRange(hsv, lower_red, upper_red)
contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
if contours:
largest_contour = max(contours, key=cv2.contourArea)
M = cv2.moments(largest_contour)
if M["m00"] != 0:
cx = int(M["m10"] / M["m00"])
cy = int(M["m01"] / M["m00"])
# Publish object position
pos_msg = ObjectPosition()
pos_msg.x = cx
pos_msg.y = cy
self.object_pub.publish(pos_msg)
if __name__ == '__main__':
detector = ObjectDetector()
rospy.spin()
3. Implement robot motion control
Now we create the motion control node that will move the robot's arm to the detected object location:
import rospy
import actionlib
from move_base_msgs.msg import MoveBaseAction, MoveBaseGoal
from control_msgs.msg import FollowJointTrajectoryAction, FollowJointTrajectoryGoal
from trajectory_msgs.msg import JointTrajectoryPoint
from robot_control.msg import ObjectPosition
class RobotController:
def __init__(self):
rospy.init_node('robot_controller')
# Initialize action clients
self.move_base = actionlib.SimpleActionClient('move_base', MoveBaseAction)
self.arm_client = actionlib.SimpleActionClient('arm_controller/follow_joint_trajectory', FollowJointTrajectoryAction)
self.object_sub = rospy.Subscriber('/detected_object', ObjectPosition, self.object_callback)
self.move_base.wait_for_server()
self.arm_client.wait_for_server()
def object_callback(self, msg):
rospy.loginfo("Object detected at: (%d, %d)", msg.x, msg.y)
# Convert pixel coordinates to robot workspace coordinates
robot_x, robot_y = self.pixel_to_robot_coords(msg.x, msg.y)
# Move robot to object position
self.move_to_position(robot_x, robot_y)
# Pick up object
self.pick_up_object()
def move_to_position(self, x, y):
goal = MoveBaseGoal()
goal.target_pose.header.frame_id = "map"
goal.target_pose.header.stamp = rospy.Time.now()
goal.target_pose.pose.position.x = x
goal.target_pose.pose.position.y = y
goal.target_pose.pose.orientation.w = 1.0
self.move_base.send_goal(goal)
self.move_base.wait_for_result()
def pick_up_object(self):
# Define joint trajectory for gripper
goal = FollowJointTrajectoryGoal()
point = JointTrajectoryPoint()
point.positions = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0] # Gripper closed
point.time_from_start = rospy.Duration(1.0)
goal.trajectory.points.append(point)
self.arm_client.send_goal(goal)
self.arm_client.wait_for_result()
if __name__ == '__main__':
controller = RobotController()
rospy.spin()
4. Create launch files
Launch files coordinate the startup of multiple nodes and parameters:
<launch>
<node name="object_detector" pkg="robot_control" type="object_detector.py" output="screen"/>
<node name="robot_controller" pkg="robot_control" type="robot_controller.py" output="screen"/>
<param name="robot_description" command="$(find xacro)/xacro '$(find robot_control)/urdf/robot.urdf.xacro" />
<node name="robot_state_publisher" pkg="robot_state_publisher" type="robot_state_publisher" />
</launch>
5. Configure robot parameters
Define the robot's physical characteristics and joint limits in a parameter file:
<robot name="robot_arm">
<link name="base_link">
<visual>
<geometry>
<box size="0.5 0.5 0.1" />
</geometry>
</visual>
</link>
<link name="arm_link">
<visual>
<geometry>
<box size="0.1 0.5 0.1" />
</geometry>
</visual>
</link>
<joint name="base_to_arm" type="revolute">
<parent link="base_link" />
<child link="arm_link" />
<origin xyz="0 0 0.1" />
<axis xyz="0 0 1" />
<limit lower="-1.57" upper="1.57" effort="100" velocity="1" />
</joint>
</robot>
6. Test and deploy the system
Finally, test your system by launching the complete robot control system:
roslaunch robot_control robot_system.launch
Monitor the system using:
rosnode list
rostopic echo /detected_object
rostopic echo /robot_status
For production deployment, consider adding error handling, safety checks, and logging to ensure the robot operates reliably in real-world conditions.
Summary
This tutorial demonstrates how to build a practical robot control system that addresses Japan's labor shortage challenges. By combining computer vision with robotic motion control, we've created an automated solution that can identify and manipulate objects. The system uses ROS for communication and control, OpenCV for computer vision, and proper node architecture for maintainability.
Key learning outcomes include understanding robot control architecture, implementing computer vision for object detection, and creating ROS nodes that work together seamlessly. This approach can be scaled to handle more complex tasks in manufacturing, logistics, and service industries where labor shortages are common.



