Turtlebot Voice Teleoperation¶
This tutorial will introduce how to control your turtlebot robot using speech recognition.
Make sure that you completed installing all the required packages in the previous tutorials and your network set-up is working fine between the ROS Master node and the Host node.
In this tutorial you will learn how to:
- Teleoprate a real and simulated turtlebot with voice commands
- Create a custom voice-command vocabulary
- Work with pocket-sphinx package in ROS
Learn ROS and get your ROS certificate by enrolling in the Udemy course (Highest Rated course) ROS for Beginners: Basics, Motion and OpenCV.
Learn about ROS2: ROS Next Generation by enrolling in the Udemy course ROS2 How To: Discover Next Generation ROS
Install PocketSphinx for Speech Recognition¶
In order to download the
pocketsphinx package on Ubuntu you need to install the
gstreamer0.10-pocketsphinx and the ROS sound drivers.
sudo apt-get install gstreamer0.10-pocketsphinx sudo apt-get install ros-indigo-pocketsphinx sudo apt-get install ros-indigo-audio-common sudo apt-get install libasound2
You do not need to worry about connecting the audio input stream with your PC or publishing a topic because the node recognizer.py in the
pocketsphinx package does all the work for you.
Test the PocketSphinx Recognizer¶
First, you need to test if the recognizer is working. To get the best result it is better to have an external Mic connected to you PC either by USB, standard audio or Bluetooth.
Make sure that the Mic you are using is the one that is selected in the sound settings in your PC. Try to test your Mic before hands to make sure that the quality is good and check the volume meter.
pocketsphinx comes as a packages in ROS Indigo. For more details, refer to the pocketsphinx package on ROS WiKi.
You can browse the directory of
pocketsphinx package with the command:
In the directory
demo/, you can observe the voice commands and dictionary. You may want to have a look at their content using
nano text viewers.
Then, run the following command:
roslaunch pocketsphinx robocup.launch
robocup.dic contains the following code:
<launch> <node name="recognizer" pkg="pocketsphinx" type="recognizer.py" output="screen"> <param name="lm" value="$(find pocketsphinx)/demo/robocup.lm"/> <param name="dict" value="$(find pocketsphinx)/demo/robocup.dic"/> </node> </launch>
This command will run the
recognizer.py script with loading the dictionary
A list of INFO messages will appear showing the loading of the recognition model. The last few messages will look like this:
INFO: ngram_search_fwdtree.c(186): Creating search tree INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 12 single-phone words INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 328 INFO: ngram_search_fwdtree.c(338): after: 77 root, 200 non-root channels, 11 single-phone words
Now, you can start saying some words to check if the recognizer will be able to understand what you are saying. For example, try to say the following words:
hello go to the room my name is door follow room
You can also try different words.
It might be possible that the recognizer will detect words different from what you pronounced. This may be due to (1) bad microphone, in this case try to get a high-quality microphone, (2) your pronounciation is not clear enough. In this case, try to repeat the word.
If the recognizer successfuly detected your spoken word, you can move to the next step to talk to your robot.
The spoken words found by the recognizer will be published to the topic
rostopic echo /recognizer/output
in another terminal to see the results as follows:
data: go to the room -- data: hello --
To see all the predefined commands in the RoboCup demo, run the following commands:
roscd pocketsphinx/demo more robocup.corpus
Try saying a word that is not in the list such as “the food is hot” and see the results on the topic
/recognizer/output, which will show something different. The recognizer will always try to find the nearest match to the word you say.
Make sure that you mute the recognizer when you are not using it because this will send random data to the robot.
Code and dependencies¶
The scripts of voice teleoperation can be found in
gaitech_edu/src/turtlebot/voice_teleop/ that you imported from GITHUB. Make sure that you imported the code from GITHUB.
If you did not import the code from GITHUB, you can still create a new ROS package as follow:
* Go to your catkin workspace and then go to
* Create new a ROS package called
gaitech_edu (or choose any other name) which depends on
std_msgs as follow:
catkin_create_pkg gaitech_edu roscpp rospy pocketsphinx sound_play std_msgs
- In the
~/catkin_ws/src/, write the following command to see all the files and folders created:
- Now, compile your newly added package:
$ cd ~/catkin_ws ~/catkin_ws$ catkin_make
- Finally, open your
package.xmlfile and add all the required dependencies (otherwise, your project will not find required packages):
<!-- Remove the commented parts --> <package> <name>gaitech_edu</name> <version>0.0.1</version> <description>gaitech_edu</description> <maintainer email="firstname.lastname@example.org">ros</maintainer> <license>TODO</license> <buildtool_depend>catkin</buildtool_depend> <build_depend>pocketsphinx</build_depend> <build_depend>roscpp</build_depend> <build_depend>rospy</build_depend> <build_depend>sound_play</build_depend> <build_depend>std_msgs</build_depend> <run_depend>pocketsphinx</run_depend> <run_depend>roscpp</run_depend> <run_depend>rospy</run_depend> <run_depend>sound_play</run_depend> <run_depend>std_msgs</run_depend> </package>
Now, you are done with creating the ROS package.
Create Your Vocabulary of Commands¶
In this section, you will learn how to add a vocabulary or corpus as it is specified in the
In partiuclar, we will create a simple vocabulary of commands to move the turtlebot robot forward, backward, and rotate it left and right.
Create a folder and call it
config and inside this folder create a
txt file called
and put the following commands (which you can extend later) for the robot motion:
move forward move backwards turn right turn left
Feel free to add/delete/change any command you want as long as you follow the conventions.
Do not use punctuation marks and pay attention to the upper and lower case letters. If you want to add a number you will have to spell it so you can not write 1, 55, 87..etc instead write one, fifty five, eighty seven.
After editing the
motion_commands.txt file, you have to compile it into special dictionary and pronounciation files so it matches the specification for the
The online CMU language model (lm) tool is very useful in this case, visit their website and upload your file.
Click on the Compile Knowledge Base button, then download the file labeled
COMPRESSED TARBALL that contains all the language model files
that you need and the
PocketSphinx can understand.
Extract these files into the config subdirectory of the
gaitech_edu package (or your package where you are working this example). These files must be provided as an input parameter to
To do so, you need to create a launch file as follow.
- First, create a folder and call it
launchwhere to create launch files
- Then, create a file called
recognizer.launch, and add the following XML code:
<launch> <node name="recognizer" pkg="pocketsphinx" type="recognizer.py" output="screen"> <param name="lm" value="$(find gaitech_edu)/turtlebot/voice_teleop/config/motion_commands.lm"/> <param name="dict" value="$(find gaitech_edu)/turtlebot/voice_teleop/config/motion_commands.dic"/> </node> </launch>
If your package name is different from
gaitech_edu make sure to consider this in the instruction
value="$(find gaitech_edu) of the launch file. Otherwise, ROS will not be able to find the parameters.
Make sure that you put the correct path for the
This file runs the
recognizer.py node from the
pocketsphinx package mentioned before in this tutorial.
The last parameter which is
output="screen" is used to let us see in real-time the recognition results in the launch window.
roslaunch gaitech_edu recognizer.launch
and in another terminal run the following command to see the published topics after giving the robot a couple of commands:
rostopic echo /recognizer/output
Make sure to close all the running launch files and all the demos running from previous examples before you run the previous commands.
This is the content of the
voice_teleop.py file in
#!/usr/bin/env python import rospy from geometry_msgs.msg import Twist from std_msgs.msg import String class RobotVoiceTeleop: #define the constructor of the class def __init__(self): #initialize the ROS node with a name voice_teleop rospy.init_node('voice_teleop') # Publish the Twist message to the cmd_vel topic self.cmd_vel_pub = rospy.Publisher('/cmd_vel', Twist, queue_size=5) # Subscribe to the /recognizer/output topic to receive voice commands. rospy.Subscriber('/recognizer/output', String, self.voice_command_callback) #create a Rate object to sleep the process at 5 Hz rate = rospy.Rate(5) # Initialize the Twist message we will publish. self.cmd_vel = Twist() #make sure to make the robot stop by default self.cmd_vel.linear.x=0; self.cmd_vel.angular.z=0; # A mapping from keywords or phrases to commands #we consider the following simple commands, which you can extend on your own self.commands = ['stop', 'forward', 'backward', 'turn left', 'turn right', ] rospy.loginfo("Ready to receive voice commands") # We have to keep publishing the cmd_vel message if we want the robot to keep moving. while not rospy.is_shutdown(): self.cmd_vel_pub.publish(self.cmd_vel) rate.sleep() def voice_command_callback(self, msg): # Get the motion command from the recognized phrase command = msg.data if (command in self.commands): if command == 'forward': self.cmd_vel.linear.x = 0.2 self.cmd_vel.angular.z = 0.0 elif command == 'backward': self.cmd_vel.linear.x = -0.2 self.cmd_vel.angular.z = 0.0 elif command == 'turn left': self.cmd_vel.linear.x = 0.0 self.cmd_vel.angular.z = 0.5 elif command == 'turn right': self.cmd_vel.linear.x = 0.0 self.cmd_vel.angular.z = -0.5 elif command == 'stop': self.cmd_vel.linear.x = 0.0 self.cmd_vel.angular.z = 0.0 else: #command not found #print 'command not found: '+command self.cmd_vel.linear.x = 0.0 self.cmd_vel.angular.z = 0.0 print ("linear speed : " + str(self.cmd_vel.linear.x)) print ("angular speed: " + str(self.cmd_vel.angular.z)) if __name__=="__main__": try: RobotVoiceTeleop() rospy.spin() except rospy.ROSInterruptException: rospy.loginfo("Voice navigation terminated.")
To execute the code, we create the following launch file called
turtlebot_voice_teleop_stage.launch that will run the
voice_teleop.py node and
<launch> <node name="recognizer" pkg="pocketsphinx" type="recognizer.py" output="screen"> <param name="lm" value="$(find gaitech_edu)/src/turtlebot/voice_teleop/config/motion_commands.lm"/> <param name="dict" value="$(find gaitech_edu)/src/turtlebot/voice_teleop/config/motion_commands.dic"/> </node> <node name="voice_teleop" pkg="gaitech_edu" type="voice_teleop.py" output="screen"> <remap from="/cmd_vel" to="/cmd_vel_mux/input/teleop"/> </node> <include file="$(find turtlebot_stage)/launch/turtlebot_in_stage.launch"/> </launch>
The first node starts the
recognizer.py with the
motion_command dictionary that we created previously.
The scond node starts the
voice_teleop.py node that will receives the voice commands and map them to velocities.
The third node starts the
turtlebot_in_stage simulator and brings-up the turtlebot robot in simulatiion. This is equivalent to the command
roslaunch turtlebot_stage turtlebot_in_stage.launch. It also possible to run the
voice_teleop.py node with Turtlebot Gazebo simulator by changing the last line of the launch file with
<launch> ... <include file="$(find turtlebot_gazebo)/launch/turtlebot_world.launch"/> </launch>
It is important to note the topic remapping instruction
<remap from="/cmd_vel" to="/cmd_vel_mux/input/teleop"/> in the launch file for the
In fact, the
voice_teleop.py node publishes velocities to the topic
/cmd_vel, whereas the turtlebot_stage simulator wait for velocities on the topic
This is the reason why we need to remap the topic
/cmd_vel_mux/input/teleop. For more information about the
<remap> tag, refer to <remap> page in ROS WiKi.
Testing the Voice-Control in the Gazebo and Stage Simulators¶
To test the voice teleopration using
Turtlebot Stage simulator, simply launch the launch file specified above as follows:
roslaunch gaitech_edu turtlebot_voice_teleop_stage.launch
Or, if you want to test with
Turtlebot Gazebo Simulator, simply run the following:
roslaunch gaitech_edu turtlebot_voice_teleop_gazebo.launch
This is equivalent to running the following three commands in three terminals:
roslaunch gaitech_edu recognizer.launch rosrun gaitech_edu voice_teleop.py roslaunch turtlebot_stage turtlebot_in_stage.launch
These simulators requires a powerful PC with a good graphics card that can launch them. They also may crash once you start them but don’t worry this is very normal, just rerun the script until it launches.
Make sure to check your Mic settings as described above. If you got an error running the
recognizer node then try installing the following package:
sudo apt-get install gstreamer0.10-gconf
To able to view the commands that are recognizable by the robot we have to run the
rqt_console using the following command:
Now, test your robot by giving it any command from the list of commands you create above.
Note that it is possible that commands are not correctly recognized if your voice is not clear or your microphone is not good enough. Try with high quality microphone for more reliable results.
You can then update the code to add more commands, such as
faster, to increase the speed of the robot,
slower, to decrease the speed of the robot. The file
voice_teleop_advanced.py contains a more elaborated example with more commands.
Testing the Voice-Control with a Real Turtlebot Robot¶
Now, you will test the voice teleoperation with a real turtlebot robot.
Before you test the robot make sure that your robot is in an open space with no obstacles or edges next to it. Also, make sure that your computer machine is correctly configured to work with the Turtlebot laptop as in the Network Configuration.
From the ROS Master gaitech_edu laptop run the following commands:
roslaunch turtlebot_bringup minimal.launch
To make the monitoring process easier bring up
rqt_console by running:
On the host node(the user PC) run the
roslaunch gaitech_edu voice_teleop.launch
Try a simple command at first like the rotate right to avoid any accidents.
You can change the robot’s speed by giving the command “go faster” or “slow down” and this will change the parameters for speed in the
However, you will have to add the commands as mentioned previously in the
config/voice_teleop.txt file and redo all he steps again.
In this section, you will extend the example above and implement a smarter version of the voice teleoperation application.
Question 1: Extending Voice Vocabulary¶
Extend the example above to consider more voice commands to control the robot motion. Implement the following commands:
define the following variables:
start_speed: the initial speed used when starting the script.
linear_increment: a certain increment to add/substract to linear velocity. The default value is 0.05
angular_increment: a certain increment to add/substract the angular velocity. The default value is 0.4
max_speed: the maximum linear velocity allowed
max_angular_speed: the maximum angular velocity allowed
rotate left: makes the robot rotate to the left (with a linear velocity equal to zero)
rotate right: makes the robot rotate to the right (with a linear velocity equal to zero)
turn left: if the linear velocity is different from zero, it increases the angular velocity with
linear_increment. If the linear velocity is zero, it sets the angular velocity to the default angular speed (e.g. 0.5)
turn right: if the linear velocity is different from zero, it decreases the angular velocity with
angular_increment. If the linear velocity is zero, it sets the angular velocity to the default negative angular speed (e.g. -0.5)
slower: decreases the linear velocity by the increment
linear_increment``and the angular velocity by the increment ``angular_increment
faster: increases the linear velocity by the increment
linear_increment``and the angular velocity by the increment ``angular_increment
half: sets the linear and angular velocities to the half of their current values.
full: sets the linear speeds to the maximum speed value
Make sure that all speeds are within the range of the minimum and maximum speeds allowed.
After modification, test you code as illustrated above on both simulated and real turtlebot robot.
Question2: Leveraging the Use of Launch Files¶
The variables defined above
max_angular_speed are set based on the user’s preference.
It is more appropriate to set these values as parameters into the launch file and then read these parameters from your program or script before using them.
Modify the code so that to allow the user to define these parameters from the launch file.
Question3: Enabling/Disabling Voice Teleopration¶
Now, you will add a new functionality to either enable or disable the voice teleopration. For this, you need to add two keywords into the vocabulary.
pause: when the user says
pausethe voice teleoperation should be paused. It means even if the user says more voice commands, they will not be executed
resume: when the user says
resume, the voice teleopration is resumed and voice commands will be executed again.
Make necessary changes to provide these functionalities.