2. Introduction to Autnomous Navigation 2
14 May 2020 | ROBOTICS
Autonomous Navigation System
Client & Cloud System
S/W and skill requirements for autonomous driving
the 5 levels of driving automation
Client System
- ROS
- Hardware Platform
Client system in autonomous driving
- Requirements
- Multiple algorithms are integrated to satisfy real-time and reliability requirements.
- The processing pipeline must be fast enough to handle the vast amount of data generated by the sensor.
- Sufficient enough to recover even if an error occurs in any part
- Strict energy and resource constraints must also be met in processing all of these operations.
- Component
- operating system
- Hardware platform
ROS(Robot operating System)
- ROS
- Powerful distributed computing framework specialized for robotics applications
- Specific tasks are driven from the node.
- Nodes communicate with each other through topics and services.
- Problem
- Reliability: In ROS, there is only one master, and there is no monitor to recover the failed node.
- Performance: In the process of sending a message in a broadcast manner, the same message may be duplicated multiple times, resulting in a decrease in performance.
- Security: No authentication and encryption mechanisms.
- responsibility
- Since there is only one master node, if the master node dies, the entire system dies.
- This is against the safety requirements of the autonomous driving system.
- To solve this, a mechanism similar to that of the zookeeper (distributed coordinator) was introduced.
- When the main master node dies, the backup master node is taken over to operate the system normally.
- When monitoring through the zookeeper mechanism and detecting a failed node, restart it.
- Performance
- The communication between the nodes is very frequent, so it must be handled efficiently.
- Local nodes communicate with each other using a loop-back mechanism, which causes about 20ms of overhead each time.
- To eliminate this overhead
- Use shared memory mechanism to process messages to go straight to the destination node without going through the TCP / IP stack
- ROS node replaces the message communication method from broadcasting to multicast mechanism, greatly reducing the bandwidth consumed by making multiple copies during the broadcasting process.
- security
- Security problem occurs because there is no symptom and encryption mechanism
- Because there is no authentication, the node may be down in such a way that it infiltrates the node and continuously allocates memory, exhausting system memory.
- Since it does not encrypt the messages that are sent and received between nodes, it can be subjected to a man-in-the-middle attack that intercepts them.
- solution
- It is solved by using LXC (Linux Container) to limit the amount of allowed resources for each node, and protect the node from the outside through the sandbox mechanism.
- Encrypt messages sent and received during communication
ROS1 vs ROS2
- Resolving the reliability, performance, and security problems of ROS1 in ROS2
- Changed data processing method (XMLRPS / TCPROS => DDS / RTPS)
- Multi-node is possible in one process (Peer to Peer communication method does not have a separate master and is performed according to the role.)
- Improving reliability and performance through methods such as notifying the master of the polling method based on the parameter, etc.
- Real-time computing (not for performance, but for final decision)
- Enhanced security through DDS (Data Distribution Service) security
- Compatible with various OS, python3 support, various embedded system support etc. https
Hardware Platform
- Example
- Specifications used by autonomous driving companies
- Usually use 2 compute boxes (main + backup)
- Processor: Intel Xeon E5 x (1 pc.)
- GPU: NvidiaK80 x (4 ~ 8)
- 5,000W power
- $ 20,000 ~ $ 30,000 (about 2,000 ~ 30 million won or more)
- Embedded specification
- ARM SoC
- 15W power
- Localization Pipeline: Processing 25 images per second
- Deep Learning Pipeline: 2-3 objects per second
- Planning and control pipeline: route planning within 6ms
- Driving at about 5 miles per hour (8 km / h)
- Example: Apollo test hardware pack presented by Baidu
Deeplearning on Embeded board
Example
Cloud Platform
Client platform in autonomous driving
- Requirements
- Because autonomous vehicles are a type of mobile system, a cloud platform that supports them is required.
- The main functions of the cloud are largely distributed computing (processing, analysis) and distributed storage (storage).
- Simulation to verify new algorithm
- HD map production and update
- Deep Learning Model Learning and Updates
- Required elements
- Distributed Computing: Spark
- Heterogeneous Computing: OpenCL
- In-memory storage: Alluxio
Simulation
- Reasons for simulation
- Rigorous testing is required before deploying the newly developed algorithm to the vehicle.
- Testing with a real car is a huge cost and time consuming and safety issue.
- Mainly tested in a simulator implemented by playing data on ROS nodes
- It takes a long time to test on one machine, and the test range is not enough.
- Distributed computing nodes are managed as sparks, and ROS replay instance is driven for each node.(3 hour test running on one server => test in 25 minutes using 8 nodes through a distributed system)
- Opensource
- Compatible with various platforms
- There are many methods based on ROS, Unity, Unreal Engine, etc.
- LGsvl: led by LG
- Airsim: led by MS
- Carla: Intel, Toyota led
HD MAP
- HD map creation process
- Original data processing
- Point cloud creation
- Point Cloud Alignment
- 2D reflection map generation
- HD Map Labeling
- Create final map
- Each step is connected to one job through Spark (In-memory computing mechanism eliminates the need to store intermediate data on the hard disk, greatly improving the performance of the map creation process)
- Evolution of HD map
Deeplearning Model
- Requirements
- Because autonomous driving systems use a variety of deep learning models, it is necessary to update them to continuously improve the effectiveness and efficiency of the model.
- It is difficult to quickly learn a model using only one server.
- Requires a distributed deep learning system based on distributed computing platforms (e.g. Spark) and deep learning frameworks (e.g. paddles) (frameworks such as Tensorflow support them on their own)
- Requires a parameter server (e.g., Luxi)
- Autonomous driving platform
- Autoware
- Apollo
-
Nividia Drive Platform
- Reference
SLAM KR 발표 자료
Autonomous Navigation System
Client & Cloud System
S/W and skill requirements for autonomous driving
the 5 levels of driving automation
Client System
- ROS
- Hardware Platform
Client system in autonomous driving
- Requirements
- Multiple algorithms are integrated to satisfy real-time and reliability requirements.
- The processing pipeline must be fast enough to handle the vast amount of data generated by the sensor.
- Sufficient enough to recover even if an error occurs in any part
- Strict energy and resource constraints must also be met in processing all of these operations.
- Component
- operating system
- Hardware platform
ROS(Robot operating System)
- ROS
- Powerful distributed computing framework specialized for robotics applications
- Specific tasks are driven from the node.
- Nodes communicate with each other through topics and services.
- Problem
- Reliability: In ROS, there is only one master, and there is no monitor to recover the failed node.
- Performance: In the process of sending a message in a broadcast manner, the same message may be duplicated multiple times, resulting in a decrease in performance.
- Security: No authentication and encryption mechanisms.
- responsibility
- Since there is only one master node, if the master node dies, the entire system dies.
- This is against the safety requirements of the autonomous driving system.
- To solve this, a mechanism similar to that of the zookeeper (distributed coordinator) was introduced.
- When the main master node dies, the backup master node is taken over to operate the system normally.
- When monitoring through the zookeeper mechanism and detecting a failed node, restart it.
- Performance
- The communication between the nodes is very frequent, so it must be handled efficiently.
- Local nodes communicate with each other using a loop-back mechanism, which causes about 20ms of overhead each time.
- To eliminate this overhead
- Use shared memory mechanism to process messages to go straight to the destination node without going through the TCP / IP stack
- ROS node replaces the message communication method from broadcasting to multicast mechanism, greatly reducing the bandwidth consumed by making multiple copies during the broadcasting process.
- security
- Security problem occurs because there is no symptom and encryption mechanism
- Because there is no authentication, the node may be down in such a way that it infiltrates the node and continuously allocates memory, exhausting system memory.
- Since it does not encrypt the messages that are sent and received between nodes, it can be subjected to a man-in-the-middle attack that intercepts them.
- solution
- It is solved by using LXC (Linux Container) to limit the amount of allowed resources for each node, and protect the node from the outside through the sandbox mechanism.
- Encrypt messages sent and received during communication
- Security problem occurs because there is no symptom and encryption mechanism
ROS1 vs ROS2
- Resolving the reliability, performance, and security problems of ROS1 in ROS2
- Changed data processing method (XMLRPS / TCPROS => DDS / RTPS)
- Multi-node is possible in one process (Peer to Peer communication method does not have a separate master and is performed according to the role.)
- Improving reliability and performance through methods such as notifying the master of the polling method based on the parameter, etc.
- Real-time computing (not for performance, but for final decision)
- Enhanced security through DDS (Data Distribution Service) security
- Compatible with various OS, python3 support, various embedded system support etc. https
Hardware Platform
- Example
- Specifications used by autonomous driving companies
- Usually use 2 compute boxes (main + backup)
- Processor: Intel Xeon E5 x (1 pc.)
- GPU: NvidiaK80 x (4 ~ 8)
- 5,000W power
- $ 20,000 ~ $ 30,000 (about 2,000 ~ 30 million won or more)
- Embedded specification
- ARM SoC
- 15W power
- Localization Pipeline: Processing 25 images per second
- Deep Learning Pipeline: 2-3 objects per second
- Planning and control pipeline: route planning within 6ms
- Driving at about 5 miles per hour (8 km / h)
- Specifications used by autonomous driving companies
- Example: Apollo test hardware pack presented by Baidu
Deeplearning on Embeded board
Example
Cloud Platform
Client platform in autonomous driving
- Requirements
- Because autonomous vehicles are a type of mobile system, a cloud platform that supports them is required.
- The main functions of the cloud are largely distributed computing (processing, analysis) and distributed storage (storage).
- Simulation to verify new algorithm
- HD map production and update
- Deep Learning Model Learning and Updates
- Required elements
- Distributed Computing: Spark
- Heterogeneous Computing: OpenCL
- In-memory storage: Alluxio
Simulation
- Reasons for simulation
- Rigorous testing is required before deploying the newly developed algorithm to the vehicle.
- Testing with a real car is a huge cost and time consuming and safety issue.
- Mainly tested in a simulator implemented by playing data on ROS nodes
- It takes a long time to test on one machine, and the test range is not enough.
- Distributed computing nodes are managed as sparks, and ROS replay instance is driven for each node.(3 hour test running on one server => test in 25 minutes using 8 nodes through a distributed system)
- Opensource
- Compatible with various platforms
- There are many methods based on ROS, Unity, Unreal Engine, etc.
- LGsvl: led by LG
- Airsim: led by MS
- Carla: Intel, Toyota led
HD MAP
- HD map creation process
- Original data processing
- Point cloud creation
- Point Cloud Alignment
- 2D reflection map generation
- HD Map Labeling
- Create final map
- Each step is connected to one job through Spark (In-memory computing mechanism eliminates the need to store intermediate data on the hard disk, greatly improving the performance of the map creation process)
- Evolution of HD map
Deeplearning Model
- Requirements
- Because autonomous driving systems use a variety of deep learning models, it is necessary to update them to continuously improve the effectiveness and efficiency of the model.
- It is difficult to quickly learn a model using only one server.
- Requires a distributed deep learning system based on distributed computing platforms (e.g. Spark) and deep learning frameworks (e.g. paddles) (frameworks such as Tensorflow support them on their own)
- Requires a parameter server (e.g., Luxi)
- Autonomous driving platform
- Autoware
- Apollo
-
Nividia Drive Platform
- Reference SLAM KR 발표 자료
Comments