Assignment 5 - Multimodal Input: An Autonomous Driving System

Due - Sunday, November 24, 2024

This is a group project of 2 or 3 students per group. The groups can be found in the document Assignment5_Groups.docx.

Introduction

The use of multimodal input in machine learning is gaining popularity and receiving increasing attention in the research and industry communities. Multimodal learning refers to the integration and simultaneous processing of data from multiple modalities, such as text, images, audio, video, or sensor data, to improve the performance and robustness of machine learning models.

UML Diagram: Multimodal Input

In this assignment, you will have to provide a UML diagram of the multimodal input design pattern for machine learning. You can use general components in your UML diagram, or the components in this assignment if you wish.

Common Usage: Multimodal Input

The multimodal input machine learning design pattern has found various applications across different industries. Here are some common uses of multimodal input in industry:

  1. Multimedia Analysis: Multimodal learning is extensively used in multimedia analysis tasks, such as image and video understanding, object recognition, image captioning, and video summarization. By combining visual and textual information, models can achieve better comprehension and interpretation of multimedia content.
  2. Natural Language Processing (NLP): Multimodal input is employed in NLP tasks to incorporate additional contextual information from images or other modalities. For example, in text sentiment analysis, combining text with accompanying images or social media posts can provide more comprehensive sentiment understanding.
  3. Human-Computer Interaction (HCI): Multimodal input enables more natural and intuitive human-computer interaction. Applications such as voice assistants, gesture recognition systems, and emotion detection systems utilize multimodal input to better understand and respond to user inputs, incorporating both visual and auditory cues.
  4. Autonomous Vehicles: In the field of autonomous driving, multimodal input plays a critical role. Systems leverage data from sensors, cameras, LIDAR, and GPS to process visual, spatial, and temporal information, enabling vehicles to perceive the environment, detect objects, and make informed decisions.
  5. Healthcare: Multimodal input is increasingly used in healthcare applications. For instance, combining patient data from electronic health records (EHR), medical images, and clinical notes can aid in diagnosis, treatment planning, and patient monitoring.
  6. Robotics: In robotics applications, multimodal input enables robots to perceive the world using multiple sensors, including vision, audio, and touch. Combining these inputs helps robots understand their environment, interact with objects, and perform complex tasks.
  7. Augmented and Virtual Reality: Multimodal input is utilized in augmented and virtual reality systems to enhance user experiences. By integrating visual, auditory, and haptic feedback, these systems provide a more immersive and interactive environment.
  8. Surveillance and Security: Multimodal input is valuable in surveillance and security applications. Integrating video, audio, and sensor data allows for more accurate event detection, anomaly detection, and threat identification.

The Autonomous Driving System

You are to design and implement an autonomous driving system using the multimodal input design pattern. You do not have to use this pattern. You may use another if you wish, but justify why. Your autonomous driving system combines data from three sensors - cameras, LIDAR (light detection and ranging), and GPS - to perceive the environment and make informed driving decisions.

The Components

You will require the following data structures: CameraData, LidarData and GPSData. These data structures have been given to you. You can modify them if you wish.

You will require the following modules:
PerceptionModule: The perception module processes camera, LIDAR, and GPS data for object and obstacle detection once per hour. Data from the camera will adjust the vehicle's speed and data from the LIDAR the vehicle's direction. Every time tick (once per hour) the current GPS data is updated. The perception module is already given to you for this assignment. You can modify it if you wish.
PlanningModule: The planning module plans the driving route based on the current GPS data and the destination GPS data, and updates the route based on information from the camera data and the LIDAR data.
ControlModule: The control module initializes GPS data (current and destination) and then processes camera, lidar and gps data through the perception module. It also plans and updates the route through the planning module. If the vehicle is within 25km of its destination, the simulation stops.

You might need an AutonomousDrivingSystem class to act as the central controller, to pass initial GPS data to the control module and to run the simulation. Your simulation should run for 24 hours (24 ticks) or until the vehicle is within 25km of its destination. You might wish to eliminate this class.

In the main() function, an object of your autonomous driving system is created, the user is prompted for current and destination GPS data which is passed through to the autonomous driving system. The simluation is then run. At the end, main() prints out:
You have arrived! (close enough)

As with previous assignments, you can write your code in one of the following four languages: C++, C#, Java or Python. Keep in mind that at least one of your assignments has to be written in Java and at least one in C#.

Skeleton code has been provided for you. Please be sure to break this code into appropriate files:
C++: AutoDrivingSystem.cpp,
C#: AutoDrivingSystem.cs,
Java: AutoDrivingSystem.java,
Python: AutoDrivingSystem.py.

Test Program

A sample run might look as follows:

What is your initial location (latitude): 30
                             (longitude): 30
What is your destination location (latitude): 32
                                 (longitude): 32
You are at 30 long, 30 lat. You want to be at 32 long, 32 lat. You are 314.584km away from your destination.
You need to travel at 45 degrees.

You are at 30.3314 long, 30.2321 lat. You want to be at 32 long, 32 lat. You are 270.381km away from your destination.
You need to travel at 46.6561 degrees.

You are at 30.5378 long, 30.3211 lat. You want to be at 32 long, 32 lat. You are 247.619km away from your destination.
You need to travel at 48.946 degrees.

You are at 30.5524 long, 30.3637 lat. You want to be at 32 long, 32 lat. You are 242.994km away from your destination.
You need to travel at 48.5022 degrees.

You are at 30.6674 long, 30.4341 lat. You want to be at 32 long, 32 lat. You are 228.692km away from your destination.
You need to travel at 49.6013 degrees.

You are at 30.6674 long, 30.4341 lat. You want to be at 32 long, 32 lat. You are 228.692km away from your destination.
You need to travel at 49.6013 degrees.

You are at 30.8031 long, 30.8153 lat. You want to be at 32 long, 32 lat. You are 187.307km away from your destination.
You need to travel at 44.7079 degrees.

You are at 31.1334 long, 31.049 lat. You want to be at 32 long, 32 lat. You are 143.103km away from your destination.
You need to travel at 47.6568 degrees.

You are at 31.3413 long, 31.1345 lat. You want to be at 32 long, 32 lat. You are 120.976km away from your destination.
You need to travel at 52.726 degrees.

You are at 31.3586 long, 31.1759 lat. You want to be at 32 long, 32 lat. You are 116.143km away from your destination.
You need to travel at 52.1064 degrees.

You are at 31.4778 long, 31.239 lat. You want to be at 32 long, 32 lat. You are 102.647km away from your destination.
You need to travel at 55.5423 degrees.

You are at 31.4778 long, 31.239 lat. You want to be at 32 long, 32 lat. You are 102.647km away from your destination.
You need to travel at 55.5423 degrees.

You are at 31.6523 long, 31.6041 lat. You want to be at 32 long, 32 lat. You are 58.6075km away from your destination.
You need to travel at 48.7082 degrees.

You are at 31.998 long, 31.8142 lat. You want to be at 32 long, 32 lat. You are 20.6636km away from your destination.
You need to travel at 89.388 degrees.

You have arrived! (close enough).

Your sample run does not have to match the above data 100%. What is important is that at each time step you are closer to the destination.

Questions

  1. Compare and contrast standard design patterns with machine learning design patterns. Find one similarity and one difference.
  2. In general, do you find machine learning design patterns useful?
  3. Which standard design pattern could you have employed for this assignment? How?
  4. Which other machine learning design pattern could you have employed for this assignment? How?

Marking Rubric

You will be marked out of 10 according to the following:

Does not meet expectationsSatisfactoryGoodExceeds Expectations

UML Diagram
(2 marks)
Does not meet requirementsMeets the most important requirementsMeets all requirements with minor errorsMeets all requirements with no errors

Planning Module
(2 marks)
Does not meet requirementsMeets the most important requirementsMeets all requirements with minor errorsMeets all requirements with no errors

Control Module
(2 mark)
Does not meet requirementsMeets the most important requirementsMeets all requirements with minor errorsMeets all requirements with no errors

Autonomous Driving
System (if necessary)
(1 mark)
Does not meet requirementsMeets the most important requirementsMeets all requirements with minor errorsMeets all requirements with no errors
Code Documentation
(1 mark)
Does not contain documentationContains header documentation for either all files or for all functions within each file Contains header documentation for all files and for most functions within each fileContains header documentation for all files and for all functions within each file. Documents unclear code.
Questions
(2 marks)
Answers no question correctlyAnswers some questions correctlyAnswers most questions correctlyAnswers all Questions correctly

Submission

Please email all source code and answers to questions to: miguel.watler@senecapolytechnic.ca

Your answers to questions can be submitted in a separate document or embedded within your source code.

Late Policy

You will be docked 10% if your assignment is submitted 1-2 days late.
You will be docked 20% if your assignment is submitted 3-4 days late.
You will be docked 30% if your assignment is submitted 5-6 days late.
You will be docked 40% if your assignment is submitted 7 days late.
You will be docked 50% if your assignment is submitted over 7 days late.