Science, Technologies, Innovations №4(32) 2024, 134-141 р.

http://doi.org/10.35668/2520-6524-2024-4-15

Bobarchuk О. А. — PhD in Engineering, National Aviation University, 1, Lubomyr Huzar Ave., Kyiv, Ukraine, 03058; a.bobarchuk@gmail.com; ORCID: 0000-0003-3176-7231

Zlotkivska T. V. — Student, National Aviation University, 1, Lubomyr Huzar Ave., Kyiv, Ukraine, 03058; 7395409@stud.nau.edu.ua; ORCID: 0009-0009-0661-8956

INNOVATIVE POSSIBILITIES OF DEVELOPING ARTIFICIAL INTELLIGENCE USING A MULTIMODAL APPROACH

Abstract. The publication is devoted to the study of multimodal artificial intelligence in comparison with unimodal systems. The paper discusses the mathematical aspects of multimodal systems, conclusions on the potential and advantages over unimodal systems.
Multimodal artificial intelligence has a significant potential to improve people’s lives in areas such as medicine, education, business, and entertainment. Research into new and innovative applications is highly relevant. This technology is used to improve the efficiency and accuracy of existing systems. Multimodal artificial intelligence can be used to solve complex problems that cannot be solved by unimodal systems, which makes this system highly relevant for research and innovation.
The study concludes that multimodal AI systems have the potential to be better suited to users than unimodal systems and provide a detailed understanding of complex real-world data to consumers. The ongoing research and advancements in the fields of multimodal representation, fusion methods, and management of large multimodal datasets are contributing to solving these challenges and extending the capabilities of the current unimodal AI.

Keywords: artificial intelligence, multimodal, unimodal, unimodal, automation, generation, modalities, sound processing, product design.

REFERENCES

  1. McNasby, M. (2024). How Multimodal Capabilities Can Revolutionize AI Models. Discover the impact of multimodal AI on marketing and content strategies. Spisework. Retrieved from: https://www.spiceworks.com/tech/artificial-intelligence/guest-article/leveraging-multimodal-capabilities-to-revolutionize-ai-models/.
  2. (2023). What is Multimodal AI? Combining Tools for Business Impact. Pecan. Retrieved from: https://www.pecan.ai/blog/what-is-multimodal-ai-business/#:~:text=Multimodal%20AI%20relies%20on%20combining,need%20to%20be%20seamlessly%20integrated.
  3. Takyar, A. Multimodal Models: Architecture, workflow, use cases and development. Leeway Hertz. Retrieved from: https://www.leewayhertz.com/multimodal-model/.
  4. Anglen, J. The Future of AI: How Multimodal Models Are Leading the Way. Rapid Innovation. Retrieved from: http://surl.li/zhroux.
  5. Embeddings in Machine Learning: Types, Models, and Best Practices. Swimm. Retrieved from: https://swimm.io/learn/large-language-models/embeddings-in-machine-learning-types-models-and-best-practices.
  6. Raster images (of a person, a basket of apples, a red pen, a white cat, a red apple). Freepik. Retrieved from: https://www.freepik.com/.
  7. Rouse, M. (2023). Multimodal AI (Multimodal Artificial Intelligence). Techopedia. Retrieved from: https://www.techopedia.com/definition/multimodal-ai-multimodal-artificial-intelligence.
  8. Lawton, G. (2024). Explore real-world use cases for multimodal generative AI. TechTarget. Retrieved from: https://www.techtarget.com/searchenterpriseai/feature/Explore-real-world-use-cases-for-multimodal-generative-AI.
  9. Bender, E. M., McMillan-Major, A., Gebru, T., & Shmitchell, Sh. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Association for Computing Machinery, P. 610–623. Retrieved from: https://dl.acm.org/doi/pdf/10.1145/3442188.3445922.
  10. Rollins, Mark M. Sc., B. Sc., Cert. Ed, PGDip. (2023). Multimodal Generative AI in Education. Linkedin. Retrieved from: https://www.linkedin.com/pulse/multimodal-generative-ai-education-mark/.