What is a pre-trained AI model?  |  NVIDIA Blog

What is a pre-trained AI model? | NVIDIA Blog

Imagine trying to teach a toddler what a unicorn is. A good starting point might be to show children pictures of the creature and describe its unique characteristics.

Now imagine trying to teach an artificially intelligent machine what a unicorn is. Where to start ?

Pre-trained AI models offer a solution.

A pre-trained AI model is a deep learning model – an expression of a brain-like neural algorithm that finds patterns or makes predictions based on data – that is trained on large data sets to accomplish a specific task. It can be used as is or refined to meet specific application needs.

Why are pre-trained AI models used?

Instead of building an AI model from scratch, developers can use pre-trained models and customize them to meet their needs.

To create an AI application, developers first need an AI model that can perform a particular task, whether it’s identifying a mythical horse, detecting a security hazard of an autonomous vehicle or to diagnose cancer on the basis of medical imaging. This model needs a lot of representative data to learn from.

This learning process involves walking through multiple layers of incoming data and emphasizing features relevant to the goals at each layer.

To create a model that can recognize a unicorn, for example, one could first give it pictures of unicorns, horses, cats, tigers, and other animals. This is incoming data.

Then layers of representative data features are built, starting with the simple ones – like lines and colors – and progressing to the complex structural features. These characteristics are assigned varying degrees of relevance when calculating probabilities.

Unlike a cat or a tiger, for example, the more a creature resembles a horse, the greater the likelihood that it is a unicorn. These probabilistic values ​​are stored at each neural network layer in the AI ​​model, and as layers are added, its understanding of the representation improves.

To build such a model from scratch, developers need huge datasets, often with billions of rows of data. These can be expensive and difficult to obtain, but compromising the data can lead to poor model performance.

Pre-computed probabilistic representations, called weights, save time, money and effort. A pre-trained model is already built and trained with these weights.

Using a high-quality pre-trained model with a large number of accurate representative weights increases the chances of a successful AI deployment. Weights can be changed and more data can be added to the model to customize or refine it further.

Developers who rely on pre-trained models can build AI applications faster, without having to worry about managing mountains of input data or calculating probabilities for dense layers.

In other words, using a pre-trained AI model is like getting a dress or a shirt and then tailoring it to your needs, rather than starting with fabric, thread and a needle.

Pre-trained AI models are often used for transfer learning and can be based on several types of model architecture. A popular type of architecture is the transformer model, a neural network that learns context and meaning by following relationships in sequential data.

According to Alfredo Ramos, senior vice president of platform at AI company Clarifai – a Premier Partner of the NVIDIA Inception program for startups – pre-trained models can reduce development time for AI applications by up to one year and lead to savings of hundreds of thousands of dollars. .

How do pre-trained models advance AI?

Since pre-trained models simplify and speed up AI development, many developers and companies use them to accelerate various AI use cases.

Key areas where pre-trained models are advancing AI include:

  • Natural language processing. Pre-trained models are used for translation, chatbots, and other natural language processing applications. Large language models, often based on the transformer model architecture, are an extension of pre-trained models. An example of a pretrained LLM is NVIDIA NeMo Megatron, one of the largest AI models in the world.
  • Voice AI. Pre-trained models can help voice AI applications plug and play in different languages. Use cases include call center automation, AI assistants, and voice recognition technologies.
  • Computer vision. As in the unicorn example above, pre-trained models can help the AI ​​quickly recognize creatures – or objects, places and people. In this way, pre-trained models accelerate computer vision, providing applications with human-like vision capabilities in sports, smart cities and more.
  • Health care. For healthcare applications, pre-trained AI models like MegaMolBART, part of the NVIDIA BioNeMo service and framework, can understand the language of chemistry and learn the relationships between atoms in the world’s molecules real world, providing the scientific community with a powerful tool to accelerate drug discovery.
  • Cyber ​​security. The pretrained models provide a starting point for implementing AI-based cybersecurity solutions and extend the capabilities of human security analysts to detect threats faster. Examples include digital fingerprinting of humans and machines, and anomaly, sensitive information and phishing detection.
  • Artistic and creative workflows. Supporting the recent wave of AI art, pre-trained models can help accelerate creative workflows with tools like GauGAN and NVIDIA Canvas.

Pre-trained AI models can be applied in industries other than these, as their customization and fine-tuning can lead to endless use-case possibilities.

Where to find pre-trained AI models

Companies like Google, Meta, Microsoft, and NVIDIA are inventing state-of-the-art model architectures and frameworks to build AI models.

These are sometimes published on model hubs or as open source, allowing developers to refine pre-trained AI models, improve their accuracy, and extend model repositories.

NVIDIA NGC – a hub for GPU-optimized AI software, models, and Jupyter Notebook samples – includes pre-trained models as well as AI benchmarks and training recipes optimized for use with the platform – NVIDIA AI shape.

NVIDIA AI Enterprise, a fully managed and secure cloud-native suite of AI and data analytics software, includes pre-trained models without encryption. This allows developers and companies looking to integrate pre-trained NVIDIA models into their custom AI applications to visualize model weights and biases, improve explainability, and debug easily.

Thousands of open source templates are also available on hubs like GitHub, Hugging Face and others.

It is important that pre-trained models are trained using ethical data that is transparent and explainable, compliant with confidentiality, and obtained with consent and without bias.

NVIDIA Pre-Trained AI Models

To help more developers take AI from prototype to production, NVIDIA offers several pre-trained models that can be deployed out of the box, including:

  • NVIDIA SegFormer, a transformer model for simple, efficient, and powerful semantic segmentation — available on GitHub.
  • Purpose-built NVIDIA computer vision models trained on millions of images for smart cities, parking management, and other applications.
  • NVIDIA NeMo Megatron, the world’s largest customizable language model, as part of NVIDIA NeMo, an open-source framework for building high-performance, flexible applications for conversational AI, speech AI, and biology.
  • NVIDIA StyleGAN, a style-based generator architecture for generative adversarial networks, or GANs. It uses transfer learning to generate infinite paintings in a variety of styles.

Additionally, NVIDIA Riva, a GPU-accelerated software development kit for building and deploying voice AI applications, includes pre-trained models in ten languages.

And MONAI, an open-source AI framework for healthcare research developed by NVIDIA and King’s College London, includes pretrained models for medical imaging.

Learn more about AI models pre-trained by NVIDIA.

#pretrained #model #NVIDIA #Blog

Leave a Comment

Your email address will not be published. Required fields are marked *