In the context of generative AI models, what are tokens?
1. Tokens are the smallest units of text, such as words or subwords, that a generative model processes during input and output.
In generative AI, tokens are the core building blocks that represent segments of text. These units could be full words, subwords, or even characters, depending on the model. The model processes input as sequences of tokens to generate predictions or responses.
A company is developing an AI application that uses language models for inference on edge devices. The company needs the lowest possible latency for real-time predictions.
Which approach should the company take?
1. Deploy optimized small language models (SLMs) on edge devices.
This is correct because deploying small, optimized models directly onto edge devices minimizes latency by avoiding network calls. Small models are specifically designed to run efficiently on devices with limited computational power, ensuring real-time performance.
Reference:
AWS for the Edge
A company wants to improve developer productivity and streamline software development using generative AI. The company plans to use Amazon Q Developer.
What functionality does Amazon Q Developer offer to help meet these goals?
1. Generate code snippets, track references, and manage open source licenses.
Amazon Q Developer is designed to assist developers by automating tasks like creating code snippets, tracking dependencies, and ensuring compliance with open-source licenses. This helps improve developer productivity by reducing manual effort.
Reference:
Amazon Q Developer
A media company has implemented a generative AI solution that uses large language models (LLMs) to automatically generate subtitles for video content in different languages. The company wants to assess the quality of the translations generated by the model.
Which model evaluation strategy should the company use?
3. Bilingual Evaluation Understudy (BLEU)
BLEU is a widely used metric for evaluating the quality of machine-generated translations by comparing them to reference translations. It measures the accuracy of translation models, making it ideal for assessing subtitle generation.
A travel company is using a pre-trained large language model (LLM) to create a chatbot that provides vacation suggestions. The company needs the chatbot’s responses to be concise and delivered in a specific language.
Which solution will help ensure the LLM produces responses that meet the company’s requirements?
1. Adjust the prompt.
Adjusting the prompt allows the company to control the style, length, and language of the responses generated by the LLM. Providing clear instructions within the prompt helps guide the model to produce shorter responses and ensures they are written in the specified language.
Reference:
Prompt Engineering Concepts
A fashion brand wants to use a pre-trained generative AI model to create content for its marketing campaigns. The company needs to ensure that the generated content reflects the brand’s tone and messaging.
Which solution will meet these requirements?
3. Craft detailed prompts with clear instructions to ensure the generated content aligns with the brand’s voice.
Creating detailed prompts is the most efficient way to control the output of a pre-trained generative AI model. By specifying the tone, style, and context in the prompts, the company can guide the model to generate content that aligns with its brand messaging without needing to modify the model itself.
Reference:
Prompt Engineering Concepts
A research organization wants to build an AI application using large language models (LLMs) to process scientific papers and automatically extract the main findings for easier review.
Which solution meets these requirements?
1. Develop a summarization system.
A summarization system is designed to read large amounts of text and extract the key points, making it ideal for processing scientific papers and summarizing the main findings.
A healthcare company is building a virtual assistant using a large language model (LLM) to provide medical advice. The company wants to ensure that the LLM does not generate incorrect medical recommendations or misleading information when prompted by users.
Which action should the company take to mitigate these risks?
1. Implement human-in-the-loop review for medical responses to ensure accuracy.
Human-in-the-loop review ensures that sensitive medical responses generated by the LLM are validated by healthcare professionals. This approach adds a layer of oversight, helping to prevent the spread of incorrect or harmful medical advice.
Reference:
Create and Start a Human Loop
A company is using a large language model (LLM) to automate customer support. However, they are concerned that users could inject malicious instructions into prompts to manipulate the LLM into providing incorrect or unauthorized responses.
Which risk is the company trying to address?
2. Prompt injection
Prompt injection refers to an attack where malicious users modify input prompts to manipulate the model into producing unintended, incorrect, or unauthorized outputs. This is a critical risk in systems that rely on prompt engineering for generating responses.
Reference:
Common Prompt Injection Attacks
A company is using AWS generative AI services to deploy a custom language model for customer support. The company wants to optimize costs while ensuring high availability and fast response times for users during peak hours.
Which factor should the company consider to manage these cost tradeoffs?
1. Token-based pricing
Token-based pricing is a key cost factor for generative AI services, where the cost is based on the number of tokens (words or characters) processed during inference. The company can manage costs by optimizing the number of tokens processed during peak hours, while maintaining high availability and responsiveness.
Reference:
Amazon Bedrock Pricing
A gaming company is deploying a custom generative AI model on AWS to generate real-time game dialogues. The company needs high availability and low-latency responses for global users, but it also wants to control costs.
Which factor should the company consider to manage these cost tradeoffs?
1. Provisioned throughput
Provisioned throughput allows the company to allocate dedicated resources for inference, ensuring high availability and low-latency responses during peak times while maintaining control over costs. By provisioning the right amount of throughput, the company can balance performance with cost efficiency.
Reference:
Amazon Bedrock Pricing